In an era where digital imagery plays an increasingly pivotal role in shaping public perception, especially within cultural heritage domains, understanding the emotional impact of visual content has become essential. Researchers have now harnessed sophisticated machine learning techniques to quantify emotional cues embedded in cultural heritage images, unveiling nuanced dynamics between image-based sentiments and public behaviors. Central to this breakthrough is the development of the Heritage Sentiment Index (HSI), a pioneering metric that translates complex visual emotions into actionable insights, paving the way for innovative approaches to heritage communication and management.
The core of this investigative effort is a convolutional neural network (CNN) model designed to classify the emotional valence of cultural heritage images into a simplified binary framework—positive or negative sentiment. By leveraging the Inception v3 architecture, renowned for its multi-branch convolutional design and superior image recognition capabilities, the researchers implemented a transfer learning strategy. This approach involved fine-tuning pre-trained models on curated datasets specifically aligned with cultural heritage themes, thus maximizing efficiency while minimizing the challenges associated with limited training data.
The training dataset used for the emotion classification model, known as DeepSent, is a product of meticulous annotation and consensus validation. Originating from social media platforms like Flickr and Twitter, this dataset initially encompassed a broad spectrum of daily life and natural imagery. Recognizing the stylistic and emotional divergence from cultural heritage images, the research team conducted rigorous screening to distill a high-confidence subset of 1,028 images exhibiting strong inter-rater agreement, ensuring robust model training and reliable sentiment classification.
Upon training, the model demonstrated impressive performance metrics with an 83.8% accuracy and an F1-score of 88.9%, affirming its capability to discern emotional emotions in visual content. However, to ensure applicability within the specific context of cultural heritage, the team undertook extensive validation using a curated selection of 100 images sourced from Redbook (Xiaohongshu) and Instagram. This step involved multilayered filtering and manual annotation, juxtaposing human judgments with model predictions, yielding a solid 74.3% accuracy and a notable 90.2% recall, thereby confirming the model’s sensitivity to negative emotional cues pertinent to heritage discourse.
The Heritage Sentiment Index itself is derived as the proportion of images classified by the model as carrying negative sentiment relative to the total number of cultural heritage images posted each day. By converting raw model outputs into this daily summary statistic, the HSI encapsulates public exposure to negatively valenced heritage imagery on a temporal scale. This index forms the foundation for exploring how visual emotional signals influence public engagement, behavioral intentions, and sentiment expressed in comments within social media ecosystems.
Beyond the binary classification scheme, the researchers also explored a probabilistic variant of the HSI, accounting for the model’s predicted confidence in negative sentiment identification. This alternative, which aggregates the weighted probabilities of negative sentiment across images, reinforces the robustness of the initial metric, as regression analyses reveal persistent predictive power of both versions on a suite of public behavioral outcomes, ranging from tourism intentions to interaction metrics like shares and likes.
Recognizing that sentiment classification thresholds may affect results, the study further tested an asymmetric dual-threshold strategy. Images with ambiguous prediction probabilities were omitted to refine signal clarity. Despite these variations, the HSI consistently demonstrated significant associations with behavioral responses, underscoring its resilience to classification nuances and confirming its validity as an emotion-driven indicator within cultural heritage communication.
In parallel complementing the image emotion recognition, the study introduced the Comment Sentiment Index (CSI), a textual sentiment measure computed through advanced natural language processing techniques. Utilizing Recursive Neural Tensor Networks (RNTN), the researchers extracted pessimism scores from user comments, effectively quantifying emotional tone at the sentence level. This multimodal approach allows for integrated analyses that compare the influence of both visual and textual emotional cues on public sentiment and behavior.
A key revelation from comparing the HSI and CSI across longitudinal datasets is their modest yet statistically significant positive correlation. This suggests that while visual and textual sentiment signals partially converge, they also operate through distinct cognitive and communicative pathways. Images wield strong symbolic and evocative power eliciting immediate emotional responses, whereas textual comments often reflect reflective or moderated sentiments, providing complementary dimensions to cultural heritage narratives online.
Data underpinning these insights were gathered through meticulous multi-stage sampling from primary social media platforms Redbook and Instagram between 2021 and 2025. The dataset encompasses over 14,000 culturally relevant images, carefully vetted to exclude advertisements, irrelevant content, or sensitive cultural imagery. This comprehensive compilation spans key global heritage events, cross-cultural contexts, and temporal fluctuations, delivering a rich foundation to explore how public emotion is mediated through visual culture over time.
Temporal analyses of HSI and CSI reveal dynamic patterns responding to significant heritage moments. For example, images documenting the repatriation of Benin Bronzes in 2021 triggered elevated negative sentiments reflecting solemnity and cultural justice. Similarly, devastating damage to Ukrainian heritage sites during armed conflict caused sharp declines in both visual and textual sentiment indices, illustrating how external shocks profoundly affect collective emotional landscapes. Positive surges in sentiment coincided with restorative milestones such as Notre-Dame Cathedral’s reconstruction and UNESCO heritage listings, demonstrating the index’s sensitivity to cultural pride and renewal narratives.
Intriguingly, immediate behavioral responses to increases in negative visual sentiment often manifested as suppression effects—reduced touring intentions, sharing, and comment positivity on the following day—suggesting aversion or caution in engaging with negatively charged heritage content. However, these trends reversed in subsequent days, indicating potential rebound effects characterized by increased attention and interaction. These findings reveal a complex, time-dependent interplay between emotional exposure and public behavioral dynamics in heritage communication.
Such nuanced understanding heralds practical implications for cultural heritage professionals and digital communicators. The Heritage Sentiment Index offers a quantitatively grounded tool to monitor emotional climates surrounding heritage imagery in real-time, potentially guiding strategies for content curation, community engagement, and public relations. By identifying emotional troughs and peaks, institutions can tailor messaging to optimize positive engagement or address emerging controversies with sensitivity.
Moreover, this research advances methodological frontiers by integrating deep learning-based visual sentiment analysis with semantic-rich textual sentiment modeling. This multidisciplinary fusion addresses longstanding challenges in capturing the full emotional spectrum of digital cultural expression, especially in short and context-dependent social media texts where traditional lexicon-based methods fall short. The adoption of RNTN for textual sentiment, aligned with CNN-driven visual sentiment detection, sets a new benchmark in multimodal sentiment analytics.
Ethical considerations underpin the research design, emphasizing respect for cultural diversity, privacy, and transparency. The dataset exclusively comprises publicly available images free of copyright restrictions, while sensitive content involving religious or political symbolism was conscientiously excluded. The aggregate-level focus circumvents individual tracking to protect user anonymity, aligning with international standards like the Declaration of Helsinki.
Further validation efforts, including robustness checks with different sentiment threshold settings and unwinsorized data analysis, affirm the reliability and stability of the HSI as an empirical tool. These steps reinforce confidence in the index’s predictive capacity across diverse cultural heritage contexts, underscoring its potential utility in broader heritage science and digital humanities research.
The study’s temporal scope—from post-pandemic cultural revitalization phases to major heritage conferences and policy changes—offers valuable longitudinal insights into how visual emotional cues modulate public sentiment and behavior amidst evolving socio-cultural landscapes. This comprehensive dataset presents unique opportunities for future research investigating causal mechanisms, cultural differences, and content effectiveness in heritage communication strategies.
In summary, this pioneering investigation elucidates the intricate role of visual emotional cues embedded in cultural heritage images in shaping public sentiment and behavioral intentions. The development and application of the Heritage Sentiment Index, supported by advanced computational models, provide a transformative approach to decoding and leveraging emotional signals in digital heritage contexts. This work not only enriches academic discourse but also offers practical pathways for cultivating more engaging, respectful, and emotionally resonant cultural heritage experiences online.
Subject of Research:
Analysis of emotional cues in cultural heritage images using machine learning to quantify public sentiment and predict behavioral intentions.
Article Title:
The impact of visual emotional cues in cultural heritage on public sentiment and behavioral intention: an image emotion recognition approach.
Article References:
Lai, S., Tian, Y. & Zhang, Q. The impact of visual emotional cues in cultural heritage on public sentiment and behavioral intention: an image emotion recognition approach. npj Herit. Sci. 14, 85 (2026). https://doi.org/10.1038/s40494-026-02348-3
Image Credits: AI Generated

