In the rapidly evolving landscape of social media, the challenge of tracking the origin and dissemination of data rights—especially within the realm of online rumors—has taken on critical importance. Recent advances presented in an innovative study highlight a sophisticated approach to unraveling the complex provenance of data intertwined with rights and interests on social platforms. By integrating advanced semantic web technologies, researchers have devised a method that not only traces the creators and propagators of rumor data but also meticulously identifies the infringers and victims within this digital ecosystem.
Central to this breakthrough is the employment of the Semantic Web Rule Language (SWRL) in conjunction with the Pellet reasoner, implemented within the widely used Protégé ontology editor. SWRL brings to ontologies a powerful inference mechanism rooted in Horn-like logical rules, greatly expanding the expressive capacity beyond standard ontologies. This enhanced reasoning ability allows the system to detect and resolve conflict relationships between classes and instances that arise from the ontologizing of the PROV-OCC model—a framework designed to capture provenance in an online context. Through this intelligent reasoning, the system enriches our understanding of the subtle and often concealed data flows within social media networks.
The research problematizes the tracing of rumors not merely as static blocks of information but as dynamic, rights-bearing data elements. To tackle this, the authors have crafted three pivotal capability questions that form the backbone of their investigative framework: Can the infringer and the specific infringed rights be identified? Can victims alongside the types of their infringed rights be pinpointed? And crucially, can the actual pathways through which these infringements propagate across service platforms be effectively reconstructed? Answering these queries is vital in addressing data rights violations deeply embedded in online rumor dissemination.
To operationalize these research questions, the team meticulously constructed SWRL rules that directly encode the logic necessary to map data rights infringements to their actors and pathways. These rules leverage object properties such as hasRevenueRight, hasCollectionRight, hasManagementRight, hasUsageRight, hasHoldingRight, hasPartialRevenueRight, and hasAssociatedWith—each representing a facet of the ownership or control over data elements. Entities within the ontology—labeled as “x” and “v” to denote respective instances of service objects and data elements—serve as the keystone in articulating these relationships. The integration of these properties embodies a nuanced representation of data rights which can model partial as well as full rights ownership and associated interactions.
The framework integrates the W7 and PROV-OCC models, combining their strengths to create an enriched ontology landscape where the subtleties of provenance in online rumor dissemination can be reasoned over. The reasoning process, performed by the Pellet reasoner inside Protégé, transcends the explicit data to unearth hidden logical assertions about the relationships among entities and their access or infringement actions. This is not simply theoretical; visualized in the project’s graphical output, the inferred relationships elegantly demonstrate how the system identifies original and secondary creators of rumor-related data elements, as well as those whose data rights have been infringed.
Crucially, the study’s findings illustrate the correctness of the model through a series of carefully conducted reasoning experiments. In one notable instance, the system correctly infers that service object “A” is the original creator of a key Rights-and-Interests-Attributed Data Element (designated “V”), establishing the provenance source point for the rumor dissemination. Simultaneously, service objects “B” and “C” are inferred as secondary creators, highlighting the dynamic nature of rumor modification and propagation as these actors contribute to the evolution of the data element. This layered understanding underscores the ontology’s capability to model detailed interactions and provenance beyond simple attribution.
Perhaps even more impactful is the model’s success in identifying the victim of data rights infringement, labeled in the study as service object “G.” Leveraging the same SWRL rules and reasoning infrastructure, the system reliably establishes “G” as the entity whose data rights were violated through the circulation of rumor data. This precise mapping of infringed rights to specific victims and infringers is groundbreaking, providing a concrete computational method to trace the often ambiguous territoriality of digital data ownership in social media contexts.
The propagation pathway constructed through inference highlights a striking pattern centered on the popular social service platform “DouYin.” From “A” to “B” and “C” and onward to various other browsing and forwarding entities, the flow chart exposes how rumors traverse complex social networks. These findings not only validate the research questions but provide empirical support for the hypothesis that a structured ontology combined with SWRL rules can effectively reconstruct detailed paths of data rights infringement and dissemination—something long sought after in digital provenance studies.
Beyond identification, the system captures the multiplicity of rights infringements occurring at different levels. It reveals the nuanced interplay where “A” infringes on “G”’s partial revenue rights to a data element, while “B” and “C” simultaneously infringe on both “G”’s and “A”’s partial revenue rights across various related data elements. Moreover, additional downstream service objects perpetuate this pattern of infringement in a cascading fashion that mimics real-world social media interactions. This multi-faceted infringement modeling enriches our understanding of data rights dynamics and sets a new standard for digital rights management research.
The incorporation of the SWRL rules into Protégé’s rule base is critical to achieving these insights. Through visual and logical representation shown in the study’s figures, the rules encode complex logical conditions that can differentiate original creators from secondary creators and infringers from victims. This level of formal logical rule definition balances expressive power with computational tractability, ensuring that the reasoning process remains scalable and applicable to large real-world data sets typical of social media platforms.
Notably, the research does not shy away from addressing potential conflicts and logical inconsistencies that might emerge between classes and instances after ontologization. By leveraging the Pellet reasoner’s capabilities, these conflicts are resolved, ensuring the ontology remains consistent, comprehensive, and reliable for provenance verification. This process is vital for maintaining semantic integrity when modeling complex social data and is a testament to the robustness of the integrated ontology-reasoner ecosystem proposed by the researchers.
This work’s implications extend far beyond theoretical contributions. In an era where digital misinformation and rumor propagation pose severe societal risks, having a technologically rigorous method to trace data rights provenance stands to revolutionize how rights managers, platform operators, and regulatory bodies handle data misuse. By providing clarity on data ownership and infringement pathways, the approach fosters accountability and opens new avenues for legal and technological interventions against digital rights violations on social media.
Moreover, the integration of semantic web technologies with social media provenance narratives exemplifies the power of interdisciplinary research bridging computer science, law, and social sciences. This amalgamation has produced a pioneering model that interprets social data flows not just as network transactions but as rights-laden objects with traceable histories—a perspective that is innovative and urgently needed in the digital era.
In conclusion, this research presents a remarkable leap forward in the provenance of data rights within online rumor diffusion. By combining the analytical rigor of SWRL inference with the ontological depth of the W7 and PROV-OCC models, and operationalizing these within the Pellet reasoner and Protégé software, the authors have established a blueprint for rigorous, automated tracing of data ownership and infringement in complex social networks. This work not only solves key research questions but also substantiates hypothesis with meaningful empirical case studies, marking a critical milestone in digital provenance research.
As online platforms continue to grapple with misinformation and unauthorized use of data, the methods outlined here offer a promising pathway to greater transparency and enforceable accountability. The intersection of semantic reasoning and social media data provenance poised by this study represents a frontier where technology can fundamentally reshape digital rights governance for the better.
Subject of Research: Provenance of data with rights and interests in online rumor data circulation on social media
Article Title: Provenance of data with rights and interests in online rumor data element circulation on social media
Article References:
Zhao, J., Liu, H., Shu, K. et al. Provenance of data with rights and interests in online rumor data element circulation on social media. Humanit Soc Sci Commun 12, 1056 (2025). https://doi.org/10.1057/s41599-025-05437-z
Image Credits: AI Generated