Towards text copyright detection using metadata in web applications
Μπώκος, Γιώργος Δ.
Πούλος, Μάριος Σ.
Poulos, Marios S.
Bokos, George D.
MetadataΕμφάνιση πλήρους εγγραφής
This paper aims to present the semantic content identifier (SCI), a permanent identifier, computed through a linear-time onion-peeling algorithm that enables the extraction of semantic features from a text, and the integration of this information within the permanent identifier. Design/methodology/approach – The authors employ SCI to propose a mechanism for simultaneously checking the authenticity and degrees of similarity between different information objects, and present an empirical investigation of the method. A management scenario for the control of the authentication process and the detection of the degree of violation of documents is proposed. Findings – Such a mechanism could be adopted as a component of libraries’ strategy for the protection of the copyrights for documents published on the web. Practical implications – The use of the proposed numeric code can be utilised efficiently as a constituent part of the digital object identifier (DOI) system, making its computation more efficient and meaningful. Originality/value – The identifier proposed in the paper can result in a more efficient index for identifying and retrieving objects in a digital library, as well as online repositories and commercial applications that can handle information retrieval requests more effectively.
- Περιοδικά, εφημερίδες