Leveraging the Legacy of Conventional Libraries for Organizing Digital Libraries
Abstract
With the significant growth in the number of available electronic
documents on the Internet, intranets, and digital libraries, the need for developing
effective methods and systems to index and organize E-documents is felt
more than ever. In this paper we introduce a new method for automatic text
classification for categorizing E-documents by utilizing classification metadata
of books, journals and other library holdings, that already exists in online catalogues
of libraries. The method is based on identifying all references cited in a
given document and, using the classification metadata of these references as
catalogued in a physical library, devising an appropriate class for the document
itself according to a standard library classification scheme with the help of a
weighting mechanism. We have demonstrated the application of the proposed
method and assessed its performance by developing a prototype classification
system for classifying electronic syllabus documents archived in the Irish
National Syllabus Repository according to the well-known Dewey Decimal
Classification (DDC) scheme.