SyGAR – A Synthetic Data Generator for Evaluating Name Disambiguation Methods
Ferreira, Anderson A.
Goncalves, Marcos Andre
Almeida, Jussara M.
Laender, Alberto H.F.
MetadataΕμφάνιση πλήρους εγγραφής
Name ambiguity in the context of bibliographic citations is one of the hardest problems currently faced by the digital library community. Several methods have been proposed in the literature, but none of them provides the perfect solution for the problem. More importantly, basically all of these methods were tested in limited and restricted scenarios, which raises concerns about their practical applicability. In this work, we deal with these limitations by proposing a synthetic generator of ambiguous authorship records called SyGAR. The generator was validated against a gold standard collection of disambiguated records, and applied to evaluate three disambiguation methods in a relevant scenario.