SyGAR – A Synthetic Data Generator for Evaluating Name Disambiguation Methods
View/ Open
Date
2009Author
Ferreira, Anderson A.
Goncalves, Marcos Andre
Almeida, Jussara M.
Laender, Alberto H.F.
Veloso, Adriano
Metadata
Show full item recordAbstract
Name ambiguity in the context of bibliographic citations is
one of the hardest problems currently faced by the digital library community.
Several methods have been proposed in the literature, but none
of them provides the perfect solution for the problem. More importantly,
basically all of these methods were tested in limited and restricted scenarios,
which raises concerns about their practical applicability. In this
work, we deal with these limitations by proposing a synthetic generator
of ambiguous authorship records called SyGAR. The generator was validated
against a gold standard collection of disambiguated records, and
applied to evaluate three disambiguation methods in a relevant scenario.