Evaluation in Context
Abstract
All search happens in a particular context—such as the particular collection
of a digital library, its associated search tasks, and its associated users.
Information retrieval researchers usually agree on the importance of context, but
they rarely address the issue. In particular, evaluation in the Cranfield tradition
requires abstracting away from individual differences between users. This paper
investigates if we can bring some of this context into the Cranfield paradigm.
Our approach is the following: we will attempt to record the “context” of the
humans already in the loop—the topic authors/assessors—by designing targeted
questionnaires. The questionnaire data becomes part of the evaluation test-suite
as valuable data on the context of the search requests.We have experimented with
this questionnaire approach during the evaluation campaign of the INitiative for
the Evaluation of XML Retrieval (INEX). The results of this case study demonstrate
the viability of the questionnaire approach as a means to capture context
in evaluation. This can help explain and control some of the user or topic variation
in the test collection. Moreover, it allows to break down the set of topics in
various meaningful categories, e.g. those that suit a particular task scenario, and
zoom in on the relative performance for such a group of topics.