A Hybrid Distributed Architecture for Indexing
Abstract
This paper presents a hybrid scavenger grid as an underlying hardware
architecture for search services within digital libraries. The hybrid scavenger
grid consists of both dedicated servers and dynamic resources in the form of
idle workstations to handle medium- to large-scale search engine workloads. The
dedicated resources are expected to have reliable and predictable behaviour. The
dynamic resources are used opportunistically without any guarantees of availability.
Test results confirmed that indexing performance is directly related to the
size of the hybrid grid and intranet networking does not play a major role. A
system-efficiency and cost-effectiveness comparison of a grid and a multiprocessor
machine showed that for workloads of modest to large sizes, the grid architecture
delivers better throughput per unit cost than the multiprocessor, at a system
efficiency that is comparable to that of the multiprocessor.