Identifying software packages on cloud machine instances using filesystem meta-data
Abstract
Utilization of the emerging grid and cloud infrastructure requires services which allow the user to identify the machine instances suitable for her software needs. Identifying the software packages installed on cloud machine instances is the first building block of such services. In the current study a software package identification system is developed. Data about the filesystem
and the packages installed is collected from various cloud machine instances. Relations amongst software elements are analyzed and used to formulate a Semantic Software Graph, a graph representation of the filesystem data and the software package data which utilizes the semantic graph technology. Relations amongst the software elements are analyzed to determine if
they related software elements of the same software package. Graph reduction algorithms are utilized to reduce the size fo the Semantic Software Graph, and different graph clustering algorithms are used on the resulting graph to group files together to closely related groups. External evaluation measures are used to compare the resulting clusters to the expected software packages.
The process is applied and evaluated on additional machines instances to prove its general applicability. The evaluation results are encouraging and may be improved in future work.
Collections
- Τμήμα Πληροφορικής [73]