Back to Volume
Paper: Automatic Recognition of Object Names in Literature
Volume: 394, Astronomical Data Analysis Software and Systems (ADASS) XVII
Page: 377
Authors: Bonnin, C.; Lesteven, S.; Derriere, S.; Oberto, A.
Abstract: SIMBAD is a database of astronomical objects that provides (among other things) their bibliographic references in a large number of journals. Currently, these references have to be entered manually by librarians who read each paper. To cope with the increasing number of papers, CDS develops a tool to assist the librarians in their work, taking advantage of the Dictionary of Nomenclature of Celestial Objects, which keeps track of object acronyms and of their origin. The program searches for object names directly in PDF documents by comparing the words with all the formats stored in the Dictionary of Nomenclature. It also searches for variable star names based on constellation names and for a large list of usual names such as Aldebaran or the Crab. Object names found in the documents often correspond to several astronomical objects. The system retrieves all possible matches, displays them with their object type given by SIMBAD, and lets the librarian make the final choice. The bibliographic reference can then be automatically added to the object identifiers in the database. Besides, the systematic usage of the Dictionary of Nomenclature, which is updated manually, permitted to automatically check it and to detect errors and inconsistencies. Last but not least, the program collects some additional information such as the position of the object names in the document (in the title, subtitle, abstract, table, figure caption...) and their number of occurrences. In the future, this will permit to calculate the ’weight’ of an object in a reference and to provide SIMBAD users with an important new information, which will help them to find the most relevant papers in the object reference list.
Back to Volume