Paper: The CDS Cross-match Service: Key Figures, Internals and Future Plans
Volume: 522, Astronomical Data Analysis Software and Systems XXVII
Page: 125
Authors: Pineau, F.; Boch, T.; Derrière, S.; Schaaff, A.
Abstract: The CDS released in 2011 a service enabling astronomers to cross-match (potentially very large) catalogues in an unprecedented short runtime. The service runs on a modest hardware infrastructure, and uses custom indexed data files rather than a relational database management system. We present the capabilities and benchmarks of the cross-match service, and also the evolution and current usage statistics by the community. Several new technologies have been tested since the release of the service, bearing in mind the long-term sustainability and its evolution in the Big Data era : usage of SSD drives instead of regular spinning disks; implementation of a cross-match algorithm in the Apache Spark framework. This algorithm requires a specific co-partition and co-location of the data in order to ensure both a good load balancing and scalability. Finally, we present a generalization of the 2-catalogues case to probabilistic multi-catalogue cross-match, both positionally and taking into account photometric informations by means of kernel density classification.
