|
|
Paper: |
Efficient and Scalable Cross-Matching of (Very) Large Catalogs |
Volume: |
442, Astronomical Data Analysis Software and Systems XX (ADASSXX) |
Page: |
85 |
Authors: |
Pineau, F.-X.; Boch, T.; Derriere, S. |
Abstract: |
Whether it be for building multi-wavelength datasets from independent surveys, studying changes in objects luminosities,
or detecting moving objects (stellar proper motions, asteroids), cross-catalog matching is a technique widely used in
astronomy. The need for efficient, reliable and scalable cross-catalog matching is becoming even more pressing with
forthcoming projects which will produce huge catalogs in which astronomers will dig for rare objects, perform statistical
analysis and classification, or real-time transients detection.
We have developed a formalism and the corresponding technical framework to address the challenge of fast cross-catalog
matching. Our formalism supports more than simple nearest-neighbor search, and handles elliptical positional
errors. Scalability is improved by partitioning the sky using the HEALPix scheme, and processing independently
each sky cell. The use of multi-threaded two-dimensional kd-trees adapted to managing equatorial coordinates
enables efficient neighbor search.
The whole process can run on a single computer, but could also use clusters of machines to cross-match future
very large surveys such as GAIA or LSST in reasonable times. We already achieve performances where the 2MASS (∼470M sources) and SDSS DR7 (∼350M sources) can be matched on a single machine in less than 10 minutes.
We aim at providing astronomers with a catalog cross-matching service, available on-line and leveraging on
the catalogs present in the VizieR database.
This service will allow users both to access pre-computed cross-matches across some very large catalogs,
and to run customized cross-matching operations.
It will also support VO protocols for synchronous or asynchronous queries. |
|
|
|
|