|Machine-Assisted Discovery Through Identification and
Explanation of Anomalies in Astronomical Surveys
|532, ASTRONOMICAL DATA ANALYSIS SOFTWARE AND SYSTEMS XXX
|Wagstaff, K. L.; Huff, E.; Rebbapragada, U.
|Data volumes in modern astronomical surveys are large, and human
attention is comparatively scarce. The most interesting sources are
rare and may therefore go permanently buried and unknown in large
archives. Many science goals from planned sky surveys (e.g., Roman,
SPHEREx, and Euclid) require exquisitely precise measurements taken
over billions of galaxies and stars. Existing validation techniques
appear unlikely to scale to the next generation of large sky surveys.
We propose the use of machine learning to identify, group, and explain
anomalies within very large data sets. The goal is to quickly
distinguish erroneous measurements and expected patterns in the data
from sources and statistical correlations with true astrophysical
origins. We illustrate the process of identifying and explaining
anomalies in a study conducted on sources observed by the Dark Energy
Survey. We found that 96% of automatically identified outliers in a
subset of 11M sources were likewise discarded by humans. In addition,
several unusual objects led to follow-up spectral observations with
the Palomar Observatory. We hypothesize that this discovery process,
when applied to other large-scale sky survey data sets, can result in
improved science yield and catalog validation.