Classification of Galactic Transients with Missing Data


Paper:	Classification of Galactic Transients with Missing Data
Volume:	538, ADASS XXXII
Page:	348
Authors:	Yuzuki Koga; Makoto Uemura; Ryosuke Sazaki; Shiro Ikeda
DOI:	10.26624/FHGO1766
Abstract:	Very early stages of galactic transients, such as nova explosions and dwarf nova outbursts, are poorly understood because their physical states change rapidly, within one day. A key problem is choosing the from different options for follow-up observations of a transient, such as imaging or spectroscopy, just after its discovery. To date, domain experts have made this choice; thus, the appropriate observations could be performed only when they were in the observatory at night. We propose an automation system that can quickly perform appropriate follow-up observations to identify the type of transients. We have developed a system called SmartK using the 1.5 m Kanata telescope of Hiroshima University. In building the system, we choose the observation type based on the mutual information of each observation mode, which is calculated with the conditional probability of the measurements obtained with the follow-up observation, and the class probability of the object. We estimate the class probability with supervised machine learning. Our training data set is characterized by many missing values, which is a common problem in the classification of astronomical objects. Sparse multinomial logistic regression (SMLR) has been used as the discriminant model in SmartK. SMLR can create nonlinear decision boundaries, although it is not easy to handle training data with missing values. Here, we define a generative model (GM) and propose a strategy to make a decision with Bayes’ theorem, and compare its performance with that of SMLR. We find that there is no significant difference in the accuracy obtained by cross-validation between SMLR and GM. It suggests that simple decision boundaries are enough for our data. We prefer GM to SMLR because it is easier to handle missing values in the training data. Additionally, GM enables us to identify anomalies that do not fall into any predefined types.