ASPCS
 
Back to Volume
Paper: From Photometric Redshifts to Improved Weather Forecasts: Machine Learning and Proper Scoring Rules as a Basis for Interdisciplinary Work
Volume: 532, ASTRONOMICAL DATA ANALYSIS SOFTWARE AND SYSTEMS XXX
Page: 173
Authors: Polsterer, K.; D'Isanto, A.; Lerch, S.
Abstract: The amount, size, and complexity of astronomical data-sets and databases are growing rapidly in the last decades, due to new technologies and dedicated survey telescopes. Besides dealing with poly-structured and complex data, sparse data has become a field of growing scientific interest. A specific field of Astroinformatics research is the estimation of redshifts of extra-galactic sources by using sparse photometric observations. Many techniques have been developed to produce those estimates with increasing precision. In recent years, models have been favoured which instead of providing a point estimate only, are able to generate probabilistic density functions (PDFs) in order to characterize and quantify the uncertainties of their estimates. Crucial to the development of those models is a proper, mathematically principled way to evaluate and characterize their performances, based on scoring functions as well as on tools for assessing calibration. Still, in literature inappropriate methods are being used to express the quality of the estimates that are often not sufficient and can potentially generate misleading interpretations. In this work we summarize how to correctly evaluate errors and forecast quality when dealing with PDFs. We describe the use of the log-likelihood, the continuous ranked probability score (CRPS) and the probability integral transform (PIT) to characterize the calibration as well as the sharpness of predicted PDFs. We present what we achieved when using proper scoring rules to train deep neural networks as well as to evaluate the model estimates and how this work led from well calibrated redshift estimates to improvements in probabilistic weather forecasting. The presented work is an example of interdisciplinarity in data-science and illustrates how methods can help to bridge gaps between different fields of application.
Back to Volume