|
|
Paper: |
Distributed Streaming Radio Astronomy Reduction with Dask |
Volume: |
532, ASTRONOMICAL DATA ANALYSIS SOFTWARE AND SYSTEMS XXX |
Page: |
337 |
Authors: |
Perkins, S.; Bester, L.; Hugo, B. V.; Kenyon, J. S.; Marais, P.; Smirnov, O. |
Abstract: |
In order to process “Big Data”, modern computing strategies such as MapReduce,
and cluster computing frameworks such as Spark are used.
These frameworks lean towards a streaming, chunked,
functional programming style with minimal shared state.
Individual tasks processing chunks of data are flexibly scheduled
on multiple cores and nodes.
Legacy radio astronomy codes do not readily adapt to this paradigm.
To process the quantities of data produced by contemporary radio telescopes such as MeerKAT, and future telescopes such as the SKA using the aforementioned paradigms, radio astronomy codes will need to adapt appropriately.
This paper describes two Python libraries, Daskms and Codex-Africanus, which enable the development of distributed High-Performance Radio Astronomy code with Dask. Dask is a lightweight Python parallelisation and distribution framework that seamlessly integrates with the PyData ecosystem to address Radio Astronomy “Big Data” challenges. |
|
|
|
|