When telescopes are observing the Universe, they collect vast amounts of data—for hours, months, even years at a time, depending on what they are studying. Combining data from different telescopes is especially useful to astronomers, to see different parts of the sky, or to observe the targets they are studying in more detail, or at different wavelengths. Each instrument has its own strengths, based on its location and capabilities.
“By setting this international standard, NRAO is taking a leadership role in ensuring that our global partners can efficiently utilize and share astronomical data,” said Jan-Willem Steeb, the technical lead of the new data processing program at the NSF NRAO. “This foundational work is crucial as we prepare for the immense data volumes anticipated from projects like the Wideband Sensitivity Upgrade to the Atacama Large Millimeter/submillimeter Array and the Square Kilometer Array Observatory in Australia and South Africa.” By addressing these key aspects, the new data model establishes a foundation for seamless data sharing and processing across various radio telescope platforms, both current and future.
International astronomy institutions collaborating with the NSF NRAO on this process include the Square Kilometer Array Observatory (SKAO), the South African Radio Astronomy Observatory (SARAO), the European Southern Observatory (ESO), the National Astronomical Observatory of Japan (NAOJ), and Joint Institute for Very Long Baseline Interferometry European Research Infrastructure Consortium (JIVE).
The new data model was tested with example datasets from approximately 10 different instruments, including existing telescopes like the Australian Square Kilometre Array Pathfinder and simulated data from proposed future instruments like the NSF NRAO’s Next Generation Very Large Array. This broader collaboration ensures the model meets diverse needs across the global astronomy community. Extensive testing completed throughout this process ensures compatibility and functionality across a wide range of instruments. By addressing these aspects, the new data model establishes a more robust, flexible, and future-proof foundation for data sharing and processing in radio astronomy, significantly improving upon historical models.
“The new model is designed to address the limitations of aging models, in use for over 30 years, and created when computing capabilities were vastly different,” adds Jeff Kern, who leads software development for the NSF NRAO, “The new model updates the data architecture to align with current and future computing needs, and is built to handle the massive data volumes expected from next-generation instruments. It will be scalable, which ensures the model can cope with the exponential growth in data from future developments in radio telescopes.”
As part of this initiative, the NSF NRAO plans to release additional materials, including guides for various instruments and example datasets from multiple international partners.“The new data model is completely open-source and integrated into the Python ecosystem, making it easily accessible and usable by the broader scientific community,” explains Steeb, “Our project promotes accessibility and ease of use, which we hope will encourage widespread adoption and ongoing development.”
About NRAO
The National Radio Astronomy Observatory (NRAO) is a facility of the U.S. National Science Foundation, operated under cooperative agreement by Associated Universities, Inc.