Scientists keen to use MeerKAT data deluge
Scientists need to cope with the terabytes of data that MeerKAT will begin producing in a few weeks. The telescope’s 64 dishes will generate a gigabyte of data a second
The last of SA’s 64 radio dishes for the homegrown MeerKAT telescope will be in the ground later in March. While radio engineers have been getting the telescope ready, data scientists are preparing for the deluge of data that the telescope will generate — and other scientific disciplines are eyeing this capacity.
SA has invested heavily in astronomy, in the region of billions of rand, although an exact figure is difficult to determine. The government views the scientific discipline as a way to generate the skilled graduates and technicians the economy requires, and to use SA’s geographic advantage of clear skies to boost research and technological niches.
On completion in the 2030s, the Square Kilometre Array (SKA) will be the world’s largest radio telescope, with a total receiving area of 1km². It will be 50 times more sensitive than current telescopes and will aim to answer some of humanity’s most enigmatic questions, such as whether humans are alone in the universe, what happened immediately after the big bang and how galaxies form.
The data from this instrument [MeerKAT] is complex and is also of huge volume, which makes it difficult for individual researchers to analyse by themselves. The ‘fear of large data’ may hinder more people from doing radio astronomy and use of the SKARuss Taylor
Director of the Inter-university Institute for Data-Intensive Astronomy
It will eventually comprise more than a million antenna in Australia and thousands of dishes in Africa, with a high concentration in SA. Construction of phase one of the SKA is set to begin in 2020 and will incorporate SA’s MeerKAT into the larger SKA. MeerKAT, like the SKA, will be an interferometer, an instrument in which many dishes or antennas together act as a single telescope. Each dish collects the relatively weak radio signals from space that have to be combined, filtered and turned into data that is useful to astronomers.
Scientists need to cope with the terabytes of data that MeerKAT will begin producing in a few weeks. The telescope’s 64 dishes will generate a gigabyte of data a second.
"The idea is that, though on paper you might be a leader of [an astronomy] project, unless you have the capability to process the data, you’re not a leader," says Russ Taylor, director of the Inter-university Institute for Data-Intensive Astronomy (Idia).
The institute will allow South African astronomers to work, for the first time, with telescope data, says Taylor who is an SKA research chair.
Established in 2015, Idia is a collaboration of more than R10m between four partners: the universities of Cape Town, Pretoria, the Western Cape and the North West.
The institute has developed a cloud-based platform that will allow astronomers from all over the country and the world to access the data that MeerKAT will create.
Astronomers and data scientists have already created images from data received from the first 16 MeerKAT dishes, which went live in 2016.
They were processed using the institute’s cloud-based system, allowing researchers in other parts of the country access to data without requiring servers and hardware to store it at their institutions.
Taylor hopes that their cloud-based infrastructure could be adopted by other African countries.
The major focus of Idia is data-intensive astronomy, says Ishwara Chandra, an associate professor at the National Centre for Radio Astrophysics at India’s Tata Institute of Fundamental Research, who is collaborating with the institute.
"The data from this instrument [MeerKAT] is complex and is also of huge volume, which makes it difficult for individual researchers to analyse by themselves.
"The ‘fear of large data’ may hinder more people from doing radio astronomy and use of the SKA," he says.
"To extract maximum science and attract the best talent to do science with the SKA, it is important to make science-ready products available to users."
Taylor says the cloud-based platform is about making the data more accessible to scientists. "We are building a system that empowers scientists, so that they can be part of processing data, a system that allows the researchers to work with the data itself and work with the analytics as if it was one their desktops," he says.
Mattia Vaccari, a data scientist at the University of the Western Cape, heads the Help-Idia Panchromatic Project, which aims to combine data of different wavelengths, such as optical and radio waves, to study galaxy evolution.
Val Munsami, head of the South African National Space Agency, sees space applications in the country’s SKA involvement and data-intensive research cloud. "It can be used for space applications across the continent," he says.
Glaudina Loots, director of health innovation at the Department of Science and Technology, told a panel at the Science Forum SA in Pretoria in December that her unit planned to "piggyback" on the astronomy investment and data infrastructure.
"Part of that is earmarked for precision medicine. If you can’t handle the data and have to export it out of the country, then you start running into problems," she said.