December 10, 2019
Researchers preserve and release trove of public, low-frequency radio data
At AGU's Fall Meeting, the preeminent international Earth and space science meeting, researchers unveiled the world's largest database of Extremely Low Frequency (ELF)/Very Low Frequency (VLF) data. The open-access database is named WALDO, which stands for Worldwide Archive of Low-frequency Data and Observations. Researchers will be able to access nearly 1000 terabytes (TB) of data to further scientific efforts in fields like space weather, ionospheric remote sensing, earthquake forecasting, and subterranean prospecting. Space weather effects can produce anything from beautiful auroras in the night sky to destructive effects on power grids and satellites, so both scientists and engineers are motivated to understand them and ultimately predict them.
The work to preserve hundreds of terabytes of ELF/VLF electromagnetic wave measurements and open it for researchers worldwide is a joint project of Stanford University, Georgia Institute of Technology and the University of Colorado Denver with support from the National Science Foundation and Department of Defense.
"It's exciting that we saved this data all these years because right now is the time when it is becoming most valuable with advances in computing power, Big Data algorithms and artificial intelligence," said Mark Golkowski, Ph.D., professor of Electrical Engineering, College of Engineering, Design and Computing, CU Denver.
Golkowski and Morris Cohen, Ph.D., associate professor in the School of Electrical and Computer Engineering at Georgia Tech initiated the WALDO project as the culmination of a legacy that began at Stanford following World War II. Professor Robert Helliwell pioneered the field and the use of large antennas to capture low-frequency radio waves in remote locations like Antarctica and Alaska to study the complex physics of near-Earth space. Helliwell at Stanford eventually passed the torch to Professor Umran Inan, who served as advisor to Golkowski and Cohen when they were students in his research program.
"If there is one thing our advisor instilled in us, it was the sanctity of high quality science observations and the importance of preserving them." says Golkowski. "Unfortunately, this kind of archival work is often put on the back burner and it's only later that people say, 'if only we had data from 10 years ago, we would know if this was an anomaly or not.' Losing data is like the burning of the library at Alexandria. When it's gone, it's gone."
For years, researchers have transferred data from magnetic tapes to CDs to DVDs as technology advanced and outdated storage methods threatened the data. The advent of massive cloud storage has the added benefit of making the data accessible to researchers all over the world.
Through the efforts of Golkowski and Cohen and their students, nearly 80,000 DVDs of data is uploading to the cloud. At the time of the meeting, 200TB of data is uploaded, with another 800 TB to go.
While most data is from the last 20 years, some recordings date back to the 1970s and 80s.
WALDO will also be a living repository, capturing ongoing data being collected by Georgia Tech and the University of Colorado Denver. For example, data collected during the 2017 Great American Solar Eclipse will be publicly available.
"The recordings capture a snapshot of the Earth's quickly changing atmosphere and space environment, which is why the effort to maintain the data we already have is crucial for future research. "It's shown me the effort necessary as a civilization to keep from losing the past," says Cohen. "While there is no question that the data on WALDO is a record of the planet's past and can inform on its present, anybody with experience in data analysis knows that one often has to comb through a lot of noise and lackluster observations to find the gem that will advance knowledge."
"This was the inspiration for the 'WALDO' name based on the children's cartoon character always hiding among the masses in his characteristic sweater," says Golkowski.
Golkowski and Cohen hope that opening up the database will inspire new discoveries and new uses for the datasets. Ever improving computational power and data algorithms will no doubt play a role.
"We have a sense of the known unknowns. But who knows who many unknown unknowns are still out there. By making this data public, our hope is for other researchers to use these data sets in ways we haven't imagined yet," said Cohen, who applied WALDO data and found that signals at 60 Hz and its harmonics—the annoying noise that comes from power grids—can be used as a diagnostic for power grids and cybersecurity systems.
At CU Denver, Golkowski has used ELF observations of lightning to diagnose the upper lower atmosphere, which could eventually improve communication systems.
"Finally!", quipped Cohen, "we have an answer to the question `Where's WALDO?"