Turning network traffic data into music
Cybersecurity analysts deal with an enormous amount of data, especially when monitoring network traffic. If one were to print the data in text form, a single day's worth of network traffic may be akin to a thick phonebook. In other words, detecting an abnormality is like finding a needle in a haystack.
"It's an ocean of data," says Yang Cai, a senior systems scientist in CyLab. "The important patterns we need to see become buried by a lot of trivial or normal patterns."
Cai has been working for years to come up with ways to make abnormalities in network traffic easier to spot. A few years ago, he and his research group developed a data visualization tool that allowed one to see network traffic patterns, and now he has developed a way to hear them.
In a new study presented this week at the Conference on Applied Human Factors and Ergonomics, Cai and two co-authors show how cybersecurity data can be heard in the form of music. When there's a change in the network traffic, there is a change in the music.
"We wanted to articulate normal and abnormal patterns through music," Cai says. "The process of sonification—using audio to perceptualize data—is not new, but sonification to make data more appealing to the human ear is."
The researchers experimented with several different "sound mapping" algorithms, transforming numeral datasets into music with various melodies, harmonies, time signatures, and tempos. For example, the researchers assigned specific notes to the 10 digits that make up any number found in data: 0, 1, 2, 3, 4, 5, 6, 7, 8, and 9. To represent the third and fourth digits of the mathematical constant Pi—4 and 1—they modified the time signature of one measure to 4/4 and the following measure to 1/4.
While this all may sound fairly complicated, one doesn't need to be a trained musician to be able to hear these changes in the music, the researchers found. The team created music using network traffic data from a real malware distribution network and presented the music to non-musicians. They found that non-musicians were able to accurately recognize changes in pitch when played on different instruments.
"We are not only making music, but turning abstract data into something that humans can process," the authors write in their study.
Cai says his vision is that someday, an analyst will be able to explore cybersecurity data with virtual reality goggles presenting the visualization of the network space. When the analyst moves closer to an individual data point, or a cluster of data, music representing that data would gradually become more audible.
"The idea is to use all of humans' sensory channels to explore this cyber analytical space," Cai says.
Co-authors Jakub Polaczyk (left) and Katelyn Croft (right) were both students of Cai's and are alumni of Carnegie Mellon's College of Fine Arts.
While Cai himself is not a trained musician, his two co-authors on the study are. Jakub Polaczyk and Katelyn Croft were once students in Carnegie Mellon University's College of Fine Arts. Polaczyk obtained his Artist Diploma in Composition in 2013 and is currently an award-winning composer based in New York City. Croft obtained her master's degree in harp performance in 2020 and is currently in Taiwan studying the influence of Western music on Asian music.
Before graduating in 2020, Croft worked in Cai's lab on a virtual recital project. Polaczyk took Cai's University-wide course, "Creativity," in 2011 and the two have collaborated ever since.
"It has been a very nice collaboration," Cai says. "This kind of cross-disciplinary collaboration really exemplifies CMU's strengths."
More information: Jakub Polaczyk et al, Compositional Sonification of Cybersecurity Data in a Baroque Style, Advances in Artificial Intelligence, Software and Systems Engineering (2021). DOI: 10.1007/978-3-030-80624-8_38