Credit: CC0 Public Domain

Software bugs have been a concern for programmers for nearly 75 years since the day programmer Grace Murray Hopper reported the cause of an error in an early Harvard Mark II computer: a moth stuck between relay contacts. Thus the term "bug" was born.

Bugs range from slight computer hiccups to catastrophes. In the Eighties, at least five patients died after a Therac-25 radiation therapy device malfunctioned due to an error by an inexperienced programmer. In 1962, NASA mission control destroyed the Mariner I space probe as it diverted from its intended path over the Atlantic Ocean; incorrectly transcribed handwritten code was blamed. In 1982, a later alleged to have been implanted into the Soviet trans-Siberian gas pipeline by the CIA triggered one of the largest non- in history.

According to data management firm Coralogix, programmers produce 70 bugs per 1,000 lines of code, with each bug solution demanding 30 times more hours than it took to write the code in the first place. The firm estimates the United States spends $113 billion a year identifying and remediating bugs.

So Microsoft's recent announcement that it has successfully created a that can accurately identify high-priority security bugs 97 percent of the time is welcome news.

In a report posted online earlier this month, Scott Christiansen, a senior security program manager at Microsoft, said, "We discovered that by pairing machine learning models with , we can significantly improve the identification and classification of security bugs."

The model has an even higher rate of success—99 percent—distinguishing between between security and non-security bugs.

Microsoft used two statistical techniques to design its bug detection system. One, called term frequency-inverse document frequency algorithm (TF-IDF) examines massive document collections for key words and calculates their relevance. The other, a logic regression model, determines probability of the existence of a specific class or event.

The program first classified security and non-security bugs and was then improved to classify degrees of threat as "critical," "important" or "low-impact."

Christiansen said Microsoft's goal was to design a bug-detection system "with a level of accuracy that is as close as possible to that of a security expert."

A key breakthrough of this project, Christiansen explained, is that "bug reports can be performed even when solely the title is available for training and scoring."

"To the best of our knowledge, this is the very first work to do so," he said.

Microsoft eventually will make its findings open source on GitHub.

"Every day, stare down a long list of features and bugs that need to be addressed," Christiansen said. "Security professionals try to help by using automated tools to prioritize security bugs, but too often, engineers waste time on false positives or miss a critical security vulnerability that has been misclassified. To tackle this problem data science and teams came together to explore how machine learning could help."