Bug-repair system learns from example

Anyone who's downloaded an update to a computer program or phone app knows that most commercial software has bugs and security holes that require regular "patching."

Often, those bugs are simple oversights. For example, the program tries to read data that have already been deleted. The patches, too, are often simple—such as a single line of code that verifies that a data object still exists.

That simplicity has encouraged computer scientists to explore the possibility of automatic patch generation. Several research groups, including that of Martin Rinard, an MIT professor of electrical engineering and computer science, have developed templates that indicate the general forms that patches tend to take. Algorithms can then use the templates to generate and evaluate a host of candidate patches.

Recently, at the Association for Computing Machinery's Symposium on the Foundations of Software Engineering, Rinard, his student Fan Long, and Peter Amidon of the University of California at San Diego presented a new system that learns its own templates by analyzing successful patches to real software.

Where a hand-coded patch-generation system might feature five or 10 templates, the new system created 85, which makes it more diverse but also more precise. Its templates are more narrowly tailored to specific types of real-world patches, so it doesn't generate as many useless candidates. In tests, the new system, dubbed Genesis, repaired nearly twice as many bugs as the best-performing hand-coded template system.

Thinning the herd

"You are navigating a tradeoff," says Long, an MIT graduate student in electrical engineering and computer science and first author on the paper. "On one hand, you want to generate enough candidates that the set you're looking through actually contains useful patches. On the other hand, you don't want the set to include so many candidates that you can't search through it."

Every item in the data set on which Genesis was trained includes two blocks of code: the original, buggy code and the patch that repaired it. Genesis begins by constructing pairs of training examples, such that every item in the data set is paired off with every other item.

Genesis then analyzes each pair and creates a generic representation—a draft template—that will enable it to synthesize both patches from both originals. It may synthesize other, useless candidates, too. But the representation has to be general enough that among the candidates are the successful patches.

Next, Genesis tests each of its draft templates on all the examples in the training set. Each of the templates is based on only two examples, but it might work for several others. Each template is scored on two criteria: the number of errors that it can correct and the number of useless candidates it generates. For instance, a template that generates 10 candidates, four of which patch errors in the training data, might score higher than one that generates 1,000 candidates and five correct patches.

On the basis of those scores, Genesis selects the 500 most promising templates. For each of them, it augments the initial two-example training set with each of the other examples in turn, creating a huge set of three-example training sets. For each of those, it then varies the draft template, to produce a still more general template. Then it performs the same evaluation procedure, extracting the 500 most promising templates.

Covering the bases

After four rounds of this process, each of the 500 top-ranking templates has been trained on five examples. The final winnowing uses slightly different evaluation criteria, ensuring that every error in the training set that can be corrected will be. That is, there may be a template among the final 500 that patches only one bug, earning a comparatively low score in the preceding round of evaluation. But if it's the only template that patches that bug, it will make the final cut.

In the researchers' experiments, the final winnowing reduced the number of templates from 500 to 85. Genesis works with programs written in the Java programming language, and the MIT researchers compared its performance with that of the best-performing hand-coded Java patch generator. Genesis correctly patched defects in 21 of 49 test cases drawn from 41 open-source programming projects, while the previous system patched 11.

It's possible that more training data and more computational power—to evaluate more candidate templates—could yield still better results. But a system that allows programmers to spend only half as much time trying to repair bugs in their code would be useful nonetheless.

More information: Automatic Inference of Code Transforms for Patch Generation. people.csail.mit.edu/rinard/pa … er/fse17.genesis.pdf

Provided by Massachusetts Institute of Technology

This story is republished courtesy of MIT News (web.mit.edu/newsoffice/), a popular site that covers news about MIT research, innovation and teaching.

Bug-repair system learns from example

Thinning the herd

Covering the bases

Automatic bug-repair system fixes 10 times as many errors as its predecessors

Adobe's VideoGigaGAN uses AI to make blurry videos sharp and clear

Emulating neurodegeneration and aging in artificial intelligence systems

Holographic displays offer a glimpse into an immersive future

For more open and equitable public discussions on social media, try 'meronymity'

Researchers develop energy-efficient probabilistic computer by combining CMOS with stochastic nanomagnet

New computer vision tool can count damaged buildings in crisis zones and accurately estimate bird flock sizes

Game theory research shows AI can evolve into more selfish or cooperative personalities

Proof of concept study shows path to easier recycling of solar modules

New circuit boards can be repeatedly recycled

Researchers develop an automated benchmark for language-based task planners

Built-in bionic computing: Researchers develop method to control pneumatic artificial muscles

Custom-made catalyst leads to longer-lasting and more sustainable green hydrogen production

Researchers outline path forward for tandem solar cells

Researcher develop high-performance amorphous p-type oxide semiconductor

Scientists create new atomic clock that is both ultra-precise and sturdy

A framework to compare lithium battery testing data and results during operation

New approach could make reusing captured carbon far cheaper, less energy-intensive

How much energy can offshore wind farms in the U.S. produce? New study sheds light

Bug-repair system learns from example

Thinning the herd

Covering the bases

Let us know if there is a problem with our content

Thank you for taking time to provide your feedback to the editors

Share article

E-MAIL THE STORY