February 9, 2017 report
Google Brain uses neural networks to provide realistic enhanced resolution to low res pics
Movies and television shows sometimes blur the line between what is real or possible and what is not—one example is where a major character asks a technician to zoom in on a part of a picture and then to enhance it, revealing details to help find someone or something. Unfortunately, this isn't possible in real life. If an image does not contain enough information, there is no way for a computer to fill in what is missing. But it can create a reasonable guess, apparently, as the team at Google Brain has found.
Faced with a blurry eight-by-eight grid of pixels that supposedly represent a human face and tasked to enhance it to make the person recognizable, a computer would be as helpless as a human. So the team at Google Brain chose to use neural networks to see if a computer could instead make a reasonable guess about what the person actually looked like. To allow it to make such a guess, they started with a conditioning network, in which the system attempted to map the image with other existing higher resolution images (of faces, in this case) in a database—those higher res images were downsized to see how closely they matched the eight-by-eight grid the system was analyzing. The system then picked the image it thought was the closest.
Next, an existing network called PixelCNN added details attempting to approximate the original—details came in the form of classes such as the general shape of a chin or nose. To add resolution, multiple classes were used that covered all the basic components of a face, or in this case, a bedroom. The third step involved combining what the two neural networks wrought and producing a new image, one that was hopefully a reasonable facsimile of the original.
The researchers offer several examples of what the system is capable of doing along with a comparative analysis they carried out that showed how accurate humans found the results.
We present a pixel recursive super resolution model that synthesizes realistic details into images while enhancing their resolution. A low resolution image may correspond to multiple plausible high resolution images, thus modeling the super resolution process with a pixel independent conditional model often results in averaging different details—hence blurry edges. By contrast, our model is able to represent a multimodal conditional distribution by properly modeling the statistical dependencies among the high resolution image pixels, conditioned on a low resolution input. We employ a PixelCNN architecture to define a strong prior over natural images and jointly optimize this prior with a deep conditioning convolutional network. Human evaluations indicate that samples from our proposed model look more photo realistic than a strong L2 regression baseline.
© 2017 Tech Xplore