July 15, 2021
Learning aids: New method helps train computer vision algorithms on limited data
Researchers from Skoltech have found a way to help computer vision algorithms process satellite images of the Earth more accurately, even with very limited data for training. This will make various remote sensing tasks easier for machines and ultimately the people who use their data. The paper outlining the new results was published in the journal Remote Sensing.
Researchers have been using computer vision and machine learning techniques to help with environmental monitoring for a while now. Tasks that may seem tedious and prone to human error are normally a piece of cake for algorithms. But before a neural network can successfully, say, discriminate between the kinds of trees in a forested area, it needs to be trained, and therein lies a challenge.
Satellite images are not your average cell phone photos, which you can take by the dozen in a moment: There are only so many shots available per orbit, the resolution is limited, and clouds can always get in the way. So, getting enough well-labeled images to train a neural network can be a nuisance, and scientists and engineers have created workarounds in the form of image augmentation.
"While they are very powerful, neural networks demand a lot of training data to achieve top results. Unfortunately, in practical tasks, we usually don't have enough data. To overcome this issue, data scientists apply various techniques that artificially increase datasets. One of the most popular methods is called image augmentation. It transforms images to add variability," Sergei Nesteruk, Skoltech Ph.D. student and co-author of the paper, explains.
Skoltech Professor Ivan Oseledets and his colleagues developed an augmentation method called MixChannel for multispectral satellite images. This method is based on substituting bands from original images with the same bands from images of another date covering the same area.
"It is easy to use image augmentation for generic RGB images. But multispectral data is very complicated, and there was no efficient way to augment it. MixChannel is the novel augmentation technique designed to work specifically with multispectral data," Svetlana Illarionova, another co-author of the paper and Skoltech Ph.D. student, says.
To test their approach, the team used Sentinel-2 satellite images of conifer and deciduous boreal forests in the Arkhangelsk region of northern European Russia to train a convolutional neural network to classify these forests. "A straightforward approach for training a CNN classification model is to take a set of available satellite images for a given territory during a period of active vegetation. The training set is constructed by taking a random patch of a large image.... However, if we test the obtained model on an image taken on a date that was not included in the training set, the accuracy can drop dramatically," the authors write.
Since it is normally quite cloudy in the Arkhangelsk region, the number of satisfactory satellite images was severely limited—to just six, in fact. But despite the small sample size, the new approach outperformed state-of-the-art solutions when tested with three neural networks, and as the authors note, it can be combined with other augmentation methods for even more training data.
Other remote sensing-related tasks this approach can help with include various environmental studies and precision agriculture—basically whenever you have medium spatial resolution data and not a lot of images available. In further work, scientists will expand the method to deal with more land cover types and larger areas with different environmental conditions.