Kaolin applications. Credit: Murthy Jatavallabhula et al.

As most real-world environments are three-dimensional, deep learning models designed to analyze videos or complete tasks in real-world environments should ideally be trained on 3-D data. Technological tools such as robots, self-driving vehicles, smartphones, and other devices are currently generating a growing amount of 3-D data that could eventually be processed by deep learning algorithms.

Up until now, however, training on this vast amount of 3-D data has been relatively difficult, as the necessary tools and platforms are only accessible to some artificial intelligence (AI) researchers. To address this lack of readily available tools, a team of researchers at NVIDIA has recently created Kaolin, a PyTorch open-source library aimed at advancing and facilitating 3-D deep learning research.

"Currently, there is not a single open-source software library that supports multiple representations of 3-D data, multiple tasks, and evaluation criteria," Krishna Murthy Jatavallabhula, one of the researchers who carried out the study, told TechXplore. "We decided to address this gap in the literature by creating Kaolin, the first comprehensive 3-D deep learning library."

Kaolin, the PyTorch library presented by Jatavallabhula and his colleagues, contains a variety of tools for constructing deep learning architectures that can analyze 3-D data, which are both efficient and easy to use. It also allows researchers to load, preprocess, and manipulate 3-D data before it is used to train deep learning algorithms.

Kaolin includes several graphics modules to edit 3-D images, with functions such as rendering, lighting, shading and view warping. Moreover, it supports a wide range of loss functions and evaluation metrics, allowing researchers to easily evaluate their deep learning algorithms.

Credit: Murthy Jatavallabhula et al.

"Typically, 3-D deep learning researchers need to write a lot of boilerplate code for their research projects," Jatavallabhula explained. "With Kaolin, however, researchers only need to implement the novel parts of their project, as Kaolin packages a comprehensive set of utilities for data loading, conversion and evaluation."

Kaolin is a valuable tool for both developers who are experienced in developing deep learning models and those who are just starting off. Within the library, in fact, developers can also find several state-of-the-art architectures that they can use as a starting point or as a source of inspiration for their own models.

"While active 3-D deep learning researchers view Kaolin as a means to accelerate their research, newcomers into this field are turning to Kaolin for an idea of where to begin," Jatavallabhula said.

In the future, the open-source library presented by these researchers at NVIDIA could help to accelerate 3-D deep learning research, assisting developers in creating new AI architectures, as well as in training and evaluating them. Meanwhile, Jatavallabhula and his colleagues are planning to work on extending Kaolin and enhancing its capabilities further.

"Our plan is to add more deep learning models to our zoo (collection of AI models) and expand our coverage to a broader set of applications like self-driving cars and embodied agents needing 3-D learning," Jatavallabhula said. "In short, we plan on making Kaolin a one-stop platform for 3-D deep learning research."

More information: Kaolin: a PyTorch library for accelerating 3D deep learning research. arXiv:1911.05063 [cs.CV]. arxiv.org/abs/1911.05063

github.com/NVIDIAGameWorks/kaolin/