September 19, 2018 feature

Fast object detection in videos using region-of-interest packing

by Ingrid Fadelli , Tech Xplore

Researchers at the Robert Bosch Center for Data Science and Artificial Intelligence and Center for Computational Brain Research, Indian Institute of Technology Madras, and Purdue University have recently developed a new method of reducing computational requirements for object detection in videos using neural networks. Their technique, called Pack and Detect (PaD), was outlined in a paper pre-published on arXiv.

Object detection is a key aspect of many computer vision applications, such as object tracking, video summarization, and video search. While recent advances in machine learning have led to the development of increasingly accurate tools for completing this task, existing methods are still computationally very intensive. For instance, processing a video at 300 x 300 resolution using the SSD300 object detection network, with VGG16 as backbone and at 30 fps requires 1.87 trillion floating point operations per second (FLOPS).

The researchers observed that in some cases, however, most regions in a video frame are merely background, with salient objects occupying only a small fraction of the area in the frame. In addition, they found that there is a strong temporal correlation between consecutive frames. They leveraged these observations and proposed a new technique for object detection in videos that could reduce computational requirements for object detection tasks.

"We were inspired by the foveal mechanism in both biological and artificial vision systems," Athindran Ramesh Kumar, one of the researchers who carried out the study, told TechXplore. "Previous efforts pertaining to the foveal attention mechanisms in artificial vision systems focus on only one region in the image or on one object at a time. We wondered how a vision system would be if it could focus on all salient regions in the scene at once."

The object detection method devised by the researchers is hence inspired by biological vision systems. However, contrary to previous attempts, their system packs all the regions of interest together in a single frame, instead of processing them sequentially.

"The objective of our work was to speed-up object detection in videos by focusing only on the salient regions in the frame and eliminating the background clutter," Balaraman Ravindran, another researcher who carried out the study, told TechXplore. "For eliminating background clutter, we exploited the temporal correlation between adjacent frames in a video. This is a property that video compression techniques use to reduce the storage and bandwidth requirements; we use it to speed up computation."

PaD, the object detection method proposed by Ravindran and his colleagues works by processing frames at regular intervals in full size. These frames are referred to as "anchor frames." In all other frames, on the other hand, the tool identifies regions of interest based on the location in which objects were situated in the previous frame.

"These regions of interest are arranged together like in a collage, which is used as input for the object detector," Anand Raghunathan, one of the researchers that carried out the study, told TechXplore. "The detections are then mapped back to the locations in the original image. This method is faster because the collage images are of smaller size than the full frames. We leverage the flexibility of popular object detectors such as SSD300 to process images at both full size and smaller sizes."

The researchers evaluated their method on the ImageNet VID dataset and found that it sped up times by 1.25x, with less than a 1.6 percent drop in accuracy. In addition, they observed that the time taken to process lower-sized frames was almost three times lower, with the FLOP count reduced by four times.

In addition, their study highlighted two important aspects that could inform the development of faster and less computationally intensive methods of detecting objects in videos. First, objects of interest generally only occupy a small fraction of pixels in a frame; second, there is a correlation between adjacent frames in a video.

"Our work can help make video analytics possible on resource-constrained devices at the edge of the Internet of Things by reducing computational requirements, or may improve the number of video streams that may be processed by a server in the cloud," Athindran said.

The study carried out by this team of researchers is an initial step toward the development of more effective object detection tools. They are now planning further investigations that could improve their method further.

For instance, currently, PaD selects anchor frames at regular intervals, yet the researchers could develop a mechanism that dynamically identifies these key frames. They also plan to test their technique in more resource-constrained hardware, such as smartphones, wearable devices and smart home appliances.

"We handcrafted an algorithm to infer the regions of interest and form a collage image," Ravindran said. "But a fully neural system would have neural networks that generate the collage image based on the previous frame. This is a more ambitious line of future work."

More information: Pack and Detect: Fast object detection in videos using region-of-interest packing. arXiv:1809.01701v1 [cs.CV]. arxiv.org/abs/1809.01701

Citation: Fast object detection in videos using region-of-interest packing (2018, September 19) retrieved 19 April 2024 from https://techxplore.com/news/2018-09-fast-videos-region-of-interest.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

Adaptive anomaly detection in traffic surveillance videos

55 shares

Feedback to editors

Researchers develop sodium battery capable of rapid charging in just a few seconds

13 minutes ago

Greater access to clean water, thanks to a better membrane

1 hour ago

Silent flight edges closer to take off, according to new research

2 hours ago

A flexible and efficient DC power converter for sustainable-energy microgrids

2 hours ago

Microsoft's AI app VASA-1 makes photographs talk and sing with believable facial expressions

3 hours ago

To build a better AI helper, start by modeling the irrational behavior of humans

3 hours ago

Versatile fibers offer improved energy storage capacity for wearable devices

4 hours ago

Harnessing solar energy for high-efficiency NH₃ production

4 hours ago

A dexterous four-legged robot that can walk and handle objects simultaneously

6 hours ago

Climate change will increase value of residential rooftop solar panels across US, study finds

8 hours ago

Load comments (0)

Fast object detection in videos using region-of-interest packing

Researchers develop sodium battery capable of rapid charging in just a few seconds

Greater access to clean water, thanks to a better membrane

Silent flight edges closer to take off, according to new research

A flexible and efficient DC power converter for sustainable-energy microgrids

Microsoft's AI app VASA-1 makes photographs talk and sing with believable facial expressions

To build a better AI helper, start by modeling the irrational behavior of humans

Versatile fibers offer improved energy storage capacity for wearable devices

Harnessing solar energy for high-efficiency NH₃ production

A dexterous four-legged robot that can walk and handle objects simultaneously

Climate change will increase value of residential rooftop solar panels across US, study finds

Adaptive anomaly detection in traffic surveillance videos

Helping computers fill in the gaps between video frames

An intuitive physics model to predict the effects of a collision

Semantic cache for AI-enabled image analysis

Mimicking the reflexive detection ability of the animal visual system for computer detection of moving objects

An integrated visual and semantic neural network model explains human object recognition in the brain

For more open and equitable public discussions on social media, try 'meronymity'

Researchers develop energy-efficient probabilistic computer by combining CMOS with stochastic nanomagnet

New computer vision tool can count damaged buildings in crisis zones and accurately estimate bird flock sizes

Game theory research shows AI can evolve into more selfish or cooperative personalities

Proof-of-principle demonstration of 3D magnetic recording could lead to enhanced hard disk drives

Tech companies want to build artificial general intelligence. But who decides when AGI is attained?

Phys.org

Medical Xpress

Science X

Fast object detection in videos using region-of-interest packing

Researchers develop sodium battery capable of rapid charging in just a few seconds

Greater access to clean water, thanks to a better membrane

Silent flight edges closer to take off, according to new research

A flexible and efficient DC power converter for sustainable-energy microgrids

Microsoft's AI app VASA-1 makes photographs talk and sing with believable facial expressions

To build a better AI helper, start by modeling the irrational behavior of humans

Versatile fibers offer improved energy storage capacity for wearable devices

Harnessing solar energy for high-efficiency NH₃ production

A dexterous four-legged robot that can walk and handle objects simultaneously

Climate change will increase value of residential rooftop solar panels across US, study finds

Related Stories

Adaptive anomaly detection in traffic surveillance videos

Helping computers fill in the gaps between video frames

An intuitive physics model to predict the effects of a collision

Semantic cache for AI-enabled image analysis

Mimicking the reflexive detection ability of the animal visual system for computer detection of moving objects

An integrated visual and semantic neural network model explains human object recognition in the brain

Recommended for you

For more open and equitable public discussions on social media, try 'meronymity'

Researchers develop energy-efficient probabilistic computer by combining CMOS with stochastic nanomagnet

New computer vision tool can count damaged buildings in crisis zones and accurately estimate bird flock sizes

Game theory research shows AI can evolve into more selfish or cooperative personalities

Proof-of-principle demonstration of 3D magnetic recording could lead to enhanced hard disk drives

Tech companies want to build artificial general intelligence. But who decides when AGI is attained?

Your Privacy