November 7, 2018 feature

Object detection in 4K and 8K video using GPUs

by Ingrid Fadelli , Tech Xplore

Researchers at Carnegie Mellon University have recently developed a new model that enables fast and accurate object detection in high-resolution 4K and 8K video footage using GPUs. Their attention pipeline method carries out a two-stage evaluation of every image or video frame under rough and refined resolution, limiting the total number of evaluations necessary.

In recent years, machine learning has attained remarkable results in computer vision tasks, including object detection. However, most object recognition models typically perform best on images with a relatively low resolution. As the resolution of recording devices is rapidly improving, there is a rising need for tools that can process high-resolution data.

"We were interested in finding and overcoming the limitations of current approaches," Vít Růžička, one of the researchers who carried out the study told TechXplore. "While plenty of data sources record in high resolution, current state-of-the-art object detection models, such as YOLO, Faster RCNN, SSD, etc., work with images that have a relatively low resolution of approximately 608 x 608 px. Our main objective was to scale the object detection task to 4K-8K videos (up to 7680 x 4320 px) while maintaining high processing speed. We also wanted to understand if and by how much we can benefit from high resolution compared to using low-resolution images, in terms of accuracy of the models."

The attention pipeline proposed by Růžička and his colleague Franz Franchetti divides the task of object detection into two stages. In both these stages, the researchers subdivided the original image by overlaying it with a regular grid and then applied the model YOLO v2 for fast object detection.

"We create many small rectangular crops, which can be processed by YOLO v2 on several server workers, in a parallel manner," Růžička explained. "The first stage looks at the image downscaled into lower resolution and performs a fast object detection to get rough bounding boxes. The second stage uses these bounding boxes as an attention map to decide where we need to check the image under high-resolution. Therefore, when some areas of the image don't contain any object of interest, we can save on processing them under high resolution."

The researchers implemented their model into code, distributing its work across GPUs. They were able to maintain high accuracy while reaching an average performance of three to six fps on 4K videos and two fps on 8K videos. Their method yielded significant benefits, with the measured average precision on the tested dataset increasing from 33.6 AP₅₀ to 74.3 AP₅₀ when processing images in high resolution compared to down-scaling images to low resolution, which is how YOLO v2 generally works.

"Our method reduced the time necessary to process high-resolution images by approximately 20 percent, compared to processing every part of the original image under high resolution," Růžička said. "The practical implication of this is that near real-time 4K video processing is feasible. Our method also requires a lower number of server workers to complete this task."

Despite the very promising results attained by this new object detection method, the use of a regular grid overlaying the original image can give rise to a number of issues. For instance, it can sometimes result in detected objects being cut in half, which requires a post-processing step on the detected bounding boxes. Růžička and Franchetti are currently exploring ways of addressing and circumventing these problems to improve their model further.

More information: Fast and accurate object detection in high resolution 4K and 8K video using GPUs. arXiv:1810.10551v1 [cs.CV]. arxiv.org/pdf/1810.10551.pdf

www.researchgate.net/publicati … d_8K_video_using_GPU

www.youtube.com/watch?v=07wCxSItnAk

www.researchgate.net/publicati … tation_for_HPEC_2018 .

Citation: Object detection in 4K and 8K video using GPUs (2018, November 7) retrieved 17 July 2024 from https://techxplore.com/news/2018-11-4k-8k-video-gpus.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

Fast object detection in videos using region-of-interest packing

120 shares

Feedback to editors

Engineers evaluate cybersecurity risks associated with EV fast-charging equipment

12 hours ago

Machine learning framework maps global rooftop growth for sustainable energy and urban planning

14 hours ago

Giving drones wrap-and-grip wings to allow them to land on poles and tree limbs

16 hours ago

Large language models make human-like reasoning mistakes, researchers find

17 hours ago

Unveiling a new class of synthetic fuels

17 hours ago

Microsoft unveils software that allows LLMs to work with spreadsheets

17 hours ago

New technique to assess a general-purpose AI model's reliability before it's deployed

18 hours ago

New system enables intuitive teleoperation of a robotic manipulator in real-time

21 hours ago

Recycled micro-sized silicon anodes from photovoltaic waste improve lithium-ion battery performance

22 hours ago

You're just a stick figure to this camera—a new camera to prevent companies from collecting private information

Jul 15, 2024

Load comments (0)

Object detection in 4K and 8K video using GPUs

Engineers evaluate cybersecurity risks associated with EV fast-charging equipment

Machine learning framework maps global rooftop growth for sustainable energy and urban planning

Giving drones wrap-and-grip wings to allow them to land on poles and tree limbs

Large language models make human-like reasoning mistakes, researchers find

Unveiling a new class of synthetic fuels

Microsoft unveils software that allows LLMs to work with spreadsheets

New technique to assess a general-purpose AI model's reliability before it's deployed

New system enables intuitive teleoperation of a robotic manipulator in real-time

Recycled micro-sized silicon anodes from photovoltaic waste improve lithium-ion battery performance

You're just a stick figure to this camera—a new camera to prevent companies from collecting private information

Fast object detection in videos using region-of-interest packing

AI method to upscale low-resolution images to high-resolution

Super-resolution microscopy builds multicolor 3-D from 2-D

A way to dramatically improve resolution of confocal microscopy

Adaptive anomaly detection in traffic surveillance videos

Engineer to combine math, machine learning and signal processing to lay groundwork for high-resolution microscope

New system enables intuitive teleoperation of a robotic manipulator in real-time

Machine learning framework maps global rooftop growth for sustainable energy and urban planning

Microsoft unveils software that allows LLMs to work with spreadsheets

New technique to assess a general-purpose AI model's reliability before it's deployed

Large language models make human-like reasoning mistakes, researchers find

A new neural network makes decisions like a human would

Phys.org

Medical Xpress

Science X

Object detection in 4K and 8K video using GPUs

Engineers evaluate cybersecurity risks associated with EV fast-charging equipment

Machine learning framework maps global rooftop growth for sustainable energy and urban planning

Giving drones wrap-and-grip wings to allow them to land on poles and tree limbs

Large language models make human-like reasoning mistakes, researchers find

Unveiling a new class of synthetic fuels

Microsoft unveils software that allows LLMs to work with spreadsheets

New technique to assess a general-purpose AI model's reliability before it's deployed

New system enables intuitive teleoperation of a robotic manipulator in real-time

Recycled micro-sized silicon anodes from photovoltaic waste improve lithium-ion battery performance

You're just a stick figure to this camera—a new camera to prevent companies from collecting private information

Related Stories

Fast object detection in videos using region-of-interest packing

AI method to upscale low-resolution images to high-resolution

Super-resolution microscopy builds multicolor 3-D from 2-D

A way to dramatically improve resolution of confocal microscopy

Adaptive anomaly detection in traffic surveillance videos

Engineer to combine math, machine learning and signal processing to lay groundwork for high-resolution microscope

Recommended for you

New system enables intuitive teleoperation of a robotic manipulator in real-time

Machine learning framework maps global rooftop growth for sustainable energy and urban planning

Microsoft unveils software that allows LLMs to work with spreadsheets

New technique to assess a general-purpose AI model's reliability before it's deployed

Large language models make human-like reasoning mistakes, researchers find

A new neural network makes decisions like a human would

Your Privacy