May 23, 2024

System extracts spoken language from video recording, converts it to searchable text

by David Bradley, Inderscience

videos — Credit: Unsplash/CC0 Public Domain

A new approach to searching through video content has been developed by a team in South Korea. The system, described in the International Journal of Computational Vision and Robotics, extracts spoken word from a video recording, converts it to text, and then makes that text searchable. Importantly, the system thus does not rely on embedded keywords nor curated tags or hashtags to be associated with the video content.

The approach obviously relies on the dialogue or spoken commentary of an item being associated with the scenes in the video that users might wish to search. It is, of course, superfluous if the video has subtitles already baked in. Nevertheless, it will be a boon for users wishing to search the millions of hours of video available in databases, on streaming services, and elsewhere on the internet and could be used to help catalogue videos.

Kitae Hwang, In Hwan Jung, and Jae Moon Lee of the School of Computer Engineering at Hansung University in Seoul, have developed an Android app for use with appropriate smartphones. It is worth noting, however, that there is at least one other app with the same name, so should this app be made available in the Google Play Store for Android apps, it is likely to require a change of name.

The new app works by extracting audio from videos using the FFmpeg code and converting it into text in 10-second increments. This, the team explains, creates a searchable timeline for the video. Advanced speech recognition technology then generates a transcription of those audio segments, which are indexed on the video timeline.

For a 20-minute video, the process is complete in just two to three minutes and runs in the background while the video plays. The team points out that users can then search for specific terms and find all mentions in the video.

The app will have applications in education, news analysis, and other information-dense video where quick access to specific information is needed. For instance, students reviewing lecture recordings or journalists searching for specific statements in interviews could make use of this app. There are many more scenarios where it would be useful to be able to search video in this manner.

More information: Kitae Hwang et al, An implementation of searchable video player, International Journal of Computational Vision and Robotics (2024). DOI: 10.1504/IJCVR.2024.138324

Provided by Inderscience

Citation: System extracts spoken language from video recording, converts it to searchable text (2024, May 23) retrieved 29 June 2024 from https://techxplore.com/news/2024-05-spoken-language-video-searchable-text.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

Prototype browser extension adds Wikipedia-like citations on YouTube to curb misinformation

3 shares

Feedback to editors

Researchers develop novel 3D printing strategy with controllable gradients porous structures

22 hours ago

Researchers develop the fastest possible flow algorithm

Jun 28, 2024

Real-time modeling of 3D temperature distributions within nuclear microreactors to improve safety systems

Jun 28, 2024

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Jun 27, 2024

Wireless receiver blocks interference for better mobile device performance

Jun 27, 2024

Researchers successfully develop domestic 6G antenna measurement system

Jun 27, 2024

Research shows how common plastics could passively cool and heat buildings with the seasons

Jun 27, 2024

Researchers suggest smart solution to harness waste heat from industry

Jun 27, 2024

Robotic hand with tactile fingertips achieves new dexterity feat

Jun 27, 2024

Help or hindrance? ER robots have potential to aid health care workers

Jun 27, 2024

Load comments (0)

System extracts spoken language from video recording, converts it to searchable text

Researchers develop novel 3D printing strategy with controllable gradients porous structures

Researchers develop the fastest possible flow algorithm

Real-time modeling of 3D temperature distributions within nuclear microreactors to improve safety systems

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Wireless receiver blocks interference for better mobile device performance

Researchers successfully develop domestic 6G antenna measurement system

Research shows how common plastics could passively cool and heat buildings with the seasons

Researchers suggest smart solution to harness waste heat from industry

Robotic hand with tactile fingertips achieves new dexterity feat

Help or hindrance? ER robots have potential to aid health care workers

Prototype browser extension adds Wikipedia-like citations on YouTube to curb misinformation

Google announces the development of Lumiere, an AI-based next-generation text-to-video generator

Elon Musk teases audio and video calls at X

Amazon, Google agree to allow each other's streaming apps

AI system can convert voice track to video of a person speaking using a still image

Enhanced video quality despite poor network conditions

Researchers develop the fastest possible flow algorithm

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Sony introduces AI for single-instrument accompaniment generation in music production

Mechanical computer relies on kirigami cubes, not electronics

New tool detects AI-generated videos with 93.7% accuracy

Researchers propose the next platform for brain-inspired computing

Phys.org

Medical Xpress

Science X

System extracts spoken language from video recording, converts it to searchable text

Researchers develop novel 3D printing strategy with controllable gradients porous structures

Researchers develop the fastest possible flow algorithm

Real-time modeling of 3D temperature distributions within nuclear microreactors to improve safety systems

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Wireless receiver blocks interference for better mobile device performance

Researchers successfully develop domestic 6G antenna measurement system

Research shows how common plastics could passively cool and heat buildings with the seasons

Researchers suggest smart solution to harness waste heat from industry

Robotic hand with tactile fingertips achieves new dexterity feat

Help or hindrance? ER robots have potential to aid health care workers

Related Stories

Prototype browser extension adds Wikipedia-like citations on YouTube to curb misinformation

Google announces the development of Lumiere, an AI-based next-generation text-to-video generator

Elon Musk teases audio and video calls at X

Amazon, Google agree to allow each other's streaming apps

AI system can convert voice track to video of a person speaking using a still image

Enhanced video quality despite poor network conditions

Recommended for you

Researchers develop the fastest possible flow algorithm

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Sony introduces AI for single-instrument accompaniment generation in music production

Mechanical computer relies on kirigami cubes, not electronics

New tool detects AI-generated videos with 93.7% accuracy

Researchers propose the next platform for brain-inspired computing

Your Privacy