An artificial observer will interact with commentators while analyzing real-time in-game state for the best spectator experience. Credit: Kyung-Joong Kim

Esports, already a billion-dollar industry, is growing, partly because of human game observers. They control the camera movement and show spectators the most engaging portions of the game screen. However, these observers might miss significant events occurring concurrently across multiple screens. They are also difficult to afford in small tournaments.

Consequently, the demand for automatic observers has grown. Artificial observing methods can either be rule-based or learning-based. Both of them predefine events and their importance, necessitating extensive domain knowledge. Moreover, they cannot capture undefined events or discern changes in the significance of the events.

Recently, researchers from South Korea, led by Dr. Kyung-Jong Kim, Associate Professor in Gwangju Institute of Science and Technology, have proposed an approach to overcome these problems. "We have created an automatic observer using object detection algorithm, Mask R-CNN, to learn human spectating data," explains Dr. Kim. Their findings were made available online on October 10, 2022 in the journal Expert Systems with Applications.

The novelty lies in defining the object as the two-dimensional spatial area viewed by the spectator. In contrast, conventional object detection treats a single unit, for instance, a worker or a building, as the object. In this study, the researchers first collected StarCraft in-game human observation data from 25 participants.

Next, the viewports—areas viewed by the spectator—were identified and labeled as "one." The rest of the screen was filled with "zeroes." While the in-game features are used as input data, the human observations constituted the target information.

The researchers then fed the data into the convolution (CNN), which learnt the patterns of the viewports to find the "region of common interest" (ROCI)—the most exciting area for the spectators to watch. They then compared the ROCI Mask R-CNN approach with other existing methods quantitatively and qualitatively.

The former evaluation showed that CNN's predicted viewports were similar to the collected human observational data. Additionally, the ROCI-based method outperformed others in the long run during the generalization test, which involved different matchup races, starting locations, and playing maps. The proposed observer was able to capture the scenes of interest to humans. In contrast, it could not be done by behavior cloning—an imitation learning technique.

Dr. Kim points out the future applications of their work. "The framework can be applied to other games representing some of the overall game state, not only StarCraft. As services such as multi-screen transmission continue to grow in Esports, the proposed automatic observer will play a role in these deliverables. It will also be actively used in additional content developed in the future."

More information: Ho-Taek Joo et al, Learning to automatically spectate games for Esports using object detection mechanism, Expert Systems with Applications (2022). DOI: 10.1016/j.eswa.2022.118979

Provided by GIST (Gwangju Institute of Science and Technology)