Human game observers are a vital part of the Esports industry. They use extensive domain knowledge to decide what to show to the spectators. However, they may miss important events, necessitating the need for automatic observers. Researchers from South Korea have recently proposed a framework that utilises an object detection method, Mask R-CNN, and human observational data to find the ‘Region of Common Interest’ in StarCraft—a real-time strategy game.
Esports, already a billion-dollar industry, is growing, partly because of human game observers. They control the camera movement and show spectators the most engaging portions of the game screen. However, these observers might miss significant events occurring concurrently across multiple screens. They are also difficult to afford in small tournaments. Consequently, the demand for automatic observers has grown. Artificial observing methods can either be rule-based or learning-based. Both of them predefine events and their importance, necessitating extensive domain knowledge. Moreover, they cannot capture undefined events or discern changes in the significance of the events.
Recently, researchers from South Korea, led by Dr. Kyung-Jong Kim, Associate Professor in Gwangju Institute of Science and Technology, have proposed an approach to overcome these problems. “We have created an automatic observer using object detection algorithm, Mask R-CNN, to learn human spectating data,” explains Dr. Kim. Their findings were made available online on 10 October 2022 and published in Volume 213 Part B of Expert Systems with Applications journal.
The novelty lies in defining the object as the two-dimensional spatial area viewed by the spectator. In contrast, conventional object detection treats a single unit, for instance, a worker or a building, as the object. In this study, the researchers first collected StarCraft in-game human observation data from 25 participants. Next, the viewports—areas viewed by the spectator—were identified and labeled as “one.” The rest of the screen was filled with “zeroes.” While the in-game features are used as input data, the human observations constituted the target information.
The researchers then fed the data into the convolution neural network (CNN), which learnt the patterns of the viewports to find the “region of common interest” (ROCI)—the most exciting area for the spectators to watch. They then compared the ROCI Mask R-CNN approach with other existing methods quantitatively and qualitatively. The former evaluation showed that CNN’s predicted viewports were similar to the collected human observational data. Additionally, the ROCI-based method outperformed others in the long run during the generalization test, which involved different matchup races, starting locations, and playing maps. The proposed observer was able to capture the scenes of interest to humans. In contrast, it could not be done by behavior cloning—an imitation learning technique.
Dr. Kim points out the future applications of their work. “The framework can be applied to other games representing some of the overall game state, not only StarCraft. As services such as multi-screen transmission continue to grow in Esports, the proposed automatic observer will play a role in these deliverables. It will also be actively used in additional content developed in the future.”