This Digital Hand Enables Hands-free Virtual Reality

University of Michigan

More than just a stand-in, the AI-powered agent can complete tasks by following simple voice commands that don't include nitty-gritty details.

Study: HandProxy: Expanding the Affordances of Speech Interfaces in Immersive Environments with a Virtual Proxy Hand (DOI: 10.1145/3749484)

A digital, voice-controlled hand could improve the convenience and accessibility of virtual and augmented reality by enabling hands-free use of games and apps. The prototype software was developed by computer scientists at the University of Michigan.

The researchers' software, called HandProxy, allows VR and AR users to interact with digital spaces by commanding a disembodied hand. Users can ask the hand to grab and move virtual objects, drag and resize windows, and perform gestures, such as a thumbs up. It can even manage complex tasks, such as "clear the table," without being told every in-between step, thanks to the interpretive power of GPT-4o, the AI model behind ChatGPT.

The hand's ability to independently parse complex tasks on the fly makes it more flexible than current VR voice-command features, which are limited to simple, system-level tasks, such as opening and scrolling through menus, or predefined commands within an app or game.

"Mobile devices have supported assistive technologies that enable alternative input modes and automated user-interface control, including AI-powered task assistants like Siri. But such capabilities are largely absent in VR and AR hand interactions," said Anhong Guo, the Morris Wellman Faculty Development Assistant Professor of Computer Science and Engineering.

"HandProxy is our attempt to enable users to fluidly transition between multiple modes of interaction in virtual and augmented reality, including controllers, hand gestures, and speech," said Guo, who is also the corresponding author of a study describing the software, published in Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies.

Enthusiasts praise VR for its immersion. Users want to be inside a virtual space, not just viewing it from the outside. The benefits, they claim, range from making games more exciting to training doctors and surgeons without risking lives.

Two men sit at a video conference desk. One of the men is wearing a virtual-reality headset connected to an open laptop. A TV on the wall displays a white, disembodied hand hovering over a digital basket, which contains a digital apple. A photo of a dog is also displayed in the top right corner of the app window.
Yuxuan Liu (left with headset), a doctoral student in computer science and engineering, and Chen Liang (right), another doctoral student of computer science and engineering, demonstrate how HandProxy follows voice commands inside a demo app. Image credit: Marcin Szczepanski, Michigan Engineering

Maximizing physical realism is key for suspending disbelief, so the industry has moved toward tactile control with hand-tracking cameras and gloves. But the focus on life-like hand motions isn't the ideal method for certain people and situations. VR users in cramped spaces might not have room for complicated gestures, and AR users may want to navigate small displays while their hands are full with cooking or cleaning.

A strict reliance on hand gestures becomes even more cumbersome for users who have motor impairments or other disabilities. People with muscular dystrophy and cerebral palsy have difficulty using VR, Scientific American reports. Tactile motions can even dissuade some users with chronic illness from even trying VR. One Redditor shared that a chronic illness prevents them from enjoying games with repetitive swinging motions, and they were skeptical that VR would be right for them. HandProxy could help make VR more comfortable and approachable.

"If there is any built-in physics, which is true for most games and VR apps, HandProxy can interact with it," said Chen Liang, U-M doctoral student in computer sciences and engineering and the first author of the study. "Our virtual hand gives the same digital signal as the user's hand, so developers don't have to deliberately add something into their programs just for our system."

Some trial users are already enthusiastic by the tool's potential. In the study, 20 participants were asked to replicate tasks from a demo video, then they freely explored HandProxy's capabilities for 10 minutes. Some participants were excited to have a virtual stand-in that they could "talk (to) normally and intuitively." But other participants, to the researchers' surprise, were more excited by the idea of having the hand do more abstract tasks that "aren't limited to the physical world."

"It could act like an agent, where a user gives it a high-level command, like 'organize my workspace,' and it finds a way to sort and close all your open windows," Liang said.

One barrier to adoption is that the hand sometimes misinterprets a user's commands. HandProxy was asked to do 781 tasks during the study, and while it correctly performed most of the tasks within one to four attempts, it failed at 64. For instance, the software didn't realize that one user was referring to a digital basket when they said "the brown object," and it didn't know to push a heart button when asked to "like the photo."

The researchers are currently working on ways to help the software interpret ambiguous speech, without taking too many liberties. One study participant offered a potential solution: allowing the hand to ask and answer questions.

The team has applied for patent protection with the assistance of Innovation Partnerships and is seeking partners to bring the technology to market.

The research was funded by the University of Michigan.

/Public Release. This material from the originating organization/author(s) might be of the point-in-time nature, and edited for clarity, style and length. Mirage.News does not take institutional positions or sides, and all views, positions, and conclusions expressed herein are solely those of the author(s).View in full here.