Visualization space: natural interface


Introduction

Humans communicate using speech, gesture, and body motion, yet today's computers do not use this valuable information. Instead, computers force users to sit at a typewriter keyboard, stare at a TV-like display, and learn an endless set of arcane commands -- often leading to frustration, inefficiencies, and disuse. We have created the Visualization Space -- a system that allows for natural interaction. The Visualization Space -- similar to our DreamSpace -- uses an intuitive yet richly interactive interface that allows the user to manipulate and navigate through all types of visual information. It "hears" users' voice commands and "sees" their gestures and body positions. Interactions are natural, more like human-to-human interactions. This information system understands the user, and -- just as important -- other users understand. Users are free to focus on virtual objects and information and understanding and thinking, with minimal constraints or distractions by the computer, which is present only as wall-sized 3D images and sounds (but no keyboard, mouse, wires, wands, etc.). The multimodal input interface combines * voice: IBM ViaVoice speech recognition; * body tracking: machine-vision image processing; * understanding: context, and small amounts of learning. The Visualization Space is essentially a smart room employing a deviceless natural multimodal interface built on these emerging technologies and combined with ever-cheaper computing power. This system was designed and used for scientific visualization applications. Future natural interfaces will allow information and communication anywhere, anytime, anyway the user wants it -- in the office, home, car, kitchen, design studio, school, and amusement park.

pointing

Description

Our Visualization Space allows users to collaborate in a shared workspace using interactive visual computing. The system "hears" users' voice commands and "sees" their gestures and body positions. Interactions are natural, more like human-to-human interactions. The "computer" understands the user, and -- just as important -- other users understand. Users are free to focus on virtual objects and information and understanding and thinking, with minimal constraints or distractions by "the computer", which is present only as wall-sized 3D images and sounds (but no keyboard, mouse, wires, wands, etc.). As shown in the schematic below, this intuitive human-like interaction is made possible by

emerging interface technologies:

schematic of Vis Space

The Visualization Space is a networked workspace where the computing system adapts to the human to optimize ease of use, enjoyment, and the organization and understanding of information. The Visualization Space paradigm of computing is ideal for many applications:

scientific visualization

The interactive communication of complex visual concepts is now as easy as pointing and speaking. A high-speed network and a supercomputer (the IBM RS6000 SP) provide computational power and the ability to handle enormous data bases, e.g., geoseismic data. Users can collaborate remotely with users at other workspaces and workstations.

education and entertainment

In location-based entertainment (e.g., a virtual themepark), where interface hardware is often damaged, our hands-off gadget-free interface allows robust unobtrusive interactivity. Virtual adventures and walk-throughs are also more fun and memorable. A user might take another user on a tour of a historic site, a virtual factory, or flying tour of the continents. (See DreamSpace for more info.)

Interaction example

The Visualization Space (early 1997) is pictured here, showing Mark Lucente in a typical interaction. To move one of the virtual objects - the earth - displayed on the large display, Mark simply points at the object and asks the system to "put that there":

picture of interaction, before"Put that..."

picture of interaction, after"...there."


Frequently-asked questions

You ask....
...and I (Mark Lucente) answer. Send questions to Mark Lucente

Or look at the FAQs related to natural interaction.

What is VizSpace running on?
The VizSpace runs on an IBM PC. Both the IBM ViaVoice(tm) speech recognition and the vision input operate on a multi-processor IBM Netfinity 7000 (also an IBM PC 704 ) running Windows NT OS. Interface integration, communication, and application software all share this same PC. Computer graphics rendering power is provided by a standard graphics accelerator card. An ATM network link to an SP allows for additional application processing power.

Why are you using such a big PC?
Initially, we needed lots of computing power for the interface modalities (voice and vision), which are processed in the main CPU(s) -- not in specialized hardware. ViaVoice(tm) represents an advance in speech recognition technology that frees up CPU cycles, and the machine-vision system has been optimized and runs one one processor without significantly interfering with the rest of the system. Simpler PCs have also been used.

How much does it cost?
The IBM PC, sound card, video digitizer, camera, microphone, and graphics accelerator card currently in use costs about $15,000. This is far more power and bandwidth than is required (or utilized). The system runs (usually just as fast) on systems that cost under $10,000. The cost of the display depends on size: our rear-projection display is big (over 2 meters wide) and bright and costs about $30,000. A new display technology from IBM will deliver better performance for a fraction of the price. And smaller displays costs much less.

What else can it run on?
The VizSpace has been implemented on a variety of platform during the past two years: IBM AIX (Unix), distributed across the network; a single-processor IBM Intellistation ; a single IBM Thinkpad; an IBM PC 704 server; an IBM Netfinity 7000 PC server.

What do all these terms mean?: "natural computing"? "natural interface"? "Visualization Space"?
I believe that computers can be designed from the ground up to be as easy and natural to use as, say, talking to your best friend, your mom, your cactus. Making computers more natural to use ("natural computing") requires a new kind of "natural interface" -- one that allows humans to communicate the way they naturally communicate with each other: speaking, gesturing, moving around, etc. The Visualization Space (also called "VizSpace" for short) uses a particular kind of natural interface. Users interact with and control the images displayed on the VizSpace simply by speaking, gesturing, etc. Other examples of natural interfaces are in the works: desks, tables, cars, kitchens, living rooms -- natural objects and environments that also happen to be "smart" and interactive. A simplified version of the VizSpace (developed at IBM's T.J.Watson Research Lab in Yorktown, New York) was set up and demonstrated at the Comdex 1997 computer exhibition in Las Vegas in November 1997. The DreamSpace is the next generation of VizSpace, designed for a broader range of applications.


Contact: Mark Lucente , lucente@watson.ibm.com
last update: 1998 Apr
emal "lucente" at "watson.ibm.com"

[ IBM research | natural interaction | ]

[ IBM ] [ Orders ] [ Contact IBM ] [ Legal ]