User Interactions in xR

Augmented, mixed, and virtual reality offer great new ways to work with 3D objects, 3D charts, data visualizations and games.

Discovering extended reality (xR)

Augmented, mixed, and virtual reality offer great new ways to work with 3D objects, 3D charts, data visualizations and games. Not to mention many uses that we haven’t even thought of. They enable entirely new applications and can make using some existing applications a lot better.

Take a spreadsheet with a lot of data for example. Displaying the data in a chart helps our brains to very quickly make sense of the data and spot things like trends, ranges and anomalies. If it is a 3D chart on a 2D screen, some things might be hidden from view and it can be hard to see relative sizes. Having that same 3D chart in a 3D environment where you can easily walk around it and look at it from different angles or virtually “grab” the chart and move it around can bring those benefits to 3D charts.

AR, MR and VR break out of flat displays and place the human in a world that is not limited to a small, flat area. This can offer some advantages to traditional office applications and even more so for new applications that are simply not feasible with flat displays.

Moving around

Moving around in extended reality (xR) works quite well already. In this article we will use xR as a generic term for augmented, mixed, and virtual reality. In virtual reality, your body usually stays in one place or it moves around in a limited physical space. (This also helps to avoid bumping into objects in the real world.) The virtual space is limitless. There are several methods for moving around in a virtual environment, such as “teleporting” by pointing at a location where you want to go and triggering a jump. There are even hardware solutions like one that lets you fly like a bird.

Moving around in augmented reality is, well, moving around in actual reality. The real world is augmented with extra information or objects. In this app for example, you can go on a hike and get information about certain spots along the way, projected on a 360º view of the location.

Mixed reality is an enhanced form of augmented reality where your view of the real, physical world is enhanced with virtual objects. If you move your head, the virtual objects stay in the same place in the physical world.

To understand the difference between augmented and mixed reality, think of the augmented reality of using your phone and pointing your camera at something. Your phone adds objects or information on the screen on top of the camera image. Think of Pokémon Go. Mixed reality typically requires a headset that continually augments your view of the real world. Meta Glasses and Microsoft HoloLens are two examples.

Manipulating objects

Ideally, we would like to be able to touch virtual objects with our hands and realistically “feel” them. Several xR systems already detect arms and hands and simulate direct manipulation. Some can even apply pressure on your fingers so that you can feel like you are touching an object. Over time this will get more and more realistic.

Other options are controllers. With controllers you can point, click, rotate, and move in space. Much like a traditional computer mouse except in 3D.

Voice is another option. You can tell the computer what to do with an object. This too is getting better and better with newer iterations of xR devices.

Interactions in xR

Let’s look at typical interactions like we are used to on regular computers.

Imagine trying to work with your laptop without using your mouse or keyboard. (No, it doesn’t have a touchscreen. It is a traditional laptop.)

Siri? Cortana? Google voice recognition? This may work for a limited set of commands but certainly not for everything you’d like to do. How would you navigate a spreadsheet? Create a slideshow? Draw an image?

Regular interactions like we are used to with mouse and keyboard are still difficult in xR. Today there is no standard mouse for xR. There is no standard pointer or cursor either. Keyboards floating in xR space are less convenient than writing messages using an old cell phone’s number pad. (Remember T9?)

An extra dimension

A traditional computer monitor is convenient. It is flat, it has a fixed size, and it stands still. It is easy to select something on a screen because it is easy to move a mouse pointer. Touch screens are even more intuitive. They do away with the pointer and let you touch an object on the screen directly. You feel your finger on the screen and you see the result of your action right away.

In contrast, xR environments are three-dimensional, they can take on any size and objects can move around in space.

This makes it harder to find things because they may be out of view and it makes it hard to select things with a pointer because you have to move the pointer in 3 dimensions instead of 2.

Some of today’s interaction methods for xR environments have two types of problems:

They may be useful but they don’t work well enough.
They are the wrong tool for the job.

Let’s look at some examples of these types of problems and how they may be fixed or avoided.

Great for specific applications

Some interaction methods are suitable in principle but they just don’t work well enough yet. Microsoft’s HoloLens has a voice interface that doesn’t always understand what you say and with Meta’s headset it is possible, but still hard to manipulate objects with your hands.

Movies show us what it would be like if everything worked well. The movie “Iron Man” for example. Jarvis, the AI assistant, understands everything Tony says and all virtual objects can be grabbed and moved by hand.

I’m sure that we will get there sooner rather than later. Others seem to think so too. The headsets on the market right now are still regarded as developer models. Several companies are spending a lot of resources on solving these issues as soon as possible.

The right tool for the job

If you have a screw, a hammer is not a useful tool. In today’s xR environments we often lack the right interaction tools; that is if our goal was to replace a traditional computer.

Take voice input for example. You cannot create a drawing using your voice. Other actions like changing the volume of your headset are possible using voice commands but it is slow and imprecise.

If I say “turn down the volume 10%”, it could be interpreted as: 10% of the total volume, 10% of the current volume or decreasing the volume by 10%. If you had a volume knob instead, you just turn it and hear when it reaches the desired volume – quick and intuitive.

User interface elements like knobs, dials, and buttons that are taken from traditional, 2-dimensional user interfaces don’t necessarily translate well into the 3-dimensional xR environment. The constraints of a 2D environment actually make it a lot easier to work with those controls. In xR they would need to be redesigned or replaced by something different.

We also need better xR user interface implementations that follow the basic usability principles that are true in any environment. For example, a consistent way to find and navigate a menu structure, allowing users to easily recover from mistakes, and good ways to give feedback to the user.

Don Norman‘s “The design of everyday things” is considered a standard text on this subject.

Having said that, current xR solutions do work really well for certain applications.

Watching immersive videos (360º or 180º) works really well and with cheap tools like Google Cardboard, which quickly transforms your own mobile phone into a VR viewer, a large audience already has access to compelling experiences.

Take the daily news site Blick.ch for example. We’ve helped them develop apps for VR content.

What is needed

Ideally, you would want the xR environment to behave and feel like a physical world where you can see, feel, and hear objects and activities, how and where you would expect.

Currently there are very few systems that offer a fully immersive and realistic experience. The ones that do require powerful and expensive hardware and a dedicated location. The Void is an example of a fully immersive environment with vision, sound, touch, smell, and even temperature.

If it is not possible to be completely realistic, the best approach is to offer alternatives that work effectively and reliably with the hardware that you have.

Take virtual reality using a mobile phone and Google Cardboard for example. A primitive solution that simulates a mouse pointer with Cardboard is the ‘gaze’ method: a pointer – usually a small dot – stays in a fixed position in front of your head. If you move your head, the pointer moves with it. If you move the pointer over an object and keep it in the same spot for a short period, the computer knows you want to ‘click’ in that spot.

The gaze method may be okay for short interactions like starting a movie but it gets tiresome quickly. It is also totally unsuitable for anything that requires precise movement. Drawing for example.

With Cardboard you get only one fixed button but no other physical control. This can enhance the gaze method: you can click on an object when the pointer is over it instead of having to wait a few seconds.

Vive and Oculus have physical controllers that allow much better interaction. You don’t need to move a pointer with your head and eyes anymore; you can just interact with objects using the controllers.

(Image credit: VR Gamer)

This picture shows a Vive controller that has a laser beam coming out of it in the virtual environment. Point at an object with the laser beam to do something with it.

Mixed reality devices like Microsoft HoloLens and the Meta glasses currently use a mixture of a gaze pointer, hand recognition for gestures, hand recognition for direct manipulation of objects, and voice commands.

We have found several use cases that work quite well with HoloLens and have developed applications for health, public transport, and tourism, for example.

This picture shows a full-size model of a ship with a historic representation of what that type of ship looked like many years ago. The overlay is created live, visible to the person who is wearing the HoloLens.

Where we’re going

Microsoft is working with other manufacturers like Acer to bring (HoloLens) mixed reality devices to a broader audience. Apple has announced it is working on AR and started implementing AR and VR technology at operating system level with MacOS 10.13 and iOS 11. Google, Facebook, and several other companies are also putting a lot of resources into research and development in xR.

It can be expected that the next generations of mixed reality and virtual reality devices will see significant improvements.

Currently the HoloLens projects its augmented view into a fairly narrow area in front of the wearer’s eyes. The field of view of mixed reality devices will expand.

Voice recognition will improve so that it becomes more like a truly conversational interface and it can distinguish between different people.

Eventually, physical controllers will become smaller and better. They may even disappear, as demonstrated in the “Iron Man” movie. If our hands can be detected with perfect accuracy, we don’t need the controller anymore. This has already happened with touch screens: you don’t need a mouse anymore.

Devices for tactile feedback will enhance the experience so you can feel virtual objects.

The devices will become faster, smaller, and cheaper. Currently many devices still need to be tethered to a powerful desktop computer but this is only a temporary situation.

Of course eventually all of these devices will disappear and be replaced by a direct brain interface. That is still a year or two away.