Coming to a campus near you: Voice assistants

Key takeaways

Privacy concerns should be considered with voice technology
Students are already personally using these devices
Voice technology can be applied to daily student life

There is a scene in Star Trek IV where Scotty, traveling back in time, sits down to program a 1990s-era computer and says, "Computer ... Computer?" Getting no response, McCoy helpfully hands him a mouse. Scotty says, "Ah! Hello, Computer!" Still no response. The engineer says with exasperation, "Just use the keyboard!"

There is actually an important truth about human-computer interaction that lies behind this bit of comic relief: people intuitively want to use all of their senses to control their environment. Speaking to a computer is actually not that unusual of an idea, it's something that people will pick up on immediately and begin using without prompting, as soon as it becomes available.

Voice interaction is especially compelling because it lets people do more than one thing at a time. You can gesture and speak; move about and speak; see and speak; and so on. For this reason, a conversation will always produce a more emotionally satisfying and deeper experience than using a medium such as a screen.

How it works

In order to have a conversation with a computer, it first needs to be able to listen to your speech, recognize the individual words and phrases, and then assemble them into a logical structure. People have been working on voice recognition for several decades. For a very long time, one had to specifically train the software to recognize each individual speaker, and even then, the accuracy was only in the mid-80% range. These problems have now been overcome, the software is speaker-independent and accurate in the high-90% range.

A big part of this success is due to the availability of cloud services, which can scale out enormously. Take someone's speech and farm out recognition to hundreds of parallel computing resources, and now you can achieve real-time interaction by voice.

The other half of the equation is connecting the recognized speech to the user's environment. Sure, you could say, "Turn on the lights" and software will understand you, but how does that turn into the act of turning on lights -- and, which lights are we referring to?

At present, this is done by having small computing devices embedded into the environment, and registering them with the cloud voice control service. The table lamp wirelessly connects to a small computer plugged into a wall socket. That computer registers with the cloud, in essence saying, I am a "light" and I can respond to commands to "turn off" or "turn on". When a person walks into the room, and says to a voice assistant such as Alexa or Google Home, "Turn on the light", then the cloud; service will relay that command to the embedded light controller, and the light will turn on.

Many devices can be automated in this way. A television set could register as a media playback device; a door lock could register as a lock; a laptop or a mobile device could register as a user interface; and so on.

Privacy concerns

This all sounds like a wonderful, new world of love, peace, and flowers, but what about privacy and hacking? If I can just walk into a room and talk to a computing device, doesn't that imply that the computer is listening all the time in the background, waiting for me to say something it recognizes? Isn't that a problem?

Yes, the devices are always listening, and yes, that is something people will need to deal with. But just as the Internet forced people to deal with new privacy issues and adopt new behaviors, people will do the same here. Many voice assistance devices prominently feature a mute button to disable the microphone, for example. People will need to remember that a computer is listening in, and learn to disable it when necessary. Voice assistants also do keep records of the commands that have been given. And, as with any other electronic device, those records could be requested by law enforcement.

It's fair to say that our existing privacy and trust controls are coarse-grained, stove-piped and diﬃcult to use. Being able to put into place really personal, really flexible controls is an area of intense research. Establishing and controlling trust is one of the great unsolved problems of modern, inter-networked computing.

OK, but how real is all this?

In another nod to the futurist concepts of Star Trek, we've been using motion controllers for quite some time now to control doors, outdoor lights, faucets, and many others. It wasn't that long ago when it was novel to find a door opening automatically, perfectly timed so we didn't break our stride. Now, it is thoroughly routine.

Voice assistance technologies have definitely come to our mobile devices, and the intelligence behind them is improving rapidly. There is still a challenge with using voice in a noisy environment such as an automobile, however, and that is slowing adoption. For example, driving directions can't yet be done conversationally and entirely hands-free.

Voice assistants such as Amazon Alexa and Google Home are enormously popular devices. Alexa, for example, has shipped over 3 million units. Ellucian has been experimenting with these devices, and we have developed proof-of-concepts and demonstrations of using voice technologies to create a "dorm room of the future" where a student can get their schedule, send and receive video mail, attend online classes, and control room environment, all using voice commands together with embedded video monitors. We have also experimented with interfacing voice commands with our backend systems, so that simple voice queries such as "What is my GPA?"could be asked.

What should colleges do next?

Voice assistants are already proliferating in college dorms. Just google "packing list for college" or "gifts for high school graduates" and you'll see these devices on the top of the list. Students are using these devices for their personal use, and as digital natives, will soon expect their schools to jump on board.

Although the technology is still in its early stages, this is an excellent time to experiment with conversational computing technologies. Put a voice assistant in lounges and classrooms. Connect them to audio/video devices and room controls. Start learning how to develop applications so students can ask about schedules, building locations, walking directions, events, menus, and the many other facts of daily student life.

And, most importantly, solicit feedback and suggestions from students, faculty, and administration staff. There are several voice-to-texting applications available from the device vendors, so designate an account to receive text suggestions, and have people just tell you what they'd like to see, conversationally.

Meet the authors