Accessibility is AI’s Biggest Challenge: How Alexa Aims to Make it Fairer
DESCRIPTIONSmart home technology, like Alexa, has been one of the biggest boons in recent years for people who are blind, and for people with disabilities altogether. Voice technology and AI help empower people in many ways, but one obstacle stands in its way: making it equitable. In this session, learn from Amazon about how they’re approaching the challenge ahead.
BRIAN: Accessibility is AI’s Biggest Challenge– How Alexa Aims to Make it Fairer. Moderator, Caroline Desrosiers. Speakers, Peter Korn and Dr. Joshua Miele.
CAROLINE DESROSIERS: Hello, everyone. Welcome to accessibility and AI’s biggest challenge– making it fairer for everyone. And today, we have Peter Korn and Josh Miele. Peter is the director of accessibility for devices and services at Amazon. He led the development of the VoiceView screen reader for Fire TV, Fire tablets, and Kindle devices. He initiated the first access features for voice interfaces, including Alexa captions and Tap to Alexa, as well as Show and Tell for Alexa. He also facilitated key improvements to React Native Accessibility, which is technology used by hundreds of applications that can all now be far more accessible. And taken all together, millions of people today use these accessibility features built into Amazon devices. And then we have Josh Miele who is a blind scientist, community leader and inventor with a history of developing innovative information accessibility solutions for blind people. Currently, Josh is a principal accessibility researcher at Amazon where he helps guide the non-visual customer experience for device accessibility and advises widely across Amazon on inclusive design and research methods. His work integrates disability inclusive design, accessibility engineering, disability studies, and other disciplines, applying emerging technologies and trends to a range of information accessibility challenges. Josh holds a bachelor’s degree in physics and a PhD in Psychoacoustics from the University of California at Berkeley. So welcome Josh and Peter. And we are looking forward to hearing all of your thoughts on today’s session topic, which is once again, Accessibility and AI’s Biggest Challenge– Making it Fairer for Everyone. And we’re going to be talking about how Amazon AI technologies like Alexa make deeply meaningful changes to the lives of blind and visually impaired people. So we’re going to talk about Alexa, speech recognition, and voice technology, how to delight a new and growing customer base of digital and accessibility natives, and hear about how Amazon is approaching the AI and accessibility challenges ahead. So we have a lot to cover. Let’s get right into it with the first question of the day. Alexa has been around for eight-ish years and has evolved a bit. Can you share a few examples of what Alexa can do today that she couldn’t do at first. Anything folks in the blind community should know about, but may not at the moment. And Peter, let’s start with you.
PETER KORN: Sure, Caroline. And thank you so much for having me. It’s a real pleasure to be here. Alexa’s first public outing was at the CSUN Assistive Technology Conference where we showed it first to blind people. And when we showed it at that time, it was this disembodied voice in the background that you could talk to and ask about the weather and the news and play songs. And that was amazing, that you could do it without having a microphone near you. But since that time, it’s grown a lot. And one of great example is the Show and Tell feature that combines computer vision, machine learning, and Alexa to identify household items. And I want to pitch it over to Josh because he had a large thing to do with Show and Tell coming to the world.
JOSHUA MIELE: Thanks Peter and Thanks Caroline. Really fun conversation. I’m looking forward to what we have to talk about here today. So Show and Tell is a featured design specifically for blind and visually impaired people on Alexa devices that we call multimodal devices. The ones that have screens in addition to Alexa. And they also have cameras. So the idea of Show and Tell was, if you are a blind chef or person at home and you’ve got a bunch of boxes or cans or packaged items that are not necessarily labeled in Braille, let’s say you just got them home from the grocery store or your categorization methods on the shelf weren’t very organized when you put things on the shelf, you can actually just hold a box or can up to the camera for Alexa to identify. And you just say, Alexa, what am I holding? And she’ll– the first time you do that, she’ll actually teach you how to do it. So there’s a bunch of steps to it. But once you’ve done it a time or two, Alexa will just jump right in and do her best to tell you what she’s seeing. And this is an experience that sounds really simple, but it’s actually something that not only serves a really important purpose for independent blind people to be able to, for example, open a can of olives instead of chili when olives are what they want, but also to make sure that the right information is being provided. So it doesn’t just do an exact match and figure out, oh, I know exactly what that is. And I’m 100% confident, but lighting varies, labeling varies, and sometimes your thumb is over part of the label. So an exact match isn’t always possible. So it was really important to make sure that our success– we were measuring success against not an exact match, but being helpful. And being helpful means providing an exact match if you can. But it also means, if you see branding information that you can– if she sees branding information that can be identified, then that will be spoken. If neither of those can be done then, of course, there’s probably text-based information that can be spoken, read out loud from the label. And if it just says olives, that’s really helpful even if it doesn’t provide an exact match. So really understanding deeply what the customer experience is, what blind people need from something like this is part of what we’re trying to bring AI to do. And part of what we at Amazon are really trying to do with all of the experiences that we design if they’re– this one is specifically for blind and visually impaired people because most people who can see don’t need their cans read for them. But we’re interested in doing this kind of deep dive on how the interaction works for everyone. And AI is a really, really key part of being proactive, making the right decisions, taking steps for the benefit of the customer without necessarily making all those steps be asked for explicitly by the customer. Peter, you might have some other examples there.
PETER KORN: Yeah, no. I think another great example that is of very broad use that’s in the very same vein of being helpful is Alexa Routines. We didn’t have Routines eight years ago. Now, you can say Alexa, good morning, and your good morning routine involves turning on your coffee pot downstairs and turning on lighting at a low level so that you can head down the stairs and giving you your morning briefing and so on. And that’s a fantastic addition that we added in the last eight years.
JOSHUA MIELE: And these are examples of things that weren’t– they’re part of making things work better for everyone.
CAROLINE DESROSIERS: Oh, that’s fantastic. This is really AI for good. And I think that this transitions nicely to talking about a question that I’m sure a lot of people have on their minds from last year at Sight Tech Global. Some of your other colleagues at Amazon were speaking to the idea of ambient AI where you anticipate a customer speaking less to Alexa in the future and the AI, instead, more proactively acting on their behalf. Now, this could be very helpful. I think some people might also be concerned. So I’d love to hear your thoughts on the ambient AI, moving from just simple tasks like playing music to more proactive assistants like notifying a customer when a package was delivered. So are we getting closer at this point from a technology perspective to this ambient intelligence reality? And what’s changed over the past year since we saw that last presentation? And maybe this time I’ll start with you Josh.
JOSHUA MIELE: Sure. And I think I’ll toss a few ideas often that I’m definitely interested in hearing what Peter has to say also. But I mean, you can think of– you mentioned package delivery. It’s a great example of Alexa is tied into Amazon’s network, and knows when your package has been delivered. And until recently, we had a way of informing people that their package was delivered or rather, that there was a message or a notification pending for them. So if you saw the ring, the yellow ring of light on Alexa’s crown, or if you saw some other visual, other visual notifications were used, then you knew that there was a notification waiting for you. And you could say, what are my notifications? Or, cancel my notifications. However, if you couldn’t see– and this was one– this is a problem that many of our blind visually impaired customers encountered, if you can’t see that light, that visual cue that was designed very intentionally to be unobtrusive and uninvasive was not available to you. You didn’t know that there was a message pending if you couldn’t see. And so we have recently released a new feature that you can turn on which is Notify When Nearby, which basically means that when she thinks that you are nearby, if she hears noise or if she hears your voice or if the camera is on and can see a person, then actually there will be a spoken notification. There will be a spoken message that there’s notifications pending so that you, as a blind person, don’t need to see that light or ask randomly, do I have notifications? So this is part of how we’re trying to make the system smarter and engage appropriately with people when you activate that feature. So if you know that you want to be told about notifications when you’re nearby, when she knows that you’re nearby, she will say, hey, you’ve got notifications. Do you want to hear them? There are probably some other examples that I– Peter, what do you think?
PETER KORN: Well, what I really like about Notify When Nearby is it addresses what our colleague Mark Mulcahy calls the candy bowl problem. Mark is a principal software engineer at Amazon. He led development of the VoiceView screen reader that’s built into all of our Echo Show devices. And he’s also blind. And when a colleague puts out a bowl of candy, at least when we’re back in the office, how do it’s there? And Notify When Nearby brings that equality, that fairness to our blind customers. And the other features that are part and parcel of ambient intelligence are, likewise, things we want to make sure are benefiting everyone. And a few examples of this. So I mentioned the Good Morning routine. But we also have something called hunches. So you might have a good night routine. And Alexa might have a hunch that normally when you go to bed, you turn the lights off downstairs. You make sure that the garage door is closed. And so the hunch would be Alexa saying, hey, do you want me to turn off the lights because I noticed you said goodnight. And that kind of hunch is another great example of taking a further big step down the road to this vision of full ambient intelligence. A third example is entertainment where I might be wondering with my family, what do we want to watch tonight? And we can just ask Alexa. And based on our watch history that we’ve shared, Alexa might suggest, hey, this might be a good night for a princess movie or a great night– or coming up on Christmas, maybe a traditional Christmas movie like Die Hard because you’re also an action fan. And this is, again, something that would benefit everybody. And today, as of– I don’t know– the last month, about 30% of smarthome interactions are initiated by Alexa based on hunches, based on what she’s recognized is going on. And nearly 90% of Alexa’s daily routines are initiated without a customer saying so. So we’re taking really strong strides down the road of ambient intelligence.
CAROLINE DESROSIERS: Interesting. Josh, I’m curious to hear a little bit more about– just a follow up question– about the research that I’ll powered these Alexa features like hunches and routines. Can you speak more to the research behind that, actually collecting information from Alexa users.
JOSHUA MIELE: Well, Amazon prides itself on being a customer-obsessed company. We want to delight our customers and we want to do everything we can to provide the products and services that people want. And that, of course, includes customers with disabilities. So we do a lot of customer research. And one of the things that we– when we begin to do, we start by asking our customers what kinds of things they want. We can talk a little bit more about how we engage with our customers with disabilities through conferences and through focus groups and stuff like that. But we begin, very often, by talking to people on the inside of Amazon. We have many, many people working for Amazon and many of them have disabilities. And so we conduct internal research ahead of our external research to make sure that we’re going in the right direction to get all the feedback we can, about how our new features are landing with all of our customers, including those with disabilities. So we knew that hunches and routines and show and tell, we’re going to be very powerful because we started by asking Amazonians, folks that work at Amazon, and we knew that they were going to be extremely useful for people with disabilities because we have a very strong and healthy community of people with disabilities of all types working at Amazon. I can tell you a little bit more about the– we’re also very interested in engaging in– we do all kinds of research. And one of the things that we’re really leaning into is making sure that the research that we conduct, even if it’s not directly intended to be accessible research, we produce– we want to reach out to our customers. And when we do, we want to make sure that the survey or the focus group or whatever activity we’re engaged with at the moment is going to be inclusive and accessible so that if somebody responds to a survey that’s not necessarily even about blindness or low vision, we want to know that survey tool is going to be accessible with a screen reader, is going to have appropriate contrast settings so that people with low vision are comfortable responding to it because when we ask to hear the voices of our customers, we want to know that we’re hearing all of the voices of our customers, including those with disabilities of all kinds.
PETER KORN: And I want to underscore something that Josh said about our employees with disabilities. I think that’s a real superpower that Amazon has in designing accessible products and services for everyone. We have thousands of people with disabilities and their allies in our affinity group, of our AmazonPWD Affinity group and that strength has really borne out in the access features we’ve delivered. And I would invite all of the companies that are participating in Sight Tech Global to really lean in higher and lean into employees with disabilities and how you make your products.
CAROLINE DESROSIERS: That’s great. So let’s really dig into the heart of this topic today, which is making AI fairer for everyone. How do you ensure at Amazon that you are building, for everyone, what are more examples of this, and when you’re creating the future of Alexa, how exactly are you working with folks to do that? Can you speak a little bit more to actually making the AI fair?
PETER KORN: So there are a couple of ways we do that. And the first is, of course, tapping our employees, tapping our customers, doing research to understand what are the pain points, what are the areas where we could be doing more, where Alexa or our other technologies could really solve pain points that customers with disabilities have. And I think it even goes beyond fairness and into opportunities where technology and digital services can just fundamentally be better than their physical alternatives. An electronic book is a book whose fonts can be as large as the size of your screen. A electronic shopping, online shopping, means you don’t have to reach for shelves that are hard to reach from your wheelchair or see what the two different cans are on the shelf because you don’t have Show and Tell with you in every store. Particularly now coming to AI and fairness, I’m delighted that we recently announced and began collaboration with Apple, Meta, Microsoft, and the University of Illinois Urbana-Champaign their Berkman Center, on getting large data sets of impaired speech so that we can train our voice assistants to recognize speech from somebody who is perhaps fairly far along in ALS or whose speech is impaired by Down syndrome or cerebral palsy. And that’s both from a fairness point of view that everyone should be able to use Alexa. But it also leads into solving problems that really can’t be solved well any other way. If I don’t have motor skills, how do I turn on and off my light switch? And if that same neurological disease like ALS is impacting my speech as well as my motor skills, boy, it’s so important that I be able to do that with my voice or with whatever physical abilities I have. So I think it’s both fairness and even beyond fairness looking at, with Show and Tell how we can make the lives of our customers better beyond what they have today in the physical world.
CAROLINE DESROSIERS: I think this raises an interesting point because a lot of people are growing up, not only as digital natives but also accessibility natives at this point. People who have had a disability for a long portion of their life and they’re familiar with how to embrace it and they’re raised in this age of digital tech, they’re very familiar about it. And we often talk about that community. But it seems we also don’t talk about the experience of the new blind cohort. And Josh, I’m interested to hear your thoughts on this.
JOSHUA MIELE: Yeah, I mean, the flip side of digital or accessibility natives is folks who are losing their vision late in life or as adults and don’t necessarily have– there’s a lot to learn about using a screen reader and using technology as a blind person. It is extremely challenging along with all of the other things that somebody has to learn when their vision is changing. And so one of the great things that I think we’re able to do with Alexa is to bootstrap or scaffold that process so that you can actually remain connected to and able to access the information you want. You can still read the news. You can still access podcasts. You can still watch movies with audio description. You can still listen to your favorite radio station. You can ask questions, get information from Wikipedia, all of this without knowing anything about how to use a screen reader or a computer as a blind person. And that’s not to say that folks shouldn’t or don’t need to learn those things. But it’s essential that you start somewhere because isolation is such a critical problem for both aging populations, especially aging populations that are losing vision or other capabilities to have a system that you can just talk to and get information from is really powerful. And that’s one of the most important groups of people that I think were able to help. We talk about digital equity. We talk about fairness. And this is one of the most important groups who are– they’re not part of the vocational rehabilitation. They’re not in education and they often get left behind. And this is why Alexa is such a powerful force in so many people’s lives as they are aging and losing vision.
PETER KORN: I also want to tease out something about accessibility natives because I think there’s another facet to that cohort that’s really interesting. A few years ago at the NFB National Convention, a young blind boy came up to our booth, Alexa, Siri, Google, what time is it? Alexa, Siri, Google, what’s the weather? And it was just completely natural to him. And then he found the remote to our Fire TV that we had in our booth, and he just– with complete comfort– navigated to Netflix, turned on audio description and settled in to watch Cars 3 because it was just– our whole purpose in the booth was to be there for him. And I loved the sense of entitlement because, of course, he should be entitled to all that technology has to offer. And I’m really looking forward to what he and all of his accessibility native cohort are going to demand of us in the coming years as they grow up and they’ve grown up with this and want yet still more. And I think that’s one of the exciting things about the years to come.
JOSHUA MIELE: And Peter, we contrast this with just a few years ago, people were shocked and delighted to even have a screen reader that worked well on a TV. And now the expectations are so much higher, and that’s what we want. That’s what we want to see. We don’t want it to be a surprise that there’s accessibility. We don’t want it to be a surprise that there’s good accessibility. We want people to expect accessibility across the board and to be divinely dissatisfied when it’s not up to their standards. And let us know, and help us make it better.
CAROLINE DESROSIERS: So thinking about the future at this point, I know there are accessibility related challenges that you’re facing today and some that you’re looking to tackle in terms of obstacles for the future in making voice technology and AI equitable. And with just under about 4 minutes left, I’d love to hear both of your thoughts on these challenges and what’s next from here.
PETER KORN: No, I’ll go ahead and–
JOSHUA MIELE: Do you want to start?
PETER KORN: Yeah, I’ll go ahead and start. I mean, I think too to your point about what we had a few years ago, I remember when VoiceView came out for the Kindle e-reader, and one of the comments in the review was they were surprised at how good the voice is in an e-reader. And I think we’re going to continue to see really exciting improvements in the quality and caliber of text-to-speech. And there’s a ton of AI behind that. Understanding what is being said, if I’m reading a Kindle book with Alexa and the passage in the book says, he jumped back frightened by the ghost that came through the door, you’d love that to be a frightened sounding voice. So there’s some neat stuff happening there. There’s some great stuff happening in computer vision. I think we’re really just scratching the surface. We recently introduced a feature on the blink cameras where, on your iPhone, if you connect to your camera, will actually describe what the camera sees. And I think there’s a ton more that’s going to happen there. Josh, anything you want to add?
JOSHUA MIELE: Yeah, I mean, it’s just there– we talk about I as if we know what we’re talking about, and it’s going to be stunning to see what the technology evolves into in the next few years. And what I think is most interesting will be to discover the new uses that we haven’t even thought of. We’re already using AI for improving text-to-speech, for improving computer vision. And those are really– we haven’t begun to build– Amazon recently released a little robot called Astro that it does a few things. But imagine what we could do with real AI and real– the capability of having machines move things for us, do things for us, and we’re seeing self-driving vehicles. That’s going to get better. And equity is always going to be an issue. We will always want to make it better. We will always want to make it more equitable. And we’re always going to be pushing the limits and using the same tools that where– the tools themselves will be the solution to the problems that the tools themselves present. So I think I’m so excited to be able to work among the accessibility folks at Amazon who are going to be finding these new applications of machine learning and what we call AI for accessibility purposes.
CAROLINE DESROSIERS: Very cool. Well, we’re all excited to find out what’s next. And thank you so much to Josh and Peter for sharing your thoughts on this topic. Accessibility and AI helps empower people in so many ways. But we absolutely need to make sure it is fair and equitable to everyone. So thanks, everyone, for listening. And back to you Sight Tech Global.