-
DESCRIPTIONOver the past several years at Sight Tech Global, our keynote speakers have examined AI's transformative impact on digital accessibility—how it has the potential to close long-standing gaps while, at times, introducing new ones. This year, Mike Shebanek and Saqib Shaikh reunite to explore how AI is reshaping daily life for blind and low-vision people, with Saqib sharing personal stories of empowerment using AI-driven agents and assistants. Together, they illuminate a rapidly evolving landscape where AI systems adapt to each user, proving that universal design is a pathway to technology that changes lives.
Speakers
-
Moderator: Mike Shebanek, Accessibility Consultant
-
Saqib Shaikh, Co-founder of Seeing AI, Microsoft
-
-
SESSION TRANSCRIPT
[MUSIC PLAYING]
VOICEOVER: Accessibility Pioneers and the Promise of Ai: A Life-Changing Technology. Speaker: Saqib Shaikh, Project Leader for Seeing AI, Microsoft. Moderator: Mike Shebanek, Consultant.
MIKE SHEBANEK: Welcome to Sight Tech Global. It is year six, and we are so happy to be here again. My name is Mike Shebanek, and I’m here with Saqib Shaikh from Microsoft. And it seems like we were just here a minute ago, Saqib. It’s been a year, and we got so many things to cover. Uh, but before we jump in, uh, maybe just take a moment for those people who don’t know you, uh, to give a little bit of background on, on who you are and some of the things you’ve been working on.
SAQIB SHAIKH: Oh, wow. Yeah, thank you, Mike. It’s really good to be talking again, and to be here at Sight Tech Global. So I work at Microsoft. I lead the Seeing AI team here, where we’re looking at how merging technologies, of course like artificial intelligence, can help the blind community. And the Seeing AI app is something I’ve sort of been a… This whole project started 10 years ago, which is crazy, but so much has happened in the decade, but so much has happened in just the past year as well.
MIKE SHEBANEK: Yeah, and I, I can’t wait to jump into all of those things. Um, it is actually, I mean, even as we were thinking about preparing for this, so hard to kind of s- sort through like, what do we talk about? There’s so many things to cover. But I hope through this conversation, people in our audience will be able to get a sense for some of the big changes that are happening, some of the kind of major announcements, maybe some of the sneaky things that have happened that people don’t really realize are going to be big game-changers over time. Um, and so, um, I think the way I would think about this last year, maybe you would too, is things have sort of changed from, “In the future, it will be like…” to, “Today, I’m doing X.” Like, today I’m doing something with this. And I… And so I wanted to ask you, like even your daily life over the last year, um, like how are you using AI or some of these new technologies that have come out?
SAQIB SHAIKH: I think as I look back to last year in our session, we were talking about, you know, is this the moment, um, and, and we were talking about, you know, people will be thinking about AI more and more, but I think it’s really become much more real, and I’m wondering if you have the same feeling after, you know, the last year. Yeah, like every, uh… And the speed that things are changing is getting ever faster, but every year, AI can do even more. I remember, you know, like a couple of years ago, “Oh, we can, you know, get really rich descriptions of images. Oh, wow, and now we’re doing videos.” And now we’re like actually doing real-time understanding of the world around you and helping you do things in the real world. And I think the key thing there is, it’s going from these theoretical things to very practical things that people are using in their everyday lives. It’s becoming much more mainstream. And that’s the physical world. So if we sort of split this then into the digital world, where I find myself just in my daily life using AI to do so many tasks which everyone would find useful, but I, as someone who’s blind, probably use it in different ways. So been having a lot of fun with, with that recently.
MIKE SHEBANEK: (laughs) You, you just moved to a new home, I think, as you were telling me, and you were using Copilot for things that was like, “Wow, that’s brilliant.” Um, talk a little bit about like what your experience is want… Moving into a new home and having Copilot versus maybe a year or two ago if you’d done the same thing, not having Copilot. Well- And, and describe what Copilot is for people who don’t know.
SAQIB SHAIKH: Yeah. Copilot is Microsoft’s, um, personal assistant, which… Sorry. It’s equivalent to many of these AI chat interfaces where you can talk to it, you can type to it, and it learns about you and can give you information from across the web. And yeah, it’s been really useful as I’ve been going through this house move for so many reasons. But if I had to pick two, it would be shopping and tactile instructions. So for me-
MIKE SHEBANEK: So, tactile instructions. (laughs) How is that helping you?
SAQIB SHAIKH: Well, the, the great thing is, is that Copilot knows about, um, things on the internet, like for this particular brand, how, what is the instruction manual? Like, you know, how am I going to set this device up? How am I going to unscrew this, or, you know, this bit of plumbing over there, what tool do I need? But then I can say, “Actually, you know what? Let’s do with this, this with non-visual instructions.” And that’s true for these household tasks, but also, just this weekend, I was trying to, uh, teach the kid how to tie different knou- uh, knots for scouting, and I could just say, “Okay, give me a tactile, non-visual version of instructions on how to tie a knot,” and, and so forth. And I also mentioned shopping, because again, things which sighted people might just glance at, I can now say to Copilot or whichever AI, “Hey, I want to buy this thing. What are the popular brands? What are the popular options?” And okay, now we’re looking at these two models. Would you compare these things and, you know, put them in a paragraph instead of a table? Or whatever it be. It’s like the information comes to me in the format that I can consume most easily, and that’s helpful for everyone, but I think as someone who’s blind, if you learn to use these tools, it’s been a game-changer this year.
MIKE SHEBANEK: Yeah, I love how you described that this is, uh, allowing you to interact with the physical world. Like, a lot of times we talk about accessibility, we think of it sort of digitally, and it’s like, well, a website or a form or something. Um, but you’re using this to do things like, I think you were telling me, even fold up a sofa or a couch. Like (laughs) how would you do that? Or find the, the water supply turn-off is, uh, knob or lever, um, and, and translating these things into like real-life things of everyday living, not just sort of online or digital.
SAQIB SHAIKH: Exactly. It’s kind of this continuum. So the work on Seeing AI initially was just like, you know, you’re gonna have your camera that’s viewing the world and help you get around the world, and at the other end, you’ve got this, um, something that’s looking just at your computer screen and helping you complete tasks there, and then you’ve got this intermediate, like you say, where… You’re in the physical world. It’s looking things up in the digital world and saying, you know, “Run your hand along this surface and look for this shaped, uh, screw, and then you’re gonna turn it this way.” And that’s, that’s really helpful.
MIKE SHEBANEK: And how would you compare that to, say, a year or two or three ago where you didn’t have that? Were, were you asking people to… for assistance? Were you looking it up, like, online PDFs on the web? Like, how, how were you, (laughs) how were you doing it? Or you just didn’t?
SAQIB SHAIKH: Well, no, I think you’re always finding the workarounds that work best, but the great thing here is, you know, um, depends on your personal situation, but sighted help is a, is a resource that I personally will go to when I need it. But you don’t want to be using it all the time, so this just gives me more independence. There are more things that I can do, or I get a few steps further along before I have to wait around to get a sighted person to check something, for example. So, yeah, that’s very empowering.
MIKE SHEBANEK: Awesome. Yeah, and I love the shopping example too. I think, um, I mean, we buy things all the time, every day, whether it’s just, you know, things to live on, uh, things around the house, clothes, food, what have you, but also services, things like that. And to have AI help out, uh, I mean, I, I… You tell me, has it changed, like, your quality of life a lot, or is it just, is it just that you get more done in a day?
SAQIB SHAIKH: Oh, quality of life, I don’t know. It… I, I just think it brings a whole new dimension to independence. Like, you know, when you’re researching which, which thing you’re gonna buy or how you’re gonna do something or, you know, I want to, you know, learn a recipe, this, that, whatever it be, all these things… There are, you know, many many tasks in the world where sighted people, they’ll skim across a whole page of information and, and get what they need whereas I have to listen to every single word typically with a screen reader. So the ability for AI to just get me to the answer that I want in the most concise form has just been super valuable. And it’s, it’s not just about speed. I don’t want to say “Oh, I’m just trying to get everything done faster,” but just, yeah, more things are feasible. More things are opened up to you.
MIKE SHEBANEK: Yeah. Yeah, I think one thing we should say is that, you know, AI is not just about speed or efficiency, although it certainly can help with that. But it’s really about access and the quality of the experience too. And, and you, you hinted at this, which I think is a really good point, which is that, you know, for people who are sighted, they can visually scan. They can look at, you know, a product page or a, you know, a comparison or something, and their eyes just naturally jump to the things they care about. Whereas if you’re listening to something via a screen reader, you’re going sequentially through the whole thing, and it takes a lot longer, but also it’s harder to do that kind of, like, random access or that… you know, that skimming ability. And so AI, when used in a thoughtful way, can actually provide that kind of capability where it’s like, “Oh, I just want to know about this aspect,” and you can jump right to it. Or, “I want to compare these two things side by side,” but presented in a way that makes sense for the way you’re consuming information. And that, that to me is like… that’s the power of it. It’s not just making things faster. It’s making things accessible in a fundamentally different way.
SAQIB SHAIKH: Yeah, exactly. And, you know, I’ve, I’ve used the phrase before, AI is a universal translator. So, you know, we’ve been doing all this work at Microsoft on being able to describe images, um, but the underlying technology is applicable to way more than just describing images. Like, an AI that can look at a photo and describe it can also look at a chart or a graph or a diagram and describe it. It can look at a user interface and tell me what’s on the screen. It can look at, you know, a physical object and tell me what it is. So the technology generalizes across so many different modalities, but also then that same technology can take information and present it in different ways. So, you know, whether I want it as a, um, you know, a verbal description or I want it structured as bullet points, or I want it translated into another language, or I want it summarized, or I want it expanded upon with more detail… Like, the AI becomes this universal translator that can take information from one form and put it in the form that works best for me. And that’s incredibly powerful.
MIKE SHEBANEK: Yeah, absolutely. And, and I wanted to, um, talk a little bit about some of the, uh, recent developments in the space. So, you know, we’ve been talking about AI broadly, but there’ve been some really significant advancements even in, like, the last six months to a year. Um, you know, things like GPT-4o from OpenAI, which has multi-modal capabilities. Um, Google has Gemini. Anthropic has Claude. Meta has Llama. Like, there’s all these different, uh, large language models and multi-modal models that are coming out. And I, I’m curious, from your perspective, like, how do you think about all of these different options? Is it just like, “Oh, more is better”? Or are there specific things that some of these models are better at than others? Like, how do you navigate, uh, this landscape as someone who’s trying to build products for the blind community?
SAQIB SHAIKH: Yeah, so I think about this in, in sort of two ways. One is, you know, as a product developer, I’m always looking at what’s the best technology available to solve a particular problem. So, you know, maybe one model is really good at understanding images, another one’s really good at, you know, conversational dialogue, another one’s really good at reasoning through complex tasks. So we’ll often use different models for different purposes depending on what they’re optimized for. But I think the other really important thing is that competition in this space is fantastic for accessibility. Because when you have multiple companies all trying to push the boundaries of what’s possible with AI, they’re all trying to make their models better, faster, more capable, more accurate. And that rising tide lifts all boats. Like, the more these companies are competing to have the best AI, the better the technology gets for everyone, including people with disabilities. And I think what’s really exciting is that many of these companies are also thinking about accessibility as a core part of what they’re building. So it’s not just an afterthought. It’s like, “How do we make sure that as we’re developing these models, they work well for people who are blind, for people who are deaf, for people with other disabilities?” And that’s, that’s really encouraging to see.
MIKE SHEBANEK: Yeah, I think that’s a really good point. And I, I’ve noticed that too, where, you know, accessibility is becoming more of a, a first-class concern rather than something that’s tacked on at the end. Um, and I think part of that is because there’s an awareness that, you know, if you design for accessibility from the beginning, you often end up with a better product for everyone, not just for people with disabilities. Um, but I, I wanted to ask you about something that, that’s been on my mind, which is, you know, we’re seeing all these amazing advancements in AI, and they’re enabling all these new capabilities. But there’s also, I think, some, some concerns or some questions about, you know, as these systems become more powerful and more capable, how do we make sure that they’re used responsibly? How do we make sure that they’re not introducing new biases or new forms of discrimination? And I’m curious, like, from your perspective working in this space, um, what are some of the things that you think about or that you’re concerned about, or that you’re working to address to make sure that as AI becomes more prevalent, it’s actually helping people and not creating new barriers?
SAQIB SHAIKH: Yeah, it’s a great question. I mean, I think there are a few things that come to mind. One is just making sure that the data that these models are trained on is representative. Like, if you’re training an AI to describe images, you need to make sure that the training data includes images of people with disabilities, includes diverse representation of different types of people, different environments, different contexts. Because if the training data is biased, then the model’s outputs are going to be biased too. I think another really important thing is making sure that we’re including people with disabilities in the development process. Not just as testers at the end, but actually as part of the team that’s designing and building these systems. Because we bring a unique perspective on what works, what doesn’t work, what could be harmful, what could be helpful. And that perspective is invaluable in making sure that these systems are built right from the beginning. And then I think there’s also this question of transparency and explainability. Like, when an AI makes a decision or provides information, it’s important to understand why it’s doing what it’s doing, especially if that decision could have a significant impact on someone’s life. So being able to say, “Okay, the AI told me X. How did it arrive at that conclusion? What information is it basing that on?” That’s really important for building trust and making sure that people can use these systems confidently.
MIKE SHEBANEK: Yeah, I think those are all really important points. And I, I think the one about including people with disabilities in the development process is particularly crucial because, you know, there’s this saying in the disability community: “Nothing about us without us.” And I think that really applies here. Like, you can’t build technology for people with disabilities without actually having people with disabilities involved in every step of the process. Um, and I wanted to ask you about another development that I think is really exciting, which is the emergence of AI-powered smart glasses. So, you know, we’ve seen products like the Ray-Ban Meta smart glasses, we’ve seen, uh, other companies working on similar things. And these seem like they could be really transformative for people who are blind or visually impaired because they provide this hands-free way to interact with AI and to get information about the world around you. Can you talk a little bit about, like, what you think the potential is there and, and what Microsoft is doing in that space?
SAQIB SHAIKH: Yeah, that’s so cool. Going back, I think 2014 was the first time we had a pair of glasses hooked up to an AI, and all that became Seeing AI, but we never managed to find a way to do like a commercially mass-market viable pair of glasses with a camera on that sees for you. And it’s really cool to see products like those from Meta, which actually have the camera built into the glasses. And we recently announced that Microsoft, through Seeing AI, is going to connect with those, uh, Ray-Ban Meta glasses and enable you to use Seeing AI and all those tools hands free. So you know, y- … while you’re holding your cane and maybe trying to open a door or carry some shopping, you know, we never have enough hands. (laughs) So, not needing to hold your phone with the camera is, I think, gonna unlock some new possibilities, because suddenly if the camera’s always there and it’s pointing wherever you’re looking, then suddenly it goes from, “Okay, tell me what you see right now,” to, “Okay,” like we said about agents, “Me and these, this AI in my pocket and the glasses is going to … we’re gonna partner on completing some task together.” And that’s a lot of the focus for the Seeing AI team right now, is, “Okay, you have this hands-free solution that’s always there. Let’s unlock new possibilities for what you can do in the world.”
MIKE SHEBANEK: Yeah. I think one of the things I love about this, that kind of, for me, makes it really real is, in the past, there have been prototypes, there have been some commercial products, but they’re multi-thousands of dollars. And, and now the prices on these are literally less than $300, which is roughly the cost of a pair of (laughs) glasses anyway. I don’t know, my glasses tend to cost me more than that when (laughs) I go even with insurance coverage in the US. So, uh, that price seems like it’s making it, uh, more, uh, accessible in terms of, you know, price and cost. But also, they’re available now at so many different outlets. Like, they’re easier to find. Before, you kind of had to know about these. Someone needed to tell you, and you kind of need to be on the inside. But this is just regular old stuff now, um, and kind of speaks to the idea that traditional products are just becoming more inclusive by design in a way that, you know, really hasn’t existed, uh, in the past. And I, you know, I’ve heard from people in the blind community that say, you know, there’s never been a better time to be blind because of all the technology available. And I, I would love to (laughs) actually get your thoughts on that. Is it, uh, is that just the perception people have in the moment, or is that actually true that, you know, there’s just more available now than there’s ever been before?
SAQIB SHAIKH: (laughs) Ah, well, you know, you’re asking a tech enthusiast here. You know I’ve dedicated my career to this, so- (laughs) … um, there’s a bit of bias there. But-
MIKE SHEBANEK: I hope you’re biased, but I want to hear your answer ’cause I think people would be interested, right?
SAQIB SHAIKH: Well, I think at any moment in time, I would have said yes. The answer is yes. And it- but it’s always been yes. So, you know, um, when I was in university, there were books I could read on the computer, but they were far and few between. And audio books were, you know, there, but not massively available. And then, you know, over time, the day that, you know, services came online that I could read thousands or hundreds of thousands and millions of books on the internet, that unlocked more and all the way through. But yes, today, the technology is incredibly exciting. All the examples we’ve discussed, from processing information to understanding the visual world around you, to doing your schoolwork or your, um, work at work, um, there is just so much that is possible now that you couldn’t have done even a year or two ago. So I’m very optimistic and year on year, there’s gonna be more and more things. So, excited to see where this all goes.
MIKE SHEBANEK: Yeah, absolutely. And, and I wanted to ask you as we kind of get to the close here, um, we’ve touched on so many things. Uh, how do you think we ensure that people with vision loss are always part of the development? Not sort of just as a consumer, like, “Okay, thank you for making this for me, I’m so grateful,” but really to be decision-makers, policymakers, um, product designers. Um, what are some of the things- You’ve been in this industry a long time in the tech field, you’ve worked with a lot of people in these roles, and, and so for people maybe in the audience who are in these roles, maybe they’re focused on accessibility, maybe they’re not and this is their first, uh, you know, contact with this topic, um, what are some of the ways they can be, um, uh, i- uh, part of this process, um, make sure that it’s not done for people who are blind but with and by people who are blind?
SAQIB SHAIKH: Yeah. I often think of my work in Seeing AI as a conversation between the blind community and the scientists. You know, how do we bridge the gap of what people need and what the technology is capable of? And have the luxury of doing that because, um, my team creates products for the blind. But everyone can do this in their own way. Like, if you are in a role where you define policy or, um, are creating the technology, then make sure that people with disabilities are included as early in the process as feasible. But everyone can play their part in some way. It could be that you’re helping test some software. So can you get your perspectives, um, represented much earlier on in the development process? It could be that you’re an advocate and you’re gonna talk about the specific needs of this community. It could be that you’re reporting bugs because you find these models are not representative of people with disabilities or so forth, so make sure that you speak up and people know that. You might be teaching others. So, um, you may be a tech enthusiast, but then there will be others who are nervous about technology, so you might be the teacher. So I would just say, whatever your thing is, this is life-changing technology. And yes, we need to make sure that the companies making these technologies are aware of our needs, that the products meet our needs, and that it scales so that the people, that everyone out there benefits and learns about it as well.
MIKE SHEBANEK: Fantastic. Well, Saqib, thank you. Uh, unfortunately, our time (laughs) is up. As it always is too soon. Uh, I can’t wait till next year to hear what’s happening, you know, again, in, in another year. It’s gonna be incredible. But thank you as always for joining and giving your thoughts on what’s happening on the inside of the industry, but also as part of the blindness community, uh, your perspective on these changes and what we should be thinking about and getting ready for. No doubt a lot of these topics that you’ve touched on will be discussed throughout, uh, Sight Tech Global this year, uh, so we’re excited to hear more from a lot of presenters about some more details of these things that we just sort of touched on. Um, thank you everyone for joining. Thank you for listening. Uh, I hope you have a great Sight Tech Global and, uh, we’ll talk again soon.
SAQIB SHAIKH: Thank you. Always a pleasure. And if I can give a brief plug-
MIKE SHEBANEK: Yes.
SAQIB SHAIKH: … um, seeingai.com. And you can always get in touch, uh, Seeing AI or saqibs@microsoft.com.
MIKE SHEBANEK: Awesome. Thank you. Thank you, Saqib.
[MUSIC PLAYING]
