Designing AI That Sees Us: Powered by Blind-Centric Data

DESCRIPTION

How do we ensure that the next generation of AI systems truly works for blind and low-vision people? The answer is in the design, and for that, you need the right data. The vast majority of AI models are trained on datasets that underrepresent or incorrectly categorize disability, creating a "disability data desert" that limits accuracy by up to 30% for disability-related objects. Without blindness-relevant data, AI models may confidently describe obstacles that are not there, unreadable signage, or mislabeled medication. Discover how Be My Eyes' transparent, privacy-first data collection is revolutionizing AI's ability to understand the lived experience of blind and low-vision users.

Speakers
- Moderator: Karae Lisle, Chief Executive Officer, Vista Center for the Blind and Visually Impaired
- Mike Buckley, Chairman and CEO, Be My Eyes
SESSION TRANSCRIPT

Download transcript as .txt file

[MUSIC PLAYING]

VOICEOVER: Designing AI That Sees Us: Powered by Blind-Centric Data. Speakers: Mike Buckley, Chairman and CEO, Be My Eyes. Moderator: Karae Lisle, CEO, Vista Center.

KARAE LISLE: Hello, everyone. My name is Karae Lisle, and I’m the CEO for Vista Center, the executive producer of Sight Tech Global. Today’s conversation is one I’ve been especially looking forward to because it sits at the intersection of innovation, responsibility, and lived experience. We are here to explore one of the most urgent questions in AI right now. How do we ensure that the next generation of AI systems truly work for blind and low vision people? And the answer is in design, and for that, you need the right data, you need the right perspectives, and you need leadership that understands that accessibility isn’t an edge case, it’s a core requirement.

Which brings me to today’s esteemed speaker, Mike Buckley, who is the CEO of Be My Eyes, a global platform serving millions of blind and low vision users in more than 150 countries. Under Mike’s leadership, Be My Eyes launched Be My AI, one of the first mainstream deployments of GPT-powered visual assistants and a tool that’s reshaping what day-to-day independence looks like for our community. But what truly distinguishes Mike is his vision for how AI should be built. He’s one of the strongest voices calling for AI systems trained on blind-centric data, data that reflects the real environments, questions, tasks, and challenges that blind users face every day. This session is about technology, yes, but it’s also about equity, agency, and the right to independence. Mike, thank you for being here with us and for everything that Be My Eyes continues to contribute to the blind and low vision community. Let’s dive in.

MIKE BUCKLEY: Uh, thank you so much. Uh, kudos back to you for everything you and Vista Center do, uh, to make the world more accessible. Uh, it does take a village. And, uh, I’m glad to share this space with you in that village, so happy to be here. Um, my name is Mike Buckley. I’m the CEO of Be My Eyes. I’m a white middle-aged male. I have brown, but rapidly graying hair, uh, and I’m in a home office flanked by a real guitar over my right shoulder and a Lego guitar over my left and some other doodads. But it’s, uh, it really is a pleasure to be here with you today.

Um, I’m here to make a desperate plea and, uh, make a pretty, I hope, adamant argument about the dramatic need to get more disability data into AI datasets. For those of you, uh, watching, listening, um, I’m showing a slide that says, “Why AI needs blind data.” And it’s a user holding a smartphone, an iPhone, as well as what looks to be a VR headset. Two things. That VR headset was AI generated, and it’s not real. It’s not one that exists in the world. However, I asked the four best models, the best AI models in the world, what that headset was. One told me it was an eye massager. One told me it was an Apple Vision Pro. One told me it was smart glasses, and one told me it might be an AR headset. It’s none of those. And so this underscores the problem that we’re gonna be talking about today.

Now, what happens when it’s much more real in terms of a failure mode? What happens when we’re talking about medication? What happens when we’re talking about, what I’m showing on a slide are a traffic light, expiration dates, um, spatial guidance in a bus stop, and descriptions in a train station? What happens when the use cases and the stakes are higher and we see failure modes in AI? Um, the risks for everyone on the planet is real, but the risk for the blind or low vision consumer is much more acute. And unfortunately, what we see is that models, even great models… And by the way, I’m a huge fan of AI. It’s important for the world. It’s great for the blind and low vision community, and we need to embrace it because we could enter a golden era of accessibility, but we have to be honest about its problems and issues.

Um, why does AI fail? Why does it give us confident and inaccurate descriptions? The slide I’m showing shows two pictures. One is a pristine kitchen with everything beautifully labeled on a countertop, and the other literally looks like a tornado went through the kitchen. The issue with AI training data is that more of it is trained on beautiful, pristine, nice datasets. This creates systemic failures in AI design because it doesn’t reflect the messy reality of life, right? A beautiful image in perfect lighting conditions is not always going to help the AI answer for a blind or low vision consumer who’s holding their camera at an angle maybe that the AI thinks is odd or is imperfect or if there are lower light conditions. And so AI can struggle with real world scenarios leading to failures, by the way, that disproportionately threaten to risk or create harm to blind and low vision consumers, and that’s unacceptable.

Now, the good news is there is a way through this problem, and that is specifically by incorporating more blind and low vision datasets. Um, and why is that? One of the ways that we can fix this is that some of the realities of blind life can be messy and/or imperfect, right? Whether it’s a blurry photo, or as we talked about earlier, an odd angle, or what AI thinks is an odd angle, or different light conditions, blind datasets force these models to acknowledge uncertainty, right? Um, this is sometimes called in the literature a domain shift or distribution shift, which basically means when the model’s trained on something that’s different from the reality that’s in front of it. So these blind datasets have the power to force these models to answer uncertainty… and acknowledge uncertainty, rather, and reduce falsehoods.

So think about, like, when a Be My Eyes volunteer or an Aira agent is asked by a blind or low vision consumer about something. That volunteer or that agent can have the blind or low vision person move the camera, right, to improve spatial reasoning, change the angle, improve the lighting, have a back and forth dialogue about what problem in the image is causing us not to be able to solve the problem. So, take an expiration date on a carton of milk. Well, if the camera’s not pointed at that expiration date, right, that human volunteer or that human agent can help. When AI gets that information, it should be then able to learn how to deal with that uncertainty, right, and then better guide a blind or low vision consumer through whatever task they need to complete.

But that also makes AI better for sighted users, right? And so not only is there a net benefit to the blind and low vision community in terms of accessibility or better AI for blind and low vision, but it’s gonna make AI better for anybody. So it’s incumbent upon us to apply as much pressure on these model makers as we can.

What I’d ask you to think about when you think about this problem is to go back in time. It’s 1996. The internet is starting to catch fire, and everyone is having a mad dash and rush to create websites. What happened as a result of that mad dash? Well, one of the things that happened is people designed websites and internet experiences that weren’t grounded in disability, right, or accessibility. What is the net result of that? To this day, it’s 2025 and more than 90% of websites have inaccessible elements. Now, think about that problem times 10 with AI, with a real-time voice agent enabling you through navigation through a subway or telling you about your medication. So this is not some theoretical problem that it’d be nice to solve. It’s a problem that we have to solve. And where we are in this cycle of AI right now is that foundation models are really being developed as we speak, and we really need to figure out how to attack this problem.

Um, there’s a quotation from the founder of Be My Eyes, Hans-Jørgen Wiberg, who was a Vista honoree last year. Thank you very much for that. Um, in addition to being a remarkable human being, and in addition to taking his personal kind of problem and figuring out a technological solution that blended technology and human kindness, um, Hans has a great quotation about this, and I just wanna read it to you. Um, and he says, “We believe there’s a moral imperative to ensure AI models not only account for blindness and blind people, but do so in a way that’s consistent with our actual experiences, capabilities, and power. Our premise is simple. Our mission is to make the world more accessible for people who are blind or have low vision, and that has to include shaping the future of AI.” And I like to read that quotation whenever I talk to audiences about this issue, because I don’t think I could say it better. But it also talks about the moral imperative here, right, and taking actual lived experience and making that a part of the reality and the reflection, and by the way, the regurgitation that these models do for us. And so, um, anyway, I’ve always appreciated that comment by Hans.

KARAE LISLE: Yeah. It’s beautiful.

MIKE BUCKLEY: Um, so when we think about building better AI, I have six components on the slide here, and I’ll kind of go over them. But one is this notion of having real data, right? And it’s a huge component of making AI better. A second is that we really need to be consistent and forthright and aggressive about gathering preference data from blind evaluators of models regularly. And I give some credit to OpenAI here, um, because they allowed us to bring 19,000 beta testers to the initial launch of GPT-4. Um, Google and Gemini are doing some work here with other companies. Um, and so we really do need to make sure that we’re listening and getting model makers to listen to us.

Um, so one is real data. Two is making sure that we get preference data from blind users. The third is really providing some anti-ableism training, right, for the people who are involved in the development of these models on the engineering side, the computer vision side, all of the above. Um, fourth is implementing uncertainty labeling. That’s that thing we were talking about a minute or two ago, where the model acknowledges that, “Hey, I can’t really tell now,” rather than giving a very confident false answer. So that labeling and that uncertainty acknowledgment is a huge component and something that’s really that really requires a lot, a lot of improvement.

Um, the fifth is robustness of OCR, optical character recognition, right? Um, Seeing AI has been a leader here. Many of the models do a pretty good job on OCR, but as we charge headlong into a future of wearables, near real-time data, lots more imperfect lighting, spacing conditions, um, we have to make sure that OCR functionality and AI performs well in a multitude of environments, right? And so that’s gonna require testing. Um, I know you talked with Joe Devon about some of the work they’re doing on AI coding and making sure that the HTML is up to speed with respect to accessibility and assigning scores. We ought to be able to do the same thing on building scorecards for optical character recognition and uncertainty in all these models.

And then, um, the last thing, and this is something that we’re gonna ask model makers to invest in and something we’re gonna invest in, is the number six point on this slide, which is lowering what we call prompt literacy, right? Knowing how to ask the AI what you need and using specific language and specific cues is going to be a part of this battle, right? For example, sometimes you don’t need to hear about 72 items in the picture, including the pretty flowers over someone’s shoulder. You just need what the menu says, right? Or what the instructions say, or what your airport gate is. And so helping the community understand how to prompt and ask questions to get specifically what you want from AI is a really important component. And, and by the way, like, Be My Eyes operates in over 180 countries and over 180 languages, so this is not something where just, you know, users in the US or UK and Europe like need to be able to do this. This is training that has to happen globally so that a, a, you know, a blind or low vision consumer in rural India can figure out to get the information she needs just as well as someone sitting in Chicago.

Um, and so these are the components of building AI, but arguably the most important here is getting a better data set into these models. Um, I wanna talk a little bit about… The slide just changed. To talk a little bit about the partnership that we have between Be My Eyes and Microsoft that we announced at the end of 2024. And this is something where we went out to the community, we did a lot of listening, um, to talk about how we could thoughtfully, meaningfully, transparently incorporate data into models. And so, um, this collaboration with Microsoft is using Be My Eyes’ proprietary blind and low vision data sets to help better inform models. Um, and it’s part of something Jenny Lay-Flurrie and Microsoft talked about a lot about bridging the disability data desert, a lot of times is what Jenny calls it, and others have, uh, scholars have called it. Um, and making sure that this is done in a privacy-centric manner. And we have very strict limitations on how data can be used, and so there’s no personal information in terms of the metadata and nobody’s allowed to market on this.

But this was the first partnership of its kind to incorporate disability data deserts. And I really wanna give a shout-out to Microsoft for thoughtfully partnering with us on this. Um, Be My Eyes will be partnering with more model trainers in the near future. Um, unfortunately nothing to announce yet today, but we will have probably a number of announcements with other foundation model makers and others on using these data sets to address this very challenge. But I did want to give a public shout-out to Microsoft. Um, I know Saqib is at the conference too. I know huge fan of his. Um, but, um, so thanks to all the team at Microsoft for working with us, not only on doing this, but also highlighting the problem.

Um, I guess I’d just say we’re kind of at a crossroads right now, and this is where I’ll end up is, um, I think if we allow models to kind of go in the direction that they’re going right now and training on mostly clean data or more clean data, I think we’re gonna have a problem. I think we’re gonna replicate the problems that we saw on internet accessibility, only they’re gonna be more acute for the members of our community. And so we’ve gotta attack this, and we’ve got to take the path of more inclusive models. And I promise you, this is something that Be My Eyes is literally working on every day. We are making calls and emails and doing anything, everything we can to persuade, cajole. We have not started threatening yet, but maybe we need to think about that. (laughs) Um, because it’s a societal level imperative. It’s a moral imperative as Hans says. And so, um, I’m always happy to answer questions. Anybody can feel free to kind of email me at Be My Eyes. It’s mike@bemyeyes.com. Um, I sometimes get 200 emails when I do that, but I always commit to answer them all. And, um, Karae, I know you probably have some questions. Again, thank you for having me here to talk about something I’m truly quite passionate about.

KARAE LISLE: Yeah. Thank you. And I think you laid out the problem and some of the steps that you’re taking in a really clear manner, and I was, you know, I think our audience is following you really well. One thing I think that might help at this point is maybe to give an example. So, um, I’ll tell you more specifically. So, um, Microsoft Copilot is looking at some data from Be My Eyes. What might that dataset or that use case be, just to give us an example of how Copilot’s learning?

MIKE BUCKLEY: So it might be kind of a video, literally, of someone asking about the expiration date on something, or is this the correct airport gate, right? And the reason this data is powerful is because a lot of it has to do with the conversation between two humans that are being kind to each other to solve a problem, right? Um, is this airport gate A6? And then a volunteer or a trained agent or another human being says, “Oh, can you lift your camera up a little higher? Can you pan to the right?” Right? It’s actually giving the information to the AI that says, “I don’t have the answer in front of me, and so I need to do some sort of action, either with my hand or something else,” and that can make these systems better performing across the board. So that’s kind of one example.

KARAE LISLE: I think that’s a really good example. So almost, uh, if you will, framing the right object, right, which is very, very difficult to do, um, for a sighted person with a regular camera, but let alone, right? So I can understand that one. And then what about if, um, an example whereby if the medicine bottle has an expiration, you said that AI can confidently falsely identify or falsely label. Are we looking, do you think, for the AI to be smart enough to say, “I can’t read this. Please don’t take it,” um, or do we want… Because we don’t want the wrong data, but what’s the alternative to that, I think?

MIKE BUCKLEY: I think that’s that training to be able to acknowledge uncertainty that we talked about, right? Like, first of all, we always say, like, never trust AI with, like, anything life or death, medical information, allergens, things like that. It’s not worth it, right? So connect with a Be My Eyes volunteer. If you have the means, call an Aira agent or another professional agent, call a family member, et cetera. And so I don’t ever want anybody making kind of the true kind of safety decisions based on this data, based on these models, where they are today. I think, so two things. I think we want to get the models to a place where they acknowledge uncertainty, and then we also wanna get them to a place where they’re much less likely to be wrong.

And thankfully, right? What we’ve seen over the last couple of years is hallucinations, you know, which is AI industry parlance for, like, confident errors, um, have dropped a lot, right? The models are doing a lot less hallucinations, but they’re still doing it, right? Which is why every AI session on Be My Eyes allows you the opportunity to roll over to a human volunteer if you ever wanna verify anything. And so I think that’s the short-term solution, and then the mid-term solution is better uncertainty labeling and then better models.

KARAE LISLE: Yeah. I think that’s right. And I think there can be dialogue that can get more specific as well, right? Um, and I think we see that with the telephone models, right? So there’s a bit of AI when you’re doing your phone call to your bank, and “I didn’t understand that,” right? And so then you know, or, you know, “Please put the, re-establish your numbers,” or “Try again.” So there’s, like, phrases that we know, and I, I guess that would probably bubble up as well with AI.

MIKE BUCKLEY: Yeah. And, and for sure, and like you raised a pet peeve of mine, by the way, so I’m gonna talk about it. But, like, most AI systems won’t let you read, like, the numbers on your credit card, right? And things like that. And I understand why, right? And it’s really important, but, like, that’s something that needs to be addressed. And not everybody has the means or the global scale to call a professional agent or even a volunteer, right? You know, not everybody is on Be My Eyes or Aira or Seeing AI or whatever it is, you know. And so we have to figure out how to put this power and make these models better in those types of situations.

KARAE LISLE: So let me pivot for a minute, because I love the path that we’re on. But now I can hear the naysayers a little bit and I wanna make sure they get heard, which is, “Oh my God, you’ve got all this data. How are you protecting the privacy of the data, um, and the confidentiality of someone who might have shown their medicine and their name or something like that?” I know you all are thinking about it. I just wanted a chance to talk that through.

MIKE BUCKLEY: Such a good question, and obviously we and Microsoft, we’re very, like, hyper-concerned about that issue. Like, the first thing is, is we strip anything personal from the metadata, right? So that gets taken out. The second is, is we have pretty sophisticated software that can identify the things you’re talking about. So if there is a medication label, right? Um, we won’t include that for now until there’s a better way to do that. Um, we can work on better lighting conditions and better OCR, which will improve the medical reading. But so we strip out personal information from the metadata. We screen anything that we share for anything personal. Everything from, like, license plates and names and addresses and medical information, all of that is pulled out, right? So that’s not a part of the training.

Um, the other thing that we did is as part of our Microsoft announcement, we announced principles, right? One is on transparency and disclosure, meaning we’re not only gonna tell you what’s going on and when we’re doing this, we’re also going to name names on our partners, right? In terms of who we’re working with. Um, the second is stripping out the metadata. The third, and this is really important, is we prohibit any partners of ours from using the data in any way that’s unrelated to training. So there’s no weird marketing crap, or anything that goes on, and there’s no sharing the data with external third parties or anything like that. So there are really quite strict limitations.

Um, the fourth is that the blind members of the Be My Eyes board of directors have veto power over any decision that we make here or any kind of partnership that we want to do. And I think, um, we also announced enhanced kind of protections for the people using our service, which are things like opt-outs, which we never had before. So anybody that wants can opt out. But we also made a commitment that we wouldn’t share Be My AI results either. So we want Be My AI to kind of be, it’s all about how you want to use it.

KARAE LISLE: Okay. So it’s mainly the human to human conversations that you’re pulling.

MIKE BUCKLEY: Right. And this is where the market’s going, right? Like you interacting probably with an AI system using your voice. And that’s kind of what we have to get better on, that as well as the visual descriptions of all sorts of stuff.

KARAE LISLE: Great. Okay. Um, any other compliance issues that you’re working through that you wanna tell the audience about so that the anxiety continues to go down about using AI?

MIKE BUCKLEY: You know, um, well, I don’t think I want anxiety to be huge on using AI, but I do think you should be an informed consumer and have a dose of skepticism, right? And a bit of self-protective instincts, right? I think it’s a beautiful thing, like if you’re at a restaurant and you get all the menu choices, that’s really cool, right? Are you always gonna trust it to identify an allergen? I think I would pretty much double check with the waiter or the server on some capacity on that. And so, um, I think we want a slightly skeptical consumer, um, but I think that the power and the transformative ability for independence and power for the blind or low vision consumer, I think it’s important to experiment with and use these tools in ways that work for you and your environment. Trust but verify when it’s meaningful.

KARAE LISLE: Yeah. I like that. And then, um, we’re getting close to the end of our session today. And so we’ve been talking, um, and your passion is palpable about what is needed. And when you’re talking about it, you’re saying it in a way that you’re worried that it might not happen, right? And you’re at that level of energy and that level of convincing, right? I love that. But are you, if you lift your chin from that for a minute and look out over the horizon, I think, um, we’re very optimistic at Vista Center, and by listening to all the other speakers from the conference, we’ve got about 50 speakers this year, are you optimistic or I’m kind of baiting the question a bit, and I’m not trying to, but what is your view of how AI will benefit the community in the future?

MIKE BUCKLEY: I’d call it cautious optimism, but here’s the reason I’m optimistic, is there’s a tremendous economic incentive to do this for these model makers, right? I mean, if you think about the internet, I think most companies that screwed this up, and by the way, we all did, including companies that I worked at at the time, right? Um, I don’t think we saw initially where the economic problem was with an inaccessible website, right? We didn’t feel it. And that may be because disability numbers weren’t big enough. It may be because of the sobering economic stats related to the blind and low vision community, about 70% of people being underemployed or unemployed. It may be a host of those things, but there’s the crazy race that’s occurring right now on AI. Once these model makers see that blind and low vision data can improve the model for everybody, that’s pretty cool. So do I think that some of these people are doing it out of the goodness of their heart and for moral? Yes, right? The work and the headaches that we caused up with Meta, working with them on the glasses and doing six months of engineering work together, it’s because Meta wanted these things to be accessible, right? Same thing with our work with Microsoft. Same thing with the beta testers and OpenAI.

Um, but longer term, the longer tail of model makers and AI generally, if you want your model to be awesome and solve more of humanity’s needs, this data can help you do that.

KARAE LISLE: Well, it can become a differentiator in the next couple years, and then it becomes a must have.

MIKE BUCKLEY: Totally. And that’s when we know our mission, you and I, and the leaders, the accessibility leaders have really accomplished it, is when it’s ubiquitous, right? Um, and we all benefit from it. And, you know, the curb cut model is proven once again.

KARAE LISLE: So I mean, you know, you’re hired on our sales team, right? Like, you know, look, FOMO, I hope we can get some FOMO going.

MIKE BUCKLEY: Yeah. Yeah. Right? For sure. Um, because look, companies aren’t bad people, but companies do act out of their economic interests. That’s just reality, right? And so we need to make our argument part of their economic interests, and I think we’re on the verge of successfully doing that. Now we just gotta do more, faster, better. (laughs)

KARAE LISLE: Yeah, exactly. Always, always. Well, and Sight Tech Global is one of the pillars that we have at Vista Center to kind of keep holding that up. And we appreciate that you’ve got a couple of pillars in your court as well. So we love so much this session. Um, we love your passion. Be My Eyes is a terrific company, and we were so excited to honor Hans. Um, and, you know, have you all participate and so we’re finishing up with this session. Thank you, Mike Buckley, CEO of Be My Eyes, and we appreciate so much everything that your company is doing for the blind and low vision community.

MIKE BUCKLEY: Thank you so much for having me, and thank you for all the work that you do every day in this world.

KARAE LISLE: Thank you.

[MUSIC PLAYING]

Designing AI That Sees Us: Powered by Blind-Centric Data

Speakers