AI gets complicated: emerging debates and surprising challenges

DESCRIPTION

For all remarkable advantages AI has brought to accessibility, there’s always been a backbeat of issues, notably around certain kinds of bias against people with disabilities. Now that AI is infiltrating more and more day-to-day experiences and generative AI is taking wing, the expanse of issues for blind people is growing fast. In some cases, it’s all about advocating for technologies like autonomous taxis (think Waymo) or facial recognition that present big advantages but are opposed by other interests in the name of privacy or public safety; in other cases, the challenge is making sure emerging generative AIs take into account the worlds of eBraille and the always emerging language of the community.

Speakers
- Bryan Bashin, Vice Chair, Be My Eyes
- Dr. Sheri Wells-Jensen, Associate Professor, Bowling Green State University
- Chancey Fleet, Assistive Technology Coordinator, New York Public Library
- Moderator: Merve Hickok, Founder, AIethicist.org
SESSION TRANSCRIPT

Download transcript as .txt file

[MUSIC PLAYING]

MERVE HICKOK: Hello, everyone. This is Merve Hickok, founder of AIethicist.org and President for Center for AI and Digital Policy. It is a great pleasure and honor to be back at Sight Tech this year. I still have conversations in my head from what we discussed last year and some of the actual partnerships that grew out of the last year’s conversations. Today I’m joined by three amazing panelists that I would like to introduce. Mention your name and ask them to introduce themselves because I know I’m going to not be able to make justice to their amazing work. And then we’re going to talk about AI and generative AI systems, and all the controversies, the conflicts, and what we want to do, how we want to improve these systems. Without further ado, let me go to our first panelist, Chancey Fleet. Chancey, could you give us a bit of your background, please?

CHANCEY FLEET: Absolutely. I am a blind technologist, and I’m the Assistive Technology Coordinator at New York Public Library where I curate group workshops about various tech topics, including the mindful use of generative AI. And I run our dimensions lab where anybody can come in and get the space equipment skill building and community that they need to create tactile, graphics, images, and 3D models.

MERVE HICKOK: And I’m so fascinated by those conversations, as well as have some very specific questions towards that end and as well. Next, we have Bryan Bashin. Bryan, please go.

BRYAN BASHIN: Thanks, Merve. I’m Bryan Bashin. And I’m for this purpose Vice Chair of Be My Eyes, which is an app that connects 7 million volunteers with half a million blind people in 120 countries. Six months ago, we married AI to that. And now blind people have the chance to use that, snap pictures, get descriptions. And it is entered us into a very interesting world that we’ll talk about shortly.

MERVE HICKOK: I’m looking forward to it. And last but not least, Dr. Sheri Wells-Johnson.

SHERI WELLS-JOHNSON: Thank you so much. I’m so glad to be here. I am a Blind Associate Professor of Linguistics at Bowling Green State University in Ohio. And my research and volunteer activities are centered around access to outer space for people with disabilities. And so I am an amateur, but excited. And hopefully, a little bit linguistically informed user of all the good AI stuff that’s out there.

MERVE HICKOK: Which is becoming more and more important conversations in the world of generative AI systems and large language models as well. Thank you for making me part of this conversation. I really look forward to the insights. Without further ado, let me do a quick introduction of set the stage for our session, for our audience. As a society, we’ve been using AI systems for several years now, whether it is by our own choice, where we want to benefit from them. We make the decision to use them. Or by being subject to the systems such as surveillance systems, job applications, social media content moderation, just to name a few. We experience the implications of these systems both benefits and negative consequences. Generative AI systems, such as ChatGPT, DALL-E, Midjourney, et cetera bring a next wave of challenges and opportunities where this technology can be used both in the public, as well as private sphere. Our emphasis in this short time we have today is on specific real-world applications and the twists and turns. AI frontiers are very dynamic and very much open to the influence of the community. And if we as a community understand what’s happening, learn to identify and articulate the interests, the community’s interest at that and shape the outcomes in whatever forums. That is our way to shape our future and how we benefit from these systems as well. So let’s start with the generative systems use of language and images. And I would like to have this question to all of our panelists as a starter. AI powered language systems, whether we use it in speech recognition, voice assistance, content moderation, et cetera, or translation systems, we’re already suffering from omission or underrepresentation of languages. Many of the world’s languages themselves are not resources– or underrepresented resources or represented in this data sets. Not as much as English does. So my question to all of our panelists– and feel free to start as you wish. Maybe start with Chancey first. What do you see as some of the risks of omission from generative AI of identities of languages and systems?

CHANCEY FLEET: We already live on a planet where English is super dominant. And that has impacts for people who are attempting to engage with literacy literature, to scholars, and students who need to communicate in English because English is so dominant in those spaces. And we have an opportunity with large language models to redress some of those communication barriers to lift up other languages and make sure that content that’s created in other languages can be understood in English and vice versa. It’s possible to build a bridge. But it’s also possible to build a moat because what we get out of these systems is what we put in. And if we don’t put in, the corpus of training data, that is available from other languages. Then the dominance of English is just going to be replicated. And the existing power imbalances are just going to play out again in the same old way in this new context. I think it takes intentionality and mindfulness on the part of folks that curate what goes into large language models. But I also think that it takes a conscious endeavor to engage communities that have historically been at the periphery or disengaged. And to hear from communities of non-native English speakers about how to construct a more linguistically just future with large language models, and what specific steps need to be taken. We need to be listening to the voices of people who aren’t represented in virtual rooms like this one.

MERVE HICKOK: Thanks so much, Cancey. Dr. Sheri, Please go ahead.

SHERI WELLS-JOHNSON: Yeah, sorry. I was just going to throw in my linguist hat on here. So we’ve got 7,000 languages spoken on Earth today, which is less than we used to have. It’s hard to track back and know how many languages there were 1,000 years ago. But that’s what we’ve got now. But we do recognize the trend that languages are literally dying quickly. So we’ve got 7,000 languages spoken now. If we continue the way we’re continuing if the curve keeps going the way, the curve is going. We will lose 90% of those languages within the next century. The conservative estimate is that we will lose 50% of them. But most linguists go with 90%. So we’re about to go from 7,000 languages to 700 languages. So Chancey is exactly right that this is a huge problem and it’s a loss of the history of human cognition and human linguistic ability. We’re not just disadvantaging people, which is terrible, but we’re also losing track. We will be losing track as we lose these languages of the knowledge that’s stored in those languages. Not just of ways of thinking, but sometimes literally names of species and what those species were good for and folk medicine. If you need a practical reason, that can be one of them, right? So if I live on the Solomon Islands, and my mother’s mother used to treat a fever with a particular kind of plant, and we no longer know the word for that plant or even that plant exists as both a species and linguistic diversity declines. That’s a measurable, tangible loss for humankind. We’ll also lose the ideas of what the human mind can do and how the human mind can link language and thought as we lose more linguistic diversity. So when I send my students out to find languages on the web, just again, like Chancey said, they’re not finding them all. And this isn’t just a matter of, hey, get on the ball and get your language on the web my friends. It’s an economic problem as well. So why is there not a full and complete sketch of the Oroha grammar or language spoken on the Solomon Islands? And dictionary of that language online, well, there are only about 30 speakers left of that language. And none of those people, unfortunately, has the economic force to cause all that work to happen. They simply can’t pay for it. And linguists don’t have freedom from other work to get that job done. So there’s a lot of things that we lose in terms of human potential when we begin to isolate ourselves down from 7,000 to 700 languages.

MERVE HICKOK: You’re saying not only we’re losing linguistic diversity, but our collective intelligence as humanity?

SHERI WELLS-JOHNSON: Yes.

MERVE HICKOK: Our collective heritage as well?

SHERI WELLS-JOHNSON: Right. What can a mind do? How can minds make language? We’ve got some idea now of what the diversity of human languages are like. But we’re not even through doing what we have. And before we get them all studied, they’re disappearing at a rate of something. If you do the math, it’s something a language every two weeks, which is heartbreaking.

MERVE HICKOK: It is heartbreaking, indeed. Bryan, I would like to get your insights about what you see as additional risks.

BRYAN BASHIN: Yes. Well, empirically, with Be My Eyes, we were learning a few things. One of them is it’s relatively easy to get the output from AI into many languages. In fact, right now if you use Be My AI, you can get the output, take a picture– and I don’t know, four or five dozen languages. That’s not the 7,000 that Sheri was talking about. But I think that the ability to translate into hundreds of languages is soon upon us. That’s not all of the problem, though. The real problem is how do you get the AI engine to understand us. We’re all three on this panel blind people. When we go up and start talking to anybody on the street, they will know we’re blind. And there’ll be a modification of language when they start talking to it. Sometimes that’s terrible. Other times, it’s very useful. So when I seek information from an AI agent, I want to know, for instance, not just a broad description of something, I might want to know shoreline that building until you get to the opening and turn right. I want to be able to say to it upon my direction. I’m a blind person. I need different kinds of instructions and cues. And some things, I don’t particularly want to prioritize. Right now the engine doesn’t know who it’s talking to. But as we start becoming a world full of diversity and difference, the individual’s ability to assert who they are, and then to modify what the AI is doing. It’s really important. Imagine you’re a low-vision person. You don’t need the big things described in a picture. You need the small things described first. And so that modification is a new language of dialogue that’s yet to be developed.

MERVE HICKOK: I would love to come back to those– the points that all three of you raised in different questions. Bryan, just to your point. We don’t have much control or agents as users of the current technology have much agency or control over these technologies is what we might need. It is usually what a developer or a company thinks a user might need, which tends to be a wide gap, continues to be a wide gap. But following on the omission piece issues of omission and how we could benefit if these languages were actually included, and we had more control. Preparing to this panel we had conversations about Braille. Although it is not a language per se. And as a code, we mentioned that Braille suffers from the same issues of omission. What do you think could be some of the benefits of linking a Braille to a large language model such as ChatGPT? And Chancey, maybe I can hope to start with you again.

CHANCEY FLEET: Sure, absolutely. So I think the benefits are pretty obvious. And then the problem is more interesting than it seems at first blush and more complex. So clearly, there is a problem in our community with on-demand access to Braille. And that might include Braille that’s properly formatted for different presentation types, including the crop of Braille, capable Braille and graphics displays that you might hear about other places during Sight Tech Global this week. And if human labor can be reduced and materials on-demand can be prepared in an appropriate format and with appropriate markup, clearly, that’s a benefit to literacy in the community. But then if we go deeper, there are things about Braille and about non-visual experience that are not sufficiently to my mind well-represented enough on the web to have all made it into large language models. So it’s a sign of trouble that large language models tend to struggle with Braille. Although they know dozens and dozens of languages. They can’t go ahead and put a simple Braille label on a simple pie chart. But if you go a little bit deeper, there’s a lot that we know about constructing tactile graphics that also doesn’t seem like it’s made it into the large language models. So, for example, we can’t semantically use color to indicate anything with tactile graphics. And so if we’ve got a pie chart, we may need to show different textures on that pie chart instead. Maybe we’re going to use a waffle grid, and a dot gradient, and blank space, and vertical lines. And there we have four wedges in a pie chart. Or maybe we’re going to go to the minimalist route. And simply Braille label the four empty otherwise wedges of a pie chart and communicate just as clearly, if not, more so. So for large language models to be of maximal benefit to us, we need them to understand what is aesthetically pleasing, semantically appropriate, and legible from the standpoint of a tactile graphics user. And we’ve got a long way to go. I won’t tell you which large language model told me this. But I was asking about some tips for preparing Braille embedded in tactile graphics a few weeks ago. And one of the tips that I got was that when you’re including Braille, you should always make sure to use a bold dark ink.

SHERI WELLS-JOHNSON: Nice.

MERVE HICKOK: There you go. We have the recommendation. And I’ll share with you later after this panel. I was playing around with being ahead of this panel this morning. And a bit concerning representations when asked about how a blind person uses generative AI systems. And the response I got was a white guy wearing black glasses touching the monitor screen with a cane.

SHERI WELLS-JOHNSON: Wow.

MERVE HICKOK: And I’ll share the image later. But amazing. On the monitor was an eye as well. So I don’t know what– the logic behind it is.

CHANCEY FLEET: The LLM does seem to have a really firm grasp on this fears and stereotypes that people who are sighted have deep in their hearts.

BRYAN BASHIN: Well, it gets to a very important point, which is these are probabilistic models. And in a probabilistic world, we are 1/3 of 1% of the population. We don’t exist. And so it’s not surprising, Merve, that you found such weird tropes. And the question is then, how do you tune it? How do you train it? Because the average person on the street is not the guide that I want as people understand my needs and my community’s needs.

MERVE HICKOK: Absolutely. And this obviously, generative AI systems large language models is not the first time we’re experiencing the underrepresentation, or harms, or the needs not being reflected in the design. We also talked about speaking of some of the bias, some of the biases, or underrepresentation. We also talked about dangers of linguistic drift. Sheri, maybe you can start us with this conversation. We talked about the sanitization of disability language through the systems avoidance of identity first and liberatory languages. And I’ll go over to other panelists as well. But can you expand upon how this manifests itself in generative AI systems? Again, what are the risks for this community? I mean. I think one of the things that gets baked in pretty early on is this desire to not offend without understanding what offense is. So, for example often when I go someplace and I need help, I might hear one person call across the shop to another person, get someone to help this person over here because she’s– because she’s– because she– because she is special and needs some help, instead of just using the word blind. The word blind is labeled in for some people as offensive, where it is just the adjective that describes the inability to see. And I worry a lot about the ease with which avoiding even mentioning blindness can spread. And I worry a lot about things like if I refer to myself as a blind person, I am often corrected. What I get back from ChatGPT is that you’re a person who is blind. And it’s just not quite capable. I can’t get it to say a blind person without carrying on at some length about my pencils and my cup. So I think there are concerns about how we want to refer to ourselves. Because you think about how most of what’s in there is Mr. Magoo and other representations of fictional blind people. And then maybe some of it’s about real blind people. And then my individual preferences as a human being, which I would like to spread for the benefit of other human beings aren’t quite as present. I think Chancey had a lot of really cool things to say in our pre-talk about the world that ChatGPT represents to us when we ask. That represents to us when we ask it questions.

MERVE HICKOK: Yeah, absolutely. Before I go, Chancey, to you thing is, again, none of this is generative AI-related problems, right? So we had similar things with social media, and content moderation, and auto flagging of toxic language, and what social media platforms consider as toxic when you use identity first or liberty languages. We see a similar censorship, and avoidance, and large language models as well. But to your point, Chancey, how does generative AI systems describe us or paint the world for us?

CHANCEY FLEET: That’s a great question. I mean, when I ask something about blind people, and I get back a response about visually impaired people, it’s irritating to me. But it doesn’t affect me profoundly because I have blind pride stockpiled. What I’m the most worried about is the impact on people who are coming to learn as newly blind or low-vision folks or as folks who imagine that they want to be allies. And the anodyne hallmark-ready version of disability that looms in their current instantiation are going to serve. And I have great hope that a different representation and discovery journey is possible with large language models. But here’s where we are right now. I’m very unlikely to be offered the term blind to use. I’m very likely to be offered the term low-vision or visually impaired. I will be offered person first language. I will be offered sometimes solid enough information about how a blind person would go about something non-visually. But I’m unlikely to discover anything truly radical. I’m unlikely to be served, for instance, the reasons why we have a widespread Braille literacy and image literacy crisis and a tactile graphics gap. Because current LLMs don’t have the depth of understanding about power imbalances and systemic inequities and how those don’t square up with our inherent aptitudes. There’s some wisdom in the community that, to my mind, barely makes it online and often doesn’t make it into the training data. So I am concerned that our representation in large language models while not overtly offensive is going to be mediatised, sanitized, and diluted to a point that very seldom does someone who comes looking for information within a conversation with a large language model come away with a call to action or a liberatory idea of what’s possible for our community. And that I think needs fixing.

MERVE HICKOK: And on that, thank you so much for that. And I want to shift the technology itself. But maybe continue with the same question with Bryan. If we took instead of large language models or just large language models and focus on facial description technology, for example, are the descriptions that we get what a blind user might want or needs using it facial description technology? Or is that also predefined for the users?

BRYAN BASHIN: Well, speaking for the universe of blind people, I can just maybe say that more is better. I want lots of description. I want the same ability to understand the physical appearance of people, that a sighted person standing in the room has. Now, of course, we’re in the middle of a legal battle about identification. But the description is something we want. Fraught with that is the dangers of bias. How you describe, what you describe. And guess what? We live in a country that there’s no consensus on that. No consensus of what’s PC. You move to another state, and it’s entirely different expectations. Same in disability. So what I’m thinking about in the future is, OK, it’s great that in these first few months since March 14, we’ve had access to these models, and that they can do what they do. Now we got to think about the interface of the future, which should bend to the desires of the individual. Just in some early access technology, you could say, oh, I want this described as in feet, or in meters, or in clock face, for directions, or whatever it is. I want the user ultimately to be able to prescribe all kinds of settings and preferences for where it wants its eye to focus, how much, and where do you align. This is all inchoate right now. But it’s going to be how I think everybody interfaces with AI technology. Because nobody wants to live in that 1950s Walt Disney World where everything is boring and perfect. And the descriptors that add beauty to the universe are not there. So that’s the future I think.

MERVE HICKOK: Looking forward to that future where there is a driver’s seat and not just a passenger seat with this technology is. I want to mindful of our time. I want to ask one question again. Focusing on facial recognition, but keeping it as a broader question. And go to all three of you very shortly. In an area, in a time where AI is getting better, and we experience more deployments of this technology facial recognition technology in our daily lives, what would you like to see done better in the deployment design or deployment? We mentioned some of those obviously representation and ability to control the outputs. But what would be for your personal use the better changes in the design and deployment.

SHERI WELLS-JOHNSON: So I guess could jump in. I mean, I agree with Bryan that I want all the detail. But I also want a decision made. When I take a picture of people in my family, it describes them. But I don’t know that that’s really them. It will describe them, but think well I don’t know what that means. And then it describes me. It doesn’t can’t use enough specificity so that I recognize even a picture of myself. It has to tell me. And it has to tell me with some degree of what its probability rating is. Because that’s what sighted people get, right? Oh, I think that was my neighbor on the street. But I’m not 100% sure. So it’d be nice if I could get a judgment. But then I think it’s essential that judgment be coupled with some what is my fudge factor, how sure was I. Because again. That’s what sighted people get. And that’s what’s useful is the– yeah, I think that’s what it is and how sure are you.

MERVE HICKOK: Thank you for that. Chancey, what about you?

CHANCEY FLEET: Speaking frankly, I am curious about facial recognition and am delighted by it in instances when it’s been available. Perhaps I notice an actor in a meme great. I’ve enjoyed myself. It is way down my list of priorities. I would prefer to have facial description. For example, for self-description. Somebody took a photo of me. Am I going to quote-unquote decide that I like it? Facial description helps me in that way. Facial recognition, on the other hand, offers me an equity of access at the same time that it potentially interacts with other people’s freedoms in a really different way. We all have to worry about the potential acceleration of facial recognition for the aims of surveillance and tracking. And I’m not at all sure yet how to navigate the tension between my quote, unquote “right to access” and the right of other folks to go unnoticed or forgotten in the Digital Commons if that is what they wish. And I think that we in the accessibility world and ethicists in the social justice world need to find common cause and keep having the hard conversations because it’s not an easy question. And the best possible answer will only come if we continue to dialogue really robustly and work through what all the implications are. It’s my belief that we’re only at the very beginning of understanding everything that we stand to gain and stand to lose.

MERVE HICKOK: Thanks so much for that. And nothing is black and white. And to your point, the conversations and the decisions are, I think, going to make. Make it break. Or as you put it at the very beginning, whether this technology is will be a bridge or a moat in our lives and our access to resources. With that, I know we’re two minutes behind schedule, test our time. Bryan, very quickly, before we finish, any closing remarks from you?

BRYAN BASHIN: Oh, well, on this business of the facial stuff, which I think people on the outside of blindness think is more important maybe than sometimes we do. I do think that the standard of who’s a public figure, those people are public figures. I want to know them. I want to be able to call them out in my social feeds and elsewhere. And then an emerging solution is that for people in my life, people in Sheri and Chancey’s lives. You can store this information on your local device, and it only be available to you. And not the cloud. And so the people who are in your life that you care about can easily be identified in photographs and other things. But as a global, social, cloud, surveillance network, I don’t think anybody is pressing for that. Right now there’s too many people with too much libertarian and other aspirations to make that something we’re likely to do in Western culture at least. But we’re on the brink of this brave new world where this technology can be used for the most amazing of humanistic things to enrich our lives, to ennoble our lives, to integrate and all of that. But we’ve got to get it right. And the main thing is that the people who think about AI have to know that we exist, and we matter, and we should be at that digital table.

MERVE HICKOK: I could not summarize and finish better. So I’m not even going to try. Thank you so much for your insights for your closing remarks. And thank you. Chancey and Dr. Sheri. As well for your remarks appreciate Sight Tech providing us the opportunity to have this discussion today. Take care, everyone.

CHANCEY FLEET: Thank you.

BRYAN BASHIN: Thank you.

[MUSIC PLAYING]

AI gets complicated: emerging debates and surprising challenges

Speakers