Full Transcript
Jagadish:
The real inspiration happened to me when I met my visually impaired friend. She was explaining her challenges that she faces daily, so I was struck by this irony that, you know, I’d been teaching robots for years how to see things.
Jeff:
Please welcome Intel’s Director of Technology Advocacy and AI for Good, Hema Chamraj, and Artificial Intelligence engineer and developer, Jagadish Mahendran.
Jagadish:
The system has an AI system that is housed in a backpack, and the cameras are concealed in the vest and the fanny pack.
Hema:
You’re talking about a thumbnail kind of sized chip, right, it’s a very tiny chip, and that is enabling this AI function to happen inside the device or in this case inside the camera. The, kind of the small form factor, the low weight, the low power, the low cost, all of these are coming together with the newest AI developing, and it’s only limited by whatever imagination and dreams that innovative developers like Jagadish and his team have been able to do with this project.
Jeff:
It is exciting when Intel announces an AI powered backpack that can help the visually impaired navigate and perceive the world around them.
Hema:
You can’t send everything back into the cloud or into somewhere else to process, it has to happen immediately, right? And more and more we’re able to do AI at the edge.
Jagadish:
The major accomplishment here is that using Intel’s products, we are able to do everything in real time, run all these complex models in real time, including the AI models and the processing and provide information in real time, which is a huge and very valuable task to do, especially for a visually impaired project.
Jeff:
Now, please welcome Hema and Jagadish, taking artificial intelligence to the edge. Great, we’re all here. Why don’t we just start out with introductions? Hema, you want to go first?
Hema:
Sure thing, my first name is Hema, that’s easy to say, right? Hema, and then my last name is Chamraj, Hema Chamraj, my role is actually I’m the Director of Technology Advocacy and AI for Good at Intel, and we are committed to creating world-changing technology that enriches the lives of every person on earth. So that’s something that we take very seriously and I have an awesome role working on projects that deliver real impact in people’s lives. So that’s my introduction, and I was going to hand it off to Jagadish.
Jagadish:
Hi, Jeff, very nice to meet you. Thanks for having me. My name is Jagadish Mahendran. I am an AI and perception engineer with a background in deep learning, computer vision, and robotics. I’m also a researcher at the Institute for AI, University of Georgia. For the past few years I’ve developed AI and perception systems for various kinds of robots. That includes inventory robots and kitchen robots. I am deeply honored that my project, AI-based visual assistance system, won the grand prize at OpenCV Spatial AI Competition, 2020. This is the world’s largest competition in this area. It is an absolute pleasure to be part of this podcast. Thanks for having me.
Jeff:
Well, thank you very much for those wonderful introductions, and welcome to Blind Abilities, I’m Jeff Thompson, and with me in the studio from the San Francisco Lighthouse adaptations store, it’s Raqi Gomez. Thanks for being here, Raqi.
Raqi:
Hello, thank you.
Jeff:
Well, this is really something that we’re both really interested in. When I first heard about it, I was really excited, kind of like making a human into an autonomous car or making a human autonomous. But you even said that you’ve been waiting 10 years for this to come along, Raqi.
Raqi:
Oh my goodness, something like that. I’m just fangirling a little bit over here, only in that it’s such a wonderful thing to see that you’re working on something like this, because I have long since envisioned the 21st century facelift, if you will, being given to travelers who are blind, who orient and travel in the public spaces, and I just think that with vehicles and all of these other things, even my vacuum cleaner can navigate autonomously, we can certainly improve the ways in which information is communicated to blind people. So when Jeff mentioned this, I was just, I couldn’t sleep last night, I was so excited. So it’s really a pleasure to be here, just anxious to learn more.
Jeff:
Well, thanks for being here, Raqi. Hema, how did Jagadish’s project, his initiative, come across to Intel? What is the connection of how you met Jagadish?
Hema:
Yeah, so in my role, right, I get to kind of meet innovators like Jagadish. But in this context, we’ve been part of this organization called OpenCV. It’s something that Intel- actually the beginnings of what is now called OpenCV started within Intel, it was back in 1998. An Intel employee had this idea that computer vision is going to be a very innovative technology that should be open sourced, and so he had started working on it, put a development team together. But since 2008, it kind of moved out of Intel and became something much, much larger, and like Jagadish was saying it’s become one of the largest communities that comes together to develop innovative solutions around the tools and technologies from the OpenCV organization. And so we are very much collaborating with projects that happen in OpenCV, and last year as part of the 28th anniversary celebration, Intel partnered with OpenCV, they announced the OpenCV Spatial AI competition and sponsored it, and Jagadish happens to be one of our winners from that competition. So that’s how I got to know Jagadish, is he’s one of our winners from this OpenCV Spatial AI competition that was sponsored by Intel and run by OpenCV.
Jeff:
Jagadish, when I was reading about you, what you’ve accomplished through this, winning this competition, and the product, the initiative behind it is really interesting. Why don’t you unpack it for our listeners a little bit, tell us about your concept and where you see this going?
Jagadish:
Yeah, absolutely. To answer this question correctly, I should start a little bit with the background. So I’ll start with how this idea occurred to me, and then how it led to me here. So initially, this idea occurred to me in 2013, when I started my master’s, that was eight years ago. So back then I couldn’t make much progress due to various reasons. One was I was new to the field, and deep learning was not a mainstream in computer vision, like how it is now, but thanks to these amazing improvements in the area of deep learning, especially thanks to Intel, we are we were able to achieve a lot of significant progress and able to do a lot of tasks that we were not able to do just like eight years ago. But the real inspiration happened to me when I met my visually impaired friend, as she was explaining her challenges that she faces daily. So I was struck by this irony that, you know, I’ve been teaching robots for years on how to see things, while there are people who cannot see things, and having problems with navigating the surroundings, and they need help. So this is what fueled my motivation to develop a visual assistance system, and that’s what led me to make this project. And to give a description about the project as such, the system has an AI system that is housed in a backpack, and the cameras are concealed in the vest and the fanny pack, along with the batteries that can last for eight hours. And also this system is powered by [unintelligible] camera kit that runs on Intel Movidius VPU chip, which is an AI chip along with the Intel Openvino software. And this system also comes with a voice interface, through which the user gets critical updates. The user can also engage with the system using voice command via Bluetooth-enabled earphones. The system then responds back to the users with verbal information on how to safely navigate their neighborhood. So this is the overall, like a high-level architecture of the project.
Raqi:
I’m curious about one thing as I’m listening to you talking, and just, it’s so exciting to see this, it just lives in my- I’m getting speechless over here. But I’m curious, is there also, is there a means of communicating with the device quietly or silently? Is it exclusively done through voice interaction, or is there also a means of overriding that and navigating in an area maybe where a user isn’t able to speak to the device, maybe in a place where there are a lot of people and you know, you’re trying to navigate through maybe a crowded theater or a lecture hall or somewhere where you can’t speak openly? Is there a means of directing the device autonomously or manually rather, I should say?
Jagadish:
Yeah, so it works like this. So generally, you know, the system provides these voice updates, right. And these updates are classified based on the criticality. So let’s say there is an obstacle that is very close to the user, and these updates override everything else. So let’s say there was some other request that was asked by the user, but when the user encounters critical activation, this voice update takes over everything else, and it will provide the critical update first to ensure the user is safe. And then the user can request for other features, like describing the environment, or setting the location via GPS or accessing more of the features. So the idea is prioritize the safety of the user, so no matter what the user is interacting with the system, if there is a critical update, this will be superseded compared to any other feature. And also a little bit more about the features, like for example, let’s say the user wants to know about the surroundings and wants to learn about the objects around the user, so the user can request the system via voice command using certain keywords. For example, using the keyword “describe,” the system will begin to describe the surroundings and it will start to list the objects that are being seen, for example, if there is a car on the right side, 20 degrees, so this will be translated to the clock notation, which will be one o’clock, the clock notation. And let’s say if there is a person on the left side, let’s say 40 degrees, this will be translated to 10 o’clock, right? And also, as the user walks, this is another feature, as the user is walking closer to the stop sign, the system will update the user that it is going to activate crosswalk detection, and it will start to perform crosswalk detection. But you know, these are the features that are done on user request, right? Crosswalk happens the moment the stop sign is detected. But however, if there is a critical update, let’s say there is a bike that is coming closer, within a certain range, all these updates will be superseded by critical updates, saying you know, there is an obstacle within let’s say, three feet or two feet so that the user is safe enough.
Raqi:
Is that obstacle, is that indicated by a voice? Or is that a tone? Or is it haptic? Is there some means of telling the user without being communicated audibly, it’s coming closer, it’s coming closer, or three feet, two feet? You know, is there a way of notifying the user?
Jagadish:
Yes, it is a voice that is played as an audio through Bluetooth-enabled earphones.
Raqi:
I see.
Jeff:
I really like that, that it prioritizes, because I might be looking for a store or something like that, but it’s very important that there’s something moving in my path and all that. So that’s really neat that it does have that feature.
Jagadish:
Yeah. So to give you a perspective, right, so the development of this project has involved, the main focus was to, you know, keep in mind that we asked for suggestions from visually impaired people on how to use this product and how they want this product to be. And we conducted a lot of interviews and took their submissions, and we continuously tried to integrate this position as we developed the project. So, which is why we wanted to have a very good user interface, like one of the common issues with the existing solutions is that some of these apps provide a continuous bombardment of information, the user cannot concentrate after a certain while, and this is something that we wanted to avoid and keep this information, you know, the rate of information provided to the user to be limited and prioritize them based on the criticality.
Jeff:
That’s a really great point, Raqi and I were talking about that, being bombarded by so much information it almost paralyzes you, it stops you from moving forward, because you have to take in so much information. So with that prioritization and the way you’re working with this, and I’m glad you’re getting the input from people who are blind, visually impaired, to discern what and how this operates.
Raqi:
I was so glad to hear that too.
Hema:
I don’t know if this is true. But Jagadish, you can add on because you have, you’re thinking of adding on more things as you conduct this testing, right? Like you were thinking of adding other features, like the haptic, like the maybe the variable, that could be in the future version of your project. Is that accurate? I thought I captured something like that, beyond, in addition to the audible interface, right? So you wanted to do more than maybe like a variable play bracelet or something?
Jagadish:
Yeah, that’s a very, very good point there. So I’ll just talk about like, what are the next steps that we will be doing, and then I’ll get to the point that Hema just pointed out, you know, as the next step, we have already formed a fantastic team called Mirror. This contains people from various backgrounds, who are volunteering to make a positive impact in the community. Our team also includes people who are visually impaired, and they want to make a good impact on the visually impaired people community. So we are in the process of raising funds for the initial phase of prototype testing, and this would involve a prototype tested by our own team members, and then eventually start to test outside our team. This is a little bit about our team, and one of our main goals is to make this solution open source and free so that anyone could contribute and make it self-sustainable. So we are actively, you know, welcoming developers to contribute and integrate whatever features they may think will be useful for this project. And this is where the point Hema made makes a lot of sense. The idea is we want to make this project as open as possible, and everyone should be able to contribute. Along those lines, we are also thinking of changing or adding, or essentially changing as such, you know, adding more user interface to make it as simple as possible. One of them might be, you know, having those haptic feedbacks like a ring or something so that the user is provided with information through vibrations, and even reduce their attention into their audio and pay more attention to the actual environment. This is something in our pipeline and there are a lot of exciting things and a lot of very interesting milestones that we have already laid out in the coming years.
Raqi:
Sounds like it.
Hema:
Exactly. Yeah, as you said in the beginning that you’re so excited to hear about this project, and I heard Jagadish also say exciting and I should use the same word, exciting is what comes to mind, because these are really exciting times like Jagadish said, if eight years ago the technology had not quite evolved to the point where this could this is now a reality, right, that they have created something, a real system that works, so that is what is exciting for us is that the technology is evolved, and that we are able to now kind of bring it down to a point where it is not a- even if people think about AI, right, they think quickly about this huge humongous system that needs to be in place. But this is, you’re talking about a thumbnail kind of size, shape, right? It’s a very tiny chip. And that is enabling these AI functions to happen inside the device, or in this case, inside the camera, the, kind of the small form factor, the low weight, the low power, low cost, all of these are coming together with the newest AI developments and of course, then there is the software that is paired with the hardware, like the open mirror toolkit that really makes this really come to life, because what was not possible a few years ago is now possible, and it’s only like limited by, you know, whatever imagination and dreams that innovative developers like Jagadish can come up with, right, because that’s what is exciting for me when I think about what Jagadish and his team have been able to do with this project.
Jagadish:
Yeah, I think Hema made an excellent point. You know, if you want to do this project that just a few years ago, it would have been a totally different problem, because the solutions were quite different, the hardware is quite different, you know, because running these AI algorithms require a lot of computational power, which require this massive GPUs, which are heavy, expensive, and you know, they need fans and power sources, and you know, imagine carrying all these things in a backpack to solve this problem. This is practically not feasible, right? But what we’re getting to achieving through Intel’s solutions is amazing, all this technology has been compressed to a hardware form factor that is similar to a USB stick, you can just plug this to a laptop, it’s carry-able, and also they provide this amazing model optimization software, I’m talking about Openvino, Intel open, you know, software here, which can run even faster on their hardware. So this is just a fantastic achievement and a huge boost in the technology when it comes to AI.
Jeff:
Is that what led you to the, I believe it’s called the movidius VPU, from Intel? Is that the size factor, the form factor, and everything, to implement that into your work?
Jagadish:
That’s right.
Jeff:
Great. I think this is really neat that you’re doing it, as Raqi said, doing work of this nature to actually help with call it autonomy, autonomous cars or something like that, all the factors and everything that goes in there and all the processing, the speed of processing, it’s always been a concern of mine, when someone’s developing this, if- I call it like, kind of like information pollution, you said it was bombarding, you don’t want to have that, and I’m glad you’re looking at that and the importance of like maybe a curb or a sidewalk or a cut out, that type of information, it would be really neat to know that you could receive some input from your device in some form factor, like you’re looking into, that a person who is visually impaired would flinch or unmistakeably just take a left or just dodge something or move something to get that interface, that haptic, that sound, that they would have that inclination to just respond without processing, just almost instinctual, or a knee-jerk reaction to make a motion. That would be really something.
Jagadish:
Yeah, you just made a great point, Jeff. So the beauty of this project is that it’s open source, and we are welcoming people to contribute in any way it’s possible, and especially the problem that you’re mentioning here, it’s doable, it’s not really hard. It’s a matter of, you know, coming up with a new hardware design, so that you can plug them into one of the, you know, parts of the body and you can activate them using certain signals, that is quite doable, and hopefully in the future, we’ll include all this features in our project.
Hema:
And also, if I can add one more thing to what Jagadish said, I think it’s the real-time aspect of what you said, Jeff, is that the flinching of the, you know, that, something that can vibrate or something that can give a haptic feedback should happen so instantaneously, right? Like as they make a sudden movement, that could be dangerous. And that’s also the beauty of doing this, AI at the edge is so important because you know, you can’t send everything back into the cloud or into somewhere else to process, it has to happen immediately, right, and that is where we are in this day and age of AI, more and more we are able to do AI at the edge and so- and Jagadish, did you have anything to add on the real-time aspect of it?
Jagadish:
Yeah, you know, the problems with, the last and biggest challenge with the algorithms are they are huge and they take a lot of time to process. To process them in real time you need a lot of hardware, right, and this needs a lot of, you know, complex form factor cards, and a lot of other overheads. So with the Intel’s movidius, what we’re doing is we’re all compressing this to a small form factor, and we can process all this complex data in real-time, you know, it’s pretty much like the person is actually seeing it live, which is a massive feature, right? Imagine like, if there is an obstacle, and the system updates after you collide with the obstacle, it’s pretty dangerous, right? You don’t want to have a system like that. So one of the major accomplishments here is that using Intel’s products, we are able to do everything in real-time, run all these complex models in real-time, including the, you know, AI models and depth processing, and provide information in real time, which is a huge and very valuable task to do, especially for a visually impaired project.
Raqi:
That’s great for advanced detection, because timing is so critical.
Jeff:
Mm-hm, and I imagine it’ll be learning all along as well.
Jagadish:
Yes, so the idea is to include models and interesting problems, like for example, if you want to add a feature where it can detect grocery items, this is doable, once we collect a good amount of data, you should be able to find a model and potentially include it in our system and integrate, so that visually impaired person can actually see what vegetables or fruits or any other grocery items that person is looking at.
Raqi:
Wow, amazing.
Jeff:
That’s great.
Raqi:
Phenomenal.
Jeff:
How can people find out more about Intel and what you’re doing with this initiative?
Jagadish:
I’ll be able to provide a link to our group, or Mirror, the team that I was talking about earlier, and I guess I’ll be able to provide information from the Intel perspective,
Hema:
We can provide information about this project specifically, and you know, all the details behind it, the ones that Jagadish talked about whether, say, the Intel hardware, software, and broadly our collaboration, right, with organizations like OpenCV. So these are things that we can share with you, and Jagadish has a lot of plans for what this can become because it’s an open source, it’s being put out in the open-source community so that developers who have ideas can kind of contribute to this project, and we are really excited on what this could be. So Jagadish can provide some details, and we’ll do the same on the Intel side.
Jeff:
We want to just thank you for planting the seeds to get this initiative started and the great work that you’re doing, because just to know that someone is working on something like this got Raqi and I excited, and we’re just so eager to get on to this interview and to learn more about it, and we’ll put all those links in the show notes, so if anyone wants to follow up and see what’s happening with the project, or probably get involved by paying attention to it and giving them feedback of some sort, that’d be really great. Thank you, Jagadish, and thank you, Hema.
Hema:
It was a pleasure talking to you, Jeff, and to Raqi. Thank you.
Jagadish,
Yeah, thanks for having me. Thank you very much.
Jeff:
Thanks for what you’re doing.
Raqi:
Yeah, thanks for giving us your time. This is fantastic work you’re doing.
Jeff:
Such a great time having Hema and Jagadish in the studio here to talk about this great innovation that they’re working on. So stay tuned, we’ll bring you more as soon as we hear about it. And for more podcasts with a blindness perspective, check us out on the web at www.blindabilities.com, on Twitter @BlindAbilities, and drop us a line, give us some feedback at 612-367-6093. We’d love to hear from you. And a big shout out to Chee Chau for his beautiful music, you can follow Chee Chau on Twitter @lcheechau. And from all of us here at Blind Abilities, to you, your family, and friends, stay well, stay informed, and stay strong. We want to thank you for listening. We hope you enjoyed, and until next time, bye-bye.
[Music] [Transition noise] -When we share
-What we see
-Through each other’s eyes…
[Multiple voices overlapping, in unison, to form a single sentence]
…We can then begin to bridge the gap between the limited expectations, and the realities of Blind Abilities.
Contact Your State Services
If you reside in Minnesota, and you would like to know more about Transition Services from State Services contact Transition Coordinator Sheila Koenig by email or contact her via phone at 651-539-2361.
Contact:
You can follow us on Twitter @BlindAbilities
On the web at www.BlindAbilities.com
Send us an email
Get the Free Blind Abilities App on the App Storeand Google Play Store.
Check out the Blind Abilities Communityon Facebook, the Blind Abilities Page, and the Career Resources for the Blind and Visually Impaired group