WEBVTT - Talking Tech 30th April 2024

0:00:21.248 --> 0:00:25.268
<v S1>Hello everyone. Welcome to Talking Tech. This edition available from

0:00:25.268 --> 0:00:29.318
<v S1>April the 30th, 2024. I'm Stephen Jolly. Great to have

0:00:29.318 --> 0:00:33.248
<v S1>you with us. Listening maybe through Vision Australia radio associated

0:00:33.248 --> 0:00:37.388
<v S1>stations of RPA Australia or maybe the Community Radio Network.

0:00:37.568 --> 0:00:40.058
<v S1>There is also the podcast. To catch that, all you

0:00:40.058 --> 0:00:42.668
<v S1>need to do is search for the two words talking

0:00:42.698 --> 0:00:46.088
<v S1>tech and Daniken, or come usually on a Tuesday afternoon

0:00:46.088 --> 0:00:49.238
<v S1>just after it's been produced. Another option is to ask

0:00:49.238 --> 0:00:52.508
<v S1>your Siri device or smart speaker to play Vision Australia

0:00:52.508 --> 0:00:57.758
<v S1>Radio talking tech podcast Vision Australia Radio Talking tech podcast

0:00:58.178 --> 0:01:02.858
<v S1>normally with me is Vision Australia's national advisor on access technology,

0:01:02.858 --> 0:01:06.338
<v S1>David Woodbridge. Unfortunately, David can't be with us this week,

0:01:06.338 --> 0:01:10.118
<v S1>so instead we'll bring you a conversation that David and

0:01:10.118 --> 0:01:15.218
<v S1>I had about four months ago. Towards the end of 2023.

0:01:15.218 --> 0:01:19.358
<v S1>Reflecting on the year's developments in technology, particularly from the

0:01:19.358 --> 0:01:22.418
<v S1>perspective of people who are blind or have low vision.

0:01:23.138 --> 0:01:27.008
<v S1>So here we go. This conversation, taken from the edition

0:01:27.008 --> 0:01:33.338
<v S1>of December the 26th, 2023. So let's reflect today on

0:01:33.338 --> 0:01:36.818
<v S1>what has probably been the major talking point in the

0:01:36.818 --> 0:01:42.668
<v S1>tech world over the last year. And that's A.I., Artificial Intelligence.

0:01:42.698 --> 0:01:46.538
<v S1>And I must say that I first really started to

0:01:46.538 --> 0:01:51.518
<v S1>think about it sort of in current times, around November 2022,

0:01:51.728 --> 0:01:56.468
<v S1>when a report on ABC was talking about this new thing, uh,

0:01:56.468 --> 0:02:01.868
<v S1>this website open AI, and you could do really interesting

0:02:01.868 --> 0:02:04.628
<v S1>things like get it to write letters for you or

0:02:04.628 --> 0:02:09.938
<v S1>write essays. And it's all taken off. I think of A.I.

0:02:09.968 --> 0:02:14.318
<v S1>really as taking advantage of the really high computing power

0:02:14.318 --> 0:02:16.748
<v S1>that we have these days to be able to do

0:02:16.748 --> 0:02:19.928
<v S1>things that we just couldn't imagine doing before, but really

0:02:19.928 --> 0:02:22.808
<v S1>useful things. How would you describe it?

0:02:22.958 --> 0:02:25.568
<v S2>I tend to describe it as a computer system that

0:02:25.568 --> 0:02:30.218
<v S2>responds to, uh, speech or typing input. So those two

0:02:30.248 --> 0:02:33.758
<v S2>modes are primarily. The second thing I always think about

0:02:33.758 --> 0:02:36.668
<v S2>is the fact that it can actually make decisions based

0:02:36.668 --> 0:02:40.268
<v S2>on data. And the third thing that really comes off

0:02:40.268 --> 0:02:43.958
<v S2>that data is that the fact that it can recognize patterns,

0:02:43.958 --> 0:02:48.638
<v S2>because it has this whole database of information, that it

0:02:48.638 --> 0:02:53.558
<v S2>can look up extremely fast and recognize patterns of your

0:02:53.558 --> 0:02:57.158
<v S2>own personal use or in general use around the world.

0:02:57.158 --> 0:02:59.948
<v S2>So that's the main three things that always sticks out

0:02:59.948 --> 0:03:03.278
<v S2>for me with computer, artificial intelligence or AI.

0:03:03.608 --> 0:03:06.398
<v S1>Now if you ask some people what I use, they'd say, oh,

0:03:06.428 --> 0:03:10.088
<v S1>something tech that's got the capacity to destroy civilization as

0:03:10.088 --> 0:03:12.788
<v S1>we know it, but it can do a lot of

0:03:12.788 --> 0:03:15.728
<v S1>very useful things, and let's not worry about the threats

0:03:15.728 --> 0:03:18.188
<v S1>of it at the moment, because that's a long conversation

0:03:18.188 --> 0:03:21.008
<v S1>some other time. What sorts of things is it doing?

0:03:21.008 --> 0:03:25.178
<v S2>It's actually quite amazing because when you think about AI,

0:03:25.268 --> 0:03:28.178
<v S2>we're already using it. So people get sort of blown

0:03:28.178 --> 0:03:32.468
<v S2>away or get nervous about chat, GPT and other systems,

0:03:32.768 --> 0:03:36.308
<v S2>but we're already using it in everyday lives. For example, um,

0:03:36.308 --> 0:03:38.978
<v S2>I've got a bit of a list here. So advanced

0:03:38.978 --> 0:03:42.728
<v S2>web search that's actually AI, because it also brings up

0:03:42.728 --> 0:03:49.538
<v S2>suggestions for you when you do a search, when you're watching, uh, Netflix, ABC, iview,

0:03:49.538 --> 0:03:51.578
<v S2>I'm assuming, because I really haven't delved in too much.

0:03:51.578 --> 0:03:54.188
<v S2>When you watch a show, it then comes up with

0:03:54.188 --> 0:03:57.158
<v S2>other suggestions that other things that you may want to

0:03:57.158 --> 0:04:00.728
<v S2>watch in the future or now. So again, that's pattern

0:04:00.728 --> 0:04:03.428
<v S2>recognition and making a bit of a decisions for you.

0:04:03.938 --> 0:04:07.688
<v S2>Let's think about virtual assistants. So they're the classic A

0:04:07.688 --> 0:04:12.848
<v S2>lady I won't say the name Google and Siri, um,

0:04:12.848 --> 0:04:16.238
<v S2>which we always have in our smartphones. Again, to me

0:04:16.238 --> 0:04:19.988
<v S2>that's a basic version of AI, uh, because it's really

0:04:19.988 --> 0:04:24.038
<v S2>command line driven. It doesn't have these large language models

0:04:24.038 --> 0:04:27.338
<v S2>that it can look up speech recognition. So not just

0:04:27.338 --> 0:04:30.728
<v S2>talking to your smart speakers or these virtual assistants, uh,

0:04:30.728 --> 0:04:36.038
<v S2>but also controlling, uh, your computer and of course, dictating

0:04:36.068 --> 0:04:40.418
<v S2>to your computer, which is actually pretty amazing. Automated translation.

0:04:40.418 --> 0:04:43.448
<v S2>So for those that remember Hitchhiker's Guide to the Galaxy

0:04:43.448 --> 0:04:46.568
<v S2>in the famous Babelfish, which is stuck in your ear

0:04:46.568 --> 0:04:50.948
<v S2>for automatic language translation, uh, systems do that automatically for

0:04:50.948 --> 0:04:54.878
<v S2>you now. And that's again, I, uh, nowadays, Steven, we

0:04:54.878 --> 0:04:58.538
<v S2>can get an audio file of somebody talking and get

0:04:58.538 --> 0:05:05.018
<v S2>a transcription of it, uh, in text. So again, I, um,

0:05:05.018 --> 0:05:07.838
<v S2>I've already mentioned smart home, but that's really in relation

0:05:07.838 --> 0:05:11.108
<v S2>to virtual. But when I had this as a separate topic,

0:05:11.108 --> 0:05:15.188
<v S2>your systems learn when you come home, when you want

0:05:15.188 --> 0:05:18.038
<v S2>the air conditioner on, when you want the lights turned off.

0:05:18.038 --> 0:05:24.188
<v S2>That's all about pattern recognition. And for industry it's predictive maintenance.

0:05:24.188 --> 0:05:28.448
<v S2>So the AI systems will decide when you know the

0:05:28.448 --> 0:05:32.918
<v S2>trucks need servicing or the machinery that's used for packing

0:05:32.918 --> 0:05:37.898
<v S2>stuff in the warehouse needs maintenance. Natural language processing. And

0:05:37.898 --> 0:05:41.528
<v S2>this is really where things like, uh, ChatGPT and Copilot

0:05:41.528 --> 0:05:44.318
<v S2>and Bing stand out, because you can just have a

0:05:44.318 --> 0:05:46.328
<v S2>bit of a general chat to these things now. And

0:05:46.328 --> 0:05:49.928
<v S2>based on the keywords that you use, it decides what

0:05:49.928 --> 0:05:52.508
<v S2>things you want to look up and give the information

0:05:52.508 --> 0:05:55.868
<v S2>back to you. Content creation, which is one you already

0:05:55.868 --> 0:05:57.908
<v S2>mentioned there. That's where, for example, if I wanted to

0:05:57.908 --> 0:06:01.358
<v S2>write an email to a company about web accessibility or

0:06:01.358 --> 0:06:04.808
<v S2>application accessibility, I can say to the AI, look, I'd

0:06:04.808 --> 0:06:08.978
<v S2>like you to write a nice, gentle, but assertive email

0:06:08.978 --> 0:06:14.588
<v S2>about the importance of web or application accessibility. And it's

0:06:14.588 --> 0:06:17.198
<v S2>just the amazing stuff that it writes. So it says,

0:06:17.198 --> 0:06:19.718
<v S2>you know, dear Sir or madam, uh, we'd like to

0:06:19.718 --> 0:06:22.458
<v S2>bring to your attention, etc., etc., etc. and it. Really

0:06:22.458 --> 0:06:26.178
<v S2>does have all the main points, which is really amazing.

0:06:26.898 --> 0:06:29.208
<v S2>This is the one that I particularly like when I'm

0:06:29.208 --> 0:06:33.648
<v S2>using and doing research for the show is automatic text summarization.

0:06:33.648 --> 0:06:36.198
<v S2>So I can actually ask a AI to summarize an

0:06:36.198 --> 0:06:38.958
<v S2>article for me, or some text that I've given it,

0:06:38.958 --> 0:06:42.858
<v S2>and it will give me the main points. When we

0:06:42.858 --> 0:06:48.408
<v S2>access our smartphones, we're using facial recognition. Again, I spell

0:06:48.408 --> 0:06:51.618
<v S2>checking you might not think this is AI, but I

0:06:51.648 --> 0:06:56.118
<v S2>is indeed one version of spell checking word suggestions. So

0:06:56.118 --> 0:06:58.548
<v S2>this is not so much about word prediction, which is

0:06:58.548 --> 0:07:02.268
<v S2>another form of AI. This is where you partially type

0:07:02.268 --> 0:07:04.728
<v S2>in a word, and then the computer comes back and

0:07:04.728 --> 0:07:07.488
<v S2>gives you a list of other words that are similar

0:07:07.488 --> 0:07:09.408
<v S2>to the word you just started typing in, which is

0:07:09.408 --> 0:07:14.688
<v S2>actually really amazing. Uh, cybersecurity is another major one weather prediction,

0:07:14.688 --> 0:07:17.748
<v S2>because when you ask your weather system or your weather app,

0:07:17.748 --> 0:07:21.108
<v S2>what's the weather going to be? That's based on computer

0:07:21.108 --> 0:07:24.108
<v S2>generated weather tables around the world and looking at trends

0:07:24.108 --> 0:07:28.218
<v S2>and that sort of stuff. Uh, health and well-being, automated

0:07:28.218 --> 0:07:30.468
<v S2>customer systems. You know, when you ring up those phone

0:07:30.468 --> 0:07:33.138
<v S2>system and it says, how would you like to chat to?

0:07:33.138 --> 0:07:36.108
<v S2>And you might say, well, for a mobile, you might

0:07:36.108 --> 0:07:39.018
<v S2>say mobile or mobile internet. And then it'll keep asking

0:07:39.018 --> 0:07:42.498
<v S2>you these questions, and I'll just say these ones quickly,

0:07:42.498 --> 0:07:46.158
<v S2>because they're all sort of similar aviation, shipping trains and

0:07:46.158 --> 0:07:50.118
<v S2>other forms of public transport, all again driven by AI

0:07:50.118 --> 0:07:54.798
<v S2>automatic self-driving cars. Who doesn't want one of those automated drones?

0:07:54.798 --> 0:07:58.908
<v S2>And finally, robotics, which is the the one that scares

0:07:58.908 --> 0:08:01.398
<v S2>people about, you know, you think of, um, Isaac Asimov

0:08:01.398 --> 0:08:04.068
<v S2>and other famous science fiction writers. That's the one that

0:08:04.068 --> 0:08:05.988
<v S2>always gets people worried because they think, well, robots are

0:08:05.988 --> 0:08:07.818
<v S2>going to take off the world. That'll be the end

0:08:07.818 --> 0:08:09.858
<v S2>of the human race. So there are all the main

0:08:09.858 --> 0:08:10.818
<v S2>general ones.

0:08:10.818 --> 0:08:15.318
<v S1>Let's talk now then, about specifically where it's making a

0:08:15.318 --> 0:08:17.388
<v S1>difference for people who are blind or have low vision,

0:08:17.388 --> 0:08:19.338
<v S1>where AI is making a difference.

0:08:19.338 --> 0:08:21.348
<v S2>So the ones that are making a huge difference, and

0:08:21.348 --> 0:08:23.148
<v S2>the ones that have already said sort of cover these,

0:08:23.148 --> 0:08:25.248
<v S2>but these are more specific ones that I guess to

0:08:25.248 --> 0:08:28.758
<v S2>remind people. So object recognition and of course these are

0:08:28.758 --> 0:08:32.058
<v S2>mainly all based around your camera and your smartphone. So

0:08:32.058 --> 0:08:35.118
<v S2>that's tell you what sort of objects may be around you.

0:08:35.958 --> 0:08:38.718
<v S2>And then we've also got distance from objects. So how

0:08:38.718 --> 0:08:42.048
<v S2>close is the object away. So that's almost like you

0:08:42.048 --> 0:08:44.988
<v S2>know a good old orientation mobility to you know beep

0:08:44.988 --> 0:08:49.908
<v S2>beep or now someone's away indoor navigation as opposed to GPS.

0:08:49.938 --> 0:08:53.658
<v S2>This is indoor navigation where systems learn a route inside

0:08:53.658 --> 0:08:57.798
<v S2>a building and then can take identification markers of items

0:08:57.798 --> 0:09:01.068
<v S2>around you to position you in a, in a building

0:09:01.158 --> 0:09:05.058
<v S2>sort of very similar to object recognition is scene detection.

0:09:05.058 --> 0:09:08.508
<v S2>So scene detection is a way where you can take

0:09:08.508 --> 0:09:12.888
<v S2>a picture of somebody's backyard or a park, and then

0:09:12.888 --> 0:09:15.678
<v S2>it will tell you the position of things in the park,

0:09:15.678 --> 0:09:19.308
<v S2>for example. So where the swings are, they'll identify the trees,

0:09:19.308 --> 0:09:20.568
<v S2>the seats and so on.

0:09:20.568 --> 0:09:23.238
<v S1>Now this next one really appeals to me on your list.

0:09:23.238 --> 0:09:26.148
<v S1>The barcode or QR code identification.

0:09:26.148 --> 0:09:30.078
<v S2>Again, being able to point your camera. And particularly there's

0:09:30.078 --> 0:09:32.658
<v S2>an app that makes it very easy to access barcodes.

0:09:32.808 --> 0:09:34.548
<v S2>I don't believe it does QR codes at the moment,

0:09:34.548 --> 0:09:38.718
<v S2>QR codes, but again, that's actually very effective. The one

0:09:38.718 --> 0:09:41.658
<v S2>that I don't like because it's it's a factor of light,

0:09:41.658 --> 0:09:45.078
<v S2>is color recognition because the based on what the light's

0:09:45.078 --> 0:09:47.568
<v S2>bouncing back from the object, you can get different versions

0:09:47.568 --> 0:09:49.308
<v S2>of color. So that's a thing that I think needs

0:09:49.308 --> 0:09:52.608
<v S2>to be worked on. Uh, speaking about light, light detection.

0:09:52.608 --> 0:09:55.518
<v S2>So you can tell whether a light's on or off

0:09:55.518 --> 0:09:57.468
<v S2>or a device is on and off by the little

0:09:57.468 --> 0:10:00.648
<v S2>LED lights on the device. Pebble detection. And I use

0:10:00.648 --> 0:10:04.248
<v S2>this quite a lot handwriting recognition as opposed to the

0:10:04.248 --> 0:10:07.368
<v S2>other one, which is also optical character recognition OCR for

0:10:07.368 --> 0:10:12.588
<v S2>reading text. So handwriting is very cool. Voice control. So again,

0:10:12.978 --> 0:10:16.428
<v S2>this is probably more advanced than just general speech recognition.

0:10:16.428 --> 0:10:20.028
<v S2>This is also being able to control your assistive technology,

0:10:20.088 --> 0:10:24.288
<v S2>whether it's a screen reader or screen magnifier, switch control, etc.

0:10:24.288 --> 0:10:27.558
<v S2>and interact with the computer. So you're using the computer,

0:10:27.558 --> 0:10:33.108
<v S2>the program and your assistive technology. It's really, really amazing.

0:10:33.348 --> 0:10:35.688
<v S1>And tell us about some of the apps that really

0:10:35.688 --> 0:10:37.548
<v S1>excel with AI.

0:10:37.578 --> 0:10:41.088
<v S2>These apps are actually going to be quite well known

0:10:41.088 --> 0:10:44.058
<v S2>to people. And the first one we have to mention

0:10:44.058 --> 0:10:47.508
<v S2>for both iOS and I'm pleased to say Android now is,

0:10:47.508 --> 0:10:50.568
<v S2>of course, the Seeing Eye app from Microsoft. And that's

0:10:50.568 --> 0:10:54.618
<v S2>a suite of applications inside it called channels. So you've

0:10:54.618 --> 0:10:57.978
<v S2>got things like short text, OCR, document text for OCR

0:10:58.008 --> 0:11:02.478
<v S2>product recognition, which is barcode. Um, you've got people detection,

0:11:02.478 --> 0:11:07.758
<v S2>you've got scene detection, light detection, currency detection, color detection

0:11:07.758 --> 0:11:08.808
<v S2>and so on.

0:11:08.808 --> 0:11:10.968
<v S1>I just want to put in a quick word here

0:11:10.968 --> 0:11:15.018
<v S1>for the document reader in seeing AI, because seeing AI

0:11:15.018 --> 0:11:18.708
<v S1>is really improved. I put my rates notice in front

0:11:18.708 --> 0:11:22.828
<v S1>of it the other day and. Instead of it reading

0:11:22.828 --> 0:11:25.498
<v S1>me the whole document, I said, can you tell me

0:11:25.498 --> 0:11:29.488
<v S1>when the due date is? Can you tell me the amount,

0:11:29.518 --> 0:11:32.578
<v S1>total amount owing, etc.? It's really good.

0:11:32.578 --> 0:11:35.578
<v S2>Yeah, it does work really well, particularly when you can actually, uh,

0:11:35.578 --> 0:11:37.648
<v S2>I don't like the word interrogate, but you can ask

0:11:37.648 --> 0:11:41.308
<v S2>questions about the document that it's just read, and you

0:11:41.308 --> 0:11:45.148
<v S2>can really narrow down on the information that it's got. Uh, so,

0:11:45.148 --> 0:11:47.458
<v S2>so to me, that's the main one that stands out,

0:11:47.458 --> 0:11:50.668
<v S2>you know, really for the last 12 months, um, of course,

0:11:50.668 --> 0:11:54.118
<v S2>the one, the main one. Besides seeing I now on Android,

0:11:54.118 --> 0:11:57.748
<v S2>you've got the lookout app, which does stuff primarily very

0:11:57.748 --> 0:12:02.548
<v S2>similar to seeing a I. And then there's other, you know,

0:12:02.548 --> 0:12:05.458
<v S2>assistive tech ones and of course, the, the couple of

0:12:05.458 --> 0:12:08.878
<v S2>the main ones that I use for, you know, I

0:12:08.908 --> 0:12:13.078
<v S2>chat type stuff is of course the chat GPT because

0:12:13.078 --> 0:12:15.988
<v S2>it's got that voice mode now where you can speak

0:12:15.988 --> 0:12:19.318
<v S2>to it and it speaks to you, uh, being the

0:12:19.318 --> 0:12:23.278
<v S2>new version that's now called copilot, that's a live search,

0:12:23.278 --> 0:12:25.558
<v S2>plus a large language model, and the one that I

0:12:25.558 --> 0:12:27.688
<v S2>use on the Mac all the time now is actually

0:12:27.688 --> 0:12:31.288
<v S2>Copilot itself via the web page on the Mac, which

0:12:31.378 --> 0:12:35.998
<v S2>of course is built in to Windows 10 and 11 now.

0:12:35.998 --> 0:12:37.468
<v S2>So that's actually pretty cool.

0:12:37.468 --> 0:12:41.278
<v S1>Perplexity is another very popular one that certainly works in

0:12:41.278 --> 0:12:45.028
<v S1>the iOS environment. And we mustn't forget to acknowledge the

0:12:45.058 --> 0:12:48.058
<v S1>be my eyes virtual assistant. Be my eye.

0:12:48.058 --> 0:12:50.848
<v S2>Yeah, that's actually really set the bar extremely high. And

0:12:50.848 --> 0:12:52.798
<v S2>of course, that's where you can take a photo of

0:12:52.798 --> 0:12:54.808
<v S2>something and it gives you back all the information. And

0:12:54.808 --> 0:12:58.318
<v S2>you can also ask it questions as well. So I think,

0:12:58.318 --> 0:13:00.748
<v S2>you know, given that this stuff's jumped up in the

0:13:00.748 --> 0:13:02.968
<v S2>last year or so, um, I think it's going to

0:13:02.968 --> 0:13:05.848
<v S2>be really amazing where this stuff goes in the next,

0:13:05.848 --> 0:13:07.348
<v S2>you know, next year or so even.

0:13:07.768 --> 0:13:08.128
<v S3>Um.

0:13:08.668 --> 0:13:11.848
<v S1>All right. So that's, uh, I and I wonder what

0:13:11.848 --> 0:13:14.158
<v S1>we'll be saying about it in 12 months time.

0:13:14.458 --> 0:13:16.288
<v S2>I think we'll have our socks blown off in another

0:13:16.288 --> 0:13:20.998
<v S2>12 months. Besides the web and mobile phones, this, for me,

0:13:20.998 --> 0:13:24.388
<v S2>is the next third thing in 30 years of doing

0:13:24.388 --> 0:13:27.298
<v S2>my job. Then I'll be really, really, really excited about.

0:13:27.298 --> 0:13:31.078
<v S1>So that's I our take on it at the moment.

0:13:31.138 --> 0:13:34.258
<v S1>Before we go, a reminder of where people can find

0:13:34.258 --> 0:13:36.898
<v S1>details of what we've been talking about in this and

0:13:36.898 --> 0:13:38.458
<v S1>previous editions of the program.

0:13:38.728 --> 0:13:41.728
<v S2>As always, you can check out my blog site, which

0:13:41.728 --> 0:13:46.318
<v S2>is David Wood Beard Podbean pod ban.com.

0:13:46.318 --> 0:13:51.178
<v S1>David Wood beard Podbean pod b e a Encom to

0:13:51.178 --> 0:13:52.228
<v S1>write to the program.

0:13:52.228 --> 0:13:56.128
<v S2>You can write to me at David Dot Woodbridge how

0:13:56.128 --> 0:13:58.018
<v S2>it sounds at Vision Australia.

0:13:58.198 --> 0:14:04.858
<v S1>Org David Woodbridge at Vision Australia. Org or jolly Stephen

0:14:04.858 --> 0:14:11.248
<v S1>at gmail.com. Jolly Stephen j o l l e y

0:14:11.428 --> 0:14:17.068
<v S1>s t e p h e n at gmail.com. This

0:14:17.068 --> 0:14:20.158
<v S1>has been talking tech. We've been hearing from Vision Australia's

0:14:20.158 --> 0:14:25.228
<v S1>National Advisor on Access Technology, David Woodbridge, reflecting on developments

0:14:25.228 --> 0:14:29.578
<v S1>in technology throughout 2023. I'm Stephen Jolly. Take care. We'll

0:14:29.578 --> 0:14:31.138
<v S1>talk more tech next week. See you.