WEBVTT - Why you need to talk, not type, to your AI 

0:00:00.600 --> 0:00:04.040
<v Speaker 1>Here's a quick question for you. What percentage of your

0:00:04.120 --> 0:00:09.360
<v Speaker 1>time working on your computer do you spend typing versus talking?

0:00:10.360 --> 0:00:13.680
<v Speaker 1>If your answer is something like ninety five percent typing,

0:00:14.200 --> 0:00:18.000
<v Speaker 1>five percent talking, or maybe zero percent talking, you are

0:00:18.079 --> 0:00:22.840
<v Speaker 1>leaving some big productivity gains on the table because talking

0:00:22.960 --> 0:00:26.920
<v Speaker 1>is faster, It gets ideas out before your analytical brain

0:00:27.000 --> 0:00:30.320
<v Speaker 1>starts second guessing them, and for most tasks, it produces

0:00:30.400 --> 0:00:34.640
<v Speaker 1>better first drafts than anything near labor over at a keyboard.

0:00:35.240 --> 0:00:38.720
<v Speaker 1>Yet when Neil and I ask audiences this question, almost

0:00:38.800 --> 0:00:42.839
<v Speaker 1>everyone is still close to one hundred percent typing. So

0:00:42.920 --> 0:00:45.720
<v Speaker 1>today we're going to get into why voice should be

0:00:45.840 --> 0:00:50.120
<v Speaker 1>a core part of your workflow, not just a novelty.

0:00:50.840 --> 0:00:53.920
<v Speaker 1>We are going to cover when talking beats typing and

0:00:53.960 --> 0:00:57.200
<v Speaker 1>also when it doesn't, the difference between dictation mode and

0:00:57.240 --> 0:01:00.279
<v Speaker 1>advanced voice mode in your AI tools and this big

0:01:00.360 --> 0:01:04.640
<v Speaker 1>software we both rely on day to day to do this,

0:01:05.480 --> 0:01:07.240
<v Speaker 1>and by the end of this episode you are going

0:01:07.319 --> 0:01:09.600
<v Speaker 1>to have a very clear picture of where to start

0:01:09.800 --> 0:01:12.800
<v Speaker 1>and probably a new piece of software to try out

0:01:12.920 --> 0:01:22.320
<v Speaker 1>before the week is over. Welcome to how IAI with

0:01:22.400 --> 0:01:27.000
<v Speaker 1>me Doctor Amantha Imber and Neo Applin, head of Inventium AI.

0:01:27.560 --> 0:01:30.640
<v Speaker 1>Each episode we share one practical way to use AI

0:01:30.800 --> 0:01:34.959
<v Speaker 1>better at work and in life. No fluff, no dech jargon,

0:01:35.319 --> 0:01:36.959
<v Speaker 1>just things you can use straight away.

0:01:38.000 --> 0:01:40.360
<v Speaker 2>Somantha, the world of work has changed, particularly the way

0:01:40.360 --> 0:01:43.160
<v Speaker 2>we work with our computers has changed over the times,

0:01:43.480 --> 0:01:45.080
<v Speaker 2>and we've got AI and all these kind of things

0:01:45.080 --> 0:01:48.360
<v Speaker 2>are now available to us. But question for you, what's

0:01:48.400 --> 0:01:50.400
<v Speaker 2>the thing that's changed the most in the way that

0:01:50.440 --> 0:01:53.840
<v Speaker 2>you work with your computer in the last couple of years.

0:01:54.520 --> 0:01:58.400
<v Speaker 1>The biggest change by far of how I interact not

0:01:58.720 --> 0:02:01.760
<v Speaker 1>just with AI, but also with my computer, is that

0:02:01.840 --> 0:02:06.600
<v Speaker 1>I have gone from typing being how I interact with

0:02:06.760 --> 0:02:10.160
<v Speaker 1>my AI and computer being the predominant way to I

0:02:10.240 --> 0:02:14.920
<v Speaker 1>would say ninety percent of my interaction when I'm working

0:02:14.919 --> 0:02:16.720
<v Speaker 1>with just me and my AI or me and my

0:02:16.800 --> 0:02:21.320
<v Speaker 1>computer is through talking to it, like blah blah talk

0:02:21.360 --> 0:02:24.200
<v Speaker 1>talk rather than type. So about ten percent of what

0:02:24.240 --> 0:02:27.520
<v Speaker 1>I would do on the computer is typing, ninety percent

0:02:27.760 --> 0:02:28.400
<v Speaker 1>is talking.

0:02:28.960 --> 0:02:30.160
<v Speaker 2>That's a huge change.

0:02:30.520 --> 0:02:35.920
<v Speaker 1>It is absolutely massive, and it has made me so

0:02:36.480 --> 0:02:40.960
<v Speaker 1>much faster and so much more effective. And I also

0:02:41.080 --> 0:02:44.000
<v Speaker 1>think in the age of AI, where so much of

0:02:44.040 --> 0:02:49.960
<v Speaker 1>our written communications sounds very aified like as one example,

0:02:50.040 --> 0:02:53.800
<v Speaker 1>like one use case from my life, any almost any

0:02:53.840 --> 0:02:57.600
<v Speaker 1>email that I write, except for quite lengthy ones that

0:02:57.720 --> 0:03:02.800
<v Speaker 1>need a lot of content summarization or documenting something. But

0:03:02.919 --> 0:03:06.280
<v Speaker 1>like when I'm just sending a much less complex email

0:03:06.320 --> 0:03:09.639
<v Speaker 1>to someone will I will talk it out and then

0:03:09.639 --> 0:03:13.440
<v Speaker 1>it will appear through dictation software on my computer. And

0:03:13.520 --> 0:03:16.720
<v Speaker 1>I feel that that sets me apart from the vast

0:03:16.760 --> 0:03:21.720
<v Speaker 1>majority of people who are using AI to write their emails.

0:03:22.680 --> 0:03:24.920
<v Speaker 1>What do you reckon it is for your workflows?

0:03:25.480 --> 0:03:30.760
<v Speaker 2>I would say it's probably eighty twenty eighty typing, twenty

0:03:30.800 --> 0:03:35.040
<v Speaker 2>percent voice. Now I'm late to the voice game, and

0:03:35.080 --> 0:03:37.240
<v Speaker 2>I see you as inspiration on trying to get further

0:03:37.360 --> 0:03:41.480
<v Speaker 2>up there. For me, it's about different use cases on

0:03:41.520 --> 0:03:44.480
<v Speaker 2>how and why I need to type versus when I

0:03:44.560 --> 0:03:47.480
<v Speaker 2>need to talk, And I'll explain that in a minute.

0:03:47.640 --> 0:03:50.520
<v Speaker 2>But for me, it's it's all that speed. And I

0:03:50.520 --> 0:03:52.680
<v Speaker 2>can see why people are getting AI to write their

0:03:52.720 --> 0:03:54.920
<v Speaker 2>emails because it's faster for me to give you five

0:03:55.000 --> 0:03:58.160
<v Speaker 2>dot points typing slow typing, and then saying, write the

0:03:58.200 --> 0:04:00.480
<v Speaker 2>rest of this email. So that probably takes I don't know,

0:04:01.200 --> 0:04:04.120
<v Speaker 2>let's say thirty seconds, maybe a minute something like that.

0:04:04.160 --> 0:04:06.200
<v Speaker 2>I write my five top points. I typed them and

0:04:06.200 --> 0:04:08.040
<v Speaker 2>then I get to clean it up. But if you're

0:04:08.080 --> 0:04:10.720
<v Speaker 2>talking that, it'll probably take you that at the same time.

0:04:11.600 --> 0:04:15.920
<v Speaker 2>So I can see the time saving of using AI.

0:04:15.960 --> 0:04:18.040
<v Speaker 2>I can see the benefit of using our own voice

0:04:18.760 --> 0:04:21.120
<v Speaker 2>to be able to write those things. And so yeah,

0:04:21.440 --> 0:04:24.720
<v Speaker 2>using dictation software, I'm getting more and more into But

0:04:24.760 --> 0:04:27.320
<v Speaker 2>it's great for those kind of things. But questions for you,

0:04:27.400 --> 0:04:29.159
<v Speaker 2>manthe you're using a lot of the times, when are

0:04:29.200 --> 0:04:31.599
<v Speaker 2>you using what purposes are you thinking there's the best

0:04:31.680 --> 0:04:34.400
<v Speaker 2>for versus when you would be typing.

0:04:34.160 --> 0:04:38.880
<v Speaker 1>For So I use it for something I've been using

0:04:38.880 --> 0:04:42.520
<v Speaker 1>it for for quite a long time, like at least

0:04:42.680 --> 0:04:46.680
<v Speaker 1>a year, probably longer though is I will often go

0:04:46.800 --> 0:04:49.840
<v Speaker 1>for a walk with my AI and I will talk

0:04:49.880 --> 0:04:52.760
<v Speaker 1>to it, particularly when I'm trying to nut through a problem.

0:04:53.240 --> 0:04:58.760
<v Speaker 1>And you know, Neo, you've built various GPTs and agents

0:04:58.760 --> 0:05:03.520
<v Speaker 1>for Inventium that will help tease out creative solutions to problems.

0:05:03.520 --> 0:05:07.760
<v Speaker 1>We've got our Brainstorm Buddy GPT, which I definitely use

0:05:08.000 --> 0:05:10.520
<v Speaker 1>quite a lot. So I will often go for a

0:05:10.560 --> 0:05:14.240
<v Speaker 1>walk with my AI, not through a problem. Certainly when

0:05:14.279 --> 0:05:16.479
<v Speaker 1>I was working on the Energy Game book, I would

0:05:16.520 --> 0:05:19.200
<v Speaker 1>go for I would say daily walks when I was

0:05:19.240 --> 0:05:21.440
<v Speaker 1>trying to nut through the structure of the book and

0:05:21.480 --> 0:05:23.480
<v Speaker 1>also the mechanics of the game that I was trying

0:05:23.520 --> 0:05:26.479
<v Speaker 1>to design. That would just be a daily thing for me.

0:05:27.680 --> 0:05:31.480
<v Speaker 1>So that's been in my workflow for quite some time.

0:05:31.880 --> 0:05:36.960
<v Speaker 1>But in terms of when I'm at my computer in

0:05:37.000 --> 0:05:39.640
<v Speaker 1>my inbox, I use it all the time. There's not

0:05:39.760 --> 0:05:42.200
<v Speaker 1>too many emails except for the more complex ones that

0:05:42.279 --> 0:05:50.000
<v Speaker 1>I'm using AI to write with me. But gosh, it's

0:05:50.080 --> 0:05:53.160
<v Speaker 1>really it's anything. It's anything from like when I'm sending

0:05:53.800 --> 0:05:56.920
<v Speaker 1>text messages to people, I will always talk those rather

0:05:57.000 --> 0:06:00.520
<v Speaker 1>than type those with my phone because I've also got

0:06:00.560 --> 0:06:04.400
<v Speaker 1>dictation software on my phone. I'm talking things. If I

0:06:04.480 --> 0:06:07.560
<v Speaker 1>am labeling a file, I will talk out the label

0:06:07.640 --> 0:06:11.919
<v Speaker 1>rather than type it. Like it's literally everything except for

0:06:11.960 --> 0:06:14.440
<v Speaker 1>some very specific use cases where I think that typing

0:06:14.520 --> 0:06:19.960
<v Speaker 1>is still superior. And what I think because I mean,

0:06:20.000 --> 0:06:22.360
<v Speaker 1>we both do a lot of keynote speaking on AI,

0:06:22.760 --> 0:06:25.600
<v Speaker 1>and something I've been asking a lot of the audiences

0:06:25.640 --> 0:06:28.880
<v Speaker 1>that I've been in front of late is what percentage

0:06:28.880 --> 0:06:32.159
<v Speaker 1>of the time of people talking versus typing to their

0:06:32.560 --> 0:06:36.920
<v Speaker 1>computer and specifically their AI and I'm still finding that

0:06:37.040 --> 0:06:41.400
<v Speaker 1>most people are at pretty close to one hundred percent typing,

0:06:41.800 --> 0:06:44.200
<v Speaker 1>and there's just there's so much to be gained, Like

0:06:44.480 --> 0:06:47.320
<v Speaker 1>you know, just as a very quick example, I mean,

0:06:47.360 --> 0:06:51.520
<v Speaker 1>the average person I think maybe types it about fifty

0:06:51.600 --> 0:06:53.760
<v Speaker 1>or sixty words per minute.

0:06:54.080 --> 0:06:56.520
<v Speaker 2>I think that feels about right. But you've also got

0:06:56.520 --> 0:06:58.400
<v Speaker 2>then the error rates on top of that, so you've

0:06:58.400 --> 0:07:00.000
<v Speaker 2>got to go. You can type fast, but you get

0:07:00.160 --> 0:07:03.599
<v Speaker 2>fixed mistakes, so it's probably average of forty words a

0:07:03.600 --> 0:07:04.400
<v Speaker 2>minute effectively.

0:07:04.839 --> 0:07:07.920
<v Speaker 1>That's so true. And for talking, I know that I

0:07:07.960 --> 0:07:10.160
<v Speaker 1>talk quite fast, but I talk at about one hundred

0:07:10.160 --> 0:07:13.360
<v Speaker 1>and forty words per minute, and so look at that,

0:07:13.440 --> 0:07:16.400
<v Speaker 1>like it's almost three times as fast as you know

0:07:16.440 --> 0:07:19.480
<v Speaker 1>for me to talk rather than type.

0:07:19.800 --> 0:07:23.240
<v Speaker 2>There's also another part of that, which is science. This

0:07:23.280 --> 0:07:25.240
<v Speaker 2>is one of the things we've been talking and our

0:07:25.760 --> 0:07:29.600
<v Speaker 2>in your keynote, which is when you're just talking, you're

0:07:29.680 --> 0:07:31.559
<v Speaker 2>using a different part of your brain than if you're typing.

0:07:31.600 --> 0:07:34.000
<v Speaker 2>Typing is very analytical and things like that. You're getting

0:07:34.000 --> 0:07:37.840
<v Speaker 2>your brain and involved. They call it a prefrontal interference,

0:07:38.480 --> 0:07:41.040
<v Speaker 2>whereas if you're just talking, that prefrontal inference is not there.

0:07:41.080 --> 0:07:43.240
<v Speaker 2>You're not critiquing yourself so much as just getting out

0:07:43.280 --> 0:07:45.280
<v Speaker 2>from the heart to be able to get those things out,

0:07:45.320 --> 0:07:48.480
<v Speaker 2>and so your ideas will flow faster by talking than

0:07:48.480 --> 0:07:49.520
<v Speaker 2>it would be from typing.

0:07:50.560 --> 0:07:53.880
<v Speaker 1>And now let's get into when should we talk versus

0:07:53.960 --> 0:07:58.960
<v Speaker 1>when should we type? Because at Inventium AI we've got

0:07:59.080 --> 0:08:02.600
<v Speaker 1>very clear views on this. So Neo, when should people

0:08:03.320 --> 0:08:07.440
<v Speaker 1>be talking and when should they be typing to their

0:08:07.560 --> 0:08:09.000
<v Speaker 1>AI slash computer.

0:08:09.800 --> 0:08:11.600
<v Speaker 2>This is up to you as well, but we've got this.

0:08:11.840 --> 0:08:14.720
<v Speaker 2>We're a bit opinionated in these things. But talking is

0:08:14.760 --> 0:08:18.000
<v Speaker 2>so much better for idea generation. So as Amantha said,

0:08:18.040 --> 0:08:20.560
<v Speaker 2>when you were when you're chatting to yourself, when you

0:08:20.640 --> 0:08:22.680
<v Speaker 2>go through the walk, all that kind of stuff, those

0:08:22.720 --> 0:08:24.960
<v Speaker 2>are great to be able to just talk it out.

0:08:25.040 --> 0:08:28.200
<v Speaker 2>You're not getting your analytical brain in the way. Same

0:08:28.240 --> 0:08:30.720
<v Speaker 2>thing things with like exploratory thinking, like what's this going

0:08:30.760 --> 0:08:34.080
<v Speaker 2>to do? Looking at ideas, options, strategies, things like that.

0:08:34.920 --> 0:08:38.840
<v Speaker 2>Another part is anything like emotional or autobiographical like what

0:08:38.880 --> 0:08:41.679
<v Speaker 2>do I think about this? Or or you know you're

0:08:41.720 --> 0:08:43.480
<v Speaker 2>getting down a personal letter or something like that. Then

0:08:44.559 --> 0:08:47.440
<v Speaker 2>very very much I would be using the dictation. I'd

0:08:47.440 --> 0:08:50.680
<v Speaker 2>also say that small bites are great for dictation. So

0:08:51.080 --> 0:08:54.080
<v Speaker 2>renaming a file name, for example, that's a really easy one.

0:08:54.280 --> 0:08:56.400
<v Speaker 2>You're changing the title of a document easy as well.

0:08:56.760 --> 0:08:59.600
<v Speaker 1>And I would say on that giving instructions is also

0:08:59.600 --> 0:09:02.000
<v Speaker 1>a lot to talk out. So for example, if I'm

0:09:02.080 --> 0:09:06.200
<v Speaker 1>prompting my AI or getting it to work through something,

0:09:06.520 --> 0:09:09.280
<v Speaker 1>it is much more natural for someone to talk through

0:09:09.280 --> 0:09:12.720
<v Speaker 1>those instructions because we're really used to giving people instructions

0:09:12.760 --> 0:09:15.959
<v Speaker 1>in our lives, particularly if you've got children, and so

0:09:16.120 --> 0:09:17.920
<v Speaker 1>that is a really natural one and a really good

0:09:18.040 --> 0:09:20.959
<v Speaker 1>use case for talking rather than typing.

0:09:21.760 --> 0:09:23.400
<v Speaker 2>And where I use typing, this is why I'm more

0:09:23.440 --> 0:09:26.640
<v Speaker 2>eighty percent than you, which is the other way around.

0:09:28.040 --> 0:09:33.000
<v Speaker 2>Adding thoughts to existing work, editing precision or logical structure

0:09:33.160 --> 0:09:36.080
<v Speaker 2>type stuff, you're much better off typing. I do an

0:09:36.080 --> 0:09:38.760
<v Speaker 2>awful lot of presentation decks, and so there's like graphic

0:09:38.800 --> 0:09:41.640
<v Speaker 2>design plus typing, and I find that just with number

0:09:41.640 --> 0:09:43.720
<v Speaker 2>of the keys and keys and keyboards and things like that,

0:09:43.920 --> 0:09:45.640
<v Speaker 2>I do find I do a lot more typing for

0:09:45.720 --> 0:09:49.280
<v Speaker 2>those types of things as well. So yeah, analytical stuff

0:09:49.360 --> 0:09:52.160
<v Speaker 2>maybe typing maybe better. But I'm finding I'm doing a

0:09:52.400 --> 0:09:54.960
<v Speaker 2>up lend now and getting down my first draft using

0:09:55.160 --> 0:09:58.880
<v Speaker 2>voice and then I'm going through and editing with typing.

0:09:59.280 --> 0:10:00.520
<v Speaker 2>So it's a good one two punch.

0:10:00.960 --> 0:10:03.560
<v Speaker 1>Okay, Now let's talk about the tools and also the

0:10:03.600 --> 0:10:08.319
<v Speaker 1>functions within the tools. So let's first talk about the

0:10:08.880 --> 0:10:14.320
<v Speaker 1>AI tools like Copilot, Chatchipt, Claude, Gemini the most common ones.

0:10:14.760 --> 0:10:18.600
<v Speaker 1>So how can people talk like? What do those little

0:10:18.600 --> 0:10:22.000
<v Speaker 1>buttons look like? How do they unlock voice?

0:10:22.480 --> 0:10:25.360
<v Speaker 2>Yep, the different tools have different buttons on there. Just

0:10:25.400 --> 0:10:27.120
<v Speaker 2>on the normal prompt window, you'll see that there's a

0:10:27.120 --> 0:10:30.680
<v Speaker 2>little microphone. The microphone is dictation, and there's an little

0:10:30.679 --> 0:10:34.559
<v Speaker 2>one looks like a squiggly waveform, and that one's Advanced

0:10:34.720 --> 0:10:37.720
<v Speaker 2>voice mode. Now not all of them have the Advance

0:10:37.760 --> 0:10:39.840
<v Speaker 2>Worice mode, but the dictation's on. I'm pretty much all

0:10:39.840 --> 0:10:40.120
<v Speaker 2>of them.

0:10:40.040 --> 0:10:40.360
<v Speaker 1>I think.

0:10:40.800 --> 0:10:42.760
<v Speaker 2>So dictation is great if you just want to get

0:10:42.800 --> 0:10:47.960
<v Speaker 2>your thoughts down uninterrupted. It just basically transcribes everything you say,

0:10:48.160 --> 0:10:50.240
<v Speaker 2>ums ers, all those kind of things are captured in

0:10:50.240 --> 0:10:52.920
<v Speaker 2>there as well, and that's great where you've just got

0:10:52.920 --> 0:10:56.720
<v Speaker 2>an idea stream of consciousness, get my thoughts down. Advance

0:10:56.760 --> 0:10:58.360
<v Speaker 2>Worice mode is great when you want to have a

0:10:58.480 --> 0:11:02.160
<v Speaker 2>chat via your mouth with AI. You talked to it.

0:11:02.160 --> 0:11:04.800
<v Speaker 2>It talks to you. It's not writing on the screen

0:11:04.880 --> 0:11:06.520
<v Speaker 2>so much. Well it does that as well, but it's

0:11:06.520 --> 0:11:08.880
<v Speaker 2>just talking to you. And both are great for different

0:11:08.880 --> 0:11:10.800
<v Speaker 2>things like if you've got ideation you want to get

0:11:10.840 --> 0:11:13.880
<v Speaker 2>throw ideas around, I'd use the advance voice mode. If

0:11:13.880 --> 0:11:15.800
<v Speaker 2>you want to get your first draft of thoughts down,

0:11:15.880 --> 0:11:18.240
<v Speaker 2>like a brain dump, then absolutely I would be using

0:11:18.280 --> 0:11:20.800
<v Speaker 2>the dictation mode on that. And so that's available on

0:11:21.160 --> 0:11:23.920
<v Speaker 2>the mainscreen tools. But the other tools that you use

0:11:24.160 --> 0:11:24.880
<v Speaker 2>are quite different.

0:11:25.400 --> 0:11:29.000
<v Speaker 1>Yes, So my main go to tool, which would be

0:11:29.760 --> 0:11:33.880
<v Speaker 1>software like aside from the standard software like PowerPoint and

0:11:34.200 --> 0:11:38.840
<v Speaker 1>Word and Chrome or Brave as our brows, this would

0:11:38.840 --> 0:11:41.920
<v Speaker 1>be the software that if you were to say, okay,

0:11:41.960 --> 0:11:46.840
<v Speaker 1>this software is now taken off your computer, I would

0:11:46.880 --> 0:11:49.920
<v Speaker 1>be sobbing in the corner, curled up in a ball.

0:11:50.480 --> 0:11:54.000
<v Speaker 1>So this is whisper flow, and I want to get

0:11:54.080 --> 0:12:00.680
<v Speaker 1>the spelling right, so it's wisp r flow. Two words,

0:12:01.080 --> 0:12:03.760
<v Speaker 1>and this is I've tested quite a lot of software,

0:12:03.920 --> 0:12:09.360
<v Speaker 1>this is by far the most accurate and useful. So

0:12:09.360 --> 0:12:14.640
<v Speaker 1>that's an important combination accurate plus useful for dictation software.

0:12:14.840 --> 0:12:17.040
<v Speaker 1>It's accurate. I think we all know what accurate means.

0:12:17.320 --> 0:12:21.720
<v Speaker 1>But useful. What it does is if I am talking

0:12:21.880 --> 0:12:25.000
<v Speaker 1>and then I'm like, actually, no, sorry, this and blah

0:12:25.080 --> 0:12:29.640
<v Speaker 1>blah blah, so it will auto correct mistakes that I make.

0:12:29.760 --> 0:12:33.520
<v Speaker 1>It will also remove filler words like um and ah,

0:12:33.600 --> 0:12:39.800
<v Speaker 1>and it will also auto punctuate, like if it can

0:12:39.840 --> 0:12:43.040
<v Speaker 1>see that I am like first, do this, second, do this, third,

0:12:43.080 --> 0:12:46.600
<v Speaker 1>do that, it will automatically put those in numbers or

0:12:47.360 --> 0:12:51.880
<v Speaker 1>bullet points, depending on what it sounds like. My intent is.

0:12:51.920 --> 0:12:56.480
<v Speaker 1>It's very good at understanding that. And really the only

0:12:56.559 --> 0:12:59.880
<v Speaker 1>problem that I've found is that it still sometimes uses

0:13:00.040 --> 0:13:04.320
<v Speaker 1>American spelling. So I will always proof freed anything that

0:13:04.360 --> 0:13:06.800
<v Speaker 1>I do dictate in, you know, if it's something that

0:13:06.840 --> 0:13:09.199
<v Speaker 1>will be read by others, like an email for example.

0:13:10.160 --> 0:13:13.040
<v Speaker 1>But I have found that it is leagues above anything

0:13:13.040 --> 0:13:14.160
<v Speaker 1>else I have tried.

0:13:15.040 --> 0:13:18.160
<v Speaker 2>I have tried others because you know, I like to

0:13:18.200 --> 0:13:21.160
<v Speaker 2>try different pieces of software. First off, I love whisper flow.

0:13:21.280 --> 0:13:23.920
<v Speaker 2>It is great and you can set the shortcut on

0:13:23.960 --> 0:13:27.199
<v Speaker 2>your keyboard and you just press that button, you talk

0:13:27.240 --> 0:13:30.840
<v Speaker 2>into your microphone and you're often running. It then places

0:13:30.880 --> 0:13:33.080
<v Speaker 2>that text anywhere, which I love. So you don't have

0:13:33.120 --> 0:13:35.640
<v Speaker 2>to go to the whisper flow app. You are using

0:13:35.679 --> 0:13:38.400
<v Speaker 2>whichever app you are already in. You press the button

0:13:38.480 --> 0:13:41.560
<v Speaker 2>and it then just types for you, and I love that.

0:13:41.679 --> 0:13:44.040
<v Speaker 2>It's a seamless the way it works. Some of the

0:13:44.080 --> 0:13:46.040
<v Speaker 2>others you have to go and dictate to the app

0:13:46.040 --> 0:13:47.760
<v Speaker 2>and then you cut and paste, which can be a

0:13:47.800 --> 0:13:51.480
<v Speaker 2>bit clunky. Too much trouble, too much trouble. But yeah,

0:13:51.480 --> 0:13:54.160
<v Speaker 2>whisper Flow for the accuracy, the formatting, and for the

0:13:54.160 --> 0:13:58.320
<v Speaker 2>auto correcting is awesome. Some of the others that I've used,

0:13:58.960 --> 0:14:00.600
<v Speaker 2>open Whisper, I've had a bit of a play with.

0:14:01.679 --> 0:14:05.520
<v Speaker 2>I'm looking at another one at the moment called where

0:14:05.559 --> 0:14:09.280
<v Speaker 2>are we Let me just get it up. Mac Whisper

0:14:09.600 --> 0:14:12.240
<v Speaker 2>is another one, and I'm quite enjoying mac Whisper. It's

0:14:12.559 --> 0:14:15.520
<v Speaker 2>a more open source kind, actually not open it's free.

0:14:15.600 --> 0:14:18.160
<v Speaker 2>There we have there's a paid version and a free version.

0:14:18.480 --> 0:14:20.480
<v Speaker 2>It's not quite as good as whisper Flow, but for

0:14:20.560 --> 0:14:24.040
<v Speaker 2>the price you're paying, it's excellent. The price you're paying

0:14:24.040 --> 0:14:29.680
<v Speaker 2>of zero, yeah, exactly for the Windows people out there.

0:14:29.960 --> 0:14:32.320
<v Speaker 2>One I've seen is called a Brainier. I've never never

0:14:32.360 --> 0:14:35.800
<v Speaker 2>actually played with Brady yet. But ultimately, none of these

0:14:35.840 --> 0:14:38.960
<v Speaker 2>in you because one spit of software I was using

0:14:38.960 --> 0:14:42.840
<v Speaker 2>back in two thousand and one was Dragon naturally speaking

0:14:43.000 --> 0:14:45.680
<v Speaker 2>for those who are in those kind of words. But

0:14:46.240 --> 0:14:48.160
<v Speaker 2>you can add them to your work day, and there's

0:14:48.200 --> 0:14:50.480
<v Speaker 2>lots of them out there. As much as I love

0:14:50.520 --> 0:14:54.000
<v Speaker 2>whisper flow, and i've if you can't be bother trying

0:14:54.000 --> 0:14:56.760
<v Speaker 2>them all, try whisper flow, like it is far and

0:14:56.800 --> 0:14:59.120
<v Speaker 2>away better than the others. But what I do like

0:14:59.160 --> 0:15:02.160
<v Speaker 2>seeing is there's a a bunch of different software out there,

0:15:02.280 --> 0:15:04.760
<v Speaker 2>so it's not just a one and done. You can

0:15:04.800 --> 0:15:07.080
<v Speaker 2>do things where you can run all of the software

0:15:07.160 --> 0:15:09.720
<v Speaker 2>on your computer so your voice doesn't go to the cloud,

0:15:09.920 --> 0:15:11.760
<v Speaker 2>so therefore it's nice and safe and it's a cure

0:15:12.080 --> 0:15:14.120
<v Speaker 2>and things like that. So there are local models you

0:15:14.160 --> 0:15:16.960
<v Speaker 2>can try. Are the ones that you can play around with.

0:15:17.000 --> 0:15:19.960
<v Speaker 2>Maybe the interfaces be different than you want. Some of

0:15:19.960 --> 0:15:21.840
<v Speaker 2>the other features you might have there, which might be

0:15:21.880 --> 0:15:24.240
<v Speaker 2>I can upload it to an audio file, maybe a

0:15:24.320 --> 0:15:26.880
<v Speaker 2>voice memo and all then transcribe that as well. So

0:15:27.200 --> 0:15:29.320
<v Speaker 2>all these different features, but they can help you in

0:15:29.320 --> 0:15:29.880
<v Speaker 2>different ways.

0:15:30.880 --> 0:15:34.080
<v Speaker 1>So I hope that you are now feeling inspired to

0:15:34.320 --> 0:15:37.360
<v Speaker 1>change your workflow. If you are someone that is doing

0:15:37.480 --> 0:15:40.680
<v Speaker 1>majority typing to your computer, maybe you'll try talking to

0:15:40.720 --> 0:15:45.480
<v Speaker 1>it today and if you know someone, maybe you sit

0:15:45.560 --> 0:15:47.800
<v Speaker 1>next to someone that's just like a hardcore typer and

0:15:47.840 --> 0:15:50.560
<v Speaker 1>you never hear them talking to their computer, maybe share

0:15:50.600 --> 0:15:53.640
<v Speaker 1>this episode with them for a little bit of inspiration.

0:15:55.240 --> 0:15:59.120
<v Speaker 1>How AI was hosted by me Amantha Imber and Neo Applan.

0:15:59.360 --> 0:16:02.160
<v Speaker 1>A big thank you Martin Imber who does our sound editing,

0:16:02.320 --> 0:16:05.640
<v Speaker 1>and Jim Rubio for production support, and thank you to

0:16:05.760 --> 0:16:08.040
<v Speaker 1>John Kilby who composed the theme music.