WEBVTT - A Conversation with Nerds

0:00:04.160 --> 0:00:07.520
<v Speaker 1>Get in text with technology with tech Stuff from stuff

0:00:07.800 --> 0:00:14.680
<v Speaker 1>dot com. Hey everyone, and welcome to Tech Stuff. I

0:00:14.720 --> 0:00:19.480
<v Speaker 1>am your host, Jonathan Strickland. Today I am incredibly fortunate

0:00:19.560 --> 0:00:22.680
<v Speaker 1>I have not one, but two amazing guests to talk

0:00:22.760 --> 0:00:25.759
<v Speaker 1>about our topic of discussion today, which is really going

0:00:25.840 --> 0:00:30.880
<v Speaker 1>to be about Amazon's Alexa service and what it can do,

0:00:31.120 --> 0:00:34.080
<v Speaker 1>what what it's like to develop for it, and why

0:00:34.120 --> 0:00:38.880
<v Speaker 1>should we be excited about these voice recognition services in general. Uh.

0:00:39.159 --> 0:00:41.640
<v Speaker 1>I guarantee you by the end of this episode you'll

0:00:41.640 --> 0:00:44.400
<v Speaker 1>be as excited about them as I already am. So

0:00:44.479 --> 0:00:47.920
<v Speaker 1>let me introduce my guests. First, on the phone, I've

0:00:47.960 --> 0:00:51.960
<v Speaker 1>got Dave Zbitski from Amazon High Dave, Hey, how you

0:00:51.960 --> 0:00:55.880
<v Speaker 1>doing great? Thank you so much for joining me, really appreciated.

0:00:56.280 --> 0:00:59.920
<v Speaker 1>And I here in the studio in the flesh. John

0:01:00.120 --> 0:01:02.760
<v Speaker 1>Scheme from big nerd Ranch. Now, Josh, you've been working

0:01:02.840 --> 0:01:06.720
<v Speaker 1>on developing sort of the the the how to guide

0:01:06.920 --> 0:01:10.160
<v Speaker 1>of developing for Amazon's Alexa. Is that correct? Yeah, that's right.

0:01:10.160 --> 0:01:13.160
<v Speaker 1>I've been working with David and others in Amazon to

0:01:13.480 --> 0:01:17.000
<v Speaker 1>build some developer education tools to write apps for the

0:01:17.040 --> 0:01:20.080
<v Speaker 1>Alexa Skills Kit platform with Now, this is great. I'm

0:01:20.120 --> 0:01:22.679
<v Speaker 1>glad that I've got two experts on the subject here

0:01:22.760 --> 0:01:26.520
<v Speaker 1>because I do lots of research and I love to

0:01:26.600 --> 0:01:30.120
<v Speaker 1>chat about technology, and I'm very passionate about the subject.

0:01:30.600 --> 0:01:34.160
<v Speaker 1>But it's always a great thrill for me to have

0:01:34.440 --> 0:01:37.319
<v Speaker 1>experts on the subject matter here as well, so that

0:01:37.360 --> 0:01:40.160
<v Speaker 1>they can even fill in those gaps that are within

0:01:40.200 --> 0:01:44.319
<v Speaker 1>my understanding, because I'm coming from a consumer standpoint primarily,

0:01:44.360 --> 0:01:48.720
<v Speaker 1>I am not someone who has had a deep education

0:01:49.000 --> 0:01:52.000
<v Speaker 1>in the field of coding and developing for this sort

0:01:52.040 --> 0:01:57.000
<v Speaker 1>of stuff. I have a liberal arts degree, but I'm

0:01:57.120 --> 0:01:58.920
<v Speaker 1>so glad to have you guys here to talk about.

0:01:59.000 --> 0:02:01.559
<v Speaker 1>Let's start at the very top and work our way down.

0:02:02.200 --> 0:02:06.360
<v Speaker 1>So Alexa is uh a kind of a personal assistant

0:02:06.400 --> 0:02:08.720
<v Speaker 1>that can do lots of different stuff and depends heavily

0:02:08.800 --> 0:02:14.440
<v Speaker 1>upon voice recognition, speech recognition, natural language processing. And I

0:02:14.520 --> 0:02:18.320
<v Speaker 1>don't think a lot of people have a true understanding

0:02:18.400 --> 0:02:23.200
<v Speaker 1>or appreciation of how big a deal that is that

0:02:23.600 --> 0:02:26.800
<v Speaker 1>the way we humans communicate and the way that computers

0:02:26.880 --> 0:02:31.000
<v Speaker 1>quote unquote think is very different. So, Dave, can you

0:02:31.040 --> 0:02:35.079
<v Speaker 1>talk a little bit about the challenges of developing something

0:02:35.120 --> 0:02:41.799
<v Speaker 1>that can actually work with natural language? Certainly, I um

0:02:41.840 --> 0:02:43.920
<v Speaker 1>and and thank you for the intro. I uh, I

0:02:44.000 --> 0:02:46.560
<v Speaker 1>love talking about technology too, and I am far from

0:02:46.560 --> 0:02:49.440
<v Speaker 1>an expert. This has been a learning journey for me.

0:02:49.520 --> 0:02:52.960
<v Speaker 1>There are people who have been working on voice and

0:02:53.120 --> 0:02:56.800
<v Speaker 1>natural language and AI for thirty plus years, and I

0:02:56.840 --> 0:03:00.360
<v Speaker 1>feel like we're in a see change now with the

0:03:00.400 --> 0:03:06.400
<v Speaker 1>power of the cloud and just how affordable you know everybody, Um,

0:03:06.600 --> 0:03:08.360
<v Speaker 1>you may have a tablet or you may have a

0:03:08.360 --> 0:03:11.880
<v Speaker 1>smart home. Just that technology has become so affordable for

0:03:11.919 --> 0:03:15.040
<v Speaker 1>everybody that we can finally do something like this. And

0:03:15.800 --> 0:03:21.680
<v Speaker 1>Amazon's vision is basically the Star Trek computer. If everybody remembers,

0:03:22.160 --> 0:03:25.080
<v Speaker 1>it doesn't matter if it's you know, the Next Generation,

0:03:25.160 --> 0:03:29.000
<v Speaker 1>the original series, Deep Space nine, any or the movies, right,

0:03:29.320 --> 0:03:33.040
<v Speaker 1>there was always a voice that a human being could

0:03:33.120 --> 0:03:36.360
<v Speaker 1>just call out to in the air. I remember watching

0:03:36.440 --> 0:03:41.000
<v Speaker 1>Next Generation and uh wharf would go and play opera, right,

0:03:41.120 --> 0:03:44.040
<v Speaker 1>klingon opera or Pricard would would ask for music just

0:03:44.480 --> 0:03:46.520
<v Speaker 1>and they would walk in the room. And I was

0:03:46.560 --> 0:03:48.960
<v Speaker 1>just thinking about that the other day walking into my office,

0:03:49.000 --> 0:03:52.680
<v Speaker 1>I was like, gosh, I'm asking a computer to turn

0:03:52.720 --> 0:03:54.560
<v Speaker 1>on the lights and play a piece of music. This

0:03:54.720 --> 0:03:58.400
<v Speaker 1>is we are living in science fiction, right, And so

0:03:58.440 --> 0:04:02.480
<v Speaker 1>that's the basic idea behind mind Alexa is it is

0:04:02.520 --> 0:04:06.400
<v Speaker 1>a service from Amazon that we make available to anyone

0:04:06.760 --> 0:04:10.640
<v Speaker 1>were free, and we've put it into actual pieces of

0:04:10.680 --> 0:04:14.119
<v Speaker 1>hardware that we make and we sell to Amazon customers

0:04:14.160 --> 0:04:17.080
<v Speaker 1>as well. And some of your listeners may know that

0:04:17.200 --> 0:04:20.720
<v Speaker 1>as the Amazon Echo, the Dot, the tap, or the

0:04:20.760 --> 0:04:25.000
<v Speaker 1>fire TV. And the basic premise is that Alexa can

0:04:25.160 --> 0:04:29.159
<v Speaker 1>understand us as a human being, and then she can

0:04:29.160 --> 0:04:32.440
<v Speaker 1>talk to all the technology in our lives so that

0:04:32.480 --> 0:04:35.440
<v Speaker 1>we don't have to learn that user interface, you know,

0:04:35.520 --> 0:04:38.160
<v Speaker 1>the whole twelve o'clock blinking light. How do I go

0:04:38.200 --> 0:04:40.560
<v Speaker 1>ahead and change that. I should be able to just

0:04:40.760 --> 0:04:45.200
<v Speaker 1>ask the device itself to set the time. I should

0:04:45.240 --> 0:04:47.159
<v Speaker 1>not have to worry about that as a human being.

0:04:47.240 --> 0:04:51.120
<v Speaker 1>And that's really what Alexa is all about. She can

0:04:51.240 --> 0:04:55.080
<v Speaker 1>talk to human beings, she can understand human beings, and

0:04:55.400 --> 0:04:58.159
<v Speaker 1>then she can go ahead and tell machines what we're

0:04:58.200 --> 0:05:01.359
<v Speaker 1>actually asking for. The is really important to have that

0:05:01.440 --> 0:05:04.760
<v Speaker 1>sort of translator between us and the machine world. One

0:05:04.800 --> 0:05:08.680
<v Speaker 1>of the things that I found extremely frustrating early in

0:05:08.760 --> 0:05:13.719
<v Speaker 1>the era of home automation is that it wasn't all

0:05:13.839 --> 0:05:16.479
<v Speaker 1>or nothing kind of approach, depending upon what you wanted

0:05:16.480 --> 0:05:20.040
<v Speaker 1>to do you had to essentially go with one provider

0:05:20.400 --> 0:05:23.320
<v Speaker 1>for everything, one manufacturer for all of your stuff, because

0:05:23.320 --> 0:05:26.280
<v Speaker 1>it wouldn't talk to each other. You have different protocols,

0:05:26.560 --> 0:05:30.000
<v Speaker 1>you have different approaches to the way that they would

0:05:30.080 --> 0:05:32.520
<v Speaker 1>integrate with each other. And if you had everything from

0:05:32.600 --> 0:05:37.480
<v Speaker 1>one company, awesome, everything talks to each other, it's fantastic.

0:05:37.520 --> 0:05:41.040
<v Speaker 1>But if you're like a regular human being who can't

0:05:41.040 --> 0:05:44.760
<v Speaker 1>necessarily outfit an entire home all at once with the

0:05:44.880 --> 0:05:48.400
<v Speaker 1>same sort of technologies, you wanted to be able to

0:05:48.400 --> 0:05:50.640
<v Speaker 1>talk to each other. So I think one of the

0:05:50.720 --> 0:05:54.600
<v Speaker 1>big in my mind, one of the big bonuses of

0:05:54.680 --> 0:05:58.200
<v Speaker 1>an approach like Alexa is the idea that you have

0:05:58.400 --> 0:06:01.880
<v Speaker 1>this go between that can do that work for you

0:06:02.120 --> 0:06:05.800
<v Speaker 1>where it can start to compensate for the fact that

0:06:05.839 --> 0:06:10.599
<v Speaker 1>these technologies don't natively talk to each other necessarily. Um. Right.

0:06:10.680 --> 0:06:14.520
<v Speaker 1>A great thing about standards is there's so many of them, right, yeah, yeah.

0:06:14.520 --> 0:06:17.080
<v Speaker 1>I love the fact that you know, the term standard

0:06:17.320 --> 0:06:19.520
<v Speaker 1>means the opposite of what you could expect it to

0:06:19.600 --> 0:06:23.440
<v Speaker 1>be and um and and that is one of the

0:06:23.440 --> 0:06:27.200
<v Speaker 1>most popular uses we have seen for the the echo device.

0:06:27.279 --> 0:06:29.640
<v Speaker 1>I know that was my journey too, is I started

0:06:29.680 --> 0:06:32.640
<v Speaker 1>out you know. I used it for music and general queries,

0:06:32.640 --> 0:06:35.600
<v Speaker 1>and I said, you know, I heard this thing can

0:06:35.640 --> 0:06:37.880
<v Speaker 1>do stuff with smart home. I have no idea what

0:06:37.960 --> 0:06:39.960
<v Speaker 1>smart home is. I have no idea what any of

0:06:39.960 --> 0:06:43.520
<v Speaker 1>these terms like IoT right, internet things. It doesn't make

0:06:43.560 --> 0:06:45.680
<v Speaker 1>sense to me. I want to light I can turn on.

0:06:46.080 --> 0:06:49.000
<v Speaker 1>So I went to Amazon and I searched for smart

0:06:49.080 --> 0:06:51.120
<v Speaker 1>light bulb I think is what I found, and it

0:06:51.200 --> 0:06:54.320
<v Speaker 1>was I think it was fifteen bucks. I figured i'd

0:06:54.400 --> 0:06:56.200
<v Speaker 1>order it and see if it would work. And there

0:06:56.279 --> 0:06:59.640
<v Speaker 1>was you know, kind of a general instruction on Amazon site.

0:07:00.080 --> 0:07:02.960
<v Speaker 1>But what you can actually do with Alexa as you

0:07:02.960 --> 0:07:06.400
<v Speaker 1>get the device and you say, Alexa discover devices and

0:07:06.400 --> 0:07:08.640
<v Speaker 1>then she figures out what you put into your home.

0:07:08.720 --> 0:07:11.080
<v Speaker 1>You don't have to go and figure that out. And

0:07:11.120 --> 0:07:15.040
<v Speaker 1>we've found enough custard demand that we've actually created. If

0:07:15.040 --> 0:07:19.200
<v Speaker 1>you go to Amazon dot com slash smart Home, people

0:07:19.240 --> 0:07:21.559
<v Speaker 1>can go there and they'll see all of the smart

0:07:21.600 --> 0:07:24.760
<v Speaker 1>home devices that Alexa can just talk to and will

0:07:24.800 --> 0:07:27.880
<v Speaker 1>automatically work. So it makes it a much easier process

0:07:28.320 --> 0:07:31.520
<v Speaker 1>versus trying to figure out all the individual pieces to buy,

0:07:31.600 --> 0:07:33.760
<v Speaker 1>who talks to who and everything that you mentioned. You

0:07:33.760 --> 0:07:36.560
<v Speaker 1>can just go ahead and put a light screw in

0:07:36.600 --> 0:07:38.280
<v Speaker 1>a light bulb and be able to talk to it. No,

0:07:39.360 --> 0:07:42.240
<v Speaker 1>that's incredible from a consumer standpoint, right, the idea that

0:07:42.280 --> 0:07:46.400
<v Speaker 1>you've made this sort of a seamless experience so that

0:07:46.800 --> 0:07:50.600
<v Speaker 1>you don't have that frustrating moment where even with something

0:07:50.640 --> 0:07:53.320
<v Speaker 1>as simple as bluetooth pairing, for some people that that

0:07:53.520 --> 0:07:57.080
<v Speaker 1>is that is a barrier, right that they have to wait,

0:07:57.120 --> 0:07:59.600
<v Speaker 1>do I have to on each device when the light

0:07:59.680 --> 0:08:02.440
<v Speaker 1>is bling? Game? It means that what's happening, you know,

0:08:02.600 --> 0:08:04.880
<v Speaker 1>just something as simple as that can be really frustrating

0:08:04.920 --> 0:08:08.080
<v Speaker 1>for some people. So to take that step away, uh,

0:08:08.240 --> 0:08:12.560
<v Speaker 1>is really really an ingenious and helpful thing to do. Now, Now, Josh,

0:08:13.040 --> 0:08:16.440
<v Speaker 1>you've worked very hard to help with the back end

0:08:16.480 --> 0:08:19.360
<v Speaker 1>of this so that people who are developing for Alexa

0:08:19.760 --> 0:08:23.920
<v Speaker 1>can take advantage of this and and give people chance

0:08:24.000 --> 0:08:28.239
<v Speaker 1>to make Alexa do some pretty incredible things. So, first

0:08:28.280 --> 0:08:32.640
<v Speaker 1>of all, I gotta I gotta lay down some vocabulary.

0:08:32.800 --> 0:08:37.040
<v Speaker 1>Right before we started recording, you talked about how a

0:08:37.120 --> 0:08:39.079
<v Speaker 1>thing on your phone you refer to it as a skill,

0:08:39.080 --> 0:08:41.439
<v Speaker 1>and you think all of us, because I'm so deep

0:08:41.480 --> 0:08:44.760
<v Speaker 1>in this Amazon world, skills and acts. So so tell

0:08:44.800 --> 0:08:49.000
<v Speaker 1>people what what exactly is an Alexa skill? Well, it's

0:08:49.000 --> 0:08:53.160
<v Speaker 1>interesting skill the term you think initially a skill is

0:08:53.200 --> 0:08:56.160
<v Speaker 1>something that you acquire or learn over time. And I

0:08:56.200 --> 0:08:59.320
<v Speaker 1>believe that Amazon used that terminology as a nod towards

0:09:00.000 --> 0:09:02.880
<v Speaker 1>really the machine learning aspect of what they offer in

0:09:02.920 --> 0:09:06.120
<v Speaker 1>the cloud, right, um, which is and David was speaking

0:09:06.160 --> 0:09:08.959
<v Speaker 1>to this a bit ago. Um, you know, the machine

0:09:09.040 --> 0:09:13.640
<v Speaker 1>learning component of what the Amazon service offers, and that

0:09:13.840 --> 0:09:19.280
<v Speaker 1>is the incredibly complex problem, right of taking spoken words

0:09:19.440 --> 0:09:23.640
<v Speaker 1>and resolving them to a format that uh software can

0:09:23.679 --> 0:09:27.880
<v Speaker 1>actually work with and treat in a predictable way. Um.

0:09:27.920 --> 0:09:32.199
<v Speaker 1>You have to imagine all the timing, the inflection, the variation,

0:09:32.320 --> 0:09:37.320
<v Speaker 1>just regional differences, uh, that someone's going to ask for something.

0:09:37.360 --> 0:09:39.400
<v Speaker 1>And then also you have to think about all the

0:09:39.480 --> 0:09:42.280
<v Speaker 1>various ways that someone could ask for the exact same things.

0:09:42.440 --> 0:09:45.840
<v Speaker 1>It's an astronomical number of ways that that can happen.

0:09:47.000 --> 0:09:50.640
<v Speaker 1>And it's the Alexa Skills Kit platform that actually resolves

0:09:50.679 --> 0:09:54.040
<v Speaker 1>that problem, um in a lot of ways, UM for you.

0:09:54.240 --> 0:09:57.000
<v Speaker 1>So I believe that's why they chose that terminology. It

0:09:57.040 --> 0:10:01.280
<v Speaker 1>makes sense, right, And actually Amazon's corporate a corpus by

0:10:01.320 --> 0:10:04.600
<v Speaker 1>the way, to throw another technology steward into the mix.

0:10:04.800 --> 0:10:08.760
<v Speaker 1>UM is really a collection of data, and Amazon their

0:10:08.800 --> 0:10:12.960
<v Speaker 1>service offers a large collection of that data at which

0:10:13.000 --> 0:10:16.200
<v Speaker 1>is ever increasing by the way. UM. That simplifies the

0:10:16.240 --> 0:10:20.319
<v Speaker 1>problem of resolving that speech and converting it into UM.

0:10:20.400 --> 0:10:23.680
<v Speaker 1>What what the platform calls an intent, which is really

0:10:23.679 --> 0:10:25.880
<v Speaker 1>an indication of something that someone would like to do

0:10:25.960 --> 0:10:29.240
<v Speaker 1>at a given time. Yeah, it's interesting because if you

0:10:29.280 --> 0:10:32.440
<v Speaker 1>think about it from a classical computing standpoint, if you

0:10:32.480 --> 0:10:36.040
<v Speaker 1>wanted your computer to do something specific, uh, let's go

0:10:36.080 --> 0:10:37.840
<v Speaker 1>back to the DOS days. We're not gonna go all

0:10:37.880 --> 0:10:39.440
<v Speaker 1>the way back. We'll go back to DOST days because

0:10:39.440 --> 0:10:42.240
<v Speaker 1>that's that's my childhood. Then you would type in a

0:10:42.280 --> 0:10:45.600
<v Speaker 1>command and and the computer knew exactly what you wanted

0:10:45.640 --> 0:10:48.520
<v Speaker 1>to do as long as the actual program is installed

0:10:48.520 --> 0:10:52.520
<v Speaker 1>on your computer. Because you're following a very specific protocol

0:10:52.559 --> 0:10:55.160
<v Speaker 1>that does not vary. That's right, always going to be

0:10:55.200 --> 0:10:56.760
<v Speaker 1>the same. It's a one to one thing, and it's

0:10:56.800 --> 0:11:00.240
<v Speaker 1>a textual interface and very consistent. Yes, but when you

0:11:00.240 --> 0:11:03.120
<v Speaker 1>get to two different people, just just two people, and

0:11:03.160 --> 0:11:05.719
<v Speaker 1>you just want them to ask for the same thing,

0:11:05.800 --> 0:11:09.320
<v Speaker 1>but you're not guiding them in how to ask, that's

0:11:09.320 --> 0:11:12.040
<v Speaker 1>same for that same thing. That's where you start getting

0:11:12.080 --> 0:11:16.920
<v Speaker 1>into this, uh, this this variability and even if they're

0:11:16.920 --> 0:11:19.720
<v Speaker 1>both saying the exact same phrase, if they're from my

0:11:19.840 --> 0:11:21.599
<v Speaker 1>neck of the woods, there might be a bit of

0:11:21.640 --> 0:11:24.600
<v Speaker 1>a droll that's right there. If they're up over in Maine,

0:11:24.679 --> 0:11:26.559
<v Speaker 1>it's going to be a different sound if they're if

0:11:26.559 --> 0:11:28.800
<v Speaker 1>they're no non native English speaker, they're gonna be in

0:11:29.000 --> 0:11:32.480
<v Speaker 1>an inflection in their voice from whatever language was their

0:11:32.480 --> 0:11:36.400
<v Speaker 1>primary language. So these are all non trivial problems actually

0:11:36.920 --> 0:11:39.679
<v Speaker 1>in the in the programming world. Yes, so there's that problem,

0:11:39.800 --> 0:11:42.360
<v Speaker 1>and then there's also an Amazon refers to this as

0:11:42.440 --> 0:11:46.400
<v Speaker 1>the interaction model, which is the number of variances in

0:11:46.480 --> 0:11:50.200
<v Speaker 1>how someone could ask for something, and the platform takes

0:11:50.280 --> 0:11:53.880
<v Speaker 1>a fuzzy matching approach to solving that problem. Right, So

0:11:54.400 --> 0:11:57.440
<v Speaker 1>instead of providing an exhaustive list of all of the

0:11:57.520 --> 0:12:01.320
<v Speaker 1>various ways that someone could ask for information for an airport,

0:12:01.440 --> 0:12:06.880
<v Speaker 1>for example, um, instead they use a artificial intelligence approach

0:12:06.880 --> 0:12:10.840
<v Speaker 1>to that problem and generalize a set of training data

0:12:10.880 --> 0:12:14.640
<v Speaker 1>that you actually provide as a developer, uh to to

0:12:14.679 --> 0:12:18.120
<v Speaker 1>simplify that problem down. Okay, so yeah, because the first

0:12:18.160 --> 0:12:19.640
<v Speaker 1>thing I was guessing was like I wonder if it's

0:12:19.679 --> 0:12:21.480
<v Speaker 1>going to be probabilistic. It is one of those things

0:12:21.480 --> 0:12:24.560
<v Speaker 1>where it assigns a probability that I'm pretty sure this

0:12:24.600 --> 0:12:26.760
<v Speaker 1>is what they're asking for, so let's go for that.

0:12:26.920 --> 0:12:29.720
<v Speaker 1>Uh exactly. Yeah. We've talked about that with some of

0:12:29.760 --> 0:12:33.440
<v Speaker 1>the other artificial intelligence platforms out there, things like IBM

0:12:33.480 --> 0:12:36.720
<v Speaker 1>S Watson being a very simple example, right, simple in

0:12:36.720 --> 0:12:39.560
<v Speaker 1>the sense that it's easy to understand. It's actually a

0:12:39.600 --> 0:12:42.760
<v Speaker 1>pretty complicated machine as it turns out. But the fact

0:12:42.840 --> 0:12:44.559
<v Speaker 1>that you would say, all right, well, when it was

0:12:44.559 --> 0:12:47.679
<v Speaker 1>playing jeopardy, it would never buzz in unless that certainty

0:12:47.760 --> 0:12:50.920
<v Speaker 1>was greater than like an eight threshold. And once you

0:12:51.000 --> 0:12:53.959
<v Speaker 1>explain to people that's what we mean by probabilistic, where

0:12:54.040 --> 0:12:58.679
<v Speaker 1>a computer is determining, well, how how quote unquote sure

0:12:58.800 --> 0:13:01.880
<v Speaker 1>am I that this is intent of the person giving

0:13:01.920 --> 0:13:05.080
<v Speaker 1>the command then act upon it. One of the things

0:13:05.080 --> 0:13:08.160
<v Speaker 1>I wanted to also mention about alexas skill, so we

0:13:08.320 --> 0:13:11.760
<v Speaker 1>talked about how it's it's voice commands, right that you're

0:13:11.760 --> 0:13:16.360
<v Speaker 1>giving it. These have their own kind of anatomy, Right,

0:13:16.440 --> 0:13:20.320
<v Speaker 1>You've got the skill, where how how you activate the

0:13:20.320 --> 0:13:23.240
<v Speaker 1>skill itself? What you call upon in order to make

0:13:23.240 --> 0:13:26.280
<v Speaker 1>the skill happen, and UM, I wonder if you can

0:13:26.320 --> 0:13:28.560
<v Speaker 1>maybe go into just a little bit like it might

0:13:28.559 --> 0:13:30.959
<v Speaker 1>not be something that the end user would necessarily think

0:13:30.960 --> 0:13:33.839
<v Speaker 1>about the developers would think about. Yeah, no exactly that.

0:13:34.000 --> 0:13:36.160
<v Speaker 1>You know, it's got its own grab bag of terminology,

0:13:36.240 --> 0:13:39.040
<v Speaker 1>but it's really not that complicated once you get past

0:13:39.040 --> 0:13:42.240
<v Speaker 1>those initial things. Um. You know. So as a developer

0:13:42.320 --> 0:13:45.079
<v Speaker 1>coming to the platform, the first thing that I asked was, well,

0:13:45.120 --> 0:13:47.560
<v Speaker 1>what heck is the name of my app or skill?

0:13:47.679 --> 0:13:50.800
<v Speaker 1>I mean, um and updating that in my mind? UM,

0:13:50.840 --> 0:13:53.600
<v Speaker 1>and that is called the invocation name on the platform.

0:13:53.720 --> 0:13:57.240
<v Speaker 1>So what that invocation does is it invocation name does?

0:13:57.320 --> 0:13:59.920
<v Speaker 1>Is it maps a user's word? Basically? I think it

0:14:00.120 --> 0:14:02.480
<v Speaker 1>as a name space like I came to you know,

0:14:02.960 --> 0:14:06.600
<v Speaker 1>Alexa skills kit development as a Java developer and being

0:14:06.600 --> 0:14:10.520
<v Speaker 1>able to give package names or name spaces two classes

0:14:10.840 --> 0:14:14.319
<v Speaker 1>as a Java developer is very helpful. Um. And really

0:14:14.360 --> 0:14:18.439
<v Speaker 1>that's what it does, uh, an invocation named. So for example, UM,

0:14:18.480 --> 0:14:21.560
<v Speaker 1>in our class at Bigner Ranch that we've built for

0:14:21.560 --> 0:14:25.120
<v Speaker 1>for Amazon, we we've given We've built a skill to

0:14:25.280 --> 0:14:29.280
<v Speaker 1>give you information about airports and UM, you you say, Alexa,

0:14:29.400 --> 0:14:32.200
<v Speaker 1>ask airport info for flight delays at a t L.

0:14:32.280 --> 0:14:36.440
<v Speaker 1>For example, that airport info word or words is an

0:14:36.440 --> 0:14:39.600
<v Speaker 1>invocation name, so that initially brings up the skill. It

0:14:39.680 --> 0:14:42.440
<v Speaker 1>launches it so to speak UM. And then you've got

0:14:43.040 --> 0:14:46.840
<v Speaker 1>sample utterances UM. I mentioned that you train UM. I

0:14:47.240 --> 0:14:49.120
<v Speaker 1>kind of think of it as a brain in the cloud,

0:14:49.360 --> 0:14:51.360
<v Speaker 1>UM that lives in the cloud to do our bidding

0:14:51.400 --> 0:14:55.240
<v Speaker 1>for us for resolving the spoken words. UH. These sample

0:14:55.320 --> 0:14:59.080
<v Speaker 1>utterances are what effectively resolve what someone's asking for to

0:14:59.720 --> 0:15:02.840
<v Speaker 1>an indication or an intent that we'd like to ask

0:15:02.880 --> 0:15:07.080
<v Speaker 1>for airport information. And Amazon has made some really smart

0:15:07.160 --> 0:15:12.000
<v Speaker 1>decisions about making that a black box. Effectively, as a developer,

0:15:12.440 --> 0:15:15.000
<v Speaker 1>all we do is provide that training data and on

0:15:15.040 --> 0:15:17.600
<v Speaker 1>the other side we get an indication of what came out.

0:15:18.040 --> 0:15:21.360
<v Speaker 1>We're not required to you know, set up a machine

0:15:21.480 --> 0:15:24.800
<v Speaker 1>learning server or you know, artificial intelligence or deal with

0:15:24.840 --> 0:15:28.440
<v Speaker 1>any of those algorithms that who knows how many countless

0:15:28.840 --> 0:15:33.600
<v Speaker 1>engineers and hours Amazon is invested into building that infrastructure UM.

0:15:33.640 --> 0:15:36.360
<v Speaker 1>But we've got that as a tool to resolve the

0:15:36.360 --> 0:15:40.520
<v Speaker 1>information down to something that our skill service can work with.

0:15:40.720 --> 0:15:43.680
<v Speaker 1>So with an Alexis skill, you've got a skill interface

0:15:43.720 --> 0:15:45.960
<v Speaker 1>and a skill service the Skill interface is where this

0:15:46.040 --> 0:15:49.000
<v Speaker 1>brain that I keep mentioning lives, and the skill service

0:15:49.200 --> 0:15:53.560
<v Speaker 1>is it's really anything that can speak HTTPS. UM. Now

0:15:54.000 --> 0:15:57.240
<v Speaker 1>with our class, we're using no JS, which is, you know,

0:15:57.280 --> 0:16:00.240
<v Speaker 1>it's JavaScript on the server side. Everybody's it in a

0:16:00.360 --> 0:16:03.520
<v Speaker 1>dab of JavaScript here there anyway, So it's an easy

0:16:03.560 --> 0:16:05.880
<v Speaker 1>language for people to pick up and get into. If

0:16:05.880 --> 0:16:09.360
<v Speaker 1>you've done you know, um uh, any web development, you've

0:16:09.400 --> 0:16:13.480
<v Speaker 1>likely got some JavaScript exposure. UM. So it's really an

0:16:13.520 --> 0:16:16.520
<v Speaker 1>extension upon that using some you know, you've got some

0:16:16.560 --> 0:16:19.600
<v Speaker 1>additional things like file io and being able to write

0:16:19.640 --> 0:16:23.480
<v Speaker 1>to a database using node. Uh. But with that JavaScript layer,

0:16:23.760 --> 0:16:26.160
<v Speaker 1>we get events from the Skill interface and we're able

0:16:26.200 --> 0:16:29.560
<v Speaker 1>to process those events and then send a message back

0:16:29.680 --> 0:16:34.120
<v Speaker 1>using Jason right Java JavaScript object notation uh to the

0:16:34.280 --> 0:16:37.480
<v Speaker 1>to the device, and she's able to speak speak our response.

0:16:37.640 --> 0:16:40.400
<v Speaker 1>It's pretty cool. This is really well, I'm sorry, Dave,

0:16:40.440 --> 0:16:43.880
<v Speaker 1>please go ahead. I was just gonna say to um

0:16:43.920 --> 0:16:46.160
<v Speaker 1>to to add on to that. You can think of

0:16:46.200 --> 0:16:49.440
<v Speaker 1>it in terms of you know, we've we talked before

0:16:49.520 --> 0:16:54.360
<v Speaker 1>about Alexa translating human language into a way that she

0:16:54.440 --> 0:16:58.800
<v Speaker 1>can talk to technology. It is the same when third

0:16:58.960 --> 0:17:02.280
<v Speaker 1>party developers go ahead and enhance what she can understand

0:17:02.760 --> 0:17:07.040
<v Speaker 1>through skills. So, if you are an Amazon customer and

0:17:07.160 --> 0:17:10.440
<v Speaker 1>you have an Echo and you talk to your echo,

0:17:11.000 --> 0:17:14.040
<v Speaker 1>and perhaps you're using what Josh describeder, you're using a

0:17:14.119 --> 0:17:18.680
<v Speaker 1>starting phrase, uh and a invocation name to launch something

0:17:18.760 --> 0:17:22.480
<v Speaker 1>like let's say Fitbit or uber or your your Domino's

0:17:22.520 --> 0:17:26.359
<v Speaker 1>easy order. Your voice goes to Amazon. Your voice is

0:17:26.440 --> 0:17:29.359
<v Speaker 1>never shared with a third party developer. You should think

0:17:29.440 --> 0:17:32.800
<v Speaker 1>of Alexa almost as that friend that can can not

0:17:32.880 --> 0:17:35.840
<v Speaker 1>only translate to other pieces of technology, but she can

0:17:35.880 --> 0:17:38.960
<v Speaker 1>talk to all of these third party developers who have

0:17:39.080 --> 0:17:44.280
<v Speaker 1>their own technology. Uh. And so in essence, for for Fitbit,

0:17:45.000 --> 0:17:48.080
<v Speaker 1>when you talk to her, she understands that you're asking

0:17:48.320 --> 0:17:50.320
<v Speaker 1>how you know, how did you do today? And then

0:17:50.359 --> 0:17:53.600
<v Speaker 1>she goes and she talks to Fitbit, and Fitbit says, oh,

0:17:53.680 --> 0:17:55.119
<v Speaker 1>you know, we have these servers and we have all

0:17:55.160 --> 0:17:57.960
<v Speaker 1>these datas and and we know we uh, you know,

0:17:58.040 --> 0:17:59.920
<v Speaker 1>we we know you're a customer. We're gonna return that.

0:18:00.400 --> 0:18:03.600
<v Speaker 1>So go tell that customer this. And then Alexa goes

0:18:04.000 --> 0:18:08.000
<v Speaker 1>and she tells you the information that Fitbit had for you. So,

0:18:08.040 --> 0:18:13.040
<v Speaker 1>in in essence, it is translating between those things and uh,

0:18:13.160 --> 0:18:17.480
<v Speaker 1>you mentioned before the probability. Uh, it's very interesting. As

0:18:17.560 --> 0:18:22.040
<v Speaker 1>human beings when we have a conversation, we were constantly

0:18:22.080 --> 0:18:25.399
<v Speaker 1>making choices like that too. I bring all of my

0:18:25.480 --> 0:18:28.120
<v Speaker 1>experience to this podcast today, right, So when we're having

0:18:28.160 --> 0:18:31.280
<v Speaker 1>a conversation, it's based on all the years that I've

0:18:31.320 --> 0:18:34.280
<v Speaker 1>had a conversation and the understanding of what those words

0:18:34.800 --> 0:18:37.479
<v Speaker 1>actually mean. And there's a lot of things that go

0:18:37.600 --> 0:18:40.000
<v Speaker 1>on set. So for example, we all know what time

0:18:40.000 --> 0:18:42.320
<v Speaker 1>of the day it is where we are, we know

0:18:42.400 --> 0:18:46.000
<v Speaker 1>where we live, we know what country we're in. I

0:18:46.119 --> 0:18:48.280
<v Speaker 1>know you know, I know how to stand up, I

0:18:48.400 --> 0:18:50.600
<v Speaker 1>know what eyes and nose look like. I know what

0:18:50.640 --> 0:18:53.200
<v Speaker 1>a computer ist I knew what when you reference Doss,

0:18:54.320 --> 0:18:57.520
<v Speaker 1>A lot of machine learning is that a computer doesn't

0:18:57.560 --> 0:19:01.560
<v Speaker 1>necessarily have that context, right, And so when you're a

0:19:01.600 --> 0:19:06.119
<v Speaker 1>third party developer, you know the the interaction model that

0:19:06.240 --> 0:19:09.119
<v Speaker 1>jos described basically, think of it like, Uh, you and

0:19:09.160 --> 0:19:10.679
<v Speaker 1>I are going to have a discussion around a new

0:19:10.680 --> 0:19:13.359
<v Speaker 1>topic today. Maybe it's something to do with fitness, and

0:19:13.440 --> 0:19:15.000
<v Speaker 1>so we all, you know, we all go look at

0:19:15.000 --> 0:19:17.080
<v Speaker 1>a wiki or maybe we look at Reddit, and we

0:19:17.160 --> 0:19:19.160
<v Speaker 1>define all the terms and then you and I can

0:19:19.359 --> 0:19:22.359
<v Speaker 1>can have a conversation and actually understand all of that.

0:19:22.440 --> 0:19:25.600
<v Speaker 1>And so that's what you're basically setting up when you

0:19:25.640 --> 0:19:28.400
<v Speaker 1>want to add an additional skill to Alexa is Hey,

0:19:28.440 --> 0:19:31.600
<v Speaker 1>here's all the terms, and when people ask for it

0:19:31.680 --> 0:19:34.800
<v Speaker 1>like this in language, here's what I want you to

0:19:34.840 --> 0:19:38.720
<v Speaker 1>tell me that they asked. And Dave you touched upon

0:19:38.840 --> 0:19:41.159
<v Speaker 1>something early on in that too that I think is

0:19:41.560 --> 0:19:44.879
<v Speaker 1>interesting and important to point out the idea that uh

0:19:44.880 --> 0:19:49.440
<v Speaker 1>with with you communicating to Alexa, your information is going

0:19:49.480 --> 0:19:52.880
<v Speaker 1>to Amazon. It's not being propagated across the internet willy nilly,

0:19:53.040 --> 0:19:57.200
<v Speaker 1>being spread everywhere. That also ends up being an important

0:19:57.240 --> 0:19:59.679
<v Speaker 1>part of some of the restrictions around the types of

0:19:59.680 --> 0:20:03.359
<v Speaker 1>skill ales that Amazon will accept for Alexa, one being

0:20:03.680 --> 0:20:07.240
<v Speaker 1>that they there aren't going to be uh child once

0:20:07.400 --> 0:20:11.879
<v Speaker 1>once for children specifically, because there is a very real

0:20:11.960 --> 0:20:15.879
<v Speaker 1>concern about privacy and information, particularly for children who may

0:20:15.880 --> 0:20:20.520
<v Speaker 1>not fully understand or recognize the importance of that. Um.

0:20:20.560 --> 0:20:23.959
<v Speaker 1>So are there other restrictions on the types of skills

0:20:24.000 --> 0:20:27.880
<v Speaker 1>Amazon is going to except I would imagine that anything

0:20:27.880 --> 0:20:33.760
<v Speaker 1>that was against the law right out the door not happening. Yeah,

0:20:33.800 --> 0:20:37.879
<v Speaker 1>you know, um in the scenario that you described, because

0:20:37.880 --> 0:20:41.480
<v Speaker 1>Alexa today has no way to distinguish between different voices.

0:20:41.520 --> 0:20:44.280
<v Speaker 1>So if we had an echo in the room and

0:20:44.359 --> 0:20:47.080
<v Speaker 1>I was talking and you were talking, that is both

0:20:47.240 --> 0:20:50.679
<v Speaker 1>a human being, it's it's language. She cannot differentiate. So

0:20:50.720 --> 0:20:53.679
<v Speaker 1>she cannot differentiate whether it is a child asking for

0:20:53.760 --> 0:20:56.879
<v Speaker 1>something or an adult. And because of that, uh, and

0:20:56.960 --> 0:20:59.600
<v Speaker 1>you know there are COPA regulations and and everything else

0:20:59.640 --> 0:21:02.320
<v Speaker 1>when it is to mobile and web existing today, we

0:21:02.320 --> 0:21:04.480
<v Speaker 1>want to make sure that we're honoring that and that

0:21:04.560 --> 0:21:11.880
<v Speaker 1>we are protecting folks. Um your um uh second question

0:21:12.080 --> 0:21:15.920
<v Speaker 1>around um, if there's anything else that we're not necessarily allowing,

0:21:15.960 --> 0:21:19.600
<v Speaker 1>I think, um, there are. The guidelines are very similar.

0:21:19.680 --> 0:21:22.439
<v Speaker 1>So we do have an Android app store today, an

0:21:22.480 --> 0:21:25.359
<v Speaker 1>Amazon App Store which has very similar policy. So it

0:21:25.400 --> 0:21:28.520
<v Speaker 1>is something that Amazon has been in that that space

0:21:28.680 --> 0:21:32.119
<v Speaker 1>for years and we've kind of learned as we've gone through.

0:21:32.560 --> 0:21:35.680
<v Speaker 1>We just launched an update to the Alexis Skills section

0:21:36.640 --> 0:21:39.440
<v Speaker 1>where we have categories, and those categories are very similar

0:21:39.440 --> 0:21:42.160
<v Speaker 1>to what people would expect in a mobile app store

0:21:42.280 --> 0:21:45.800
<v Speaker 1>for you know, games, technology, things like that. So uh,

0:21:45.880 --> 0:21:48.080
<v Speaker 1>you know it's very it's got to be PG thirteen

0:21:48.160 --> 0:21:51.960
<v Speaker 1>and G you know, no our rated content anything like that.

0:21:52.640 --> 0:21:57.439
<v Speaker 1>No UM personally identifiable information. You would not. You know,

0:21:57.480 --> 0:22:00.400
<v Speaker 1>we're very careful about if you keep asking people all

0:22:00.440 --> 0:22:02.520
<v Speaker 1>about them, why are you asking about them? And you

0:22:02.560 --> 0:22:06.040
<v Speaker 1>better have a privacy policy for that that somebody has

0:22:06.080 --> 0:22:09.439
<v Speaker 1>allowed you to ask for that information. We do not

0:22:09.600 --> 0:22:14.520
<v Speaker 1>share any information about you. We don't share device information UM,

0:22:14.720 --> 0:22:18.280
<v Speaker 1>although you know developers of course keep asking for those

0:22:18.320 --> 0:22:19.960
<v Speaker 1>things you know they want to know is this coming

0:22:19.960 --> 0:22:23.360
<v Speaker 1>from an echo or mulbile app? Because Alexa will run

0:22:23.480 --> 0:22:26.600
<v Speaker 1>in anything. So there are UH apps for the iPhone,

0:22:26.640 --> 0:22:28.879
<v Speaker 1>apps for Android where you can talk to Alexa and

0:22:28.920 --> 0:22:31.399
<v Speaker 1>get access to all your skills. You can get access

0:22:31.400 --> 0:22:34.200
<v Speaker 1>to all of your smart home functionality as well, using

0:22:34.200 --> 0:22:39.480
<v Speaker 1>the same familiar Alexa experience. See and I am glad

0:22:39.520 --> 0:22:41.600
<v Speaker 1>you were able to address that. I think that that

0:22:41.760 --> 0:22:44.600
<v Speaker 1>actually is a very important thing for any platform to

0:22:44.680 --> 0:22:48.520
<v Speaker 1>do well. You do. Actually it seems counterintuitive, but you

0:22:48.600 --> 0:22:51.680
<v Speaker 1>do need to have those guidelines and restrictions there. And

0:22:52.119 --> 0:22:56.520
<v Speaker 1>if you ever want to see what can happen if

0:22:56.560 --> 0:22:59.560
<v Speaker 1>you don't put them there, you can look at some

0:23:00.000 --> 0:23:05.040
<v Speaker 1>any dramatic lessons we've learned through things like Microsoft Ta,

0:23:05.640 --> 0:23:10.240
<v Speaker 1>which was certainly not intended to turn into a big problem.

0:23:10.359 --> 0:23:13.840
<v Speaker 1>It was that the intention, the intent was how long

0:23:13.880 --> 0:23:16.480
<v Speaker 1>did that last? For like a day and hours? It

0:23:16.560 --> 0:23:18.920
<v Speaker 1>was twenty four hours and then they pulled the plug. Yeah,

0:23:18.920 --> 0:23:20.879
<v Speaker 1>I did a full episode of Microsoft Taste. I'm not

0:23:20.920 --> 0:23:23.040
<v Speaker 1>going to go back and do that, obviously, I'm not

0:23:23.080 --> 0:23:25.359
<v Speaker 1>going to ask either of you to comment on that

0:23:25.400 --> 0:23:29.240
<v Speaker 1>anymore than but just to just to say, like, if

0:23:29.280 --> 0:23:32.639
<v Speaker 1>you have a system and you don't have those restrictions

0:23:32.640 --> 0:23:37.120
<v Speaker 1>in and we are all in many ways like children,

0:23:37.240 --> 0:23:40.960
<v Speaker 1>and sometimes as children, you want to test boundaries, and

0:23:41.080 --> 0:23:45.600
<v Speaker 1>if you find there are no boundaries, problems happen. So

0:23:46.240 --> 0:23:49.520
<v Speaker 1>I'm in favor of boundaries personally. That's actually one thing,

0:23:49.840 --> 0:23:52.560
<v Speaker 1>uh that Amazon has done a really good job of

0:23:52.800 --> 0:23:58.200
<v Speaker 1>is curating the experience as well. You know, they literally, um,

0:23:58.240 --> 0:24:02.120
<v Speaker 1>as you submit your skill, audit that skill pretty thoroughly

0:24:02.119 --> 0:24:04.560
<v Speaker 1>and pretty rigorously to make sure that it conforms to

0:24:05.440 --> 0:24:09.720
<v Speaker 1>good Voice User Experience guidelines. Well, and when you're talking

0:24:09.720 --> 0:24:12.600
<v Speaker 1>about a device that that you know people think of

0:24:12.720 --> 0:24:16.640
<v Speaker 1>as listening to you, obviously you have a great responsibility

0:24:17.280 --> 0:24:20.439
<v Speaker 1>in order to provide an experience that isn't going to

0:24:20.600 --> 0:24:23.919
<v Speaker 1>be negative in any way. Uh, knowing that you know

0:24:23.960 --> 0:24:26.040
<v Speaker 1>you have to take a lot of time and effort

0:24:26.080 --> 0:24:29.560
<v Speaker 1>to make certain that you avoid any problems that could

0:24:29.600 --> 0:24:31.840
<v Speaker 1>come later down the line. I mean, that's got to

0:24:31.880 --> 0:24:36.119
<v Speaker 1>be pretty pretty top concern from Amazon. Yeah, that was,

0:24:37.080 --> 0:24:40.320
<v Speaker 1>you know, very very important for us at Amazon when

0:24:40.320 --> 0:24:43.680
<v Speaker 1>we created the device. And so if you're not using

0:24:43.680 --> 0:24:46.479
<v Speaker 1>an Echo, you're using another device, you know, maybe it's

0:24:46.480 --> 0:24:48.919
<v Speaker 1>a mobile app, or you know, maybe it's in the

0:24:49.000 --> 0:24:52.040
<v Speaker 1>car or in a clock radio. All of those devices

0:24:52.040 --> 0:24:55.120
<v Speaker 1>are pushed to talk. So those devices are never listening

0:24:55.480 --> 0:24:59.400
<v Speaker 1>unless you hit a button. If you have an Echo device,

0:25:00.080 --> 0:25:03.000
<v Speaker 1>you have the ability to hit a mute button, and

0:25:03.040 --> 0:25:05.119
<v Speaker 1>when you hit a mute button, you actually see a

0:25:05.160 --> 0:25:07.879
<v Speaker 1>red ring that goes around the outside letting you know,

0:25:08.400 --> 0:25:10.879
<v Speaker 1>and we do cut power to the microphone as well

0:25:11.359 --> 0:25:15.040
<v Speaker 1>when you hit that. Otherwise, Alexa is never listening unless

0:25:15.119 --> 0:25:18.320
<v Speaker 1>she hears her name. So when you say Alexa, then

0:25:18.359 --> 0:25:21.639
<v Speaker 1>we begin to record your voice. Your voice again is

0:25:21.640 --> 0:25:23.600
<v Speaker 1>only sent to Amazon. We do not send that to

0:25:23.680 --> 0:25:27.480
<v Speaker 1>third parties. And then you can open the Alexa app itself.

0:25:27.520 --> 0:25:29.880
<v Speaker 1>You can see every same little thing that you've ever

0:25:30.000 --> 0:25:33.960
<v Speaker 1>said to Alexa. After you've said Alexa and and then talk.

0:25:34.760 --> 0:25:37.040
<v Speaker 1>And then you also have the ability to delete any

0:25:37.080 --> 0:25:39.720
<v Speaker 1>one of those, or if you'd like, you can contact

0:25:39.760 --> 0:25:42.720
<v Speaker 1>us and remove your entire history as well. So we

0:25:42.760 --> 0:25:45.959
<v Speaker 1>do put all the control into the customer's hands. And

0:25:46.119 --> 0:25:49.560
<v Speaker 1>obviously that was this great foresight on on the part

0:25:49.560 --> 0:25:51.760
<v Speaker 1>of Amazon, because you could easily imagine that if you

0:25:51.760 --> 0:25:55.120
<v Speaker 1>did not build that into your design from the ground up,

0:25:55.560 --> 0:25:58.400
<v Speaker 1>that you would you would very quickly realize the need

0:25:58.520 --> 0:26:01.800
<v Speaker 1>for that. And if that that, that's not a good feeling.

0:26:02.600 --> 0:26:04.520
<v Speaker 1>I think in any of this, this scenario, because it

0:26:04.600 --> 0:26:07.600
<v Speaker 1>is very new for people, it's about building up trust

0:26:08.400 --> 0:26:11.560
<v Speaker 1>and it's about you know, I use terminology of crawling,

0:26:12.280 --> 0:26:15.560
<v Speaker 1>then walking, and then running, and I think as a technologist,

0:26:15.600 --> 0:26:18.359
<v Speaker 1>I always want to run, but it's important in a

0:26:18.400 --> 0:26:21.679
<v Speaker 1>space like this to start out crawling, even if that

0:26:21.760 --> 0:26:25.280
<v Speaker 1>means you're limiting what you can actually do with the device.

0:26:25.520 --> 0:26:28.239
<v Speaker 1>And one of the you know, the things that I

0:26:28.280 --> 0:26:31.359
<v Speaker 1>feel is a sign that that's been a success is

0:26:31.400 --> 0:26:36.160
<v Speaker 1>that people come and they ask for more. You know, now, hey,

0:26:36.200 --> 0:26:38.040
<v Speaker 1>I want Alexa to do more. It's okay, you know,

0:26:38.080 --> 0:26:40.480
<v Speaker 1>give me permission. I wanted to control this. I wanted

0:26:40.520 --> 0:26:42.400
<v Speaker 1>to control that, and gee, it would be great if

0:26:42.400 --> 0:26:45.040
<v Speaker 1>we call it like, so people are now they're fine

0:26:45.119 --> 0:26:47.640
<v Speaker 1>with the fact that they can talk to Alexa. Now

0:26:47.680 --> 0:26:49.720
<v Speaker 1>I want Alexa to do even more. I want Alexa

0:26:50.160 --> 0:26:52.520
<v Speaker 1>to wake up and start talking to me even if

0:26:52.560 --> 0:26:54.560
<v Speaker 1>I haven't talk to her, which is something you know

0:26:54.640 --> 0:26:57.560
<v Speaker 1>that she doesn't do today. She will never interrupt you

0:26:58.000 --> 0:27:02.360
<v Speaker 1>or start talking out of nowhere unless you first engage.

0:27:02.400 --> 0:27:04.560
<v Speaker 1>And I think that's a sign of customer trucks and

0:27:04.600 --> 0:27:07.280
<v Speaker 1>people getting excited about where the technology is headed. That's

0:27:07.280 --> 0:27:09.600
<v Speaker 1>pretty cool. I also want to say something else that

0:27:09.600 --> 0:27:13.240
<v Speaker 1>I think is really cool. Um, you guys might disagree,

0:27:13.600 --> 0:27:16.960
<v Speaker 1>and many listeners may disagree. But so I went back

0:27:17.000 --> 0:27:20.200
<v Speaker 1>and I was looking at There are blogs for specifically

0:27:20.240 --> 0:27:24.040
<v Speaker 1>for Alexa developers, which I recommend my listeners go out

0:27:24.160 --> 0:27:26.800
<v Speaker 1>check out those blogs. They are not written in a

0:27:26.840 --> 0:27:30.919
<v Speaker 1>way that is UH that's so dense or so technical

0:27:31.119 --> 0:27:34.040
<v Speaker 1>that they are inaccessible. They are very accessible and I

0:27:34.080 --> 0:27:36.400
<v Speaker 1>read over quite a few of them before we had

0:27:36.440 --> 0:27:39.200
<v Speaker 1>this conversation, and one of the reasons why I wanted

0:27:39.200 --> 0:27:41.960
<v Speaker 1>to bring this up. While they are incredibly helpful and technical,

0:27:42.760 --> 0:27:45.639
<v Speaker 1>one of the examples that was used in UH in

0:27:45.680 --> 0:27:49.280
<v Speaker 1>one of the blog posts spoke to a deep core

0:27:49.400 --> 0:27:52.960
<v Speaker 1>within me where it was actually being used to explain

0:27:53.040 --> 0:27:55.760
<v Speaker 1>what is a launch phrase? Was an invocation name and

0:27:55.800 --> 0:27:59.800
<v Speaker 1>it was about um using Dungeon dice D twenty for

0:28:00.320 --> 0:28:04.520
<v Speaker 1>that was me. Yeah, I'm as a hardcore D and

0:28:04.600 --> 0:28:08.040
<v Speaker 1>D fan from way back. I hung out with Gary

0:28:08.080 --> 0:28:14.200
<v Speaker 1>Guy Gags creator Dudgeons and Dragons. Yeah, edition campaign right now. Actually,

0:28:17.400 --> 0:28:21.679
<v Speaker 1>if I hold this up talking about the video, you

0:28:21.720 --> 0:28:23.679
<v Speaker 1>see all my D and D manuals right there. They

0:28:23.760 --> 0:28:26.800
<v Speaker 1>can my originals. I can report that Dave in fact

0:28:26.880 --> 0:28:30.560
<v Speaker 1>does have a stack approximately looks like about two ft

0:28:30.560 --> 0:28:34.879
<v Speaker 1>tall of D and D manuals behind him. Um, first edition,

0:28:35.000 --> 0:28:37.439
<v Speaker 1>all the way through a second. Oh Dave, you and

0:28:37.480 --> 0:28:42.240
<v Speaker 1>me man this blogged this room and on the phone.

0:28:42.680 --> 0:28:44.600
<v Speaker 1>Uh yeah, so that was one of those things. But

0:28:44.760 --> 0:28:49.000
<v Speaker 1>I like that the examples you guys give are interesting,

0:28:49.320 --> 0:28:53.320
<v Speaker 1>they are easy to understand, and you also it's outside

0:28:53.400 --> 0:28:56.800
<v Speaker 1>of that initial reaction I think a lot of people

0:28:56.840 --> 0:28:59.719
<v Speaker 1>would have when they hear the word Amazon. Of course,

0:29:00.320 --> 0:29:02.800
<v Speaker 1>their first thought is going to go towards online shopping

0:29:03.160 --> 0:29:04.720
<v Speaker 1>and they're thinking, oh, well, this is going to be

0:29:04.760 --> 0:29:06.600
<v Speaker 1>an app that just makes it easier for me to

0:29:06.600 --> 0:29:08.960
<v Speaker 1>buy things. But then you get into something like this

0:29:09.000 --> 0:29:11.080
<v Speaker 1>and you're like, well, no, this is here's something where

0:29:11.080 --> 0:29:13.640
<v Speaker 1>imagine you've got a table and you know, you don't

0:29:13.760 --> 0:29:17.480
<v Speaker 1>have a metric ton of dice weighing the table down.

0:29:17.640 --> 0:29:19.320
<v Speaker 1>You actually have an app and you can just call

0:29:19.400 --> 0:29:23.280
<v Speaker 1>on any time you need, and then that becomes incorporated

0:29:23.320 --> 0:29:26.560
<v Speaker 1>into your game. It's almost like Alexa is playing the

0:29:26.560 --> 0:29:29.880
<v Speaker 1>game with you, and people start to realize, oh, there's

0:29:30.000 --> 0:29:33.760
<v Speaker 1>other stuff this can do that that aren't that isn't

0:29:33.800 --> 0:29:36.520
<v Speaker 1>related to buying things. I personally think one of the

0:29:36.560 --> 0:29:39.640
<v Speaker 1>cool decisions Amazon made with this is really treating it

0:29:39.680 --> 0:29:43.640
<v Speaker 1>like an interface and as a developer, implementing that interface,

0:29:43.840 --> 0:29:46.720
<v Speaker 1>you know, uh, for your purposes is really what the

0:29:46.760 --> 0:29:50.040
<v Speaker 1>platform is is great at I mean at work. For example,

0:29:50.080 --> 0:29:53.040
<v Speaker 1>we had a hackathon recently and um, I wrote a

0:29:53.080 --> 0:29:55.560
<v Speaker 1>service in a lixer to be able to control and

0:29:55.600 --> 0:29:57.440
<v Speaker 1>this runs on a Raspberry Pie to be able to

0:29:57.440 --> 0:30:01.520
<v Speaker 1>control the servo locks on of the doors in our building.

0:30:01.840 --> 0:30:04.520
<v Speaker 1>And I recently was able to write a skill that

0:30:04.640 --> 0:30:08.520
<v Speaker 1>interacts with that Raspberry Pie via the web to open

0:30:08.560 --> 0:30:11.400
<v Speaker 1>the door, um when we see a visitor come by.

0:30:11.480 --> 0:30:14.880
<v Speaker 1>So the platform is totally open as a developer, and

0:30:14.880 --> 0:30:17.280
<v Speaker 1>and uh, there's Something else I wanted to touch on

0:30:17.320 --> 0:30:19.600
<v Speaker 1>as well, that as mentioned in the blogs and we've

0:30:19.640 --> 0:30:21.560
<v Speaker 1>kind of touched talked about a little bit, is that

0:30:21.640 --> 0:30:24.240
<v Speaker 1>the the coding side of this, it is a server side,

0:30:24.240 --> 0:30:28.040
<v Speaker 1>not a device side, not a client side kind of service. Josh,

0:30:28.120 --> 0:30:30.480
<v Speaker 1>can you talk a little bit about about why that

0:30:30.640 --> 0:30:32.720
<v Speaker 1>is just for people who might wonder like, well, why

0:30:32.840 --> 0:30:36.120
<v Speaker 1>would all this be? Why would all the the hard work,

0:30:36.160 --> 0:30:38.200
<v Speaker 1>the crunching of numbers, if you will, Why is that

0:30:38.280 --> 0:30:42.000
<v Speaker 1>happening in the cloud and not on on a dedicated device. Yeah,

0:30:42.000 --> 0:30:43.520
<v Speaker 1>that's a great question. I mean, I think there are

0:30:43.520 --> 0:30:46.080
<v Speaker 1>a couple of really solid reasons right away. You can

0:30:46.120 --> 0:30:48.800
<v Speaker 1>list for one, um, you know, the fact that the

0:30:48.840 --> 0:30:53.840
<v Speaker 1>Amazon Echo you'll likely have numerous devices, and being able

0:30:53.880 --> 0:30:56.040
<v Speaker 1>to enable that skill and have it in the cloud

0:30:56.280 --> 0:30:59.920
<v Speaker 1>um is an instantaneous thing across all of those devices,

0:31:00.480 --> 0:31:02.720
<v Speaker 1>right So, I think that's one really solid reason that

0:31:02.760 --> 0:31:07.160
<v Speaker 1>Amazon went without architecture. The other reason is moving everything

0:31:07.320 --> 0:31:10.440
<v Speaker 1>to the cloud, right keeping it on that infrastructure, it

0:31:10.480 --> 0:31:15.120
<v Speaker 1>allows for Amazon to iterate incredibly quickly on that skill

0:31:15.320 --> 0:31:18.480
<v Speaker 1>interface side of things. UM, you know, not having that

0:31:18.560 --> 0:31:22.640
<v Speaker 1>constraint to the device and worrying about shipping new hardware

0:31:22.680 --> 0:31:27.080
<v Speaker 1>to make changes over time. Amazon can simply change out

0:31:27.120 --> 0:31:31.520
<v Speaker 1>that in interface and interaction model um uh, you know

0:31:32.400 --> 0:31:37.160
<v Speaker 1>pretty much seamlessly. Um and it works um incredibly well.

0:31:38.200 --> 0:31:41.400
<v Speaker 1>And that's a that's a big part of it. It

0:31:41.440 --> 0:31:44.480
<v Speaker 1>also enables um, you know, third party developers to host

0:31:45.160 --> 0:31:48.120
<v Speaker 1>their own services wherever they want, and whatever technology they

0:31:48.120 --> 0:31:51.000
<v Speaker 1>want doesn't have to be an Amazon Cloud using any

0:31:51.080 --> 0:31:55.600
<v Speaker 1>language that they want. And we knew, you know, uh,

0:31:55.840 --> 0:31:58.200
<v Speaker 1>we thought of some pretty interesting things for Alexa, but

0:31:58.280 --> 0:32:02.480
<v Speaker 1>we knew we really need to an essence crowdsource her

0:32:02.520 --> 0:32:06.760
<v Speaker 1>ability to get smarter. Uh. And there's now over fifteen

0:32:06.840 --> 0:32:11.200
<v Speaker 1>hundred skills out there today by a third party developers

0:32:12.080 --> 0:32:15.160
<v Speaker 1>who are creating things with you know, like job described

0:32:15.520 --> 0:32:18.280
<v Speaker 1>about controlling doors and putting things in Raspberry pies and

0:32:18.760 --> 0:32:22.960
<v Speaker 1>robots and drones and health health and life sciences and

0:32:23.040 --> 0:32:27.360
<v Speaker 1>doctors offices and hospitals and games and virtual reality, you know,

0:32:27.480 --> 0:32:30.680
<v Speaker 1>things that we never would have been able to right ourselves,

0:32:30.680 --> 0:32:33.640
<v Speaker 1>that people have just taken uh and they've run with it.

0:32:33.720 --> 0:32:36.040
<v Speaker 1>And I continue to be impressed every day by what

0:32:36.400 --> 0:32:38.760
<v Speaker 1>people are doing. Yeah, exactly, Davin. I mean, you know,

0:32:38.880 --> 0:32:42.400
<v Speaker 1>you can embed you know, Alexa in if you're a

0:32:42.400 --> 0:32:46.440
<v Speaker 1>hardware designer, you can build your own version of Echo

0:32:46.560 --> 0:32:49.640
<v Speaker 1>with extended capabilities. In fact, you could have a display

0:32:49.960 --> 0:32:52.240
<v Speaker 1>or you know, if you have visions about what that

0:32:52.280 --> 0:32:55.480
<v Speaker 1>could be like, you can embed what's called Alexive Voice

0:32:55.480 --> 0:32:59.640
<v Speaker 1>services within that hardware and actually create your own version

0:32:59.680 --> 0:33:03.560
<v Speaker 1>of of the Echo, which is incredibly powerful. Interesting. So

0:33:04.040 --> 0:33:06.680
<v Speaker 1>let's talk a bit now, Josh. You you work very

0:33:06.720 --> 0:33:10.080
<v Speaker 1>closely over at Bigner branch and developing the kind of

0:33:10.240 --> 0:33:14.320
<v Speaker 1>UH curriculum that someone would who wants to develop for Alexa.

0:33:14.320 --> 0:33:16.400
<v Speaker 1>They'd find it very helpful they to to actually learn

0:33:16.440 --> 0:33:20.160
<v Speaker 1>how to code something and crafts something. And you've mentioned

0:33:20.160 --> 0:33:23.720
<v Speaker 1>before also that that one of the test skills people

0:33:23.720 --> 0:33:27.880
<v Speaker 1>will develop is one about getting information about airports and

0:33:28.400 --> 0:33:32.160
<v Speaker 1>potential delays. Let's let's kind of, in a layman sense,

0:33:32.280 --> 0:33:35.520
<v Speaker 1>kind of walk through what is the process in general

0:33:35.680 --> 0:33:38.680
<v Speaker 1>of of developing for Alexa, and then at the end

0:33:38.680 --> 0:33:41.560
<v Speaker 1>of it, I think we can do a quick little demonstration. Alexa,

0:33:41.640 --> 0:33:44.200
<v Speaker 1>by the way, I didn't introduce her. I feel like

0:33:44.200 --> 0:33:46.720
<v Speaker 1>such a cad I didn't introduce her, but she is

0:33:46.800 --> 0:33:49.800
<v Speaker 1>also in the studio with us. So Alexa, how are

0:33:49.800 --> 0:33:54.520
<v Speaker 1>you doing today? Great. And if you're listening, make sure

0:33:54.560 --> 0:33:58.280
<v Speaker 1>you're muted. Although you were already listening your your echo

0:33:58.320 --> 0:34:01.160
<v Speaker 1>has probably gone on several times. My my parents are

0:34:01.200 --> 0:34:04.040
<v Speaker 1>going to send me a very mad email. My parents,

0:34:04.080 --> 0:34:07.240
<v Speaker 1>by the way, they have they're the ones who asked

0:34:07.240 --> 0:34:10.239
<v Speaker 1>me to do this podcast because my parents own own

0:34:10.320 --> 0:34:12.680
<v Speaker 1>and Echo, and they talked to it all the time,

0:34:12.719 --> 0:34:15.040
<v Speaker 1>and they demonstrated to me all the time. And uh,

0:34:15.080 --> 0:34:17.640
<v Speaker 1>the only reason I haven't picked one up yet is

0:34:17.680 --> 0:34:19.120
<v Speaker 1>I've got to wait for my wife to go out

0:34:19.120 --> 0:34:21.319
<v Speaker 1>of town just long enough so I can get it

0:34:21.320 --> 0:34:23.560
<v Speaker 1>and incorporated into everything, so that when she comes back

0:34:23.560 --> 0:34:26.600
<v Speaker 1>and sees that I bought it, it's so awesome and

0:34:26.640 --> 0:34:29.080
<v Speaker 1>incorporated and integrated that she would never want to get

0:34:29.160 --> 0:34:32.640
<v Speaker 1>rid of it. Yeah, I run the white I do

0:34:32.719 --> 0:34:38.040
<v Speaker 1>the wife test with technology as well, and is slowly

0:34:38.120 --> 0:34:42.480
<v Speaker 1>warmed up to She was resisted at first, but she's

0:34:42.560 --> 0:34:45.759
<v Speaker 1>just accepted it into our home. Step one is is

0:34:45.800 --> 0:34:48.759
<v Speaker 1>get my wife comfortable with Alexa. Step two is to

0:34:48.760 --> 0:34:51.920
<v Speaker 1>get her into a D and D campaign. All right,

0:34:52.000 --> 0:34:54.799
<v Speaker 1>So so we're I come to you, Josh, I just say,

0:34:54.840 --> 0:34:57.320
<v Speaker 1>I say I am interested in starting to develop for Alexa.

0:34:57.719 --> 0:35:01.440
<v Speaker 1>What's kind of the process of of learning to develop

0:35:01.480 --> 0:35:05.160
<v Speaker 1>and and the process of actually developing in a skill. Yeah.

0:35:05.239 --> 0:35:08.120
<v Speaker 1>And so you mentioned earlier that you had read our

0:35:08.160 --> 0:35:11.480
<v Speaker 1>blog and you followed that arc. It's um as you

0:35:11.520 --> 0:35:14.320
<v Speaker 1>get into the process, it's really a couple of things

0:35:14.400 --> 0:35:17.680
<v Speaker 1>that you get your brain wrapped around. Initially, you know,

0:35:17.800 --> 0:35:20.280
<v Speaker 1>most developers are going to be coming from a graphical

0:35:20.400 --> 0:35:23.560
<v Speaker 1>user interface background, right, I mean that's the lay of

0:35:23.600 --> 0:35:27.040
<v Speaker 1>the land these days. Of the applications being written, though

0:35:27.040 --> 0:35:30.399
<v Speaker 1>that's changing, are all graphical, right, because it's either meant

0:35:30.440 --> 0:35:36.040
<v Speaker 1>for a smartphone screen or it's on a laptop or desktop. Yeah. Once, once,

0:35:36.080 --> 0:35:41.800
<v Speaker 1>once mac os and and and Microsoft Windows really took hold,

0:35:42.200 --> 0:35:45.960
<v Speaker 1>we started to see other types of UIs disappear, and

0:35:46.000 --> 0:35:48.960
<v Speaker 1>then we just began to assume that was the only

0:35:48.960 --> 0:35:52.439
<v Speaker 1>way you could do things for a while. Yeah, exactly right. Uh.

0:35:52.520 --> 0:35:55.080
<v Speaker 1>You know, however, the paradigm as we see is changing.

0:35:55.960 --> 0:36:00.880
<v Speaker 1>So the first thing to understand is the skill interface

0:36:01.000 --> 0:36:04.200
<v Speaker 1>portion of what I consider to be a two part thing.

0:36:04.239 --> 0:36:07.680
<v Speaker 1>It's a skill interface and then a skill service. The

0:36:07.719 --> 0:36:11.560
<v Speaker 1>skill interface, like I said, is responsible for resolving the

0:36:11.640 --> 0:36:15.000
<v Speaker 1>user's words to the first step of building a skill,

0:36:15.040 --> 0:36:18.840
<v Speaker 1>which is defining the interaction model, and so I consider

0:36:19.200 --> 0:36:22.400
<v Speaker 1>consider that to be setting an invocation name, which we

0:36:22.440 --> 0:36:25.759
<v Speaker 1>talked about earlier. It's the name of your skill, um

0:36:25.760 --> 0:36:28.120
<v Speaker 1>and how a user is going to be communicating with it.

0:36:28.239 --> 0:36:30.319
<v Speaker 1>So this is so that you can you can create

0:36:30.360 --> 0:36:33.040
<v Speaker 1>a distinction between the skill that you're developing in every

0:36:33.040 --> 0:36:36.600
<v Speaker 1>other skill that's out there on available on with Alexa. Yeah,

0:36:36.600 --> 0:36:38.400
<v Speaker 1>that's exactly right, and it's going to be unique to

0:36:38.400 --> 0:36:41.080
<v Speaker 1>the skill that you're building. Um. Amazon will in fact,

0:36:41.400 --> 0:36:43.320
<v Speaker 1>uh not allow you to have you know, of course,

0:36:43.400 --> 0:36:47.319
<v Speaker 1>the same name of another skill that that introduces an

0:36:47.320 --> 0:36:51.200
<v Speaker 1>interesting new like domain squatting type problem, that doesn't wonder,

0:36:51.280 --> 0:36:54.160
<v Speaker 1>I wonder what will become of that at any rate? Um.

0:36:54.239 --> 0:36:57.319
<v Speaker 1>So in our class, for example, one skill that we

0:36:57.360 --> 0:36:59.560
<v Speaker 1>teach you to write, which gives you a pretty good

0:36:59.560 --> 0:37:03.200
<v Speaker 1>cross action of all the different capabilities of the platform

0:37:03.320 --> 0:37:06.399
<v Speaker 1>is the airport info skill. So my first step would

0:37:06.400 --> 0:37:10.799
<v Speaker 1>be defining the invocation name of airport info in Amazon's

0:37:10.880 --> 0:37:14.520
<v Speaker 1>skill interface. And this is literally a web portal that

0:37:14.560 --> 0:37:18.680
<v Speaker 1>you visit and configure in your web browser. So as

0:37:18.719 --> 0:37:20.520
<v Speaker 1>you can figure that. The next step that you get

0:37:20.560 --> 0:37:25.080
<v Speaker 1>to is defining the sample utterances that as a developer,

0:37:25.560 --> 0:37:29.080
<v Speaker 1>I need to hook into UM as a result of

0:37:29.120 --> 0:37:32.960
<v Speaker 1>the resolution of what the interaction model said someone had

0:37:33.000 --> 0:37:37.440
<v Speaker 1>just spoken into the device, right. So this is the

0:37:37.560 --> 0:37:39.920
<v Speaker 1>kind of magic that we're talking about earlier. The black

0:37:40.000 --> 0:37:44.440
<v Speaker 1>box that Amazon offers you UM that leverages the artificial

0:37:44.480 --> 0:37:48.000
<v Speaker 1>intelligence and machine learning, you know, the cutting edge technology

0:37:48.160 --> 0:37:52.600
<v Speaker 1>really there, UM, and we train that model up with

0:37:52.800 --> 0:37:58.000
<v Speaker 1>sample utterances. So those sample utterances for airport information UM.

0:37:58.040 --> 0:38:01.040
<v Speaker 1>It's actually UH, it's got a couple of different aspects.

0:38:01.160 --> 0:38:04.880
<v Speaker 1>One is, UM, we want to be able to resolve

0:38:05.280 --> 0:38:08.560
<v Speaker 1>the various ways that someone will ask for information about

0:38:08.560 --> 0:38:11.520
<v Speaker 1>an airport. And then secondly, we've got to have a

0:38:11.560 --> 0:38:15.480
<v Speaker 1>way to pass a variable into the application or the skill.

0:38:15.680 --> 0:38:18.560
<v Speaker 1>So I want to be able to, for example, determine

0:38:18.880 --> 0:38:22.719
<v Speaker 1>if someone had said SFO or a t l UM. Right,

0:38:22.800 --> 0:38:25.640
<v Speaker 1>they're asking about it a specific airport code, and I

0:38:25.680 --> 0:38:27.600
<v Speaker 1>need to be able to throw that into a variable.

0:38:28.040 --> 0:38:31.080
<v Speaker 1>So the interaction model has a mechanism for doing that

0:38:31.120 --> 0:38:35.479
<v Speaker 1>called a slot, and a slot is it's kind of UM.

0:38:35.520 --> 0:38:38.880
<v Speaker 1>It's Amazon's terminology for basically a variable assignment that we

0:38:38.880 --> 0:38:41.719
<v Speaker 1>want to be able to do. So in that in

0:38:41.760 --> 0:38:46.480
<v Speaker 1>that set of sample utterances, we give strings that represent

0:38:46.640 --> 0:38:49.640
<v Speaker 1>how user could ask for airport information. So I might

0:38:49.680 --> 0:38:52.680
<v Speaker 1>ask for, uh, you know, Alexa, ask airport info for

0:38:52.840 --> 0:38:56.640
<v Speaker 1>flight delays at a t L, um for flight information

0:38:56.719 --> 0:38:59.320
<v Speaker 1>at a t L, for delay status at a t L.

0:38:59.440 --> 0:39:01.560
<v Speaker 1>You see the very creations of ways that someone could

0:39:01.560 --> 0:39:05.319
<v Speaker 1>ask for that info. UM. Now, like I said, it's generalized,

0:39:05.400 --> 0:39:08.720
<v Speaker 1>so in other words, it's a fuzzy resolution between those

0:39:08.719 --> 0:39:13.319
<v Speaker 1>phrases and UM. An intent. So an intent is an

0:39:13.320 --> 0:39:15.960
<v Speaker 1>indication of what someone would like your skill to do.

0:39:17.000 --> 0:39:18.839
<v Speaker 1>And we want our skill to be able to give

0:39:18.920 --> 0:39:22.760
<v Speaker 1>us information about an airport. And so the skill interface,

0:39:22.840 --> 0:39:25.439
<v Speaker 1>once we've set up that sample utterances list and we've

0:39:25.480 --> 0:39:29.760
<v Speaker 1>said okay, the words that fall at this portion of

0:39:29.960 --> 0:39:32.560
<v Speaker 1>what someone says are actually going to be dropped into

0:39:32.600 --> 0:39:35.680
<v Speaker 1>this new thing called a slot. And I want to

0:39:35.680 --> 0:39:38.040
<v Speaker 1>be able to have that as a variable on the

0:39:38.080 --> 0:39:41.360
<v Speaker 1>second portion of building a skill, which is the skill service.

0:39:41.960 --> 0:39:44.279
<v Speaker 1>So once the skill interface has done that work for me,

0:39:44.680 --> 0:39:47.360
<v Speaker 1>and since the information of what it found of what

0:39:47.480 --> 0:39:50.160
<v Speaker 1>someone had said to the skill service, that's where we're

0:39:50.160 --> 0:39:53.840
<v Speaker 1>in node land, or like David said, it's really any

0:39:54.040 --> 0:39:57.560
<v Speaker 1>programming language that can speak HTTPS and live on a

0:39:57.600 --> 0:40:00.719
<v Speaker 1>server One of the steps that actually happened at that

0:40:00.760 --> 0:40:04.720
<v Speaker 1>point too, is after you've set up everything that johsh

0:40:04.719 --> 0:40:07.799
<v Speaker 1>talked about, in essence, this is what I want to

0:40:07.800 --> 0:40:11.160
<v Speaker 1>talk about. Here's some examples of me using it in

0:40:11.160 --> 0:40:13.920
<v Speaker 1>a sentence. That's where the computer science comes in. So

0:40:14.000 --> 0:40:17.280
<v Speaker 1>that's where it runs in the cloud, and we actually

0:40:17.280 --> 0:40:21.480
<v Speaker 1>do a bunch of uh, you know, AI and machine

0:40:21.560 --> 0:40:24.240
<v Speaker 1>learning and everything that sits on top of Amazon Web Services,

0:40:24.280 --> 0:40:27.920
<v Speaker 1>and we generate a lexicon and a language model, so

0:40:28.360 --> 0:40:31.560
<v Speaker 1>it's it's all done ahead of time, and it's almost

0:40:31.560 --> 0:40:34.160
<v Speaker 1>a you know, me being a sci fi geek, it's

0:40:34.160 --> 0:40:37.440
<v Speaker 1>almost like the matrix when Neo says, teach me kung fu,

0:40:37.800 --> 0:40:39.919
<v Speaker 1>and then he's like, I now know kung fu. That's

0:40:39.960 --> 0:40:42.960
<v Speaker 1>the part that we do through the portal. So now

0:40:43.000 --> 0:40:46.520
<v Speaker 1>Alexa knows kong fu, uh, and she can talk to

0:40:46.560 --> 0:40:48.600
<v Speaker 1>you about kong fu. But you actually have to build

0:40:48.640 --> 0:40:51.520
<v Speaker 1>something to be able to respond to that. And that's

0:40:51.560 --> 0:40:53.640
<v Speaker 1>what Josh is talking about now when we move on

0:40:53.680 --> 0:40:56.160
<v Speaker 1>to the node piece, right, Yeah, so this is the

0:40:56.200 --> 0:40:59.279
<v Speaker 1>magic black box in the cloud that has you know,

0:40:59.360 --> 0:41:03.239
<v Speaker 1>the cutting edge artificial intelligence technology that Amazon offers us

0:41:03.960 --> 0:41:07.080
<v Speaker 1>um on the interaction model side of things, and as

0:41:07.080 --> 0:41:11.080
<v Speaker 1>a developer, I provide a training data. Uh. When David

0:41:11.120 --> 0:41:14.160
<v Speaker 1>said it's sort of prepares things ahead of time. It

0:41:14.200 --> 0:41:19.160
<v Speaker 1>literally bakes that training data down into something that occurs,

0:41:19.280 --> 0:41:22.279
<v Speaker 1>you know, in near real time. That interaction is is

0:41:22.400 --> 0:41:25.760
<v Speaker 1>very very fast and there's there's little to no latency

0:41:26.040 --> 0:41:28.640
<v Speaker 1>um and the reason is because they use that sample

0:41:28.680 --> 0:41:31.839
<v Speaker 1>data to prepare a model ahead of time. Uh, it's

0:41:31.880 --> 0:41:35.080
<v Speaker 1>really cool. It's it's super cool. It's you know, we're

0:41:35.160 --> 0:41:37.520
<v Speaker 1>we're living a sci fi novel right now. I feel

0:41:37.520 --> 0:41:39.759
<v Speaker 1>like I'm in cryptomic con or. So well, it's it's

0:41:39.760 --> 0:41:42.279
<v Speaker 1>got to be exciting too to be a developer that

0:41:42.400 --> 0:41:47.440
<v Speaker 1>gets to take advantage of that level of of technology

0:41:47.760 --> 0:41:50.960
<v Speaker 1>and not have to build it yourself, Like to to

0:41:51.360 --> 0:41:54.279
<v Speaker 1>develop on top of something that has already got this

0:41:54.320 --> 0:41:57.080
<v Speaker 1>amazing capability and on the shoulders of giants. I mean,

0:41:57.080 --> 0:42:00.279
<v Speaker 1>we're leveraging all of that platform and infrastry sure and

0:42:00.400 --> 0:42:03.799
<v Speaker 1>research that's that's occurred there. Uh. And right now there

0:42:03.880 --> 0:42:06.719
<v Speaker 1>is a language model out in the cloud that knows

0:42:06.800 --> 0:42:10.040
<v Speaker 1>all about rolling dungeons and dragon dice, which is It's

0:42:10.040 --> 0:42:12.520
<v Speaker 1>a reassuring thought, isn't it. I would say that that

0:42:12.600 --> 0:42:16.239
<v Speaker 1>was critical and critical hit it too a good initiative

0:42:16.680 --> 0:42:21.600
<v Speaker 1>in doing that, I'm terrible. I've lost my saving throw.

0:42:21.640 --> 0:42:27.279
<v Speaker 1>It gets being cool, that's not she will actually she'll

0:42:27.280 --> 0:42:31.400
<v Speaker 1>make a little scathing comment about you rolling a one.

0:42:31.880 --> 0:42:33.759
<v Speaker 1>That's that's pa why I showed it so like if

0:42:33.800 --> 0:42:36.400
<v Speaker 1>you could aller it. But the big light of my

0:42:36.480 --> 0:42:39.400
<v Speaker 1>kids rolling some dice if they had one, and she'll say, hey,

0:42:39.440 --> 0:42:41.520
<v Speaker 1>do you have a warple sword? If you hit twenty,

0:42:41.800 --> 0:42:46.120
<v Speaker 1>oh nice because off the head there you go. Uh so,

0:42:46.480 --> 0:42:48.399
<v Speaker 1>all right, So you you get to this point where

0:42:48.440 --> 0:42:51.440
<v Speaker 1>you've you've defined what it is you you know, incode,

0:42:51.440 --> 0:42:53.799
<v Speaker 1>You've defined what it is that the action that needs

0:42:53.840 --> 0:42:56.799
<v Speaker 1>to happen, um and you've you've built it. So we've

0:42:56.800 --> 0:42:59.360
<v Speaker 1>provided the training data and the interaction model, right, and

0:42:59.400 --> 0:43:04.200
<v Speaker 1>we've said when you uh hear words that are along

0:43:04.239 --> 0:43:08.560
<v Speaker 1>the lines of our request for airport information, I want

0:43:08.600 --> 0:43:11.960
<v Speaker 1>you to send an airport info intent as part of

0:43:11.960 --> 0:43:15.239
<v Speaker 1>a Jason payload to my skill service. And I also

0:43:15.280 --> 0:43:17.640
<v Speaker 1>want you to grab I want you to wrap up

0:43:17.719 --> 0:43:21.000
<v Speaker 1>what they said at this portion of that utterance into

0:43:21.040 --> 0:43:24.600
<v Speaker 1>a variable that I can use on my skill service. UM.

0:43:24.640 --> 0:43:28.120
<v Speaker 1>So the skill service in our class is living on UM.

0:43:28.160 --> 0:43:32.600
<v Speaker 1>It's actually another Amazon service called Lambda UM, which is

0:43:32.640 --> 0:43:35.880
<v Speaker 1>an HTPS server that spins up and shuts down. It's

0:43:35.920 --> 0:43:39.440
<v Speaker 1>kind of like Hiroku. If anyone's ever done rubyond rails

0:43:39.480 --> 0:43:42.280
<v Speaker 1>development here, Hiroku is a real go to for handling

0:43:42.280 --> 0:43:44.799
<v Speaker 1>the DevOps side of things UM and it's kind of

0:43:44.800 --> 0:43:47.560
<v Speaker 1>an on demand server platform that we're using in the

0:43:47.560 --> 0:43:53.200
<v Speaker 1>class UH and so on that. On that AWS Lambda instance,

0:43:53.280 --> 0:43:58.160
<v Speaker 1>we've written node code that handles parsing the HTTPS UM

0:43:58.320 --> 0:44:01.239
<v Speaker 1>Jason payload that comes across from the Skill interface and

0:44:01.280 --> 0:44:04.400
<v Speaker 1>it says, Okay, I've gotten Jason information here, and I

0:44:04.440 --> 0:44:09.400
<v Speaker 1>received an airport info intent UM, and I've got a

0:44:09.480 --> 0:44:12.920
<v Speaker 1>variable of a t L here, And with that information

0:44:13.239 --> 0:44:15.200
<v Speaker 1>I can then do what I will with it. I

0:44:15.200 --> 0:44:17.360
<v Speaker 1>mean I can I can go make a web request

0:44:17.640 --> 0:44:20.640
<v Speaker 1>or right to a database I could call off to

0:44:20.680 --> 0:44:23.480
<v Speaker 1>another service UM. And that's actually what we do with

0:44:23.520 --> 0:44:27.680
<v Speaker 1>Airport info is we hit the Federal Aviation Administration's servers

0:44:28.320 --> 0:44:31.400
<v Speaker 1>and request the status for a t L. Then we

0:44:31.480 --> 0:44:34.440
<v Speaker 1>build a string which is what Alexa is going to

0:44:34.480 --> 0:44:37.760
<v Speaker 1>respond with, and send it back to the Skill interface

0:44:38.680 --> 0:44:41.600
<v Speaker 1>and that gets forwarded onto the device. So that's kind

0:44:41.600 --> 0:44:44.320
<v Speaker 1>of the round trip of how an interaction with the

0:44:44.360 --> 0:44:46.839
<v Speaker 1>skill would work and what we would do as the developer. There,

0:44:46.960 --> 0:44:50.200
<v Speaker 1>so got the skill interface, skill service, skill service response

0:44:50.280 --> 0:44:52.120
<v Speaker 1>to the interface and hands it back to the device.

0:44:52.239 --> 0:44:58.040
<v Speaker 1>Now for practical experience for the person who's actually using Alexa. Obviously,

0:44:58.440 --> 0:45:01.200
<v Speaker 1>all of that information, while awesome, is not something that

0:45:01.200 --> 0:45:02.960
<v Speaker 1>you have to worry about. If you did have to

0:45:03.000 --> 0:45:06.040
<v Speaker 1>worry about it, then you would not really have a

0:45:06.800 --> 0:45:09.560
<v Speaker 1>consumer product in your hands. You'd be a developer. But

0:45:10.080 --> 0:45:12.880
<v Speaker 1>in order for you to understand what is what's the

0:45:12.960 --> 0:45:15.239
<v Speaker 1>end result with that, I thought i'd be cool, Josh,

0:45:15.320 --> 0:45:18.400
<v Speaker 1>would you mind asking Alexa kind to give an example

0:45:18.480 --> 0:45:21.440
<v Speaker 1>of that app in action? Yeah? Sure, So it's skill

0:45:21.480 --> 0:45:23.960
<v Speaker 1>I should say I have. Don't worry about it. That's

0:45:23.960 --> 0:45:26.759
<v Speaker 1>a common blunder initially coming into the thing. I think

0:45:26.800 --> 0:45:30.120
<v Speaker 1>I said app once today as well, So we don't

0:45:30.160 --> 0:45:34.680
<v Speaker 1>have any app. That's fair, that's fair, that's a good

0:45:34.719 --> 0:45:39.279
<v Speaker 1>skill idea. Yeah, so alright, so let me first, let's

0:45:39.360 --> 0:45:44.120
<v Speaker 1>un mute her, all right? So okay, So Alexa ask

0:45:44.200 --> 0:45:48.200
<v Speaker 1>airport info for flight delays at a t L. There

0:45:48.280 --> 0:45:51.759
<v Speaker 1>is currently no delay at Hartsfield Jets in Atlanta International

0:45:53.080 --> 0:45:56.759
<v Speaker 1>ladies and gentlemen, This is an amazing day. Not only

0:45:56.760 --> 0:46:01.280
<v Speaker 1>did we get to hear voice wrecking shouldn't service work

0:46:01.600 --> 0:46:03.520
<v Speaker 1>in real time on this show? But there are no

0:46:03.640 --> 0:46:11.640
<v Speaker 1>flight delays at the airport. Actually yeah, it's it's it's

0:46:11.640 --> 0:46:13.600
<v Speaker 1>not calling it's not really calling the airport, it's not

0:46:13.640 --> 0:46:16.560
<v Speaker 1>calling the f A uh. But this is this is

0:46:16.600 --> 0:46:21.320
<v Speaker 1>really interesting again showing just a very simple application. Obviously,

0:46:21.360 --> 0:46:25.560
<v Speaker 1>you could ask for lots of different things. Sorry, skill,

0:46:26.480 --> 0:46:28.600
<v Speaker 1>but you could ask for lots of different things, including

0:46:28.680 --> 0:46:31.080
<v Speaker 1>like if you were curious about what's the weather going

0:46:31.120 --> 0:46:33.160
<v Speaker 1>to be? Like what what was the time of day?

0:46:33.760 --> 0:46:36.640
<v Speaker 1>Can you play my favorite playlist on such and such?

0:46:37.040 --> 0:46:40.000
<v Speaker 1>All of these sort of things that are all really

0:46:40.040 --> 0:46:41.680
<v Speaker 1>just you know, if you come up with an idea

0:46:41.760 --> 0:46:45.120
<v Speaker 1>that could be uh essentially if you could if you

0:46:45.120 --> 0:46:49.120
<v Speaker 1>could do it across a computer web browser, and if

0:46:49.160 --> 0:46:52.800
<v Speaker 1>you're able to translate that into an experience that works,

0:46:53.480 --> 0:46:58.759
<v Speaker 1>especially speech um in a speech role, then it's it's

0:46:58.920 --> 0:47:01.960
<v Speaker 1>totally possible. And only that, but I saw because I

0:47:02.000 --> 0:47:04.239
<v Speaker 1>was curious about this. There are some things that we

0:47:04.320 --> 0:47:08.000
<v Speaker 1>do with computers where if you were asking a question,

0:47:08.480 --> 0:47:11.160
<v Speaker 1>you might need a little more information or a little

0:47:11.200 --> 0:47:14.040
<v Speaker 1>bit more of Uh, it might be a little difficult

0:47:14.280 --> 0:47:18.200
<v Speaker 1>to explain something just simply in spoken word. Uh. I

0:47:18.280 --> 0:47:21.360
<v Speaker 1>find that all the time doing an audio podcast, that

0:47:21.480 --> 0:47:25.240
<v Speaker 1>it can be kind of challenging to explain a certain

0:47:25.280 --> 0:47:28.200
<v Speaker 1>concept without the use of visual aids. I saw that

0:47:28.320 --> 0:47:30.799
<v Speaker 1>with Alexa you could actually pair that with something like

0:47:30.840 --> 0:47:33.399
<v Speaker 1>a display on an app where you can you can

0:47:33.440 --> 0:47:36.480
<v Speaker 1>have a little more information about whatever the request is. Yeah,

0:47:36.480 --> 0:47:40.919
<v Speaker 1>that's right. Yes, So Amazon also provides a additional part

0:47:40.960 --> 0:47:44.479
<v Speaker 1>of that response from the server called a card um.

0:47:44.520 --> 0:47:47.280
<v Speaker 1>So if we sent that information onto the skill interface,

0:47:47.400 --> 0:47:51.440
<v Speaker 1>what would happen is we could preserve or send additional

0:47:51.480 --> 0:47:56.040
<v Speaker 1>info to the Alexa companion app, which is in your browser,

0:47:56.080 --> 0:47:59.200
<v Speaker 1>it's on your phone, it's on the Android and Iosh

0:47:59.400 --> 0:48:02.280
<v Speaker 1>client the interfaces with it, and we'd have a history

0:48:02.560 --> 0:48:05.439
<v Speaker 1>um and additional info as well um that we could

0:48:05.440 --> 0:48:08.839
<v Speaker 1>display with that card mechanism. Right, that's a great way

0:48:08.840 --> 0:48:11.839
<v Speaker 1>of getting around what could be a real challenge. I mean,

0:48:11.880 --> 0:48:15.719
<v Speaker 1>that's something that because we've designed so much of our

0:48:15.760 --> 0:48:21.800
<v Speaker 1>interaction to be primarily a visual experience, there's certain tasks

0:48:21.840 --> 0:48:25.200
<v Speaker 1>that you do that don't translate as easily into something

0:48:25.239 --> 0:48:28.200
<v Speaker 1>that's audio based. Yeah, it's a really interesting aspect of

0:48:28.280 --> 0:48:30.839
<v Speaker 1>voice user interface design as well. I mean it's such

0:48:30.880 --> 0:48:35.600
<v Speaker 1>a um ephemeral format. It's it's here and then it's gone,

0:48:35.960 --> 0:48:39.520
<v Speaker 1>and it's an interesting problem coming from you know, the

0:48:39.520 --> 0:48:43.719
<v Speaker 1>graphical user interface background that everybody's got UM. I think

0:48:43.719 --> 0:48:46.200
<v Speaker 1>being able to hook a little bit into the g

0:48:46.400 --> 0:48:49.120
<v Speaker 1>u I side of things and persist data UM and

0:48:49.160 --> 0:48:51.400
<v Speaker 1>allow people to refer to that back um. You know,

0:48:51.520 --> 0:48:55.719
<v Speaker 1>it does give your skill legs beyond just that one interaction.

0:48:56.080 --> 0:49:00.320
<v Speaker 1>You know, for situational experiences, an on demand info voice

0:49:00.320 --> 0:49:03.040
<v Speaker 1>is great, but being able to carry it beyond that

0:49:03.200 --> 0:49:05.280
<v Speaker 1>is also one of the things that the platform offers.

0:49:05.640 --> 0:49:08.640
<v Speaker 1>That's fantastic. Now I have a question for both of you.

0:49:08.680 --> 0:49:10.920
<v Speaker 1>I'm going to kind of start wrapping this up because

0:49:11.520 --> 0:49:14.080
<v Speaker 1>I feel like we've got a nice, a nice foundation

0:49:14.160 --> 0:49:16.399
<v Speaker 1>for a discussion here and I think that it will

0:49:16.480 --> 0:49:19.960
<v Speaker 1>really uh, it's really enlightening to understand, like what are

0:49:20.000 --> 0:49:22.680
<v Speaker 1>the challenges not just the challenges, but what are the

0:49:23.120 --> 0:49:26.600
<v Speaker 1>potential benefits of this kind of technology. My question to

0:49:26.680 --> 0:49:29.480
<v Speaker 1>both of you, and it's a pretty simple one, is

0:49:30.239 --> 0:49:33.520
<v Speaker 1>what is your personal favorite skill that you've had a

0:49:33.600 --> 0:49:36.239
<v Speaker 1>chance to play with on Alexa and Dave, I'm gonna

0:49:36.280 --> 0:49:40.799
<v Speaker 1>have you go first. Yeah. So, um, besides, you know,

0:49:41.200 --> 0:49:44.279
<v Speaker 1>using the dice with my kids the funniest one, and

0:49:44.320 --> 0:49:46.560
<v Speaker 1>this is this speaks I think a little bit to

0:49:46.800 --> 0:49:50.839
<v Speaker 1>my maturity level. Is I have this skill enabled. It's

0:49:50.880 --> 0:49:54.480
<v Speaker 1>called for a fart, and you can ask for a

0:49:54.600 --> 0:49:58.080
<v Speaker 1>fart and you will get one through Alexa. So I

0:49:58.200 --> 0:50:02.880
<v Speaker 1>actually I have dots so upstairs in my house. Yeah,

0:50:02.920 --> 0:50:05.839
<v Speaker 1>all right, there's a dot in each one of my

0:50:05.920 --> 0:50:08.880
<v Speaker 1>kid's rooms. And then I have a full echo in

0:50:08.920 --> 0:50:11.600
<v Speaker 1>my bedroom. And there's actually a place in that upper

0:50:11.640 --> 0:50:14.600
<v Speaker 1>hallway that I can say Alexa asked for a fart,

0:50:14.680 --> 0:50:20.200
<v Speaker 1>and there is a symphony musical returned across the entire

0:50:20.280 --> 0:50:24.959
<v Speaker 1>upstairs and I immediately here, Dad, not again. A nice,

0:50:25.040 --> 0:50:30.160
<v Speaker 1>a nice chorus followed by groans. I can appreciate that

0:50:30.200 --> 0:50:35.520
<v Speaker 1>as someone who loves puns and and grown and grown worthy. Hum. Um, yeah,

0:50:35.560 --> 0:50:39.239
<v Speaker 1>that's fantastic. All right, So Josh, you your answer, that's

0:50:39.239 --> 0:50:42.400
<v Speaker 1>an interesting one. Well, I have to say on the

0:50:42.440 --> 0:50:46.480
<v Speaker 1>pragmatic side of things, Um, I have used lift skill

0:50:47.080 --> 0:50:49.919
<v Speaker 1>to great effect. So you just say, Alexa asked lift

0:50:49.960 --> 0:50:52.000
<v Speaker 1>for a ride, and guess what a car shows up

0:50:52.560 --> 0:50:55.759
<v Speaker 1>at your door right to pick you up. Amazing, and

0:50:55.800 --> 0:50:57.799
<v Speaker 1>it's such a good fit for the platform as well.

0:50:58.160 --> 0:51:00.040
<v Speaker 1>I mean, you're walking out the door, you say, I

0:51:00.080 --> 0:51:01.920
<v Speaker 1>don't want to get my phone out. Let's let's have

0:51:01.960 --> 0:51:05.040
<v Speaker 1>a car show up right now. Uh, and it happens.

0:51:05.120 --> 0:51:09.640
<v Speaker 1>That's very convenient. And there is a if you you know,

0:51:09.640 --> 0:51:11.120
<v Speaker 1>we talked a little bit about D and D. If

0:51:11.160 --> 0:51:13.839
<v Speaker 1>you ever played um that goes all the way back

0:51:13.880 --> 0:51:17.120
<v Speaker 1>into the old bulletin board systems days of Space Empire

0:51:17.480 --> 0:51:20.959
<v Speaker 1>or anything like that, there is a skill called star

0:51:21.120 --> 0:51:24.880
<v Speaker 1>Lanes by Joe Jo Quinta that basically allows you to

0:51:25.040 --> 0:51:27.440
<v Speaker 1>do that and you can play It's a multi player

0:51:27.560 --> 0:51:31.799
<v Speaker 1>online Space Empire game through Alexa, and that is one

0:51:31.840 --> 0:51:34.040
<v Speaker 1>of my other favorites. I would highly recommend checking that

0:51:34.080 --> 0:51:36.480
<v Speaker 1>out if you've ever played any of those games in

0:51:36.480 --> 0:51:38.760
<v Speaker 1>the past. That could be quite addicting. That's really cool,

0:51:38.920 --> 0:51:42.600
<v Speaker 1>I mean, and it really does speak to the potential

0:51:42.719 --> 0:51:47.160
<v Speaker 1>for things that we can't even necessarily anticipate right now

0:51:47.400 --> 0:51:50.440
<v Speaker 1>that could end up being either uh, it could end

0:51:50.520 --> 0:51:52.880
<v Speaker 1>up being something where you know, people talk about for

0:51:52.880 --> 0:51:54.919
<v Speaker 1>a little while like, oh, that's really clever. It's really

0:51:54.920 --> 0:51:57.400
<v Speaker 1>neat use of the technology, or it could truly be

0:51:57.480 --> 0:52:01.239
<v Speaker 1>transformative to the point where we didn't think about this

0:52:01.360 --> 0:52:03.480
<v Speaker 1>and now we can't think of what what life would

0:52:03.480 --> 0:52:05.840
<v Speaker 1>be without it. I mean, that's the that's the cool

0:52:06.360 --> 0:52:09.240
<v Speaker 1>promise of this kind of tech is that it stands

0:52:09.280 --> 0:52:13.880
<v Speaker 1>to be really disruptive, uh, for a type of technology

0:52:13.920 --> 0:52:17.760
<v Speaker 1>that's been fairly set in its ways for the last

0:52:17.840 --> 0:52:21.359
<v Speaker 1>several decades. And um, I love seeing this. I love

0:52:21.360 --> 0:52:24.239
<v Speaker 1>the discussions about machine learning and artificial intelligence. I love

0:52:24.320 --> 0:52:27.560
<v Speaker 1>the discussions about natural language and the challenges that we

0:52:27.600 --> 0:52:30.360
<v Speaker 1>face when we try to create interfaces that can accept

0:52:30.440 --> 0:52:33.560
<v Speaker 1>natural language as an input. Guys, I have to thank

0:52:33.560 --> 0:52:36.520
<v Speaker 1>you so much for joining me on tech Stuff, Dave,

0:52:36.719 --> 0:52:39.880
<v Speaker 1>Thank you, Josh, thank you. I really hope that you

0:52:40.000 --> 0:52:43.399
<v Speaker 1>enjoyed your time here on on David. I know you're

0:52:43.400 --> 0:52:47.400
<v Speaker 1>not actually in the studio, but I'll pick up my

0:52:47.560 --> 0:52:50.560
<v Speaker 1>laptop and I'll just show you around in a minute.

0:52:50.600 --> 0:52:53.799
<v Speaker 1>Thank you. Yeah. I really enjoyed being here today and

0:52:53.840 --> 0:52:56.000
<v Speaker 1>having this conversation. Thank you so much for having me

0:52:56.040 --> 0:52:58.160
<v Speaker 1>on the show. Absolutely yeah, thanks for having us, Jonathan,

0:52:58.200 --> 0:53:02.000
<v Speaker 1>I was really stimulating conversation. Uh. Yeah. And Guys, like

0:53:02.040 --> 0:53:04.960
<v Speaker 1>I said, if you want to learn more about developing

0:53:05.000 --> 0:53:06.960
<v Speaker 1>for Alexa, or you just want to kind of give

0:53:07.000 --> 0:53:10.279
<v Speaker 1>a better idea of what's going on when you are

0:53:10.480 --> 0:53:14.359
<v Speaker 1>using a device or a service with Alexa incorporated into

0:53:14.400 --> 0:53:17.440
<v Speaker 1>it and you're wondering, well, what's actually happening? Those like

0:53:17.480 --> 0:53:20.879
<v Speaker 1>I said, those blog posts are really accessible. I've read

0:53:21.480 --> 0:53:24.320
<v Speaker 1>a lot of developer blogs over the last ten years,

0:53:24.640 --> 0:53:27.239
<v Speaker 1>and they are written in a way that is much

0:53:27.440 --> 0:53:31.560
<v Speaker 1>easier to understand, even if you're coming in from a

0:53:31.640 --> 0:53:35.200
<v Speaker 1>purely just just an area of curiosity, much easier to

0:53:35.280 --> 0:53:37.480
<v Speaker 1>understand that some of the other ones I've have encountered.

0:53:38.320 --> 0:53:42.719
<v Speaker 1>That makes me feel good. And I not that you

0:53:42.760 --> 0:53:46.000
<v Speaker 1>know it's I am. I am a budding uh podcaster,

0:53:46.120 --> 0:53:49.560
<v Speaker 1>but just last week I did actually launch a podcast

0:53:49.600 --> 0:53:53.400
<v Speaker 1>along the same lines of just talking about Alexa, you know,

0:53:53.440 --> 0:53:56.840
<v Speaker 1>similar to the blog post learning Um. The first episode

0:53:56.880 --> 0:53:59.200
<v Speaker 1>aired with Charlie Kendall, who runs in our smart home,

0:53:59.239 --> 0:54:02.120
<v Speaker 1>so it's me asking him you know what that team

0:54:02.440 --> 0:54:04.640
<v Speaker 1>set out to accomplish it If you're if you're interested,

0:54:04.680 --> 0:54:06.640
<v Speaker 1>if you're if your listeners want to check it out,

0:54:06.680 --> 0:54:10.600
<v Speaker 1>it's Bitley Alexa dev Chat or just look for Alexa

0:54:10.640 --> 0:54:14.680
<v Speaker 1>dev Chat on iTunes or Stitcher or tune in or

0:54:14.719 --> 0:54:18.120
<v Speaker 1>any of the other catchers. Yeah yeah, I should also

0:54:18.160 --> 0:54:22.120
<v Speaker 1>mention the video series for learning Alexa skill Kit Development

0:54:22.840 --> 0:54:26.920
<v Speaker 1>just rolled out. At least the first two videos have

0:54:27.080 --> 0:54:29.520
<v Speaker 1>rolled out, and um I believe Amazon is planning on

0:54:29.640 --> 0:54:33.000
<v Speaker 1>rolling all of them out in the next few days. Um,

0:54:33.040 --> 0:54:34.960
<v Speaker 1>so those are going to be available online. If you

0:54:34.960 --> 0:54:39.000
<v Speaker 1>search for you know, Big nerd Ranch Amazon training videos,

0:54:39.000 --> 0:54:41.480
<v Speaker 1>you should you should be able to find it that way.

0:54:41.480 --> 0:54:45.120
<v Speaker 1>Fantastic guys. Thank you again and listeners out there. If

0:54:45.120 --> 0:54:46.440
<v Speaker 1>you want to get in touch with me, you have

0:54:46.440 --> 0:54:48.919
<v Speaker 1>a suggestion for a future episode, you gotta follow up question.

0:54:48.960 --> 0:54:51.360
<v Speaker 1>You want me to either run down Dave or Josh

0:54:51.360 --> 0:54:53.879
<v Speaker 1>and ask them, or you know, just want to say hi.

0:54:54.400 --> 0:54:57.479
<v Speaker 1>You can email me addresses tech stuff at how stuff

0:54:57.480 --> 0:54:59.760
<v Speaker 1>works dot com or drop me a line on Twitter

0:54:59.880 --> 0:55:02.880
<v Speaker 1>or Facebook. The handle for both of those is text

0:55:02.880 --> 0:55:06.120
<v Speaker 1>stuff h s W and I'll talk to you again

0:55:06.840 --> 0:55:14.960
<v Speaker 1>really soon. For more on this and thousands of other topics,

0:55:15.200 --> 0:55:26.840
<v Speaker 1>is it how stuff works dot com.