XR for Business

Shaping the Digital World with Our Hands, with Clay AIR’s Varag Gharibjanian


Listen Later

We’re used to navigating our computing with keyboards, mice, and maybe track pads — analog input. But those inputs work for desktop computers; they’re clunky for XR interfaces. That’s why we need gesture controls ASAP, according to today’s guest, Clay AIR’s Varag Gharibjanian.

Alan: Hey, everyone, Alan

Smithson here. Today we're speaking with Varag Gharibjanian, the
chief revenue officer at Clay AIR, a software company shaping the
future of how we interact with the digital world, using natural
gesture recognition. We're going to find out how Clay will bring our
real hands into the virtual world. Coming up next, on the XR for
Business podcast.

Varag, welcome to the show, my friend.

Varag: Hey, Alan. Glad to be

here.

Alan: It's my absolute pleasure

to have you on the show. I know you guys are working on some cutting
edge stuff, so why don't I not ruin it, and just let you tell us what
is Clay AIR?

Varag: So Clay is a software

company, we're specializing in hand tracking and gesture recognition,
mostly in the AR and VR space. And we're also tackling a couple other
industries, automotive. And our third product category we call Clay
Control, which is kind of all the devices that can use gesture
interaction at a distance.

Alan: Are you doing this from

single cameras, or is this from infrared cameras, or a combination of
everything?

Varag: Yes, so Clay's-- we're

hardware agnostic. So it'll work across all those types you just
said. It could be one camera, two cameras, or more. And all different
types, so we'll work on RGB cameras that you'll find on everyday
smartphones, to what you might find embedded in AR and VR devices, to
monochrome ones, time-of-flight ones, and so we're pure software and
we've worked across a lot of those different types and have
compatibility with most of them now, which gives us a lot of
flexibility and it's really useful.

Alan: So I'm going to be able to

look at watches on my wrist in AR, right? Like I'm going to be able
hold my hand up and see what the newest, latest, greatest watch is?

Varag: It's actually pretty cool

that you say that, because that is one of the use cases that comes in
often inbound to us, as companies -- it hasn't happened yet -- but
those companies definitely brainstorming around how you track the
hands even with just a smartphone, like overlaying something.

Alan: We actually did it. We did

a project just using Google's hand tracking library. We managed to
make the watch sit on the wrist, but it was kind of glitchy. It would
sit weird. And yeah, it was-- it was not great, but we made it work,
it just wasn't sellable.

Varag: Yeah.

Alan: So this is really a

foundational software. And I know you guys are working with some of
the larger manufacturers. You want to talk about that -- or can you
talk about that -- and then what that might look like?

Varag: Yeah, I can speak a

little bit about that. So we feel -- like you said -- this is
software that really needs to be optimized for the hardware that it's
working on. The deeper it is in the stack, the better performance
you'll get, and the better synergies you'll get with all the other
technologies that are working on these devices. So that's why when I
joined the company, really, I made the focus to get as deep into the
stack as possible. We looked at the market that time a couple of
years ago to look at who is really central to defining the reference
stack. What's going to most AR and VR devices? And to me, Qualcomm
made the most sense. So we spent a lot of time working with them. As
you know -- and some of our listeners might know -- they really do
define a lot of what goes into the guts of most AR and VR devices
today. And likely in the future, too. So we work closely with them.
What that means that from a software architecture standpoint and a
hardware standpoint, we try to make our software as optimized as
possible for their reference designs. And as a result of that, any
OEMs that want to pick up a Qualcomm chip and all the technologies
around it, we're really well suited to fit along the side of all
those other technologies.

With Qualcomm's 835 -- or 845, whatever

-- their new chips are kind of really powering the future devices. I
know pretty much all of the standalone VR headsets right now and most
of the AR glasses are running on the Qualcomm chips. So this kind of
opens up the world of spatial computing. And hand tracking needs to
be there. It's kind of part and parcel. And now when Qualcomm's
announced their new XR2 chip, I think this is going to really, really
unlock it. And it's basically 10x performance right across the board.
And so you guys are well-positioned for that.

Varag: Absolutely, yeah. And

it's pretty cool. We have every chip that comes out. The best part of
that partnership, in a sense, not just the amazing people we get to
work with, but also the chips that we get some of those first
references sites to our office,and to our labs. And it's pretty cool
that we get to experiment with the latest and greatest machine
learning models, and try to get the most out of those chips. Which in
every single chip that comes out -- which makes sense -- like we get
just a little bit more accuracy, a little more points of interest at
a lower consumption rate on those pieces of hardware. So it's pretty
cool, to see that evolve. It's pretty quick. Even though it's day to
day, it moves more quickly than people think.

Alan: I think this whole

technology stack of XR is moving. It's actually moving way, way
faster than I had ever anticipated. I was looking at 2025 for
ubiquitous AR glasses. But after seeing what came out of CES this
year, and learning about this Qualcomm XR2 chip, you've now got AR
glasses coming out en masse. CES, I think it was 11 AR glasses that
came out this year, and all of them in the form factor of a pair of
sunglasses.

Varag: What did you think of

that, by the way? Because for me, when I saw that and I saw Nreal's
glasses, which to me are the closest thing you've got to consumer AR.
I saw a lot of companies that were coming out kind of mirroring what
they were doing. Slightly different form factors, seem like fast
followers in a way. To me that seemed like a good signal.

Alan: I think it's a great

signal. I think what will happen is -- like with the VR market --
there was thousands of Chinese knockoffs of VR headsets and all of
them have gone out of business now. I think you're going to see this
kind of flooding of the market of these cheap AR glasses. And it will
come down to things like embedding technologies like Clay into it,
and having proper Qualcomm chips in there and in proper chipsets. And
it really comes down to having the full tech stack to deliver on the
quality that people really want. So I think you're going to see lots
of incumbents, you're going to see a lot of people come in and try to
take over the market. But these are not easy problems to solve, as
you know. And then--

Varag: Especially because every

maker has got to print hardware make-up, in a sense. And so it's
really about bringing these technologies together in that specific
device. I think what Qualcomm is doing and companies like us working
together is making it more-- look, I'm really looking forward to--
and this is why I joined the company in a sense. I want that iPhone
moment within this industry, where the right technologies come
together and a quote unquote, a kind of dominant design is developed.

Alan: With Facebook now working

on AR, and you've got Magic Leap, and Microsoft's Hololens.

Varag: Yeah.

Alan: So they've kind of set the

bar pretty high. The Hololens was an amazing piece of tech and now
Hololens 2. And, you know, Magic Leap-- people are going to expect
when they walk around holograms, they stay put. They stay rock solid,
steady, attached to the real world, which is really gives you that
kind of mixed reality experience. But it's not easy to do that. And I
think a lot of these companies are like here, "Here's a heads-up
display, you can see things in mid-air." But they're not
thinking about the tracking. They're not thinking about object
recognition. They're not thinking of the full tech stack that's
required for real pervasive augmented reality in the world. Then
you've got glasses like the North glasses, which just give you a
little heads up display, like your Apple Watch display. And I think
those are going to have a great place, too. But let's get back to
hand tracking, because this is a vital part. And I know Oculus Quest
just introduced hand tracking. And I think the Quest -- I just read
an article today -- the Quest has sold over half of all the VR
headsets in the world last year.

Varag: Wow.

Alan: Yeah, it is-- it's a game

changer. And they introduced hand tracking. And as soon as we start
to see apps that has that built in, that will be the new standard.
That'll be the new normal. Just like standalone, no wires,
non-tethered VR is now the new normal. We're not going backwards.
We're not going, "Hey, let me connect this to a big computer and
cables and stuff." Nobody wants that anymore.

Varag: Right, exactly, exactly.

And you know what? By the way, Alan, I think it's already becoming--
you're right, it's not the standard yet amongst users and consumers,
but amongst the OEMs. And that's who I speak with a lot. It is
becoming like, hey, they want the same thing, ASAP.

Alan: What's the next step for

hand tracking, then? You've got-- you're able to track hands very
precisely. Are things like midair haptics-- I know-- I guess Leap
Motion would be a competitor to you guys, even though they're using a
hardware solution to do that. But things like midair haptics, with
the Ultra Haptics or the Ultra Leap now. These things are really
foundational and interesting.

Varag: Yeah, I think there's

still some things to be solved even on basic, marker-less,
hardware-less sort of computer vision based hand tracking. So even
things like occlusion or field of view and compute, those all need to
be even better, so to say. So what I mean by that is, if you take
your hands and you go out of the field of view and you quickly bring
them back into the field of view, what's that activation time look
like? Like, how quickly do you go from not recognizing that hand to
recognizing it. And in particular, now a lot of people are using
machine learning models, optimizing those models, making sure that
there are fewer false positives. There's still some things to be
solved. If you have other hands in the field of view or-- and then
from a compute standpoint, just making sure it all works and it's not
burning the battery. There's still some things to be done there. And
that's an ongoing challenge, especially if you're not like a huge
multi-billion dollar company like Facebook, and you don't have all
the resources they have to stitch that all together in the best way.
So companies like ours are getting better and better at doing that
and offering that. So there's still some things to be done there, in
terms of once that's solved, or in tandem, There's-- you're right,
there's some cool things you might be able to do with haptics,
because one thing you notice is that as our technology gets better
and better, and you really feel like your hands are there, it becomes
more confusing that when you actually go to touch something in the
virtual world, especially if we're talking about VR, you don't have
any of that that sensation back. It's really confusing to your brain.
So having something like that -- even if it's a small form of
feedback -- could be really, really helpful. Although I'm -- at least
for consumer VR -- I'm against adding any extra hardware. I think
it's hard enough for consumers to adopt -- for the regular consumer,
even the early majority of consumers -- to adopt VR. We don't want to
be making them wear anything more on their hands or controllers,
even, to me sometimes can be excessive or too much of a requirement.

Alan: It's interesting you say

that, because I actually got to try something at CES last year, and
it was these little sensors that went over my fingers, almost like a
pulse oximeter.

Varag: Okay.

Alan: And they just gave slight

haptic feedback on the tips of your fingers. And it turned out that
that was actually quite convincing. We didn't need the whole haptic
glove or anything. It was just a very small sensor that clipped on
your finger. And I'm assuming -- I was wired -- but I'm assuming it
could maybe be Bluetooth or whatever. But it was enough sensation,
that I actually felt it was real. I reached out and touched a fire,
and it buzzed. Then I jumped back. [chuckles] I felt like an idiot.
Freaked me out.

Varag: It's pretty cool. Yeah.

And I even met this company not too long ago. Unfortunately blanking
on the name right now, but it was at VRX last time I was there. And
they were working on something really interesting about haptics. They
were changing what that haptic feedback felt like, both in frequency
and intensity, based on what you were touching within VR. Which I
thought was really cool. So I'm looking at a desk right now, touching
a desk in real life feels very different from touching this cup,
right here. So how do you bring those differences in VR too, along
with hand tracking? That's super, super exciting. And I think it
makes sense definitely now -- as you know -- to do this all of the
enterprise space, but rolling that into consumer where prices need to
lower and things need to be even easier to use, that's yet to be
seen. Maybe there needs to be something like of what's embedded in
the headset, where there's ultrasounds around or something like that.
That's pretty far out there, though, I think.

Alan: Yeah, I don't know. This

is going to be so many use cases for this. And I'm just thinking of
the different ways hand tracking can be deployed, and in just the use
cases from e-commerce and being able to see products in your hands,
but also just interacting in the real world in a natural way. One of
the things that always stuck with me is that when Hololens 1 came
out, when you show a kid something, they just pick it up and they do
it. But when you showed anybody over a kind of 30, that gesture
motion of the clicking that Hololens-- you had to reach out and kind
of thumb-point-click thing. And nobody could figure it out. Everybody
just reached out and tried to push things, they just reached out
naturally to touch it. And I think the Hololens 2, I believe they've
addressed that. But there's gonna be so many different devices out
there. And I think you guys have made a really good idea in putting
it at the chipset level and really making it available as that
reference design for everybody.

Varag: Yeah, it's paid off so

far, in the sense that when we go to OEMs or companies that really
are building this hardware -- or even those that are buying it --
they love the fact that we understand the architecture, the software
itself understands the rest of the architecture. And we've got a
version of something we could put in there that most importantly, it
interoperates with everything else. Because there's just so many
other functions you need to-- and sometimes we're using literally the
same cameras that are being used for other functions, too. So we need
to respect those as well and make sure they work harmoniously
together, to enable those use cases as well.

Alan: Indeed. So speaking of use

cases, let's dig into some of the use cases. I've got a pair of AR
glasses or VR goggles on. What are some of the use cases where I'd
want to use some of the gesture recognition? I'm just looking at the
page here on your site about gesture recognition. How you have flip,
grab, up, down, swipe, pinch, victory point, all these things. But
what are some of the use cases, practical use cases that people can
wrap their heads around?

Varag: First thing I'd like to

say is, I think of hand tracking and gestures and some of the other
input methods as kind of what is like the keyboard and mouse of back
when in the PC era, more like just basic navigating around that
device, I think is going to not just only be hand tracking, but it
will be all those, in some multi-modal form together. So I think
that's number one, is just getting around the device most easily. So
when you first put that headset on, you don't necessarily want to
have to reach for the other controllers or something else every time
you want to just navigate the device. So I think that's first and
foremost. And they're in app control as well, too. So obviously for
gaming, just interacting, you can be holding like -- it's more on the
consumer side -- you can be shooting out of your hand, we've done
applications like that before. But especially on the enterprise side
-- where things are being used a lot today -- I see various things
like just communicating to the device, or doing grab gestures for
short cuts in the device makes a lot of sense. Sometimes it's just
like capturing a screenshot of what's going on. You might do one of
those gestures to take a screenshot or take an image if that's what
you're looking at in AR. But what comes up a lot definitely -- for
gestures, at least -- is shortcuts. To the extent that there is
something you want to do very quickly in-device that you do
repeatedly very often, using a gesture to do that can make a lot of
sense. Especially when some of the other inputs are just not
convenient. So like voice, for example, if there is a noisy
background, you might want to just use your hands instead. So yeah,
that's some of the ones we've seen, especially if you've got gloves
on or something, in the enterprise use case and you don't-- you can't
necessarily do touch on the device itself and you might want to use a
gesture. And those functions can include things like-- those
shortcuts can be like muting/unmuting, changing volume, changing apps
you're in, waking up the device or making it sleep. So and so forth.

Alan: The gestures in shortcuts

are going to be amazing, because I think it's kind of like you said,
multi-modal is going to be really important, being able to use voice,
being able to use gaze, understanding what you're looking at. I think
once we have eye tracking and this gesture recognition and voice,
it's just going to be "Show me that thing." and it knows
you're pointing, it knows what you're looking at, and it knows what
you said, and gives you what you want.

Varag: And Alan, one thing I

think about a lot is--t I like to think several years ahead, it's
fun. When you think about all those things coming together, just
imagine what you can observe about-- some people might say this is
creepy, but what you can observe about what a user wants and what
they're doing. Today we're looking at what's someone's behavior on an
app, on a phone, or on their laptop. And you can track a lot through
those sensors that are onboard. But just imagine what you could track
about where someone's eyes are looking in a given seat, how long
you're looking at a certain ad, or specifically with their hands. So
we have at least 22 points of interest tracked in some cases, and we
could see how they're-- where they are in 3D space, relative to all
the other things that are in 3D space. And that, I think, is really
interesting for capturing intent of what a user is doing. Super
exciting stuff. And how you can maybe monetize that data at some
point is interesting too. But just the fact that you can understand
what a user wants and how they're moving their hands around. I think
hands are a really natural, expressive way of what a user, a person
is doing.

Alan: Indeed. I wonder if you're

going to have people flipping off the bird, and can you edit that
out, so it doesn't do it? [laughs]

Varag: [laughs] That could be

added. Like every time that's read--

Alan: "This user needs

help." And support comes up. [laughs] "I see you're having
troubles. Can I help you?"

Varag: Exactly.

Alan: Awesome. So let's get

practical here. If you're going to be building this into all sorts of
things, not just VR and AR headsets, put automobiles and that sort of
thing, why would I want gesturing or hand recognition in a car?

Varag: That's a good question.

So we're seeing that in the automotive space, for example, that--
I'll make it more broad than automotive. Anywhere where the rules of
how you interact with a device are either being written for the first
time or being rewritten, I think that's where gestures are
interesting and hand tracking as well. In the automotive space, I
think there is a lot of automotive players trying to figure out what
does the future look like, especially with autonomous driving coming,
I think that's part of what's driving that. And so that means that
passengers and drivers have an opportunity to have a different
experience with how they interact with the car. And that's where
something like gestures comes in. Even when things are not autonomous
yet, actually, we were-- one of our clients in that space is Renault.
And I think one of the key drivers there is safety. And the use case
there is drivers making gestures to control the infotainment system.
So any function of the infotainment system that either takes too many
button clicks to get to that specific function you want, again, have
a shortcut to the feature. The button you want to press is just too
far away, and there's too much reach. That seems like it's a small
thing, but really, when you're driving, any milliseconds matter. It's
to make it a little more convenient for the driver so they can mute,
unmute, answer a call, end a call, zoom in, zoom out, so on and so
forth. Then we can map gestures to different kinds of functions
there, that make the most sense for that specific car. And it's not
just for the convenience, but also for safety. The more we can reduce
the cognitive load of drivers, the more safeness. That's the idea.

Alan: Yeah, I guess that makes

sense. So what's next for you guys then? You've partnered with
Qualcomm. You're gonna be kind of at the root level of all of these
new devices. What's next then?

Varag: What's next is going to

be continuing that work, what we're doing with Qualcomm, and really
trying to onboard more and more of their customers. This year we've
been focusing on bringing on Tier 1 companies that are building AR/VR
devices. And there is still in almost in every case these device
makers, they make little tweaks and changes to that reference design
that they get. And so they might take different kinds of cameras. And
there is some work with Clay to onboard them in the right way, make
sure that the hand tracking is working well -- I guess -- with all
the other functions, and we'd like to own that process a little bit
too, and get involved. So it'll be the year for us, I think, where
our tech has never been as ready as it's been today. What I mean by
that is we're interoperable now with cameras that are the most
available in the market, including monochrome cameras that are used
for six-degrees-of-freedom. So we can work with those. And so the
next steps will just be bringing on those big customers. I think the
demand is going to be there this year more than ever. That's what I'm
excited for, mainly because we have the software that runs in the
right hardware more than ever. And I think what Oculus did and
launched that as a feature proves that it's important, and that
everyone else now wants it too. So that's what I'm excited for for AR
and VR in 2020, for sure. And from a technology standpoint, we're
just always-- that's never going to stop. I think the key thing we're
going to be driving there is getting more and more detailed tracking
and more more accurate tracking. And at the same time, monitoring how
much we're really consuming the CPU, the GPU, or DSPs onboard, and
optimizing like crazy, because we don't want to be--

Alan: You guys don't want to be

the one that sucks up your battery.

Varag: Totally. Yeah. We want to

be the one that's like "That works better than anyone else."
and no one says you're too expensive.

Alan: I have to ask, because I'm on your new website -- it's not published yet, but it'll be out soon -- you have here, "scalability: one to four hands."

Varag: [laughs]

Alan: I'm not sure what what's

all about, but alright!

Varag: [laughs] So maybe that

applies to, like-- [laughs] maybe that applies to the other use
cases. So we have three product categories, we've got plain reality
for all AR and VR stuff. But for one to four hands, four hands could
make sense maybe in-- actually in situations where you've got a
really, really big screen at big shows. Things like that, you might
want to have multiple people, multiple hands.

Alan: I was also thinking being

able to -- in virtual space -- being able to see other people's hands
come into the space, and hand them things.

Varag: Haven't seen that yet

come up. But I could see that coming for sure. So, yeah, absolutely.

Alan: Very interesting. What is

one problem in the world that you want to see solved using XR
technologies?

Varag: It's a good, good

question. I think that there's a lot that AR and VR can solve. I
guess what I'm like personally most excited about -- this is less
about Clay and more about me -- but I think the problem of distance
is just-- and that's that's very general, but what got me super
excited into XR in the first place was this idea that two people can
be far away from one another. And I know we can connect now through a
lot of different mediums, but I think that those mediums really only
get us halfway there. And strangely, being halfway there makes us
less often to be there, like in person face-to-face today. And I
think that-- I really actually-- like I actually despise that about
social media, and all the other technologies that help us connect
today. It's like I feel like people are now lazier than ever with
really connecting. What I'm excited about in AR and VR, I think it's
the best candidate for a technology to really, actually bring us to
90 percent there, maybe even close to a 100. There is a lot of
problems that get solved, obviously, by being able to do that. One is
just connecting better with people. But all the things that come
around, all the problems that we get around today by not doing that
well enough, that will be solved. So that's seeing a loved one more
often and feeling like you're present with them, that's great. If
it's trying to do business with someone who's really far away, I
think like feeling present with them in a room -- like you can do in
VR -- is pretty amazing. So that's what I'd say.

Alan: It's a really good answer,

and I think it's a great way to position XR as a medium by which we
can create a more global unified world.

Varag: Absolutely. I mean, I

remember the closest thing I did to that-- I haven't done enough
multiplayer or multi-user type of VR experience lately, but I
remember a while back I tried -- I guess maybe like a year and a half
ago -- I tried a company's software called -- you may or may not have
heard them -- called VRChat.

Alan: Yeah, absolutely. I

actually did a -- ooh, back maybe three years ago -- I was in
Gunther's Universe, which was one of the VRChat rooms, and I was
interviewed in Gunther's Universe. It was super fun.

Varag: Yeah. And I just remember

sitting there thinking-- I felt like -- I'm sure it's much better
today than it was when I tried it -- and I was like, "Oh my
God." I just remember thinking, "Wow, I really feel like
I'm here with someone who's a stranger." And it was kind of
strange, because that kind of creeped me out a little bit, in the
sense that I was like "Wait, this person, I'm in a room with
them. I don't know them, though." And strangely enough, I didn't
have all the information that you get when you're -- because I love
networking, I love meeting strangers all the time -- but when you're
in person, you get 100 percent of who they are. You see them,
interact with them, you shake their hand. But in VR, at that time,
three years ago, you're only halfway there.

Alan: The first time I had that

was with Altspace. I was lying in bed, and I put on Altspace. It was
with the Gear VR.

Varag: Yeah.

Alan: And I was just walking

around. I was kind of talking myself, like, "What is this silly
thing?" And somebody stood beside me and started saying, "Hey,
it must be your first time in here." And I couldn't figure out
that they were talking to me. I was like, "What?!"

Varag: That was your first time.

That's an interesting note, for sure.

Alan: Yeah, and it's funny

because up until recently, Altspace really didn't improve. But I was
in there the other day, and it actually looks like they've made some
real improvements, which is fantastic to see. Everything's getting
better.

Varag: Absolutely. I've got to

check that out.

Alan: And then the new one that

we really love is Meet In VR. It's a Danish company. And they've
really nailed the interactions for corporate clients, for enterprise,
because it allows you to be in a VR chatroom. But you have access to
photos, you have access to 3D models. You can write on the walls, you
can have conversations. It's-- I don't know. To me, it's the most
comprehensive business tool for communication, it's really great.

Varag: That's a big one, because

I'm on the business side of Clay. And so that means I travel around a
lot and I actually like traveling lots. I don't mind it too much. But
being able to do that pretty quickly in virtual reality would be
amazing. I've got to try that out. But it hasn't happened for me yet.

Alan: Indeed. Well, I want to,

Varag, thank you again for taking the time out of your busy schedule
to join us on the show.

Varag: Thanks for having me,

Alan. This is fun.

Alan: Absolutely. Where can

people find out more information about Clay and the work you guys are
doing?

Varag: So they can go to

www.clayair.io. They could reach me out on an email, too. It's my
first name, [email protected], as well. We have a new website coming
soon, so we'll be announcing that somewhere on LinkedIn, social
media, and all that good stuff. Alan, I love what you do in the
industry. Just please keep it up and keep in touch.

Alan: Thank you very much. And

you know what? I'm going to take this opportunity then to tell people
to subscribe to the podcast. I always forget this part and people
keep telling me, "You've got to tell them to subscribe."
So, subscribe to the podcast and you'll get more XR for Business
deliciousness, all the time.

Varag: [chuckles]

Alan: [chuckles]

...more
View all episodesView all episodes
Download on the App Store

XR for BusinessBy Alan Smithson from MetaVRse

  • 4.5
  • 4.5
  • 4.5
  • 4.5
  • 4.5

4.5

12 ratings