Even with all
the advancements in CG animation, it can't capture that distinctly
lifelike essence that a real human exudes. But XR can capture that
essence -- volumetrically. Metastage CEO Christina Heller drops by to
discuss the process of transcribing the aura of a person into XR
space.
Alan: I have a really special
guest today; Christina Heller, the CEO of Metastage. Metastage is an
XR studio that puts real performances into AR and VR through
volumetric capture. Metastage is the first US partner for the
Microsoft Mixed Reality Capture Software, and their soundstage is
located in Culver City, California. Prior to Metastage, Christina
co-founded and led VR Playhouse. So, between Metastage and VR
Playhouse, she's helped produce over 80 immersive experiences. To
learn more about Christina Heller and Metastage, you can visit
metastage.com.
Welcome to the show, Christina.
Christina: Thank you so much for
having me.
Alan: It's my absolute pleasure.
We met, maybe three years ago? At VRTO?
Christina: Yes, that's correct.
Alan: Yeah, we got to try your
incredible experiences, mostly in the field of 360 video. And you've
kind of taken the leap to the next level of this stuff. So, talk to
us about Metastage.
Christina: Sure. As you said,
it's a company that specializes in volumetric capture. I think, in
the future, you'll see other things, but at the moment, we specialize
in volumetric capture. Specifically, using the Microsoft Mixed
Reality Capture system, which is an incredibly sophisticated way of
taking real people and authentic performances, and then bringing them
into full AR and VR experiences, where you can move around these
characters, and it's as if they are doing that action right in front
of you.
Alan: Let's just go back a
little bit. What is volumetric capture, for those who have no idea
what volumetric capture is?
Christina: Sure. For a long
time, if you wanted to put real people into AR/VR experiences, you
had basically two ways of doing it. You could either animate it; so,
you would try to create -- using mo-cap and animation -- the most
lifelike creation of a human character possible. Think, like, video
games; when you go play a video game and they've got a character
playing a scene out with you. If you wanted to put real people into
these XR experiences, that was the most common way to do it.
Then there was also volumetric capture,
which, for a long time, just wasn't quite -- I would say – at the
technological sophistication that people wanted, to integrate it into
projects. Volumetric capture -- thanks to the Microsoft system, I
think -- is finally really ready to be used in a major way in all
these projects. And basically what it does is, we use 106 video
cameras, and we film a performance from every possible angle. So,
we're getting a ton of data. We use 53 RGB cameras and 53 infrared
cameras. The infrared is what we use to calculate the depth and the
volume of the person that's performing at the center of the stage.
The RGB cameras are what's capturing all the texture and visual data.
Then, we put that through the Microsoft
software, and on the other end of it you get a fully-3D asset that
really maintains the integrity and fidelity of the performance that
was captured on the stage. Unlike some of the animated assets --
because this was kind of the challenge -- the animated assets, they
might get kind of there, but they had that uncanny valley thing
going.
Alan: Yeah, those are creepy.
Christina: Yeah. And so if
you're not familiar with the term "uncanny valley,"
basically with people and animals – or like, dynamic, organic,
moving objects -- if you get it kind of close, but not fully there in
te