Join the VFX community by becoming a b&a Patreon...and get bonus content!
Performance capture that could be done simultaneously above and below water, a new muscle strains system away from blend shapes, and new R&D on a simulation framework to solve multiple types of water were among the big new developments for the film. Plus, the VFX team answer a major VFX question about a key scene.
When James Cameron’s Avatar was released in 2009, it was a showcase of several technological and artistic breakthroughs across performance capture, virtual cameras and native stereo photography. Then there were the several leaps forward that visual effects studio Wētā FX (then Weta Digital) implemented when taking a ‘template’ from the edited performances and translating performance capture to CG characters, while also realizing detailed facial rigging and animation, texturing, rendering, and compositing.
Jump to 2022 and the release of Avatar: The Way of Water, where Cameron’s Lightstorm Entertainment and Wētā FX have again raised the bar in visual effects–and not just in terms of performance capture and characters, but also one of the film’s biggest flexes: digital water.
Indeed, this new film would bring further developments to real-time performance capture, such as new additions to the motion capture suits and HMCs, a robotic eyeline system, live- action simulcam, and real-time depth compositing.
Not to mention enabling the underwater capture of actors in the massive Manhattan Beach Studios tank that was 120 feet long, 60 feet wide and 30 feet deep – this even allowed for capture when the actors went from above to below the water, and vice versa. Then there was, of course, the use of 3D, high dynamic range and high frame rate.
Wētā FX, too, reached new heights for The Way of Water in many different areas. The first was in facial rigging and animation, where the studio moved away from blend shapes to a new muscle strains system (with built-in neural network) that could be driven by both performance capture and of course by animators.
Then for the many different types of CG water featured in the film–waves, reefs, underwater currents, bubbles, foam, on characters et cetera–Wētā invested deeply into specific R&D on fluid solvers within their in-house simulation framework Loki, alongside a multitude of other tools, including the proprietary physically-based renderer Manuka.
Here, befores & afters got to sit down to discuss the latest advancements with the key VFX personnel from Wētā FX on The Way of Water: senior visual effects supervisor Joe Letteri, Wētā FX senior visual effects supervisor Eric Saindon, and Wētā FX senior animation supervisor Daniel Barrett (Richard Baneham was also a member of this principal VFX team as executive producer/Lightstorm’s visual effects supervisor and virtual second unit director).
We also highlight, in the article, some published research from SIGGRAPH on Wētā FX’s facial pipeline and water simulation R&D. And, the final question below puts to rest one of the great VFX questions that emanated from the trailer for The Way of Water…
New on-set: capture, real-time depth compositing and robotic eyelines
b&a: Joe, what were the new things you needed to solve for on-set performance capture for this film, especially with water?
Joe Letteri: Well, the capture itself was actually handled by the team at Lightstorm, rather than us–that’s where Jim did his performance capture, with that team using a big tank with a wave mover. Their approach was to build a motion capture volume and sink it into the tank, then build another volume that was above the tank, because we needed action in and out of the water. Characters bounce up and jump into the water, so the two had to work together. They’re similar in idea, but in practice they’re different.
Back on Apes, we switched from using optical light to using infrared light because it works in a lot of situations where optical light has problems. You don’t get stray reflections and things like that, however you can’t use that underwater because the red wavelengths get absorbed within a few feet, so that doesn’t work.
It was Ryan Champney, part of the Giant team at Lightstorm, who came up with a super blue, almost ultraviolet, light that was used for that. It’s super blue underwater and infrared above the water. We then put the two volumes together and realised we could actually get decent data by tracking it that way. That was the big win, because at some point we thought, ‘Well, this could turn out to just be really good reference.’ But in fact it turned out to be pretty usable data, right Dan? We were able to use a lot of it pretty cleanly.
Dan Barrett: We were, yeah. Obviously, it was fantastic to have buy-in from the cast. First of all, they learned how to stay underwater comfortably–and they could stay underwater for a long time. Then once they were comfortable, they gave us all these great performances down there, which was pretty amazing.
Joe Letteri: So, we’d have actors swimming underwater. Now, obviously humans can’t go as fast as the actual Metkayina who are adapted for underwater, but you still get correct swimming motion and buoyancy. And you get a performance that is only possible when you’re submerged in water, as opposed to trying to do it dry for wet.
Then, the animation team would take over and say, ‘Okay, well, I see the swim, but this needs to be a power stroke that’s propelled by a tail.’ So the team would add those extra motions and extend it, building a proper character performance from that basis.
b&a: Eric, what were the things done on set that might have been new on the film in terms of say body and facial movement?
Eric Saindon: As far as the on-set facial capture goes, we didn’t do a lot different other than shifting to two cameras to get better capture information. It gives you a lot more shape in the face and better information on the motion. The detail on the lips, too, things like that.
The other big tech differences on set were things like the real-time depth compositing, which allowed us to layer things properly in camera. Previously we would’ve put either A over B, or B over A, but if Quaritch walked in front of Spider, say, and then behind Spider, we wouldn’t have been able to get that previously.
Now with the real-time depth compositing, you can actually place the characters properly in space, and get the proper composition. Because of the height difference, if Quaritch [Stephen Lang in Avatar form] was supposed to be behind Spider [Jack Champion, a human character], he’s so big that composition-wise, it would’ve been a little wacky on set. You wouldn’t have known how big he was and his proportions in relation to Spider would have been off. With our new system, Jim gets a really great idea of what the shots are going be, how to compose them, how to shoot them while on set.
That ‘depth comp’ was used in combination with the new eyeline system, which is kind of stolen from sports games–the camera is on a wire like the ones you see running over an NFL game, say. We had a setup like that, which allowed us to put a video monitor floating in space at the right location for Quaritch’s head when Spider is interacting with him. It showed Stephen Lang’s capture performance and moved in sync with the CG character’s movements in the scene, which gave Jack Champion an eyeline for where to look, as well as timing and a performance to play off. This in turn gave us a better performance and allowed us to use our new facial system and our animation in a great way.
b&a: I was going to bring up the integration of say, live-action Spider with CG Avatars, and the Na’vi and other characters. I just thought that was seamless. Can you talk more about that compositing and integration?
Eric Saindon: Well, I think it all starts at the very beginning. So many times you shoot a character on a greenscreen and you end up shooting roughly what you think that character’s going to do. Composition-wise, you might shoot it a little bit weird, then you have to put it on a card and make it fit with the performance you want.
Here, everything was done with performance capture in almost a theatrical setting, with all of the actors together going through an entire scene without any cameras–all the actors knew the process. Then when it came to live action, Jack understood what the scene was because he’s performed it already.
It also meant that Jim knew what he was going to get. He could see the other performers acting with Jack, as CG characters with Spider together. It allowed for a much better composition and timing for the interaction between the characters. It meant we could get a great idea of what the shot was going to be, or what plate was going to work with the others. It just set us down a great path.
From emotionally plausible to anatomically plausible
b&a: Wētā FX has done so much in the field of digital humans and characters with its facial system–what was the new approach you took here?
Joe Letteri: For me, it came out of the system we’ve been using ever since Gollum. We were just banging our heads up against the limitations of it, so I had an idea for how to solve it, but Dan actually had to deal with it.
Dan Barrett: We’ve been using our FACS system for years. It wasn’t a muscle based system, it was basically a blend shape system based on the surface of the face. It was more about decomposed expressions and how they combined when the animator used the sliders. You can do really great emotional performances with that, but you can quite easily get off- model. You can quite easily combine things that shouldn’t be combined, then all of a sudden you can’t quite recognize your character and you wonder what’s going on, so you have to go back in and work out what’s happened. Sometimes you’d even have to rebuild it.
I think in the past we’ve been good at that emotionally plausible work, but we really wanted to be much more anatomically plausible. This new system uses muscle strains rather than just surface faces. We essentially measured the muscle strains from the actor’s face by reconstructing their faces, which could then be applied to the characters.
All of these shapes, every single movement, were based on a data set that had been captured from the actors’ faces. From the stereo cameras–the face cam–we could get fantastic solves from the performers that as an animation tool remained plausible.
The way that we interacted with it, we could pull points on the face, so if you wanted someone to smile, you pulled those corners back, and it would allow that to happen, but it wouldn’t allow it to go too far. Maybe if you pushed the cheek up, it could go a bit further, because that’s a fuller smile. So it was both an incredibly powerful solving tool to get those performances from the day, while being a great tool for animators as it kept us in line.
b&a: And then on top of that, I feel like Wētā FX has been doing amazing things with some deep learning and machine learning techniques on top of all that facial as well–was that the case here?
Joe Letteri: We started to use it where it was appropriate. The face was a really interesting example, because as Dan said, we wanted to understand the muscles but you can’t dissect the face live while a performance is happening. So the only thing we could do was try to infer it. In trying to understand the structures of the muscles, I realized that if you describe that linkage properly, you’re describing a very similar setup to how you would build a neural network. So I thought, well, rather than try to have a neural network guess using deep learning–which is what you would normally do–why don’t we just tell the neural network what it’s going to do? That gave us a more direct way of approaching it.
The old system is a blend shape system. It’s very linear, so if you want to have an eye blink, for example, you actually have intermediate poses within the system. ‘0.2’ will have the eye partially shut, ‘0.5’ will be more shut. So you go through these steps to approach a rotation. What you can do with the neural networks is encode the rotations directly. You can just say ‘eye open, eye close’ and all the in-betweens get handled by the neural network, not by the team building the puppet.
That’s incredibly important because most of the important things on the face happen rotationally. The muscles around your eyes and the muscles around your mouth are orbital muscles, and your jaw opens rotationally. So, it simplifies the concept of building it by pushing it into this neural network. You get nice smooth transitions.
That helps, for example, when a lot of the takes from an actor get smashed together. Jim likes the first part here, the second part here, because he’s changed the timing of the cut and he wants the pacing to work differently. So now you’ve got a hard cut between the two. With the blend shape system, it is really a pain to deal with that! The eyes are going to be in a different position. One might have been looking right, or looking left, but it’s not the right expression. This system allows you to plausibly blend across those transitions, in a way that would feel natural for the specific actor. This allows the animators to focus on the timing of it all, rather than the mechanics.
You can read more about Wētā FX’s new APFS (Anatomically Plausible Facial System) in the SIGGRAPH Asia Technical Paper: ‘Animatomy: an Animator-centric, Anatomically Inspired System for 3D Facial Modeling, Animation and Transfer’
b&a: Obviously a huge thing Wētā FX had to solve for the film was water. You’ve got normal waves, crashing waves, underwater, of course. Characters going from above to below, below to above. Is there something you could say generally, Joe, about how you started approaching water?
Joe Letteri: Yeah, and we also had hair and costumes in and out of the water, these things all had to work together. It’s another one of those things that we started thinking about after we did the first film, just “How do you make this better?”
There are two tricky things about water. One is we all know what it looks like, in all its forms; splashes, bubbles, foam, anything you can do with water. But each of those things is a different form – a big swimming pool of water is different to an ocean of water, and a rolling wave is different to a breaking wave, and a creature swimming through the water is different to a boat moving through the water, which is different to a creature jumping out of the water. They’re all different things.
We realized that we couldn’t really solve all that with one solver, because there are different states for each form. Bulk water will react to air drag differently to small particles because you’re changing the shape of them or you’re changing the shape and the scale. What we realized is that water exists in all these different states, so we wrote a suite of solvers in Loki, each one tuned for the optimal state.
What Loki does is essentially parses the simulation and assigns the correct solver to the correct state, before allowing room to blend in between the two. The artists have control over things like blend points and where they move those solvers. This allows us to go back and say, ‘Yes, now we can solve it in one pass.’ We’re running all the solvers in parallel, each one allocated their portion of it.
That was the approach we decided to take. I don’t know how you could have done it any other way. Plus It also had advantage that we could break it out into multiple machines in parallel. The big solves could actually be broken out into multiple machines. We could do solves all across the water. For example, when we had a bunch of boats in the water, we could do high-res solves over and around the boats and blend it into the bigger tank. That was why we built Loki a few years ago, to get this film going.
You can read more about Loki in the SIGGRAPH 2022 Technical Paper: ‘Loki: a unified multiphysics simulation framework for production’.
You can also read some previous research published at SIGGRAPH 2019 on: ‘A practical guide to thin film and drips simulation’.
b&a: Eric, what kind of conversations were there internally, but also with Jim, about the ‘look’ of underwater?
Eric Saindon: Well, Jim is a bit of an expert on water and underwater photography, especially. The way you look at the creatures underwater, you look at a Tulkun and you think ‘whale’, but a Tulkun is really five times the size of a whale. So when you get underwater you want the particulate, because you want to feel the medium, especially when you’re watching it in stereo. You want to feel that water around you, you don’t want them to look like they’re floating in air, so you have to add particulate. But, you don’t want to overdo it because then in reality you wouldn’t see more than five meters, and you want to be able to feel the characters and understand the distance.
So, we had to play with lots of different things, depending on the scene. You want to have the particulate, you want to be able to see a little bit of the reds in the underwater. They drop off so quickly that the reality is everything would be blue in every shot, because of the scale of everything. The audience expects to see a certain color, and the audience is so much smarter these days on understanding shots and understanding CG that if you fake it too much, you’ll lose them. If you go too much into the reality of, ‘Everything would be blue, you wouldn’t ever see anything,’ then it takes the audience out of the film a little, so it’s always a fine balance. Jim was really good at knowing that balance, and guiding us along for the ‘feel’ he wanted for shots.
b&a: Dan, in terms of creatures in the water, I particularly loved the skimwings. I just wanted to ask you about the animation challenges for something like that which are both in and on the water.
Dan Barrett: They were fun creatures. We looked at flying fish predominantly, for their wing and ground effects and for their flight. Jim was quite clear on what he wanted their wake to look like, based on reference that we’d seen in flying fish. Water resistance was really important for us to get right.
That was probably even more of an issue with the boats. It was something that we knew we were going to have to be on top of. We knew that if you’re animating a boat, all of a sudden you’ve got a terrain that is irregular, for starters, so your boat has to travel in an irregular way. But it’s also a terrain that can very easily change; there might be a direction for a wave phase to change, and all of a sudden our work needs to be redone. So we developed systems that meant a boat could adhere it to the surface. If you were driving slowly over waves, it would just stick to the waves, and if you started speeding up, then you’d start getting the kind of air that you would expect if you are going at a greater speed.
And then, water is 500 times denser than air, so if you do hit a wave with that boat and it starts flying, it’s got gravity acting upon it in a medium that’s much less dense than when it hits the water, so you need to know how that’s going to decelerate. You need to calculate buoyancy.
So even before we got into the coupling of the water and object, simulated against each other, we had to be able to block these things quickly. We had to be able to iterate fast as well, and we had to make sure that we weren’t doing what it is that animators are prone to do, which is crash spectacularly into water. Our simulations are so real-world, that if you suddenly plow a boat deep into the water, the solver is going to think that the mass is far, far greater than your boat actually is, and there’ll be water hitting the lens from a hundred meters away.
So we developed these tools knowing that that was coming. As an animation team, we were more closely entwined with the FX team than we’ve ever been on any project before. We worked essentially in a parallel workflow, where we would do our bit with our simulation on our animation tools, then they’d do their bit. There were times when perhaps you’d have a boat that’s quite near another boat, and all of a sudden you’re creating wakes, so once that’s there, it starts affecting the boats beside them. We’d be in a circular and coupled workflow, as well as a simulation paradigm.
b&a: One of my favorite shots or sequences is when the kids are first hesitantly jumping into the water to follow the other clan. It goes from above water to below water. How was that accomplished?
Joe Letteri: The performance capture tank was key to that, because we had the reference of them jumping in the water. So you’ve got that slowdown when they hit the water, you’ve got their proper behavior when you see them under the water, including the way they were trying to reorient themselves. I think that was pretty critical.
The other thing that made it really believable was the work we did on air bubbles. Air bubbles are much harder than we thought going into it. You’ve got cavitation, as well as all the air that gets entrained in the hair and costume and gets dragged down, before being released again and recombining into these blobby shapes.
They’re very familiar shapes. Again, we know what those look like. You know who really knows what those look like? Jim. We spent a long time trying to get that feeling of the flow of the bubbles. It’s not something that our software particularly wanted to do, right out of the box, so we had to rewrite a lot of the code to understand that entrained air solution because you’re inverting it. You’re doing bubbles inside of the water, not the other way around.
It’s similar to what Dan was talking about when a boat hits the water, you’ve got the other problem where bubbles are rising. They’re floating in the air, but as soon as they hit the air, technically they’re supersonic. It’s really hard to simulate that state change, so we had to write something to handle that boundary and get the correct bubbling on the water. That also drove into the foam–what we call foam–which is, close-up, a big collection of bubbles. One of the hardest things we had to do was that little bit of lapping water that creates those bubbles along the edge of a surface.
You can read more about some of Wētā FX’s research into bubbles in these presentations:
SIGGRAPH 2020 Talk: ‘Underwater bubbles and coupling’.
SIGGRAPH 2022 Technical Paper: ‘Guided bubbles and wet foam for realistic whitewater simulation’.
b&a: Well, that leads into another favorite shot, which is where the family is re-grouping on a bit of a rocky outcrop. They’re coming out of the water to rest on this rock. And they’re wet, the hair’s wet. There’s Spider there as well. Eric, I just thought this shot works so well because, well, I wasn’t clear if you actually went and shot that maybe on some sort of rock set piece or out on some bay.
Eric Saindon: The fact that you don’t know works perfectly for us. We actually had a wave pool that we shot Jack in – it was just a little rock and Jack in the wave pool. He climbed up out of the water and sat on that rock. We used Jack and a lot but not all of the water, because we needed to connect that water with the background water – the integration of the two waters together. Then there was the integration of all the rocks and the other characters.
That is a great scene, it’s obviously an important family moment in the film. The performances around that whole scene are spectacular. For the water integration and everything, there really is a little bit of everything there. Because of the way the system was built, and because we knew what Jim wanted with the cadence of the waves, we were able to connect the CG water and the live action water together to drive that sequence and really make it work.
Joe Letteri: A lot of effort went into that integration, too. You’ve got this water surface that’s continuous, from close-up to camera and farther out. Because we were shooting it in native stereo, there was no place to hide that. Remarkably, our depth compositing system helped us there – it did a pretty good job of extracting the water surface. It had no right to work that well on that surface, but it did! And that really helped us get that integration.
Eric Saindon: Yeah, we were able to create geometry from the water surface itself, which allowed us to then use that to help drive the FX side, because we knew where the blend needed to be and how to make them work together.
Real or CG?
b&a: Finally, there was a lot of online speculation after people saw that shot of the hands tightening the leather straps on the creature on the water surface, and people wondering whether it was practical or CG. To settle the debate, can you tell me if that was real or CG?
Eric Saindon: The shot in question was both live action and CG. The props department built a Ilu saddle and strap for Kevin Dorman to sit on in a small pool on stage. Kevin’s hand and forearm were painted by Sarah Rubano using reference of Jake’s arm from the CG model. Jim Cameron was then able to get the performance he wanted for the wrapping of the strap around Jake’s hand and interacting with the water. Once we got the plates at Wētā FX, we match moved the motion and used CG from the straps above Jake’s wrists. We used real water over the saddle and around the hands and fingers. CG water was used to extend the plate and to get the interaction of Jake’s body in the water.
Join the VFX community by becoming a b&a Patreon...and get bonus content!