Going deep on muscle strains

Why Wētā FX changed its animation system and adopted new deep learning methodologies for ‘The Way of Water.’

The history of CG characters crafted over the years by Wētā FX is almost like a history of CG facial animation itself. Standouts such as Gollum, King Kong, the Na’vi, Caesar, Alita and Junior each evidenced significant developments in the studio’s workflow from facial capture to final animation.

However, much of Wētā FX’s approach for these characters had come to rely on FACS and blendshapes. While powerful, the previous methodology had often required time-consuming manipulation of shapes and sliders to remain ‘true’ to the actor performance.

For the Na’vi and Avatars in James Cameron’s Avatar: The Way of Water, a new approach to facial animation was reimagined at Wētā FX which moved away from blendshapes to instead incorporate muscle fibre curves that drive skin deformation. The curves would closely mimic real facial muscle behavior. In addition, changes in muscle strains over time would be utilized by the system as training data and the latent space for a neural network.

Several steps also included ground-truth reference of performers from actor scans and capture data. The new system, called the Anatomically Plausible Facial System (APFS), produced not only realistic results, but also plausible ones, that is, the mesh fidelity remained within the ‘expression manifold’ for the CG characters and the performers behind them. Yet these performances could still be tailored per-shot by animators where necessary. In the end, more than 15 characters and over 3000 facial performances were animated with the APFS on The Way of Water.

To get a sense of the changes to the facial system, and where machine learning techniques came in, befores & afters spoke to Wētā FX facial motion supervisor Stuart Adcock and pre-production supervisor Marco Revelant about the advent of APFS. This is an excerpt from issue #11 of befores & afters magazine.

‘What if we learn it?’

It was on the original Avatar and several surrounding projects that Wētā FX established the somewhat de facto standard of the FACS puppet that was driven from attributes or action units. “That means,” explains Stuart Adcock, “that you’re essentially playing with around 50 core shapes on a puppet for the animators to use as inputs. What happens is, those puppets get very complex because when those linear shapes combine with each other, they create forms that are not particularly accurate. The linear stack tends to be quite hard for us to manage, so we add a lot of corrective shapes to try and correct when two shapes pull together, and before long it starts to just evolve into this quite a tricky beast.”

“We’ve always invested a lot of time into our puppets, and we’ve always felt like we got a pretty good representation,” continues Adcock. “But it was not as streamlined or precise as we wanted for The Way of Water. We wanted to go that extra step, plus we felt like we’d squeezed the lemon dry on FACS puppets.”

Marco Revelant adds that one inherent challenge with the FACS approach was that it is based on research relating to humans and human expressions under stress. “It was more about why the face works, why you do an expression, not how you do an expression. The starting point for us was, “Okay, how do we go beyond this?’”

As it often does, Wētā FX turned to scientific research and reference to further understand the face and go beyond what they had already done with FACS. “We experimented a lot with physical simulation behavior, trying to basically create a muscle system, and trying to drive it in a forward manner, by animating muscles with muscles,” notes Revelant. “We actually went to a maxillofacial department at a university to try to understand how a face works. It turns out they don’t really have a clear idea how the face works. They know that there is a cause and effect, but they don’t know what actually makes the difference.”

That meant the idea of simulation ‘went out the window’. It was senior visual effects supervisor Joe Letteri who then suggested a different approach, relates Revelant. “Joe said, ‘What if we learn it?’ That’s when the ideas started coming. The problem was, what do we learn and how do we learn?”

Wētā FX had gone down the deep learning path before for Gemini Man, with a different solver. “On that show,” outlines Adcock, “we used a learning technique to learn marker data on an actor in terms of the markers and what FACS shapes should be driven by those marker configurations. It was learning the relationship between markers and shapes, but I think the same limitations still applied. Those shapes, in a number of combinations, would look off-model–a bit funky–and we’d have to put effort in to try and either rein the animation back again, try and teach animators to not lean on certain combinations, or to go back to the modelers and get combination shapes cleaned up for when the face looked a bit broken.”

For The Way of Water, Adcock says Letteri wanted to ensure that animators could not get the characters into poses that did not look anatomically correct. “He wanted to learn from big datasets on what the face is actually capable of doing, and then in any scenario, if an animator wants to drive it through direct manipulation or pull individual muscle movements, the output is something that we have already learned, that is, the output is something that we’ve actually verified and can understand.”

“In essence,” continues Adcock, “the output you get from driving the face is anatomically plausible. That’s the system’s name, APFS, Anatomically Plausible Facial System. The output should be representative of some of the input. And often, in facial animation, when we do use these FACS puppets, in a way we treat them a bit like a musical instrument. The individual shapes are like individual notes, and we learn to play the correct chords to play the music and to drive the face. Joe just wanted to make sure that we couldn’t accidentally play bad chords.”

Read more in issue #11.