From on-set motion capture to final performance. An excerpt from issue #47 of befores & afters magazine.
In The Fantastic Four: First Steps, Ben Grimm—The Thing—is given a rocky exterior and super-strength as a result of his cosmic ray exposure. Previous incarnations of the character were largely realized with prosthetics and make-up effects, but in this case, The Thing was almost entirely CG, based off of the performance of Ebon Moss-Bachrach.
“Ebon’s performance was always going to be key for Ben,” states production visual effects supervisor Scott Stokdyk. “So we captured his movement with a stereo Technoprops head-mounted camera setup and an Xsens inertial capture suit. The actor also often wore a fractal tracking suit. For scenes where Ebon interacted with other actors, we had him wear some volume-adding pieces, i.e. shoulder-pads or a prosthetic hand, but mostly we just let him give a natural performance.”
On set, the filmmakers did employ a stand-in Thing in full costume, which was one of over a dozen outfits the character had. This included a prosthetic head for lighting reference and to represent the volume of space The Thing would occupy. “In some cases,” notes Stokdyk, “we were able to use the costume in-camera while replacing the head and/or hands.”
For the body animation of The Thing, Stokdyk advises that “the performance had to be modified to convey a sense of mass, and to adjust proportions of the Thing versus Ebon. Fortunately, we made the eye level the same between Ebon and The Thing, to avoid eye-line issues. For the most part, we were slowing down or reducing motions of the Thing compared to Ebon’s performance.”

“For facial animation,” continues Stokdyk, “we didn’t want to visibly squash and stretch the rocks, so the teams at all the companies had to implement tools to manage rock movement. ILM spent a lot of time at the start of the development in range of motion tests to work out issues, and they dialed in how much ‘grout’ was between the rocks as well as how to rotate/move the rocks to overcome visual problems.”
Designs for The Thing were led by Marvel head of visual development Ryan Meinerding. ILM was then responsible for the asset build, including modeling, texturing, lookdev and facial shapes. The main VFX vendors—ILM, Imageworks, Framestore and Digital Domain—each handled final The Thing shots. Leaping off from the Xsens and head-mounted camera setup for The Thing, the production would ultimately employ a relatively unique approach to facial animation. Here, Digital Domain engaged its proprietary markerless facial capture system called Masquerade3 as a postvis pipeline shared with four other vendors on the film. The VFX studio assembled postvis with a Masquerade3 facial FACS solve for The Thing scenes to be provided to the vendors, and to previs vendor The Third Floor. It provided a work-in-progress ‘preview’ look at the translation of the performance to the CG character.
“The idea here,” discusses Digital Domain visual effects supervisor Phil Cramer, “was to see if we could take Ebon’s performance into a super-fast turnaround into postvis. Our motion capture supervisor, Connor Murphy, was tasked with finding a way. We had just come up with Masquerade3, so he sent the footage to us. We did a test for him to see how long it would take and what quality we would reach if we processed everything. It was not just processing the selects that the client made, it was processing everything captured.
Digital Domain’s Masquerade3 is an evolution of the studio’s Masquerade toolset that it uses to take facial motion capture data from head-mounted cameras, and transform it, including with machine learning techniques, into high-resolution 3D data for use with digital humans or CG characters. Masquerade3 allowed for markerless capture to be employed, which meant Moss-Bachrach did not need dots to be drawn on his face each day of the shoot. It also meant that other vendors which would be creating The Thing shots, and would normally rely on markers for their solves, could benefit directly from receiving Digital Domain’s Masquerade3 solve. Digital Domain took this same approach with facial capture for the Silver Surfer and Galactus in terms of providing facial capture data solves to other vendors.
During a shoot day, Imaginarium Studios collected motion capture data. Digital Domain would then perform a batch process of facial solve data overnight. “It was somewhat petrifying for me to agree to give that data over blindly to the other vendors,” admits Cramer. “I didn’t want to bombard them with awful data. But it ended up being really good. Our facial rigging supervisor, Rickey Cloudsdale, absolutely nailed this. Initially, we had to calibrate the cameras to make sure there was consistency. From that point, we used the same set of training that we had for it. We even did some reshoots where Ebon now had a beard and it was no problem.”
“Connor would then do what we called the ‘assembly’ step,” adds Cramer. “It was like a layout stage to gather the face and data capture, merge it, and put in some rough camera tracks and line-up frames. That would be the package that would go to Third Floor for their postvis. Later, the same package, or an update to that package, would go to the finals vendors. It was really helpful as a layout step that was taken out of everyone’s pipeline and done at the production level. What it meant was, every vendor would start with the same data.”
Although a number of vendors realized shots featuring The Thing, ILM was tasked with the initial build. “It had to look absolutely like real rock,” remarks ILM visual effects supervisor Daniele Bigi. “We actually received a package at ILM of many different rocks that were reference rocks. These were used on set and shot together with the HDRI as a reference for the light interaction and for the texture. But, the main thing I felt we had to get right was the intersection of the rocks on the body.”
Meinerding’s concepts depicted The Thing with only small gaps between the rocks, and a distinctive surface underneath them—something ILM described as ‘mortar’ internally. “At the end of the day,” says Bigi, “that mortar was like tissue that kept the rocks together. But with the small gaps, we actually started doing very early test animations, even before finishing the model, to see where the rocks would go when they moved. I asked my team to do several tests. For example, I asked, what if those stretchy areas were happening outside on the opposite side of the camera frustum? It was as if we were creating gaps, and we were squashing elements that were not visible to camera. Could that be a solution? We did use that approach sometimes in the end.”

“The other approach we thought about was,” continues Bigi, “what if we are sometimes stretching the rocks a tiny bit, but we are keeping the angles between the edges completely correct? This way, you don’t see the skinning intervening and deforming the entire rock, but you have an inner part of the surface that can actually collapse a bit, and then all the angles will remain exactly the same. We also sometimes embraced the intersections, but allowed the rocks to go on top of each other. So, even though there might be some intersections, when we ran the simulations, you’d see the rocks getting closer and closer to each other and one of them going above the other so that it looked like the rocks were colliding and going over without truly penetrating each other.”
On The Thing’s body, ILM also ran rigid body simulations to help ensure the movement did not appear too smooth, as Bigi elaborates. “I thought we needed some sort of friction. If you imagine these rocks are there and they’re moving and they’re bumping into each other, they cannot always move gently in and out. I wanted to add something else. And so, in Houdini we sometimes added a simulation where some of the rocks were actually colliding.”
Of course, in many of The Thing’s scenes, he is wearing either a Fantastic Four ‘tech suit’, or other clothing. This meant ILM and the other vendors had to deal with cloth sims on top of his rocky surface. Bigi and ILM initially took a physically plausible approach to these cloth sims. “I was very curious to see what would happen if we did it correctly, that is, if we kept the geometry of the rocks underneath and we ran a cloth simulation. Well, the result was awkward, it was very visible. The tech suit required simulating quite a thick material, and the thick fabric was smoothing out the collision and it was creating some weird bumps. So, what we ended up doing was taking the geometry of the rock underneath and creating a much, much smoother version and we ran the collision off that. It wasn’t technically accurate, but visually it was more pleasing. The only thing we kept very precise were the collisions around the sleeves.”
Crafting a cast of superheroes and villains for ‘The Fantastic Four: First Steps’
ILM then took the aforementioned body capture data, and the facial capture data orchestrated by Digital Domain, and brought this into the studio’s animation pipeline. Some bespoke FACS setups were added to the mix, while ILM also sought to solve some animation issues specific to the character. “For example,” mentions Bigi, “Ebon is an amazing actor, very expressive, and there would sometimes be a lot of motion from his brow and his forehead. But with The Thing, the forehead is made of four rocks. We couldn’t translate and apply the same movement to those rocks. We therefore had to figure out how much we really need to move this area, or any area of the face, to get the right look. We might move the eyebrow in a convincing normal way, but if you then apply that to a long bar of rock brow, it would look somewhat cartoony. It became distracting and was creating quite a lot of shadow.”
“So,” adds Bigi, “the real work and the creative part was taking the data, absolutely respecting the timing, but finding the right look. If he was moving his cheek, some of the rocks around the cheek were absolutely moving at the right timing, but the amplitude of the motion was completely different.”
One element of Moss-Bachrach’s performance that ILM always retained were the actor’s eyes, notes Bigi. “We took the scans of Ebon and we literally split the scan in half, moved it exactly where his eyes were, and we mimicked the curvature of the eyelid exactly with the rocks. Every time he was blinking, the curvature between these rocks were absolutely identical to Ebon.”
To further match the look of the actor, ILM referenced Moss-Bachrach’s nasolabial folds that ran around his cheeks. “When it was possible,” says Bigi, “we tried to line up or change the orientation of the rocks to create gaps between them that corresponded to the lines of Ebon. Sometimes it was possible to translate and sometimes because of the design it was impossible to align. But it’s something that we considered from the get-go. Once you add all of these things, then you start to get the result that you are after. It’s almost unconscious, but you start to compare Ebon’s performance with The Thing and you see that there is something in it that is really coming from the performance, even though it’s a completely different geometry and anatomy.”
As noted, other vendors including Digital Domain and Framestore were also responsible for shots involving The Thing. Each studio had to build a complex facial rig to translate the mocap reference into a skeleton formed of rocks rather than bone and muscles. From Framestore visual effects supervisor Rob Allman’s perspective, his team worked extensively to preserve Moss-Bachrach’s performance and deal with rock intersections. “We had to work out, what is he going to look like when his mouth is as mobile as Ebon’s mouth? If it looked too fleshy, then it began to look like we’d just got bits of polystyrene and put it onto a mask. But then if you stiffen it up too much, he then becomes a Muppet with only a flapping mouth.”
Framestore began with a typical FACS setup, adding in cluster controls around the eyes and mouth and implementing individual controls for individual rocks so that any intersections could be fixed, and so that more nuanced expressions could be animated. “We had to find ways to convey expression without breaking what he looked like,” states Allman.






