A step-by-step breakdown of 4D scanning

Taking 3D photogrammetry scans to the next level

You might already be familiar with the now-common process of scanning actors in a photogrammetry booth for the purpose of later building a Digi-double for, say, stunt work, face replacements, or even a photoreal CG human for central performances.

Often, the photogrammetry booths are used for ‘static’ 3D scans, that is, scanning the actors in very neutral or specific poses (mostly of faces). In more recent times, various scanning companies have offered 4D scans. Where the actor can give a performance and a 3D scan is crafted for each frame of video.

Pixel Light Effects, in Vancouver and Beijing, has been offering 3D scanning since 2014. Their work can be seen in Deadpool 2, Altered Carbon, The Wandering Earth, and Sonic. Recently, the studio started testing 4D capture. I saw these tests and asked if they might be able to break down the process for before & afters readers (and me!) in order to get a handle on how it all works.

This ‘Valentina 4D Demo’ breakdown utilizes a 4D scan test conducted in partnership between Pixel Light Effects & Reblika. Here, the facial performance was captured in ‘4D’ in much the same way as 3D facial blendshapes are captured. It is effectively 3D scanning over time; each frame is an OBJ model with textures.

Follow their steps below.

1. The scanning setup

Pixel Light Effects’ scanning booth uses 16 high speed machine vision cameras. Which are frame-accurate synchronized with over 100 LEDs. The LEDs flash at the same frame rate as the cameras. So they don’t look as bright to the actors to ensure they can still perform naturally.

The main difference between the new 4D system with Pixel Light Effects’ typical photogrammetry rig is that it captures the ‘in-betweens’. The final footage has realistic and smooth transitions between the shapes. It can be used as rigging reference as well as training data for the face rigs when it comes to facial motion capture.

2. Capturing the actor

Inside the booth, the actor first goes through some of the typical FACS shapes for calibration, and then begins their performance. At 30 fps, up to 30 min of footage can be captured in one take, 15 min at 60 fps. The takes often run anywhere between 30 seconds to 2 minutes.

Early on in Pixel Light Effects’ scanning process, they once used non-polarized lighting. But found that the tip of the nose was often missing since it is a common hotspot for reflections. So all the LEDs, as well as the camera lenses, are now polarized in order to cancel the reflection.

3. Turning raw data into a mesh

The raw footage is massive, over 2GB per second at 30 fps, before it is wrapped into a consistent topology. Effectively, image sequences are captured from the cameras, with RealityCapture from Capturing Reality used to ‘batch process’ mesh them.

Reblika did the clean up as well as wrapping base mesh. Then they completed further lookdev in Arnold, with the hair groomed using XGen.

The 4D mesh was tracked by Pixel Light Effects in Russian3DScanner’s Wrap3 and optimized with its OpticFlow function. The studio is currently testing Wrap4D, which is specifically designed for processing 4D captures. For this Valentina demo, Pixel Light Effects had to jump between Wrap3 and other software, although now Wrap4D would provide a one-stop solution.

4. Crafting the facial performance

One of the biggest challenges with 4D capture, according to the team behind this demo. It is that eyes and lips can become deformed and projected incorrectly. Reblika came up a solution for this to relax the entire face and project additional details on top. Correction of eyelids was done in two passes.

Minimal rigging was done. Since the scan does not come with eyes and teeth data, Reblika did parent constraining and used the 4D data to drive the lower teeth movement. Once the jaw was positioned correctly, everything was able to follow.

Additional tongue animation was also created by Reblika to fill in the void. By studying the original raw sequence in color, Reblika discovered that the lips were almost never fully closed. In order to make them deform more naturally, Reblika created correctives to simulate the sticky lips. However, real-life lips stick to each other depending on the amount of saliva and how much it compresses.

Reblika sculpted additional secondary details around the eyes and lips to capture Valentina’s likeness. Tertiary maps were layered from Texturing.xyz to finalize the lookdev in Maya using Arnold.