Understanding one of the most useful compositing techniques for quality paint and clean-up, in this tutorial from Escape Studios’ Allar Kaasik.
Part of a good compositor’s skillset is the ability to paint photoreal images. This might be a step in completing a clean-up shot, where you need to paint a clean-plate for projections, or you might be doing wire-removal frame by frame, while making sure that every patch looks neat and remains consistent throughout the shot.
When dealing with complex tasks like these, it is helpful to break down your challenges into smaller chunks, so that you could tackle those individually and gradually approach a good-looking final solution. Understanding the concept of spatial frequencies allows you to separate details of different sizes, retouch photos for beauty, and deal with shading and texture of real-world footage.
Follow along the process with this tutorial from Escape Studios Senior VFX Lecturer Allar Kaasik.
Frequencies, Filtering and Signal Processing
The first term that we need to explain is “spatial frequencies”. In common language, a frequency describes how often something occurs per unit of time, but we can expand that idea and measure things using any unit. For example, I could count the number of computers per row in the classroom where I teach. You could count the number of toilet rolls per shopping cart in a supermarket, books on every shelf, flowers in every garden. When we speak about digital film, we can count the dark and the bright pixels in every frame.
For example, in the image below (Figure 1) we have a piece of green screen with a tracking marker on it. When we zoom into it (Figure 2) we can see that the picture is made out of individual pixels with different values plotted as an RGB graph using Frank Rueter’s “SliceTool” gizmo for Nuke.
If we zoom back out and draw a graph for a slice through the middle of the image (Figure 3) we can see that on the left, the greenscreen is relatively even with a brighter area of the green tracking marker in the middle, and to the right, is an even bright white wall. Notice that the even looking green still gives a slightly noisy looking signal since there is a bit of texture in the green screen fabric and a little bit of noise in the camera’s sensor.
When talking about the details in the image, we can describe their size, or we can describe their frequency if we use pixels to measure the distance in space. So, the noise and fabric texture that varies a lot over a small space (i.e. pixels next to each other are different from each other) are high-frequency, the tracking marker is not that small but also not very big, so you could call it mid-frequency, and the overall difference between the greenscreen and the wall behind is fairly big, i.e. low-frequency.
Now that we can measure the frequency, or at least understand what it means, we can begin to manipulate it. Filtering an image involves taking a pixel and a bunch of its neighbouring pixels to get a new value for that particular pixel. If we average together all the pixels in a certain neighbourhood, then we get a smoother looking blurrier image – a lower frequency result. In Figure 4, we have blurred the image and removed the high-frequency noise and got a smoother graph, and in Figure 5, we have blurred it further to make it even smoother.
So far so good, but how does any of this knowledge about making images blurrier help us with getting photoreal results in digital painting?
Differences Between Frequencies, Laplacian
In the previous paragraph, we saw that we can remove details (or frequencies) from images by blurring them. But can we also extract those details that the filtering removed? Turns out that we can! After subtracting the different softened images from each other, what is left over must be the detail that was lost. For example, if we subtract the very blurry (low frequency) Figure 5 from the less blurry Figure 4 from earlier, we can see that we have extracted the tracking marker in Figure 6. When subtracting values from each other we get both positive and negative values as a result, so we can add a bit of grey to visualise the values that went below zero (Figure 7).
The difference between different blurred images (filtered with a Gaussian kernel if you want to sound clever in a conversation) is called a Laplacian (again, if you want to sound clever). And you can do this in any software! In Photoshop, it is already built in and it is called a “High Pass” filter, which relates to the signal processing theory of only letting the higher-frequency values pass and removing the lower frequencies, as opposed to blurring which removes high-frequencies and lets low frequencies pass through. In BorisFX Silhouette 2020 they introduced the “Detail Separation” tool in the Paint node (also available now as an OFX plugin). In Foundry’s Nuke you can use the “Laplacian” node or find a gizmo or toolset from Nukepedia.com (there are several). If the software that you use does not come with a “high-pass” filter, then now you know how to build one yourself by simply blurring an image and subtracting the result from the original (that does not sound as clever, so keep calling it “Laplacian”).
A Practical Example of Painting Skin
It is time to put the above to some practical use. In Figure 8 we have a picture of a cool looking guy with tracking markers drawn onto his face and in Figure 9 we are concentrating on one particular tracking marker on his left cheek (screen right).
When dealing with skin retouching, we must be extra careful to preserve the texture, so as not to make it feel plastic. Especially, when doing paint work on faces as we are highly tuned to pick out anything that stands out there. So, the first step is to soften your image until the skin texture that you want to preserve has disappeared (Figure 10). This means that any paintwork that you are going to do, is not going to affect that. In Figure 11, I used a technique called “EdgeBlur/UnPremult” to first make a hole in my graph, and then fill it with an estimate of nearby colours. The difference between the blurred image and the original gives us the skin texture (Figure 12), which we can now fix with a simple clone-paint technique (Figure 13) and then add the result back onto our blurry image (Figure 14) to have the texture and shading in the same image again.
That seems like a lot of work for one little tracking marker, so is it worth the trouble? Figure 16 shows the exact same clone-paint stroke that we used to paint the texture, but if we had applied it straight to the original image. The area highlighted with the red circle shows where we are cloning the texture from to cover the tracking marker. We can clearly see that patch on the left as it looks greener, which is the result of also copying over some of the low-frequency shading difference.
This effect is even better illustrated when moving a small patch around the image and it almost looks like it changes colour as we do that (Figure 17). The problem is that as compositors we tend to trust our eyes, but our eyes can fool us as well. In Figure 18 you can see a grey circle moving on a gradient and seems to change brightness as well.
A few years ago, there was a famous picture on the internet about strawberries that were not red but looked red and many people explained it with “colour constancy” theory whereby our knowledge of the world can have top-down effects on our senses, while I think that both of these visual illusions are more related to information processing in the very early stages of the visual senses via lateral inhibition on our retinas. The ganglion nerve cells in our eyes are connected in a way that the signal they send forward to the brain already has a “Laplacian effect” applied to it (but don’t quote me on that, please, since I did not study neuroscience). Regardless of the underlying explanation, when we move the patch in the high-frequencies only (Figure 19), it doesn’t feel out of place wherever we put it.
Neat! So, it is almost as if science thinks that doing painting in your frequencies separately is a better approach as well!
Conclusion and Further Discussion
In the tutorial above we have seen an introduction to “spatial frequencies” and how that concept could be used as a practical case of painting out tracking markers. If you are interested in learning more about it, you should look at the 1984 paper “Pyramid Methods in Image Processing” (Adelson, et al.) They used a similar technique of scaling an image down (making it smaller, and smaller, and smaller) and comparing different scaling to each other to the same effect. Another consideration is suggested in our initial explanation of separating out the high-, mid-, and low-frequency (not just the high and low), to be able to more accurately select the elements that you wanted to treat.
For example, when doing beauty retouching shots, you might keep the skin texture, pores and cilia in the high-frequencies untouched, treat some skin discolouration in the mid-frequencies, while preserving the shading and lighting information in the low frequencies. Also, the problem with using a Gaussian filter to smooth the image is that it blurs across the borders of edges, while bilateral type filters (such as the recent GPU powered Bilateral node in Nuke) can blur within borders. This means that you could use a bilateral blur to smooth the greenscreen, while the edge between the greenscreen and the wall at the back still remains sharp. Mads Hagbarth Lund put together his own neat GPU accelerated Wavelet Blur gizmo a few years earlier which used his own implementation of the bilateral blur. Also, median filters may give better results when removing small details that are very different from its surroundings without blurring the image too much.
In the end, it all boils down to softening your image by some amount and then comparing it with the original to extract the details, so you can paint elements separately.
About the Author
Allar Kaasik is a Senior VFX Lecturer at Escape Studios, where he looks after all the MA degrees and teaches advanced compositing techniques to anyone who is willing to listen. His background is in compositing for commercials and films, but he also has degrees in Computer Vision, Computing&IT and Psychology, Digital Post Production and Television Production.