The Paul Debevec long read

A reflection on his lasting impact on filmmaking and visual effects.

You probably (or hopefully) know a lot about the long history Paul Debevec has had in computer graphics and the significant impact he has made on visual effects. As a researcher, Debevec has brought us new ways to think about image-based lighting, virtual humans and virtual cinematography.

Debevec’s projects like the Light Stages at USC ICT have become synonymous with scanning and re-lighting actors in Hollywood productions, for example, not to mention his earlier work in photogrammetry and how that influenced the creation of synthetic environments.

At SIGGRAPH Asia 2023 in Sydney, Debevec, who is now chief research officer at Eyeline Studios, presented an incredible historical look back at his career. befores & afters then got a chance to revisit some of these career highlights in a sit-down interview.

I encourage everyone to first check out Debevec’s homepage at https://pauldebevec.com for the actual research papers, going back to his first image-based lighting approaches at UC Berkeley and then at USC and elsewhere.

Since I’ve chatted to Debevec previously about much of that earlier, we pick things up when he joined Google in 2016, then look at his stint at Netflix. Along the way, please explore the many hyperlinks to different pieces of Debevec’s research (and that of his colleagues) from over the years.

b&a: I don’t think I’ve talked to you too much about Google in terms of your time there, but I’m curious about what takeaways you had from Google and what your main area of research was there?

Paul Debevec: I was at Google for almost exactly five years, hired to work on virtual reality with Google Daydream. It started with a dinner conversation with Marc Levoy at the CVPR 2015 conference. I had just presented a virtual reality light field demo at the FMX conference that I’d done with OTOY. Jules Urbach at OTOY had been showing some pre-rendered Octane models that he turned into light fields that you could see in VR, which I thought was awesome. Since I was consulting for OTOY, I said, ‘Hey, something I think I can figure out a way to do this with real photos.’ So I worked with Greg Downing to build a rig that spun a fisheye camera around on a post, it got all the rays of light coming into a sphere, and then I translated that into their light field renderer for VR.

Paul Debevec speaks at SIGGRAPH Asia 2023.

The FMX conference was coming up and I said, ‘How about we talk about this at FMX? VR is so hot right now, people would love it.’ Jules approved it and we wrote up an article about it and people who were working in VR and interested in photoreal six-degrees-of-freedom video paid attention. Our work apparently got the attention of Google where Marc Levoy, one of the light field pioneers at Stanford, had gotten Clay Bavor, the head of Google VR, excited about light fields for VR.

Ultimately, our little light fields for VR demo achieved a big part of what Google wanted and that put me square in the crosshairs of the people recruiting for Google Daydream. It actually took a year for me to be ready to move on from the graphics lab at USC. I had just taken on Chloe LeGendre as a graduate student, and I knew that that was going to be a nice PhD thesis to supervise. And there was also a big negotiation with Google so they could get licenses to all the IP they should have if they’re going to hire us, and then to get permission to continue to work at USC as adjunct faculty to finish up with my graduate students.

I and three USC ICT team members went to Google in June of 2016. I actually thought they were hiring us for Light Stage related things, but what they cared about most was light fields for virtual reality. I got to work with Matt Pharr of PBRT fame for a while, and when he went to NVIDIA and I ended up running that project. It resulted in the demo Welcome to Light Fields on Steam VR, which used a new light field capture system. We went on to develop a complete system for light field video, where you could move your head and see the video rendered in real time from your particular viewpoint. We sadly didn’t get permission to make a new Steam VR demo, but we published the work here at SIGGRAPH Asia including some downloadable demos you can run on PC-based VR headsets.

The representation that we came up with to do light field video, notably, was a forerunner to Google’s NeRF work, since we decided to represent the scene as an RGB+opacity volume instead of view-dependent textured polygon meshes like we had done in Welcome to Light Fields. I’d hired a research engineer named John Flynn back into Google to join our team, and he proposed a machine learning technique to convert a set of 2D photos into a set of layered multiplane images (MPI’s). This represents the scene as a volume of opacity and RGB, very NeRF-like, except it was explicit rather than implicit in the sense of being encoded in the weights of a neural network. Nonetheless, we started getting view interpolation results that just blew our minds. It was like, oh, my God, we freaking solved view interpolation with this, because we would use some gradient descent machine learning to solve for the best RGB alpha volume that reproduces the views that we actually shot. And then it ended up extrapolating to novel viewpoints extremely well.

Some of our colleagues in Google Research including John Barron were paying attention to our light field work and working with some of our data and had the idea: why don’t we just encode all that in the weights of the neural network? And then NeRFs were born.

It was fun to work on all that at Google with light fields and VR. But frustratingly about two years in, Google decided to exit the VR. That was sad, but they had just realized there’s not enough market to earn Google-scale money on VR. So Google pivoted to AR, and that wasn’t the worst thing, because we found new cool problems to work on like lighting estimation without a light probe from just background plates. We built a machine learning data collection system which was a cell phone with a stick holding a mirror ball, diffuse ball and a matte silver ball in the bottom of the frame. We built four of these cheap rigs and paid people to walk around four different cities indoors and outdoors and capture data that shows: if the background looks like this, then if you’d shot a light probe, it would’ve been this. From that data we trained a network to look at a new background and then hallucinate what the light probe would have been. And that’s a useful AR feature because then if you’re putting your Pokemon into your field of view, you can light the Pokemon with harmonized lighting to the view. It doesn’t work as well as sending a professional VFX crew to measure the lighting, but you can’t do that in a cell phone app, and it works in real-time. Our feature shipped in ARCore, and it’s even been leveraged in the newer Pokemon apps that use ARCore.

Another thing I did at Google, which ended up consuming most of the political capital that I came in with, was to design and build a new Light Stage system over there. I’d originally wanted to build a big Light Stage over there, like the 50-foot version I showed a visualization of at the end of my SIGGRAPH 2002 talk. Google’s Playa Vista campus had this huge Howard Hughes hangar where the Spruce Goose airplane was built (as seen in 2004’s The Aviator). I thought, this would be perfect for a big light stage, and for a while, things were heading that way. But at some point, the YouTube space in Hangar 17 across the street (where Avatar’s motion capture was filmed) said, ‘No, we need somewhere to store props in that space.’ So we lost that battle.

That’s when I had the idea to take the plans for our facial capture Light Stage, expand it from 2.5 meters to 3.5 meters and have two sets of LEDs each of the lights: Lambertian multispectral LEDs to light the whole body and focused LEDs with polarization control to light the face. We’d also have two kinds of cameras on the stage, some that were wide-angle to see most of the body, and some that were zoomed in for the face. So this was our first dual face-and-body scanning Light Stage. My lab at USC ICT built a similar system for Christian Theobalt’s group at the Max Planck Institute, and they’re going to use it for their research.

So, since we built the Light Stage at Google, we could also record lots of people in lots of lighting conditions from lots of viewing conditions, and we started training ML models on that. A lot of the team that I worked with came in to Google with Shahram Izadi, who can definitely be credited for holding an umbrella over all of this effort. Having come from Microsoft’s Holoportation project, they were experts in volumetric capture and creating consistent mesh topology over time. We built what at the time was by far the most powerful relightable volumetric capture system over there, which we published as The Relightables paper at SIGGRAPH Asia 2019.

The same light stage system let us collect all the data we needed for SIGGRAPH 2019’s Single Image Portrait Relighting and our greatly improved SIGGRAPH 2021 followup Total Relighting, where our algorithms can change the lighting on a selfie photograph of someone who’s never stepped inside of a light stage, and change out the background to match. For this we captured data of nearly 100 people – including OLAT (One Light at a Time) reflectance fields from my SIGGRAPH 2000 paper on the first light stage – to generate paired training data which shows that if a person looks like this lit one way, then they should look like this lit this other way. It was gratifying to see work from two decades before having an impact in the age of machine learning. I just ran into Hoon Kim here at SIGGRAPH Asia and was pleased to see how relighting techniques like this are making their way into film production in products such as Beeble’s SwitchLight.

The final big project that I contributed to at Google was Project Starline, which is a very impressive 3D video conferencing system. It was a good project for me to join because it felt like a natural extension of the work I did at SIGGRAPH 2009 where we brought a 3D video teleconferencing system, which they were aware of, and just how cool it is to talk to somebody in 3D where they can make proper eye contact with you or shift their gaze to anyone else you happen to be standing next to. One of our 3D teleconferencing participants at SIGGRAPH 2009 was NVIDIA’s CEO Jensen Huang, and it’s been fun to see NVIDIA’s recent 3D Videoconferencing systems at SIGGRAPH ETech led by one of my former Ph.D. Students, Koki Nagano, who worked on our video projector array system to create 3D hologram-like conversations with survivors of the Holocaust.

b&a: How did you then move onto Netflix?

Paul Debevec: I had originally planned to work for Google for four years – that’s often how long you’re incentivized to remain at a big tech company – and then decide if I wanted to try something else. Perhaps, I’d go on to become more directly involved in filmmaking again.

On stage discussing digital humans at SIGGRAPH Asia 2023.

Four years into my Google tenure was June 2020, and since the COVID pandemic had just locked us into our homes, it didn’t feel like a great time to change jobs. So I stayed at Google for a fifth year. Google had been a bit frustrating since I’d been contacted about a number of projects that would have been exciting opportunities to apply and extend the techniques from our research. One of them that came through was a project to go to London and record a light field portrait of Queen Elizabeth II. This seemed like a very exciting opportunity and one which would be a nice bookend to the 3D Presidential Portrait I had helped record of President Obama in 2014 with a mobile light stage.

My manager, Shahram, thought this would be interesting, but there wasn’t a VP who was personally excited about it, so the project didn’t go anywhere. Another opportunity which came our way was to test the relightable volumetric capture system in a virtual production context. An industry colleague who was doing work for Disney for The Mandalorian show asked if they could do some tests with our system, since they thought it would be great for adding realistic animated characters into the show’s real-time virtual production backgrounds. I thought that would be great! I pitched it to Google, knowing there were Star Wars fans among our executive management and even some Star Wars collaborations with Google Cardboard and Google Seurat in the VR days. But they just said, “No, that’s not really our business model.” That was frustrating!

So Google wasn’t letting me or my team work on creative collaborations with the cutting-edge technology we were developing. Cool things like these don’t directly make money, but they do move the ball forward. It gets your whole team more excited about what you’re doing. It gets you to push the technology further forward to get an amazing result, and you end up in a better position at the end of it, because your technology has been proven out. It’s solidified. Now more people know about it. It’s now in a better position to lead to revenue at that point.

So, when I got a LinkedIn ping from Netflix’s Girish Balakrishnan, their director of virtual production, that Netflix had a new position open for Director of Research, I got back to him right away. A few discussions later I had an attractive offer. During my interviews I asked, “Will I be able to work on virtual production technology there?” Because, another thing that had happened is that virtual production technology had sprung to life while I was at Google. I’d been doing research in the virtual production space for 20 years, from our LED stage at SIGGRAPH 2002 to working on the LED image-based lighting technology for Gravity, and I thought, “Why does it seem like everyone else is having fun with this right now except me?” So Netflix presented an opportunity for that.

At Netflix, we had a chance to do research to improve virtual production technology. The biggest shortcoming I wanted to address is that even though LED stages are intended to produce realistic lighting on the actors – that’s the reason we put LEDs on the ceiling and all around the walls – the results weren’t very accurate in practice. The stages being used had a lot of issues with dynamic range and color rendition, and just didn’t get image-based lighting to work like we’d been able to do in our research and on Gravity.

So during my transition from Google to Netflix, I worked with the USC Entertainment Technology Center to evaluate a state-of-the-art virtual production stage for its image-based lighting capability. I noticed these stages don’t really light people on stage the way that would look if they were actually in the scene that you’re trying to simulate, often, not even close. The panels couldn’t go bright enough. They didn’t cover all the angles you need to light people from, and the color rendition was terrible. When you light people only red, green, and blue LEDs, your colors are way off, especially skin tones. Darker skin tones shift toward red, and lighter skin tones shift toward pink. It’s actually slightly magenta because there’s a little bit on the blue end of the spectrum that also messes up for the lighter skin tones. We sometimes call this the “lobster effect”, since you look like you’re kind of a red boiled lobster when you get on a virtual production stage. Yellow, orange, and cyan materials can look dark and dull, too, since there are gaps in the RGB lighting spectrum there. So all that has to be color corrected out in postproduction.

I’d noted this problem in our SIGGRAPH 2002 paper, and we developed RGBW multispectral lights in 2003 to show that multispectral LED lighting could accurately replicate the color rendition properties of traditional illuminants like daylight and incandescent. In 2013, we built Light Stage X to be an LED lighting reproduction stage which surrounded actors with RGB plus amber, cyan, and white LEDs to produce accurate color rendition for any real-world environment. At SIGGRAPH 2016, we published a straightforward way of driving multispectral lighting to reproduce the colors of an HDRI panorama as well as the color rendition properties observed through one or more color charts, which I don’t think people had done before that.

Wearing my Netflix badge, I was able to chat with companies like ROE and AOTO and Kino-Flo who make LED panels and lighting solutions and ask them if they had any plans to improve the color rendition properties of their LED products. I asked if they might be able to add some broad spectrum LEDs to these panels, which to some was an odd concept since if you add a white LED to an RGB panel, it does absolutely nothing to increase the color gamut of that panel – you can make the “color” white from RGB, but with a spectrum very different from that of daylight or incandescent light. The manufacturers were more interested in to trying to expand the color gamut of their panels. The only project I knew of trying to add a new LED spectrum to a panel was adding cyan, which is actually the one place where you can significantly increase the color gamut because you can go outside of the triangle between red, green and blue there.

I’ve since seen an LED panel display that adds a cyan LED, and I don’t know if it’s just because we’re not used to looking at it, but it didn’t seem to be an appealing color, it was a strange, dusty turquoise kind of look. I would’ve thought it’d be great for underwater scenes at least, but you can go see Avatar: Way of Water with just the three RGB primaries on a laser projector and be absolutely satisfied with the gorgeousness of the color of every blue, green, emerald, aqua and aquamarine that you’ve got. So I’m not sure that we need to add more LED’s to expand our color gamut for what we can display, but we really do need additional LED spectra to improve our color rendition when we use these systems for lighting.

I’m happy to say that as of last year, ROE, AOTO, and Kino-Flo have all developed new LED panels which add a broad-spectrum white emitter, and the color rendition improvement in how they light actors is remarkable. It would be even better to have a dedicated yellow and a dedicated cyan LED like we added for Light Stage X which we showed could produce 99% accurate color rendition. But we also showed that you can get 98% accurate color rendition just by adding the white LED for most subjects and illuminants. So I think the research helped influence the development of RGBW image-based lighting LED panels. And I hope to see virtual production stages incorporating these new products!

Most recently, we presented our Magenta Greenscreen research at DigiPro 2023 to try to improve the training data available for accurate keying algorithms, which surprisingly in this day and age is still not fully automated to get movie-quality results. There’s a lot of thankless effort and expense involved in that. The idea was, if we can derive accurate mattes, particularly on a virtual production stage where everything is live, and put in any background that we need then or after the fact, then we wouldn’t have to concentrate as much about getting the background right perfectly final pixel in-camera, and that would alleviate a lot of the stress associated with the LED stage virtual production process.

b&a: When you released that magenta greenscreen research, it actually got a lot of press, which sort of took it in the wrong way, I think. But what are you hoping comes out of the research? People took away the wrong thing that you were suggesting replacing greenscreens. Your research was more about, I think, using machine learning and training to get better results, right?

Paul Debevec: Yes, we developed the technique as a way to obtain a rich set of real-world ground truth training data for ML-based natural image matting algorithms, where the actor’s can be automatically keyed off of any background and lighting. But since the stories ran the teaser image of our paper which has our actors looking quite magenta on a greenscreen – that’s the image the computer analyzes where the green channel is a direct record of the actor’s holdout matte – I think the too-casual reporter imagined that we were suggesting that movie sets of the future should look like this!

If I find myself directing a movie, I promise to look for a way to pull off a few shots with actual magenta greenscreen. We’ll rehearse it in the white light to get the colorization training data, but we’ll actually shoot a take that goes into the movie with the magenta light and I think that will be fantastic. The pinkish magenta light is fun to be in, and actually it doesn’t look nearly as extreme as the teaser photo seen in the articles because that was color matrixed into the primaries of the camera. In person, the red and blue LEDs excite all three of your retina’s cones, middle-wavelengths included, so things look pink but not searing magenta. I don’t think the cast of Barbie would be unfamiliar.

And if you did want to shoot with it, literally, we showed in the paper how you can time-multiplex it. We were able to have the LED panels very rapidly switch between magenta-green and green-magenta lighting, so it just looks like you’re lit with gray light both front and behind, and we time the camera exposure to pick out just the magenta-green frames. But what we really want to do is actually be able to get a good alpha channel for anybody anywhere without having crews having to wait to put up a greenscreen. Productions already often don’t wait to put up a greenscreen, which leads to a lot of rotoscoping. And no one has suggested this is the best use of the time and talent of the visual effects industry.

The data that the magenta-greenscreen technique yields will let us train a good model to get the alpha channel, which should be useful in regular vfx production, but also useful in real-time filmmaking as well, because if you can see a high quality composite on set, even in the presence of bounce light on your LED panels, you can see what will be possible to achieve in postproduction and know you will be happy with it.

b&a: This is something we have talked about before, but I always love thinking about the drawing of a line from your research through to what filmmakers and visual effects artists are doing today. The classic one was the Campanile film at UC Berkeley with the tower and John Gaeta seeing that and using the idea to craft the backgrounds for bullet time in The Matrix. There’s also LED lighting and re-lighting. I’m wondering, is there something you are seeing being done today where you go, oh, we touched on that in our previous research?

Paul Debevec: The most recent example was the work done on Thor: Love and Thunder with the time-multiplexed lighting technique that we’d demonstrated at SIGGRAPH 2005. That’s one of my favorite papers since it landed at the top of the paper reviews scoring list that year and for once I could relax at the committee meeting knowing that my paper had a very strong chance of being accepted. In 2016, when my lab was working on scanning actors like Mark Ruffalo and Cate Blanchett for Thor: Ragnarok, vfx supervisor Jake Morrison and director Taika Waititi came over to visit our lab too. When we have visitors who might use some of our light stage techniques on a production, we play a reel of all the cool things you can do with the Light Stages. It’s kind of like a menu of possibilities. For the Light Stages, a killer application was high resolution polarized gradient facial scanning, because it’s been the highest resolution 3D scan you can get, down to skin pores and fine creases, for digi-doubles. A great number of productions have been able to do great work starting with that data.

The next most common thing used in productions has been our OLAT data, which as we discussed earlier is getting more and more prevalent since that’s the data that’s really useful for machine learning. Google, Meta, Adobe all have now have their own light stages for their relighting projects. And after OLAT’s it’s been our technique for skin microgeometry, where I designed a little plate that you could push your face up to and our light stage records one hundredth of a millimeter detail of skin samples to create really nice procedural skin shaders.

And time-multiplexed lighting was something that we’d always present, where we very rapidly change the lighting to come from different directions to get relightable performance data recorded on a high-speed camera. We’d demonstrate it with a clip from a 2013 piece about the technique done for the Discovery Channel. They were familiar with that from Thor: Ragnarok, where they set up their own strobe lighting array to get kind of the time slice version of lighting to happen for the Valkyrie sequence. And then they implemented a much more complete version of performance relighting with time-multiplexed lighting for Thor: Love and Thunder for the Moon of Shame sequence on that film.

b&a: I think that’s probably my favorite thing about your work, Paul, is the research side of it turning into actual film production. And then of course, you also being involved in the film production. Especially digital humans. I wonder, what you’ve been seeing recently with digital humans and what do you think still needs to be done, in terms of making them more photoreal, or more emotional, or more relatable. Where do we still need to go in your view? Are we ‘there’ yet?

Paul Debevec: We’re not quite there yet. But, what I think that we need to be doing, because we’re seeing all sorts of exciting new capabilities in the research realm, is that we need these tools to be able to be wielded by filmmakers. I think the best content results from when you have the best creative people using the best tools, and at the moment we have a whole new array of impressive tools being developed but they seem to just spray in every possible direction. We need to be able to use these things with intention, with supervision, and to enhance what all of the craftspeople are doing.

It took us a long time to notice and figure out how to simulate the subtleties of the human facial appearance such as the multi-layer subsurface scattering in the skin, the asperity scattering from peach fuzz, and the dynamic facial micro geometry you get when you squinch your flesh and how the specular highlights change as a result.

In terms of noticing and simulating all the subtleties of how faces move, I think we’re still near the beginning. Good films are the result of a good story written by a writer and compelling performances achieved through actors working with a director, and I think the first thing we should do is to find ways to give new creative tools to the filmmakers who will know how to use them. I think it’s less interesting to develop new tools to do what we already can do very well. Early movies staged their shots not too differently to how they would have been blocked on a theatrical stage, staying wide without close-ups. That was basically trying to directly translate an established medium to a new one. I think too much of the discussion now is how to make current kinds of films using new tools. I think it will be far more interesting to discover new kinds of storytelling experiences which can be brought to life with these new tools. Let’s make the stuff we haven’t seen yet. That’s what I’m excited about.