Behind the scenes of the crowds of ‘Bohemian Rhapsody’, ‘Rocketman’ and ‘Yesterday’
In case you didn’t notice, there’s been a few movies in recent times about singers. Three of those – Bohemian Rhapsody, Rocketman, and Yesterday – all featured huge concert scenes, and they were scenes where VFX studios helped deliver the necessary cheering crowds.
With DNEG, Cinesite and Union VFX, befores & afters goes crowd surfing to find out how these studios populated stadiums and other locations with excited audiences, and sometimes just the singer himself.
Beatle-mania (with a touch of Ed Sheeran)
In Yesterday, singer Jack Malik (Himesh Patel) rises to stardom by playing Beatles songs that the rest of the world seems to have forgotten. At one point, he performs them at Wembley Stadium, and also earlier at a beachside concert at Gorleston-on-Sea. Those concert scenes required a unique approach to crowd creation, which was handled by Union VFX.
For the Wembley Stadium concert, the filmmakers modeled their scene on real concerts performed by Ed Sheeran (who also has a role in the film). Rather than generate a CG crowd or use crowd sprites, actual crowds from Sheeran’s concerts were the crowds used in the film. Patel would then perform his scenes to an empty stadium and be composited into the already crowd-filled shots.
[twenty20 img1=”3186″ img2=”3185″ offset=”0.5″]
“For the energy that director Danny Boyle wanted to get from the crowd,” says Union VFX visual effects supervisor Adam Gascoyne, “we all thought it would be very difficult to achieve that with a CG crowd because of the random nature of it. You can see it, the energy, especially the energy when you watch one of Ed’s shows, he’s winding everyone up into an absolute frenzy and you really feel that.”
The action takes place in Wembley, but a Sheeran concert in Cardiff was also used – “you can double them quite easily because the stadiums are roughly the same shape,” says Gascoyne.
[twenty20 img1=”3195″ img2=”3193″ offset=”0.5″]
The process began with a recce of three of Ed Sheeran’s concerts. “We got to know the lighting setups and work with the lighting team. From that we started to seam together a plan of rough lighting setups that we could use from Ed’s show, on the crowd, to use in our sections of the film. And we started planning shots of Jack on stage, and thinking about where we could put the cameras within Wembley and Cardiff.”
The plan was to shoot all of Ed Sheeran’s concerts – full of up to 80,000 people – from start to finish. Then, once Sheeran had finished and left the stadium, the team continued working by shooting Patel all night on the stage with the same lighting setups, but in an empty stadium. There were nine planned camera positions around the stadium, allowing the team to capture the action of Jack on stage, plus several moving cameras, including Spider-cams.
[twenty20 img1=”3188″ img2=”3187″ offset=”0.5″]
The planning lead-up to the shoots helped with one of the more challenging aspects of this approach; the crowd would be reacting to Sheeran’s music, but in the final shots they of course had to react to Jack’s Beatles music. “Luckily,” says Gascoyne, “you start seeing similarities in what you want the crowd to be doing when they’re listening to, for instance, ‘Here Comes The Sun.’ It might be, waving their hands in the air, and so you notice that the crowd start doing things that you want to start incorporating into the songs that Jack’s singing. So, when Jack is singing ‘Saw Her Standing There’, the crowd for those shots is when Ed Sheeran is singing ‘Bloodstream’.
Union VFX’s work for the Wembley Stadium shots was mostly in compositing Jack into the real crowd plates, but there were additional moments that required specific VFX. For example, Sheeran at one point in his concert has the crowd hold up their phones; the lights were enhanced for the scene using Houdini particles so that the ’twinkling’ was particularly magical. Some of the light-show display graphics for Jack’s concert were also re-designed to match the Beatles music.
[twenty20 img1=”3190″ img2=”3189″ offset=”0.5″]
It was an intense shoot. Preparations – over two nights at Wembley and four in Cardiff – would begin around 11am in the morning. Ed Sheeran’s concert would start around 6pm. Then, after the concert ended, the team would shoot through until around 7am the next morning. “It was pretty insane,” acknowledges Gascoyne. “They were long, long days and long, long nights.”
Playing the beach
Earlier in the film, Jack launches his Beatles music album with a rooftop concert on the beach at Gorleston. Around 6,000 extras turned up to shoot these scenes for real, with Union VFX then extending the crowd with a further 24,000 more people.
“We had the largest crowd-call of any movie in the UK, maybe any movie in the world, barring Gandhi,” reveals Gascoyne. “These 6,000 people turned up, all based on goodwill and Danny asking them nicely to come along. It was an enormous amount and it looked really good, but we enhanced it a bit with CG crowd. We were able to do that because we had the energy of the real crowd as well, and then linked it across.”
[twenty20 img1=”3200″ img2=”3199″ offset=”0.5″]
Digital crowd members were created based on a motion capture shoot done at the studio of an artist dancing and replicating the kinds of moves that were performed on the location. Union relied on the crowd software Golaem for the crowd sims, and then also incorporated extra CG elements such as balloons, beach balls, festival flags and even blow-up pink flamingos. “It all helped with the energy,” says Gascoyne.
DNEG will rock you
Bohemian Rhapsody’s climactic Live Aid concert sees Freddie Mercury (Rami Malek) and his Queen band members play to a gigantic crowd, also at Wembley Stadium, but back in 1985. Here, the actors performed on a mock stage mostly against greenscreen with a small crowd, while DNEG would craft stadium crowds based on individual video captures of different performances. The studio then used an image-based modeling approach to convert 3D sprites into 100,000 crowd agents.
“The director wanted the crowd captured ‘in camera’, he was strongly against the use of animated fully CGI 3D characters driven by A.I. networks which would be our usual method,” says DNEG R&D supervisor Ted Waine. “This requirement was very much client driven, they felt that animated crowd agents wouldn’t match the appearance and dynamism of the Live Aid crowd. This meant we had to use video based performance capture but beyond that specification it was up to us to deliver a solution.”
[twenty20 img1=”3212″ img2=”3211″ offset=”0.5″]
Waine notes, too, that another key requirement was that the crowd should react in different and specific ways to each of Queen’s songs. They would need to do certain actions in time with the music, such as stamping their feet and clapping once for ‘We Will Rock You’ or do a double hand clap for ‘Radio GaGa’. Audio guide tracks helped make this possible.
The effort began with performance capture shoots over seven days. “Each day was 10 hours and cameras were running for about 8 hours a day, so about 50 or 60 hours of performances in total,” describes Waine. “We captured a total of about 70TB of video data, and constituted several hundred individual performances to our audio guide tracks.”
[twenty20 img1=”3210″ img2=”3209″ offset=”0.5″]
“Camera timecode sync with the audio track was ensured on-set so we could extract each separate action from the performance, timed to the music, automatically,” adds Waine. “This would allow us to easily set-up the crowd afterwards so that they were all clapping in time to each other and the music even though they were recorded independently.”
It was realized early that turning those video captures into usable elements would be tricky if they were only one angle (since projections onto say 2D cards – a common approach for crowd replication – would not work for the planned dynamic camera moves). So, an array of cameras was used to record crowd extras performing. “We could process the multi-view video into lightweight 3D elements, which we called ‘3D sprites’, using image based modelling,” explains Waine. “Essentially we do stereo feature matching across pairs of cameras in the array to get a depth image, and we can join the depth images from each pair of cameras and generate a 3D surface, onto which we re-project the video to give it texture.”
[twenty20 img1=”3214″ img2=”3213″ offset=”0.5″]
This was done with DNEG’s own in-house photogrammetry, depth reconstruction and pointcloud meshing tools and a set python scripts to provide a pipeline and to distribute the processing. Waine acknowledges this was computationally expensive, but that some of the up-front image processing was done in Nuke, such as procedural greenscreen key and motion vector generation – “so that we could inject motion data into the 3D sprites and add motion blur effects at render time,” he says.
To ensure the crowds were broken up with distinctive movements, DNEG’s
lighting and layout teams developed set-ups that would allow them to choose and distribute a mix of crowd actions for each shot, with some dancing in time to music, some just watching, some cheering and clapping and so-on.
[twenty20 img1=”3216″ img2=”3215″ offset=”0.5″]
Says Waine: “We would first use very lightweight flat sprites to develop the layout and get it approved before running the full extraction to 3D sprites and switching them in. We then used some tricks like a procedural randomized hue variation to further add variation to the appearance of the crowd because duplication of sprites was still an issue in some cases. Although we had a few hundred performances, the crowd was 100,000 people, so any given sprite had to be duplicated a few hundred times. Randomizing the time offset of each sprite also helped mask duplication.”
DNEG worked in Clarisse 3.5 to do this, relying on the software’s instancing and scattering toolset. “Under the regular crowd generation workflow we would need experienced crowd TDs working in other 3D animation and effects applications like Maya or Houdini, but for Bohemian Rhapsody we had limited access to that sort of artists but many shots to do,” notes Waine. “So developing the Clarisse workflow allowed a wider team of artists to build crowds than otherwise might have been possible, and was key to delivering the work.”
Shots where the crowd came close to camera were the most challenging, admits Waine. This was because the 3D sprites were relatively low resolution and worked mostly for mid to background crowds.
“The very front crowd might be from the plate itself, i.e. a thin layer of the crowd extras on set. Behind them we might use higher res 2D elements directly from the video capture if the angles worked, then behind them we would use a higher resolution version of the 3D sprites with more detail in the texture data, and then beyond them the ‘regular’ lightweight 3D sprites. Integrating these elements was a good challenge for the comp team.”
[perfectpullquote align=”right” bordertop=”false” cite=”” link=”” color=”” class=”” size=””]Each day was 10 hours and cameras were running for about 8 hours a day – about 50 or 60 hours of performances in total.[/perfectpullquote]
Other challenges also came from those sweeping camera moves. Here, the 3D nature of the sprites tended to work for most of the shots, but those sprites were not a full 360 degrees (the camera array captured around 120 degrees of the performers).
“For this reason,” says Waine, “during performance capture some actions were repeated with the actor facing side-on or away from the camera array so we had a sprite that would work viewed from the side or behind. This made the layout stage more complicated because on some shots the crowd had to be segmented depending on whether the camera could see the front, the side or the backs of the people in the crowd and we would have to offset the sprites differently to pick the desired angle from the performance for each segment.”
In addition to the crowds themselves, DNEG also built stages, vehicles, digi-doubles and stadium pieces to match the 1980s era. The final scenes often started with a real set-piece and then transitioned into a CG build.
Dodger Stadium, re-visited
Rocketman’s enormous stadium sequence – when Elton John (Taron Egerton) plays to thousands of people at Dodger Stadium in 1975 – saw a different approach to crowd replication, this time from Cinesite. Egerton performed the scenes in Shepperton, with the crowd process beginning with extras filmed on a greenscreen stage on a per-scene basis.
“They performed various actions such as dancing, cheering, applauding or, for one scene, simply ‘milling about,’” advises Cinesite visual effects supervisor Holger Voss. “Each of these actions were filmed at 6 different angles to camera each. The setup was such that the camera remained static while the scene lights were rotated for each angle. In addition to angles and actions, different period matching costumes were worn by the crowd performers. This resulted in a sizeable number of elements. All in, well over 150 green screen crowd takes with up to three extras per shot were used in the final sprite crowds.”
Then, Cinesite used 2D extraction techniques to create crowd sprites that were turned into individual cards, with specific mattes assigned to the crowd members’ costumes and faces. “These cards were then published to FX animation, were they were randomized and placed in the respective CG scene,” says Voss. “These scenes were then rendered and handed back to compositing for final integration in the shots. With the help of proprietary Cinesite pipeline tools, compositing artists had control of the costume colors and exposure levels for each sprite individually if needed.”
[perfectpullquote align=”right” bordertop=”false” cite=”” link=”” color=”” class=”” size=””]All in, well over 150 green screen crowd takes with up to three extras per shot were used in the final sprite crowds.[/perfectpullquote]
“All shots were composited using Nuke, in combination with the proprietary pipeline tools,” continues Voss. “Maya was used for the camera layout, these layouts were then populated with particles, i.e. extras, in Houdini. This tool was also used to to assign action instances pulling from the respective crowd source clips. Lastly, lighters using Cinesite groups in-house ‘Gaffer’ toolset were able to re-assign textures during shading and lighting.”
Dodger Stadium itself was of course a CG build. “We spent a considerable amount of time researching the architecture, period specific decor and color schemes and even the audio technology for each of the concert venues,” notes Voss. “Thankfully, there is a lot of information available on-line and in archives. Elton’s Dodger stadium concerts from 1975 are particularly well documented, we watched the 2-hour documentary quite a bit in preparation, it guided us in terms of adding banners, beach balls flying overhead, lighters and sparklers, all of which helped sell the final shots.”
Yesterday images courtesy Union VFX. © 2019 Universal Pictures.
Rocketman images © 2019 Paramount Pictures.
Bohemian Rhapsody images courtesy DNEG. © 2018 Twentieth Century Fox Film Corporation. All rights reserved.
Bohemian Rhapsody is available on Blu-ray, DVD & Digital Download now.