When real-time digital puppeteering and stop-motion combined
One of the highly memorable effects in Irvin Kershner’s RoboCop 2 is the distinctive polygonal CG face of Cain, the film’s villain turned berserker robot – rate highly. It was a face appearing on a screen reaching out of Cain’s metallic body, and it came to life in the film via a combination of a stop-motion animated version of the character led by Tippett Studio and then ground-breaking digital puppetry by deGraf/Wahrman.
The puppeteer for Cain’s face was Trey Stokes, who also worked on several of deGraf/Wahrman’s well-known puppeteered characters such as Mike The Talking Head. RoboCop 2 will soon reach its 30th anniversary; to celebrate, Stokes looks back at the ingenious (and in some ways mind-boggling) way Cain was puppeteered – in real-time – for the film in an era that came just before the explosion in CGI in films, and what seems the laborious but clever way that resulting animation was transferred to the stop-motion puppet.
b&a: You were freelancing for deGraf/Wahrman at the time – what were some of the other projects you’d been working on with them before RoboCop 2?
Trey Stokes: The first was Mike The Talking Head at SIGGRAPH ’88. I’d just done The Blob, and our VFX Supervisor was Mike Fink. He knew d/W wanted a puppeteer to demonstrate this new software they had developed, called “Perform”, and he suggested I meet with them. So I did, and a week later we were at SIGGRAPH in Atlanta with Mike Gribble, the host of the film show.
We opened the show with a semi-improvised comedy bit, with human Mike on stage and a CG version of Mike on the theater screen. Afterward, we found out many people just assumed CG Mike was pre-rendered and human Mike was only pretending to interact with him. No matter how many times we emphasized that this was a live performance, it just wasn’t in people’s heads that a computer character could be done in real-time.
We did a similar performance to open TED 2, the second TED conference – the Head became a Harry Marks lookalike and we did a live bit with the real Harry. Then Mike became an alien woman and did video bumpers for a Spanish festival called ArtFutura. And he was the on-screen narrator for a gonzo Australian movie called Sons of Steel. There’s a glimpse of him in the trailer here – but I have no idea how much is in the movie, I’ve never seen it. I don’t know if it’s ever been available outside Australia.
The project most people might recognize became the cover art for a video called ‘Beyond The Mind’s Eye’, made of clips of CG animation cut together to a synth score.
The Head acts as host of the video, but the footage is actually from a theme park ride called ‘Journey to the Fourth Dimension’ where the Head was “Max”, the computer navigator of a time machine. In Mind’s Eye they dubbed some English dialog onto Max, but the actual project was in Japanese.
For all these projects the Head was performed live – even if the end result was pre-rendered, we had no ability to edit the performance, at least not in the early days. So I had to perform every clip of animation over and over until I got the lip-sync and overall performance right all the way through, and some of those takes were very long. To this day I can recite lines like “The human mind is insufficient to navigate such a complex journey, so I will pilot you into the Fourth Dimension” in rapid-fire Japanese that I learned entirely phonetically for Max. I’m still hoping for a chance to use that line in real life.
b&a: What do you remember was the brief for Cain from production, and what had been the discussion about how it would be incorporated into Tippett’s stop motion puppet?
Trey Stokes: I just remember they wanted something new and cool. d/W had been trying to convince Hollywood to use the tools they’d developed and this was the first big studio project that agreed to try it. Phil and Craig Hayes came down from Berkeley, Irvin Kershner also came in and we showed them what the character could do. Phil and Craig both played with the system and were fascinated by it, so I’m pretty sure I witnessed the birth of the idea that became the Jurassic Park DID years later. Pretty soon after that we got storyboards and a deadline and we were off and running. We had exactly two weeks from ‘go’ to final film delivery.
b&a: In terms of the live puppeteering, can you break down how that worked at d/W? What was the Waldo set-up, and what was your technique and approach to operating it? How had the CG face model been built?
Trey Stokes: The Talking Head team for pretty much every project consisted of Greg Ercolano, J. Walt Adamczyk, Ken Cope, and me. Ken mostly modeled, Greg and J. Walt mostly wrote code. And Sally Syberg and Anne Adams were the producers for the RoboCop project. All I did was operate. I got a crash course in how CG worked, but I couldn’t do any of it. I learned to call an exclamation point a ‘bang’, but that was pretty much it.
The quick video below is someone using the waldo, but I don’t know who that is!
You can see the basic setup – the 3-axis roll-cage thingy was usually patched into the three axes of the head rotation. In the middle of the roll cage was a toggle for the mouth. So the right hand was a lot like operating a hand puppet and fairly intuitive to use. The other controller was a 2-axis joystick that had multiple functions. For lip sync, each of the four extremes of the joystick was a mouth shape – move it one way and the face made an “O”, another way was “EE” and then “AH” and “EH”.
You morphed between mouth shapes with the left hand while working the jaw with the right. Close the mouth and that was “MM”, open it while morphing the lips toward EE and you’ve got “ME” and so on. It was similar to work I’d done with rubber creatures and radio controls, except this creature only existed on a screen. The joystick could also be re-assigned to the eyes or eyebrows or to move the CG lights around. All of this hardware was custom-built and hooked into a Silicon Graphics computer, usually an IRIS as I recall.
The Head always had to have the fewest possible polygons so it could run in real-time. So it wasn’t really ever a head, just a face – you could only rotate him so far before you saw the edges. Like a Halloween mask, with the eyeballs and teeth just hanging in space behind it. Ken would tweak the head model for every project, so for RoboCop 2 he made it resemble Tom Noonan, the human Cain, based on a laser scan of the actor.
An example of how quickly we had to get this done – for Cain’s death scene, we needed some sort of dramatic visual effect to show him coming apart. But we didn’t have any good tools for that – just the head and the very basic Perform software. And there wasn’t time to create proper custom head models to show his disintegration.
So Greg Ercolano put the Cain model file up on a screen – but I don’t mean the image of Cain and modeling tools and so on. I mean the raw data, literally just a text file of numbers. Greg pointed to the screen and said. ‘Change some of those numbers.’ So I did – I randomly changed some numbers here and there.
We saved several versions of that and then loaded those files into Cain. Neutral Cain was the same, but move the joystick and his forehead would tear open or he would explode into a demented starfish. Now we had a whole collection of broken Cain meshes we could animate and blend to create his death throes. So I guess I can say I did some computer modeling for RoboCop 2, but it’s not something I would put on my resume.
b&a: What did you find, during puppeteering, worked and didn’t work well in terms of gross and more minute movements?
Trey Stokes: The Head couldn’t move very quickly – or rather it could, but that could overwhelm the graphics engine and he’d start to drop frames. So for most projects, I’d move the head slowly and deliberately, which tended to give him a sort of sinister vibe. On the other hand he could lip sync as fast as I could go – fewer polygons to move.
But for RoboCop 2 he didn’t have to talk, so that simplified things. And Cain was supposed to be glitchy – he’s a prototype, like ED-209. That freed us up a lot – the face could turn further because now it was cool to see that he was just a skin with nothing behind it. And I could move him as fast as I liked – if he dropped frames, that was fine. The controls were also adjusted so I could drive the facial expressions beyond what a human could possibly do. We wanted to show that this really was a computer image and not a Max Headroom-style faux-CG effect, which is what people would have assumed in those days.
b&a: What was the output of the puppeteering, in terms of digital files and formats or film, and how did you deliver that to Tippett? What was then their approach in projecting that onto the puppet screen?
Trey Stokes: All our animation had to be done first so that Phil’s team could do the stop-mo shots with the face on the robot’s screen. But with such a tight deadline, we couldn’t do tests and get Phil’s feedback, or Irvin’s either. We had to get it right on the first try. And the storyboards were very specific, down to the exact length of each shot. So it was a little daunting, knowing we had one chance to nail all these bits of animation, in real-time.
But then – it hit us that we shouldn’t approach this like animation, but as performance. That was the strength of a real-time system. A five second animation took five seconds to create. But ten seconds only took…five more seconds! So instead of doing shots with exactly the number of frames requested, we did takes like an actor would. We let the recording run while I tried variations of each shot, and even improvised some stuff. That way, Phil could choose a section of performance that worked best, with lots of extra head and tail frames so he could slip the timings however he needed.
And that was revolutionary for the time, when every frame of animation was so expensive. It wasn’t normal to tell a client, ‘You asked for ninety seconds of animation but we’re sending you four minutes. That okay?’
The only piece of animation that wasn’t done with Perform was when Cain pushed through a ‘wall’ of computer text. That was keyframed, along with some very early computer compositing. I think it was Adrian Iler, another of the d/W team, who did that shot.
Now, as to how the shots actually got delivered…this might sound a little insane, but here we go:
1. I recorded the takes, we played them back and made our selects. Then we had to output those to film, so…
2. In a side room was a 35mm movie camera aimed at a hi-res monitor, with a black cloth over the whole thing. Each frame would render at full resolution on the monitor – it took about a second per frame. The film camera would snap that frame, then the next frame would render, and so on. That was all automated so all we had to do was wait. And not walk around too much, because that could vibrate the rig while it was shooting.
3. We sent the film to the lab, got it back, screened it once to make sure we hadn’t screwed anything up, then FedEx’d it to Phil in Berkeley.
4. Phil had the film transferred to laserdiscs, because a laserdisc could freeze one video frame indefinitely, then advance exactly one frame and freeze again.
5. The laserdisc signal was piped into a Sony Watchman that was built into the Cain stop motion model.
6. Phil’s team animated each Cain shot, moving the stop-motion model one frame and advancing the Cain video one frame as well. then snapping a frame of the whole scene with their film camera.
So, the highest resolution computer image that could be created at the time was shot on film, transferred to laserdisc, played back on the tiniest TV that existed at the time and then…shot on film again.
As I recall there was also a larger Cain robot head, with a full-sized TV built into it, for the closer shots. I wasn’t there of course, but I think that was the case. So, not as crazy as a Watchman. Still pretty crazy.
b&a: Can you talk about your memories of seeing final Cain shots for the first time?
Trey Stokes: I didn’t see the end result until the movie came out. And it turned out they had used a lot of what we had done, much more than the original brief. Some of those little improv moments even made the cut.
For example – when RoboCop pulls Cain’s brain out, the face zooms forward until it’s just one eye staring out of the screen. I’d done an alternate take where Cain’s eyeball popped completely out of his head, like a Tex Avery gag. I assumed it’d be too over the top but since we could do it, why not include it? That’s the version they used, and I’m very proud.
Some of my scribbled performance notes (including. ‘try with eye pop’), below:
You can see another bit of improv in Cain’s final meltdown. Folks who know computer graphics will recognize that the face is just switching between shaded and wireframe and normals. We could toggle those view modes with the button box but the system couldn’t record that or play it back. So although the animation was pre-recorded, the mode-switching I did live, as the film camera was running.
As I said, Cain took about a second per frame to render – so every second I would toggle him to a different model. I think on a couple of frames I accidentally switched him off completely and he disappeared.
So it was a bit of a Hail Mary play – we figured it would look pretty good, but we didn’t actually know until we got the film back, and by then it was too late either way. So Cain’s final glitching meltdown was done live onto film, in one pass, once – and they used just about every frame of it in the movie. Still kinda boggles my mind.
b&a: How far further did d/W take this kind of live puppeteering/CG approach at the time?
Trey Stokes: After RoboCop 2, I was out of town a lot, doing other gigs. I think there were a few more Talking Head projects, but I wasn’t part of them. And d/W was doing other CG projects all along, the real-time stuff wasn’t their only thing. I wasn’t around when they decided to close up shop and so I don’t know exactly why that happened. I do think it’s unfortunate that they closed just as CG was becoming accepted in moviemaking, since they’d worked so hard to make that happen. But everyone at d/W went on to do groundbreaking work at companies like ILM and Disney.
And because I’d met Phil and Craig on RoboCop 2, later they brought me in on Starship Troopers and we used real-time animation in that. J. Walt hired me to do a theme park project based on Yellow Submarine that included a real-time CG Captain Fred who talked to the guests. The technology that was pioneered at d/W continued in all sorts of ways at other companies, so the legacy of Mike the Talking Head is woven pretty deeply into the history of CG animation.
By the way, other folks might not define what we were doing as ‘motion capture’, but I sure do, and that makes Cain the first motion-captured CG character to appear in a Hollywood movie. And the honor of first motion-captured CG character in ANY movie goes to Sons of Steel!
What’s ironic about that is – the stop-motion that Tippett Studio did in RoboCop 2 is astounding. In my opinion, Cain is the pinnacle of stop motion as a realistic VFX technique.
It had never been done better, and now probably never will be. Because built right into Cain, in its earliest, crudest form, was the technology that would effectively end stop-motion as a VFX tool just a couple of years later. Nobody knew it at the time, but RoboCop 2 was a turning point in VFX, with its past and its future combined into a single character.