What even is animation? — Just to do Something Bad

A message from the future! Future Raf from the distant year 2021 wants you to know that he no longer thinks of motion generation methods as being as fundamentally incompatible as he made them out to be here. In fact, some of that seeming incompatibility comes from the nature of keyframes! That’s just one of the ways that keyframes have screwed us. This post still has a lot of useful stuff about the nature of animation though!

So, this animation thing. What is it, exactly? I mean, like, on a philosophical level? Time for some navel gazing!

Animation is difficult to define, but most definitions I’ve seen rely on the concept of frame-by-frame creation or manipulation. Something recorded from the real world in real time is not animation--anything created frame-by-frame is.

I think this definition made sense before the advent of CG, but it makes very little sense now. Is a VFX shot of an explosion making use of fluid simulation, and containing no characters, animated? By this definition, it is. Is a shot that uses only motion capture data, with no intervention by an animator, animation? (I know that doesn’t happen much in real life, but bear with me.) By this definition, it is. But neither of those scenarios describe something much like the animation an animator does.

Conversely, is motion graphics animation? This definition includes that too, but in that case I think it’s clearly correct--a motion graphics artist is an animator, just not a character animator. There is something similar between the processes a character animator and a motion graphics animator use that is fundamentally different from a shot that relies purely on simulation or mocap. The conventional definition fails to “cut the problem at its joints,” and leads to a lot of misunderstanding about what animation is or isn’t good for, and how it can be used.

I think this all becomes a lot clearer if you abandon the “frame-by-frame” definition and look at animation as just one method of creating motion data. I propose that there are three main methods of authoring motion:

This is motion recorded from a performance in real time. Performance capture, puppeteering, and live action film are all methods of recording motion.

This is motion created algorithmically from a set of starting conditions. Unlike the other two, generated motion only exists in CG. This includes all simulation and most procedural animation.

This is motion defined directly by a human artist. This includes CG animation (character animation or otherwise), but also drawn animation and stop motion.

These three methods of generating motion are very different from each other in terms of how they relate to time. Recorded motion is, of course, authored in real time. Generated motion may or may not be real-time. It does, however, have an “arrow of time,” albeit one imposed by a simulation’s reliance on prior states rather than the second law of thermodynamics.

Animated motion alone allows independence from time’s arrow.* An animator builds up a shot from its “bones”—usually storytelling poses, but this applies even to the “layered” animation approach—in a way completely at odds with the basic process of either recorded or generated motion. This is both animation’s great strength (the artistic possibilities offered by this way of looking at motion) and it’s great weakness (it’s really goddamn time-consuming).

Most shots in a conventional CG production process will use some combination of these three methods. Keyframed animation will be passed to the FX department for cloth and hair sim. Motion captured motion will be adjusted and reworked by an animator. But because the processes and basic relationship to time used by each motion creation method are incompatible, using more then one will force your pipeline into an exceptionally rigid configuration.

Do you discover, after simulating a character’s clothing, that its silhouette no longer reads? You must exit the simulation process and return to the animation process—where you can no longer see the effects of your simulation. Do you discover, while reworking a motion captured performance, that it must be significantly different to fulfill the needs of the project (or the whims of the client)? Your choices are either to turn it into animated motion, or to return to the motion capture stage and throw out any work you’ve done up to that point, since a process that is not time-independent cannot be easily used to modify existing motion.

Recorded and generated motion might conceivably be made compatible in terms of process if the generated motion was calculated in real-time as the motion was recorded, but neither can be made compatible with animated motion by the very nature of the processes involved. You can’t run a simulation backwards.** The real world, meanwhile, is so famously strict about its arrow of time that reversing its direction requires violating fundamental physical laws, usually the purview of a Doctor of some sort (notable Doctors with experience in this sort of thing include Dr Emmet Brown and “just The Doctor, thanks,” although I understand that they have beef).

Interestingly, this isn’t true of many other parts of the CG production process, even though they are not used to create motion. It’s entirely possible, for instance, to animate and light a shot concurrently, updating the data in the lighting file as new animation revisions become available. The only reason we do not generally do this in the other direction, pushing lighting information to animation scenes, is just that since most lighting and shading is not intended for real-time use it wouldn’t be much use to an animator. That’s a technological limitation, not an inherent consequence of incompatible processes, and it’s one that isn’t even that hard to bridge: many studios have pipelines that use final rendered assets and their actual renderer for playblasts. Of course, the very best case scenario would be real-time rendering in-viewport.

Similarly, modeling and rigging processes do not produce the same kind of hard incompatibility as the various processes associated with motion authoring. Certainly, most riggers would prefer to have a model locked before rigging begins, but this is more of a bulwark against careless modelers and indecisive directors then an inherent incompatibility of processes—there is no reason one could not rig a base mesh while a modeler continues to work on surface detail, assuming one trusted the modeler not to make proportional changes that would cause major rig revisions (which is a very big assumption). Since I often act as modeler, rigger, and animator, I will often make modeling changes in situ.

Pipeline implications aside, the different methods of motion authoring are also fundamentally good for different things. This may seem obvious--no one tries to motion capture a dragon, simulate a main character’s performance, or animate realistic clothing behavior--but I don’t think that the differences are always fully appreciated. Specifically, there is a reason why animation lends itself so readily to comedy and action genres, and has such difficulty with subtlety.

Human perception and understanding of the actual behavior of the world around us is awful. Half the information we think we have about what happens around us is just bullshit our brains make up. This is terrible for pretty much everything we do, except art. It’s great for art, because it’s possible to appeal to those skewed expectations to create artistic effects that cannot be perceived in the real world, because they don’t actually happen.

For animation, this means appealing to human cluelessness about physics. I’m not talking about the classic “cartoon physics” cliches--walk out over a cliff and don’t fall till you look down etc--but something much more elemental about how movement is portrayed. For instance, “hang time” at the top of a character’s leap looks great to the human eye, even though the way it’s usually portrayed in animation is flat-out physically impossible. Animation can produce aesthetic effects that cannot be recorded and would be exceedingly difficult to generate, precisely because the direct, time-independent control of every aspect of movement by a human artist allows for the creation of movement that is wrong but feels right.

Conversely, aesthetic effects that rely on a great deal of fidelity to real life are precisely what animation struggles with. I want to animate clothing because I intend to animate it in a highly stylized manner--hand animating realistic clothing would be completely insane. At the far end of the spectrum you get something like a photorealistic face. Ironically, that’s pretty much the one thing we are good at perceiving, and animating one successfully is so incredibly difficult that I don’t think anyone has ever actually succeeded in doing so, even once, to this very day.

It will not surprise readers of this blog that all my interest is in animated motion, and that I have little use for the other two. Their incompatibilities with the process of animation make them a bad choice for the kind of fast production I’m interested in. However, there’s some question about whether these three categories fully encompass what’s possible. Not all procedural animation techniques necessarily have an “arrow of time,” and there is some possibility of developing some sort of “assisted animation” process where time-independent procedural techniques are used by an animator while animating. Better automatic inbetweening through an ML-assisted breakdown tool, for instance, is something me and Tagore Smith have discussed a bit in the past, and there may be some real potential there to speed up the animation process. But the potential for harmony between algorithmic and animated processes remains largely untapped. For the moment, I intend to deal with the problem by telling all procedural and simulated motion generation methods to keep their damn dirty hands off my characters.

* Stop motion animation seems like a good counter argument to my definition here--doesn’t it always have to proceed forward frame by frame, and doesn’t that give it an inherent time arrow just like generated and recorded motion? My answer would be that it still falls into the category of animated motion since arcs, poses, and performance details can all be decided on ahead of time with precision (even if they often aren’t)--indeed, I understand it’s quite common at studios like Laika for animators to “pose block” a stop motion shot for approval, and then use that as a skeleton to build the final shot on. It’s is a bit of a grey area, though.

** Some may take issue with my contention that simulation can’t be defined in a time-independent way, since simulations can have goals. While this does allow you to define what some aspects of the simulation will look like at a particular point in time, I don’t think it’s the same thing as the time-independence of the animation process, since you still can’t actually know how your simulation will reach that goal until you run it.