This is a summary of Scott McCloud's "Understanding Comics".

Chapter 1: Setting the Record Straight[edit]

Comics is a medium, not a genre.

In one of the few previous books discussing comics as a medium, Will Eisner's Comics and Sequential Art, comics is defined, unexpectedly, as Sequential Art. Here McCloud expands and formalizes that definition (in a rare panel that reduces neatly to pure text): "com.ics (kom'iks) n. plural in form, used with a singular verb. 1. Juxtaposed pictoral and other images in deliberate sequence, intended to convey information and/or to produce an aesthetic response in the viewer."

The definition is dense (in the book it is developed over several pages, with McCloud's cartoon avatar taking questions from an audience), but to quote another part of the chapter1

The secret is not in what the definition says but what it doesn't say! ||

For example, our definition says nothing about superheroes or funny animals. Nothing about fantasy/science fiction or reader age. No genres are listed in our definition, no types of subject matter, no styles of prose or poetry. ||

Nothing is said about paper and ink. No printing process is mentioned. Printing itself isn't even specified!2 Nothing is said about technical pens or bristol board or Windsor & Newton Finest Sable Series 7 Number 2 Brushes!3 No materials are ruled out by our definition. No tools are prohibited. ||

There is no mention of black lines and flat colored ink. No calls for exaggerated anatomy or for representational art of any kind. No schools of art are banished by our definition, no philosophies, no movements, no ways of seeing are out of bounds!4

Like text, comics can be used in uncountable ways; unlike text, its potential has until recently been tragically squandered.


  1. In quoting without pictures, a lot has to be left out (here I've denoted frame breaks with || ). One of the joys of this book, and one of many reasons it has to be in comics, is that the phenomena talked about in the text are often played out in the art (to oversimplify a complex and interdependent relationship). Some simple examples: later in the book, in the middle of the explanation of the effects of the non sequitor panel transition, the narrative is interrupted by a panel containing only a fork. In Chapter 2, after explaining Magritte's The Treachery of Images, the McCloud-avatar asks, "Do you hear what I'm saying? If you do, have your ears checked, because no one said a word." In an example closer to home (if somewhat more diffuse), the quote directly following the sentence to which this is footnoted describes comics freed from the constraints of adolescent power fantasies, and lo and behold, the very comics describing this are themselves free of that mold. McCloud points out, later, that one of the advantages of comics as a medium is the ease with which text and subtext can spar, as in Art Spiegelman's Maus, where the Jews are drawn as mice and the Nazis as cats.
  2. In the discussion that led up to this passage, McCloud pointed out out that under his definition, artifacts like Trajan's Column would be comics, and traced the medium's development into the current century, highlighting notables like William Hogarth's A Harlot's Progress, Max Ernst's A Week of Kindness, and the works of Rodolphe Topffer, Lynd Ward, and Frans Masereel, all of which are traditionally considered non-Comics.
  3. Indeed, McCloud now proudly uses a tablet and Photoshop, and publishes many of his comics on his website.
  4. Two footnotes in one:
    • On the other hand, McCloud points out that his definition excludes single panel works like Family Circus and the Far Side.
    • These issues are further explored in Reinventing Comics, which focuses on the possibilities the internet creates.

Chapter 2: The Vocabulary of Comics[edit]

Comics are built out of icons -- "For the purposes of this chapter, I'm using the word icon to mean any image used to represent a person, place, thing or idea." Letters and words are completely abstract icons, bearing no physical resemblance to their ideas; pictures have varying levels of abstraction, from the photograph of a face, full of color and shading, an ink copy of photons taken in through a shutter, to =D

That smiley, that :-) , that set of dots and lines that looks like a face to us only because our brains contain a lot of face-seeing hardware (after all, no one sees eyes in solitary colons, noses in solitary dashes), is a cartoon. Cartoons focus our attention -- through simplification, by eliminating superfluous features, they amplify the features that remain -- but, McCloud explains, that is not the entirety of their drawing power:

When two people interact, they usually look directly at one another, seeing their partner's features in vivid detail. || Each one also sustains a constant awareness of his or her own face, but this mind-picture is not nearly so vivid; just a sketchy arrangement...a sense of shape...a sense of general placement. || Something as simple and basic as a cartoon. || Thus, when you look at a photo or a realistic drawing of a face, you see it as the face of another. || But when you enter the world of the cartoon, you see yourself. ... The cartoon is a vacuum into which our identity and awareness are pulled, || an empty shell that we inhabit which enables us to travel into another realm. We don't just observe the cartoon, we become it!

(McCloud goes on to discuss how varying levels of iconism and realism are used in comics to achieve various effects -- for example, a sword might be drawn as a few quick lines in one panel, to symbolize its connectedness with the iconic character holding it (just as the car one drives becomes, to an extent, an extension of the self), and drawn as detailed and photorealistic in the next panel, as the character gazes down at its strange engravings, to symbolize its mystery, its otherness, to allow the reader to see it, rather than be it.)

The spectrum as described thus far:

real face --> photograph of a face --> realistic drawing --> cartoony drawing --> smiley

The photograph requires little cognition; it is received as pure visual input. The smiley requires an effort, if an unconscious one, a piecing together of the abstract lines. Continuing the spectrum to the right, McCloud argues, we arrive at the word FACE -- bold and distinct, with relatively few letters, reminiscent of prehistoric times, when words and pictures were one -- then at a more detailed description ("Two eyes, one nose, one mouth" is the example given), then at still higher levels of abstraction ("Thy youth's proud livery, as gazed on now...").

reality --> photo --> Batman --> Charlie Brown --> =D --> FACE --> Two eyes, one nose, one mouth --> Thy youth's proud livery, as gazed on now...

Our need for a unified language of comics sends us toward the center where words and pictures are like one side of the same coin! || But our need for sophistication in comics seems to lead us outwards, where words and pictures are most separate.

In addition to iconic abstraction, there exists abstraction of the more traditional sense, the abstraction of abstract art. Thus, the spectrum becomes a pyramid:

 The Picture Plane/Art Object
               / .\ \
              /  . \  \
             /      \   \
            /  .     \    \
           / .        \   .|
          /        .   \  / language
         /.             \/

The pyramid has limitations, of course (Where do The Treachery Of Images and Fountain fall? Does the horizontal axis along the verbal edge dictate how the words are displayed on the page or how they describe things?) but it is a cool little invention.

Chapter 3: Blood in the Gutter[edit]

 ^_^   -_^ 

As part of normal life, everyone learns to assume certain things (or so one assumes) -- that the world doesn't disappear when you're not looking, for example, that the house across the street has furniture and interior walls. "As infants, we're unable to commit that act of faith. If we can't see it, hear it, smell it, taste it or touch it, it isn't there! The game "Peek-A-Boo" plays on this idea. Gradually, we learn that even though the sight of mommy comes and goes, mommy remains. || This phenomenon of observing the parts and perceiving the whole has a name. It's called closure."

You performed closure when you saw the lines at the top of this section as two anime smilies; more to the point of the chapter, you performed closure when you saw the two smileys as a single winking smiley. "See that space between the panels? That's what comics aficionados have named "the gutter!" And despite its unceremonious title, the gutter plays host to much of the magic and mystery that are at the very heart of comics...If visual iconography is the vocabulary of comics, closure is its grammar."

Of course, other media make use of closure as well -- in movies, our minds effortlessly connect each frame to those preceding and following it -- but comics requires conscious (or semiconscious), high-level closure between every frame. McCloud has categorized panel-to-panel transitions into six classes:

1. Moment-to-moment 
The same subject is displayed in adjacent instants, like a movie running jerkily on a slow computer. Very little closure is required.
2. Action-to-action 
The focus remains on a single subject, but this time, two separate, consecutive actions are displayed (for example, the first panel might contain a car speeding along, the 2nd the car smashing into a tree).
3. Subject-to-subject 
Both panels are within the same scene or idea, but each portrays a different subject. ("John: What more could go wrong? || Catherine: Well, at least Jerry never called! || Telephone: Rring ")
4. Scene-to-Scene 
Just what it sounds like: great leaps in time or space. ("Detective: He can't outrun us forever. || Image of darkened house with caption: Ten years later...") Lots o' closure -- deductive reasoning, even -- is often required to link the panels into a single narrative.
5. Aspect to aspect 
"Bypasses time for the most part and sets a wandering eye on different aspects of a place, mood, or idea."
6. Non-Sequitur 
Panels with no logical relationship. (McCloud argues, though, that any panels placed side by side will inevitably generate the impression of some sort of relationship in the reader's mind. "--alchemy at work in the space between panels which can help us find meaning or resonance in even the most jarring of combinations."

(The transition at the top of this section, iconic and decontextualized as it is, manages to fall into two separate categories, Moment-to-moment and Action-to-action.)

Looking at how often each panel transition is used in a particular comic can reveal some interesting things. Jack Kirby's pioneering style, as invoked in a Fantastic Four comic from 1966, breaks down as follows: 65% action-to-action (type 2), 20% subject-to-subject (type 3), 15% scene-to-scene (type 4); the remaining transitions are unused. Here's the bar graph McCloud makes of the data:

   | | | | | | |
90%| | | | | | |
   | | | | | | |
   | |.| | | | |
   | |M| | | | |
50%| |M| | | | |
   | |M| | | | |
   | |M| | | | |
   | |M|M|.| | |
10%| |M|M|M| | |

As it turns out, almost every American comic -- regardless of storytelling style, regardless of genre (with a few experimental exceptions like Art Spiegelman's early work) -- charts similarly, from issue 1 of X-Men to Heartbreak Soup to Betty and Veronica to Naughty Bits to Frank in the River to A Contract With God to Maus to Donald Duck. A similar survey of European comics (Squeak the Mouse, Asterix, Welcome to Alflolol, The Long Tomorrow, Manhattan, Clik!, The Black Island, The Clock Strikes) yields similar results.

But here's a popular mainstream Japanese comic from Osamu Tezuka:

   | | | | | | |
90%| | | | | | |
   | | | | | | |
   | | | | | | |
   | | | | | | |
50%| | | | | | |
   | |M| | | | |
   | |M|M| | | |
   | |M|M| |L| |
10%|L|M|M|M|M| |

In Japan, where comics developed mostly in isolation following World War II, where they are often published in gigantic anthologies rather than tiny magazines (lessening the premium on space and thus the emphasis on concise, action-oriented transitions 2-4) the charts look quite different. More important still, eastern culture has bequeathed an emphasis on holism and contemplation (aspect-to-aspect works well for setting a mood), an emphasis on the power of intervals -- of silence in a song, of negative space in a painting.

In comics this means a renewed emphasis on the power of closure, on the strange alchemy that occurs in the gutter. The effect is to spark the imagination, to engage every one of the five senses, rather than simply sight. (Sadly, replicating McCloud's demonstration of this in ASCII art is prohibitively difficult.)

The comics creator asks us to join in a silent dance of the seen and unseen. The visible and invisible. || This dance is unique to comics. No other artform gives so much to its audience while asking so much from them as well. || This is why I think it's a mistake to see comics as a mere hybrid of graphic arts and prose fiction. What happens between these panels is a kind of magic only comics can create.

Chapter 4: Time Frames[edit]

We've been trained to see a picture as a snapshot, as a single moment in time, but any student of visual physiology can tell you that the vast majority of information is taken in by a small area at the center of the field of vision; our eyes compensate for this porthole effect by darting continually around and our brains compensate by maintaining the illusion of detailed peripheral vision (for a demonstration, try reading this sentence while keeping your eyes fixed on one of its component words). Much modern art acknowledges this (Cubism, for example, incorporates many perspectives into one image), and so do many comics. McCloud's example is a long panel containing several chatting family members; reading from left to right, the conversation takes maybe 15 seconds to unfold, and the expression and pose of each person matches the moment his or her particular statement is made -- "one panel, containing several panels".

To some extent, then, space in comics translates into time; as your eyes cross the page they also pass through the seconds (or the hours, or the years) and while a frame generally denotes a particular moment (in addition to serving other purposes beyond the scope of this entry), moment is a slippery thing that in practice can be any length at all; in determining a more specific time period, the reader relies heavily on context, on a page's particular content, and, especially, on the sounds portrayed in the text, which we have not been conditioned to think of as having a duration of zero.

(Here there is a quick digression about direction. What would happen if instead of reading solely left-to-right, a comic was made into a labyrinth wherein at every intersection the reader could choose which direction the story takes? What would happen if a comic was circular, its end connecting to its beginning?)

And then there's motion. It can, of course, be portrayed through the frame transitions described in chapter 3, but shortly after the era of futurism had ended, shortly after the invention of the motion picture, comics invented the motion line, which lies "somewhere between the futurists' dynamic movement and Duchamp's diagrammatic concept of movement." Over a period of decades, motion lines evolved from "wild, messy, almost desperate attempts to chart the paths of moving objects through space" to something "more refined and stylized, even diagrammatic", then, eventually, "became so stylized as to almost have a life and physical presence all their own!"

I've been trying to figure out what makes comics tick for years and I'm still amazed at the strangeness of it all. || But no matter how bizarre the workings of time in comics is -- || -- the face it presents to the reader -- || -- is one of simple normalcy. || Or the illusion of it, anyway. || It all depends on your frame of mind.

Chapter 5: Living in Line[edit]

This chapter relies very heavily on visual examples, so is difficult to summarize here.

Chapter 6: Show and Tell[edit]

In the beginning, pictures and words were two sides of the same coin. In school, at show-and-tell, you show, and you tell, interchangeably -- "This is my's got one of these things"; in picture books, simple words combine similarly with images. As we grow up, we learn to separate show and tell from each other -- we paint pictures without words, and read books without pictures.

A similar progression can be traced through history. In cave paintings, the people shown were iconic, symbol-esqe, almost like letters, as were the flat, bright images of the early Greeks and Creteans and the line-drawings of ancient Egypt; early written language, likewise, was full of the descendants of those cave paintings, letters that were pictures. Words and images were side by side, at the lower-left vertex of McCloud's great pyramid.

Over the course of the next few thousand years, they diverged. Letters sacrificed visual representation for writing ease (and, later, printing ease) and pictures grew richer and more complex until looking at them was more like looking at reality than at thoughts.