MeGUI/x264 Settings/x264 Stats Output

From Wikibooks, open books for an open world
Jump to: navigation, search

This page attempts to document what the stats mean, and more importantly, what you can learn from them. Specifically, what encode settings might be useful, and/or what your encode settings did.

Fiddling with settings is not usually needed, beyond picking a preset (and sometimes one or more --tune settings). You still need to look at the output if you want to really check that it looks better at the same bitrate as with some other setting.

Stats output can be useful to give ideas about what you might want to fiddle with, not as a measure of whether it worked. --tune ssim --ssim (or psnr) can be useful, but can't help with psy settings (psychovisual, i.e. looks-better-to-humans, but worse quality metrics).

If you made an encode, and want to try different settings, your best bet is to do a 2pass encode, targetting the same bitrate as your first encode, but with modified settings. Using of the same source, of course. Then you can visually or computationally (SSIM & PSNR) compare your two same-bitrate encodes. Remember that CRF's behaviour is affected by many settings, so you can't say that settings that produce a smaller file size with the same CRF were better. You have to compare the quality, too. (x264 can accurately hit a bitrate target, not SSIM or PSNR targets, which is why 2pass is recommended for comparing the rate-distortion performance of different settings.)

Typical x264 output:

avis [info]: 1280x720 @ 1.77 fps (40997 frames)

When using Avisynth, this line shows basic information about the input.

x264 [info]: using cpu capabilities: MMX2 SSE2Fast SSSE3 PHADD SSE4 Cache64

Note this does not neccessarily match up with the capabilities your CPU has. On some chips, x264 won't use a certain instruction set as it is actually slower.

x264 [info]: profile High, level 4.0

Information on the h264 profile and level for the stream. The profile is implied by the options used (eg, B-frames implies at least main profile, 8x8dct implies at least high profile) and is given here for information only. The level is a number written into the bitstream. You can either explicitly set the level you want with --level, or not do so and let x264 guess (reasonably accurately).

x264 [info]: frame I:879   Avg QP:21.39  size: 59921
x264 [info]: frame P:24856 Avg QP:25.44  size: 12473
x264 [info]: frame B:61727 Avg QP:28.98  size:  3759

For each of the three frame types, show the total number of frames, the average quantizer of every macroblock in the frame type, and the average size of said frame type. The QPs should be related by --ipratio and --pbratio, but mbtree, psy/aq, and ratecontrol modes other than constant-QP will

x264 [info]: consecutive B-frames:  4.8%  6.0% 12.1% 51.5% 15.9%  8.0%  1.8%

Percentage of frames within a sequence of this many B-frames. P = 1, PB = 2, PBB = 3, etc. If the numbers trail off to near-zero, you likely won't gain anything by raising the consecutive-B-frame limit. (The converse isn't always true: there aren't necessarily significant gains from --bframes > 3, even if x264 chooses to use 3 consecutive b-frames most of the time.)

x264 [info]: mb I  I16..4: 17.9% 68.9% 13.2%
x264 [info]: mb P  I16..4:  9.2% 12.1%  0.6%  P16..4: 40.4%  5.6%  6.7%  0.1%  0.0%    skip:25.2%
x264 [info]: mb B  I16..4:  0.9%  1.1%  0.1%  B16..8: 36.7%  2.7%  0.5%  direct: 2.1%  skip:55.9%  L0:44.6% L1:52.5% BI: 3.0%

The numbers in each row sum to 100%, accounting for all macroblocks in each frame type.

For each of the three frame types (I,P,B), show what partitions were used. I frames can only use I macroblocks, while P and B frames can use I or their native macroblock type.

are what percentage of partitions within are either I or the native type of partitions (which is P for P MBs and B for B MBs). The three numbers for I frames represent i16x16, i8x8 and i4x4. For P/B frames, the five numbers represent 16x16, 16x8/8x16, 8x8, 8x4/4x8 and 4x4.  (B can't use 8x4/4x8 or 4x4).  In this example, 

Skip shows the number of partitions using the skip vector without residual, while Direct shows the number of partitions using the skip vector with a residual. What is the skip vector? It's very very complicated.. (speculation: your editor believes that these skip vectors, as the previous author calls them, are what motion search finds. So a lot of "skip" and "direct" partitions mean that motion search found good enough references for vector (+ residual) to be a better RD tradeoff than other ways of coding the block.)

For B partitions, another list is tacked on to the end of the line. B partitions can predict from previous frames (L0 reference list), future frames (L1 reference list), or from a blend of a past and a future frame (BI(directional)).

x264 [info]: 8x8 transform intra:56.8% inter:76.6%

How often 8x8dct was actually used.

x264 [info]: direct mvs  spatial:99.9% temporal:0.1%

Percentage of frames using each method of direct/skip motion vector calculation. See above note about the skip vector. 2pass mode is needed for --direct auto to use more than a tiny amount of temporal MVs on typical content, but crf is still widely recommended when you don't have a specific bitrate target.

x264 [info]: coded y,uvDC,uvAC intra: 32.3% 49.4% 12.6% inter: 5.9% 10.5% 0.5%

For each kind of block, what fraction was coded (non-zero DCT coefficients).

x264 [info]: i16 v,h,dc,p: 32% 19% 10% 39%
x264 [info]: i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 15% 10% 14%  8% 11% 12% 11% 10%  9%
x264 [info]: i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 18% 15%  7%  7% 11% 11% 11%  9% 11%
x264 [info]: i8c dc,h,v,p: 39% 26% 21% 14%

How often each prediction mode was used for each partition type. (FIXME: find a link for more detail about h.264 prediction modes, and/or any suggestion about how this info is useful for seeing what your settings did.)

x264 [info]: Weighted P-Frames: Y:2.7% UV:2.1%
x264 [info]: ref P L0: 56.2% 10.9% 15.7%  5.2%  3.9%  2.8%  2.4%  1.3%  1.3%  0.3%  0.0%
x264 [info]: ref B L0: 81.1%  9.2%  4.6%  1.9%  1.3%  1.0%  0.6%  0.2%
x264 [info]: ref B L1: 94.6%  5.4%

Which reference picture was actually used by P and B partitions. L0 is the list of past reference pictures. L1 is the future. This example output came from a preset=veryslow encode, with ref=9, but the P reference list has 11 entries. The last two are the virtual duplicates produced by the way x264 implements weightp. The decoder of the produced stream won't see them; they don't take DPB (decoded picture buffer) space.

Like with consecutive B frames, if the list trails off to near zero, you could have used a lower refs with little impact. Note that this list doesn't tell you how much better a match was found: Even if the list doesn't trail off to zero, it might not hurt compression much at all to user fewer refs if the refs from farther back were only slightly better. It also doesn't tell you if there were really good matches farther back than it was checking. ((speculation) Only very synthetic input with a repeating pattern has much chance of seeing a big jump in compressibility when going e.g. from 8 to 12 refs, with a pattern that repeats every 10 frames.)

External links: