Digital video has some properties and terminology inherited from old analog systems used in television broadcast and also some new properties brought by the new technologies. Detailed information about those properties and systems are beyond the scope of this manual, but here are some explanations which might help you in using clips and generating video with Kdenlive.
Frames per second
The human eye is a wonderful mechanism, but it can be fooled. This is what made cinema possible back in the 1800s --- if a sequence of slightly different still pictures is shown rapidly, the eye cannot distinguish individual images, creating the illusion of movement. The same thing happens if you take a pen between your fingers and move it up and down while keeping your fingers flexible --- this will make the pen look as if it's melted.
In video language, each of the pictures in a sequence is called a frame, and we have to show the frames at a certain speed to create the movement illusion. The measure found for this speed is the number of frames per second (fps). The human eye can not distinguish frames shown at a rate of 14fps, although this might still look clumsy. This was the speed used in the earlier cinema movies. Slightly higher speeds allow an illusion of smoother movement.
When television was created, one of the problems found was that, more than having the frames shown at a certain rate, it was needed to synchronise the frame display in the receiver to that of the transmitter. The lack of precise electronic circuits and components at the time led the engineers to use a simple oscilator which was ready for use in the wall --- the frequency of the Alternating Current (AC) electricity network. In some countries, AC was generated in a 50 Hz frequency, and in other countries it came to the outlets at 60 Hz. This will have more meaning when we reach the NTSC and PAL systems explanations.
The aspect ratio is basically the proportion between the screen width and the screen height.
The two more used aspect ratios are:
- 4:3 - This is the standard format for analog television, also known as "pan format".
- 16:9 - This format comes from cinema, and it is also used by the so-called wide-screen television.
When a 16:9 movie must be shown in a 4:3 screen, it has to be cut at both sides, by removing columns (and content), or it can be shown with black borders in the top and bottom which will allow the whole width to fit in the narrower screen. This latter display method is known as letterbox.
Another property which came from the television. The cinescopes (or CRTs, Cathode Ray-Tubes) used until the 90s did not have enough speed to write a whole picture in the screen 50 or 60 times in a second.
The solution found was to split each picture in two fields --- the first one made of all the odd lines which composed a picture, and the second one with the even lines. The human eye could not distinguish scenes shown and the materials used in the tube screens would, anyways, retain light for a little while after the light ray hit them. In such speed, the combination of the two fields would be imperceptible.
This is interlacing. Following the frequency chosen by the television engineers, 50 or 60 fields are shown per second, making for, respectively, 25 or 30 frames per second.
Later, faster cinescopes allowed the full display of pictures without the need of interlacing, and non-interlaced video allows a better quality, specially noticeable in frozen images. Sometimes, converting interlaced video to non-interlaced video or vice-versa is desirable.
When we are editing video, we often need a reference of a specific point of the video --- for example, where a certain scene begins. The most precise measure since the earlier edition machines is the timecode, which came with the magnetic video tape technology and remained in digital video.
The timecode of a recording is embedded in the video information and it counts the time since the start of the recording. It is usually shown in the format hh:mm:ss:ff, where hh is the number of hours, mm means minutes, ss are the seconds, and ff is the number of frames within a second.
NTSC is a television broadcast system created by the National Television Standards Committee in USA. While an analog system, NTSC is used as a reference for digital video for the screen size and number of frames per second. It has the following characteristics:
- 352x240 pixels screen size (720x480 for DVD)
- 30fps. More precisely, 30000/1001fps or 29,97fps.
The strange rate of frames per second in NTSC makes necessary the use of the commonlly called drop frame technique in video editors like Kdenlive. Every 2 minutes, except each 10th minute, two frames are simply ignored in the time count, so the video seems to run in 30fps. Without this trick, the audio at "real 30fps" would slightly slip out of sync with the video in some minutes.
The PAL system was created by German electronics giant AEG-Telefunken and became widely adopted in many other countries of Europe, South America, Asia and Africa. Its name comes from Phase-Alternate Line, which was a rather ingenious method found by the engineers to allow automatic color correction in television --- which resulted in PAL being described as "Pictures Always Loveable", while NTSC was known as "Never Twice the Same Color"...
Just like NTSC, PAL is also used as a reference in digital video for image size and picture rate:
- 352x288 pixels screen size (720x576 for DVD)
However, a number of PAL variants were created for different countries and some of them do NOT follow these standards. The PAL-M system adopted in Brazil and Laos, for example, keeps the "color burst" and modulation of PAL while taking the picture size and frames-per-second properties of NTSC.
Chrominance and Luminance: colors in video
When color TV became commercially possible, a number of black and white TV sets were already in people´s restrooms and it would make no sense to throw them away. Also, the idea of transmitting color TV with different signals for the three basic colors (Red, Green and Blue, the famous RGB) was impossible in practice, because of the large bandwidth required for analog transmission. It would be too much information to fit in one standard television channel, 6 MHz wide.
Engineers got a brilliant solution, one more time benefiting from a limitation of the human eye. They created a color system based on subtraction which became known as YCC, as opposite to the addition system used for RGB.
The main part of the analog color TV signal is made of luminance, which defines only the brightness, ranging from full black to full white. This is exactly what the black and white TV needed, so those old TV sets could remain useful just by ignoring the color information. Added to it, but taking much less bandwidth, the chrominance signals were added for the colors. These signals are processed locally by the receiver with a little colors logic. If the blend of all colors in full bright result in pure white --- which is also the maximum value possible for the luminance signal ---, you could get color values from subtracting those chrominance values from the luminance signal.
To be exact, the blend for making white is 30% red, 11% blue and 60% green. So, to keep the required bandwidth as low as possible, the two chrominance signals blended to the luminance are those for red (Cr) and blue (Cb). Getting the amount of green in a pixel is as simple as G = Y - Cr - Cb . Blue is the result of Y + (Cb - Y), and red comes from Y + (Cr - Y).
Then, a coloured image in TV is made half by the luminance signal, and the rest for chrominance (a proportion of 4:2:2). The color information itself can use less bandwidth because the human eye is much more sensitive to the bright and contours of an image than the color differences. You can prove that by looking at drawings made only from colors without black or darker contours.
Although this solution was made for analog television, and digital video allows complete color information in a narrower bandwidth, the YCC scheme was preserved for allowing better compression. The DV25 standard for digital video uses even less bandwidth for the color information, in a luminance/chrominance proportion of 4:1:1 --- which can lead to visual artifacts in operations such as compositing an image with other of a person shot against a blue background (the famous chromakey composition). The more professional-driven DV50 standard uses 4:2:2 color sampling by default.