Intro to Video Streaming and Video Audio Codecs

Perm url with updates:

Intro to Video Streaming and Video Audio Codecs

Xah Lee, 2010-07-04

This page provides a survey of the current video/audio codecs, file formats, streaming tech, for streaming multimedia.

I need to setup embedded videos on my site as well sites i work for. So, today, i started a comprehensive study on video streaming. Here's some learning notes. (The last time i worked with video is in 1995-1997. At the time video streaming is still pretty much science fiction)

The technology involved for playing a video from a browser from a site, such as youtube, involves several separate technologies. Here's a summary of the basics:

  • video codec. The video file must be encoded into a standard format (i.e. sequence of bits). Usually this means using a praticular compression scheme. The encoding and decoding algorithm and format is called a codec. (examples are: H.264, MPEG-4, WMV, QuickTime, DVD-Video, ...)
  • audio codec. The audio part for the video must also be encoded, usually treated and stored separately from the video. (examples: MP3, AAC, WAV, AC-3 (Dolby Digital), FLAC, ...)
  • multimedia container format. The encoded video/audio file(s) is saved in in a container file format. That is, a file format that contains video, audio, and other items such as subtitle. (examples: QuickTime's “.mov”, Microsoft's “.avi”, “wmv”, Adobe Flash, DVD, mpeg, ...)
  • streaming protocol. The file must be served in a special network protocol, e.g. by a streaming server. Because, it's not a normal file. You want users be able to watch the movie as it start to download, and usually play/pause the movie anytime.
  • application support. The web browsers must have special code/plugin for movie files so that for example a movie file will display screenshot even when not playing, and show the play/pause buttons, view in full screen, etc. (popularly done with Adobe Flash plugin. Or Java, or HTML5's video tag.)

Video Codecs


H.264 (aka MPEG-4 part 10, AVC (Advanced Video Coding)). It is currently the most popular video codec. It is used in Blu-ray Disc, YouTube, iTunes, and many nation's broadcasting and other video related applications. See: List of video services using H.264.

First version of H.264 was completed in 2003-05.


A competitor to H.264 is the VP8. Currently owned by Google and released as open source. VP8 is endorsed by FSF.

The associated container format with VP8 is WebM. WebM format is also free, and is based on the free Matroska container format.

WebM file format, is competing to be the default video format for HTML5 video.

WMV (VC-1)

Windows Media Video (WMV), refers to several video codecs from Microsoft, but mostly the latest WMV 9 (aka VC-1). WMV 9 is released to standard body and standardized as VC-1, in 2006. It is widely supported, and is used in Blue-ray Discs, Xbox 360, PlayStation 3. It's a competitor to H.264.

Besides WMV 9, there's 2 other codecs: WMV Screen and VMV Image. The screen one is optimized for screenshots, e.g. tutorials on using a application. The Image one is optimized for slideshows.


Another widely used one is from Apple's QuickTime. (see below)


Sorenson codec

Sorenson codec refers to 2 proprietary codecs. Quote:

  • Sorenson Video (aka Sorenson Video Codec, Sorenson Video Quantizer, SVQ). Used by Apple's Quick Time, but is phased out in mid 2000s.
  • Sorenson Spark (aka Sorenson H.263). Used by Adobe Flash, but is phased out in mid 2000s.


Theora is a free lossy video compression codec. The technical quality of Theora is not as good as H.264 or VP8. It is based on VP2 format of 2002, then a proprietary format by On2 released as free. It is not widely supported. Theora is usually stored in the Ogg container format, together with the free lossy audio codec Vorbis.

DivX and xvid

DivX started as a open source project in ~2000 but became proprietary, and Xvid is forked from it. Both are usually used as the format from ripped DVDs. Both do not particularly define new codecs or container formats, rather, they are based on some subset of MPEG-4 standard and other container formats. It began as a reverse-engineer of Microsoft's MPEG-4 version 3 codec.


There are tens of audio codecs, some are lossy, some lossless. Here's some popular lossy ones:

  • MP3, from the standard MPEG-1 Audio Layer 3. Most popular. Started the digital music era in late 1990s.
  • AAC, lossy. Used in iTunes, iPod, iPhone, etc. Much better than mp3.
  • Windows Media Audio (WMA). Microsoft's answer. WMA is part of Microsoft's Windows Media framework. WMA can refer to 4 codecs: WMA, WMA Pro, WMA Lossless, WMA Voice.
  • Vorbis. Open source. Typically used together with the Ogg container format. Superior to mp3, and probably inferior to AAC.

There are a number of free and lossless codecs for audio. Most popular is probably FLAC. Lossless audio codec typically compress a music file by 50%.

For audio file formats (not compresed), the most popular ones are: Microsoft's WAV and Apple's AIFF. These are pretty old, starting in early 1990s. Note that both formats actually support compression, but audio stored in these formats are almost always not compressed.

Note that 300 kilo bits per sec gets you CD quality audio (using a lossy compression). While a DVD quality video is about 5 mega bits/s. That's about 17 times more.

The need for audio codec research has past. Computer storage and processing power today can deal with audio no problem, and use of lossless codec for audio is increasingly popular. So, for issues of movie streaming, the video part is the primary concern.

Multimedia Container Formats

QuickTime (“.mov” or “.qt”) is Apple's container format. Widely used.

AVI is Microsoft's tech, fairly old, started in early 1990s. Widely used.

Advanced Systems Format (ASF) is Microsoft's container format, part of the Microsoft's Windows Media framework.

Matroska (“.mkv”) is free container format. Recently adopted by Google and re-branded as WebM, to be used together with VP8.

Ogg is another free multimedia container format. Its tech quality is often in dispute. It is used by Wikipedia.


Quicktime (QT) is Apple's multimedia framework. It supports audio and video, as well as interactive panoramic images, and including such things as midi. It supports many codecs for audio and video.

The file format of QT is “.mov”. Quote:

The QuickTime (.mov) file format functions as a multimedia container file that contains one or more tracks, each of which stores a particular type of data: audio, video, effects, or text (e.g. for subtitles). Each track either contains a digitally-encoded media stream (using a specific codec) or a data reference to the media stream located in another file. Tracks are maintained in a hierarchical data structure consisting of objects called atoms. An atom can be a parent to other atoms or it can contain media or edit data, but it cannot do both.[11]

QT 7.x is around from 2005 (OS X 10.4) to version 7.6 in 2009 (OS X 10.6). After that, the next version is QT X (10), which is supposedly completely rewritten for 64-bit computing and somewhat incompatable with past QT versions. Though, QT X relies on QT 7 for dealing with older codecs and other files such as MIDI.

Some more Wikipedia quotes:

QuickTime X is a combination of two technologies: QuickTime Kit Framework (QTKit) and QuickTime X Player.

... many Apple products (such as iTunes and Apple TV) still use the older QuickTime 7 engine.

QT Streaming

QuickTime Streaming Server (QTSS) is a server or service daemon built into Apple's Mac OS X Server that delivers video and audio on request to users over a computer network, including the Internet. Its primary GUI configuration tool is QTSS Publisher and its web-based administration port is 1220.

QuickTime Broadcaster is an audio and video RTP/RTSP server by Apple Computer for Mac OS X. It is separate from Apple's QuickTime Streaming Server, as it is not a service daemon but a desktop application.


FFmpeg and VLC

FFmpeg is a open source project on video and audio tech. Three notable component from FFmpeg are:

  • libavcodec, an audio/video codec library used by several other projects.
  • libavformat, an audio/video container mux and demux library
  • ffmpeg command line program for transcoding multimedia files.

One interesting thing about the project is that it has a command line tool “ffmpeg” that lets you convert one video format to another.

VLC is a movie player. Originally designed as a server/client for streaming multimedia, but now is just a single application the VLC. Was at one point used by Google at Google Video until they switched to Flash. VLC can also be used on the command line.

Streaming Technologies

The following are the most commonly used protocols for Streaming media. Each with a particular purpose:

  • RTSP. e.g. send the play, pause, request from client.
  • RTP. e.g. the streaming media payload.
  • RTCP. e.g. monitor transmission statistics and QoS information.

The above combo are usually referred to as “RTSP/RTP”.

Adobe Flash uses its own Real Time Messaging Protocol (RTMP).

Microsoft was using Microsoft Media Server (MMS), but is preprecated in 2003. Now Microsoft uses RTSP.

HTTP Live Streaming is Apple's tech, new in 2009 with QuickTime X. It is different than others because it is HTTP based. Proposed as a internet standard.

Some detail:

Here's a list of video hosting services: Comparison of video services. Contains some detail of what protocol they use.

See also: Comparison of streaming media systems.

Some References

Besides Wikipedia, here's some other articles i used for this article.

Comparison of codecs:

  • First Look: H.264 and VP8 Compared (2010-05-20), by Jan Ozer. Source
  • The first in-depth technical analysis of VP8 (2010-05-19), by Jason Garrett-Glaser. (a x264 and ffmpeg developer; college student) Source
  • Video on the Web (2009-03), by Till Halbach. (comparison of Dirac, Dirac Pro, Theora, H.264) Source
  • “[whatwg] H.264-in-<video> vs plugin APIs” (2009-06-13), by Chris DiBona (google employee)

Comparison of container formats:

  • Ogg objections (2010-03-03), by Mans Rullgard (ffmpeg developer). Source
  • In Defense of Ogg's Good Name (2010-04-27), by Christopher Montgomery (ogg designer). Source

Audio codecs comparison:


  • Apple proposes HTTP streaming feature as IETF standard (2009-07-09), by Chris Foresman.

Popular posts from this blog

11 Years of Writing About Emacs

does md5 creates more randomness?

Google Code shutting down, future of ErgoEmacs