IP: The Past and Future and Impact on Broadcasting
For many decades broadcasting has been based on synchronous, circuit-based methods such as analogue video or Serial Digital Interface (SDI), both using coaxial cabling. It is only in the last two decades that this has changed, with the use of Internet Protocol (IP) technologies replacing costly and inflexible methods such as coaxial cabling.
For simplicity, this piece will talk about broadcast production and B2B content delivery over IP networks. In particular the areas of contribution, getting content from the field to a broadcaster and primary distribution, the distribution of produced content to other broadcasters or cable operators are both areas which will be discussed. There have also been many changes to B2C delivery of broadcast television, not to mention OTT services such as Netflix, for example, using adaptive streaming allowing for high quality video to be sent to many devices in the home. But this is better addressed in a different piece as this has so far developed into a separate ecosystem to B2B video.
It is also important to point out that the term “IP” does not necessarily relate to using the Internet. It merely refers to using the technologies (protocols) behind the Internet. But the content may or may not traverse the Internet itself. This piece will try to go through the developments in chronological order and will explore how these impacted the industry at the time and how future developments will impact broadcasters.
MPEG Transport Streams over IP
In the late 1990s as the dot-com era led to increases in network speeds and as digital television rolled out, there was a push to move away from using broadcast-industry specific interfaces such as Asynchronous Serial Interface (ASI) to IP networks. This was a key change as MPEG-TS streams were cut into “IP packets” and sent separately down the network. In theory, these packets could be reordered or even lost as they traversed the network, though any network designer would try to avoid this. This was a large step-change compared to the continuous signals in coaxial cabling (SDI and ASI). IP uses packet-switching instead of circuit-switching.
Generally speaking, the broadcast industry uses fixed bit-rate content and so use of the Transmission Control Protocol (TCP) that is used by web sites was not suitable as it has variable latency, throughput issues when packets are lost. So instead the User Datagram Protocol (UDP) was used, which provides no guarantees of delivery. UDP also allowed for the use of multicast technology on the network, something to this day not used widely in IT. Multicast allowed for multiple receivers of a single transport stream, with replication of the stream handled by the network equipment. IP networks also allowed for multiple streams to traverse a single cable. This was possible with ASI too but required costly re-multiplexers. Streams could also be bidirectional on IP networks in contrast to ASI which was unidirectional. As will be a common theme in this piece, IP technology comes from a much larger industry than broadcast, allowing for major economies of scale.
There was however one missing technical element that needed to be created. A way of measuring how much packet loss had occurred. MPEG-TS had a 4-bit continuity counter that was suitable for measuring the rare small amounts of corruption that generally occurred in ASI. But 4-bits did not provide enough resolution to measure packet loss on IP networks. So the Internet Engineering Task Force, who manage many IP protocol definitions, created RFC 2250 to support framing MPEG-TS data inside the Real Time Protocol (RTP, not to be confused with RTSP or RTCP). This placed a header (an RTP header) before the MPEG-TS data inside a UDP packet. Most importantly the RTP header contains a 16-bit sequence number that allows for precise measurement of packet loss as well as being able to reorder accurately. A timestamp is also present but this is often ignored as many transport streams contain programs with different internal clocks so only one can be chosen.
It is worth pointing out that in small networks where packet loss is unlikely, RTP headers are sometimes omitted. As a result of these developments, broadcasters could build networks which transported compressed video much more efficiently.
Initially MPEG-TS transport over IP was used over local networks (e.g a cable headend) but there was subsequently a push to use wide-area networks. Wide-area networks have the problem that packets traverse routers and switches that can drop packets. Broadly speaking, there are two types of packet loss, random (“salt-and-pepper”) loss, and burst loss. As a result a basic Forward Error Correction protocol based on XORing of the packet data was developed in the early 2000s. As shown in Figure 2, by using row and column FEC packets (an FEC matrix), different types of packet loss can be recovered. This was at the expense of a small buffer on the receiving device.
By using this protocol, broadcasters were able to use generic IP networks to transport compressed video, increasing flexibility and reducing dependence on broadcast-specific technologies such as dedicated fibre or satellite.
It can also be beneficial for reliability to use more than one telecom provider and so using a protocol known as 2022-7 hitless switching the same packets are transmitted over two providers. The receiver receives both sets of packets from each provider and a buffer at the receiver uses the RTP sequence number to detect duplicate packets. This means that if a given link fails or has packet loss, packets can be used from the alternate link.
The public internet and the cloud
In many locations, private connectivity for broadcast is prohibitively expensive. Some broadcasters therefore used the public internet to transport feeds. Some used FEC as shown above with some success, although FEC is not capable of handling burst losses that are larger than the matrix size. As a result proprietary protocols such as Zixi and VideoFlow were created which used packet retransmissions, known as ARQ (Automatic Repeat ReQuest). Sometimes this is combined with FEC. ARQ works by retransmitting lost packets but comes at the expense of needing a large buffer, especially if the round trip time between source and destination is large (e.g London to Sydney).
The prevalence of proprietary protocols proved to be a problem and so the industry came together to make RIST (Reliable Internet Stream Transport) protocol building on top of RTP, using RTCP retransmission messages. In 2017, Haivision open sourced its in-house SRT protocol based on UDT, a file transfer protocol. Currently there is a “format war” taking place between RIST and SRT.
Throughout the late 2010s and early 2020s, there was a gradual push to move satellite and fibre based services to public internet transport. Likewise, the use of the cloud requires traversing shared infrastructure and so a recovery protocol is needed. These technologies will allow broadcasters to use more generic connectivity to broadcast from a wider variety of locations as a result of improvements in internet speed. Equally, primary distribution can now be performed without costly fibre or satellite connections.
Separately to this, specific workflows around cellular bonding for newsgathering have transformed the industry. These are generally based around aggregating 4G internet connections from different mobile providers known as cellular bonding. In addition these solutions adapt the bitrate and often the picture resolutions as these 4G networks become more congested (e.g during a protest) or signal quality reduces. Many traditional satellite newsgathering vehicles have been replaced with 4G vehicles. It is worth noting all these solutions are “walled-garden” products, there is no common industry approach to cellular bonding. This technology has not yet extended widely to sports, except for web sports, largely because sports have more technically demanding requirements compared to news. Sports broadcasters won’t tolerate fluctuations in picture quality or resolution. 5G will change this, as it will provide high bandwidth connectivity from many locations.
The bigger picture is that B2B broadcast video has been abstracted away from industry-specific, precisely timed circuit-based video solutions to lossy and jittery public internet solutions. It can be argued that the use of Zoom or Facetime during the pandemic for newsgathering is the next step, as these are video transport mechanisms that have consumer grade characteristics (low end camera sources with variable frame rates etc). This change is somewhat analogous to the IT industry’s move to cloud and serverless, an increased abstraction of processing.
IP for broadcast production
Whilst the world of broadcast contribution moved to IP the world of production remained in the world of SDI coaxial cables, largely because of the high data rates and precise timing requirements. Yet as shown in Figure 4, networking speeds have grown drastically, thanks to consumer demands on video, big data etc and so it was inevitable that the broadcast industry would have to move.
There were a few false-starts with industry-specific technologies such as Audio Video Bridging. But the industry took a step in this direction with the use of ST 2022-6, which directly mapped an SDI signal into fixed IP packets. Like with MPEG-TS this allowed for use of high-bandwidth networks to transport multiple Uncompressed IP in both directions using a single cable. Unfortunately, by encapsulating SDI directly into IP it still continued to carry historical signals such as the vertical blanking and the CRC (cyclic redundancy check). Timing was also added using the Precision Time Protocol (PTP) which allowed the legacy analogue genlock signal to be transported down the same cabling. By doing this, the industry had a common source of time frequency and phase, most importantly without leap seconds. For broadcasters, this again removed the need for another legacy signal. For many broadcasters, especially those with SDI routers which were full, ST 2022-6 was a useful transition.
ST 2022-6 suffered from two main problems, it wasted bandwidth on legacy blanking data, and any device that wanted to process just a part of the signal such as audio had to process a large amount of unnecessary video data. As a result ST 2110 was created that separated video, audio and ancillary data into separate RTP flows. Packets contain timestamps locked to PTP and receivers have to resync all received flows. As the flows contain Uncompressed Data without any framing, a control system such as NMOS IS-04/5 is needed to control multicasts of receivers and tell the receiver important information about the flows such as video width and height. NMOS has proven to be complex and time consuming to implement.
The bigger goal of using IP is to allow broadcasters to use COTS (Commercial-off-the-shelf) equipment. Whilst COTS networking equipment such as switches are used, an ideal goal would be to use COTS IT equipment such as servers. However, the tight timing requirements of ST 2022-6 and ST 2110 make this a very challenging task. As a result, few vendors have implemented software stacks. As a result a large amount of Uncompressed IP equipment is fixed function hardware. As broadcasters are looking to compete with streaming providers, a rigid approach to live production workflows continues to hold them back. IP should enable more flexibility but we have not reached this goal yet.
In the meantime, various proprietary approaches to live production over IP such as NDI have gained market share. NDI uses a compressed approach using a modified version of MPEG-2 encoding. The framing and transport is entirely proprietary but has the advantage of using gigabit switches which are much less costly.
The evolution of IP in broadcast
The pandemic has changed the way live television is produced. There has been an early push to move processes to the cloud. The cloud solves one of the major problems that broadcasters have: They have to maintain the peak amount of fixed-function equipment for the busiest day of the year and most of the time this equipment sits idle. This is in contrast to streaming services which are cloud-native. The cloud would allow elastic production techniques, scaling up and down resources. But there has been limited progress on multi-vendor cloud production techniques. Amazon CDI is one approach and the Video Services Forum Ground-Cloud-Cloud-Group (VSF GCCG) working group is another. Arguably this is the last major technical problem left in broadcasting and it remains to be seen what future technologies will affect broadcasting.