The ICE LLC Difference
A single Interactive Content Engine can provide Video on Demand, Time Shift TV, Network PVR, and Targeted Digital Commercial Insertion, and do so with equivalent reliability but far lower cost than competing architectures and products. It is scalable from hundreds of streams to hundreds of thousands of streams without content replication, and is based on commodity products. Systems can be expanded with hardware current at the time of expansion, without requiring a "fork lift" upgrade. The simplicity of the Interactive Content Engine Architecture leads to its power and reliability.
Einstein said it best:
"Everything should be made as simple as possible, but no simpler."
Simplicity in design has been a hallmark of our earlier products. These included a random access commercial insertion controller sold by Texscan for fourteen years, a multiuser software menu package bundled by Compupro for twelve years, and automated time delay devices used for time zone realignment for audio and video content by Oceanic Cable, among others, for 25 years. We believe that simplicity extends not only to the design of the device, but to its operation. And not just to its day to day operation, but its operation during emergency situations, whether caused by hardware or human error.
We have proven that simplicity of design leads to long term reliability, in spite of low initial cost.
Like Einstein, we also believe that this simplicity should not imply a limitation in functionality, or shortcuts in design. For example, the video server that is the focus of this paper (the Interactive Content Engine, ICE) does not use any "tricks" in the delivery of content. When we talk about the number of streams it can produce, we are talking about point to point, individually controllable streams. If required, the Interactive Content Engine can produce each stream from a different source of content.
"Server" is a poorly defined term in computing. It can imply a single machine, an assembly of machines, or even a software package. So we have used the term "engine" to imply a group of components working together. The Interactive Content Engine is an assembly of identical, small computers that would be considered servers. We don't need all of their capabilities, but it would cost significantly more to build custom products, so our architecture is able to use commodity devices. Fortunately, this not only gives us the ability to produce devices of various stream densities and overall size, but also the ability to accommodate new technologies over time and integrate them in existing systems (no "fork lift upgrades" required).
We have long advocated gigabit ethernet (soon to be 10 gb or faster at commodity pricing) as the most appropriate server interconnect. DVB ASI interconnects, used by the first few generations of video servers, cost about $500 to $700 per 100 megabits of data bandwidth, and required critical timing (500 nanoseconds) even though the signal was frequently retimed downstream. Gigabit ethernet costs about $1 per 100 megabits of bandwidth, and relaxes timing requirements of other server components by about two orders of magnitude (50 ms). The remarkable cost reductions from this alone have caused its widespread adoption, with at least one company (Teleste of Finland) giving us credit.
However, we have taken it a step further than competing products with our patent pending Synchronous Switch Architecture, which allows us to use the same remarkably inexpensive but reliable technology, as the interconnect for our engine. We use a carefully managed gigabit ethernet switch as our backplane, and are able to take advantage of its full backplane bandwidth. Our only scaling limit is the largest currently available switch, which is already in the terabit range and increasing. The amazing backplane bandwidth not only enables scaling, but allows us to scale without the necessity of replicating content.
Content replication is required by servers whose access bandwidth to a given title is limited by its architecture. RAM based servers are frequently touted as having sufficient bandwidth that replication is not required. However electronic memory costs about 100 times as much as hard disk memory, so a server that stores all content in RAM is very expensive. And that expense is a waste, because high bandwidth is only required for periods of peak demand in individual titles. Those parts of a title that no one is watching require zero bandwidth! Typically, demand clusters around the end time of a preceding event. So for example, if the evening news ends at 6:30PM, a cluster of demand would be expected for a popular title centered around that time, requiring a significant amount of bandwidth. However, by 6:45PM, the demand for the very beginning of that title is likely to be significantly less, as most viewers will have moved on and be about 15 minutes into the title. The need for great bandwidth will now be centered about fifteen minutes into the title, and little bandwidth will be required for its beginning.
The Interactive Content Engine uses a caching approach that automatically keeps the most popular moments of a title in RAM, multiplying the effective available bandwidth and avoiding the need for (and expense of) content replication. So it combines the cost advantage of a hard disk based server with the bandwidth advantage of a RAM based server. Interestingly, the advantage of caching, or of RAM based servers, is based on some titles being far more popular than others, which is a usual case. But in other contexts (for example, education or corporate training) it is possible that each stream being produced by the server comes from a different title. For an Interactive Content Engine, this is no problem. The system may be configured with an emphasis on disk bandwidth instead of RAM bandwidth, with no software changes required. In fact, a system can be readily reconfigured in the field if needs should change, just by sliding in more disk storage. And for a given storage capacity, a hard drive based server is physically smaller than a RAM based server.
From the standpoint of reliability, our Engine requires just one level of redundancy. Each of the RAID arrays in the engine spans multiple individual servers (which we call Storage Processor Nodes, or SPNs). If a drive fails, its contents is regenerated from parity information, and the operator is notified. If an SPN fails, each of its drives represents a single drive in a separate RAID array. Each array regenerates its contents, and operation continues normally. This also isolates drive failures caused by external factors, unlike a single box RAID approach.
For more than a decade, we have also advocated the use of commodity disk drives, especially with the onset of Serial ATA drives which are intended to be hot-swappable (replacable during continued normal operation). Other video server manufacturers have typically used SCSI or Fibre Channel drives, which are four to ten times more expensive for the same storage capacity, with marginal bandwidth benefits. Commodity drives are produced in large numbers, with very slim profit margins, where warrantee failures have a much larger effect on the manufacturer's bottom line. It always seemed to us that this called for the manufacturers to make commodity drives their most reliable product. In fact, recent studies by Google and other ISPs have shown that there is no difference in reliablity between SATA, IDE, SCSI, and Fibre Channel drives (see storagemojo.com, "Everything You Know About Disks Is Wrong"). Our approach of building RAID arrays across servers leads to even greater reliability than single box RAID solutions.
The Interactive Content Engine is based on a simple, straightforward architecture that requires just two types of commodity components: server computers (carefully chosen to meet our criteria), and gigabit (or beyond) ethernet switches. As a result, the server can be optimized for lowest cost, smallest size, greatest stream density, or any other parameters required by our customers. For example, there is no problem with replacing 3.5" hard drives (lowest cost per storage capacity) with 2.5" hard drives (greatest storage density, low power consumption). In fact, when holographic storage and optical processing become competitive, those components can drop right into the Interactive Content Engine Architecture, or even be added to existing Engines.
Delivering many independent streams of isochronous video, at almost any bit rate, is about the hardest task for a server. Since we can do that, we can deliver just about any form of content or data -- or record it. The number of streams we support, and their composite bandwidth, can be used for input, output, or a mix. In most applications, we are ingesting real time signals and loading content from local and worldwide libraries while delivering many times more output streams. However, with the large composite bandwidth our Engine possesses, it is also possible to simultaneously record hundreds or thousands of cameras and store their content for long periods, or to do a snapshot backup dump of a transaction server to minimize its nightly downtime. (The backup can then be spooled off of the ICE at leisure to another backup medium, if that is required.) This also means that in our primary role, a single Interactive Content Engine can be used to provide Video on Demand, Time Shift Television, Network PVR, and Digital Commercial Insertion -- even individually targeted commercial insertion.
Further presentations on the operation of the Interactive Content Engine may be seen at contentengines.com.
A Comparison of the Motorola B-1 Server to an Interactive Content Engine
Motorola White Paper
While this paper provides a perceptive overview of the future of video services, it is based on some incorrect assumptions, which I'll try to clarify below.
"With 250 channels of broadcasting over a 24-hour period, there is the potential to create 6,000 hours of content every day. To handle this massive content load, the video server at the heart of even medium-sized Time-Shifted TV deployments must be able to ingest, store, and stream video simultaneously and in real-time. Most video servers available today can only perform one of these functions at a time."
Most video servers available today, including our ICE, have no trouble performing all of these functions simultaneously. Not only that, but 250 simultaneous channels, each arriving at real-time speeds (loading at 1x real time by definition) is about the equivalent of 250 users of the system watching linearly (least overhead). Assuming a 20% peak demand in a 50,000 subscriber cable system (small), the server has to be able to provide up to 10,000 simultaneous streams. The 250 streams of ingest mentioned by Motorola is a small number, made smaller by the fact that each channel has to be ingested only once, but may be watched thousands of times while it is stored on the server! The hard work of compressing the streams and encrypting them if necessary happens before they reach the server, and in many cases they arrive already compressed and encrypted.
"Using trick files [ed: for fast forward and rewind] increases overhead by 30%, adding cost to the equation by forcing providers to install additional racks of disks and power supplies."
"Because the B-1 simply steps through I-frames, trick files are not [Motorola's emphasis] created and overhead is reduced from the 30% commonly seen with disk based servers to less than 1%."
We prefer to use trick files for prerecorded content, because they are so much more pleasant to use (resembling film running fast, rather than a jerky sequence of almost random frames). However, with real time content this is not possible due to time constraints. So for real time content, we use a similar I-Frame approach. We are fully capable of doing this with all of our content, if desired. However, the extra storage for the superior approach costs about 1/100 as much for us as it does for Motorola's RAM based approach, so you can see why they wouldn't want to do it. It is easier to do the I-Frame approach in RAM, but it really isn't a big deal for us to do it.
"Video service providers require video servers that can make the content available for streaming and viewing within five seconds of ingesting it -- another highly complex operation."
Uh, no, we find it easy. Papers written in 1991 through 1996, hinting at our architecture, explicitly mention time delayed viewing (and were later used as prior art in a patent lawsuit between SeaChange and nCube). From the beginning, our architecture has made low latency viewing a priority.
"But when cable operators expanded their service portfolios to include SVOD [subscription video on demand] in 2002, viewing rates increased significantly. At the same time, operators had to extend disk based systems originally designed to support 200 to 300 titles to support libraries that now contained 1000 to 2000 titles. Cable operators were forced to add a significant number of servers and disks to handle the load. Adding Time-Shifted TV to existing MOD, VOD, and SVOD services will only increase the strain on already over-taxed systems."
Cable operators typically compress content to about 3.75 Mb/sec, which requires about 3 GB of storage per movie title (90 minutes or so). 2000 titles requires 6TB of storage, plus redundancy, or about 8TB for the purposes of argument. Assuming that there is 6,000 hours of new storage every day for real time content (truly an outer space number, 1,000 hours per day of unique content would be high), at 2 GB per hour, 12 TB of additional storage is required per day of storage. If we are storing it for twelve days, the total storage capacity required would be about 200 TB with redundancy. With hard disk storage costing about $.20 per gigabyte, this would add about $50,000 for disks alone. With DRAM based storage, at $20 per gigabyte, this would add about $3,800,000 to the RAM only cost of the server. (These numbers disregard the implementation costs of the additional storage, which for disks is low and for RAM is proprietary and high.)
It is likely that in this scenario, Motorola would actually opt for disk based storage for some of less popular content, in spite of their claims. With 1,000 hours of content, 12 days would add about 30 TB. Note that for disk storage, this requires as few as 30 disks, which consume far less space than 30 TB of RAM storage. Commodity DRAM typically comes on sticks holding 2 GB. Even with 4GB per stick, it would take 250 sticks per terabyte, plus sockets and support circuitry. And power consumption.
"Disk-based VOD servers handle ultra-high concurrency rates by duplicating content, and require approximately 30% overhead to support concurrent viewing requests."
There are server architectures that require replication of content, and the corresponding overhead as well as much more complex software to anticipate demand. ICE is not one of them. Not requiring replication has always been a hallmark of our design. In fact, the Motorola B-1 requires replication when providing more than 15,000 4 Mb/s streams (adjusted from their claim of 30,000 2 Mb/s streams to reflect actual cable bit rates).
The Motorola paper was written by a marketing person using data from the mid 1990s. We have the positive attributes they brag about, but at a remarkably lower cost. Our architecture is designed to support he massive storage required by Time Shift TV, while the B-1 maxes out at 1.28 TB for RAM storage (as of 12/07), or about 600 hours. The Motorola unit is either really hard disk based, or remarkably expensive, and may be both. The Interactive Content Engine achieves all of Motorola's goals, but at a far lower cost, without sacrificing reliability.