Size matters

We are starting to get into detailed discussions about data and content management strategy across the business, and the thing that is unbelievable is our capability to produce vast amounts of data. As Moore's law has exponentially increased processing power, the net result has been bigger and bigger file sizes, and more and more of them.
Not only that, but the traditional barriers to regulate the creation of new content (mostly, film stock and processing costs) have all but disappeared. For example, the number of photographs that we take to record one of the events we work on has probably increased five-fold. And the reality is that the digital storage of content on technology models is substantially more expensive than the old ways of storing reels or sheets of celluloid in tins. According to research by AMPAS from 2008, up to 12 times more expensive.
At present, on average, our London office generates around 150GB of data every week. That's from about 200 people, and excludes most of the moving image material. To support that growth we have around 12TB of high-availability, high resilience NAS storage, a few 10s of TB of nearline archive, and about 90 TB of offline storage (which is where most of our moving image archive resides). The offline storage costs about £100/TB as a one-off cost and the NAS £30,000 per year (for our external costs). If you include all of the surrounding costs of staff, power and so on, that figure probably rests at about £100,000. By comparison, I could get 16TB of storage from Google each year for about US$4,000, or from Amazon for about US$2,500.
It seems that the business has gone through cycles about every three to four years where we run out of space, panic a bit, and then invest in a new technology platform that at the time appears to solve the problem forever. Except that in about three years time it's full to bursting. This is M25-theory. If you build more capacity without regulation, you generate more demand that eventually creates a feedback loop of uncontrollable consumption. Compare that to the M6 Toll Road… still, to this day, a pleasurable driving experience because it is empty (or is it because it speeds you past Birmingham?).
A technological approach to the problem would be to find cost-effective ways to find more space more cheaply. That's certainly something we will do, and the economics of storage now seem to point to the need, above everything else, to invest in bandwidth to get to cheap storage.
However, in isolation, that's the equivalent to just adding a new lane to an over-busy highway. Tolls are required to regulate demand, and by increasing the use of pay-per-use infrastructure services, that scalability of cost becomes something that is equitable. On-premise models of charging for services are inherently nonsensical because the lumps of costs that a firm had to endure to introduce or enhance a service were usually massive multiples of the charges doled out to consuming business units. If one department decided they didn't want to use a particular service, that cost needed to be reallocated across the remaining consumers. Cost actually had no relation to usage.
Moving to scalable charging models alone, however, is unlikely to lead to a desired change in behaviour. Often such models just end up with profit centres scrimping and saving and having inappropriate services as a result, and cost centres hoovering up resource “because that's the cost of our operation” (if you want evidence of that, ask a CFO to quantify the business benefit of his ERP system…)
The changes that we are going to need to make are to introduce more rigorous processes of librarianship. Whilst at a consumer level, data volume issues are seemingly solved, at a professional level the boundaries of affordable data storage are constantly being pushed. 4,000 line video from Red Camera doesn't upload to YouTube too well. We have to be able to make decisions about what is worthy of our archive, and also what is the swarf that has been generated along the way and should now be discarded. Software will help us – and a push for 2011 will be content and asset management – our ability to make valued decisions about what is worth storing is crucial too.

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.