Commentary & Analysis
Where is that file when I need it?
By Pat Taylor,
By WhatTheyThink Staff
Published: August 23, 2004
By Pat Taylor, Proactive Technologies His answer echoes distant in my ears; I am lost for a moment in the dusty flophouse of data. August 23, 2004 -- Harley’s prepress department has plenty of Macs and a network switch that connects the creative group to the file server. The file server is a Mac with a half-a-gig of RAM and an aging RAID attached like a sidecar to the faded Apple engine. I can tell by looking at the cable connection that the RAID is older than my youngest son. “Got all my production data on that RAID right there,” boasts Harley, “and I back it up every week. That RAID of mine holds about 70 Gigs and we keep it pretty full. So when we finish making plates for a job, we copy the folder onto a CD and delete it to make RAID space for the next job.” He steps back and, with his left hand on his hip, uses his right arm to sweep across the wall of CDs that represent his company’s job archive. An entire wall is covered with CDs in jewel cases lined up like books on library shelves. Each case is numbered, and that number is recorded by hand in a journal (just like the one in which Grandpa kept his accounting records). Next to the number listed in the journal are the details about the job data stored on every CD; it looks like Harley’s version of the Dewey Decimal System. “Pretty slick, huh,” crows Harley. “Every job we’ve done for the last eight years...” “Do you pull up old jobs very often?” “Oh, yeah. Almost every day.” His answer echoes distant in my ears; I am lost for a moment in the dusty flophouse of data. There are literally thousands of CDs stacked against the wall, behind which is a warehouse of paper job bags scattered willy-nilly in shopping carts, on shelves, and in boxes. I feel like I have stumbled into the final scene of that Indiana Jones movie where the Ark of the Covenant was stored away for safekeeping in a warehouse just like this one – never to be found again. “Ever had any trouble finding a file, Harley?” I ask. “Well, sometimes it takes a while to finger through this journal, and different operators have different ways of listing a CDs’ content. I remember lookin’ for a job for a half-day once before we gave up and reworked it. That don’t happen often; most of the time, we can get the job back in a couple hours.” “And you can wait a couple hours for a file?” “Not sure we have a choice. This is the way we’ve been doin’ it for years.” ***** The good news is that storage is becoming more affordable and easier to manage than in the past. Systems for storing the job and image data that is central to a printing business are quickly becoming the cornerstone of a company’s digital infrastructure. It is not unusual to find a prepress department with eight operators using up to a terabyte of disk storage. The cost and complexity of storage technologies make purchasing decisions difficult, and we often take the same path as Harley; ‘going with what you know’ and getting left in the past. The good news is that -- in spite of the wide variety of new data storage technologies and all those confusing acronyms -- storage is becoming more affordable and easier to manage than in the past. The most important considerations when purchasing storage for your printing operation today are (a) ensuring that the cost of storage ‘real estate’ is in line with the value of the data stored on that device, and (b) streamlining or automating to movement or migration of files from one storage media to another. Less Expensive RAIDs The core storage technology for any of us in the printing and publishing business is RAID – Redundant Arrays of Independent Disks (or Inexpensive Disks, depending on your source). RAID storage is fault-tolerant, meaning that you can experience a single disk drive failure without losing the data on the array. While RAID has been a mainstay in our industry for more than a decade, it has historically been constructed using the highest quality (i.e. – most expensive) disk drives available. Recent developments in RAID technology allow us to use less expensive disk drives to create fault-tolerant arrays. These RAIDs utilize IDE disk drives typically used in desktop PCs. IDE disk drives have a 1- or 3-year warranty (as opposed to 5-year warranties on high-end SCSI and Fibre Channel disk drives), but the resulting decrease in cost-per-gigabyte of storage compels us to find a place for this low-cost alternative to high-dollar storage. IDE RAIDs are now deployed in a number of useful functions in prepress and printing. Using IDE RAIDs as ‘targets’ for mirroring production data stored on the high-end production RAIDS (an application referred to as Disk-to-Disk Backup), provides much quicker restoration of data than traditional tape backup. Additionally, they can be attached to the LAN (Local Area Network) as an archive repository. When a job is completed, the prepress managers can literally drag-and-drop the job folder from the Production server onto the Archive server. There is no administrative burden (such as burning CDs, journaling their content, maintaining a library, etc). Retrieving a completed job is as simple as going to the Chooser on any Mac on the LAN and selecting ‘Archive’ to view a file system identical to the one used on the production side of the business. If CD or DVD is the media of choice for storing old job data, all those disks can be loaded into into a jukebox and attached to the network – providing access to job information to anyone on the LAN. There are storage management systems that will search, define, and catalog the jobs on the CDs in your jukebox. Fault tolerant design You will realize a greater return on your investment in storage by deploying systems that automate the migration of job folders and files from high-dollar ‘real estate’ to low-cost ‘warehouse space’ The idea is to design a data storage system that is fault-tolerant (in order to provide protection for these job and image assets), and price-balanced for the data lifecycle specific to your operation. You need an appropriate amount of high-quality storage (SCSI and Fibre Channel RAIDs) for production data. This primary storage should be supplemented with less expensive “near-line” storage subsystems (IDE RAIDs, CD/DVD jukeboxes, etc) to house data accessed less frequently. You will realize a greater return on your investment in storage by deploying systems that automate the migration of job folders and files from high-dollar ‘real estate’ to low-cost ‘warehouse space’ (IDE RAIDs) when the job is off the press and out the door. To complete the storage component of your digital infrastructure, you need to make sure that there is (a) a tested plan in place for backing up data without impeding production and (b) a disaster recovery plan that provides for the immediate restoration of production data needed to continue business-as-usual. And what was Harley able to do with his mess? Well, we racked all his CDs up into a jukebox and cataloged the job folders with a Mac-based asset management system. Now operators pull old jobs from archives in minutes instead of hours, and Harley hasn’t had to finger through that journal once in the last three months. The next step will be to digitize all those paper job bags in the back--but that’s another story.