Documentary data wrangling demystified, Part II: Choosing storage

In this multi-part post, I share how we managed digital media on a feature-length documentary project. In Part I – Securing Your Footage, I covered how to prevent data loss. Let’s continue today with a look at primary storage for a tapeless workflow. Next time, I’ll cover organizing your media for editing in Final Cut Pro X.

Primary storage. On a documentary, it’s typical to shoot for a long time before production wraps. In our film Beyond Naked, two years elapsed from first shot till locked picture (nearly a year in production, and more than a year in post). The amount of resulting footage can easily surpass that of a narrative film. So if you’re making a doc, you’re likely to have some serious data demands.

On my first short film a few years back, I solved storage problems by simply filling up individual external hard drives. Then I’d back them up by mirroring them to other drives. The problem with this approach is that it’s a pain to perform this backup every time you add new media, and so like anything that’s a pain, you end up putting it off. Before you know it, it’s been weeks since you backed up. So when a drive fails, you’re really going to be screwed. And as the old data wrangler saying goes, “it’s not whether a drive will fail, but when.”

So forget primary storage on individual hard drives. As a first line of resistance against data loss, you MUST be able to replace a defective drive without losing anything. That means selecting a RAID. Or something like it. (For a thorough discussion of RAID types as they apply to video editing, check out this post by Larry Jordan.)

Throughout the production of Beyond Naked, we used a Drobo Pro for our primary storage. It uses a proprietary RAID-like system (called “Beyond RAID”) that has some advantages over traditional RAIDs. By default, it’s configured so that if one drive fails, you can replace it without any data loss. You can even configure them so that two drives can fail simultaneously. The tradeoff is that you get less storage space.

Drobos are also hot swappable, so recovering from a drive failure is as simple as pulling the failed drive, and shoving in a new one. Everything will magically pick up right where you left it (although it takes a few hours of read/write time before the recovery is complete). During production, our Drobo experienced a massive failure that prevented it from booting up. It was heart-stopping, but all we had to do was remove all of the hard drives, and send the unit back to Drobo. Even though it was a few months out of warranty, Drobo sent us a replacement Drobo Pro, no questions asked. When it arrived, we very carefully plugged the drives back in, held our breath, and turned the power on. The lights blinked on, red at first, then green. Good to go.

Another benefit of Drobos is that they can handle any type and size of SATA drive you can fit into them, including SATA 3. So as drives get bigger, faster and more affordable, you can replace smaller ones with bigger ones (if you have an older Drobo, a firmware upgrade is required to recognize drives larger than 2TB). We started with 4TB of storage on our Drobo when we purchased it in 2009, and today it’s grown to 16TB. We’ve still got plenty of room to expand.

But there is a major down side to using Drobo Pro: it’s slow. Really slow. Despite being advertised as connecting at up to 100MB/sec via gigabit ethernet, I have found this connection to be totally unstable. Using it invariably causes my 2011 iMac (as well as my previous Macbook Pro) to freeze and require a force-reboot. Repeated support tickets to Drobo have yet to resolve the issue.

So for Drobo Pro, Firewire 800 is the fastest connection I can count on. Here’s what that means on my editing suite:

Unfortunately, that’s nowhere near fast enough to edit HD video. As a safe place for simply storing media, though, it works fine.

If I were purchasing a primary storage today, I would stay far away from any system that didn’t support a Thunderbolt connection. If you want a recommendation, I would heartily recommend the Pegasus Promise R6, after the great experience we’ve had with our Pegasus R4, which I’ll elaborate on shortly).

Storage for Editing. By coincidence, the day we wrapped principle photography on our film was the same day that Apple released Final Cut Pro X. As a frustrated Final Cut 7 user, I made a snap decision to switch. I don’t regret that decision for a minute, despite all the venom that oozed from the professional editing community. FCPX has given me (an artist, not an engineer) god-like powers to skim through mountains of data, and has put wings on my editing. Still, there were challenges.

It turns out that in order to live up to its billing, FCPX demands fast everything (a newer computer, a Thunderbolt connection, and zippy drives). We couldn’t afford to purchase a Thunderbolt RAID that would hold all 4.5 terabytes of original media for the film, but we did scrape together enough cash from our remaining Kickstarter funds to purchase a 4TB RAID.

Calculating storage space. This raised a question: If we used FCPX to create proxy media for us, would the entire film fit on a 4TB RAID with enough free space left over for editing?

You guessed it, there’s an app for that. The best one is called Katadata. AJA Datacalc also works, but with a more rudimentary UI. The beauty of Katadata is that you can select your camera type, and options narrow immediately to those supported by your camera, making the menus much easier to navigate. The app also automatically adds multiple shoot info, and offers the option to email the results of your calculation. Katadata costs $4.99 in the App Store.

We calculated that after converting to proxy, we could fit the entire project into about 2.5 terabytes of proxy footage. So with the last of our Kickstarter funds, we purchased a Pegasus Promise R4 (the only Thunderbolt drive manufacturer actually shipping drives at the time) and set about the task of importing and organizing our footage for editing within FCPX. I’ll cover how we handled that task in part III of this post.

The Pegasus R4 turned out to be amazing for us. Just take a look at the performance we’re getting with ours today, with the project finished editing, and the drive way more than 3/4 full:

An important aside: avoid letting your drives grow beyond 3/4 full for best editing performance. Hard drives require some headroom in order to perform at their peak. Things can really slow down as a drive approaches getting full, so just be keenly aware of that and never let your editing storage drives get too full.

Another Pegasus R4 plus is that it’s relatively small, about the size of a toaster. At least, as long as this drive was parked on my desk, it seemed small. But after a month of daily packing it up and transporting it to my co-editor’s place for work, it began to seem rather large. The cardboard box that it shipped in began to fall apart, and I searched in vain for a Pelican case that would neatly contain it for secure travel. I looked everywhere, and discovered they don’t make one.

As an aside, if you’re thinking “I’ll just use my internal hard drive for editing storage,” that probably won’t work. On all but the newest computers with SSD drives or Apple’s new Fusion drives, an internal drive’s bus connection is slower than it will be via an external connection like ESATA or Thunderbolt. You’re better off using external storage for another reason, too: inevitably at some stage of the film, you will need to take your media with you (to a colorist like John Davidson or an audio mixing facility, for example). However, if you have a 2012 iMac, you might want to forget everything I just said and use your screaming fast Fusion drive to edit on, as long as you’re backing up regularly.

Fast, portable storage. As we got deeper and deeper into editing, Lisa and I needed to share files more and more often. We opted for Lacie 2big 4TB Thunderbolt drives. The most economical place to purchase them, we found, is MacMall, which routinely sells refurbished ones for under $300. We got two of them. We were initially reluctant to buy refurbished drives, but our budget constraints forced our hand, and they have performed flawlessly. You can fit one into a Pelican 1400, with room for the Thunderbolt cable stowed in the top lid under the foam, with the power cable off to the side below.

Because we wanted maximum performance, we left them at Raid level 0, their default. Which yields this:

At RAID 0, if one of the drives fails, you lose everything on the drive. But as long as you remember this is editing storage – not primary storage – it’s less scary. If something fails, you don’t lose footage – just the work you’ve done in editing it. So of course, regular backups are absolutely critical. It would be far more secure to set these drives to RAID level 1, but this results in significantly slower editing response times in FCPX. And I’ve found that if I have to wait for my hard drives when I’m editing, I get distracted, and pulled out of “the zone.” So we committed RAID 0 and to making regular backups.

Storing in more than one location. Having all the original media stored on the Drobo meant our footage was safe if one hard drive failed. But what if the house caught on fire? It’s a common-sense good idea to have your film media in two places just in case. We couldn’t afford to buy a second Drobo. So we simply filled up a bunch of older 1 and 2TB Lacie drives from our previous project. We copied all our original media onto these, and put them in a box in Lisa’s closet. Offsite backup – check.

Is this archival? No, but… for our purpose, it doesn’t need to be. Studies show that disconnected hard drives in storage lose about 1 percent of their data-holding capacity per year. But experts say that you can safely store data for 2-5 years without danger. So as a good, short-term solution, it works. Just don’t put them in a drawer and forget about them, or you may be sorry in 10 or 15 years.

Ultra-portable storage. One final note about storage. When I was in the UK over the holidays last year, I took the film (proxy only) with me on a single Lacie 2Big Lacie, intending to do first pass sound mix and color correction on my MacBook Pro. I realized after I arrived that I had forgotten about 1/4 of the files! Altogether they totaled about 600 gigs. Way too much to transfer over the sketchy internet connection at the place I was staying.

So Lisa, back home in Seattle, simply loaded the missing files onto a USB 3 2TB WD My Passport for Mac portable external hard drive. It is so tiny it fit into a DHL envelope. The cost of shipping from Seattle to England ($100) was almost as much as the cost of the drive ($150).

WD claims the drive is “ulta fast,” but even on my MacBook Pro 2012, with USB 3, it clocks modestly compared to Thunderbolt:

We will be tasking this drive with doing on-set file backups on many of our shoots in the coming year.

OK, so we’ve got storage covered. Next question: How does one organize media for feature-length documentary film editing? I’ll address that in my next post.

9 thoughts on “Documentary data wrangling demystified, Part II: Choosing storage

  1. Matt Cadwallader

    Great overview. The new-ish Drobo 5D seems to be similar to the Drobo Pro, except it has a thunderbolt connection. I’m looking in to a storage solution right now, and while I’ve considered the R6, I’d prefer not to have to fuss with RAID configurations. Would you still recommend the Pegasus with the 5D now available?

    1. Dan McComb

      Hi Matt,

      I’m a little gun shy about Drobo after the bad experience I had, so if it were me, I would go with the Pegasus. It totally rocks. And I’ve never had to do any RAID configuration on it besides turning it on. It does take a few hours to configure itself initially, but without you having to do anything other than accept defaults, which work fast and flawlessly.

  2. Marcel Beck

    Hello Dan,

    Would you recommend the pelican for airline travel? I am shooting a documentary in 3 different countries which mean my gear and myself are going on a long road trip.

    Do you think I could get a pelican case big enough for the Pegasus R4?


    1. Dan McComb

      If you’re going to be traveling, I would get something more compact than the R4. It’s shaped like a toaster and will not fit neatly into any current Pelican cases. When I’m traveling, I take a Lacie 2Big 4-tb thunderbolt hard drive. These are now shipping in 4-, 6- and 8-TB configurations. The drive fits neatly and compactly into a Pelican 1400 case. If you aren’t worried about editing, I would take something even smaller if you just need to store the data in transit. A couple of USB-3 drives (one for backing up the other) in a compact enclosure should do the trick, such as this one:

  3. Michele

    Hi Dan,

    I have read this a few times now. I still need some advice, however. I am deciding between two external hard drives for a feature-length doc: the Promise Technology 8TB Pegasus R4 RAID Storage with Thunderbolt (4x 2TB) or the LaCie 10TB 5big Thunderbolt Series 5-Bay RAID. I have about 4 TB of original data for a doc I’ve been working on for 2-3 years, and I really want to make the best decision about storage and editing. The LaCie is cheaper and has more space, but would those features be eclipsed by the performance of the Pegasus? Thanks!

  4. Pingback: Documentary Data Wrangling Demystified IV: first assembly to final output | Dan McComb

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.