Saturday
31Jan2009
Drobo Saga: Part 2
Saturday, January 31, 2009 at 03:45PM
[Update: read the resolution to this story.]
This is an update to a previous post. Before you read this, make sure you've read Part One of this story.
After I posted my last, I moved the Drobo over to my Mac mini, so that I could leave it churning as long as it wanted. In the meantime, I received email from Ralph Herbst, the Director of Customer Service at Data Robotics, asking to set up a call for the next day.
Literally 20 minutes before the call, my Drobo volumes reappeared on the mini. I had spent the intervening waking hours reconnecting my Aperture library to an rsync backup of the referenced masters that were on the Drobo. Thanks to my backup regimen, I did not lose a single one of the 20,000 photos that were on the device.
So we had the call. On the line were myself, Ralph Herbst, Tom Loverro (Director of Product Marketing) and Valorie Koch (Technical Support Mgr.). The DR people were very apologetic over the way the technical support incident was handled. In particular, they made it clear that literally everything I had been told by DR tech support was wrong, which tells you a story in itself. My disconnection procedure was 100% correct: unmount the drives, then pull the FireWire cable.
Based on some information that I had (quickly) gathered since the device came back online, we eventually got round to surmising that the problem was a corruption of the HFS+ directory structure on one of the volumes. My Drobo is partitioned into two volumes: 1.5TB for Time Machine and the rest of the virtual 16TB volume for data that I don't keep on my MacBook Pro. Fortunately, the Time Machine volume was the one that was damaged. As I write this, I'm recovering the last of my data from the other partition. That data seems to be OK.
The lingering question, though, is how that volume came to be damaged? I suppose that's something that we'll never know. I can't say that this absolves the Drobo of blame, nor that it conclusively proves that the Drobo itself is somehow defective. It's a known unknown and I guess it will probably stay that way.
I do have one suspicion, though: I've noticed that when your machine goes to sleep for an extended period, the Drobo also goes to sleep. When the computer wakes up, it does so a lot faster than the Drobo and, as a result, Mac OS X throws up the "Device Disconnected" dialog that you see if you pull a USB drive too soon. That's not usually a good thing.
Data Robotics have offered to replace my Drobo unit as a precaution. As I said in my last post, I'll probably continue to use the Drobo in some capacity but those masters are going on a backed-up 1TB eSATA drive from now on.
Let me leave you with this thought for now: music and videos can be replaced with the application of time and/or money. You will never be able to recreate a lost photograph of the day your child was born. Back them up. Twice.
This is an update to a previous post. Before you read this, make sure you've read Part One of this story.
After I posted my last, I moved the Drobo over to my Mac mini, so that I could leave it churning as long as it wanted. In the meantime, I received email from Ralph Herbst, the Director of Customer Service at Data Robotics, asking to set up a call for the next day.
Literally 20 minutes before the call, my Drobo volumes reappeared on the mini. I had spent the intervening waking hours reconnecting my Aperture library to an rsync backup of the referenced masters that were on the Drobo. Thanks to my backup regimen, I did not lose a single one of the 20,000 photos that were on the device.
So we had the call. On the line were myself, Ralph Herbst, Tom Loverro (Director of Product Marketing) and Valorie Koch (Technical Support Mgr.). The DR people were very apologetic over the way the technical support incident was handled. In particular, they made it clear that literally everything I had been told by DR tech support was wrong, which tells you a story in itself. My disconnection procedure was 100% correct: unmount the drives, then pull the FireWire cable.
Based on some information that I had (quickly) gathered since the device came back online, we eventually got round to surmising that the problem was a corruption of the HFS+ directory structure on one of the volumes. My Drobo is partitioned into two volumes: 1.5TB for Time Machine and the rest of the virtual 16TB volume for data that I don't keep on my MacBook Pro. Fortunately, the Time Machine volume was the one that was damaged. As I write this, I'm recovering the last of my data from the other partition. That data seems to be OK.
The lingering question, though, is how that volume came to be damaged? I suppose that's something that we'll never know. I can't say that this absolves the Drobo of blame, nor that it conclusively proves that the Drobo itself is somehow defective. It's a known unknown and I guess it will probably stay that way.
I do have one suspicion, though: I've noticed that when your machine goes to sleep for an extended period, the Drobo also goes to sleep. When the computer wakes up, it does so a lot faster than the Drobo and, as a result, Mac OS X throws up the "Device Disconnected" dialog that you see if you pull a USB drive too soon. That's not usually a good thing.
Data Robotics have offered to replace my Drobo unit as a precaution. As I said in my last post, I'll probably continue to use the Drobo in some capacity but those masters are going on a backed-up 1TB eSATA drive from now on.
Let me leave you with this thought for now: music and videos can be replaced with the application of time and/or money. You will never be able to recreate a lost photograph of the day your child was born. Back them up. Twice.
in
Tech
Tech
Reader Comments (10)
I am glad to hear you pictures were all recovered. In the end, that is what matters most. I wonder if the corruption had anything to do with Time Machine it self. It seems to be a fairly flaky technology. When it first came out, it was specifically recommended not to be used in conjunction with Aperture. That allegedly got fixed, but it does not inspire confidence. I have 2 Drobo units at work right now. Both are fairly new, and neither of them have been put to that much use. I hope they remain stable.
Hope you pics remain safe and you sanity is left intact! cheers
I am glad to hear the update Fraser. We will speak to you again I believe Monday. We apologize about the cr@p tech support you got the first time around.
Just as a reminder and shout out to all folks everywhere, I [heart] Drobo as much as anyone and love its safety, expandability and awesome ease of use but do remember that Drobo can't prevent viruses, file system corruption, earthquakes, fires, floods, three year olds etc. so having an offsite strategy is part and parcel of best practices. I highly recommend Peter Krogh's DAM Book (Digital Asset Management) as a bible for anyone (especially photographers) in the audience. I personally also use a free piece of software from Crashplan.com to sync my data offsite (especially my Aperture 2 library) once a day for free (I did the initial sync locally and do incrementals over the internet via Crashplan). I live in earthquake-prone SF! :-)
Best Regards,
-Tom
Hi Fraser, glad to hear things got solved.
While I commend DR's response, the cynic in me can't help but wonder if they would have been as eager to help if you weren't a well known developer and much-read voice in the Mac community...
i really don't see why RAID5 or drobo raid (which is a sort of raid5) is a good backup solution.
pull out a disk from a raid5 and its UNREADABLE !!!! pull out a disk from a drobo and it's UNREADABLE!!!!!
raid was not invented for backup but for high availability data storage service....if a drive fails , the volume still works....THATS IT !!!!!!!
i'd rather use raid1 where every drive remains readable in case of enclosure failure.
raid0 work drive + On site RAID1 backup + offsite raid1 backup = seems a good solution.
one more thing: all day spinning drive leads to high rate failure ratio.
so leave backupdrives unplugged between every backup
> You will never be able to recreate a lost photograph of the day
> your child was born. Back them up. Twice.
And then, periodically, do a test restore to check the integrity of your backups. Multiple backups are no more use than a chocolate fireguard if you can't restore from even one of them. GIGO still applies.
Here's an interesting talk about SoftRAID. The forthcoming version 4 sounds interesting.
http://tinyurl.com/bjce48
> HFS+ is a tremendously crappy filesystem, as a matter of fact it’s one of the
> most horrific bits of OS X. Instability in HFS+ doesn’t surprise me at all. No
> amount of stability in the storage will save you if the filesystem is unreliable
> and crufty.
Sorry, but I disagree. HFS+ is a long way from being "tremendously crappy". It certainly isn't "horrific", "unreliable" or "crufty", particularly compared to the other common filesystems in use today. It is an unusual design, but that doesn't in and of itself make it worse than its contemporaries, and it actually allows it to provide some useful features (very fast whole-filesystem enumeration, for instance; on other systems you typically have to do a recursive directory scan whereas on HFS+ you can linearly scan the Catalog File).
I'd actually go so far as to say that HFS+ is rather a good filesystem in many respects. Even more so given the age of the core parts of its design, much of which dates back to the original 1985 version of HFS.
If I were going to guess where the blame lies here, I'd point my finger squarely at Drobo. It seems to me that it's far too clever for its own good, and that leads to a significant risk of data loss if it misunderstands the current state of the disk in any way. I'd love to know how much testing they've done and exactly how they did it. I already suspect it wasn't enough, because they went and claimed that Drobo worked with "all your favourite utilities", and it clearly doesn't work with iPartition… we've had a number of customers try to repartition Drobo units and end up with all their data trashed because Drobo didn't know what was going on.
I started using Drobo as my primary drive for media, music and photos.
1st unit arrived with no working fans, which I did not know because I had not seen one before but after seeing it run hot I got replaced. 2nd unit fans worked but then failed, melting case and damaging a drive (heat). 3rd unit has been fine and I lost no data. Tech support was great when I finally got escalated.
Drobo is no longer primary storage for me, I keep it outside any cabinet (or I would use external fan as safety) and I power off when I go on vacation. Media now lives on a mirrored drive and gets backed up to Drobo.
by the way: the upcoming version of softraid (softraid 4) is awesome.
i'm sorry but pleased drobo users or pleased under 2000$ RAID enclosure users are only users who underwent a single drive failure.
with drobo or raid5 if 2 disks fails: 100% loss.
with drobo or raid5 is the enclosure fails: you have to wait for a brand new enclosure to read your drives.
best backup solution: on site raid1 backup + offsite raid1 backup.
I bought a Drobo with FW800 on it and had no end of problems with waking up and then sleeping after a few hours.
Since I purchased the unit from Amazon and it was only a couple of weeks old, I decided to send it in for replacement. I pulled the drives and when I got my new Drobo, it worked like a charm. I didn't lose any data, but with all things hard drive, I'll be backing up the Drobo as well... probably with a second Drobo.