Storage Migrations – the pain, oh the pain
Chris Evans, the storage architect, just published a post about storage migrations. He talks about how the increasing storage capacity of arrays means that when you have to replace them, the process is ever-so-much more painful. I completely agree, but I think there’s a few points to consider when looking at this process.
One is that I see fewer and fewer organizations opting for the the massive centralized storage arrays, partially for the reasons Chris lists in his blog posts. The trend seems to be, at least in my neck of the woods, dedicated midrange arrays that are focused around individual application or business unit silos. For example, you might have a Clariion box dedicated to a VMWare SAN that consists of a hundred servers, or a large NetApp array that is supporting 50 Oracle RAC clusters. In both cases, migrating from one array to another can be substantially different, thanks to the workload and application level technologies available.
Another is that in very large environments, even the best documentation and planning, the complexity in large SAN environments and communication can result in challenges. For example, back in my days at EMC I was part of a large project migrating several thousand servers from one set of arrays to another. Due to SAN fabric constraints, we were actually moving servers from one set of two fabrics to another one, with SRDF being used to mirror between the source and the target Symmetrixes. We had even done a dry run with the customer, and everything had gone smoothly with the test servers.
However, when we flipped the big switch and migrated the first 50 windows servers over to the new fabrics with the new zoning, all of the data drives were there, but none of the quorum devices. When we asked the EMC CE why that was, he explained that EMC didn’t configure quorum devices to be replicated – and then when we asked the customer why they didn’t see this during their dry run, they explained that they’d tested it with their development servers…which happened to be non-clustered. Whoops.
I think the reality is that the more applications live on one storage array, the more painful that migration becomes, independent of how much storage lives on that array.