Joe Tucci has been awfully busy of late.
As CEO of Hopkinton, Mass.-based EMC, he has been steering his company beyond its roots as a maker of data storage gear. In July, EMC announced it would acquire storage software company Legato. In October, the company agreed to buy Documentum, which makes content-management software.
And just a few weeks ago, EMC announced plans to acquire software company VMware, a move that could let Tucci's company reach further into the world of utility computing.
As EMC becomes more software-focused, a key piece of its strategy centers on so-called information lifecycle management. Tucci has trumpeted this concept of more-efficient data storage, as have competitors such as Hewlett-Packard and StorageTek. CNET News.com recently talked with Tucci about what information lifecycle management means, and how the company aims to stand out from the pack.
Q: You have talked a lot about information lifecycle management lately. What does it mean and how does it differ from what happens now with data storage?
A: You have a lot of choices now in storage. You have your high-end storage, you have your midtier storage, you have midtier storage with ATA drives in them--which drop the cost--and of course, there is low-end storage. There is NAS (network attached storage). Of course there's still tape. What ILM (information lifecycle management) does is basically offer you a tiered approach. How valuable is this information? What kind of performance do you need? where is the best place for storing it, the lowest cost that meets your requirements?
The second element of it is the information needs to be protected. And therefore if you can put up with a long recovery time you can take the data that's on disk and back it up on tape. If it is a very mission-critical set of information you might want to do a disk replica; you may want to have an in-the-same-array disk replica; you may want to have a remote disk replica. We can take other kinds of data that are still very important to the enterprise and back it up using ATA technology--much, much more reliable, much, much faster than tape. It gives you a quicker recovery obviously. And then of course you still have the tape option.
Then of course the value of information changes over time and there's the whole question of how do you put your information from a high-end storage to midtier storage to ATA storage to say Centera fixed-content storage with immutability? There is a whole aspect here of data mobility. The second piece is data protection and data mobility.
And then the third layer is the central place to manage it all. That is what makes up information lifecycle management. So you can dial in the protection that you need.
How is it that we can do this now, where we couldn't in the past?
Well, technology wasn't there to do this in the past. I mean you didn't have this class of ATA storage. It really got introduced into the market in 2003.
Is that a serial ATA you are talking about?
No, parallel ATA. Now there's a place at about a penny a megabyte that gives you an alternative to tape. But again the speed between the two, the reliability between the two is night and day.
So is the ATA much more reliable besides being much faster?
Yeah. When you put a tape in a vault, there is a chance--and not a remote chance--that you're not going to be able to read it. With this technology, that virtually never happens.
What about the idea of how to judge that value of data. Isn't that something that has been kind of a holy grail also? How are you going to keep track of which is the data that is important and which isn't?
Our customers have to know their own business processes right? But we can tell them how often it is accessed, what kind of utilization they are getting from it. And then they should know from their business process how does that relate back to their business process. They can set policies. And then we came out with software that helps them automatically move the information.
For example, if you're a bank and your business process is funds transfer, in a fraction of a second you could move a billion dollars. That information is kind of critical and the bank would know that, right? We can give them the data about the data. From those two points they can develop policies and then move the data using our software.
To what extent can ILM be done right now, or is it still sort of a dream?
The tiered storage is there today. Everything I talked about in storage. This is not in the future. These pieces are there today.
Give me an example of what is not there now that could be helpful down the line. Is there a particular kind of metadata that is missing right now, Joe? (Metadata is descriptive information about data.)
I mean there are a lot of kinds. It's not that it is missing; it is how easy it is to get. You know there have been a lot of advances made there, but we are certainly not where we want to be. I mean it is a journey in that area.
For every copy of information that exists on disks, there's eight and a half copies on tape.
I'll give you a statistic. For every copy of information that exists on disks, there's eight and a half copies on tape. This has to do with incremental backup. I mean there is no real good way to basically have this metadata engine so powerful that I could say that I will only need one copy on tape. And I will let the disk handle the incremental adds and when I get enough of them, I'll update it again, and get rid of the old copy, you follow?
That would lead to more efficient storage.
Yes. It's just one piece. There are other things that are there but not used enough. There are easier ways to provision. Provisioning--which means assigning jobs to specific hardware--is very difficult. To add 200GB of storage, you have to go through a lot of steps. We have some tools to automate that and those tools will get a lot more automated as time goes on.
To what extent is ILM something that customers are really calling for versus something that you and other industry players are using as a kind of marketing slogan to try to spur excitement in the industry?
Customers are asking for all the elements. Do customers want tiered storage? Yes. Do customers need to have mobility to move the information across the tiered storage? Yes. Are they asking for those things? Yes. Do customers want a central point to manage all the information? Yes. Do customers want more metadata about their information to make better informed decisions and policies? Yes.
So every one of these things they're crying for. So basically we put that under an umbrella called information lifecycle management.
How is what you are doing different from other companies offering ILM products or strategies?
It's the most complete, right? If you go to the first layer--the first layer I defined as tiered arrays or tiered storage, right? There is no other vendor that has the vast breadth of a product line that we have. The high end of storage, the midtier, the ATA, and the ability on the same array to put a rack of Fibre Channel, another rack of Fibre Channel, two racks of ATA, a rack of Fibre Channel, three racks of ATA. Mix and match anyway you want.
There is no other vendor that has that plus the full array of the NAS heads that we have. You know we have the NetWin series, which we did in conjunction with Microsoft. We have our own Celerra series, again we have the Centera built exclusively for fixed content and immutability.
Number two, you go to the second line, which is the protection and the data mobility. Again we have more ways of moving information than any other company, by far. And then of course more ways of dialing in more levels of protection than any other company.
Then you move to the management. EMC ControlCenter has significant share out there, almost 50 percent share. And you back that with (the fact that) this is a company of substantial size with substantial financial resources that is fully dedicated to this, as opposed to a piece of the company. So you put those things all together and I think that's our advantage.