Rating: +1

Positive Negative

HI Craig -

DataDomain certainly knows the differences, but to expand on why I think the terminology still gets confused: The archiving vendors; have for years been using the term "de-duplication" in their jargon in reference to SIS. They use "de-duplication" instead of SIS, because they are often doing it only for larger objects (such as binary attachments) vs. all objects which is what most storage vendors provide in their SIS implementations so they try to differentiate what they do at the application layer vs. what storage vendors might do at their tier.

There are good arguments for only doing attachments in email, less IO, means less CPU, means less data center resources, power, etc. Do you really want your CPU spending its I/O de-duplicating millions of 1K gif files that appear in the standard signature file of a corporations emails when it’s a much better bang for the buck to deal with binary files over a certain size.

So when many customers hear “de-duplication” today they are accustomed to how archival vendors have used the term in reference to their implementation of SIS. But as you know that's changing quickly.

We always try to explain the differences and advantages very high-level. Using email as an example: “if you have a document in the archive, that is utilizing SIS, and the user opens the original document and changes “only” the date, and saves it, its now a different document and SIS technology will recognize the “entire” document as “new” and store it again in the archive; now using block-level de-duplication, the technology will recognize that “only” the date was changed and the rest of the document is the same, so block level de-duplication will only store the bytes represented by the “date” and only store those bytes and not the entire document again. The difference could be a couple of mega-bytes vs. 1 or 2 bytes being written to disk, now multiply that a few million times over 10 to 15 years. That's the difference. The audience typically gets it. The simpler the example, the better.

Good Luck,
Peter
December 2008


Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>