Share

* Gustavo Leite

If we look in the dictionary for synonyms for perish, we will find expressions such as die, succumb, extinguish, rot or rott, in English. From the idea of perishing, spoiling data, the concept of ROT was born, an acronym used to define something equally bad in corporate data management: redundant, obsolete and trivial data.

Just as food spoils from lack of consumption (use) over time, ROT data is the result of misuse of good data management practices. And did you know that up to a third of corporate data can be considered ROT (and another 52% is data with unknown value, where at least a part of it can fall under this concept).

Let's briefly explore each category of ROT data:

  • Redundant Data: Data redundancy is a good thing. In fact, it's a fundamental part of the 3-2-1-1 rule of data backup and recovery:
  • Keep at least three copies of your data in different locations.
  • Use at least two separate storage media.
  • Store at least one copy of your data offsite.
  • Keep at least one copy of your data on unchanging storage.

But the redundant data we're talking about here is different – it's unnecessary duplicate data that therefore has no value. For example, multiple identical copies of an employee's spreadsheet that he has saved on the corporate network, all of which are also unnecessarily copied in each of the three critical backups outlined in the 3-2-1-1 rule.

  • Stale Data: There are reasons to save data that may no longer be in use. The most obvious of these are data compliance rules and regulations that stipulate that certain types of data must be stored for certain periods of time, even if they are no longer needed for day-to-day operations. But not all data fits into this category. In fact, most don't. This data — data that is no longer needed because it has been replaced by updated information and has no legal value — is stale data.
  • Trivial Data: As the name implies, trivial data is information that is simply not important. It has no value in terms of corporate knowledge, business acumen or record keeping. Lots of instant messages, videos of kittens everywhere, for example.

ROT data is bad for many reasons, including:

  • Increased data security risks: The more unnecessary data you have, the harder it is to protect what really matters from threats like ransomware and other data breaches.
  • Increased Data Compliance and Governance Risks: ROT data often does not comply with rules and regulations that dictate that certain types of data must be deleted after certain periods of time.
  • Increased Liability Risk: Data held beyond the required retention period could be used against your organization in legal and financial proceedings.
  • Reduced Productivity: ROT data makes it difficult for employees to find the information they need to do their jobs efficiently and effectively.

But the reason I really want to focus on it here is your bottom line. ROT data is a significant source of excessive storage costs. Traditionally, this was an issue that companies faced in their own datacenters as they needed to add more physical infrastructure. But as cloud-based data — especially data in the complex multi-cloud environments found in most enterprises today — outstrips the amount of data companies are storing on-premises, it is also becoming increasingly problematic in the cloud.

In fact, recent research suggests that 94% of organizations are failing to meet their cloud budgets, spending an average of 43%. And what are they overspending on the cloud? You guessed it: storage, including backup and recovery, is number one on the list.

A big part of this is that the tools provided by cloud service providers (CSPs) to help reduce ROT data are simply not up to the task – this is simply not a CSP's top priority. Their top priority is selling more cloud storage, not less. I'm not suggesting that they're nefariously ignoring the fact that you're putting a lot of ROT data on their servers, which they may charge for, but they certainly don't have the strongest incentive to put the research and development into creating the robust tools of data management you need to reduce your ROT data.

The first step is to create a data taxonomy or classification system – a set of definitions, labels and groups to organize your cloud-based data. This will help you identify your ROT data.

As part of this, establish a single source of truth (SSOT) location for each category of your data in the cloud, where the “right” version of each data asset is saved, reducing the chance that ROT versions exist elsewhere in the cloud .

Then define policies for managing the ROT data you've identified – rules and procedures for removing it from the cloud.

Finally, remember that this is an iterative process – continually update your data taxonomy, continually manage your SSOT location to ensure it is being used properly, and regularly enforce your ROT data policies with the procedures you have in place to get rid of this.

If this all sounds complex and tedious, that's because it is. In fact, the easiest way to do it all is to extend your existing enterprise data management platform to your cloud environments to autonomously sort your cloud-based data, deduplicate your unnecessary redundant cloud data, and archive or delete your stale cloud data. and trivial.

In conclusion, ROT data is bad. And the same problems this can cause in your data center also exist in the cloud. Some of them might even be worse in the cloud, where controlling costs can be much more difficult. If reducing ROT data wherever it is – on-premises and in the cloud – was an easy manual process, you probably wouldn't have any, but I can assure you that you do. The right tools will help you reduce cloud costs associated with ROT data, simplify overall data management and increase data protection in the cloud.

*Gustavo Leite is Vice President for Latin America at Veritas Technologies

Notice: The opinion presented in this article is the responsibility of its author and not of ABES - Brazilian Association of Software Companies

quick access

en_USEN