*By Gustavo Leite
As the world works towards vital zero net carbon targets, digitalization has become essential to deliver green and efficient strategies. Data is essential to driving better business outcomes and ensuring a sustainable future. On the other hand, however, our new data-driven world presents its own sustainability challenges. The data centers that house our digital reserves require massive amounts of energy. Global emissions from cloud computing, for example, are predicted to account for over 3.5% of greenhouse gas emissions, even more than commercial flights.
Over the past decade, efforts have been made to ensure that data centers are more sustainable. However, while the infrastructure could become greener, the issue of wasted storage hampers the efforts. Continuously storing useless data drains precious resources. According to Veritas research, the power required to store this dark data wastes up to 6.4 million tons of CO2 annually. Analysts predict that by 2025 there will be around 91ZB of obscure data being held unnecessarily – more than four times the current volume.
Obscure Data
On average, our research found that 52% of data stored by organizations is “dark”; its contents and value are unknown, and it is essentially useless until its value (if any) is determined. At the same time, it is estimated that about a third of organizational data is Redundant, Obsolete and Trivial (ROT).
In short, swathes of data are being stored for no reason. ROT data is a major contributor to high storage costs; Recent global research suggests that more than nine out of ten organizations exceed their cloud budgets, spending an average of 43%, primarily on storage, backup and recovery. Much has been said about the financial cost of obscure data, but the environmental cost is often overlooked. Eliminating massive amounts of data waste can help dramatically reduce an organization's carbon footprint, leading to greater sustainability and lower costs. As such, companies must rein in their data management strategies, use the right tools to identify valuable data, and rid their data centers of unnecessary and energy-consuming obscure data.
Data management is crucial
Data management is a crucial first step for organizations to efficiently analyze data at scale.
This starts with data mapping and discovery, understanding how information flows through an organization. Gaining visibility and insight into where sensitive data and information is stored, who has access and how long it is retained is the first port of call when identifying obscure data. However, it is important for organizations to invest in an ongoing proactive data management program. This allows organizations to gain visibility into their data, storage and backup infrastructure and make insightful decisions related to data deletion on an ongoing basis. Dark data and accumulated ROT drain all resources.
In addition, data minimization and purpose limitation can reduce the amount of data stored and ensure that what is retained is directly related to its purpose. The use of classification, flexible data retention policies and compliant policy mechanisms means that non-relevant information can be reliably deleted. Not only does this reduce the amount of obscure data that feeds into data center resources, but it can also ensure compliance with data protection regulations such as the GDPR.
For many organizations, reducing dark data and ROT is not a simple task, especially when handled manually. The process can be complex, with many enterprise data management solutions retaining a manual approach to deployment and maintenance, decreasing operational agility.
With the amount of data created and stored exploding, this is not a task that companies can handle manually. Automating analysis, tracking and reporting of dark data is essential when dealing with potentially petabytes of data and billions of files. Furthermore, the need for multi-cloud strategies required the development of a new approach to data management.
Autonomous data management
The ultimate tool for organizations now is autonomous data management. Here, artificial intelligence (AI) and machine learning (ML) technologies enable the automation of data management processes and minimize human intervention and oversight. By automating the provisioning, optimization, recovery and configuration of data management technologies in multi-cloud environments, companies can get a much clearer and more accurate picture of their data in a much shorter amount of time, no matter what or where is. it is stored.
For example, enterprise data management platforms can now autonomously classify cloud-based data, deduplicate unnecessary and redundant data in the cloud, and archive or delete stale and trivial data in the cloud. This automated approach to data insight must also be integrated with archiving, backup, and cybersecurity solutions to prevent data loss and ensure policy-based data retention.
Ongoing digital transformation makes organizations' requirements for data content and context a top priority, particularly as many of these transformative projects seek to deliver greater sustainability. The energy used to store useless data is pure waste. Imagine if we could automatically remove 85% from that junk data from data centers – that would allow for a huge leap towards net zero.
Reducing the environmental impact of our data storage footprint will be imperative if we are to avoid creating an even greater mass of waste data as the cloud evolves. Green strategies driven by digitization cannot be abandoned by the shadow of dark data that consumes power in the background, silently undoing good work. The journey to a sustainable cloud hinges on tackling data waste. The best solution for managing data waste in a complex hybrid cloud and multi-cloud environment is autonomous operation, minimizing reliance on manual processes by combining hyper-automation with data-driven intelligence.
*Gustavo Leite is Vice President for Latin America at Veritas Technologies
Notice: The opinion presented in this article is the responsibility of its author and not of ABES - Brazilian Association of Software Companies