A crise da complexidade: por que a observabilidade é a base da resiliência digital

*By Bob Wambach

The stark reality for businesses today is that software failures will pose an increasingly greater threat going forward, but what will determine whether their occurrence becomes headline news or a mere footnote is the organization's ability to detect, diagnose, and recover from them in real time. And this ability doesn't come from traditional monitoring, characterized by fragmented data and organizational silos.

Developing and delivering robust and resilient software requires deep, AI-powered, end-to-end observability that provides a consistent, unified source of truth. This need becomes even more critical as today's enterprise software environments become more complex, encompassing cloud-native applications, multicloud deployments, third-party services, APIs, and now the growing influence of Artificial Intelligence (AI). These layered environments introduce significant opacity into the software supply chain, making it more difficult to manage risk, performance, and resilience at scale.

Hidden vulnerabilities in modern software stacks pose risks to businesses that rely on a vast ecosystem of interconnected technologies. A single misconfigured update or a vulnerability in a widely deployed third-party agent can have cascading effects within minutes, impacting the customer experience, operations, and ultimately, business continuity.

According to Adaptvist research, 421% of organizations expect to experience an incident caused by one of their suppliers. Making this scenario even more complex, teams are often left in the dark when something goes wrong, which can be frustrating and costly. To operate confidently, companies need visibility into their entire digital supply chain, which basic monitoring doesn't provide. Unlike this type of monitoring, which often focuses on isolated metrics or alerts, observability offers a unified, real-time view of the entire technology stack, enabling faster, data-driven decisions at scale. Implementing AI-powered observability encompasses all digital components of the business, from infrastructure and services to applications and user experience.

Today, observability has gone from being a technical choice to a strategic one. In fact, it's evolving beyond its current role in IT and DevOps to become a fundamental element of modern business strategy. As a result, it plays a critical role in managing risk, maintaining uptime, and protecting digital trust.

Observability also enables organizations to proactively detect anomalies before they become major disruptions, quickly identify root causes in complex, distributed systems, and automate responses to reduce Mean Time to Resolution (MTTR). The result is faster, smarter, and more resilient operations, giving teams the confidence to innovate without compromising system stability—a crucial advantage in a world where digital resilience and agility must go hand in hand.

It's important to emphasize that resilient systems need to absorb impacts without collapsing. This requires technical and cultural investment, from adopting shared responsibility among teams to using modern implementation strategies. But modern strategies only work if teams have access to real-time feedback and clarity, allowing organizations to understand what's happening, why, and how to act—before the customer notices any failure.

The rise of agentic AI also brings a new layer of complexity and risk. As organizations increasingly adopt generative and agentic AI to accelerate innovation, they also expose themselves to new types of risks. Agentic AI can be configured to act independently, making changes, triggering workflows, and even deploying code without direct human involvement. This level of autonomy can increase productivity, but it also poses serious challenges.

For example, a misconfigured agent or a malicious prompt can trigger rapid cascading consequences. Small failures can quickly become major problems, with greater scope, and greater difficulty in containing them. Real-time, AI-powered observability platforms are essential not only for monitoring what agents do, but also for understanding how they act, interact with other systems, and when intervention is necessary. Observability helps safely harness the potential of agentic AI, paving the way for more autonomous operations.

The market leaders of the future will be those capable of adopting and adapting to new technologies, embracing agentic AI, but recognizing the increased exposure risks and compliance demands it brings. These leaders will need to shift from reactive to proactive and preventative operations.

AI-powered real-time observability can automate accurate responses without relying on someone to push the automation button. Organizations that invest in this approach are going beyond preparing for the next potential disruption. They're building a foundation for trust, agility, and continuous innovation that propel their businesses into the future.

*By Bob Wambach, Vice President of Portfolio and Strategy at Dynatrace

Notice: The opinion presented in this article is the responsibility of its author and not of ABES - Brazilian Association of Software Companies