Home

How Agentic AI is Reinventing Observability - Six Five On The Road

How Agentic AI is Reinventing Observability - Six Five On The Road

Baha Azarmi, General Manager of Observability at Elastic, joins host Jason Andersen to discuss how agentic AI is advancing observability tools, enabling faster insights, automated operations, and improved root cause detection for modern IT teams.

How are agentic AI capabilities reshaping the observability landscape and enabling organizations to move toward more autonomous operations?

From AWS re:Invent 2025, host Jason Andersen is joined by Elastic's Baha Azarmi, General Manager, Observability, for a conversation on how agentic AI is reinventing observability. They focus on the role of AI-native observability, the transition from reactive dashboards to intelligent automation, and the innovative capabilities Elastic is introducing to operational transformation.

Key Takeaways Include:

🔹Achieving Proactive Monitoring: How agentic ai is transforming observability from reactive monitoring, manual dashboards, and alerts to systems that perceive, reason, and act for accelerated issue resolution and prevention.

🔹Defining AI-Native Observability: Elastic’s approach to embedding AI as a core architectural component, driving contextual insights directly from machine data.

🔹Overcoming Observability Challenges: How agentic AI addresses alert fatigue, data complexity, and operational silos, enabling teams to optimize costs and reduce manual toil.

🔹Autonomous Operations: The role of evolving from AIOps to AutOps, including steps organizations can take to automate root cause analysis and operational decision-making.

🔹The Future of Elastic’s AI-powered Platform: Insights into upcoming innovations and strategic direction for agentic AI in observability.

Learn more at Elastic.

Watch the full video at sixfivemedia.com, and be sure to subscribe to our YouTube channel, so you never miss an episode.

Or listen to the audio here:

Disclaimer: Six Five On The Road is for information and entertainment purposes only. Over the course of this webcast, we may talk about companies that are publicly traded, and we may even reference that fact and their equity share price, but please do not take anything that we say as a recommendation about what you should do with your investment dollars. We are not investment advisors, and we ask that you do not treat us as such.

Transcript

Jason Andersen:

Hi, this is Jason Anderson, Principal Analyst with Moor Insights and Strategy, and we're here with Six Five on the Road here at AWS reInvent 2025. One of the great topics about AI is how it can be used in observability use cases. And AI, in particular, agentic AI, is reinventing that whole process. And joining me today is Baha Azami from Elastic. And he's going to talk to us a little bit about Elastic's point of view on observability and kind of where observability is headed. So it's nice to meet you, Baha. 

Baha Azarmi: Yeah, nice to meet you, Jason. Thank you for having me here. Oh, great. 

Jason Andersen: Yeah, no problem. So really just to start out at the high level, right? in terms of AI and its role in observability and now even further operations. Kind of, you know, what's the state of the market, kind of how do we get here or how are we getting where we're going maybe is a better way to put it.

Baha Azarmi:

Yeah, yeah, I think we learn a ton from our own past. Okay. Yeah, if we look at how our customers started to use Elastic, for observability, they started with logs mainly. Sure. You know, bringing their logs into Elasticsearch, using Logstash, Kibana, you heard about ELK. Yeah, for sure. Strong brand in the open source community. And so the reason they did that is because a lot of customers and a lot of developers had to deal with tons of logs, right? And gripping into it and it was not scalable. To bring that into Elastic because something was magic about aggregating data in Elastic and it's still magic. And so that was a way for them to scale with relative to the amount of logs they're getting every day and look at this through visualization and dashboard and all that. Then we saw probably 5-6 years ago AIOps coming in and to be honest, without being too critical, just a false promise to the customers in terms of the market and vendors talking about AIOps and how we could auto-remediate with issues and incidents. Still not there, yes. Still not there, but you know what? Now we have the opportunity. Now we have this opportunity because of a couple of things. We saw the LLMs coming and being really good with unstructured data. And so being able to just extract meaning from unstructured data. We also have done tons of innovation in our platform. We've been building our vector database, our semantic search capabilities, our hybrid search capabilities. We have our own model. If we look at Elastic, there is a platform with tons of primitives that serve different solutions, including mine, the observability solution. And so we're leveraging that. How? Well, now we're building a context, and this context for observability, you know, SRs are not necessarily thinking about a context. They think about, hey, how can you help me do my RCA? But under the hood, what we do is that we combine the different signals, logs, traces, metrics together into our elastic context. And now we're serving this to an LLM. And so an SRE will have this experience of getting into Elastic and chatting through their own RCA, having an agent looking at this context and leveraging the logs and metrics to help the user to solve an incident. And so that part is just another iteration of the whole journey.

Jason Andersen:

In a way, even just from AI and LLMs, which was very much a reactive type of thing. With agents, you're moving more proactive, right? I imagine where it's kind of, instead of just asking, hey, I saw something and something's wrong, help me fix it. Now it's, hey you, something's wrong. You should take a look at it, right?

Baha Azarmi:

You know what, Jason? We felt that too. Yeah? And I think we realized after a couple of years, because we've been into this for two, three years now. Okay. we've been working on our own gen AI capabilities. And one thing that we know is that we can't just expect the user to just chat.

Jason Andersen:

Okay.

Baha Azarmi:

The same way that you cannot expect them to be in front of a dashboard and start clicking across the V's and dashboards, you won't expect everyone to just go and just start chatting. They're going to also waste time. And so, There is a company we acquired called Keap. And so Keap has built a workflow engine. And so like we did with tons of features, we put that into our platform that now serves different solutions, serves my solution. And so the way we are thinking about it is instead of asking the user to go chat, we're going to define automation that will shorten the investigation path. So imagine, for example, thinking simply about the low-value task, such as I have an incident and I have to restart something. I don't want to do that manually. A workflow can do that for me. I have an incident and I want at least 80%, 90% of the investigation to be done for me. through agentic combined by workflow automation. So this is how we combine both and that's how we're going to implement our automated RCA. And the third one is, you know, there's a category of user that will always want to chat with their data to do root cause analysis. And so that's how we think about it. It's like we categorize those three type of use case, combining our workflow engine and our agent builder engines.

Jason Andersen:

So do you think that, it's really a profound point because if you think of an SRE, right, in terms of their kind of, what's your expectation of the SRE in terms of how their role is changing, and what new behaviors you want them to have, or is it just, I really can't expect them to have new behaviors, so I have, you know, what's the trade, as a, in a product sense, right, what's the trade-off in terms of how disruptive we make this towards the practitioner?

Baha Azarmi:

Yeah, that's a great question. And tons of people are asking that. So I think number one, we cannot also expect every SRE to know everything about the application stack they have to observe. They cannot be expert in everything. And so one thing we're doing with that is that we enable them to be expert without being experts. Suddenly, they just have or ask an agent to do something about something they don't know, such as, where is the problem? And then now the agent is informing them, hey, you have this Redis instance, it doesn't have enough memory. here is the thing you should do. Hey, you have made this code change and this code change is creating either contention on this service, creating a performance issue on your checkout service and your revenue is going down. See what I mean? Like all that dependency impact analysis is done by the agent without the SRE having to do this because it's also not realistic to expect them to know everything about the system they're observing. There is that. The way we think about observability for us is that it's going to be agent first. Even in the way we're developing our own observability platform, the features are somehow going to be tools for agents. But it doesn't mean that the human is not going to be involved. There will still be in the loop, of course. For example, I just talked about the code change. When you look at the incident in observability, the majority of them is someone change a code, someone change something in infra, it crashes, something goes wrong. So an agent can say, can go and remediate, not remediate, sorry, do the automated RCA and then propose a code change. Right? So imagine that code change could be in GitHub as a pull request. And I am the developer and I see this as a potential solution to my problem. So now I have the ability to accept or even debate with the agent that, no, that's not right. Like I reviewed the code and I think we could do this and that. So the same type of experience you would expect from an IDE today, if I'm a developer, you need to have the same for an observability solution. I think it's not acceptable if you don't have that.

Jason Andersen:

Well, and it kind of ties into the whole shift left movement, right? I mean, where we're kind of putting, as we've automated the process of writing code, we've given people more responsibilities in a lot of ways with the shift left, right? Yes. And I think that's a pretty critical thing because, to your point, now depth of expertise is becoming a fleeting thing, right? People have to be broader versus deeper in many, not in all, but in many respects. Kind of switching gears a little bit towards, you talk about this journey and this path. I think a lot of, you mentioned humans in the loop, but there's also starting to see a lot of talk about autonomy in terms of agents operating autonomously and even, you know, getting to the point where they may commit changes themselves and do those things. Can you talk a little bit about that part of the journey? As we project forward, what will humans delegate or not? That balancing act that's going to invariably happen, right? As models get more powerful and as tooling gets smarter, we should expect that to do more for us, right?

Baha Azarmi:

Totally. I think at Elastic, we tend to pay attention on giving a lot of control to the user, while at the same time, we're opinionated about what good looks like. So we come with, you know, pre-built experiences and, you know, path to follow to do something. That's like the open-ended way. But then you can think about, you know, Being autonomous means also that we need to give a lot of control and transparency to what's happening. So if I have an agent executing different steps, the least I can do for an SRE is to tell them what happened, and so what is my chain of thoughts, and what are the different steps, and here is the queries that I've executed as an agent, and here is the way you can debug them. So that's a first principle in what we're building. We want to make sure that we give that level of transparency to the user. The other thing is, when you think about what I said around agent builder being our agent engine and our workflow engine, we also want to give a ton of control to our users to define their own path, their own remediation path. They're automated, they're agent powered, but the user will be able to just choose how things need to be executed if they want to. Does that mean that they can't just hand it off to the agent? No, it doesn't mean that. They can. And of course, we're going to come up with functionalities that will help take action, remediate automatically. We will have that. But we still want to give control to the user. Because the reality is that the market is adopting a baby step, like step by step. Absolutely. AI. And I think it's also, you know, there's an assumption that everyone is like jumping on it. But yet, you know, people are, you know, conscious of the impact it could have. And they're looking at this as, can I actually keep control on what I'm doing? And that's for us, it's very important.

Jason Andersen:

Good. I mean, is that generally the theme? You have observability, but that seems to be the theme through the entire elastic stack, right?

Baha Azarmi:

Yes.

Jason Andersen:

Right. Cool. Excellent. Excellent. So, you know, just the last follow-up question for you is, you know, in terms of we're here at reInvent, there's a lot of discussions, a lot of ideas, a lot of stuff going on with AWS. Any thoughts on, is that formulating and putting any ideas in your head as you look forward the next, say, 6 to 12 months?

Baha Azarmi:

Yeah, I think what I'm excited about is that every vendor, including AWS, they're coming up with their agentic offering. We heard about the DevOps agent, Frontier agents, which I think is pretty cool. So what I'm excited about and what I'm thinking about this is, Our agent is, you know, you can expose it as an MCP, you can talk to other agents, you can use other agents as a tool. And so, you know, when you combine both together, this is going to be a huge value for the customers. And so that's what I'm very excited about.

Jason Andersen:

That's great. That's really good to hear. So, well, thank you. Thank you for your time today. It was very enlightening and I appreciate it. Just to close out, again, you want to learn more about observability, you want to learn a little bit more about AI and AI and observability, check out Elastic. And to close out, this is Jason, Six Five On The Road from AWS re:Invent. Thank you.

MORE VIDEOS

Building Enterprise-Ready Agentic AI with Search - Six Five On The Road

Steve Kearns, GM of Search Solutions at Elastic, joins Nick Patience to share how Elastic is enabling enterprises to move from RAG to agentic AI, solving operational challenges, and powering the next generation of autonomous workflows.

Agentic AI and the Future of Threat Detection with Elastic Security - Six Five On The Road

Mike Nichols, General Manager, Security at Elastic, joins host Mitch Ashley to discuss how agentic AI and the Elasticsearch platform are driving faster, more intelligent threat detection and response for today’s security teams.

Building Trust at Scale: How RAG Is Evolving for Enterprise AI - Signal65

Seamus Jones, Director, Technical Marketing Engineering at Dell, joins David Nicholson and Brian Martin to share lab-based insights on the evolution of RAG—including how Agentic RAG and advanced AI infrastructure are raising the bar for enterprise-ready, trustworthy generative AI.

See more

Other Categories

CYBERSECURITY

QUANTUM