Observability and security have come to the forefront of IT service delivery, a convergence that was long overdue. This was the urgent theme of the 2022 Splunk conference in Las Vegas.
The latest Atlassian outage goes to show that every cloud provider is prone to unplanned downtime sooner or later. While every company strives to achieve that unicorn status of zero downtime, it is almost impossible to achieve that in the face of “Unknown Unknowns.” I analyze it and offer some solutions on how to mitigate that if disaster strikes you.
Despite rising investments in artificial intelligence (AI) by today’s enterprises, trust in the insights delivered by AI can be a hit or a miss with the C-suite. Joe & I discuss in this article published in Harvard Business Review on what you can do to mitigate it.
The recent AWS East-1 outage provides a catalyst for customers to rapidly address their AIOps and Observability capabilities, especially the monitoring/observability portion. If cloud regions by AWS can take a hit, because of a misconfiguration, dependent organizations are even more vulnerable. I analyze it and offer some solutions on how to mitigate that.
In the RPA space, it is a dog-eat-dog world. Not only established players are fighting it out, but many new startups are trying to disrupt this space using AI/ML technologies. Automation Anywhere just acquired Fortress IQ to up the ante! Read my analysis on how it changes the RPA landscape.
AWS always used to come across as a landing place to attract the digital innovators to experiment, innovate and then productionize. They always had a good story attracting the bleeding edge innovators. This time I felt they missed that beat a little. Overall, it came across as less innovative and more incremental to what they already have. No earth-shattering new initiatives that blew me away. Could be because they wanted to play it safe with the change at the helm. What do you think?
I was fortunate enough to be invited to attend and speak at the Refresh 2021 conference in Las Vegas earlier this month. This blog is my review of the the conference Refresh 2021.
When it comes to crisis and incident management in the cloud/digital era, HOPE IS NOT A STRATEGY! A properly setup Incident Management process should identify the incidents, provide you with Root Cause Analysis (RCA), propose possible fixes, and escalate the issue to the right SRE, DevOps, SME in a matter of minutes.
In digital economy, you must move fast to survive. Not in six-month release cycles. But moving with fast release cycles, continuous releases, a mature CI/CD pipeline is only a portion of the solution. If you continue to break your systems at a faster rate but are unable to fix them faster as well, you are setting up for unplanned disasters that will hurt your business sooner than later. I discuss some of the fixes in this blog.
What are the criteria for selecting a good AIOps solution? How do you compare and measure the solutions one against another? Especially when there are so many solutions out there all claiming to solve the problem better than the others! In this article, I outline the top 5 criteria that all buyers should keep in mind when considering an AIOps solution. Let me know if you have more.