Root Cause Analysis Q&A Featuring Forrester Senior Analyst Andrew Hewitt
A look at how holistic, scalable, artificial intelligence-powered RCA can have a profound effect on distributed workforces
The importance of root cause analysis for digital workplaces shouldn’t be underestimated — especially with dispersed workforces that rely heavily on their devices to work from anywhere at any time. But it can sometimes be difficult for businesses to understand the value of efficient, artificial intelligence-driven root cause analysis and the impact it can have on end users as well as the bottom line.
So how can IT make the case for tools that quickly and holistically identify the root cause of issues and help intelligently resolve problems at scale? Following our recent webinar on experience level agreements (XLAs), we asked our guest speaker Andrew Hewitt, a Forrester Senior Analyst and expert on employee experience, to answer a few key questions about root cause analysis and the vital role it plays in modern IT strategies.
Keep reading to see his responses or download the PDF version: Making the Business Case for Root Cause Analysis.
Root Cause Analysis Q&A with Andrew Hewitt
Q: Why do IT teams often report difficulty in quickly and effectively identifying root cause?
The first and most important reason is IT hygiene. This is especially true if the organization does not have good hygiene when it comes to endpoint management. Without ensuring consistent patch success rate, prompting users to restart devices, wiping caches when full, and maintaining visibility over malicious software, such as malware, employee devices can start to behave erratically in ways that don’t make sense or have no historical precedence.
The second and very common reason is that IT increasingly owns less of the end-user technology than it did in the past. In the old days, the IT organization owned the PC, the apps, the network connectivity, the authentication, etc., but today third-party managed service providers or hyperscale providers own or operate much of the technology employees use daily. Deciphering whether an issue on an endpoint is related to a faulty hard drive versus a SaaS outage or poor ISV app design can be difficult. This gets even harder when organizations start to do RCA for remote workers, where personally owned devices, unknown peripherals, and shaky at-home Internet connectivity inhibit RCA even further.
Q: Why is effective root cause analysis particularly important to supporting a remote or hybrid workforce?
In a remote-work scenario, the employee exclusively relies on technology to not just do their job, but also connect them to their company’s culture and fellow coworkers. The stakes are simply much higher if there is an issue that prevents them from connecting to the internet, accessing the right information to do their jobs, or collaborating seamlessly with their colleagues. It’s a matter of engagement over burnout. Getting to RCA faster is essential to remote worker engagement because the longer they’re left out in the dark, the more likely they are to burnout over the long term.
When it comes to hybrid work, the amount of potential issues drastically escalates as employees move locations, connect to different Wi-Fi networks, reserve conference rooms and desks, and then return home for a few days a week. Effective RCA is important in this scenario because the organization needs to have visibility over the multitude of locations and workstyles that are constantly changing, opening up new opportunities for device, app, and network failures. I would argue that RCA is hardest in the hybrid work scenario because of the sheer volume of different locations and scenarios that IT has to manage.
“I would argue that RCA is hardest in the hybrid work scenario because of the sheer volume of different locations and scenarios that IT has to manage.”
Q: What should organizations look for when evaluating tools and solutions that can support root cause analysis?
Organizations should look for three primary characteristics when evaluating tools and solutions that can support root cause analysis: holistic, fast, and scalable.
A holistic RCA capability spans all of the different technologies that impact productivity. Many tools today focus on one aspect of RCA, like say, device performance impacts. In reality, there are many other technologies that impact productivity, from network connectivity at home, to ISV outages, to ISP outages, to backend server issues — the list goes on and on. A great RCA tool can trace the issue across all of these areas to pinpoint the exact location of the issue.
A fast RCA capability enables IT admins to quickly resolve issues without having to dig too deep into the console. Some vendors today offer a button as simple as “Fix” next to an issue, enabling the IT admin to simply click and remediate the issue. This of course takes a lot of historical knowledge and artificial intelligence-based capabilities to understand whether the proposed fix does indeed solve the problem. Importantly, this capability isn’t buried deep in the console — it’s located on the landing page of the console, which makes it far easier to fix problems quickly. Some platforms even have suggested remediation actions to help IT admins prioritize the right fixes and their impact on digital employee experience.
A scalable RCA capability is fast and holistic as well, but can execute RCA across many devices, applications, users, etc. at the same time. Instead of having to do RCA on a 1:1 basis (one admin investigates one device), admins can run RCA across hundreds or thousands of devices to determine what other devices are also having the same issue and perform a bulk remediation. The less admins have to perform RCA on individual devices and the more they can do it in bulk, the faster users can get back to productivity.
Q: Is the value of root cause analysis truly understood at the business level? If not, what is not understood/appreciated?
RCA is often too complex of a topic for the business. Out of all my research on employee experience over the years, the one thing that stands out most is that people fundamentally care most about making progress in their daily work — productivity, in other words. In the context of RCA, business leaders and individual contributors only care about getting back to work as quickly as possible, regardless of what actually caused the issue. In some cases, you might find some tech savvy users that would be interested in self-servicing issues themselves in the future, but most employees don’t fall into that category.
Often times what we see at Forrester is that employees often equate their overall experience with their service desk experience, which is a limited way to look at digital experience. What’s missing is an understanding of the chain of technologies that stitch together to create the overall digital employee experience. For example, many employees don’t understand why certain resources require a VPN or why VPN usage impacts overall PC performance, but they do understand the frustration of a slow computer that’s forced to sign into a VPN for every resource they need to be productive. Similarly, employees might confuse a SaaS connectivity outage with an issue with their ISP provider. At the end of the day, the result is the same: the employee can’t work and they’re frustrated. They don’t necessarily care why the issue happened, just that it’s fixed.
Overall, business leaders could benefit by learning more about the benefits of good RCA because it’s so fundamentally linked with ensuring technology doesn’t hinder employee success. It’s up to IT to ensure that they’re effectively marketing the value of RCA.
“Overall, business leaders could benefit by learning more about the benefits of good RCA because it’s so fundamentally linked with ensuring technology doesn’t hinder employee success. It’s up to IT to ensure that they’re effectively marketing the value of RCA.”
Q: How should organizations modernize their approach to identifying root cause?
AI-based approaches to RCA are the future of better issue remediation. Identifying vendors that have significant experience with RCA and a library of potential scenarios should be top of mind for organizations that want to modernize their approach to RCA. AI-based approaches are superior to traditional RCA because they use historical data to identify likely causes of an issue based on prior events. It also can help suggest remediations that are likely to work given the issue at hand. Ideally, your modern approach to RCA should also be holistic enough to see across the entire chain of digital employee experience, including devices, apps, network connectivity, SaaS, and beyond.
Of course, investing in the right tools that have these capabilities is only one part of the story. Part of a great RCA strategy also involves a culture shift, one in which an organization can use data to verify which technology is causing the issue. Belief in this data is absolutely essential to convince product owners that it is indeed their product that is causing the degradation in experience. The organization needs to be willing to invest in learning, training, and improved collaboration to truly modernize their RCA strategy.
Support a Better RCA Strategy
Discover how digital experience management gives IT the tools to dig into root causes quickly and efficiently. Request a customized demo today.