AIOps on IBM Z - Group home

Best practices for taking a hybrid approach to AIOps

  



Good managers empower their teams with tools that can help them excel at their jobs. The adoption of hybrid cloud brings benefits to IT organizations by enabling IT teams to innovate with agility, create better customer experiences, fuel business growth and build competitive advantage. However, managing hybrid applications can be a challenge for IT operation teams who need to sift through terabytes of data being generated by often siloed and disparate data sources. This is why leading IT organizations are turning to AIOps to help improve IT operational resiliency and the productivity of their teams.  

At THINK 2021 we learned that leading IT organizations want to take a hybrid approach to AIOps in order to support their digital transformation efforts and take full advantage of the insights that can be generated by AI and ML technologies across their hybrid application landscapes. 

However, the journey to AIOps is incremental, and each customer will take a slightly different path. Based on our work with many IBM Z customers, we have captured a set of best practices that can help accelerate that journey and deliver value at every step.

Figure 1: Jason English, Intellyx, “Accelerating AIOps on the Mainframe”. Journey to AIOps – Four stages of operation. April 2021

The three key capability areas of AIOps that can be applied to empower IBM Z IT ops teams, and accelerate customer AIOps journeys, include accurately detecting emerging problems across hybrid cloud, diagnosing and deciding how to fix problems quickly in dynamic and complex environments and acting swiftly to resolve issues with intelligent automation. In a recent report from Intellyx we see there are several best practices for achieving operational resiliency through AIOps strategies including IBM Z. Within IBM, we have summarized these best practices as:

Detect

  • Monitoring and observability: identify poorly performing APIs quickly for faster resolution with full-stack monitoring for early detection of Z incidents
  • Application performance management: faster isolation of problems in hybrid application inclusive of IBM Z with end-to-end tracking visibility
  • Anomaly detection: outage avoidance with advanced notification of unusual behavior prior to end-user or SLA impact

Decide

Act

  • Collaborative incident remediation: achieve availability of high SLA requirements within changing hybrid environments with faster incident resolution through chat-based operations and user-friendly dashboards
  • Intelligent automation: strongly reduce the need for coding to implement cross-enterprise system automation with end-to-end, goal driven policy-based system automation for consistent and reliable automation across the enterprise
  • Storage automation: Automated repetitive and time consuming storage tasks and transform them into best practice policies that can be initiated on command or triggered when an event occurs requiring little or no human intervention
  • Predictive workload automation:  enable predictive workload automation with open scheduling for integration with DevOps and hybrid cloud solutions

Depending on where you are on your journey to adopting more of these AIOps best practices we have developed the following resources:

  • To assess your current stage of AIOps maturity and identify action oriented next steps for adopting more AIOps best practices, inquire about the 15-minute online AIOps Assessment for IBM Z.
  • Join the AIOps on IBM Z Community to follow the launch of a 10 blog series describing the above best practices and to engage directly with our AIOps product teams. Note:  We will update this blog with hyperlinks to the 10 blogs as they become available, so stop back here for the full picture.
  • And finally, to research our IBM Z products that are implementing AIOps technologies to improve operational resiliency visit our product portfolio page.



Comments

Tue June 08, 2021 10:13 AM

This is a great launch of a series of blogs!  I am looking forward to the upcoming 10 blogs that will cover Monitoring & Observability; Operations Analytics; Automation; and Performance & Capacity Management.  Please engage with us on these topics!