Power up your IT operations

what you need to know about aiops

Content provided by and TNW

As our lives become more digitized, the IT infrastructure supporting the applications and services we use have become increasingly complex. There are a variety of options to run services in the cloud, on-premise, serverless, and hybrid, which makes it possible to accommodate different kinds of applications, environments, and audiences.

However, managing such complex IT architectures is becoming increasingly difficult. There are too many moving parts, which makes it difficult to optimize IT, predict and prevent outages, and respond to incidents after they happen.

what you need to know about aiops

Greetings, humanoids

Subscribe to our newsletter now for a weekly recap of our favorite AI stories in your inbox.

Fortunately, AIOps — the use of AI in IT operations — is a fast-developing field that can address some of these challenges through automation. Here’s everything you need to know about AIOps and what it can do for your organization.

The challenges of modern IT

“The industry is facing three major trends, and first is complexity,” says Pratik Gupta, CTO at IBM Automation.

More organizations are using cloud IT, and in many instances in conjunction with on-premise servers. This is in addition to all kinds of serverless technologies, APIs, microservices, and the like that become integrated into applications.

“Many organizations use multiple clouds — up to five. You have on-prem environments, you have cloud environments. It is much more complex than it used to be,” Gupta says.

People have to understand that this is a way of augmenting their job.

The second trend is scale.

“We have seen ten years of digitization in one year during the covid pandemic. Organizations are moving to more digital experiences and more applications for getting work done. There are a lot more applications in this hybrid cloud state,” Gupta says.

And third? Well, that is skills.

“Most C-level execs do not have the time or talent to manually manage IT environments, which as we know, are becoming extraordinarily complex,” Gupta says.

These trends are driving interest in automating the IT environment and getting help from AI.

“AI and automation, which can be referred to as intelligent automation, are no longer nice-to-have. It’s a necessity and it’s actually differentiating companies, and those who use automation and AI are going to fare much better,” Gupta says.

This is where AIOps enters the picture. AIOps is a series of tools and services that use AI to automate all IT operations, from monitoring and collecting information to optimizing machines and services, and predicting and resolving incidents.

Observability

“We think of applying AIOps as a transformation not only for technology but also for people,” Gupta says. “People have to understand that this is a way of augmenting their job and not a way of replacing them.”

Basically, AIOps helps IT staff do things that were impossible with their previous tools. The first step to implementing AIOps is to collect quality information about your IT infrastructure and operations. This is important not only to provide you with a better image of your IT infrastructure, but also to train and guide AI systems to optimize and monitor them. This first stage of AIOps is called “observability.”

“Observability is different from past application performance monitoring (APM) in the sense that observability is about collecting all the data,” Gupta says. “Whereas old APM legacy tools may sample information purely from a performance management perspective, observability is capturing information to do AIOps.”

An example of observability tools is IBM Instana Observability, a solution that can capture metrics, traces, and logs from applications running on different computing platforms, going all the way from mobile devices to on-premise servers to mainframes and virtual machines running in the cloud. According to Gupta:

One of the things observability tools like Instana help you do is to find root causes faster, which application or microservice is causing errors, and directly pinpoint it using very strong heuristics and algorithms and AI.

AI-powered observability can lead to huge gains. Consider ExaVault (acquired by Files.com), a company that provides file-transfer services to large organizations. ExaVault’s API receives 35,000 requests per minute and over 50 million calls per day. Availability is very critical for ExaVault, but since each customer uses the service and API in different ways, it is very difficult for the company to oversee all activities through traditional monitoring methods.

Using Instana, ExaVault was able to establish observability in its API to monitor and control availability in a way that was impossible with previous APM tools. As a result, they were able to track and resolve issues faster than before. They reached 99.99-percent availability and reduced mean time to resolution (MTTR) by around 57%.

Optimization

“In today’s complex environments of cloud, on-prem, and hybrid, once an app is deployed, no human being can mentally monitor and manage how to set things up, configure them properly, and make sure they have the right performance, right server size, memory allocation, and so on,” Gupta says. “These are currently managed through smart guesses.”

Another important aspect of AIOps is the optimization of IT resources. An example is IBM Turbonomic, a tool that analyzes end-to-end environments and creates a single-view topology of the system. Turbonomic can process data from different aspects of the system, including service-level objects, application configurations, and pricing and contracts. It takes in all this information and helps you optimize the components of your IT ecosystem to achieve different goals, such as improving availability or reducing waste and costs. Depending on your requirements, Turbonomic can automatically optimize your IT components or provide you with recommendations.

A study found that on average, the application of Turbonomic results in a 471% return on investment and the payback period is under six months. Automation tools like Turbonomic help IT departments avoid overprovisioning infrastructure, which on average results in a 75% reduction in IT spend.

The benefits of AIOps can go beyond reducing IT costs and outages.

For example, BBC Studios used Turbonomic to manage its network of more than 1,000 virtual machines. By applying Turbonomic, the BBC team was able to obtain a full-stack view of their environment, allowing them to better understand the cause of performance problems and bring their environment back into a maximally efficient and performant state. Turbonomic provided them with action recommendations as well as predictions on the impact of each action.

The team started by reviewing Turbonomic’s recommendations and manually implementing them. They ended up automating part of the resizing actions without manual intervention. The smart and automated optimization of their IT resources enabled them to reclaim hundreds of gigabytes of memory and dozens of virtual CPUs in a single month.

Incident prevention and resolution

One of the challenges of complex IT infrastructures is predicting when and where failures will happen — and taking the right measures to prevent them. Another challenge is finding the cause of failures and responding to them in a timely manner. Fortunately, this is another area where AIOps can help.

An example is IBM Cloud Pak for Watson AIOps, a solution that collects all the incidents, metrics, traces, logs, and tickets from an IT system and analyzes them in a generalized AI framework with machine learning models. Cloud Pak for Watson AIOps can help predict blast radius, which is the effect that the outage of a particular component will have on other parts of the system. Accordingly, it can provide recommendations on how to prevent such incidents. As Gupta explaines:

It is a tool that provides a general framework for understanding what happens in the system and taking actions in response to incidents both predictably and proactively.

Incident prediction is especially useful for organizations that are responsible for critical infrastructure. For example, Taiwan’s National Center for High-performance Computing (NCHC) runs dozens of supercomputers and provides computation resources for all kinds of operations, including drug research and scientific projects. NCHC used Cloud Pak for Watson AIOps to establish an AI-based automation system for predicting incidents and improving resilience.

Cloud Pak for Watson AIOps used structured and unstructured data from NCHC’s compute network to train AI models to automatically and proactively manage problems and incidents. Thanks to automation, NCHC was able to achieve a 55% shorter mean time to detect (MTTD) issues that would affect its service. They were also able to detect potential outages 25 hours in advance, giving them vital time to resolve incidents before they happen.

Beyond IT

The benefits of AIOps can go beyond reducing IT costs and outages to creating better applications and serving customers. According to Gupta:

We’re seeing a shift in the thinking from managing IT as a cost center to managing IT as an enabler for revenue. Not only does AIOps optimize IT infrastructure dynamically and result in savings, but it also frees up the people to do more business-critical work.

For example, AIOps can help developer teams understand bottlenecks and the effects of failures in advance. This helps them design their applications and systems with robustness built into them, instead of responding to failures ad hoc.

“If you shift left and say how should a development team build their application to be more resilient to failure, the things we do include how code changes affect the quality of the release going out,” Gupta says.

By spending less time addressing technical failures, developers can focus more on creating better products that solve customer problems.

“Several studies show AIOps are resulting in more clients coming to web applications,” Gupta says. “The reason is that the people in IT were now more focused on doing work that is aligned with the business and generates revenue.”

The field is just beginning to take off, and there are many developments in artificial intelligence research that can find their way into AIOps.

“We started off with advanced heuristics, added machine learning models, and we are seeing more and more foundation models in IT and AIOps,” Gupta says.

Going forward, we’ll see a lot more use of natural language processing and foundation models impacting how IT is managed. We’re going to see a huge amount of intelligence and AI brought to bear in managing IT systems. We see an exciting road ahead with this evolution of using AI in IT. We should stay tuned because the next few years are going to be very exciting in terms of how AI is affecting IT.

Content provided by IBM and TNW

what you need to know about aiops

TECH NEWS RELATED

TikTok EU ban on the table if social network doesn’t comply with new laws

TikTok is one of the most popular social networks out there. But TikTok is also a cause of concern for western governments that worry about the company’s ties to the Chinese government. TikTok can’t run on most devices the US government issues, and there has been talk of a ...

View more: TikTok EU ban on the table if social network doesn’t comply with new laws

Don’t Buy a Foldable Until Samsung Brings This Prototype to Life

Samsung Display via The Verge The world of foldable phones is surprisingly stagnant. The Galaxy Z Fold gets a tiny little upgrade every year, and rival phone brands loosely copy Samsung’s homework. But a new Samsung Display prototype called the “Flex In & Out” could turn this narrative on ...

View more: Don’t Buy a Foldable Until Samsung Brings This Prototype to Life

Best free sports streaming apps in 2023

Cutting the cord on cable television is something tons of people have done over the past five years. But that hasn’t proven to be the smartest way to continue to watch sports. Whether it comes from premium sports website subscriptions to keep tabs on your favorite players, or even fantasy ...

View more: Best free sports streaming apps in 2023

Avengers 5 might have Ant-Man in it, Quantumania star teases

The first MCU Phase 5 movie will be Ant-Man and the Wasp: Quantumania, the third installment in the Ant-Man franchise and a film with much higher stakes than the previous episodes. The sequel will deliver the MCU’s first Kang (Jonathan Majors) villain after we met a somewhat good He Who ...

View more: Avengers 5 might have Ant-Man in it, Quantumania star teases

Sharing a Netflix Account? Get Ready to Pay For It

DANIEL CONSTANTE/Shutterstock.com Netflix is about to get serious in its efforts to eliminate freeloaders. If you share a Netflix account with family or friends outside your household, get ready to pay for it. A new “paid sharing” system could roll out starting next month, and you’ll have to pay a ...

View more: Sharing a Netflix Account? Get Ready to Pay For It

‘7 Wonders’ Board Game Gets a New ‘Edifice’ Expansion

Asmodee and Repos Production Board game lovers have a wonderful reason to celebrate today. Board game makers Asmodee and Repos Production announced their latest collaboration: 7 Wonders Edifice, an expansion to the popular board game 7 Wonders. The game launches on February 24th for $29.99. 7 Wonders: Edifice adds ...

View more: ‘7 Wonders’ Board Game Gets a New ‘Edifice’ Expansion

T-Mobile Kicks Off 2023 With Another Data Breach

r.classen / Shutterstock.com In a press release, T-Mobile confirms that it detected a data breach in its systems on January 5th. A “bad actor” managed to steal personal information (but not financial data) from around 37 million customers. This is the eighth T-Mobile data breach since 2018. The hacker ...

View more: T-Mobile Kicks Off 2023 With Another Data Breach

Apple appeals to UK competition watchdog investigation about mobile browser dominance

Apple has filed an appeal against the UK’s competition watchdog regarding its dominance of mobile browsers in the cloud gaming market, reports Reuters. The Competition and Markets Authority started investigating this dominance by the Cupertino firm and Google. Lawyers representing Apple believe the investigation should be reviewed as CMA ...

View more: Apple appeals to UK competition watchdog investigation about mobile browser dominance

Galaxy S23 Ultra release date and specs leak finally reveals everything about the new model

WhatsApp for iOS rolling out the ability to create a chat with yourself

Amazon Prime Music Unlimited changes streaming prices, now matches Apple Music

Deadpool 3 and Secret Wars to feature Fox’s X-Men, according to Marvel insider

Report: OLED iPad Pro still on track for 2024 release, 2026 for MacBook Pro

How to negotiate over practically anything

HomePod 2 praised in exclusive hands-on before launch

M2 Pro MacBook Pro Amazon preorder deal gives you $50 off

What “choice” means for millions of women post-Roe

Singapore FinTech firm Pilon secures $5.2M seed funding led by Wavemaker Partners

Capital Square Partners and Basil Technology team up for $700M tech fund in Asia

This feel-good movie about man’s best friend is dominating Netflix

OTHER TECH NEWS

Top Car News Car News