Articles

Integration of AI agents for the automation of SaaS system maintenance

By Nasreddine Menacer, Ph.D. in Robotics | Assistant Professor at aivancity

A SaaS (Software as a Service) offering is an application accessible online, without the need for local installation, typically via a web browser. Tools such as Google Workspace, Salesforce, Notion, and Stripe are common examples. This model has become the norm in most organizations, to the point where a company’s software infrastructure now relies on a set of interconnected services, often operated by different providers. Reports such as those from Okta show that a company uses an average of nearly 90 SaaS applications, with significantly higher numbers in large organizations.

This reliance on a large number of services introduces structural complexity. An application no longer functions as an isolated unit, but as an assembly of distributed components: external APIs, authentication systems, databases, cloud services, and data pipelines. Each component evolves independently, making the overall system vulnerable to local changes.

In this context, incidents often stem from these interactions. An API changes a response schema, an endpoint is deprecated, an authentication mechanism changes, or a service slows down and triggers a cascade of timeouts. The application code remains unchanged, but the system’s behavior no longer matches what was expected.

The handling of these incidents still relies heavily on human intervention. A developer analyzes the logs, identifies the point of failure, formulates a hypothesis, tests a fix, and validates the result. This process is well-established, but it remains entirely reactive.

At the same time, artificial intelligence has become widely adopted in many operational applications. It is used to generate code, analyze data, and support decision-making. It is therefore reasonable to ask the following question: if these models are capable of producing code and interpreting technical situations, why not use them to directly assist with system maintenance?

To answer this question, we need to clarify exactly what these models do.

The language models used today are autoregressive models that generate a sequence of tokens by maximizing a conditional probability. Their role is to produce an output based on a given context. This context includes both the user input and a set of predefined instructions, often in the form of a system prompt. This framework determines the model’s behavior, which remains limited to a generation task. A generative model, used on its own, has neither access to an external environment nor the ability to take action. It produces a response, then stops. This limitation explains why its direct use does not allow for managing a maintenance process.

The shift to agents relies on a different approach to integration. The model is embedded within a system that provides it with the means to interact with its environment. It can access tools, execute queries, read logs, query APIs, and modify a state within a controlled environment.

The process relies on explicit orchestration. A task is broken down into several steps, each corresponding to a call to the model with a specific objective. This sequence, often implemented as prompt chaining, helps structure the overall behavior. The model analyzes the data, decides on an action, interprets the result, and then feeds the information into the next step.

In this context, the agent does not rely on autonomous intelligence in the strict sense, but rather on a combination of rules, tools, and local decisions made by the model. This structure is well-suited to tasks such as maintenance, which require incremental analysis and interaction with multiple sources of information.

Software maintenance relies on structured procedures. When an incident occurs, the analysis begins with an examination of logs and metrics, continues with the identification of a failure point, and then proceeds to the development and validation of a fix.

This type of process can be partially handled by an agent, provided it is given controlled access to the necessary resources: logs, traces, documentation, code, and the test environment.

The architecture of such an agent typically consists of several components:

  • An orchestrator, which defines the sequence of steps
  • A tool registry that lists available actions (reading logs, calling APIs, running tests)
  • A language model used for local analysis and decision-making
  • A working memory that stores intermediate states
  • An isolated environment for testing changes

The process of resolving an incident then follows a clear sequence. The agent gathers the relevant context, identifies a likely point of failure, proposes one or more hypotheses, and then tests potential fixes in a controlled environment. The results are analyzed before any decision is made to implement a solution.

A SaaS service uses a payment API to process transactions. Following an update on the provider’s end, the type of the `amount` field in the JSON response changes from an integer to a string.

In the existing code, this field is used directly in a numerical operation, which causes an error. The logs show an exception related to an unexpected type, and some transactions fail.

In a typical scenario, a developer analyzes the logs, identifies the source of the problem, modifies the parsing logic to explicitly convert the value, and then deploys a fix.

With a built-in agent, the sequence might be as follows:

The agent detects an unusual spike in errors on a specific route. It retrieves the relevant logs and identifies an exception related to a field type. It extracts samples of recent API responses, compares them to archived responses, and notes a difference in structure.

Based on this observation, he formulates a hypothesis about the cause of the error, then makes a change to the code or data mapping. This change is tested in a staging environment using representative test cases. If the tests pass and the data flow becomes functional again, the fix is implemented.

At this point, there are two options: either the system automatically applies the correction within a limited scope, or it requires human approval before deployment.

It is technically possible to integrate this type of agent into a production system and automate part of the maintenance process, including in response to actual incidents. The necessary components already exist: generative models, orchestration tools, isolated environments, and test pipelines.

On the other hand, it would be unwise to let an agent act without supervision.

A generative model produces plausible solutions, with no guarantee of overall correctness. In a complex system, a local change can have unintended consequences that are difficult to anticipate. The question is not whether the agent can make a mistake, but under what conditions that error is contained. A robust architecture therefore imposes explicit constraints:

  • Limiting available actions (read, write, targeted modification)
  • Isolation of test environments
  • Human approval for changes that affect production
  • Comprehensive logging of actions
  • Option for a systematic rollback

In this context, the agent serves to speed up diagnosis and correction, without replacing human judgment in critical decisions.

The integration of agents into maintenance builds on mechanisms already in place. Systems have long utilized forms of automated resilience, such as retries, circuit breakers, and auto-scaling.

Generative models offer additional capabilities by enabling the handling of situations that are not fully defined in advance, through their ability to interpret context.

In practice, this allows for the delegation of part of the incident analysis and resolution process, within a framework governed by rules and control mechanisms.

In the medium term, SaaS architectures are expected to incorporate dedicated maintenance agents. Their role will be to assist with diagnostics and certain corrective actions, which will have a direct impact on response times and operational costs.

For teams, this reduces the time spent on recurring incidents. For companies, it strengthens product positioning. The ability to quickly identify a problem and implement a fix, using a system that continuously monitors, is a concrete selling point in terms of reliability and resolution time.

In this context, the key challenge remains the speed of incident detection and resolution, supported by continuous monitoring and structured response procedures. This level of responsiveness is one of the criteria customers use to evaluate a service.

Agentic AI & Claude Cowork
aivancity

MasterAgent-Based AI
, with Claude Cowork

Move from conversational AI to operational AI. Automate your complex tasks in just two days—no technical skills required.

2-day training course Managers & non-technical staff Eligible for OPCO / CPF funding Paris-Villejuif Campus

Bouzenia, I., Devanbu, P., & Pradel, M. (2025). RepairAgent: An Autonomous, LLM-Based Agent for Program Repair. Proceedings of the IEEE/ACM International Conference on Software Engineering.

Jin, H., Huang, L., Cai, H., Yan, J., Li, B., & Chen, H. (2024). From Large Language Models to LLM-Based Agents for Software Engineering: A Survey of Current Trends, Challenges, and Future Directions. arXiv preprint.

Liu, J., Wang, K., Chen, Y., Peng, X., Chen, Z., Zhang, L., & Lou, Y. (2024). Large Language Model-Based Agents for Software Engineering: A Survey. arXiv preprint.

Yang, J., Jimenez, C. E., Wettig, A., Lieret, K., Yao, S., Narasimhan, K., & Press, O. (2024). SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering. Advances in Neural Information Processing Systems.

Xia, C. S., Deng, Y., Dunn, S., & Zhang, L. (2024). Agentless: Demystifying LLM-based Software Engineering Agents. arXiv preprint.

Jimenez, C. E., Yang, J., Wettig, A., Yao, S., Pei, K., Press, O., & Narasimhan, K. (2024). SWE-bench: Can Language Models Resolve Real-World GitHub Issues? International Conference on Learning Representations.

OpenAI & Princeton NLP. (2024). SWE-bench Verified: A Human-Validated Benchmark for Real-World Software Engineering Tasks.

Yao, S., Zhao, J., Yu, D., Du, N., Shafran, I., Narasimhan, K., & Cao, Y. (2023). ReAct: Synergizing Reasoning and Acting in Language Models. International Conference on Learning Representations.

Microsoft Research. (2025). AIOpsLab: A Holistic Framework for Evaluating AI Agents to Enable Autonomous Cloud Operations. arXiv preprint.

Yehudai, A., Eden, L., Li, A., Uziel, G., Zhao, Y., Bar-Haim, R., Cohan, A., & Shmueli-Scheuer, M. (2025). Survey on the Evaluation of LLM-based Agents. arXiv preprint.

Okta. (2024). Businesses at Work 2024. Okta Research Report.

Zhang, Q., et al. (2025). Advances in Automated Program Repair: A Comprehensive Review. Knowledge and Information Systems.

Don't miss our upcoming articles!

Get the latest articles written by aivancity experts and professors delivered straight to your inbox.

We don't send spam! Please see our privacy policy for more information.

Don't miss our upcoming articles!

Get the latest articles written by aivancity experts and professors delivered straight to your inbox.

We don't send spam! Please see our privacy policy for more information.

Related posts
Articles

A human being is not just a slower AI
: A Manifesto to Redefine Intelligence, Education, and Work in the Age of Machines That Can Respond

(Included at the end of this manifesto is a free self-assessment tool to evaluate your readiness for AI and help you move toward a more critical, responsible, and humane approach to its use) By Dr. Tawhid CHTIOUI, Founding President of aivancity…
Articles

The Day AI Began to Organize Itself Without Us: Toward the Emergence of Sociotic AI

By Dr. Tawhid CHTIOUI, Founding President of aivancity School of AI & Data for Business & Society; selected by Keyrus as one of the 25 most influential global figures in the field of AI and data…
Articles

War in the Age of AI:
When Algorithms Enter the Battlefield

By Dr. Tawhid CHTIOUI, Founding President of aivancity School of AI & Data for Business & Society; selected by Keyrus as one of the 25 most influential global figures in the field of AI and data…

Leave a comment

Your email address will not be published. Required fields are marked with *