Kelzodi blog: AI in Practice

Using AI to Preserve Knowledge and Accelerate Maintenance of Enterprise Java Applications

Introduction

Enterprise Java applications often remain in production for decades, supporting critical business processes such as inventory management, order processing, logistics, finance, manufacturing and customer services. Although these systems continue to deliver considerable business value, organizations frequently encounter a common problem: the gradual loss of technical knowledge.

Developers move to other projects, documentation becomes outdated, and maintenance increasingly depends on a small number of specialists who understand the application's internal behavior. Eventually, the greatest risk is no longer the technology itself, but the disappearance of the knowledge required to maintain and evolve it.

Artificial Intelligence offers an opportunity to fundamentally change this situation.

The objective is not to replace the software engineer responsible for maintaining the application. Enterprise software maintenance still requires experienced professionals capable of understanding software architecture, making design decisions, validating business rules and ensuring software quality.

Instead, the goal is to provide an experienced Java architect or software engineer with an intelligent assistant capable of understanding the application almost as quickly as its original development team. Once the initial learning phase is complete, the AI continuously assists the engineer by accelerating incident resolution, improving the quality and safety of software changes, generating technical documentation, identifying architectural dependencies, and even helping understand the business processes implemented by the application.

Rather than replacing engineers, the proposed solution creates a permanent knowledge platform that captures years of technical expertise and transforms it into an intelligent maintenance assistant available throughout the application's entire lifecycle.

Proposed Solution

The proposed solution is based on an on-premise AI platform combining modern Large Language Models (LLMs), Retrieval-Augmented Generation (RAG), semantic search, automated reverse engineering and continuous operational monitoring.

Rather than training a custom model specifically for the application, the platform continuously builds a structured knowledge repository by analysing every available source of information. The LLM consults this repository through a RAG architecture, allowing it to answer questions using accurate, up-to-date and verifiable project knowledge.

The implementation naturally evolves through four project phases:

Architecture Preparation
Knowledge Acquisition
RAG Construction
Automated Reverse Engineering

Once these phases have been completed, the platform becomes an AI-powered maintenance assistant that continuously evolves alongside the application.

Recommended Architecture

The proposed architecture is composed of several independent but highly integrated layers.

Knowledge Sources

The platform continuously ingests information from multiple technical and functional sources:

Java source code
Relational Database metadata
Configuration files
User Interface descriptions
Functional documentation
Technical documentation
Source code repositories
Application logs
Monitoring platforms
Issue management systems
Architecture diagrams
Deployment pipelines

Ingestion Layer

The ingestion layer is responsible for collecting and normalising information.

Typical responsibilities include:

Repository connectors
Metadata extraction
Source code parsing
Document parsing
Content chunking
Metadata enrichment
Continuous synchronization

Knowledge Base

The processed information is transformed into semantic embeddings and stored inside the knowledge repository.

The knowledge base typically contains:

Vector Database
Semantic Index
Metadata Index
Knowledge Graph
Hybrid Search Index

This repository becomes the technical memory of the application.

AI Orchestration Layer

The orchestration layer represents the intelligence of the platform.

Instead of simply forwarding user questions to the LLM, this layer builds the complete execution context.

Its responsibilities include:

Query Understanding
Semantic Search
Context Assembly
Prompt Orchestration
Tool Calling
Conversation Memory
Context Compression
Security Policies
LLM Interaction
Response Generation
Source Citation

Prompt Orchestration is one of the most important components of the solution. It dynamically constructs the final prompt sent to the LLM by combining retrieved documents, source code, database metadata, operational information, previous conversation context and task-specific instructions.

Applications

Once deployed, the platform provides several capabilities:

AI Technical Assistant
Incident Analyzer
Code Explorer
Dependency Explorer
Business Process Explorer
Documentation Generator
Change Impact Analyzer
Refactoring Assistant

Live Operational Connections

To keep the knowledge continuously updated, the platform should maintain live connections with operational systems such as:

Source Code Repository
Relational Database metadata
Logging Platform
Monitoring Platform
CI/CD Pipeline
Issue Tracking System
Documentation Repository

These connectors ensure that the assistant continuously learns from new deployments, incidents and software evolution.

Information Required by the AI Platform

The effectiveness of the assistant depends directly on the quality and completeness of the information supplied during the knowledge acquisition process.

Source Code

The complete Java application should be indexed, including:

Controllers
Services
Repositories
Entities
DTOs
Mappers
Validation rules
Batch jobs
Scheduled tasks
Security configuration
Integration services
Unit and integration tests

The objective is to understand not only individual classes but also the relationships between components.

Relational Database

The AI should analyse the complete logical database model:

Tables
Columns
Relationships
Primary Keys
Foreign Keys
Constraints
Views
Functions
Stored Procedures
Triggers
Indexes

This information allows the assistant to reconstruct the application's data model.

User Interface

Business knowledge is often better represented by the application's user interface than by the source code itself.

Useful information includes:

Screen captures
Navigation flows
Field descriptions
User manuals
Functional specifications

This enables the AI to relate technical implementation with actual business operations.

Documentation

Every available document should be incorporated:

Functional Specifications
Technical Specifications
Architecture Documents
Deployment Guides
API Documentation
Integration Specifications
Existing Diagrams

Operational Knowledge

Production experience represents one of the most valuable knowledge sources.

The platform should ingest:

Historical incidents
Root Cause Analysis reports
Previous fixes
Frequently executed SQL queries
Production logs
Stack traces
Monitoring alerts

Over time, operational knowledge becomes part of the assistant's expertise.

Phase 1 Architecture Preparation

The first phase focuses on designing the AI ecosystem.

Typical activities include:

Selecting the on-premise LLM
Selecting the Vector Database
Designing the RAG architecture
Designing the AI Orchestration Layer
Defining metadata models
Defining security policies
Identifying information sources
Designing update mechanisms
Defining governance policies

At this stage no application knowledge has yet been generated. The objective is to prepare the platform.

Phase 2 Knowledge Acquisition

Once the infrastructure is available, the platform begins collecting information from every available source.

Inputs include:

Source Code
Relational Database metadata
Documentation
User Interfaces
Configuration Files
Architecture Diagrams
Logs
Monitoring Information
Incident History

Each source is parsed, divided into semantic chunks, enriched with metadata and stored inside the knowledge repository.

This phase creates the knowledge foundation upon which the entire system will operate.

Phase 3 RAG Construction

After acquiring the available knowledge, the Retrieval-Augmented Generation platform is built.

Activities include:

Embedding generation
Vector indexing
Metadata indexing
Knowledge graph construction
Hybrid search configuration
Semantic retrieval optimisation
Prompt template design
Prompt orchestration workflows
Tool integration
Context ranking
Response validation

The resulting RAG platform allows the AI to retrieve accurate and relevant technical information before generating any response.

Unlike traditional LLM usage, every answer is grounded in the organization's own technical knowledge.

Phase 4 Automated Reverse Engineering

Once sufficient knowledge has been collected, the AI begins reconstructing the application's architecture automatically.

This is where the platform starts generating new technical knowledge rather than simply indexing existing information.

Technical Models

The AI can automatically generate:

Application Architecture
Layer Dependencies
Component Relationships
Service Catalogue
API Catalogue
Package Dependencies
Deployment Architecture

Data Models

The assistant reconstructs:

Logical Data Models
Entity Relationships
Data Flows
Database Dependencies

Business Models

Business knowledge extracted from code and documentation includes:

Business Entities
Business Rules
Validation Rules
Decision Logic
Domain Concepts

Workflow Reconstruction

The AI can automatically identify workflows such as:

Order Creation
Order Validation
Inventory Allocation
Stock Reservation
Inventory Updates
Shipment
Order Cancellation
Inventory Adjustments

Dependency Analysis

The assistant identifies:

Cross-module dependencies
Component interactions
Service dependencies
Data dependencies
External integrations

Change Impact Analysis

The platform can estimate:

Components affected by a modification
Potential regressions
Downstream impacts
Risk areas

Automatic Documentation

The reverse engineering process continuously generates documentation such as:

Architecture Documentation
Technical Documentation
API Documentation
Data Model Documentation
Business Process Documentation
Sequence Diagrams
Component Diagrams
Workflow Diagrams
Dependency Diagrams

At this point, the organization has effectively rebuilt the technical knowledge of the application, even if much of the original documentation has been lost.

Operational AI-Assisted Maintenance

Once the platform has completed the reverse engineering process, it becomes an intelligent assistant for day-to-day software maintenance.

Incident Analysis

The assistant can:

Analyse production incidents
Explain stack traces
Suggest root causes
Recommend diagnostic SQL queries
Identify affected components

Business Understanding

Engineers can ask questions such as:

How is inventory updated?
Which validations occur before an order is confirmed?
What happens when an order is cancelled?
Which services update stock levels?

Code Understanding

The assistant explains:

Business logic
Algorithms
Class responsibilities
Method interactions
Design patterns
Technical decisions

Change Impact Analysis

Before modifying the application, the AI can identify:

Impacted services
Impacted database objects
Affected APIs
Dependencies
Potential regressions

Safe Change Assistance

Rather than changing the software automatically, the assistant proposes improvements for human review.

Typical outputs include:

Implementation suggestions
Refactoring opportunities
SQL improvements
Performance recommendations
Security improvements
Regression test suggestions

Human validation remains mandatory before any deployment.

Documentation Assistance

The platform continuously generates and updates:

Technical documentation
Business documentation
Architecture diagrams
API descriptions
Operational guides

Continuous Knowledge Evolution

The platform is that it never stops learning.

Every software release enriches the knowledge repository through:

New source code
Database schema evolution
New documentation
Production incidents
Monitoring information
User feedback
Deployment history

The RAG repository continuously evolves, making the assistant increasingly accurate and valuable over time.

Instead of becoming obsolete, the knowledge platform grows alongside the application.

Conclusion

The proposed solution should not be viewed as an attempt to automate software maintenance or replace experienced software engineers.

Its real purpose is to preserve the technical knowledge accumulated over many years and provide software architects and maintenance engineers with an intelligent assistant capable of understanding both the software architecture and the underlying business processes in a fraction of the time traditionally required.

By combining modern Large Language Models, Retrieval-Augmented Generation, automated reverse engineering, semantic search and continuously evolving operational knowledge, organizations can transform legacy enterprise applications into self-documented systems supported by AI.

The result is not autonomous maintenance, but AI-Augmented Software Engineering: significantly faster onboarding of new maintainers, quicker incident resolution, safer software evolution, continuously updated documentation, deeper understanding of business processes, and ultimately, a substantial reduction in the long-term maintenance cost and risk of enterprise applications.

It is worth noting that several commercial solutions already pursue a similar vision of AI-assisted software engineering. Among them, Sourcegraph Cody is probably one of the closest, providing semantic code search, repository-wide understanding, Retrieval-Augmented Generation (RAG), and AI-assisted development over large codebases. However, the approach proposed in this article aims to go beyond source code analysis. It envisions a unified software knowledge platform that combines source code, relational database metadata, user interface descriptions, technical and functional documentation, operational logs, monitoring data, incident history, and deployment information into a continuously evolving knowledge repository. The objective is not only to assist developers while writing code, but to reconstruct the application's technical architecture, business processes, workflows, dependencies, and operational knowledge, creating a long-term AI companion for software maintenance, onboarding, architectural understanding, and safer system evolution.

Monday, June 29, 2026

No comments:

Post a Comment