Monday, June 1, 2026

Why the Future of Document Management Is Not Another ECM

For more than two decades, Enterprise Content Management (ECM) systems have been the foundation of corporate document management.

They have proven their value by providing secure storage, version control, workflows, permissions, audit trails, records management and compliance capabilities. Platforms such as SharePoint, Oracle WebCenter Content (WCC), OpenText, Alfresco and many others continue to manage millions of business-critical documents every day.

The problem is not that these systems have failed.

The problem is that the expectations of users have changed.

The New Challenge: Knowledge Discovery

Traditionally, document management was focused on storing and retrieving documents.

Users knew what they were looking for:

  • Find a document.

  • Locate the latest version.

  • Check who approved it.

  • Review its history.

Artificial Intelligence introduces a completely different expectation.

Users now want answers rather than documents.

They want to ask questions such as:

  • Which systems are impacted by this change?

  • What requirements are affected by this decision?

  • Which documents contain conflicting information?

  • What risks are associated with this component?

  • What knowledge already exists about this topic?

Answering these questions requires much more than document storage and keyword search.

It requires understanding relationships, context, dependencies and knowledge hidden inside thousands of documents.

This is something that traditional ECM platforms were never designed to do.

Why Simply Adding AI Is Not Enough

Many current initiatives attempt to connect existing ECM repositories to chatbots or generic AI assistants.

While this approach can produce impressive demonstrations, it often struggles in real-world environments.

The reason is simple.

Documents are distributed across multiple repositories, each with its own structure, permissions, metadata models and lifecycle rules.

A typical organization may store information in:

  • SharePoint

  • Oracle WebCenter Content

  • OpenText

  • Alfresco

  • File shares

  • PLM systems

  • Email archives

Each repository contains valuable knowledge, but none of them provides a complete picture.

Adding an AI assistant on top of a single repository does not solve the fragmentation problem.

The Reality: Nobody Wants to Replace Their ECM

This is where many proposed solutions become unrealistic.

Large organizations have invested years, sometimes decades, building their document management ecosystems.

They have:

  • Millions of documents.

  • Complex workflows.

  • Regulatory requirements.

  • Existing integrations.

  • Thousands of users.

No organization wants to hear:

"Replace your entire ECM infrastructure."

The cost, risk and disruption would be enormous.

In practice, most organizations will continue using their existing ECM platforms for many years.

And that is perfectly reasonable.

A Different Approach

Instead of replacing existing ECM systems, a more realistic strategy is to preserve them as the official systems of record.

Their role remains essential:

  • Document storage.

  • Version control.

  • Security.

  • Compliance.

  • Auditability.

  • Records management.

What changes is what sits above them.

Rather than building yet another ECM, organizations should introduce a new layer capable of connecting all existing repositories and transforming the information they contain into usable knowledge.

This new layer becomes the bridge between traditional document management and modern AI capabilities.


Preserving Existing ECM Investments

One of the key advantages of this approach is that it does not require organizations to replace their existing ECM platforms.

Systems such as SharePoint, Oracle WebCenter Content, OpenText or other repositories can continue operating exactly as they do today. Documents remain in their current locations, managed by the same permissions, workflows and governance processes already in place.

The new platform simply connects to these repositories through their existing APIs and content services.

Rather than moving documents, the platform discovers and indexes knowledge from multiple sources while leaving the original content untouched.

This significantly reduces risk, cost and implementation effort.

Starting Small

Perhaps the most important aspect of this architecture is that it does not require a massive transformation project from day one.

A first version could start with a very simple user interface:

  • A unified search screen.
  • An AI-powered chat interface.
  • Basic document discovery across repositories.
  • Simple knowledge exploration capabilities.

Behind the scenes, the platform would connect to existing ECMs and gradually build a unified knowledge layer.

Additional capabilities such as impact analysis, knowledge graphs, intelligent agents and advanced workflows could then be introduced incrementally.

This allows organizations to begin generating value immediately while evolving towards a much more powerful Document Intelligence Platform over time.

Instead of replacing existing systems, the new platform enhances them, bringing AI-powered knowledge discovery to repositories that organizations already trust and depend on.

From Multiple Repositories to a Unified Knowledge Layer

The objective is not to migrate documents.

The objective is to unify access to knowledge.

In this model, existing repositories remain untouched:

  • SharePoint continues managing SharePoint documents.

  • Oracle WCC continues managing WCC content.

  • Other repositories continue performing their current role.

Above them, a new intelligence layer is introduced.

This layer is responsible for:

  • Discovering information across repositories.

  • Extracting knowledge from documents.

  • Understanding relationships between information.

  • Building semantic indexes.

  • Applying security and permission rules.

  • Providing AI-powered search and analysis capabilities.

Users no longer need to know where information is stored.

They interact with a unified knowledge platform capable of accessing multiple repositories behind the scenes.

The Next Generation of Document Management

The future is unlikely to be another standalone ECM platform.

Instead, it is likely to be a new architecture where existing ECMs continue acting as trusted repositories while a new intelligence layer provides AI-driven discovery, analysis and knowledge management capabilities.

This approach protects previous investments, reduces migration risks and enables organizations to benefit from Artificial Intelligence without disrupting their existing document management landscape.

The challenge is no longer managing documents.

The challenge is understanding and exploiting the knowledge contained within them.

And that requires a new architectural foundation.

Proposed Technology Stack

At this stage, the objective is not to build every component from scratch. Instead, the platform should leverage proven technologies for each layer while focusing development efforts on the areas that create real business value.

For the user interface and application layer, a rapid development platform such as Jmix provides an excellent starting point. It enables the fast creation of enterprise-grade user interfaces, administration screens, workflows, dashboards and security models, allowing the project to deliver working functionality in a relatively short timeframe.

For the intelligence layer, modern open-source AI technologies provide the foundation for knowledge discovery and semantic search. Large Language Models (LLMs) can be deployed locally to ensure full control over data, while vector databases can be used to support semantic retrieval and knowledge exploration.

The platform can therefore be structured around several complementary layers:

  • Application Layer: User interfaces, administration, workflows and dashboards.
  • Knowledge Layer: Unified access to information coming from multiple repositories.
  • AI Layer: Semantic search, document understanding, summarization and knowledge discovery.
  • Security and Governance Layer: Permissions, auditability, classification and compliance.
  • Repository Layer: Existing ECMs and other systems of record.

A possible implementation could combine:

  • Jmix for rapid enterprise application development.
  • Spring Boot for backend services and integration.
  • Local LLMs for secure AI processing.
  • Vector databases for semantic search and retrieval.
  • Knowledge graph technologies for relationship discovery and impact analysis.
  • Existing ECM platforms as trusted systems of record.

The key point is that none of these technologies replace the current repositories. Instead, they work together to create a new layer of intelligence capable of discovering, connecting and exploiting the knowledge already stored across the organization.

This approach allows the project to start with a simple and practical first release while providing a clear path towards a much more advanced Document Intelligence Platform in future iterations.

 


Recommended Technologies

AI Layer

The AI Layer should be built using proven open-source technologies rather than developed from scratch. The objective is to leverage mature components and focus development efforts on the capabilities that create real business value.

Capability

Recommended Technologies

Purpose

Large Language Models (LLMs)

Llama, Mistral, Mixtral, Qwen

Document understanding, summarization, question answering and reasoning

LLM Runtime / Inference Engine

vLLM, Ollama, Text Generation Inference (TGI)

Efficient execution of AI models, either for production or development environments

Vector Database

Qdrant, pgvector, Milvus

Semantic search, embeddings storage and Retrieval-Augmented Generation (RAG)

Knowledge Graph

Neo4j, ArangoDB

Relationship discovery, dependency mapping and impact analysis

Agent Framework

Dify, LangGraph

AI workflows, intelligent assistants and agent orchestration

For an initial release, a combination of Jmix, Spring Boot, Dify, vLLM and Qdrant could provide a fast path towards a working platform with AI-powered search, document chat and semantic retrieval capabilities.

As the platform evolves, more advanced technologies such as LangGraph and Neo4j can be introduced to support sophisticated agent workflows, relationship analysis and knowledge discovery scenarios.

The key point is that these technologies are not the product itself. They are building blocks. The real value lies in the intelligence layer built on top of them, including knowledge federation, document classification, metadata extraction, relationship discovery, impact analysis and security-aware access to information across multiple repositories.

Knowledge Layer

The Knowledge Layer is the core of the platform. Its role is to connect existing repositories, normalize their information, apply security rules and transform distributed documents into usable knowledge.

Capability

Recommended Technologies

Purpose

Repository Connectors

REST APIs, CMIS, Microsoft Graph API, Oracle WCC APIs

Connect to SharePoint, WCC, ECMs and other repositories without replacing them

Integration and Synchronization

Spring Boot, Apache Camel, Kafka, RabbitMQ

Move metadata, events and document updates between repositories and the intelligence platform

Metadata Federation

PostgreSQL, Oracle, Elasticsearch / OpenSearch

Normalize metadata from different systems into a common searchable model

Security Federation

LDAP, Active Directory, Keycloak, OAuth2 / OpenID Connect

Preserve permissions, roles and identity rules across repositories

Knowledge Processing

Apache Tika, OCR engines, custom extraction services

Extract text, structure and relevant information from documents

Search Indexing

OpenSearch, Elasticsearch

Support fast keyword search, filtering and faceted navigation

This layer is especially important because it prevents the platform from becoming just another isolated repository. Instead, it acts as a bridge between existing ECMs and the new AI capabilities.

The key idea is that documents can remain in their current systems of record, while the Knowledge Layer creates a unified view of their metadata, content, permissions and relationships.

In other words, the Knowledge Layer is what allows the platform to connect SharePoint, WCC, other ECMs, PLM systems, databases and file shares under a common intelligence model.

Presentation Layer

The Presentation Layer is responsible for providing simple, powerful and user-friendly access to the platform. It should allow users to search, explore, analyse and interact with knowledge without needing to know where documents are physically stored.

Capability

Recommended Technologies

Purpose

Enterprise UI Development

Jmix, Vaadin

Rapid development of enterprise screens, administration panels, dashboards and workflows

Advanced Web Interfaces

React, Angular, Vue

Build richer user experiences such as AI search, document exploration and visual analysis

AI Chat Interface

Dify UI, custom React UI, Vaadin components

Provide conversational access to documents and knowledge

Dashboards and Analytics

Jmix dashboards, Apache Superset, Grafana

Display document metrics, usage, quality indicators and knowledge insights

Graph Visualization

Neo4j Bloom, Cytoscape.js, React Flow, D3.js

Visualize relationships between documents, systems, requirements and risks

Document Viewer

PDF.js, OnlyOffice, Collabora

Preview documents, compare versions and display extracted knowledge next to the original content


For the first version, Jmix remains a very appropriate option because it allows the team to build useful enterprise interfaces quickly, including search screens, metadata views, administration panels and basic workflows.

Later, more advanced interfaces can be introduced using React or specialized visualization libraries for AI chat, relationship graphs, impact analysis and document intelligence dashboards.

The main goal of this layer is to make the platform feel simple for the user, even if the underlying architecture connects many repositories, AI services and knowledge sources behind the scenes.

Example of a Unified Search and Knowledge Discovery Experience

The Unified Search and Knowledge Discovery screen is the primary entry point to the platform. It combines traditional document search with AI-powered knowledge discovery, allowing users to search across multiple repositories through a single interface.

Users can perform natural language queries, apply advanced filters and explore documents stored in different systems such as SharePoint, WCC, engineering repositories and other ECM platforms.

Search results are enriched with AI-generated summaries, metadata, relationships and impact information, helping users understand not only which documents exist, but also how they are connected to other systems, requirements, reports and business processes.

The interface is organized into three main areas:

  • Advanced Filters Panel: Allows users to refine searches by repository, document type, classification, status, program, owner, date range and tags.
  • Results Panel: Displays matching documents together with summaries, metadata and related knowledge.
  • Knowledge Panel: Provides additional context, including relationships, dependencies, impact analysis and AI-generated insights.

This approach transforms document retrieval into knowledge discovery, enabling users to find information faster, understand its context and assess its potential impact across the organization.