Published at: "Expert Systems with Applications"

GraphRAG-ASCOC: A Lightweight Framework for Adaptive Synonym-aware Clustering and Ontology Completion

ABSTRACT: Recent advances in large language models (LLMs) are accelerating the shift from document-centric workflows toward the model-driven paradigm of Model-Based Systems Engineering (MBSE). However, when vast, highly structured industrial standards must be converted into executable Web Ontology Language (OWL) ontologies for expert-system reasoning and decision support, text-centric LLM pipelines suffer from hallucination and context-window limits. Our earlier GraphRAG-KM framework partially mitigated these issues; yet, synonym redundancy, cluster imbalance, and missing relations still constrain downstream reasoning. To address these gaps, we present GraphRAG-ASCOC (Adaptive Synonym-aware Clustering and Ontology Completion), an enhanced, cost-effective, closed-loop pipeline that outputs compact knowledge bases suitable for intelligent decision support. First, a synonym-merging module combines a mutual-nearest-neighbour criterion with FAISS retrieval to consolidate lexical variants and eliminate redundancy. Second, a multi-feature fusion clustering method with a scoring function automatically selects the optimal cluster count, ensuring semantic coherence. Finally, to recover implicit relations, we introduce TLC (TransE–LLM Completion). This lightweight module enriches the ontology with high-confidence triples via embedding recall and semantic validation. To evaluate practical utility, the framework was applied to the U.S. tactical data-link standard MIL-STD-6016B. Experiments demonstrate substantial gains over the original GraphRAG-KM pipeline in entity compression, clustering coherence, and relation-completion F1, producing compact OWL ontologies suitable for rule engines, simulators, and other expert system components. These results confirm the enhanced pipeline’s practical readiness for large-scale defence standards and other industrial documents, paving the way for reliable knowledge bases in real-world expert-system applications across defence and manufacturing domains.

Full article