Skip to main content

Context Engineering Framework (CEF) - Architecture

Version: beta-0.5
Status: Research Beta (Optimized for Rapid Prototyping & Analysis)
Date: November 27, 2025
Target Audience: AI Conference - Technical Experts


Executive Summary

Context Engineering Framework (CEF) is a domain-agnostic Java ORM for building knowledge model persistence in LLM applications. Just as Hibernate abstracts relational databases for transactional data, CEF abstracts knowledge stores (graph + vector) for LLM context.

Key Innovation

Relationship Navigation → Semantic Search → Keyword Fallback

CEF provides an ORM layer for context engineering - define knowledge models (entities, relationships), persist them to dual stores, and query context through intelligent assembly strategies that automatically fall back to ensure comprehensive results.

Core Differentiators

FeatureRDBMS ORM (Hibernate)Knowledge Model ORM (CEF)
Data ModelTables, columns, foreign keysNodes, edges, semantic relations
PersistenceSingle databaseDual (graph + vector stores)
Query StrategySQL optimizationRelationship navigation + semantic search
ContextTransactional dataKnowledge context for LLMs
FallbackQuery rewrite3-level automatic (relation → semantic → keyword)
SchemaJPA annotationsDomain-agnostic Node/Edge model
StoragePluggable (MySQL, Postgres, etc.)Pluggable (JGraphT, Neo4j, Qdrant, Pinecone)
ScaleMillions of rows100K nodes (JGraphT, tested) / millions (Neo4j, experimental)

Problem Statement

Why an ORM for Context Engineering?

Just as developers needed Hibernate to abstract RDBMS complexity for transactional data, LLM applications need an ORM to abstract knowledge stores for context engineering.

  1. Knowledge Models ≠ Data Models - LLM context requires semantic relationships, not just foreign keys
  2. Dual Persistence Complexity - Managing graph stores (relationships) + vector stores (semantics) is error-prone
  3. No Standard Patterns - Every project reinvents persistence, caching, and query optimization
  4. Entity-Relationship Reasoning - LLMs need to navigate "Patient → Condition → Treatment" relationships
  5. Storage Flexibility - Development (in-memory) to production (Neo4j, Qdrant) without code changes

What CEF Provides

Knowledge Model ORM - Define entities (Node) and relationships (Edge) like JPA @Entity
Dual Persistence - Transparent management of graph + vector stores
Intelligent Context Assembly - Relationship navigation with automatic fallback strategies
Domain Agnostic - Framework provides primitives, you define semantics
Pluggable Storage - Swap backends via configuration (JGraphT, Neo4j, Postgres, Qdrant, Pinecone)
Research Patterns - Batch operations, monitoring, reproducible benchmarks


Architecture Overview

High-Level Architecture

┌─────────────────────────────────────────────────────────────────┐
│ LLM Application │
│ (OpenAI, Ollama, vLLM, etc.) │
└────────────────────────────┬────────────────────────────────────┘

│ MCP Tool Call
│ retrieve_context(textQuery, graphHints)

┌─────────────────────────────────────────────────────────────────┐
│ Context Engineering Framework (CEF) │
│ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ KnowledgeRetriever (MCP Tool) │ │
│ │ • Dynamic schema injection (LLM learns graph structure) │ │
│ │ • 3-level fallback (Hybrid → Vector → Keyword) │ │
│ │ • Configurable required fields │ │
│ └──────────────┬──────────────────────┬────────────────────┘ │
│ │ │ │
│ ▼ ▼ │
│ ┌─────────────────────┐ ┌─────────────────────────────────┐ │
│ │ GraphStore API │ │ VectorStore API │ │
│ │ ┌───────────────┐ │ │ ┌───────────────────────────┐ │ │
│ │ │ JGraphT (def) │ │ │ │ Postgres+pgvector (def) │ │ │
│ │ │ Neo4j (opt) │ │ │ │ DuckDB VSS (opt) │ │ │
│ │ │ TinkerPop │ │ │ │ Qdrant (opt) │ │ │
│ │ └───────────────┘ │ │ │ Pinecone (opt) │ │ │
│ └─────────────────────┘ │ └───────────────────────────┘ │ │
│ │ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ Knowledge Model (Domain Agnostic) │ │
│ │ • Node (id, label, properties, vectorizableContent) │ │
│ │ • Edge (relationType, source, target, weight) │ │
│ │ • RelationType (name, semantics, bidirectional) │ │
│ │ • Chunk (content, embedding, linkedNodeId) │ │
│ └──────────────────────────────────────────────────────────┘ │
│ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ Parser System (User Extensible) │ │
│ │ • YAML/JSON/CSV (SnakeYAML, Jackson, OpenCSV) │ │
│ │ • ANTLR (optional, for complex 10-50 page PDFs) │ │
│ │ • Custom parsers (user-defined, priority-based) │ │
│ └──────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘


┌─────────────────────────────────────────────────────────────────┐
│ Pluggable Storage Layer │
│ │
│ Graph: JGraphT (in-memory, <100K) | Neo4j (millions) │
│ Vector: Postgres/pgvector | DuckDB | Qdrant | Pinecone │
│ Config: cef.graph.store, cef.vector.store │
└─────────────────────────────────────────────────────────────────┘

Core Components

1. Knowledge Model (Domain Agnostic)

Philosophy: Framework provides primitives, users define semantics.

Node - Universal Entity

@Entity
public class Node {
UUID id;
String label; // User-defined: "Patient", "Product", "Document"
Map<String, Object> properties; // Flexible JSONB bag
String vectorizableContent; // Auto-embedded if present
Timestamp created, updated;
}

Edge - Typed Relationships

@Entity
public class Edge {
UUID id;
String relationType; // User-defined: "TREATS", "CONTAINS", "IS_A"
UUID sourceNodeId;
UUID targetNodeId;
Map<String, Object> properties;
Double weight; // For weighted graphs
}

RelationType - Semantic Hints for Reasoning

@Entity
public class RelationType {
String name;
RelationSemantics semantics; // HIERARCHICAL, ASSOCIATIVE, CAUSAL, TEMPORAL, SPATIAL, CUSTOM
boolean bidirectional;
}

enum RelationSemantics {
HIERARCHICAL, // Parent-child or taxonomy (IS_PART_OF, IS_A)
ASSOCIATIVE, // General link (RELATED_TO, REFERENCES)
CAUSAL, // Cause-effect (CAUSES, TREATS)
TEMPORAL, // Time-based (BEFORE, AFTER)
SPATIAL, // Location (NEAR, LOCATED_IN)
CUSTOM // User-defined
}

Key Insight: Framework uses RelationSemantics for traversal logic (e.g., follow HIERARCHICAL for parent-child queries) without knowing domain specifics.

2. Intelligent Retrieval with 3-Level Fallback

@MCPTool(name = "retrieve_context")
public SearchResult retrieve(RetrievalRequest request) {

// Step 1: Graph Reasoning (if entity provided)
ReasoningContext context = null;
if (request.graphHints() != null) {
UUID nodeId = resolveEntity(request.graphHints());
context = graphStore.traverse(nodeId, depth, semantics);
// Returns: related nodes, paths, extracted keywords
}

// Step 2: Hybrid Search (graph-constrained vector)
if (context != null) {
Set<UUID> contextChunkIds = getChunksForNodes(context.relatedNodes());
SearchResult result = vectorStore.vectorSearchFiltered(
embed(enhanceQuery(request.query(), context.keywords())),
contextChunkIds,
topK
);
if (result.size() >= minResults) return result; // Success!
}

// Step 3: Fallback to Pure Vector Search
SearchResult result = vectorStore.vectorSearch(
embed(request.query()),
topK
);
if (result.size() >= minResults) return result; // Acceptable

// Step 4: Final Fallback to BM25 Keyword Search
return vectorStore.fullTextSearch(request.query(), topK); // Never empty
}

Why This Works:

  1. Hybrid (Graph + Vector) - Best quality, leverages structured relationships
  2. Pure Vector - Good quality, works when no entity context
  3. BM25 Keyword - Acceptable quality, guarantees non-empty results

3. MCP Tool with Dynamic Schema Injection

Problem: How does LLM know graph structure to construct entity-aware queries?

Solution: Inject schema into tool description at runtime.

@PostConstruct
public void registerTool() {
GraphSchema schema = schemaRegistry.discoverSchema(); // At startup

String description = """
Retrieve context using graph reasoning + semantic search.

GRAPH SCHEMA:
Node Types: Patient(name,age,gender), Doctor(name,specialization),
Condition(name,icd10Code), Medication(name,dosage)
Relations: HAS_CONDITION(Patient→Condition), TREATS(Doctor→Patient),
PRESCRIBED_MEDICATION(Patient→Medication)

REQUEST SCHEMA:
- textQuery: REQUIRED (natural language)
- graphHints: OPTIONAL (use when entity mentioned in query)
- entityLabel: REQUIRED if graphHints (from schema above)
- entityProperty: REQUIRED if graphHints
- entityValue: REQUIRED if graphHints

EXAMPLES:
1. Pure semantic: {"textQuery": "diabetes symptoms"}
2. Entity-aware: {"textQuery": "John's conditions",
"graphHints": {"entityLabel": "Patient",
"entityProperty": "name",
"entityValue": "John"}}
""".formatted(formatSchema(schema));

mcpToolRegistry.register("retrieve_context", description, this::retrieve);
}

Result: LLM learns graph structure on first call, constructs correct entity-aware queries without hardcoding.

4. Pluggable Storage Architecture

GraphStore Interface

public interface GraphStore {
void addNode(Node node);
void addEdge(Edge edge);
ReasoningContext traverse(UUID startId, int depth, Set<RelationSemantics> semantics);
Optional<GraphPath> findPath(UUID from, UUID to);
String getImplementation();
}

Implementations:

  • JGraphTGraphStore (default) - In-memory, <100K nodes, O(1) lookups
  • Neo4jGraphStore (optional) - Millions of nodes, Cypher queries
  • TinkerPopGraphStore (optional) - Gremlin-compatible (JanusGraph, etc.)

VectorStore Interface

public interface VectorStore {
void indexChunk(Chunk chunk);
List<RankedChunk> vectorSearch(float[] embedding, int topK);
List<RankedChunk> vectorSearchFiltered(float[] embedding, Set<UUID> chunkIds, int topK);
List<RankedChunk> fullTextSearch(String query, int topK);
}

Implementations:

  • PostgresVectorStore (default) - Postgres + pgvector extension
  • DuckDBVectorStore (lightweight) - DuckDB with VSS extension
  • QdrantVectorStore (optional) - Specialized vector DB
  • PineconeVectorStore (optional) - Cloud-hosted

Configuration:

cef:
graph:
store: jgrapht # jgrapht | neo4j | tinkerpop
jgrapht:
max-nodes: 100000
neo4j:
uri: bolt://localhost:7687

vector:
store: postgres # postgres | duckdb | qdrant | pinecone
duckdb:
path: ./data/knowledge.duckdb
qdrant:
host: localhost
port: 6333

Reasoning Context Extraction

Core Algorithm: BFS/DFS traversal with semantic filtering.

public ReasoningContext traverse(UUID startId, int depth, Set<RelationSemantics> semantics) {
Node rootNode = findNode(startId);
Set<Node> visitedNodes = new HashSet<>();
List<GraphPath> paths = new ArrayList<>();

Queue<TraversalState> queue = new LinkedList<>();
queue.offer(new TraversalState(startId, 0, new ArrayList<>()));

while (!queue.isEmpty()) {
TraversalState state = queue.poll();

if (state.depth >= depth) continue;

Node current = findNode(state.nodeId);
visitedNodes.add(current);

// Get outgoing edges filtered by semantics
List<Edge> edges = getOutgoingEdges(state.nodeId)
.stream()
.filter(e -> semantics == null ||
semantics.contains(getRelationType(e).semantics()))
.toList();

for (Edge edge : edges) {
Node target = findNode(edge.targetNodeId);
visitedNodes.add(target);

List<Edge> newPath = new ArrayList<>(state.path);
newPath.add(edge);
paths.add(new GraphPath(newPath));

queue.offer(new TraversalState(edge.targetNodeId, state.depth + 1, newPath));
}
}

// Extract keywords from visited nodes
Set<String> keywords = visitedNodes.stream()
.flatMap(n -> extractKeywords(n).stream())
.collect(Collectors.toSet());

return new ReasoningContext(rootNode, visitedNodes, paths, keywords, depth);
}

Example: Medical Query "Treatments for John's diabetes"

Input: graphHints = {entityLabel: "Patient", entityProperty: "name", entityValue: "John"}
depth = 2

Step 1: Resolve entity → Patient(id=uuid-123, name="John")

Step 2: Traverse graph:
Patient(John) --HAS_CONDITION--> Condition(Diabetes Type 2)
Patient(John) --TREATED_BY--> Doctor(Smith)
Condition(Diabetes) --TREATED_WITH--> Medication(Metformin)
Condition(Diabetes) --HAS_GUIDELINE--> Document(Treatment Protocol)

Step 3: Extract context:
- Related nodes: [Patient(John), Condition(Diabetes), Doctor(Smith),
Medication(Metformin), Document(Protocol)]
- Keywords: ["diabetes", "type 2", "metformin", "treatment", "protocol"]
- Paths: [John→Diabetes, John→Smith, Diabetes→Metformin, Diabetes→Protocol]

Step 4: Enhance query:
Original: "treatments for diabetes"
Enhanced: "treatments diabetes type 2 metformin protocol"

Step 5: Hybrid search:
Search only chunks linked to: [Diabetes, Metformin, Protocol]
Result: Relevant treatment chunks (high precision)

Parser System Architecture

Philosophy: Simple formats use standard libraries. ANTLR only for complex custom formats.

When to Use ANTLR

Complex Structured Documents

  • 10-50 page medical lab reports with specific sections
  • Legal contracts with defined clause structure
  • Custom domain-specific languages (DSLs)
  • Proprietary file formats with grammar

Don't Use ANTLR For

  • YAML/JSON/CSV files (use SnakeYAML, Jackson, OpenCSV)
  • Plain text documents
  • HTML/XML (use Jsoup, standard parsers)

Example: ANTLR for Medical PDFs

Grammar (LabReport.g4):

labReport : header sections footer ;
header : PATIENT_ID NAME DOB ;
sections : testSection+ ;
testSection : TEST_NAME results ;
results : (TEST_RESULT UNIT RANGE)+ ;

Parser Implementation:

@Component
public class MedicalPdfParser extends AbstractParser<LabReportData> {

@Override
public LabReportData parse(InputStream input, ParseContext context) {
String pdfText = extractTextFromPdf(input);

LabReportLexer lexer = new LabReportLexer(CharStreams.fromString(pdfText));
LabReportParser parser = new LabReportParser(new CommonTokenStream(lexer));

LabReportVisitor visitor = new LabReportVisitor();
return visitor.visit(parser.labReport());
}

@Override
public List<NodeInput> extractNodes(LabReportData data) {
// Convert parsed data to Node/Edge/Chunk inputs
}
}

Benefit: Define grammar once, parse thousands of similar documents efficiently.


Performance Characteristics

Indexing Performance

OperationTargetNotes
Index 1000 nodes<2sBatch insert with transaction
Index 10,000 nodes<10sParallel embedding generation
Full index (50K nodes, 200K edges)<60sInitial load from database
Incremental node add<50msSingle insert + in-memory update

Retrieval Performance

OperationTargetImplementation
Graph traversal (depth 3)<50msJGraphT in-memory BFS
Vector search (10K chunks)<100msPostgres pgvector HNSW
Hybrid search<150msGraph traversal + filtered vector
BM25 full-text<80msPostgreSQL ts_vector index

Memory Usage

ScaleJGraphT MemoryNotes
10K nodes, 50K edges~35MBO(n) node index, O(m) edge list
50K nodes, 250K edges~180MBAcceptable for most applications
100K nodes, 500K edges~350MBRecommended maximum for JGraphT
1M+ nodesUse Neo4jSwitch to disk-based graph DB

Configuration Reference

Complete Configuration Example

cef:
# Graph storage backend
graph:
store: jgrapht # jgrapht (default) | neo4j | tinkerpop
preload-on-startup: true
jgrapht:
max-nodes: 100000
max-edges: 500000
neo4j:
uri: bolt://localhost:7687
username: neo4j
password: ${NEO4J_PASSWORD}

# Vector storage backend
vector:
store: postgres # postgres (default) | duckdb | qdrant | pinecone
postgres:
# Uses spring.datasource configuration
duckdb:
path: ./data/knowledge.duckdb
qdrant:
host: localhost
port: 6333
api-key: ${QDRANT_API_KEY}
collection: cef-chunks
pinecone:
api-key: ${PINECONE_API_KEY}
environment: us-east-1
index-name: cef-chunks

# MCP tool configuration
mcp:
tool-format: openai # openai (default) | mcp
schema-injection: dynamic # static | dynamic (default)
required-fields:
text-query: true # Always required
graph-hints: false # Optional by default (trust LLM)
semantic-keywords: false

# Embedding configuration
embedding:
provider: ollama # ollama | openai | custom
model: nomic-embed-text
dimension: 768
base-url: http://localhost:11434

# Search configuration
search:
default-depth: 2
default-top-k: 10
min-results-threshold: 3 # Fallback trigger

# Database configuration
storage:
postgres:
connection:
url: jdbc:postgresql://localhost:5432/cef
username: cef_user
password: ${DB_PASSWORD}
schemas:
graph: cef_graph
vector: cef_vector

Production Deployment Considerations

Scaling Strategies

Small Scale (<10K nodes)

  • JGraphT + Postgres/pgvector
  • Single instance deployment
  • Memory: 2-4GB
  • Cost: Minimal (open source)

Medium Scale (10K-100K nodes)

  • JGraphT + DuckDB (lightweight) or Postgres
  • Single instance with L2 cache (Redis)
  • Memory: 4-8GB per instance
  • Cost: Low (open source + cache)

Large Scale (1M+ nodes)

  • Experimental / Roadmap
  • Neo4j + Qdrant
  • Note: While the architecture supports pluggable backends like Neo4j, the current v0.5 implementation loads the active graph into memory for performance. Distributed graph partitioning is planned for v1.0.

High Availability

Deployment Model This release is designed as a Single-Instance Research Pod.

  • Ideal for: Ad-hoc analysis, nightly batch jobs, and isolated research environments.
  • Not for: Multi-instance clusters (distributed locking is on the roadmap).

Monitoring & Observability

Metrics Exported (Micrometer):

  • cef.retrieval.duration - Retrieval latency histogram
  • cef.retrieval.strategy - Counter by strategy (hybrid/vector/bm25)
  • cef.graph.traversal.nodes - Nodes visited per traversal
  • cef.vector.search.latency - Vector search duration
  • cef.cache.hit_rate - L1/L2 cache effectiveness

Distributed Tracing (OpenTelemetry):

@WithSpan
public SearchResult retrieve(RetrievalRequest request) {
Span span = Span.current();
span.setAttribute("query", request.query());
span.setAttribute("depth", request.depth());
// ... retrieval logic
}

Health Checks:

@Component
public class CefHealthIndicator implements HealthIndicator {
@Override
public Health health() {
return Health.up()
.withDetail("graph_store", graphStore.getImplementation())
.withDetail("vector_store", vectorStore.getImplementation())
.withDetail("node_count", graphStore.getNodeCount())
.withDetail("chunk_count", vectorStore.getChunkCount())
.withDetail("memory_usage_mb", getMemoryUsage())
.build();
}
}

Comparison with Existing Solutions

vs. Traditional Data ORMs (Hibernate/JPA)

AspectHibernate/JPACEF Framework
PurposeTransactional data persistenceKnowledge context persistence
Data ModelTables, columns, FK constraintsNodes, edges, semantic relations
StorageSingle RDBMSDual (graph + vector stores)
Query ApproachSQL (structured)Relationship navigation + semantic search
CachingL1/L2 cacheL1/L2 cache (planned)
Lifecycle Hooks@PrePersist, @PostLoadPlanned
Batch OperationsStatelessSessionBatch indexing
Domain FocusBusiness entitiesKnowledge entities for LLMs

vs. LangChain/LlamaIndex

AspectLangChain/LlamaIndexCEF Framework
ParadigmDocument processingORM for knowledge models
LanguagePython-firstJava/Spring
RelationshipsLimitedFirst-class (like JPA @OneToMany)
Context AssemblyVector similarity onlyRelationship navigation + semantic
PersistenceManualDeclarative (like JPA)
Domain ModelDocument-centricEntity-relationship model
StorageVendor-specificPluggable (like JDBC drivers)
EnterpriseLimitedSpring Boot ecosystem
AspectNeo4j AloneCEF Framework
Abstraction LevelDatabaseORM framework
Vendor Lock-InYes (Neo4j required)No (pluggable like Hibernate)
Vector SearchPlugin/addonFirst-class dual persistence
Small ProjectsOverkillSuitable (in-memory JGraphT)
ORM FeaturesNoneCaching, lifecycle hooks, batch ops
ScaleMillions+ nodes10K-1M+ (configurable backend)
Cost$$$ (enterprise)$ (open source default)

Research Foundations

CEF architecture builds on established research:

  1. GraphRAG (Microsoft Research, 2024)
    "Graph-augmented RAG improves retrieval precision by 23-35% over pure vector search"

    • CEF implements similar graph traversal + keyword extraction
  2. Multi-Hop Reasoning (Stanford, 2023)
    "Following relationship paths enables complex question answering"

    • CEF's ReasoningContext provides multi-hop paths to LLM
  3. Hybrid Search (Google, 2023)
    "Combining keyword + semantic search outperforms either alone"

    • CEF's 3-level fallback ensures graceful degradation
  4. Entity-Centric Retrieval (Meta AI, 2024)
    "Entity-aware RAG improves factual accuracy for entity-specific queries"

    • CEF's graphHints enables entity-centric retrieval

Use Cases

Medical Knowledge Systems

Domain: Hospital records, patient data, medical literature
Scale: 50K patients, 100K conditions, 200K documents
Graph: Patient → Condition → Treatment → Medication
Benefit: "What treatments has John received?" traverses graph for precise context
Domain: Contracts, case law, regulations
Scale: 10K cases, 50K clauses, 100K precedents
Graph: Case → Precedent → Statute → Clause
Benefit: "Find precedents for clause X" follows citation graph

Financial Risk Assessment

Domain: Companies, transactions, risk factors
Scale: 100K companies, 1M transactions, 500K reports
Graph: Company → Transaction → Counterparty → Risk
Benefit: "Risk exposure for company Y" aggregates relationship-based risk

E-Commerce Recommendations

Domain: Products, categories, user behavior
Scale: 1M products, 10M users, 50M interactions
Graph: User → Purchase → Product → Category → Similar Products
Benefit: "Products for user Z" combines graph + vector for hybrid recommendations

Future Roadmap

Near-Term (3-6 months)

  • ✅ Temporal graph support (time-travel queries)
  • ✅ Multi-tenancy (isolated graphs per tenant)
  • ✅ Graph algorithms (PageRank, community detection)
  • ✅ Advanced caching (distributed, write-through)

Medium-Term (6-12 months)

  • ✅ Real-time graph updates (streaming indexing)
  • ✅ Graph visualization export (GraphML, Cypher)
  • ✅ Query optimization (cost-based planning)
  • ✅ Cross-graph federation (query multiple graphs)

Long-Term (12+ months)

  • ✅ Graph neural networks (GNN embeddings)
  • ✅ Automated schema discovery (ML-based)
  • ✅ Multi-modal support (images, audio in chunks)
  • ✅ Federated learning (privacy-preserving RAG)

Conclusion

Context Engineering Framework (CEF) addresses critical gaps in existing RAG solutions by combining:

  1. Graph Reasoning - Structured relationships enhance retrieval precision
  2. Automatic Fallback - 3-level strategy prevents zero-result failures
  3. Domain Agnostic - User-defined schemas, framework provides primitives
  4. Pluggable Architecture - Scale from 10K to millions of nodes via config
  5. MCP Integration - Dynamic schema injection enables agentic LLM behavior

Target Audience: Enterprise Java/Spring teams building knowledge-intensive LLM applications where entity relationships matter (medical, legal, finance).

Key Innovation: Graph reasoning → vector search → keyword fallback with LLM-driven entity resolution via MCP tool with dynamic schema injection.

Production Status: Ready for deployment with proven patterns (dual persistence, caching, observability) borrowed from mature ORMs like Hibernate.


Conclusion

Context Engineering Framework (CEF) addresses critical gaps in existing RAG solutions by combining:

  1. Graph Reasoning - Structured relationships enhance retrieval precision
  2. Automatic Fallback - 3-level strategy prevents zero-result failures
  3. Domain Agnostic - User-defined schemas, framework provides primitives
  4. Pluggable Architecture - Scale from 10K to millions of nodes via config
  5. MCP Integration - Dynamic schema injection enables agentic LLM behavior

Target Audience: Enterprise Research Pods and Academic teams building knowledge-intensive LLM applications where entity relationships matter (medical, legal, finance).

Key Innovation: Graph reasoning → vector search → keyword fallback with LLM-driven entity resolution via MCP tool with dynamic schema injection.

Deployment Status: Ready for Ephemeral Research Pods and Single-Instance Analysis. Not recommended for long-running, high-concurrency enterprise clusters in this release.


Document Version: 2.0
Last Updated: November 27, 2025
Author: Mahmudur R Manna (mrmanna) - Founder and Principal Architect, DDSE Foundation
Organization: DDSE Foundation (Decision-Driven Software Engineering)

License: MIT License - Copyright (c) 2024-2025 DDSE Foundation