Release Notes
Version beta-0.5 (November 27, 2025)โ
First Public Beta Release
This is the initial beta release of the Context Engineering Framework (CEF) from DDSE Foundation. CEF provides an ORM-like abstraction for LLM context engineering, managing knowledge models through dual persistence (graph + vector stores).
๐ฏ Release Highlightsโ
- ORM for Context Engineering - Define knowledge models (nodes, edges) like JPA entities
- Dual Persistence - Automatic management of graph and vector stores
- Intelligent Context Assembly - 3-level strategy (relationship navigation โ semantic โ keyword)
- Structured Patterns - Repository layer, service patterns, lifecycle hooks
- Comprehensive Documentation - USER_GUIDE, ARCHITECTURE, examples
โจ Core Featuresโ
Knowledge Model ORMโ
- Entity Persistence - Node and Edge entities with JSONB properties
- Relationship Navigation - Multi-hop graph traversal with semantic filtering
- Vectorizable Content - Automatic embedding generation and persistence
- RelationType System - Semantic hints (HIERARCHY, CAUSALITY, ASSOCIATION, etc.)
Storage Backends (Pluggable)โ
- โ DuckDB - Embedded database (default, tested)
- โ JGraphT - In-memory graph store (default, tested)
- โ ๏ธ PostgreSQL - External database with pgvector (configured, untested)
- โ ๏ธ Neo4j - Graph database for large-scale deployments (configured, untested)
- โ ๏ธ Qdrant - Vector database (configured, untested)
- โ ๏ธ Pinecone - Cloud vector database (configured, untested)
LLM Integrationโ
- โ vLLM - Production inference server with Qwen3-Coder-30B-A3B-Instruct-FP8 (tested)
- โ Ollama Embeddings - nomic-embed-text model, 768 dimensions (tested)
- โ ๏ธ OpenAI - GPT-4, GPT-3.5 Turbo (configured, untested)
- โ ๏ธ Ollama LLM - Llama 3.x models (configured, untested)
Context Assemblyโ
- Pattern-Based Retrieval - GraphPattern with TraversalStep and Constraint
- Multi-Hop Reasoning - Configurable depth (1-5 hops)
- 3-Level Fallback - Graph โ Hybrid โ Vector-only
- Semantic Filtering - Relationship semantics-aware traversal
Developer Experienceโ
- Repository Pattern - Domain-specific facades over ORM layer
- Service Layer - Business logic separation with transaction support
- Reactive API - Spring WebFlux + R2DBC for non-blocking I/O
- Configuration - YAML-based with sensible defaults
๐ฆ What's Includedโ
Framework (cef-framework)โ
<dependency>
<groupId>org.ddse.ml</groupId>
<artifactId>cef-framework</artifactId>
<version>beta-0.5</version>
</dependency>
KnowledgeIndexer- Entity persistence (like EntityManager)KnowledgeRetriever- Context queries (like Repository)GraphStore- Pluggable graph backend interfaceVectorStore- Pluggable vector backend interfaceNode,Edge,Chunk,RelationType- Core domain entitiesGraphPattern,TraversalStep,Constraint- Query DSL
Comprehensive Test Suiteโ
- Medical Domain: 150 patients, 5 conditions, 7 medications, 15 doctors (177 nodes, 455 edges)
- Financial Domain: SAP-simulated data (vendors, materials, purchase orders, invoices)
- Benchmarks: 4 complex scenarios proving Knowledge Model superiority
- Results: 60-220% improvement over vector-only search (see Benchmark Analysis)
Documentationโ
- USER_GUIDE.md - Complete ORM integration guide (30KB, 1,200 lines)
- ARCHITECTURE.md - Technical deep dive
- QUICKSTART.md - Getting started in 5 minutes
- KNOWN_ISSUES.md - Testing status and limitations
- README.md - Project overview
๐งช Testing Statusโ
Thoroughly Tested โ โ
- DuckDB embedded database
- JGraphT in-memory graph (up to 100K nodes)
- vLLM with Qwen3-Coder-30B-A3B-Instruct-FP8
- Ollama embeddings (nomic-embed-text, 768 dimensions)
- Pattern-based retrieval with multi-hop reasoning
- Medical domain example with benchmarks
Configured but Untested โ ๏ธโ
- PostgreSQL + pgvector
- Neo4j graph database
- OpenAI GPT models
- Ollama LLM models (Llama 3.x)
- Qdrant vector database
- Pinecone vector database
See Known Issues for details.
๐ Getting Startedโ
Prerequisitesโ
- Java 17+
- Maven 3.8+
- Docker & Docker Compose
Quick Startโ
# Clone repository
git clone <repository-url>
cd ced
# Start services (Ollama for embeddings)
docker-compose up -d
# Note: vLLM (Qwen3-Coder-30B) required for benchmark reproduction
# See https://docs.vllm.ai/ for installation
# Build framework
mvn clean install
# Run test suite (includes benchmarks)
cd cef-framework
mvn test
# View benchmark results
open cef-framework/BENCHMARK_REPORT.md
open cef-framework/BENCHMARK_REPORT_2.md
open cef-framework/SAP_BENCHMARK_REPORT.md
Example Usageโ
// Define knowledge model
Node patient = new Node(null, "Patient",
Map.of("name", "John", "age", 45),
"Patient John with diabetes");
// Persist entity
indexer.indexNode(patient).block();
// Define relationship
Edge hasCondition = new Edge(null, "HAS_CONDITION",
patientId, diabetesId, null, 1.0);
indexer.indexEdge(hasCondition).block();
// Query context
SearchResult result = retriever.retrieve(
RetrievalRequest.builder()
.query("diabetes treatments")
.depth(2)
.topK(10)
.build()
);
๐ Benchmark Resultsโ
Comprehensive test suite validates Knowledge Model superiority over vector-only approaches.
Test Domainsโ
-
Medical Clinical Decision Support
- 177 nodes: Patients, Conditions, Medications, Doctors
- 455 edges: Multi-hop relationships
- 4 complex scenarios: Contraindication discovery, behavioral patterns, cascading risks, transitive exposure
-
Financial SAP-Simulated Data
- Enterprise procurement workflows
- Vendor-Material-Invoice relationships
- Transaction pattern analysis
Key Findingsโ
| Scenario | Vector-Only | Knowledge Model | Improvement |
|---|---|---|---|
| Multi-hop contraindication | 5 chunks | 12 chunks | +140% |
| Behavioral risk patterns | 5 chunks | 8 chunks | +60% |
| Cascading side effects | 5 chunks | 8 chunks | +60% |
| Transitive exposure risk | 5 chunks | 16 chunks | +220% ๐ฅ |
| Average | 5.0 chunks | 11.0 chunks | +120% |
Latency: 26ms avg (only +19.5% vs vector-only 21.8ms)

See Benchmark Analysis for detailed analysis.
๐ Performance Characteristicsโ
Tested Configuration: DuckDB + JGraphT + vLLM (Qwen3-Coder-30B) + Ollama (nomic-embed-text)
| Operation | Performance | Notes |
|---|---|---|
| Node indexing | <50ms per node | Single insert |
| Batch indexing | ~2s per 1000 nodes | Transactional batch |
| Graph traversal (depth 2) | <50ms | JGraphT in-memory |
| Vector search (10K chunks) | ~100ms | DuckDB brute-force |
| Hybrid assembly | ~150ms | Graph + vector combined |
| Embedding generation | ~200ms per chunk | Ollama nomic-embed-text |
Benchmark Results (Medical Domain):
- Vector-only: 60 chunks retrieved
- Knowledge Model ORM: 132 chunks retrieved (120% improvement)
- Relationship-aware context with proper entity boundaries
๐ง Configuration Exampleโ
cef:
# Storage backend
database:
type: duckdb # or postgresql
duckdb:
path: ./data/cef.duckdb
# Graph store
graph:
store: jgrapht # or neo4j
preload-on-startup: true
# Vector store
vector:
store: duckdb # or postgres, qdrant, pinecone
# LLM provider
llm:
default-provider: vllm # or ollama, openai
vllm:
base-url: http://localhost:8001
model: Qwen/Qwen3-Coder-30B-A3B-Instruct-FP8
# Embedding provider
embedding:
provider: ollama # or openai
model: nomic-embed-text
dimension: 768
๐ Known Limitationsโ
- JGraphT Memory - Recommended maximum 100K nodes (~350MB)
- PostgreSQL Untested - Schema provided but not integration tested
- Concurrent Indexing - Not thread-safe, use sequential loading
- DuckDB Vector Search - Brute-force only, no HNSW index
- No Schema Validation - RelationType semantics are advisory
- Limited Observability - Basic metrics only, enhanced planned for v0.6
See Known Issues for complete list and workarounds.
๐ Documentationโ
- User Guide - Complete integration guide with ORM patterns
- Architecture - Technical architecture and design decisions
- Quick Start - Get started in 5 minutes
- Known Issues - Testing status and limitations
๐ฃ๏ธ Roadmapโ
v0.6 (Planned - Q1 2026)โ
- PostgreSQL production testing and validation
- Thread-safe concurrent indexing
- Enhanced observability (metrics, tracing)
- Performance optimizations for large graphs
- OpenAI integration testing
v0.7 (Planned - Q2 2026)โ
- Intelligent context truncation
- L1/L2 caching implementation
- Multi-tenancy patterns
- Qdrant/Pinecone testing
v1.0 (Planned - Q3 2026)โ
- Production-grade release
- Comprehensive test coverage
- Auto-migration support
- Schema validation framework
- Community-tested all backends
๐ค Contributingโ
We welcome contributions, especially:
- Testing untested configurations (PostgreSQL, OpenAI, Neo4j)
- Performance benchmarking on different scales
- Documentation improvements
- Bug reports and fixes
- Feature requests
See Known Issues for areas needing community testing and validation.
๐ Licenseโ
MIT License
Copyright (c) 2024-2025 DDSE Foundation
See LICENSE file for details.
๐ Acknowledgmentsโ
- DDSE Foundation - https://ddse-foundation.github.io/
- Author - Mahmudur R Manna (mrmanna), Founder and Principal Architect
- Built with Spring Boot, Spring AI, JGraphT, DuckDB, and pgvector
- Inspired by Hibernate/JPA ORM patterns
๐ Supportโ
- Documentation: User Guide | Architecture
- Issues: GitHub Issues (repository link TBD)
- Community: DDSE Foundation website
- Email: Contact through DDSE Foundation
Thank you for trying CEF beta-0.5!
We appreciate your feedback as we work toward v1.0. Please report any issues or share your success stories with the community.
DDSE Foundation - Decision-Driven Software Engineering
https://ddse-foundation.github.io/