Known Issues

Version: beta-0.5
Last Updated: November 27, 2025

Testing Status

✅ Tested Configurations

The following configurations have been thoroughly tested and validated:

Database: DuckDB (embedded)
LLM Provider: vLLM with Qwen3-Coder-30B-A3B-Instruct-FP8
Embedding Model: Ollama with nomic-embed-text (768 dimensions)
Graph Store: JGraphT (in-memory)
Operating Systems: Linux

⚠️ Untested Configurations

The following configurations are implemented and available but not yet tested in production:

Databases

PostgreSQL - Configuration available, schema provided, but not fully tested
- R2DBC reactive driver configured
- pgvector extension support included
- Migration scripts available
- Status: Needs integration testing

LLM Providers

OpenAI - Client factory implemented, configuration available
- GPT-4, GPT-3.5 Turbo support
- Status: Needs API key testing
Ollama (LLM) - Configuration available for Llama models
- Llama 3.2, Llama 3.1 support
- Status: Needs comprehensive testing

Vector Stores

Qdrant - Interface implemented, configuration available
- Status: Needs deployment and integration testing
Pinecone - Interface implemented, configuration available
- Status: Needs API key and integration testing

Graph Stores

Neo4j - Interface defined, configuration available
- Cypher query support planned
- Status: Requires Neo4j instance and integration testing

Known Issues

1. Memory Management (JGraphT)

Issue: In-memory graph store (JGraphT) may experience memory pressure with large knowledge bases.

Details:

Recommended maximum: 100,000 nodes
Memory usage: ~350MB for 100K nodes, 500K edges
No automatic pagination or lazy loading

Workaround:

Monitor heap usage with -Xmx settings
Consider Neo4j for graphs >100K nodes
Implement periodic graph pruning for time-bound data

Status: By design - JGraphT is optimized for fast in-memory operations

2. Embedding Generation Performance

Issue: Batch embedding generation can be slow for large document sets.

Details:

Single-threaded embedding calls to Ollama
~200ms per embedding with nomic-embed-text
Initial index of 10,000 chunks takes ~30-40 minutes

Workaround:

Use batch indexing during off-peak hours
Consider pre-generating embeddings offline
Increase Ollama concurrency settings

Status: Planned for optimization in v0.6

3. PostgreSQL Schema Auto-Creation

Issue: R2DBC does not automatically create tables on startup like JPA.

Details:

Schema files provided in src/main/resources/
Requires manual execution or Flyway/Liquibase integration

Workaround:

Run schema-postgresql.sql manually before first startup
Or configure Flyway migration (example in docs)

Status: Documentation updated, auto-migration planned for v1.0

4. Vector Search with DuckDB

Issue: DuckDB vector similarity search uses brute-force comparison (no HNSW index).

Details:

Performance degrades with >50,000 chunks
No approximate nearest neighbor (ANN) support in DuckDB VSS extension

Workaround:

Use PostgreSQL with pgvector for large-scale deployments
Implement result caching for frequent queries

Status: DuckDB limitation - consider PostgreSQL for production

5. Concurrent Indexing

Issue: Multiple concurrent indexing operations may cause race conditions in graph updates.

Details:

JGraphT in-memory store is not thread-safe by default
Edge additions during concurrent node updates may be inconsistent

Workaround:

Use sequential indexing for initial data load
Implement application-level locking for concurrent updates

Status: Thread-safe wrapper planned for v0.6

6. Context Window Token Limits

Issue: Large relationship subgraphs may exceed LLM context windows.

Details:

No automatic context pruning or summarization
Deep graph traversals (depth >3) can generate excessive context

Workaround:

Limit traversal depth to 2-3 hops
Use topK parameter to control result size
Implement custom context filtering in application layer

Status: Intelligent context truncation planned for v0.7

7. Relationship Type Validation

Issue: No runtime validation that edges match defined RelationType schemas.

Details:

Framework accepts any relation type string
Semantic hints (HIERARCHY, CAUSALITY) are advisory only

Workaround:

Implement application-level validation
Use strict entity builders with type checking

Status: Schema validation framework planned for v0.8

8. Docker Compose Networking

Issue: Ollama container may not be accessible from host on some Docker Desktop versions.

Details:

DNS resolution of ollama:11434 fails from host machine
WSL2 networking issues on Windows

Workaround:

Use localhost:11434 instead of ollama:11434 in application.yml
Or add 127.0.0.1 ollama to /etc/hosts

Status: Documentation updated with troubleshooting steps

9. Query Performance Monitoring

Issue: Limited built-in query performance metrics.

Details:

No automatic slow query logging
Graph traversal metrics not exposed via actuator

Workaround:

Enable DEBUG logging for org.ddse.ml.cef
Use Spring Boot Actuator with custom metrics

Status: Enhanced observability planned for v0.6

10. Multi-tenancy Support

Issue: No built-in tenant isolation for knowledge graphs.

Details:

Single shared graph store per application instance
No tenant-specific caching or partitioning

Workaround:

Deploy separate application instances per tenant
Implement application-level filtering with tenant_id in node properties

Status: Multi-tenancy patterns documented in USER_GUIDE.md

Reporting Issues

If you encounter issues not listed here:

Check the User Guide Troubleshooting section
Review Architecture for design constraints
Enable DEBUG logging: logging.level.org.ddse.ml.cef=DEBUG
Report via GitHub Issues with:
- Configuration details (database, LLM provider, models)
- Error logs and stack traces
- Steps to reproduce
- Expected vs actual behavior

Testing Contributions Welcome

We welcome community testing and feedback on untested configurations:

PostgreSQL integration - Production deployment reports
OpenAI provider - API compatibility testing
Neo4j graph store - Large-scale graph performance
Qdrant/Pinecone - Vector store benchmarking
Windows/macOS - Platform compatibility

Contact DDSE Foundation at https://ddse-foundation.github.io/ for contribution guidelines.

DDSE Foundation - https://ddse-foundation.github.io/

Testing Status​

✅ Tested Configurations​

⚠️ Untested Configurations​

Databases​

LLM Providers​

Vector Stores​

Graph Stores​

Known Issues​

1. Memory Management (JGraphT)​

2. Embedding Generation Performance​

3. PostgreSQL Schema Auto-Creation​

4. Vector Search with DuckDB​

5. Concurrent Indexing​

6. Context Window Token Limits​

7. Relationship Type Validation​

8. Docker Compose Networking​

9. Query Performance Monitoring​

10. Multi-tenancy Support​

Reporting Issues​

Testing Contributions Welcome​

Testing Status

✅ Tested Configurations

⚠️ Untested Configurations

Databases

LLM Providers

Vector Stores

Graph Stores

Known Issues

1. Memory Management (JGraphT)

2. Embedding Generation Performance

3. PostgreSQL Schema Auto-Creation

4. Vector Search with DuckDB

5. Concurrent Indexing

6. Context Window Token Limits

7. Relationship Type Validation

8. Docker Compose Networking

9. Query Performance Monitoring

10. Multi-tenancy Support

Reporting Issues

Testing Contributions Welcome