A JavaScript library for Retrieval-Augmented Generation (RAG) within the QVAC ecosystem. Build powerful, context-aware AI applications with seamless document ingestion, vector search, and LLM integration.
npm install @qvac/rag
Each pluggable adapter has specific dependency requirements. Choose the adapters you need and install their dependencies:
HyperDBAdapter - Decentralized vector database
npm install corestore hyperdb hyperschema
BaseDBAdapter - Custom database interface
# No dependencies - implement your own database logic
QvacLlmAdapter - QVAC runtime models
npm install @qvac/llm-llamacpp
# Option 1: Directly through the addon (you will need local model files)
# No additional dependencies. See example in `examples/direct-rag.js`
# Option 2: Through runtime manager. See example in `examples/quickstart.js`
npm install @qvac/rt @qvac/router-inference @qvac/manager-inference
HttpLlmAdapter - HTTP API integration (OpenAI, Anthropic, etc.)
npm install bare-fetch
BaseLlmAdapter - Custom LLM interface
# No dependencies - implement your own LLM logic
QVAC Embedding Addon - Local model inference
npm install @qvac/embed-llamacpp
# Option 1: Directly through the addon (you will need local model files)
# No additional dependencies. See example in `examples/direct-rag.js`
# Option 2: Through runtime manager. See example in `examples/quickstart.js`
npm install @qvac/rt @qvac/router-inference @qvac/manager-inference
Custom Embedding Functions - Any service you prefer
# No dependencies - implement your own embedding logic and plug it in
LLMChunkAdapter - Intelligent text chunking
# Required
npm install llm-splitter
BaseChunkAdapter - Custom chunking interface
# No dependencies - implement your own chunking logic
Full-featured setup (default adapters with all features):
npm install @qvac/rag
# Database: HyperDBAdapter
npm install corestore hyperdb hyperschema
# LLM: QvacLlmAdapter
npm install @qvac/rt @qvac/router-inference @qvac/manager-inference @qvac/llm-llamacpp
# Embedding: QVAC Embedding Addon
npm install @qvac/embed-llamacpp
# Chunking: LLMChunkAdapter
npm install llm-splitter
Lightweight HTTP setup (cloud LLMs, minimal dependencies):
npm install @qvac/rag
# Database: HyperDBAdapter (still need vector storage)
npm install corestore hyperdb hyperschema
# LLM: HttpLlmAdapter for OpenAI/Anthropic
npm install bare-fetch
# Chunking: LLMChunkAdapter (basic word tokenization)
npm install llm-splitter
Custom implementation (bring your own adapters):
npm install @qvac/rag
# No additional dependencies - use your custom BaseDBAdapter, BaseLlmAdapter, BaseChunkAdapter
Installation Strategy:
@qvac/error, ready-resource, uuid-random)devDependencies for seamless testingnpm install --omit=dev to exclude testing dependenciesPerformance Benefits: Production deployments get minimal bundle sizes while development and testing have full functionality. Dependencies are only loaded at runtime when specific adapters are used.
The library follows a modular architecture:
RAG (Orchestrator)
├── Core Services
│ ├── ChunkingService - Text segmentation and tokenization
│ └── EmbeddingService - Vector generation and processing
└── Business Services
├── IngestionService - Document ingestion workflow
└── RetrievalService - Context retrieval workflow
Adapters (Plugin System)
├── Database Adapters
│ ├── HyperDBAdapter - HyperDB implementation
│ └── BaseDBAdapter - Custom database interface
├── LLM Adapters
│ ├── QvacLlmAdapter - QVAC runtime models
│ ├── HttpLlmAdapter - HTTP API integration
│ └── BaseLlmAdapter - Custom LLM interface
└── Chunking Adapters
├── LLMChunkAdapter - Intelligent text chunking
└── BaseChunkAdapter - Custom chunking interface
new RAG({
llm: BaseLlmAdapter, // Optional: LLM adapter (required for inference)
embeddingFunction: EmbeddingFunction, // Required: embedding function
dbAdapter: BaseDBAdapter, // Required: Database adapter
chunker: BaseChunkAdapter, // Optional: Custom chunker
chunkOpts: ChunkOpts, // Optional: Chunking configuration
});
The default database adapter requires a Corestore instance for persistent storage:
const Corestore = require("corestore");
const { HyperDBAdapter } = require("@qvac/rag");
// Create a Corestore instance with persistent storage
const store = new Corestore("./my-rag-data");
// Create database adapter with store
const dbAdapter = new HyperDBAdapter({ store });
// Alternative: Use external HyperDB instance
const HyperDB = require("hyperdb");
const dbSpec = require("./path/to/your/db-spec");
const hypercore = store.get({ name: "my-db" });
const db = HyperDB.bee(hypercore, dbSpec);
const dbAdapter = new HyperDBAdapter({ db });
Configuration Options:
store: Corestore instance (required when not providing db)db: External HyperDB instance (optional)dbName: Name for the hypercore (default: 'rag-vector-store')documentsTable, vectorsTable, etc.: Configurable table namesgenerateEmbeddings(text)Generate embeddings for a single text.
await rag.generateEmbeddings(text: string): Promise<number[]>
generateEmbeddingsForDocs(docs, opts?)Generate embeddings for a set of documents.
await rag.generateEmbeddingsForDocs(
docs: string | string[],
opts?: {
chunk?: boolean,
chunkOpts?: BaseChunkOpts,
signal?: AbortSignal
}
): Promise<{ [key: string]: number[] }>
chunk(input, chunkOpts?)Chunks text into multiple chunks using configured chunking options.
await rag.chunk(
input: string | string[],
chunkOpts?: BaseChunkOpts // Override default chunking options
): Promise<Doc[]>
ingest(docs, opts?)Full pipeline: chunk, embed, and save documents to the vector database.
await rag.ingest(
docs: string | string[],
opts?: {
chunk?: boolean, // Default: true
chunkOpts?: BaseChunkOpts,
dbOpts?: DbOpts,
onProgress?: (stage, current, total) => void, // Stage-aware progress
progressInterval?: number, // Report every N docs (default: 10)
signal?: AbortSignal // Cancellation support
}
): Promise<{
processed: SaveEmbeddingsResult[],
droppedIndices: number[]
}>
Progress Stages:
chunking - Document chunking phaseembedding - Embedding generation phasesaving:deduplicating - Checking for duplicatessaving:preparing - Computing hashes/centroidssaving:writing - Writing to databasesaveEmbeddings(embeddedDocs, opts?)Save embedded documents directly to the vector database. Documents must have id, content, and embedding fields.
await rag.saveEmbeddings(
embeddedDocs: EmbeddedDoc[],
opts?: SaveEmbeddingsOpts
): Promise<SaveEmbeddingsResult[]>
Options:
dbOpts - Database adapter optionsonProgress(current, total) - Progress callbacksignal - AbortSignal for cancellationsearch(query, params?)Search for documents based on semantic similarity.
await rag.search(
query: string,
params?: {
topK?: number, // Number of results (default: 5)
n?: number, // Centroids to search (default: 3)
signal?: AbortSignal
}
): Promise<SearchResult[]>
infer(query, opts?)Generate AI responses using retrieved context.
await rag.infer(
query: string,
opts?: {
topK?: number, // Context docs to retrieve
n?: number, // Centroids to search
llmAdapter?: BaseLlmAdapter, // Override default LLM
signal?: AbortSignal
}
): Promise<any> // Format depends on LLM adapter
reindex(opts?)Optimize database index structure to improve search quality. Implementation depends on the database adapter (e.g., HyperDBAdapter uses k-means centroid rebalancing).
await rag.reindex(
opts?: {
onProgress?: (stage, current, total) => void,
signal?: AbortSignal
}
): Promise<{
reindexed: boolean,
details?: Record<string, any> // Adapter-specific details
}>
Note: Progress stages and details vary by adapter. HyperDBAdapter reports: collecting, clustering, reassigning, updating.
deleteEmbeddings(ids)Delete embeddings for documents from the vector database.
await rag.deleteEmbeddings(ids: string[]): Promise<boolean>
setLlm(llmAdapter)Set the default LLM adapter for the RAG instance.
rag.setLlm(llmAdapter: BaseLlmAdapter): void
The LLMChunkAdapter provides token-aware chunking with lots of flexibility.
{
chunkSize: 256, // Max tokens per chunk
chunkOverlap: 50, // Overlapping tokens
chunkStrategy: 'paragraph', // How chunks are grouped: 'character' | 'paragraph'
splitStrategy: 'token', // Built-in tokenizers: 'token' | 'word' | 'sentence' | 'line' | 'character'
splitter: (text) => string[] // Custom tokenizer (overrides splitStrategy)
}
Default: Token-based chunking
Use model-specific tokenizers for accurate chunk sizing:
// Install: npm install tiktoken
const tiktoken = require("tiktoken");
// Create tiktoken-based splitter
const encoding = tiktoken.encoding_for_model("text-embedding-ada-002");
const chunker = new LLMChunkAdapter({
splitter: (text) => {
const tokens = encoding.encode(text);
return tokens.map((t) => new TextDecoder().decode(encoding.decode([t])));
},
chunkSize: 256,
});
// Don't forget to clean up
encoding.free();
Note: Custom splitters must preserve original text (no lowercasing/transformations).
Get started with these examples:
Complete RAG workflow with document ingestion, search, and inference:
bare examples/quickstart.js
Comparing different tokenizers and chunking approaches:
bare examples/chunking.js
To run the tests, use the following commands:
# Unit tests
npm run test:unit
# Integration tests
npm run test:integration
# All tests
npm test
Important: Before running the integration tests, make sure you have installed the required libraries as specified in the integration test.
This project is licensed under the Apache-2.0 License – see the LICENSE file for details.
For any questions or issues, please open an issue on the GitHub repository.