$ npm install @qvac/bci-whispercppBrain-Computer Interface (BCI) neural signal transcription addon for qvac, powered by the tetherto/qvac-ext-lib-whisper.cpp fork of whisper.cpp.
Transcribes multi-channel neural signals (e.g., 512-channel microelectrode array recordings) into text using a BCI-trained whisper model running natively via GGML. Output matches the Python BrainWhisperer reference model exactly.
Neural Signal (512ch, 20ms bins)
│
▼
┌──────────────────────────────┐
│ NeuralProcessor (C++) │
│ - Gaussian smoothing │ std=2, kernel=100
│ - Day-specific projection │ low-rank (A·B) + month + softsign
│ - Pad to 3000 frames │ mel-major layout for whisper.cpp
└──────────────┬───────────────┘
│ mel features (512 × 3000)
▼
┌──────────────────────────────┐
│ whisper.cpp (patched) │
│ - conv1 (k=7, 512→384) │ BCI-trained embedder weights
│ - conv2 (k=3, stride=2) │
│ - Positional encoding │ learned time PE + sinusoidal day PE
│ - 6-layer encoder │ windowed attention (w=57) on layers 0–3
│ - 4-layer decoder (LoRA) │ beam search, length_penalty=0.14
└──────────────┬───────────────┘
│
▼
Text output
Native GGML inference matches the Python BrainWhisperer reference on all test samples:
| Sample | Ground Truth | GGML Native Output | WER |
|---|---|---|---|
| 0 | "You can see the code at this point as well." | "You can see the good at this point as well." | 10.0% |
| 1 | "How does it keep the cost down?" | "How does it keep the cost down?" | 0.0% |
| 2 | "Not too controversial." | "Not too controversial." | 0.0% |
| 3 | "The jury and a judge work together on it." | "The jury and a judge work together on it." | 0.0% |
| 4 | "Were quite vocal about it." | "We're quite vocal about it." | 20.0% |
| Average | 6.0% |
Binary files with the following layout:
| Offset | Type | Description |
|---|---|---|
| 0 | uint32 | Number of timesteps |
| 4 | uint32 | Number of channels |
| 8 | float32[] | Feature data (row-major: features[t * channels + c]) |
Each timestep represents a 20ms bin of neural activity. Channels correspond to individual electrodes in a microelectrode array (typically 512 channels).
cd packages/bci-whispercpp
npm install
VCPKG_ROOT=/path/to/vcpkg npm run build
VCPKG_ROOT environment variable setTo run an example you need the BCI model files and (for batch mode) the test fixtures. The download script fetches the model files from the QVAC model registry (no GitHub CLI, no auth) and the neural-signal fixtures from the public release tarball (the fixtures aren't in the registry yet).
cd packages/bci-whispercpp
npm install # installs @qvac/registry-client (devDependency)
# Download model files (ggml-bci-windowed.bin + bci-embedder.bin) + test fixtures
npm run download-models
# node scripts/download-models.js --models # models only (from the registry)
# node scripts/download-models.js --fixtures # fixtures only
# node scripts/download-models.js --force # re-download even if present
The models land in models/ and the neural-signal fixtures in test/fixtures/. Then run an example:
# Transcribe all bundled fixture samples and print WER
bare examples/transcribe-neural.js --batch
# Transcribe a single neural signal file
bare examples/transcribe-neural.js test/fixtures/neural_sample_0.bin
# Streaming transcription over a sliding window
bare examples/transcribe-stream-neural.js test/fixtures/neural_sample_0.bin
By default the examples look for models/ggml-bci-windowed.bin (with bci-embedder.bin alongside it). Override with WHISPER_MODEL_PATH=/path/to/ggml-bci-windowed.bin or by passing the model path as the final argument.
The model files come from the
qvacmodel registry (engine@qvac/bci-whispercpp, S3 sourceqvac_models_compiled/bci-whispercpp/...). If you already have them locally, skip the download step and pointWHISPER_MODEL_PATHat your copy. The fixtures URL can be overridden withBCI_FIXTURES_URL.
numpy, torch, and transformers (pip install numpy torch transformers)Convert a trained BrainWhisperer checkpoint. This produces two files, both required for inference:
| File | Size | Description |
|---|---|---|
ggml-bci-windowed.bin | ~84 MB | GGML model: whisper encoder/decoder (LoRA-merged), tokenizer, positional embedding, windowed attention header |
bci-embedder.bin | ~24 MB | Day projection weights: low-rank A·B matrices per recording day, month projections, session-to-day mapping |
python3 scripts/convert-model.py \
--checkpoint /path/to/epoch=93-val_wer=0.0910.ckpt
Both files are written to models/ by default. All flags are optional:
| Flag | Default | Description |
|---|---|---|
--output | models/ggml-bci-windowed.bin | GGML model output path |
--embedder-output | models/bci-embedder.bin | Embedder weights output path |
--day-idx | 1 | Day index for baked positional embedding |
--window-size | 57 | Windowed attention size (0 to disable) |
--last-window-layer | 3 | Last encoder layer with windowed attention |
--f32 | off | Use f32 for all tensors (avoids f16 precision loss, ~2x larger) |
Important: By default both files must be in the same directory at runtime — the addon resolves bci-embedder.bin next to the GGML model file and will fail if it is missing. To store the embedder elsewhere, pass an explicit path via files.embedder (see Usage).
The package's default export is the high-level BCIWhispercpp class. It owns model lifecycle, an inference queue, and a sliding-window streaming driver on top of the native addon.
const BCIWhispercpp = require('@qvac/bci-whispercpp')
const bci = new BCIWhispercpp({
files: { model: './models/ggml-bci-windowed.bin' },
opts: { stats: true } // optional — surfaces runtime stats on response.stats
}, {
whisperConfig: { language: 'en', temperature: 0.0 },
bciConfig: { day_idx: 1 }, // session day index for day-specific projection
miscConfig: { caption_enabled: false }
})
By default the companion
bci-embedder.binmust sit next tofiles.model— the native addon resolves it relative to the model path and will fail to load otherwise. To keep the embedder in a different location, pass its path explicitly viafiles.embedder:const bci = new BCIWhispercpp({ files: { model: './models/ggml-bci-windowed.bin', embedder: './weights/bci-embedder.bin' // optional override } })
await bci.load()
load() is idempotent — calling it again unloads the existing model and re-initialises with the current config. There is no progress callback today.
Use this when you have the full neural signal up-front. transcribe() accepts the raw bytes (header + body); transcribeFile() is a convenience wrapper that reads the file for you.
const fs = require('bare-fs')
const response = await bci.transcribeFile('./signal.bin')
// or: const response = await bci.transcribe(new Uint8Array(fs.readFileSync('./signal.bin')))
const segments = await response.await()
const text = segments.map(s => s.text).join('').trim()
console.log(text)
if (response.stats) console.log(response.stats) // when opts.stats: true
Concurrent calls are serialised — a second transcribe() waits for the first to settle.
transcribeStream() consumes a stream of bytes (async iterable, sync iterable, Uint8Array, or array of chunks) and decodes a sliding window over the body as data arrives. The first 8 bytes of the stream must be the standard [T u32 LE, C u32 LE] header (T is ignored in stream mode; C must be non-zero).
const response = await bci.transcribeStream(chunkIterable, {
windowTimesteps: 1500, // default
hopTimesteps: 500, // default — must be < windowTimesteps
emit: 'delta' // 'delta' (default) | 'full'
})
response.onUpdate(segments => {
// emit: 'delta' — newly-discovered tail segments since the last window.
// Each segment carries native fields (text, t0, t1, ...) plus
// windowStartTimestep so you can map back to the stream timeline.
// emit: 'full' — single { text } entry with the full running transcript.
for (const s of segments) process.stdout.write(s.text)
})
await response.await() // resolves when the stream ends and the final window decodes
Streaming constraints:
| Option | Constraint |
|---|---|
windowTimesteps | positive integer, ≤ 2900 (MAX_WINDOW_TIMESTEPS) |
hopTimesteps | positive integer, < windowTimesteps |
emit | 'delta' or 'full' |
Only one stream may be active at a time. response.stats is not populated for streams.
await bci.cancel() // abort an in-flight job or stream; instance remains usable
await bci.unload() // release native resources; bci.load() can be called again
await bci.destroy() // permanent — instance cannot be reused
The package re-exports computeWER(hypothesis, reference) for evaluation:
const { computeWER } = require('@qvac/bci-whispercpp')
const wer = computeWER('how does it keep the cost down', 'how does it keep the cost down?')
response.await() resolves to an array of segments; response.onUpdate(cb) receives the same shape per emission:
[
{ text: ' How does it keep the cost down?', t0: 0, t1: 280, /* ... */ }
]
In streaming delta mode each segment is annotated with windowStartTimestep. In full mode the array contains a single { text } entry.
| Script | Purpose |
|---|---|
npm run test:unit | JS unit tests (brittle-bare test/unit/*.test.js) — no model required |
npm run test:integration | JS integration tests against the native addon — requires WHISPER_MODEL_PATH |
npm run test:cpp | C++ unit tests (GoogleTest); bare-make rebuilds the addon with BUILD_TESTING=ON |
npm run test:dts | Type-checks the published index.d.ts |
npm test | Runs test:unit + test:integration |
# JS unit tests
npm run test:unit
# JS integration tests
WHISPER_MODEL_PATH=./models/ggml-bci-windowed.bin npm run test:integration
# C++ unit tests
VCPKG_ROOT=/path/to/vcpkg npm run test:cpp
# .d.ts typecheck
npm run test:dts
Integration tests require both ggml-bci-windowed.bin and bci-embedder.bin to be present in the same directory, plus the neural-signal fixtures. The quickest way to get them is npm run download-models (see Quickstart); alternatively produce the models yourself via Model Conversion.
BCIWhispercpp accepts two arguments:
new BCIWhispercpp(args, config)
| Field | Type | Description |
|---|---|---|
files.model | string | Required. Path to BCI GGML model file. By default bci-embedder.bin must sit alongside it. |
files.embedder | string | Optional explicit path to the embedder weights file. Overrides the default lookup of bci-embedder.bin next to files.model. |
logger | object | Optional logger; wrapped in @qvac/logging. Defaults to a noop logger. |
opts.stats | boolean | When true, runtime stats are surfaced on response.stats for batch jobs. Default false. |
The convenience defaults below are surfaced explicitly. Any other whisper_full_params key is forwarded untouched to whisper.cpp — see Advanced configuration.
| Parameter | Type | Default | Description |
|---|---|---|---|
language | string | "en" | Language code |
temperature | number | 0.0 | Sampling temperature |
n_threads | number | 0 (auto) | Number of threads |
| Parameter | Type | Default | Description |
|---|---|---|---|
day_idx | number | 0 | Session day index for the day-specific low-rank projection at runtime. Distinct from the conversion-time --day-idx flag, which bakes a positional embedding into ggml-bci-windowed.bin. |
These keys back the whisper_context. Changing any of them between jobs forces a full model reload (unload → re-init → warmup), which can take several seconds.
| Parameter | Type | Description |
|---|---|---|
model | string | Optional override; usually set via args.files.model. |
use_gpu | boolean | Enable GPU acceleration (Metal on macOS by default). |
flash_attn | boolean | Enable flash attention. |
gpu_device | number | Select a non-default GPU device. |
| Parameter | Type | Default | Description |
|---|---|---|---|
caption_enabled | boolean | false | Format segments with <|start|>..<|end|> markers. |
transcribeStream())| Parameter | Type | Default | Constraint | Description |
|---|---|---|---|---|
windowTimesteps | number | 1500 | positive integer, ≤ 2900 (MAX_WINDOW_TIMESTEPS) | Decode window size in 20 ms timesteps. |
hopTimesteps | number | 500 | positive integer, < windowTimesteps | How far the window advances between decodes (~33% overlap by default). |
emit | string | 'delta' | 'delta' or 'full' | 'delta' emits the newly-discovered tail per window with native segment fields plus windowStartTimestep. 'full' emits a single { text } entry with the running transcript. |
The encoder accepts up to ~3000 timesteps per forward pass; MAX_WINDOW_TIMESTEPS = 2900 keeps a safety margin so partial flush windows always fit.
whisperConfig is a thin pass-through to whisper.cpp's whisper_full_params. For the full surface (decoding strategy, beam search, VAD, suppression, callbacks, etc.) refer to the upstream reference:
whisper_full_params in whisper.cpp@qvac/transcription-whispercpp for richer usage patterns (VAD, chunking, live streaming).The BCI patches live in the tetherto/qvac-ext-lib-whisper.cpp fork (v1.8.4.2) and are consumed via the qvac-registry-vcpkg port:
| Feature | Description |
|---|---|
| Variable conv1 kernel | Read n_audio_conv1_kernel from model header (k=7 for 512ch BCI vs k=3 for audio) |
| Windowed attention | Attention mask with configurable window size/layer params in header |
| BCI SOS tokens | BCI-specific start-of-sequence token handling |
| Graph placement fix | Correct encoder-graph mask population for the encoder graph refactor |
All errors thrown by this package are QvacErrorAddonBCI instances (extending QvacErrorBase from @qvac/error) and use codes in the range 26001–27000.
| Code | Name | When |
|---|---|---|
26001 | FAILED_TO_LOAD_WEIGHTS | Native addon failed to load the GGML model |
26002 | FAILED_TO_CANCEL | cancel() failed at the addon layer |
26003 | FAILED_TO_APPEND | Append to processing queue failed |
26004 | FAILED_TO_DESTROY | destroy() failed at the addon layer |
26005 | FAILED_TO_ACTIVATE | addon.activate() failed during load() |
26006 | INVALID_NEURAL_INPUT | Batch input rejected by the addon |
26007 | JOB_ALREADY_RUNNING | transcribe() called while a job is in flight |
26008 | MODEL_NOT_LOADED | Inference called before load() or after destroy() |
26009 | MODEL_FILE_NOT_FOUND | files.model missing/unreadable, or files.embedder provided but missing/invalid |
26010 | BUFFER_LIMIT_EXCEEDED | Neural signal buffer exceeded the addon limit |
26011 | FAILED_TO_START_JOB | Addon refused to start the job |
26012 | INVALID_CONFIG | Constructor / context configuration rejected |
26013 | EMBEDDER_WEIGHTS_INVALID | bci-embedder.bin failed validation |
26014 | STREAM_ALREADY_ACTIVE | transcribeStream() called while one is already active |
26015 | INVALID_STREAM_INPUT | Bad stream input type or streamOpts |
26016 | INVALID_STREAM_HEADER | Stream [T u32, C u32] header malformed (e.g. C == 0) |
26017 | WINDOW_TOO_LARGE | windowTimesteps exceeds MAX_WINDOW_TIMESTEPS (2900) |
Codes are also re-exported via require('@qvac/bci-whispercpp/lib/error').ERR_CODES for programmatic matching.
tetherto/qvac-ext-lib-whisper.cpp@qvac/transcription-whispercppqvac-registry-vcpkgscripts/convert-model.pyday_idx) — selects the day-specific low-rank projection (A·B) baked into bci-embedder.bin. Sessions recorded on different days use different projections.w=57 over layers 0–3 by default), configured at model conversion time.Apache-2.0