npm stats
  • Search
  • About
  • Repo
  • Sponsor
  • more
    • Search
    • About
    • Repo
    • Sponsor

Made by Antonio Ramirez

ai-cli

0.3.1

@GitHub Actions

npmHomeRepoSnykSocket
Downloads:1364
$ npm install ai-cli
DailyWeeklyMonthlyYearly

ai

A tiny, agent-native CLI for generating images, video and text with dead-simple commands, stdin support and predictable artifact outputs. Uses Vercel AI SDK and AI Gateway for unified access to hundreds of models.

Install

npm install -g ai-cli

Requires Node.js 20+ and an AI Gateway API key or a provider-specific key (e.g. OPENAI_API_KEY).

Usage

ai image "a cute dog"
ai video "a spinning triangle"
ai text "explain quantum computing"
ai models                          # list available models

Piping and References

ai image "a dragon" | ai video "animate this"
ai video -i input.png "animate this"
ai image --image reference.png "make a sticker in this style"
ai image -i sketch.png -i palette.jpg "render this product concept"
ai text --image screenshot.png "what is broken in this UI?"
cat photo.png | ai text "describe this image"
cat notes.txt | ai text "summarize this"
git diff | ai text "explain these changes"

Common Options

All commands support:

-m, --model <id>         Model ID (creator/model-name), comma-separated for multi-model
-o, --output <path>      Output file path or directory
-n, --count <n>          Number of generations per model (default: 1)
-p, --concurrency <n>    Max parallel generations (default: 4, video: 2)
-q, --quiet              Suppress progress output
--json                   Output metadata as JSON

Model IDs can be specified as creator/model-name or just model-name (resolved against models fetched from the gateway):

ai text -m gpt-5.5 "hello"          # resolves to openai/gpt-5.5
ai image -m flux-2-pro "a sunset"   # resolves to bfl/flux-2-pro

image

-i, --image <path-or-url> Reference image path or URL (repeatable)
--size <WxH>             Image size (e.g. 1024x1024)
--aspect-ratio <W:H>     Aspect ratio (e.g. 16:9)
--quality <level>        Quality (standard, hd)
--style <style>          Style (vivid, natural)
--no-preview             Disable inline image preview

Reference images can be local paths, file:// URLs, http(s):// URLs or data URLs. You can repeat --image to pass multiple references, and you can still pipe one image through stdin:

cat input.png | ai image -i style.png "combine the subject with this style"

Reference-image support is model-dependent; unsupported models may reject image inputs.

video

-i, --image <path-or-url> Image input path or URL
--aspect-ratio <W:H>     Aspect ratio (e.g. 16:9)
--duration <seconds>     Duration in seconds
--no-preview             Disable inline video frame preview

Image inputs can be local paths, file:// URLs, http(s):// URLs or data URLs. Video generation accepts one input image, provided either through --image or piped stdin:

ai video -i input.png "animate this"
cat input.png | ai video "animate this"

text

-f, --format <fmt>       Output format: md, txt (default: md)
-i, --image <path-or-url> Image input path or URL for vision (repeatable)
-s, --system <prompt>    System prompt
--max-tokens <n>         Maximum tokens to generate
-t, --temperature <n>    Temperature (0-2)

For vision-capable text models, ai text accepts images from --image or piped stdin:

ai text -i chart.png -i table.jpg "summarize the data"
cat screenshot.png | ai text "list the visible errors"

models

--type <type>            Filter by type: text, image, video
--creator <name>         Filter by creator (e.g. openai, google)
--json                   Output as JSON (includes descriptions)

All model types (text, image, video) are fetched live from the AI Gateway.

Multi-Model Comparison

Generate with multiple models by comma-separating -m:

ai image "a sunset" -m "openai/gpt-image-1,xai/grok-imagine-image,bfl/flux-2-pro"

Combine with -n to generate multiple per model:

ai image "a sunset" -n 2 -m "openai/gpt-image-1,bfl/flux-2-pro"   # 4 images total

Inline Preview

When running in a terminal that supports the Kitty graphics protocol (Kitty, Ghostty, WezTerm, Warp, iTerm2), generated images and videos are displayed inline automatically. Video previews decode an H.264 keyframe from the midpoint of the video using openh264 compiled to WebAssembly — no native dependencies required. Use --no-preview to disable this, or set AI_CLI_PREVIEW=1 to force it on in undetected terminals.

Output Behavior

  • text: saves to <id>.md (interactive), stdout when piped
  • image/video: saves to <id>.png / <id>.mp4 (interactive), raw binary stdout when piped
  • -o <dir>: saves inside the directory with auto-generated names

When the CLI needs to choose a filename, it uses a response id when available and falls back to a random 8-character id.

Environment Variables

VariableDescription
AI_GATEWAY_API_KEYAI Gateway authentication key
OPENAI_API_KEYProvider-specific key (or other provider keys)
AI_CLI_TEXT_MODELDefault text model (overrides openai/gpt-5.5)
AI_CLI_IMAGE_MODELDefault image model (overrides openai/gpt-image-2)
AI_CLI_VIDEO_MODELDefault video model (overrides bytedance/seedance-2.0)
AI_CLI_OUTPUT_DIRDefault output directory for generated files
AI_CLI_PREVIEWSet to 1 to force inline image preview, 0 to disable
NO_COLORDisable ANSI color output
FORCE_COLORForce color output even when not a TTY

The -m flag always takes priority over AI_CLI_*_MODEL env vars. The -o flag always takes priority over AI_CLI_OUTPUT_DIR.

Timeouts

Requests that exceed the timeout are aborted automatically:

CommandTimeout
text120 seconds
image120 seconds
video300 seconds

Exit Codes

CodeMeaning
0Success
1All generations failed
2Partial failure (some succeeded, some failed)

License

Apache-2.0