castkit: CLI Demo Videos From One Command

Every CLI tool I’ve ever shipped had the same problem: zero visual presence. Then I built castkit - a castkit CLI demo video generator that turns any binary into a polished MP4 or GIF with one command. No screen recording software. No manual scripting. No video editing. Open source, written in Rust, and the meta-demo sells it better than I can: castkit generates its own demo video.

Check the repo before reading further if you want to see the output first.

Why I actually built this

Screen recording CLI tools is a pain in the ass. You open your terminal, run the command, inevitably typo something on the third take, realize the font looks weird, forget to hide your API key in the environment, and end up with a 45-second raw recording you now have to edit in Final Cut.

The existing options all have real tradeoffs. asciinema is free and captures terminal output, but the playback looks like a terminal log - no visual polish, no branding, nothing you’d put on a landing page. Screen Studio costs $89 and produces beautiful results, but you’re still recording manually. VHS from Charm is the closest thing to what I wanted - declarative, scriptable - but you have to write a .tape file by hand for every recording. No auto-discovery. No agent support. Still requires you to think about the demo structure yourself.

The trigger for actually building this was working on AI coding agents. I kept hitting the same pattern: an agent builds a CLI tool, the CLI tool works, and then the last step is… what? Push to GitHub with a README? That’s a dead end. I wanted the agent to be able to say “generate a demo video” as the final build step. No human intervention. No manual configuration. Just: here’s a binary, produce something I can ship.

That didn’t exist. So I built it.

What castkit actually does

The pipeline is: binary/command → discover → plan → record → redact → render → encode → MP4 or GIF.

What castkit actually does

Each stage does real work.

Discover reads your tool’s --help output and README to understand what commands exist, what flags do what, and what a reasonable demo flow looks like. It builds a structured picture of the tool without you explaining anything.

Plan takes that understanding and generates a demo script as editable JSON. This is the part I’m most proud of. You can run castkit plan on any CLI tool, get a JSON file showing every scene, every command, every pause duration - and edit it before recording. It’s not a black box.

Record runs the actual PTY recording with human-like typing jitter. Real typing cadence, not sleep 0.1 between every character. The variation is calibrated to feel like a person typed it, not a script.

Redact runs automatically before anything gets rendered. Environment variables, API keys, tokens - anything that looks like a secret gets masked. Safe by default. You don’t have to remember to do this.

Render is where it gets interesting. castkit uses cosmic-text for text shaping and tiny-skia for pixel rendering. Full software renderer - no GPU dependency, no display server needed, runs in CI. It draws macOS window chrome (traffic lights, title bar, drop shadow), handles auto-zoom with easing, crossfade transitions between scenes, cursor smoothing with blink animation.

Encode pipes frames to ffmpeg for H.264 MP4 or GIF output.

The whole thing is a single binary plus ffmpeg. No runtime dependencies. No Python environment. No Node. Ship it anywhere.

How the castkit CLI demo video generator works end to end

The one-command path is:

castkit demo./your-binary

That’s it. It discovers, plans, records, and renders without you specifying anything. You get an MP4 with branding, themes, and proper window chrome.

If you want more control:

castkit plan./your-binary --output demo-plan.json
# edit demo-plan.json
castkit record --plan demo-plan.json --output demo.mp4

The plan JSON looks like this:

{
 "title": "castkit demo",
 "theme": "catppuccin",
 "style": "dark",
 "scenes": [
 {
 "type": "command",
 "command": "castkit --help",
 "duration_ms": 2000,
 "pause_after_ms": 1500
 },
 {
 "type": "command",
 "command": "castkit demo./my-tool",
 "duration_ms": 3000,
 "pause_after_ms": 1000
 }
 ]
}

You can add intro cards, outro branding, adjust timing, swap themes. The plan is the contract between what you want and what gets rendered.

Themes and visual styles

Built-in color themes: catppuccin, tokyo-night, dracula, one-dark. Visual styles: dark, light, minimal, ocean, hacker. These aren’t just color swaps - each style affects font weight, padding, window chrome treatment, and background rendering.

The hacker style does exactly what you think it does.

Two recording modes

Terminal mode is the classic full-terminal recording. The full PTY, command output scrolling, cursor - everything you’d see if you were sitting at the machine.

Web mode is for CLI tools that spawn a browser or have web UI output. It captures that context instead.

The technical stack

Rust was the right call here for a few reasons. The rendering pipeline needs to be deterministic and fast - you’re compositing frames, and any variability in timing shows up in the output video. Rust’s ownership model also made the PTY recording safe to implement without the race conditions you’d fight in Go or Python.

Key crates:

vt100 - terminal state parsing and capture. This is the core of getting accurate terminal output into a data structure we can render.
cosmic-text - text layout with proper Unicode support, ligatures, font fallback. CLI output has a lot of edge cases here.
tiny-skia - pure-Rust 2D renderer. No Cairo, no Skia binding hell, no native dependency chain.
portable-pty - cross-platform PTY. The thing that lets you actually run a real shell session and capture it.
clap - CLI interface. The irony of using a CLI framework to build a CLI demo tool is not lost on me.

The rendering pipeline goes: PTY session → vt100 parser → terminal cell grid → tiny-skia frame → raw pixel buffer → ffmpeg stdin. Each frame is fully rendered in software. That’s intentional - it means the tool runs in headless CI without needing a display.

The auto-zoom is probably the feature that makes demos look professional without any work. When a command produces long output, castkit calculates the bounding box of the relevant content and applies a smooth zoom-in with easing. It’s the thing that makes it look like someone edited the recording, when nobody did.

Why not Go or Python

I get asked this. Go would’ve been fine for the CLI surface, but the rendering layer involves pixel-level compositing with tight frame timing. The garbage collector pauses in Go show up as dropped frames when you’re pushing raw buffers to ffmpeg at 30fps. Python isn’t even in the conversation for a tool you want to distribute as a single binary.

Rust also gave me compile-time guarantees on the PTY session lifecycle. A PTY that doesn’t get cleaned up properly hangs your terminal. With Rust’s Drop trait handling cleanup, that class of bug just doesn’t happen.

Running the castkit CLI demo video generator in CI

Running castkit in CI

This is where it gets useful beyond the local workflow. Because castkit runs headless - no display server, no GPU, just CPU and ffmpeg - it works in GitHub Actions without any special setup.

name: Generate Demo

on:
 push:
 branches: [main]

jobs:
 demo:
 runs-on: ubuntu-latest
 steps:
 - uses: actions/checkout@v4
 - name: Install ffmpeg
 run: sudo apt-get install -y ffmpeg
 - name: Download castkit
 run: |
 curl -L https://github.com/deeflect/castkit/releases/latest/download/castkit-linux-x86_64 -o castkit
 chmod +x castkit
 - name: Build your tool
 run: cargo build --release
 - name: Generate demo
 run:./castkit demo./target/release/your-tool --output demo.mp4
 - name: Upload demo artifact
 uses: actions/upload-artifact@v4
 with:
 name: demo-video
 path: demo.mp4

Every push regenerates the demo. If your tool’s output changes, the demo reflects it automatically. No stale GIFs in your README. The agent-native workflow I actually want is this: Claude Code builds the tool, the CI pipeline generates the demo, the demo gets committed to the repo. Zero human steps in the video production chain.

That’s not hypothetical. I’ve been running this on a few of my own tools for the past few weeks and it works exactly as described.

Why this matters for CLI tool marketing

Most developers shipping CLI tools think of documentation as the end of the marketing funnel. It’s not. The funnel is:

Someone sees the GitHub repo linked somewhere
They skim the README in about 8 seconds
They either get it or they don’t

A 15-second GIF that shows the tool running converts browsers into users at a completely different rate than text. This isn’t an opinion - every product person who’s run A/B tests on landing pages with and without video knows this. The video wins. Every time.

But developers don’t make demo videos because making demo videos is annoying. It’s a context switch out of the thing you’re building and into video production, which is a different skill set, different tooling, different mental mode. castkit makes that context switch disappear. You’re still in the terminal. You’re still thinking like an engineer. You just run a command.

The agent-native angle is where I think this gets genuinely interesting going forward. Right now I use Claude Code for a lot of my CLI development. The last step of that workflow - “generate a demo for this” - can now be a single tool call. The coding agent builds the thing and generates the demo as part of the same build step. No human in the loop for that part.

That’s the version of developer tooling I actually want to work in.

What’s missing and what’s next

The current version has two gaps I know about:

Windows support is partial. The PTY layer uses platform abstractions but I haven’t battle-tested it on Windows. If you hit issues, open an issue - I want to fix this.

Dynamic content is tricky. If your CLI tool outputs things that change every run (timestamps, generated IDs, random data), the recording captures whatever happened during that specific run. There’s no redaction for content variance the way there is for secrets. You can work around this by mocking your tool’s output in the plan, but it’s not automatic yet.

On the roadmap: a --dry-run mode that shows you the rendered plan without executing real commands, support for recording multiple tools in one session (useful for showing integrations), and a web viewer for the JSON plans so non-engineers can review before recording.

The code is MIT licensed and the repo is open. PRs are welcome. If you build something with it, I want to see the demo.

The meta-demo point

I said it up top but it’s worth landing: castkit generates its own demo video. That’s not a marketing stunt. It’s the proof of concept. If a tool can document itself, the core premise works.

Every CLI tool I build from here ships with a demo generated by castkit. Not because I’ve committed to some content strategy, but because it takes one command and it looks good. That’s the bar. If it takes more than one command or it looks bad, it doesn’t get used.

Right now it’s at one command and it looks good. Check the 31 Rust CLI tools for what I’m building next, or browse by my coding stack if you want more of this kind of thing.

Go run castkit demo./your-tool on whatever you’re building. If the output looks wrong or the plan generation misses something obvious, open an issue with your --help output. That’s the fastest way to make the discovery smarter.