v0.2.0 — Now available

Your terminal.
Your models.
Your rules.

Open-source AI coding agent built in Rust. Switch between Claude, GPT-4o, DeepSeek, Gemini, or any local Ollama model with one flag — 23 tools, full permission controls, MCP server built in.

terminal
$ curl -sSfL https://cuervo.cloud/install.sh | sh
6
AI Providers
23
Built-in Tools
2,226
Tests passing
100%
Open Source
About Cuervo

AI that works the way you do

Halcon CLI was built out of frustration with AI assistants that only chat. We built an agent that actually runs code, edits files, commits to git, searches your codebase — and does it safely, with your permission.

It runs entirely on your machine. No cloud intermediary, no data sent anywhere except directly to the AI provider of your choice. Open-source under Apache 2.0 — inspect every line, modify anything, contribute back.

Apache 2.0 🦀 100% Rust 🔒 Local-first 🚫 No telemetry
01
Developer-first design

Every feature was designed by developers who use it daily. If it doesn't feel right in the terminal, it doesn't ship.

02
Production-grade reliability

2,226 tests, atomic file writes, circuit breakers, tool loop guards — because your codebase deserves better than a chatbot wrapper.

03
Transparent & auditable

Every tool call, every permission decision, every model invocation is logged. Know exactly what Cuervo did and why.

CLI in action

See what Cuervo can do

From simple questions to complex multi-step agentic tasks — from your terminal.

chat
# Basic AI chat session
halcon chat "Explain this Rust function and suggest improvements"
◆ Halcón [claude-sonnet-4-6 · anthropic]
The function transform_pipeline() applies a three-stage transformation:
1. Filters items where score > 0.7
2. Maps each to ProcessedItem with normalization
3. Collects into a new Vec
💡 Suggestion: Use iterator chains — 40% fewer allocations:
// Refactored
items.iter()
.filter(|i| i.score > 0.7)
.map(ProcessedItem::from)
.collect()
# Multi-step agentic task
halcon chat "Add validation to all API endpoints in src/api/"
◆ Planning task...
Step 1: Scan directory structure
Step 2: Read each endpoint handler
Step 3: Add validation from existing patterns
Step 4: Run tests to verify
directory_tree src/api/ (0.3s) — 6 files
file_read src/api/users.rs (0.1s)
file_edit src/api/users.rs (0.2s)
bash "cargo test api" (4.1s) — 12 passed
✓ Done. Added validation to 6 endpoints. All tests pass.
# Available tools
halcon tools list
file_read ReadOnly Read files with optional line ranges
file_write Destructive Write atomically (SHA-256 verified)
file_edit Destructive Edit files with string replacement
bash Destructive Execute shell commands
grep ReadOnly Search file contents with regex
directory_tree ReadOnly Visualize directory structure
git_status ReadOnly Show working tree status
git_commit Destructive Create commits
web_search ReadOnly Search the web (Brave API)
symbol_search ReadOnly Find symbols across Rust/Python/JS
task_track ReadOnly Track tasks in-session
+ 12 more tools
# TUI cockpit
halcon chat --tui
◆ Halcón SESSION abc123 anthropic / claude-sonnet-4-6 ● Running
Activity
◆ Halcón R1
Planning the refactor...
✓ file_read (0.2s)
✓ file_edit (0.3s)
Plan · 2/4
✓ Scan files
⚙ Edit handlers
○ Run tests
○ Summary
Tokens: 4,821
Cost: $0.012
[Space] pause[N] step[F1] panel[F6] sessions
Features

Everything you need.
Nothing you don't.

Built for developers who want a serious AI coding partner — not a chatbot wrapper.

Multi-Model Intelligence

Switch between Claude, GPT-4, DeepSeek, Gemini, and local Ollama — one flag, no reconfiguration. Automatic fallback and latency-aware routing built in.

23 Built-in Tools

File ops, bash, git, grep, web search, directory tree, symbol search, background tasks — all with three-tier permission controls and human-in-the-loop authorization.

TUI Cockpit

Full-featured terminal UI with live activity zone, side panel for plan/metrics/context, real-time token tracking, session browser, and clickable stop button.

Permission Controls

ReadOnly / ReadWrite / Destructive permission model. Every sensitive tool requires user consent — with deny-always and non-interactive modes for CI/automation.

MCP Server

Exposes all 23 tools as an MCP server for IDE integration. Works with Cursor, VS Code, and any MCP-compatible client over stdio JSON-RPC — zero config.

Self-Correcting Agent

Detects when it's stuck before you do. Bayesian anomaly detection + ARIMA resource forecasting + reflexion loop — the agent corrects itself, so you don't have to restart.

Providers

Works with every major AI provider

Switch with a single flag — same interface, every model.

Anthropic
● Active
claude-opus-4-6
claude-sonnet-4-6
claude-haiku-4-5
OpenAI
● Active
gpt-4o
gpt-4o-mini
o1
o3-mini
DeepSeek
● Active
deepseek-chat
deepseek-coder
deepseek-reasoner
Gemini
● Active
gemini-2.0-flash
gemini-1.5-pro
Ollama
⬡ Local
any local model
llama3
deepseek-coder-v2
OpenAI Compat
⇄ Compat
Together AI
Groq
LM Studio
any endpoint

Any OpenAI-compatible endpoint works out of the box — --provider custom --base-url https://...

Architecture

Production-grade
agent architecture

Cuervo runs a multi-round agent loop with FSM state tracking, parallel tool batching, and a 5-tier context memory pipeline (L0–L4). Not a thin API wrapper.

HICON metacognitive loop — Bayesian anomaly detection + EMA self-correction
L0–L4 context pipeline — HotBuffer → SlidingWindow → ColdStore → BM25 → Archive
ARIMA resource predictor — Forecasts token usage with 95% CI per session
Speculative tool execution — Pre-fetches ReadOnly tool results before model responds
Playbook-based planning — Auto-learns successful execution plans for reuse
// Agent loop (simplified)
loop {
// HICON: anomaly check
metacognitive.pre_round();
// Context (L0-L4 pipeline)
let ctx = pipeline.build_messages();
// Speculative pre-fetch
let cached = speculator.check_cache();
// Model invocation
let stream = provider.invoke(ctx);
// Tool execution (parallel)
let results = executor.run_batch(tools);
// Self-correct if anomaly
corrector.evaluate(&results);
if done { break; }
}
2,226
tests
<2ms
overhead/round
100%
Rust
momoto-ui-core · Rust/WASM

Perceptual color science,
live in your browser

Halcón's UI system is powered by momoto-ui-core — a Rust/WASM engine that derives perceptually accurate UI state tokens from OKLCH color science. WCAG 2.1 + APCA validated in real time.

initializing WASM…
OKLCHWCAG 2.1APCA

Brand Palette

Derived State Tokens

# Base OKLCH
oklch(0.62  0.22  38)
# Engine: TokenDerivationEngine
# ~0.02ms cache hit · 0.2ms miss

A11y Validation

Powered by momoto-ui-core · Rust/WASM · OKLCH perceptual modelloading…
OKLCH Color Space
Perceptually uniform. No sRGB distortions.
TokenDerivationEngine
0.02ms cache hit · 0.2ms cold miss
WCAG 2.1 + APCA
AA / AAA / Lc contrast validation
Rust → WASM
~45 KB gzip. Zero JS dependencies.
CLI Reference

Every command at a glance

12 top-level commands. All with --help for full options.

halcon --help
halcon chat "<prompt>" Start AI-powered agentic chat
halcon chat --tui Launch the TUI cockpit interface
halcon -p deepseek chat Use a specific AI provider
halcon auth login anthropic Store API key in system keychain
halcon auth status Show all configured providers
halcon tools list Show all 23 available tools
halcon tools validate Validate tool configurations
halcon tools add <name> Add a custom tool from manifest
halcon chat --full Enable orchestration, reflexion, tasks
halcon memory search "<query>" Search across all session memory
halcon update Self-update to latest release
halcon mcp-server Start as MCP server for IDEs
halcon doctor Check system health and config
Run halcon cuervo <command>lt;commandcuervo <command>gt; --help for full options. Full documentation ↗
--provider <name> Override AI provider (anthropic, openai, deepseek, ollama, gemini)
--model <name> Override model for this session
--tui Launch the TUI cockpit interface
--no-tools Disable all tool use (chat only)
--expert Enable verbose expert mode output
--dry-run Preview actions without executing
Quick Start

Up and running in 60 seconds

Three commands. That's all it takes.

01
Install
curl -sSfL https://cuervo.cloud/install.sh | sh
macOS · Linux · Windows (PowerShell available)
02
Add API key
halcon auth login anthropic
Or: openai, deepseek, gemini, ollama
03
Start coding
halcon chat --tui
Or: halcon chat "your prompt here"

Ready to ship
faster?

Open source. Self-hostable. No cloud required.
Just you, your terminal, and the best AI models.