Your terminal.
Your models.
Your rules.

Open-source AI coding agent built in Rust. Switch between Claude, GPT-4o, DeepSeek, Gemini, or any local Ollama model with one flag — 23 tools, full permission controls, MCP server built in.

Download v0.2.0 View on GitHub

terminal

$ curl -sSfL https://cuervo.cloud/install.sh | sh

AI Providers

Built-in Tools

2,226

Tests passing

100%

Open Source

About Cuervo

AI that works the way you do

Halcon CLI was built out of frustration with AI assistants that only chat. We built an agent that actually runs code, edits files, commits to git, searches your codebase — and does it safely, with your permission.

It runs entirely on your machine. No cloud intermediary, no data sent anywhere except directly to the AI provider of your choice. Open-source under Apache 2.0 — inspect every line, modify anything, contribute back.

⚖ Apache 2.0 🦀 100% Rust 🔒 Local-first 🚫 No telemetry

Developer-first design

Every feature was designed by developers who use it daily. If it doesn't feel right in the terminal, it doesn't ship.

Production-grade reliability

2,226 tests, atomic file writes, circuit breakers, tool loop guards — because your codebase deserves better than a chatbot wrapper.

Transparent & auditable

Every tool call, every permission decision, every model invocation is logged. Know exactly what Cuervo did and why.

CLI in action

See what Cuervo can do

From simple questions to complex multi-step agentic tasks — from your terminal.

chat

# Basic AI chat session

❯ halcon chat "Explain this Rust function and suggest improvements"

◆ Halcón [claude-sonnet-4-6 · anthropic]

The function transform_pipeline() applies a three-stage transformation:

1. Filters items where score > 0.7

2. Maps each to ProcessedItem with normalization

3. Collects into a new Vec

💡 Suggestion: Use iterator chains — 40% fewer allocations:

// Refactored
items.iter()
.filter(|i| i.score > 0.7)
.map(ProcessedItem::from)
.collect()

# Multi-step agentic task

❯ halcon chat "Add validation to all API endpoints in src/api/"

◆ Planning task...

Step 1: Scan directory structure

Step 2: Read each endpoint handler

Step 3: Add validation from existing patterns

Step 4: Run tests to verify

✓ directory_tree src/api/ (0.3s) — 6 files

✓ file_read src/api/users.rs (0.1s)

✓ file_edit src/api/users.rs (0.2s)

✓ bash "cargo test api" (4.1s) — 12 passed

✓ Done. Added validation to 6 endpoints. All tests pass.

# Available tools

❯ halcon tools list

file_read ReadOnly Read files with optional line ranges

file_write Destructive Write atomically (SHA-256 verified)

file_edit Destructive Edit files with string replacement

bash Destructive Execute shell commands

grep ReadOnly Search file contents with regex

directory_tree ReadOnly Visualize directory structure

git_status ReadOnly Show working tree status

git_commit Destructive Create commits

web_search ReadOnly Search the web (Brave API)

symbol_search ReadOnly Find symbols across Rust/Python/JS

task_track ReadOnly Track tasks in-session

+ 12 more tools

# TUI cockpit

❯ halcon chat --tui

◆ Halcón SESSION abc123 anthropic / claude-sonnet-4-6 ● Running

Activity

◆ Halcón R1

Planning the refactor...

✓ file_read (0.2s)

✓ file_edit (0.3s)

Plan · 2/4

✓ Scan files

⚙ Edit handlers

○ Run tests

○ Summary

Tokens: 4,821

Cost: $0.012

[Space] pause[N] step[F1] panel[F6] sessions

Features

Everything you need.
Nothing you don't.

Built for developers who want a serious AI coding partner — not a chatbot wrapper.

Multi-Model Intelligence

Switch between Claude, GPT-4, DeepSeek, Gemini, and local Ollama — one flag, no reconfiguration. Automatic fallback and latency-aware routing built in.

23 Built-in Tools

File ops, bash, git, grep, web search, directory tree, symbol search, background tasks — all with three-tier permission controls and human-in-the-loop authorization.

TUI Cockpit

Full-featured terminal UI with live activity zone, side panel for plan/metrics/context, real-time token tracking, session browser, and clickable stop button.

Permission Controls

ReadOnly / ReadWrite / Destructive permission model. Every sensitive tool requires user consent — with deny-always and non-interactive modes for CI/automation.

MCP Server

Exposes all 23 tools as an MCP server for IDE integration. Works with Cursor, VS Code, and any MCP-compatible client over stdio JSON-RPC — zero config.

Self-Correcting Agent

Detects when it's stuck before you do. Bayesian anomaly detection + ARIMA resource forecasting + reflexion loop — the agent corrects itself, so you don't have to restart.

Providers

Works with every major AI provider

Switch with a single flag — same interface, every model.

Anthropic

● Active

claude-opus-4-6

claude-sonnet-4-6

claude-haiku-4-5

OpenAI

● Active

gpt-4o

gpt-4o-mini

o3-mini

DeepSeek

● Active

deepseek-chat

deepseek-coder

deepseek-reasoner

Gemini

● Active

gemini-2.0-flash

gemini-1.5-pro

Ollama

⬡ Local

any local model

llama3

deepseek-coder-v2

OpenAI Compat

⇄ Compat

Together AI

Groq

LM Studio

any endpoint

Any OpenAI-compatible endpoint works out of the box — --provider custom --base-url https://...

Architecture

Production-grade
agent architecture

Cuervo runs a multi-round agent loop with FSM state tracking, parallel tool batching, and a 5-tier context memory pipeline (L0–L4). Not a thin API wrapper.

HICON metacognitive loop — Bayesian anomaly detection + EMA self-correction

L0–L4 context pipeline — HotBuffer → SlidingWindow → ColdStore → BM25 → Archive

ARIMA resource predictor — Forecasts token usage with 95% CI per session

Speculative tool execution — Pre-fetches ReadOnly tool results before model responds

Playbook-based planning — Auto-learns successful execution plans for reuse

// Agent loop (simplified)

loop {

// HICON: anomaly check

metacognitive.pre_round();

// Context (L0-L4 pipeline)

let ctx = pipeline.build_messages();

// Speculative pre-fetch

let cached = speculator.check_cache();

// Model invocation

let stream = provider.invoke(ctx);

// Tool execution (parallel)

let results = executor.run_batch(tools);

// Self-correct if anomaly

corrector.evaluate(&results);

if done { break; }

}

2,226

tests

<2ms

overhead/round

100%

Rust

momoto-ui-core · Rust/WASM

Perceptual color science,
live in your browser

Halcón's UI system is powered by momoto-ui-core — a Rust/WASM engine that derives perceptually accurate UI state tokens from OKLCH color science. WCAG 2.1 + APCA validated in real time.

initializing WASM…

OKLCHWCAG 2.1APCA

Brand Palette

Derived State Tokens

# Base OKLCH

oklch(0.62 0.22 38)

# Engine: TokenDerivationEngine

# ~0.02ms cache hit · 0.2ms miss

A11y Validation

◈

OKLCH Color Space

Perceptually uniform. No sRGB distortions.

⚙

TokenDerivationEngine

0.02ms cache hit · 0.2ms cold miss

✓

WCAG 2.1 + APCA

AA / AAA / Lc contrast validation

⬡

Rust → WASM

~45 KB gzip. Zero JS dependencies.

Open Source

Built in the open.
For everyone.

Apache 2.0 licensed. Every feature is public, every bug is trackable, every contribution is welcome.

Apache 2.0 License — Free to use, modify, distribute

CLI Reference

Every command at a glance

12 top-level commands. All with --help for full options.

halcon --help

halcon chat "<prompt>" Start AI-powered agentic chat core

halcon chat --tui Launch the TUI cockpit interface core

halcon -p deepseek chat Use a specific AI provider core

halcon auth login anthropic Store API key in system keychain config

halcon auth status Show all configured providers config

halcon tools list Show all 23 available tools tools

halcon tools validate Validate tool configurations tools

halcon tools add <name> Add a custom tool from manifest tools

halcon chat --full Enable orchestration, reflexion, tasks core

halcon memory search "<query>" Search across all session memory memory

halcon update Self-update to latest release core

halcon mcp-server Start as MCP server for IDEs integration

halcon doctor Check system health and config config

Run halcon cuervo <command>lt;commandcuervo <command>gt; --help for full options. Full documentation ↗

--provider <name> Override AI provider (anthropic, openai, deepseek, ollama, gemini)

--model <name> Override model for this session

--tui Launch the TUI cockpit interface

--no-tools Disable all tool use (chat only)

--expert Enable verbose expert mode output

--dry-run Preview actions without executing

Quick Start

Up and running in 60 seconds

Three commands. That's all it takes.

Install

curl -sSfL https://cuervo.cloud/install.sh | sh

macOS · Linux · Windows (PowerShell available)

Add API key

halcon auth login anthropic

Or: openai, deepseek, gemini, ollama

Start coding

halcon chat --tui

Or: halcon chat "your prompt here"

More install options: Homebrew, Cargo, Windows → /download ↗

Ready to ship
faster?

Open source. Self-hostable. No cloud required.
Just you, your terminal, and the best AI models.

Download Halcon CLI Star on GitHub

Your terminal.
Your models.
Your rules.

AI that works the way you do

See what Cuervo can do

Everything you need.
Nothing you don't.

Multi-Model Intelligence

23 Built-in Tools

TUI Cockpit

Permission Controls

MCP Server

Self-Correcting Agent

Works with every major AI provider

Production-grade
agent architecture

Perceptual color science,
live in your browser

Built in the open.
For everyone.

Source Code

Discussions

Contribute

Every command at a glance

Up and running in 60 seconds

Ready to ship
faster?

Your terminal. Your models. Your rules.

AI that works the way you do

See what Cuervo can do

Everything you need.Nothing you don't.

Multi-Model Intelligence

23 Built-in Tools

TUI Cockpit

Permission Controls

MCP Server

Self-Correcting Agent

Works with every major AI provider

Production-gradeagent architecture

Perceptual color science, live in your browser

Built in the open.For everyone.

Source Code

Discussions

Contribute

Every command at a glance

Up and running in 60 seconds

Ready to ship faster?

Your terminal.
Your models.
Your rules.

Everything you need.
Nothing you don't.

Production-grade
agent architecture

Perceptual color science,
live in your browser

Built in the open.
For everyone.

Ready to ship
faster?