v0.2.0 stable

The terminal AI agent ▋

Halcón connects your terminal to Claude, GPT-4o, DeepSeek, Gemini, or any local Ollama model — with 21 built-in tools, layered permissions, and a native MCP server for your IDE.

Download v0.2.0 Read the docs

bash

$ curl -sSfL cuervo.cloud/install.sh | sh

SHA-256 verified · no sudo · installs to ~/.local/bin

v0.2.0 macOS Linux Windows Written in Rust

halcon — zsh — 120×38

Built-in Tools

AI Providers

100%

Written in Rust

<2ms

Agent overhead

Multi-model

Works with every major AI provider

Switch providers and models with a single flag. No reconfiguration, no friction.

🧠

Anthropic

Models

Claude Opus 4.6

Claude Sonnet 4.5

Claude Haiku 4.5

⬡

OpenAI

Models

GPT-4o

GPT-4o mini

o3-mini

◎

DeepSeek

Models

deepseek-chat

deepseek-coder

deepseek-reasoner

✦

Google Gemini

Models

Gemini 2.0 Flash

Gemini 2.5 Pro

⬤

Ollama

Local · Free

Models

Llama 3

Mistral

DeepSeek Coder V2

+ any model

Switch with one flag


halcon -p openai chat


halcon -p ollama -m llama3 chat

Capabilities

Everything you need to automate your workflow

Production-grade with 2,200+ test coverage. Multi-model. Zero native dependencies.

21 built-in tools

File read/write/edit, bash, git, web search, regex grep, code symbols, HTTP requests, and background jobs — all wired into the agent loop.

╌╌ R2

▸ Read project structure

⟳ directory_tree cuervo-cli/

✓ 14 crates · 2.3s

⟳ file_read src/main.rs

✓ grep "fn main" · 0.4s

◆ Found entry point at line 42…

Permission-first

Every tool is classified. ReadOnly tools run silently. ReadWrite and Destructive tools require explicit confirmation.

● ReadOnly — silent

● ReadWrite — confirm

● Destructive — [y/N]

TUI Cockpit

3-zone terminal UI: live token counters, agent FSM state, plan progress, and real-time activity stream. Pause, step, or cancel mid-execution.

halcon chat --tui

Episodic memory

Remembers decisions, file paths, and learnings across sessions using BM25 semantic search and temporal decay scoring.

MCP Server built in

Run halcon mcp-server to expose all 21 tools via JSON-RPC over stdio. Wire into VS Code, Cursor, or any MCP-compatible IDE.

Multi-provider routing

Automatic fallback between providers, latency-aware model selection, and cost tracking per session. Set balanced, fast, or cheap strategies.

deepseek-chat

420ms

claude-sonnet

680ms

gpt-4o-mini

310ms

ollama/llama3

95ms

p95 latency · real session data · automatic routing

TUI Cockpit

Full situational
awareness.

The 3-zone TUI gives you live token counters, real-time agent FSM state, execution plan progress, side panel metrics, and a scrollable activity stream — all updating as the agent works.

✓ Pause, step, or cancel the agent mid-execution
✓ Live token cost tracked per round
✓ Tool execution timeline with durations
✓ Provider fallback indicators
✓ Context compression events in real time


halcon chat --tui

    
halcon · TUI Cockpit  ● RUNNING 
 SESSION a3f8c1 anthropic / claude-sonnet-4-5 R3   ↑1.2k ↓4.8k $0.0024 
╌╌ R1
▸ Analyze project structure
⟳ directory_tree 240ms
✓ directory_tree ▸
╌╌ R2
▸ Read key source files
⟳ file_read  src/main.rs
✓ file_read ▸
╌╌ R3
 ◆ Halcón The project has 14 crates. The main entry point... 
Plan
 ✓ Analyze structure 
 ✓ Read sources 
 ▸ Generate report 
Tokens
 In 1.2k 
 Out 4.8k 
 Cost $0.002 
 › Ask anything or give a task… ⏸ Space 

In action

One agent, every workflow

From quick questions to complex multi-step engineering tasks.

~ $ halcon chat "Analyze this repo and find any N+1 query issues"

◈ Planning...

▸ Scan directory structure

▸ Read database layer files

▸ Analyze query patterns

⟳ directory_tree src/ 170ms

✓ directory_tree 23 files

⟳ file_read src/db/users.rs 38ms

⟳ file_read src/db/posts.rs 31ms

✓ file_read 2 files

◆ halcon Found 2 N+1 issues: get_posts_by_user() loads all posts then fetches author per row. list_comments() makes one query per post. Use JOIN or batch WHERE id IN (?) instead.

~ $ halcon -p deepseek chat "Write a Rust async web scraper"

provider: deepseek · model: deepseek-chat · strategy: balanced

◈ Planning...

▸ Design scraper architecture

▸ Write reqwest + tokio implementation

▸ Add error handling and rate limiting

⟳ file_write src/scraper.rs 2.1s

✓ file_write 148 lines written

◆ halcon Created src/scraper.rs — async scraper using reqwest, tokio, concurrent requests via JoinSet, exponential backoff, and configurable rate limiting.

↑ 842 tokens · ↓ 2.1k tokens · $0.0003 · 4.8s

~ $ halcon chat --tui

Launching TUI Cockpit...

SESSION f2a1b9 anthropic / claude-sonnet-4-5 R2 ↑1.8k ↓5.2k $0.0031

╌╌ R1

▸ Understand the codebase

✓ directory_tree · file_read

╌╌ R2

⟳ grep "TODO" **/*.rs

▸ [Space] pause [N] step [Esc] cancel

Get started

Up and running in 60 seconds

Install Halcón CLI

Installs to ~/.local/bin/halcon. SHA-256 verified.

$ curl -sSfL cuervo.cloud/install.sh | sh

Configure your API key

Stores your key securely in the OS keychain.

$ halcon auth login anthropic

Start coding with AI

Or launch the full TUI cockpit with --tui

$ halcon chat "Refactor my login function to use async/await"

Supports Anthropic, OpenAI, DeepSeek, Gemini, and local Ollama.

Free forever — no account required

Start shipping faster today.

One command install. No configuration. Every major AI provider.
macOS · Linux · Windows.

Download v0.2.0 Read the docs

$ curl -sSfL cuervo.cloud/install.sh | sh

✓ SHA-256 verified · ✓ No sudo needed · ✓ Installs to ~/.local/bin · ✓ 100% open audit trail

The terminal AI agent ▋

Works with every major AI provider

Everything you need to automate your workflow

21 built-in tools

Permission-first

TUI Cockpit

Episodic memory

MCP Server built in

Multi-provider routing

Full situationalawareness.

One agent, every workflow

Up and running in 60 seconds

Start shipping faster today.

Full situational
awareness.