v0.2.0 stable

The terminal AI agent

Halcón connects your terminal to Claude, GPT-4o, DeepSeek, Gemini, or any local Ollama model — with 21 built-in tools, layered permissions, and a native MCP server for your IDE.

bash
$ curl -sSfL cuervo.cloud/install.sh | sh

SHA-256 verified · no sudo · installs to ~/.local/bin

v0.2.0 macOS Linux Windows Written in Rust
halcon — zsh — 120×38
21
Built-in Tools
5
AI Providers
100%
Written in Rust
<2ms
Agent overhead

Multi-model

Works with every major AI provider

Switch providers and models with a single flag. No reconfiguration, no friction.

🧠
Anthropic

Models

Claude Opus 4.6
Claude Sonnet 4.5
Claude Haiku 4.5
OpenAI

Models

GPT-4o
GPT-4o mini
o1
o3-mini
DeepSeek

Models

deepseek-chat
deepseek-coder
deepseek-reasoner
Google Gemini

Models

Gemini 2.0 Flash
Gemini 2.5 Pro
Ollama
Local · Free

Models

Llama 3
Mistral
DeepSeek Coder V2
+ any model

Switch with one flag

halcon -p openai chat halcon -p ollama -m llama3 chat

Capabilities

Everything you need to automate your workflow

Production-grade with 2,200+ test coverage. Multi-model. Zero native dependencies.

21 built-in tools

File read/write/edit, bash, git, web search, regex grep, code symbols, HTTP requests, and background jobs — all wired into the agent loop.

╌╌ R2
Read project structure
directory_tree cuervo-cli/
14 crates · 2.3s
file_read src/main.rs
grep "fn main" · 0.4s
Found entry point at line 42…

Permission-first

Every tool is classified. ReadOnly tools run silently. ReadWrite and Destructive tools require explicit confirmation.

ReadOnly — silent
ReadWrite — confirm
Destructive — [y/N]

TUI Cockpit

3-zone terminal UI: live token counters, agent FSM state, plan progress, and real-time activity stream. Pause, step, or cancel mid-execution.

halcon chat --tui

Episodic memory

Remembers decisions, file paths, and learnings across sessions using BM25 semantic search and temporal decay scoring.

MCP Server built in

Run halcon mcp-server to expose all 21 tools via JSON-RPC over stdio. Wire into VS Code, Cursor, or any MCP-compatible IDE.

Multi-provider routing

Automatic fallback between providers, latency-aware model selection, and cost tracking per session. Set balanced, fast, or cheap strategies.

deepseek-chat
420ms
claude-sonnet
680ms
gpt-4o-mini
310ms
ollama/llama3
95ms

p95 latency · real session data · automatic routing

TUI Cockpit

Full situational
awareness.

The 3-zone TUI gives you live token counters, real-time agent FSM state, execution plan progress, side panel metrics, and a scrollable activity stream — all updating as the agent works.

  • Pause, step, or cancel the agent mid-execution
  • Live token cost tracked per round
  • Tool execution timeline with durations
  • Provider fallback indicators
  • Context compression events in real time
halcon chat --tui
halcon · TUI Cockpit
● RUNNING
SESSION a3f8c1 anthropic / claude-sonnet-4-5 R3   ↑1.2k4.8k $0.0024
╌╌ R1
Analyze project structure
⟳ directory_tree 240ms
✓ directory_tree
╌╌ R2
Read key source files
⟳ file_read src/main.rs
✓ file_read
╌╌ R3
◆ Halcón The project has 14 crates. The main entry point...

Plan

Analyze structure
Read sources
Generate report

Tokens

In 1.2k
Out 4.8k
Cost $0.002
Ask anything or give a task… ⏸ Space

In action

One agent, every workflow

From quick questions to complex multi-step engineering tasks.

~ $ halcon chat "Analyze this repo and find any N+1 query issues"
◈ Planning...
▸ Scan directory structure
▸ Read database layer files
▸ Analyze query patterns
⟳ directory_tree src/ 170ms
✓ directory_tree 23 files
⟳ file_read src/db/users.rs 38ms
⟳ file_read src/db/posts.rs 31ms
✓ file_read 2 files
◆ halcon Found 2 N+1 issues: get_posts_by_user() loads all posts then fetches author per row. list_comments() makes one query per post. Use JOIN or batch WHERE id IN (?) instead.
~ $ halcon -p deepseek chat "Write a Rust async web scraper"
provider: deepseek · model: deepseek-chat · strategy: balanced
◈ Planning...
▸ Design scraper architecture
▸ Write reqwest + tokio implementation
▸ Add error handling and rate limiting
⟳ file_write src/scraper.rs 2.1s
✓ file_write 148 lines written
◆ halcon Created src/scraper.rs — async scraper using reqwest, tokio, concurrent requests via JoinSet, exponential backoff, and configurable rate limiting.
↑ 842 tokens · ↓ 2.1k tokens · $0.0003 · 4.8s
~ $ halcon chat --tui
Launching TUI Cockpit...
SESSION f2a1b9 anthropic / claude-sonnet-4-5 R2 1.8k5.2k $0.0031
╌╌ R1
▸ Understand the codebase
✓ directory_tree · file_read
╌╌ R2
⟳ grep "TODO" **/*.rs
▸ [Space] pause [N] step [Esc] cancel

Get started

Up and running in 60 seconds

1

Install Halcón CLI

Installs to ~/.local/bin/halcon. SHA-256 verified.

$ curl -sSfL cuervo.cloud/install.sh | sh
2

Configure your API key

Stores your key securely in the OS keychain.

$ halcon auth login anthropic
3

Start coding with AI

Or launch the full TUI cockpit with --tui

$ halcon chat "Refactor my login function to use async/await"

Supports Anthropic, OpenAI, DeepSeek, Gemini, and local Ollama.

Free forever — no account required

Start shipping faster today.

One command install. No configuration. Every major AI provider. macOS · Linux · Windows.

$ curl -sSfL cuervo.cloud/install.sh | sh
✓ SHA-256 verified · ✓ No sudo needed · ✓ Installs to ~/.local/bin · ✓ 100% open audit trail