v0.2.0 · macOS · Linux · Windows

The terminal AI agentthat runs on any model.

Halcón connects your terminal to Claude, GPT-4o, DeepSeek, Gemini, or any local Ollama model — with 21 built-in tools, layered permissions, and a native MCP server for your IDE.

Download v0.2.0 Read the docs

One-line install

bash

$ curl -sSfL cuervo.cloud/install.sh | sh

SHA-256 verified · no sudo required · installs to ~/.local/bin

Works with every major AI provider

Switch providers and models with a single flag. No reconfiguration, no friction.

Anthropic 3 models

Models

Claude Opus 4.6

Claude Sonnet 4.5

Claude Haiku 4.5

OpenAI 4 models

Models

GPT-4o

GPT-4o mini

o3-mini

DeepSeek 3 models

Models

deepseek-chat

deepseek-coder

deepseek-reasoner

Google Gemini 2 models

Models

Gemini 2.0 Flash

Gemini 2.5 Pro

Ollama 4 models

Local · Free

Models

Llama 3

Mistral

DeepSeek Coder V2

+ any model

Switch with one flag


halcon -p openai chat


halcon -p ollama -m llama3 chat

Everything you need to automate your workflow

Production-grade with 2,200+ test coverage. Multi-model. Zero native dependencies.

21 built-in tools

File read/write/edit, bash execution, git operations, web search, regex grep, code symbol extraction, HTTP requests, and background jobs — all wired into the agent loop.

Permission-first by design

Every tool is classified: ReadOnly tools execute silently, ReadWrite and Destructive tools require your explicit confirmation. The agent never runs rm or git commit without you.

TUI Cockpit

A 3-zone terminal UI with live token counters, agent FSM state, plan progress, side panel metrics, and a real-time activity stream. Pause, step, or cancel the agent mid-execution.

Episodic memory

The agent remembers decisions, file paths, and learnings across sessions using BM25 semantic search, temporal decay scoring, and automatic consolidation. No setup required.

MCP Server built in

Run `halcon mcp-server` to expose all 21 tools via JSON-RPC over stdio. Wire it into VS Code, Cursor, or any IDE that supports the Model Context Protocol.

Multi-provider routing

Automatic fallback between providers, latency-aware model selection, and cost tracking per session. Set `balanced`, `fast`, or `cheap` routing strategies in config.

TUI Cockpit

Full situational
awareness.

The 3-zone TUI gives you live token counters, real-time agent FSM state, execution plan progress, side panel metrics, and a scrollable activity stream — all updating as the agent works.

Pause, step, or cancel the agent mid-execution
Live token cost tracked per round
Tool execution timeline with durations
Provider fallback indicators
Context compression events in real time


halcon chat --tui

    
halcon · TUI Cockpit  ● RUNNING 
 SESSION a3f8c1 anthropic / claude-sonnet-4-5 R3   ↑1.2k ↓4.8k $0.0024 
╌╌ R1
▸ Analyze project structure
⟳ directory_tree 240ms
✓ directory_tree ▸
╌╌ R2
▸ Read key source files
⟳ file_read  src/main.rs
✓ file_read ▸
╌╌ R3
 ◆ Halcón The project has 14 crates. The main entry point... 
Plan
 ✓ Analyze structure 
 ✓ Read sources 
 ▸ Generate report 
Tokens
 In 1.2k 
 Out 4.8k 
 Cost $0.002 
 › Ask anything or give a task… ⏸ Space 

Up and running in 60 seconds

Install Halcón CLI

Installs to ~/.local/bin/halcon. SHA-256 verified.

$ curl -sSfL cuervo.cloud/install.sh | sh

Configure your API key

Stores your key securely in the OS keychain.

$ halcon auth login anthropic

Start coding with AI

Or launch the full TUI cockpit with --tui

$ halcon chat "Refactor my login function to use async/await"

Supports Anthropic, OpenAI, DeepSeek, Gemini, and local Ollama.

The terminal AI agentthat runs on any model.

Works with every major AI provider

Everything you need to automate your workflow

21 built-in tools

Permission-first by design

TUI Cockpit

Episodic memory

MCP Server built in

Multi-provider routing

Full situationalawareness.

Up and running in 60 seconds

Full situational
awareness.