Skip to content

CLI Configuration Reference

Configuration guide for Hyper-Extract CLI, supporting OpenAI, Alibaba Cloud Bailian, and Local vLLM deployments.


Quick Start

Choose your deployment method and follow the steps.

he config init -p openai -k YOUR_OPENAI_API_KEY

This creates ~/.he/config.toml with OpenAI presets: - LLM: gpt-4o-mini - Embedder: text-embedding-3-small

he config init -p bailian -k YOUR_BAILIAN_API_KEY

This creates ~/.he/config.toml with Bailian presets: - LLM: qwen3.6-plus - Embedder: text-embedding-v4

Start the LLM and Embedding services first, then configure separately:

# LLM service
he config llm -p vllm \
  -u http://localhost:8000/v1 \
  -k dummy \
  -m Qwen/Qwen3.5-9B

# Embedding service
he config embedder -p vllm \
  -u http://localhost:8001/v1 \
  -k dummy \
  -m BAAI/bge-m3

Verify Configuration

he config show
┌─────────────────────────────────────────────────────────────┐
│              Hyper-Extract Configuration                    │
├──────────┬──────────┬─────────────────────┬────────────┬────┤
│ Service  │ Provider │ Model               │ API Key    │ ...│
├──────────┼──────────┼─────────────────────┼────────────┼────┤
│ LLM      │ bailian  │ qwen3.6-plus        │ sk-xxxx... │ ...│
│ Embedder │ bailian  │ text-embedding-v4   │ sk-xxxx... │ ...│
└──────────┴──────────┴─────────────────────┴────────────┴────┘
┌──────────────────────────────────────────────────────────────────┐
│                Hyper-Extract Configuration                       │
├──────────┬──────────┬─────────────────────┬──────────┬───────────┤
│ Service  │ Provider │ Model               │ API Key  │ Base URL  │
├──────────┼──────────┼─────────────────────┼──────────┼───────────┤
│ LLM      │ vllm     │ Qwen/Qwen3.5-9B     │ dummy... │ localhost…│
│ Embedder │ vllm     │ BAAI/bge-m3         │ dummy... │ localhost…│
└──────────┴──────────┴─────────────────────┴──────────┴───────────┘

CLI Command Reference

Command Overview

Command Common Flags Description
he config init -p, --provider — Provider preset
-k, --api-key — API key
-u, --base-url — Custom API endpoint
Initialize configuration (first time)
he config llm -p, --provider — Provider
-m, --model — Model name
-k, --api-key — API key
-u, --base-url — Custom API endpoint
--show — Show current config
--unset — Clear configuration
Configure LLM
he config embedder -p, --provider — Provider
-m, --model — Model name
-k, --api-key — API key
-u, --base-url — Custom API endpoint
--show — Show current config
--unset — Clear configuration
Configure embedder
he config show View full configuration

Common Usage Examples

Interactive initialization (recommended for first-time users):

he config init
# Follow prompts to select provider, enter model name and API key

View LLM configuration:

he config llm --show

Clear configuration (reset to defaults):

he config llm --unset
he config embedder --unset

Per-Platform Configuration Examples

# One-click setup
he config init -p openai -k sk-your-key

# Or switch model
he config llm -p openai -k sk-your-key -m gpt-4o
# One-click setup (default qwen3.6-plus + text-embedding-v4)
he config init -p bailian -k sk-your-key

# Or switch model
he config llm -p bailian -k sk-your-key -m qwen-plus
# Start services first, then configure separately
he config llm -p vllm \
  -u http://localhost:8000/v1 \
  -k dummy \
  -m Qwen/Qwen3.5-9B

he config embedder -p vllm \
  -u http://localhost:8001/v1 \
  -k dummy \
  -m BAAI/bge-m3

LLM and Embedder can use different providers:

# LLM via Bailian, Embedding via local vLLM
he config llm -p bailian -k sk-your-key
he config embedder -p vllm \
  -u http://localhost:8001/v1 \
  -k dummy \
  -m BAAI/bge-m3

# Or LLM via local, Embedding via Bailian
he config llm -p vllm \
  -u http://localhost:8000/v1 \
  -k dummy \
  -m Qwen/Qwen3.5-9B
he config embedder -p bailian -k sk-your-key

Configuration File Details

Running he config init automatically creates ~/.he/config.toml.

File Location

  • Linux/macOS: ~/.he/config.toml
  • Windows: %USERPROFILE%\.he\config.toml

Configuration Format

[llm]
provider = "bailian"
model = "qwen3.6-plus"
api_key = "sk-your-api-key"
base_url = "https://dashscope.aliyuncs.com/compatible-mode/v1"

[embedder]
provider = "bailian"
model = "text-embedding-v4"
api_key = ""
base_url = ""
[llm]
provider = "vllm"
model = "Qwen/Qwen3.5-9B"
api_key = "dummy"
base_url = "http://localhost:8000/v1"

[embedder]
provider = "vllm"
model = "BAAI/bge-m3"
api_key = "dummy"
base_url = "http://localhost:8001/v1"
[llm]
provider = "bailian"
model = "qwen3.6-plus"
api_key = "sk-your-api-key"
base_url = ""

[embedder]
provider = "vllm"
model = "BAAI/bge-m3"
api_key = "dummy"
base_url = "http://localhost:8001/v1"

Configuration Options

Section Key Description Default
[llm] provider Provider preset
[llm] model LLM model name Depends on provider preset
[llm] api_key API key Required
[llm] base_url API base URL Depends on provider preset
[embedder] provider Provider preset
[embedder] model Embedding model name Depends on provider preset
[embedder] api_key API key (empty inherits from llm)
[embedder] base_url API base URL (empty inherits from llm)

See the Provider System compatibility table for default models per provider.


Environment Variables

The following environment variables can be used as configuration fallback:

Variable Maps To Example
OPENAI_API_KEY llm.api_key, embedder.api_key sk-...
OPENAI_BASE_URL llm.base_url https://api.openai.com/v1

Priority Note: Environment variables take precedence over config file, but are overridden by command-line flags.

Use Cases: - Temporarily switch API keys - Inject secrets in CI/CD environments - Avoid hardcoding keys in config files


Troubleshooting

"API key not found"

he config init -p bailian -k YOUR_API_KEY

"The model does not exist" / 404

Check that the configured model name is available on the platform. See available models in Provider System.

"Failed to connect to API"

# Check configuration
he config llm --show

# Reset base_url
he config llm --base-url https://dashscope.aliyuncs.com/compatible-mode/v1

"Connection refused"

vLLM service is not running or has stopped:

# Check if services are running
curl http://localhost:8000/v1/models
curl http://localhost:8001/v1/models

# Restart services

"CUDA out of memory"

Lower --gpu-memory-utilization or use a smaller model / quantized version.


See Also