he parse¶

Extract knowledge from documents and save to a knowledge abstract.

Synopsis¶

he parse INPUT [OPTIONS]

Arguments¶

Argument	Description
`INPUT`	Input file path, directory, or `-` for stdin

Options¶

Option	Short	Description
`--output`	`-o`	Output directory (required)
`--template`	`-t`	Template to use (omit for interactive selection)
`--method`	`-m`	Method template (e.g., `light_rag`, `graph_rag`)
`--lang`	`-l`	Language: `zh` or `en` (required for knowledge templates)
`--force`	`-f`	Force overwrite existing output
`--no-index`	—	Skip building search index

Examples¶

Basic Usage¶

Extract from a single file:

he parse document.md -t general/biography_graph -o ./output/ -l en

Interactive Template Selection¶

Omit -t to select from available templates:

he parse document.md -o ./output/ -l en

# You'll see:
# Select a template:
#   [1] general/biography_graph
#   [2] general/graph
#   [3] finance/earnings_summary
#   ...
# Enter number or search keyword:

Process a Directory¶

Extract from all .md and .txt files in a directory:

he parse ./documents/ -t general/concept_graph -o ./output/ -l en

Files are combined in alphabetical order before extraction.

Using Methods Instead of Templates¶

Use underlying extraction methods:

he parse document.md -m light_rag -o ./output/

Methods always use English prompts.

Force Overwrite¶

Overwrite existing output directory:

he parse document.md -t general/biography_graph -o ./output/ -l en -f

Skip Index Building¶

Speed up extraction if you don't need search/chat:

he parse document.md -t general/biography_graph -o ./output/ -l en --no-index

Build index later with he build-index.

Read from Stdin¶

cat document.md | he parse - -t general/biography_graph -o ./output/ -l en

Output Structure¶

./output/
├── data.json           # Extracted knowledge (entities, relations, etc.)
├── metadata.json       # Extraction metadata
│   ├── template        # Template used
│   ├── lang           # Language
│   ├── created_at     # Creation timestamp
│   └── updated_at     # Last update timestamp
└── index/             # Vector search index (if built)
    ├── index.faiss
    └── docstore.json

Language Support¶

Templates support multiple languages:

# English
he parse doc.md -t general/biography_graph -l en -o ./output/

# Chinese
he parse doc.md -t general/biography_graph -l zh -o ./output/

Choose the language that matches your document for best results.

Common Use Cases¶

Research Paper¶

he parse paper.md -t general/concept_graph -o ./paper_kb/ -l en

Biography¶

he parse biography.md -t general/biography_graph -o ./bio_kb/ -l en

Legal Contract¶

he parse contract.md -t legal/contract_obligation -o ./contract_kb/ -l en

Financial Report¶

he parse earnings.md -t finance/earnings_summary -o ./finance_kb/ -l en

Error Handling¶

"Output directory already exists"¶

The output directory exists and is not empty. Solutions:

Use -f to force overwrite
Choose a different output path
Remove the existing directory first

"Template not found"¶

The specified template doesn't exist. Solutions:

List available templates: he list template
Use interactive selection (omit -t)
Check template path spelling

"Language is required"¶

Knowledge templates require a language flag. Methods don't:

# Template - requires -l
he parse doc.md -t general/biography_graph -o ./out/ -l en

# Method - no -l needed
he parse doc.md -m light_rag -o ./out/

Best Practices¶

Choose the right template — Match your document type
Use correct language — Improves extraction quality
Organize outputs — Use descriptive directory names
Skip index during batch — Use --no-index, build once at the end