he parse¶
Extract knowledge from documents and save to a knowledge abstract.
Synopsis¶
Arguments¶
| Argument | Description |
|---|---|
INPUT |
Input file path, directory, or - for stdin |
Options¶
| Option | Short | Description |
|---|---|---|
--output |
-o |
Output directory (required) |
--template |
-t |
Template to use (omit for interactive selection) |
--method |
-m |
Method template (e.g., light_rag, graph_rag) |
--lang |
-l |
Language: zh or en (required for knowledge templates) |
--force |
-f |
Force overwrite existing output |
--no-index |
— | Skip building search index |
Examples¶
Basic Usage¶
Extract from a single file:
Interactive Template Selection¶
Omit -t to select from available templates:
he parse document.md -o ./output/ -l en
# You'll see:
# Select a template:
# [1] general/biography_graph
# [2] general/graph
# [3] finance/earnings_summary
# ...
# Enter number or search keyword:
Process a Directory¶
Extract from all .md and .txt files in a directory:
Files are combined in alphabetical order before extraction.
Using Methods Instead of Templates¶
Use underlying extraction methods:
Methods always use English prompts.
Force Overwrite¶
Overwrite existing output directory:
Skip Index Building¶
Speed up extraction if you don't need search/chat:
Build index later with he build-index.
Read from Stdin¶
Output Structure¶
./output/
├── data.json # Extracted knowledge (entities, relations, etc.)
├── metadata.json # Extraction metadata
│ ├── template # Template used
│ ├── lang # Language
│ ├── created_at # Creation timestamp
│ └── updated_at # Last update timestamp
└── index/ # Vector search index (if built)
├── index.faiss
└── docstore.json
Language Support¶
Templates support multiple languages:
# English
he parse doc.md -t general/biography_graph -l en -o ./output/
# Chinese
he parse doc.md -t general/biography_graph -l zh -o ./output/
Choose the language that matches your document for best results.
Common Use Cases¶
Research Paper¶
Biography¶
Legal Contract¶
Financial Report¶
Error Handling¶
"Output directory already exists"¶
The output directory exists and is not empty. Solutions:
- Use
-fto force overwrite - Choose a different output path
- Remove the existing directory first
"Template not found"¶
The specified template doesn't exist. Solutions:
- List available templates:
he list template - Use interactive selection (omit
-t) - Check template path spelling
"Language is required"¶
Knowledge templates require a language flag. Methods don't:
# Template - requires -l
he parse doc.md -t general/biography_graph -o ./out/ -l en
# Method - no -l needed
he parse doc.md -m light_rag -o ./out/
Best Practices¶
- Choose the right template — Match your document type
- Use correct language — Improves extraction quality
- Organize outputs — Use descriptive directory names
- Skip index during batch — Use
--no-index, build once at the end
See Also¶
he feed— Add documents incrementallyhe build-index— Build search indexhe list— List available templates- Template Library