Home
Transform documents into structured knowledge with one command.
"ๅๅซๆๆกฃ็ฆ่๏ผ่ฎฉไฟกๆฏไธ็ฎไบ็ถ"
Hyper-Extract is an intelligent, LLM-powered knowledge extraction framework. It transforms unstructured text into persistent, predictable, and strongly-typed knowledge structuresโfrom simple lists to complex knowledge graphs, hypergraphs, and spatio-temporal graphs.
โก 5-Minute Quick Start¶
โ Ready to dive deeper? Check out the Getting Started Guide or jump to CLI / Python SDK documentation.
โจ What Makes Hyper-Extract Different?¶
-
8 Auto-Types
From simple
AutoList/AutoModelto advancedAutoGraph,AutoHypergraph, andAutoSpatioTemporalGraph. Pick the right structure for your data. -
10+ Extraction Engines
Built-in support for GraphRAG, LightRAG, Hyper-RAG, KG-Gen, iText2KG, and more. Choose the best method for your use case.
-
80+ Domain Templates
Ready-to-use templates for Finance, Legal, Medical, TCM, and Industry. Zero configuration needed.
-
Incremental Evolution
Feed new documents to continuously expand your knowledge abstract. No need to reprocess everything.
๐ฏ Choose Your Path¶
-
CLI User
Process documents directly from your terminal. Perfect for:
- Quick knowledge extraction
- Batch document processing
- Building knowledge abstracts without coding
-
Python Developer
Integrate into your Python applications. Perfect for:
- Custom extraction pipelines
- Integration with existing workflows
- Building AI-powered applications
-
Want to Learn More?
Understand the concepts and architecture:
- How Auto-Types work
- Choosing extraction methods
- Creating custom templates
๐งฉ The 8 Auto-Types at a Glance¶
| Type | Use Case | Example Output |
|---|---|---|
| AutoModel | Structured summaries | A pydantic model with specific fields |
| AutoList | Collections of items | A list of entities or facts |
| AutoSet | Deduplicated collections | A set of unique items |
| AutoGraph | Entity-relationship networks | Knowledge graph with nodes and edges |
| AutoHypergraph | Multi-entity relationships | Hyperedges connecting multiple nodes |
| AutoTemporalGraph | Time-based relationships | Graph with temporal information |
| AutoSpatialGraph | Location-based relationships | Graph with geographic data |
| AutoSpatioTemporalGraph | Time + Space combined | Full context with when and where |
โ Learn which Auto-Type to choose
๐๏ธ Architecture Overview¶
Hyper-Extract follows a three-layer architecture:
graph TD
A[Your Document] --> B[CLI / Python API]
B --> C[Template]
B --> D[Method]
C --> E[Auto-Type]
D --> E
E --> F[Structured Knowledge]
subgraph "Layer 3: Templates & Methods"
C
D
end
subgraph "Layer 2: Core Engine"
E
end
subgraph "Layer 1: Output"
F
end
- Auto-Types โ Define the output data structure (8 types)
- Methods โ Provide extraction algorithms (RAG-based and Typical)
- Templates โ Offer domain-specific, ready-to-use configurations
You can use Hyper-Extract at any level: pick a template for quick results, choose a method for more control, or work directly with Auto-Types for full customization.
๐ Comparison with Other Tools¶
| Feature | GraphRAG | LightRAG | KG-Gen | Hyper-Extract |
|---|---|---|---|---|
| Knowledge Graph | โ | โ | โ | โ |
| Temporal Graph | โ | โ | โ | โ |
| Spatial Graph | โ | โ | โ | โ |
| Hypergraph | โ | โ | โ | โ |
| Domain Templates | โ | โ | โ | โ |
| CLI Tool | โ | โ | โ | โ |
| Multi-language | โ | โ | โ | โ |
๐ Documentation Structure¶
- Getting Started โ Installation and your first extraction
- CLI Guide โ Complete terminal workflow documentation
- Python SDK โ API reference and developer guides
- Concepts โ Understanding the architecture
- Templates โ Domain-specific extraction templates
- Resources โ FAQ, troubleshooting, and contributing
๐ค Contributing¶
Contributions are welcome! Whether it's bug reports, feature requests, or documentation improvements, please feel free to submit an issue or pull request.
๐ License¶
Hyper-Extract is licensed under the Apache-2.0 License.