Troubleshooting¶

Solutions to common issues.

Installation Issues¶

pip install fails¶

Problem: Installation fails with errors

Solutions:

Upgrade pip: pip install --upgrade pip
Use Python 3.11+: python --version

Install in virtual environment:

python -m venv venv
source venv/bin/activate
pip install hyperextract

ImportError: No module named 'hyperextract'¶

Problem: Can't import after installation

Solutions:

Check Python version: python --version (need 3.11+)
Verify installation: pip list | grep hyper
Check virtual environment is activated
Reinstall: pip install --force-reinstall hyperextract

Configuration Issues¶

API Key Not Found¶

Error: No API key configured

Solutions:

CLI Configuration (recommended):
```
he config init -k YOUR_API_KEY
```
Environment Variable:
```
export OPENAI_API_KEY=your-api-key
```
Verify Configuration:
```
he config show
```

Invalid API Key¶

Error: Authentication failed

Solutions:

Verify key is correct
Check for extra spaces
Try regenerating key in OpenAI dashboard
Check if key has available credits

Runtime Issues¶

Template Not Found¶

Error: Template 'xxx' not found

Solutions:

List available templates:
```
he list template
```

Check spelling:

# Correct
he parse doc.md -t general/biography_graph

# Incorrect
he parse doc.md -t general/biography

Use Python to search:

from hyperextract import Template
templates = Template.list(filter_by_query="bio")

Language Required¶

Error: --lang is required

Solution:

# Add language flag
he parse doc.md -t general/biography_graph -o ./out/ -l en

Note: Method templates don't require language.

Output Directory Exists¶

Error: Output directory already exists

Solutions:

Force overwrite:

he parse doc.md -t general/graph -o ./out/ -l en -f

Use different directory:

he parse doc.md -t general/graph -o ./out2/ -l en

Remove existing:

rm -rf ./out/
he parse doc.md -t general/graph -o ./out/ -l en

Index and Search Issues¶

Index Not Found¶

Error: Search index not built

Solution:

he build-index ./output/

Search Returns Empty¶

Problem: he search finds no results

Solutions:

Verify index exists:

he info ./output/
# Should show: Index: Built

Try different query:

he search ./output/ "different keywords"

Increase top_k:
```
he search ./output/ "query" -n 10
```

Check data exists:

he info ./output/
# Should show Nodes > 0

Chat Fails¶

Error: Chat failed: index not found

Solution:

he build-index ./output/
he talk ./output/ -q "your question"

Performance Issues¶

Extraction is Very Slow¶

Problem: Taking too long to process

Solutions:

Skip index during batch:

he parse doc.md -t general/graph -o ./out/ -l en --no-index

Reduce chunk size (Python):

ka = Template.create("general/graph", "en")
ka.chunk_size = 1024  # Default: 2048

Reduce workers (if hitting rate limits):

ka = Template.create("general/graph", "en")
ka.max_workers = 5  # Default: 10

Out of Memory¶

Problem: Process killed or memory error

Solutions:

Process smaller chunks:

for chunk in split_document(text, chunk_size=1000):
    result.feed_text(chunk)

Save intermediate results:

for i, doc in enumerate(documents):
    result.feed_text(doc)
    if i % 5 == 0:
        result.dump(f"./checkpoint_{i}/")

Don't build index for intermediate steps:

# Build only at the end
result.build_index()

Data Issues¶

No Entities Extracted¶

Problem: Empty result

Solutions:

Check input text:
```
wc -l document.md  # Check not empty
```
Try different template:
```
he parse doc.md -t general/model -l en
```

Check language:

# Wrong language
he parse chinese_doc.md -t general/graph -l en

# Correct
he parse chinese_doc.md -t general/graph -l zh

Corrupted Knowledge Abstract¶

Problem: Can't load or errors reading

Solutions:

Check file structure:

ls ./ka/
# Should have: data.json, metadata.json

Validate JSON:

python -c "import json; json.load(open('./ka/data.json'))"

Re-extract:

rm -rf ./ka/
he parse doc.md -t general/graph -o ./ka/ -l en

Still Having Issues?¶

Check logs — Look for detailed error messages
Update to latest — pip install --upgrade hyperextract
Check GitHub Issues — github.com/yifanfeng97/hyper-extract/issues
Create new issue — Include error messages and reproduction steps

Debug Mode¶

Enable verbose output:

import logging
logging.basicConfig(level=logging.DEBUG)

from hyperextract import Template
ka = Template.create("general/graph", "en")

Or in CLI config:

[defaults]
verbose = true