Skip to content

Troubleshooting

Solutions to common issues.


Installation Issues

pip install fails

Problem: Installation fails with errors

Solutions:

  1. Upgrade pip: pip install --upgrade pip
  2. Use Python 3.11+: python --version
  3. Install in virtual environment:
    python -m venv venv
    source venv/bin/activate
    pip install hyperextract
    

ImportError: No module named 'hyperextract'

Problem: Can't import after installation

Solutions:

  1. Check Python version: python --version (need 3.11+)
  2. Verify installation: pip list | grep hyper
  3. Check virtual environment is activated
  4. Reinstall: pip install --force-reinstall hyperextract

Configuration Issues

API Key Not Found

Error: No API key configured

Solutions:

  1. CLI Configuration (recommended):

    he config init -k YOUR_API_KEY
    

  2. Environment Variable:

    export OPENAI_API_KEY=your-api-key
    

  3. Verify Configuration:

    he config show
    

Invalid API Key

Error: Authentication failed

Solutions:

  1. Verify key is correct
  2. Check for extra spaces
  3. Try regenerating key in OpenAI dashboard
  4. Check if key has available credits

Runtime Issues

Template Not Found

Error: Template 'xxx' not found

Solutions:

  1. List available templates:

    he list template
    

  2. Check spelling:

    # Correct
    he parse doc.md -t general/biography_graph
    
    # Incorrect
    he parse doc.md -t general/biography
    

  3. Use Python to search:

    from hyperextract import Template
    templates = Template.list(filter_by_query="bio")
    

Language Required

Error: --lang is required

Solution:

# Add language flag
he parse doc.md -t general/biography_graph -o ./out/ -l en

Note: Method templates don't require language.

Output Directory Exists

Error: Output directory already exists

Solutions:

  1. Force overwrite:

    he parse doc.md -t general/graph -o ./out/ -l en -f
    

  2. Use different directory:

    he parse doc.md -t general/graph -o ./out2/ -l en
    

  3. Remove existing:

    rm -rf ./out/
    he parse doc.md -t general/graph -o ./out/ -l en
    


Index and Search Issues

Index Not Found

Error: Search index not built

Solution:

he build-index ./output/

Search Returns Empty

Problem: he search finds no results

Solutions:

  1. Verify index exists:

    he info ./output/
    # Should show: Index: Built
    

  2. Try different query:

    he search ./output/ "different keywords"
    

  3. Increase top_k:

    he search ./output/ "query" -n 10
    

  4. Check data exists:

    he info ./output/
    # Should show Nodes > 0
    

Chat Fails

Error: Chat failed: index not found

Solution:

he build-index ./output/
he talk ./output/ -q "your question"

Performance Issues

Extraction is Very Slow

Problem: Taking too long to process

Solutions:

  1. Skip index during batch:

    he parse doc.md -t general/graph -o ./out/ -l en --no-index
    

  2. Reduce chunk size (Python):

    ka = Template.create("general/graph", "en")
    ka.chunk_size = 1024  # Default: 2048
    

  3. Reduce workers (if hitting rate limits):

    ka = Template.create("general/graph", "en")
    ka.max_workers = 5  # Default: 10
    

Out of Memory

Problem: Process killed or memory error

Solutions:

  1. Process smaller chunks:

    for chunk in split_document(text, chunk_size=1000):
        result.feed_text(chunk)
    

  2. Save intermediate results:

    for i, doc in enumerate(documents):
        result.feed_text(doc)
        if i % 5 == 0:
            result.dump(f"./checkpoint_{i}/")
    

  3. Don't build index for intermediate steps:

    # Build only at the end
    result.build_index()
    


Data Issues

No Entities Extracted

Problem: Empty result

Solutions:

  1. Check input text:

    wc -l document.md  # Check not empty
    

  2. Try different template:

    he parse doc.md -t general/model -l en
    

  3. Check language:

    # Wrong language
    he parse chinese_doc.md -t general/graph -l en
    
    # Correct
    he parse chinese_doc.md -t general/graph -l zh
    

Corrupted Knowledge Abstract

Problem: Can't load or errors reading

Solutions:

  1. Check file structure:

    ls ./ka/
    # Should have: data.json, metadata.json
    

  2. Validate JSON:

    python -c "import json; json.load(open('./ka/data.json'))"
    

  3. Re-extract:

    rm -rf ./ka/
    he parse doc.md -t general/graph -o ./ka/ -l en
    


Still Having Issues?

  1. Check logs — Look for detailed error messages
  2. Update to latestpip install --upgrade hyperextract
  3. Check GitHub Issuesgithub.com/yifanfeng97/hyper-extract/issues
  4. Create new issue — Include error messages and reproduction steps

Debug Mode

Enable verbose output:

import logging
logging.basicConfig(level=logging.DEBUG)

from hyperextract import Template
ka = Template.create("general/graph", "en")

Or in CLI config:

[defaults]
verbose = true