Search and Chat¶
Advanced - Post-Extraction
This guide covers using extracted knowledge. You should be comfortable with Level 1: Using Templates first.
Query your knowledge abstract using semantic search and conversational AI.
Overview¶
After extracting knowledge, you can interact with it in two ways:
- Search — Find specific entities and relations
- Chat — Have natural language conversations
Both require a search index to be built first.
Building the Index¶
from hyperextract import Template
ka = Template.create("general/biography_graph", "en")
result = ka.parse(text)
# Build search index
result.build_index()
Note: Index building is required before search/chat operations.
Semantic Search¶
Basic Search¶
# Search for relevant items (returns tuple of nodes, edges)
nodes, edges = result.search("inventions", top_k=5)
for node in nodes:
print(f"Node: {node.name}")
for edge in edges:
print(f"Edge: {edge.source} -> {edge.target}")
Search Parameters¶
nodes, edges = result.search(
query="electrical engineering achievements",
top_k=10 # Number of results for both nodes and edges
)
Working with Results¶
nodes, edges = result.search("Nobel Prize")
# Process nodes
for node in nodes:
print(f"Node: {node.name} ({node.type})")
# Process edges
for edge in edges:
print(f"Edge: {edge.source} -> {edge.target}")
Search Use Cases¶
# Find specific people
people_nodes, people_edges = result.search("scientists who worked with Tesla", top_k=10)
# Find concepts
concept_nodes, concept_edges = result.search("alternating current system", top_k=5)
# Find events
event_nodes, event_edges = result.search("important dates in Tesla's life", top_k=10)
Chat Interface¶
Single Question¶
# Ask a question
response = result.chat("What were Tesla's major achievements?")
print(response.content)
Accessing Retrieved Context¶
response = result.chat("What did Tesla invent?")
print(response.content)
# Access nodes and edges used to generate response
retrieved_nodes = response.additional_kwargs.get("retrieved_nodes", [])
retrieved_edges = response.additional_kwargs.get("retrieved_edges", [])
print(f"Based on {len(retrieved_nodes)} nodes and {len(retrieved_edges)} edges")
Chat Parameters¶
response = result.chat(
query="Explain the War of Currents",
top_k=10 # More context for complex questions
)
Chat Use Cases¶
# Summarization
summary = result.chat("Summarize Tesla's career in 3 sentences")
# Explanation
explanation = result.chat("What is the significance of the Tesla coil?")
# Comparison
comparison = result.chat("How did Tesla's approach differ from Edison's?")
# Timeline
timeline = result.chat("What happened in Tesla's life between 1880-1890?")
Search vs Chat¶
| Feature | Search | Chat |
|---|---|---|
| Returns | Raw entities/relations | Natural language answer |
| Speed | Fast | Slower (LLM call) |
| Use for | Finding specific data | Understanding/explaining |
| Precision | Exact matches | Interpretive |
| Output | Structured | Free text |
When to Use Each¶
Use Search when: - You need specific entities - Building reports or summaries - Exporting data - Fast lookup needed
Use Chat when: - Explaining concepts - Answering complex questions - Summarizing content - Interactive exploration
Advanced Patterns¶
Search then Chat¶
# First, search for specific nodes
nodes, edges = result.search("wireless technology", top_k=5)
# Then, ask about them
if nodes:
context = ", ".join([node.name for node in nodes])
response = result.chat(f"Explain the significance of {context}")
print(response.content)
Iterative Exploration¶
# Start broad
response = result.chat("What are the main topics in this document?")
print(response.content)
# Drill down
response = result.chat("Tell me more about the Tesla coil")
print(response.content)
# Specific question
response = result.chat("How does the Tesla coil work?")
print(response.content)
Building a Research Assistant¶
class ResearchAssistant:
def __init__(self, ka_path):
self.ka = Template.create("general/concept_graph", "en")
self.ka.load(ka_path)
self.ka.build_index()
def ask(self, question):
response = self.ka.chat(question)
return response.content
def find(self, query, n=5):
return self.ka.search(query, top_k=n)
def summarize(self):
return self.ask("Summarize the main points of this paper")
# Usage
assistant = ResearchAssistant("./paper_kb/")
print(assistant.summarize())
print(assistant.ask("What are the limitations?"))
Best Practices¶
Search Tips¶
- Use natural language — "wireless transmission" not "wireless"
- Be specific — "Tesla's patents" not "Tesla"
- Increase top_k for broad queries —
top_k=10or more - Filter results by type — Check
hasattr(item, 'name')
Chat Tips¶
- Ask clear questions — Specific questions get better answers
- Use context — Build understanding progressively
- Adjust top_k — Complex questions need more context
- Check sources — Review
retrieved_nodesandretrieved_edgesfor accuracy
Troubleshooting¶
"Index not built"¶
"No results found"¶
# Try different phrasing
results = result.search("inventions") # Try synonyms
results = result.search("discoveries")
results = result.search("contributions")
# Increase top_k
results = result.search("Tesla", top_k=20)
"Irrelevant chat responses"¶
# Increase context
response = result.chat(question, top_k=10)
# Or rephrase question
response = result.chat("Be more specific: ...")
See Also¶
Related Workflows: - Incremental Updates — Add more content - Saving and Loading — Persist for later use
Basics: - Using Templates — Level 1 fundamentals - Working with Auto-Types — Level 2 customization