Research API Reference
Table of Contents
- Overview
- Prompting Best Practices
- Model Selection
- Key Parameters
- Basic Usage
- Streaming vs Polling
- Structured Output vs Report
- Response Fields
- Summary
Overview
The Research API conducts comprehensive research on any topic with automatic source gathering, analysis, and response generation with citations. It's an end-to-end solution when you need AI-powered research without building your own pipeline.
Prompting Best Practices
Define a clear goal with all details and direction.
Guidelines:
- Be specific when you can. Include known details: target market, competitors, geography, constraints
- Stay open-ended only for discovery. Make it explicit: "tell me about the most impactful AI innovations in healthcare in 2025"
- Avoid contradictions. Don't include conflicting constraints or goals
- Share what's already known. Include prior assumptions so research doesn't repeat existing knowledge
- Keep prompts clean and directed. Clear task + essential context + desired output format
Example Queries
Company research:
Research the company ____ and its 2026 outlook. Provide a brief overview
of the company, its products, services, and market position.Competitive analysis:
Conduct a competitive analysis of ____ in 2026. Identify their main
competitors, compare market positioning, and analyze key differentiators.With prior context:
We're evaluating Notion as a potential partner. We already know they
primarily serve SMB and mid-market teams, expanded their AI features
significantly in 2025, and most often compete with Confluence and ClickUp.
Research Notion's 2026 outlook, including market position, growth risks,
and where a partnership could be most valuable. Include citations.Model Selection
| Model | Best For |
|---|---|
pro | Comprehensive, multi-agent research for complex, multi-domain topics |
mini | Targeted, efficient research for narrow or well-scoped questions |
auto | When unsure how complex research will be (default) |
Pro Model
Multi-agent research suited for complex topics spanning multiple subtopics or domains. Use for deeper analysis, thorough reports, or maximum accuracy.
result = client.research(
input="Analyze the competitive landscape for ____ in the SMB market, "
"including key competitors, positioning, pricing models, customer "
"segments, recent product moves, and defensible advantages or risks "
"over the next 2-3 years.",
model="pro"
)Mini Model
Optimized for targeted, efficient research. Best for narrow or well-scoped questions where you still benefit from agentic searching and synthesis.
result = client.research(
input="What are the top 5 competitors to ____ in the SMB market, and how do they differentiate?",
model="mini"
)Key Parameters
research()
| Parameter | Type | Default | Description |
|---|---|---|---|
input | string | Required | The research topic or question |
model | enum | "auto" | "mini", "pro", or "auto" |
stream | boolean | false | Enable streaming responses |
output_schema | object | null | JSON Schema for structured output |
citation_format | enum | "numbered" | "numbered", "mla", "apa", "chicago" |
get_research()
| Parameter | Type | Description |
|---|---|---|
request_id | string | Task ID from research() response |
Basic Usage
Research tasks are two-step: initiate with research(), retrieve with get_research().
import time
from tavily import TavilyClient
client = TavilyClient()
# Step 1: Start research task
result = client.research(
input="Latest developments in quantum computing and their practical applications",
model="pro"
)
request_id = result["request_id"]
# Step 2: Poll until completed
response = client.get_research(request_id)
while response["status"] not in ["completed", "failed"]:
print(f"Status: {response['status']}... polling again in 10 seconds")
time.sleep(10)
response = client.get_research(request_id)
# Step 3: Handle result
if response["status"] == "failed":
raise RuntimeError(f"Research failed: {response.get('error', 'Unknown error')}")
report = response["content"]
sources = response["sources"]Streaming vs Polling
Streaming — Best for user interfaces where you want real-time updates. Polling — Best for background processes where you check status periodically.
Streaming
Enable real-time progress monitoring with stream=True.
stream = client.research(
input="Latest developments in quantum computing",
model="pro",
stream=True
)
for chunk in stream:
print(chunk.decode('utf-8'))Event Types
| Event Type | Description |
|---|---|
| Tool Call | Agent initiates action (Planning, WebSearch, etc.) |
| Tool Response | Results after tool execution with sources |
| Content | Research report streamed as markdown (or JSON with output_schema) |
| Sources | Complete list of sources, emitted after content |
| Done | Signals completion |
Tool Types
| Tool | Description | Models |
|---|---|---|
Planning | Initializes research strategy | mini, pro |
WebSearch | Executes web searches | mini, pro |
Generating | Creates final report | mini, pro |
ResearchSubtopic | Deep research on subtopics | pro only |
Typical Flow
Planningtoolcall → toolresponseWebSearchtoolcall → toolresponse (with sources)ResearchSubtopiccycles (Pro mode only)Generatingtoolcall → toolresponseContentchunks (markdown or structured JSON)SourceseventDoneevent
See streaming cookbook and polling cookbook for complete examples.
Structured Output vs. Report
| Format | Best For |
|---|---|
| Report (default) | Reading, sharing, or displaying verbatim (chat interfaces, briefs, newsletters) |
| Structured Output | Data enrichment, pipelines, or powering UIs with specific fields |
Structured Output
Use output_schema to receive research in a predefined JSON structure.
schema = {
"properties": {
"summary": {
"type": "string",
"description": "Executive summary of findings"
},
"key_points": {
"type": "array",
"items": {"type": "string"},
"description": "Main takeaways from the research"
},
"metrics": {
"type": "object",
"properties": {
"market_size": {"type": "string", "description": "Total market size"},
"growth_rate": {"type": "number", "description": "Annual growth percentage"}
}
}
},
"required": ["summary", "key_points"]
}
result = client.research(
input="Electric vehicle market analysis 2024",
output_schema=schema
)Schema Best Practices
- Write clear field descriptions. 1-3 sentences explaining what the field should contain
- Match the structure you need. Use arrays, objects, enums appropriately (e.g.,
competitors: string[], not"A, B, C") - Avoid duplicate fields. Keep each field unique and specific
- Use
requiredarrays to enforce mandatory fields at any nesting level
Supported types: object, string, integer, number, array
Streaming with Structured Output
When output_schema is provided, content arrives as structured JSON:
stream = client.research(
input="AI agent frameworks comparison",
model="mini",
stream=True,
output_schema={
"properties": {
"summary": {"type": "string", "description": "Executive summary"},
"key_points": {"type": "array", "items": {"type": "string"}}
},
"required": ["summary", "key_points"]
}
)
for chunk in stream:
data = chunk.decode('utf-8')
print(data) # Content chunks will be structured JSONResponse Fields
research() Response
| Field | Description |
|---|---|
request_id | Unique identifier for tracking |
created_at | Timestamp when task was created |
status | Initial status |
input | The research topic submitted |
model | Model used by research agent |
get_research() Response
| Field | Description |
|---|---|
status | "pending", "processing", "completed", "failed" |
content | Generated research report (when completed) |
sources | Array of source citations |
response_time | Time in seconds |
Source Object
| Field | Description |
|---|---|
url | Source URL |
title | Source title |
citation | Formatted citation string |
Summary
- Be specific in prompts — Include known details: target market, competitors, geography, constraints
- Share prior context — Include what you already know to avoid repetition
- Choose the right model —
minifor focused queries,profor comprehensive multi-domain analysis - Use streaming for UX — Display real-time progress during long research tasks
- Use structured output for pipelines — Define schemas for consistent, parseable responses
- Use reports for reading — Default format is best for chat interfaces and sharing
For more examples, see the Tavily Cookbook and live demo.