from openai import OpenAI
client = OpenAI(api_key="sk-...")
response = client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": "Hello"}],
temperature=0.7,
max_tokens=150
)
from openai import OpenAI
client = OpenAI(
api_key="pplx-...",
base_url="https://api.perplexity.ai"
)
response = client.chat.completions.create(
model="llama-3.1-sonar-large-128k-online",
messages=[{"role": "user", "content": "Hello"}],
temperature=0.7,
max_tokens=150
)
✓ Just change the base URL and API key - all your existing code works!
Direct search with structured results
POST /search
{
"query": "latest AI news",
"search_type": "news",
"limit": 10
}
Search API Docs
AI-powered answers with citations
POST /chat/completions
{
"model": "sonar-pro",
"messages": [{
"role": "user",
"content": "What's new in AI?"
}]
}
LLM API Docs
Select from Perplexity's model lineup
Standard conversation array format
Control response randomness (0-2)
Set response length limits
Nucleus sampling control
Real-time streamed responses
Control word repetition
Manage topic diversity
Include or exclude specific domains
["include:arxiv.org", "exclude:reddit.com"]
Filter by content freshness
"month", "week", "day", "hour"
Control search depth
"low" | "medium" | "high"
Web or academic sources
"web" | "academic"
Parse intent and context
Choose search strategy
Retrieve relevant data
Generate final answer
Perplexity's multi-stage pipeline ensures accurate, relevant, and up-to-date responses
"What is the capital of France?"
→ May skip search
"Latest AI news today"
→ Requires fresh search
"Compare quantum computing approaches"
→ Deep search needed
"Write a poem about coding"
→ No search required
Search: arXiv, PubMed, Google Scholar
search_mode="academic"
Priority: Recent results
search_recency_filter="hour"
Filter: Specific domains
search_domain_filter=[...]
Use: Model knowledge only
disable_search=True
Broad Search:
Targeted Search:
3-5 sources • Fast • Cost-effective
10-15 sources • Balanced • Standard
20+ sources • Comprehensive • Research-grade
Pull key facts from each source
Source 1: "GDP grew 3.2% in Q3"
Source 2: "Inflation at 2.1%"
Source 3: "Fed signals rate pause"
Cross-reference and validate claims
Compose natural, accurate response
Apple's Q4 2024 revenue reached $94.8 billion, representing an 8.1% year-over-year increase. iPhone sales grew 6% to $46.2 billion, while Services revenue hit a record $22.3 billion. China sales declined 2.5% due to increased competition.
{
"company": "Apple Inc.",
"quarter": "Q4 2024",
"total_revenue": 94800000000,
"growth_rate": "8.1%",
"segments": {
"iphone": {
"revenue": 46200000000,
"growth": "6%"
},
"services": {
"revenue": 22300000000,
"record": true
}
},
"regional_performance": {
"china": {
"change": "-2.5%",
"reason": "increased competition"
}
}
}
Apple's Q4 2024 revenue reached $94.8 billion, up 8.1% year-over-year. iPhone sales grew 6% while Services hit record highs.
✓ Temporal marker: "latest", "2024"
✓ Technical domain: "quantum computing"
✓ Research intent: "breakthroughs"
→ Decision: Fresh academic + tech search needed
{
"search_mode": "web", // Not purely academic
"search_recency_filter": "month",
"search_domain_filter": [
"include:arxiv.org",
"include:nature.com",
"include:ieee.org",
"include:quantumcomputing.com"
],
"search_context_size": "high"
}
"The latest quantum computing breakthroughs in 2024 include Google's Willow chip achieving below-threshold error correction [1], IBM's Condor processor exceeding 1000 qubits [2], and Harvard's demonstration of logical qubits with 99.5% fidelity [3]..."
Let AI automatically decide when web search is needed
response = client.chat.completions.create(
model="llama-3.1-sonar-large-128k-online",
messages=messages,
enable_search_classifier=True # Smart search activation
)
Turn off web search for deterministic, offline responses
response = client.chat.completions.create(
model="llama-3.1-sonar-large-128k-online",
messages=messages,
disable_search=True # Pure model knowledge only
)
Include relevant image URLs in responses
return_images=True
→ Adds visual evidence and context to your answers
Generate intelligent follow-up questions
return_related_questions=True
→ Boost user engagement and discovery
schema = {
"type": "object",
"properties": {
"company": {"type": "string"},
"revenue": {"type": "number"},
"growth_rate": {
"type": "string",
"pattern": "^\\d+\\.\\d+%$"
},
"key_metrics": {
"type": "array",
"items": {"type": "string"}
}
},
"required": ["company", "revenue"]
}
response = client.chat.completions.create(
model="llama-3.1-sonar-large-128k-online",
messages=[{
"role": "user",
"content": "Analyze Apple's earnings"
}],
response_format={
"type": "json_object",
"schema": schema
}
)
{
"company": "Apple Inc.",
"revenue": 94800000000,
"growth_rate": "8.1%",
"key_metrics": [
"iPhone sales up 6%",
"Services revenue $22.3B",
"China sales declined 2.5%"
]
}
import base64
# Upload and analyze documents
with open("report.pdf", "rb") as file:
file_content = base64.b64encode(file.read()).decode()
response = client.chat.completions.create(
model="llama-3.1-sonar-large-128k-online",
messages=[
{
"role": "user",
"content": [
{"type": "text", "text": "Summarize key findings"},
{"type": "document", "document": {"data": file_content, "type": "pdf"}}
]
}
]
)
job = client.async_jobs.create(
model="llama-3.1-sonar-large-128k-online",
messages=messages
)
status = client.async_jobs.get(job.id)
# Status: pending → processing → complete
if status.status == "complete":
results = status.result
Perfect for batch processing, complex research, and long-running analysis tasks
search_recency_filter="hour"
search_domain_filter=[
"include:reuters.com",
"include:bloomberg.com"
]
search_mode="academic"
search_context_size="high"
return_related_questions=True
response_format={
"type": "json_object",
"schema": financial_schema
}
search_domain_filter=["include:sec.gov"]
# Upload and extract data
{"type": "document",
"document": {"data": contract_pdf}}
response_format={
"type": "json_object",
"schema": extraction_schema
}
OpenAI-compatible for instant migration
Advanced search controls unique to Perplexity
Structured outputs and file processing
Zero data retention guarantee
Enterprise security and compliance
Get started quickly with our official Python, TypeScript, and JavaScript SDKs with full documentation and examples.
View All SDKsTest and explore Perplexity's API capabilities interactively before writing any code.
Try PlaygroundConnect Perplexity to your tools and workflows with our official MCP server implementation.
Setup MCP ServerStep-by-step tutorial to migrate from OpenAI and explore Perplexity's unique features.
Start Building