Build an AI-Powered App with Claude API: Complete Developer Guide
On this page
I've been building tools with the Claude API for the past year — everything from automated code reviewers to content pipelines to data processing workflows. Unlike chatbot wrappers, the API lets you embed AI capabilities directly into your applications where they actually provide value. This guide covers everything I wish I'd known when I started, from basic usage to production patterns.
TL;DR: Learn to build production-ready AI applications using the Anthropic SDK. Covers messages API, tool use, streaming, and a practical code reviewer example.
Setup
Install the Anthropic SDK and set up your API key:
npm install @anthropic-ai/sdk
Get your API key from console.anthropic.com and set it as an environment variable:
export ANTHROPIC_API_KEY=sk-ant-...
The Messages API: Basics
The Messages API is the core of every Claude interaction:
import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
async function askClaude(prompt: string): Promise<string> {
const message = await client.messages.create({
model: 'claude-sonnet-4-20250514',
max_tokens: 1024,
messages: [
{ role: 'user', content: prompt },
],
});
// Extract text from the response
const textBlock = message.content.find((block) => block.type === 'text');
return textBlock?.text ?? '';
}
const response = await askClaude('Explain closures in JavaScript in 3 sentences.');
console.log(response);
System Prompts
System prompts set the behavior and personality of the model:
const message = await client.messages.create({
model: 'claude-sonnet-4-20250514',
max_tokens: 1024,
system: `You are a senior software engineer doing code reviews.
Be direct, specific, and constructive.
Focus on bugs, security issues, and performance problems.
Ignore minor style preferences.`,
messages: [
{ role: 'user', content: `Review this code:\n\n${codeSnippet}` },
],
});
Multi-Turn Conversations
Pass the full conversation history to maintain context:
const conversationHistory: Anthropic.MessageParam[] = [];
async function chat(userMessage: string): Promise<string> {
conversationHistory.push({ role: 'user', content: userMessage });
const response = await client.messages.create({
model: 'claude-sonnet-4-20250514',
max_tokens: 1024,
messages: conversationHistory,
});
const assistantMessage = response.content
.filter((block) => block.type === 'text')
.map((block) => block.text)
.join('');
conversationHistory.push({ role: 'assistant', content: assistantMessage });
return assistantMessage;
}
await chat('What is a race condition?');
await chat('Show me an example in Node.js.');
await chat('How do I fix it?');
Streaming Responses
For real-time output in UIs, use streaming:
async function streamResponse(prompt: string) {
const stream = client.messages.stream({
model: 'claude-sonnet-4-20250514',
max_tokens: 1024,
messages: [{ role: 'user', content: prompt }],
});
// Process chunks as they arrive
for await (const event of stream) {
if (
event.type === 'content_block_delta' &&
event.delta.type === 'text_delta'
) {
process.stdout.write(event.delta.text);
}
}
// Get the final message object
const finalMessage = await stream.finalMessage();
console.log('\n\nTokens used:', finalMessage.usage);
}
Streaming in a Web API
Pipe streaming responses to an HTTP client:
// Next.js API route
export async function POST(req: Request) {
const { prompt } = await req.json();
const stream = client.messages.stream({
model: 'claude-sonnet-4-20250514',
max_tokens: 2048,
messages: [{ role: 'user', content: prompt }],
});
// Convert to a ReadableStream for the Response
const readableStream = new ReadableStream({
async start(controller) {
for await (const event of stream) {
if (
event.type === 'content_block_delta' &&
event.delta.type === 'text_delta'
) {
controller.enqueue(
new TextEncoder().encode(event.delta.text)
);
}
}
controller.close();
},
});
return new Response(readableStream, {
headers: { 'Content-Type': 'text/plain; charset=utf-8' },
});
}
Tool Use (Function Calling)
Tool use lets Claude call functions you define. The model decides when to call a tool, provides the arguments, and you execute it and return the result.
const tools: Anthropic.Tool[] = [
{
name: 'get_weather',
description: 'Get the current weather for a city',
input_schema: {
type: 'object',
properties: {
city: {
type: 'string',
description: 'City name, e.g., "San Francisco"',
},
unit: {
type: 'string',
enum: ['celsius', 'fahrenheit'],
description: 'Temperature unit',
},
},
required: ['city'],
},
},
];
async function chatWithTools(userMessage: string) {
const response = await client.messages.create({
model: 'claude-sonnet-4-20250514',
max_tokens: 1024,
tools,
messages: [{ role: 'user', content: userMessage }],
});
// Check if Claude wants to use a tool
if (response.stop_reason === 'tool_use') {
const toolUse = response.content.find(
(block) => block.type === 'tool_use'
);
if (toolUse && toolUse.type === 'tool_use') {
// Execute the tool
const result = await executeFunction(toolUse.name, toolUse.input);
// Send the result back to Claude
const followUp = await client.messages.create({
model: 'claude-sonnet-4-20250514',
max_tokens: 1024,
tools,
messages: [
{ role: 'user', content: userMessage },
{ role: 'assistant', content: response.content },
{
role: 'user',
content: [
{
type: 'tool_result',
tool_use_id: toolUse.id,
content: JSON.stringify(result),
},
],
},
],
});
return followUp;
}
}
return response;
}
async function executeFunction(name: string, input: Record<string, unknown>) {
if (name === 'get_weather') {
// Call your actual weather API here
return { temperature: 22, condition: 'sunny', city: input.city };
}
throw new Error(`Unknown function: ${name}`);
}
Practical Example: AI Code Reviewer
Let's build a real code review tool that analyzes pull request diffs:
import Anthropic from '@anthropic-ai/sdk';
import { execSync } from 'node:child_process';
const client = new Anthropic();
interface ReviewResult {
summary: string;
issues: Array<{
severity: 'critical' | 'warning' | 'suggestion';
file: string;
line?: number;
description: string;
}>;
approved: boolean;
}
async function reviewPullRequest(baseBranch = 'main'): Promise<ReviewResult> {
// Get the diff
const diff = execSync(`git diff ${baseBranch}...HEAD`).toString();
if (!diff.trim()) {
return { summary: 'No changes found.', issues: [], approved: true };
}
const response = await client.messages.create({
model: 'claude-sonnet-4-20250514',
max_tokens: 4096,
system: `You are a senior software engineer reviewing a pull request.
Analyze the diff for:
1. Bugs and logic errors
2. Security vulnerabilities (injection, XSS, auth bypass)
3. Performance issues (N+1 queries, missing indexes, memory leaks)
4. Error handling gaps
Respond with JSON matching this schema:
{
"summary": "One paragraph summary",
"issues": [
{
"severity": "critical|warning|suggestion",
"file": "path/to/file",
"line": 42,
"description": "What's wrong and how to fix it"
}
],
"approved": true/false
}
Only flag real issues. Do not nitpick style or formatting.`,
messages: [
{
role: 'user',
content: `Review this pull request diff:\n\n${diff}`,
},
],
});
const text = response.content
.filter((block) => block.type === 'text')
.map((block) => block.text)
.join('');
// Parse the JSON response
const jsonMatch = text.match(/\{[\s\S]*\}/);
if (!jsonMatch) {
throw new Error('Failed to parse review response');
}
return JSON.parse(jsonMatch[0]) as ReviewResult;
}
// Run the review
const review = await reviewPullRequest();
console.log(`Summary: ${review.summary}`);
console.log(`Approved: ${review.approved}`);
console.log(`Issues found: ${review.issues.length}`);
for (const issue of review.issues) {
const icon =
issue.severity === 'critical' ? 'X' :
issue.severity === 'warning' ? '!' : '-';
console.log(`[${icon}] ${issue.file}:${issue.line ?? '?'} — ${issue.description}`);
}
Error Handling and Retries
Production applications need robust error handling:
import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
async function robustQuery(
prompt: string,
maxRetries = 3
): Promise<string> {
for (let attempt = 1; attempt <= maxRetries; attempt++) {
try {
const message = await client.messages.create({
model: 'claude-sonnet-4-20250514',
max_tokens: 1024,
messages: [{ role: 'user', content: prompt }],
});
const text = message.content.find((b) => b.type === 'text');
return text?.text ?? '';
} catch (error) {
if (error instanceof Anthropic.RateLimitError) {
// Rate limited: wait and retry
const waitTime = Math.pow(2, attempt) * 1000;
console.log(`Rate limited. Waiting ${waitTime}ms...`);
await new Promise((resolve) => setTimeout(resolve, waitTime));
continue;
}
if (error instanceof Anthropic.APIError && error.status >= 500) {
// Server error: retry
console.log(`Server error (${error.status}). Attempt ${attempt}/${maxRetries}`);
await new Promise((resolve) => setTimeout(resolve, 1000 * attempt));
continue;
}
// Client error (400, 401, etc.): do not retry
throw error;
}
}
throw new Error(`Failed after ${maxRetries} attempts`);
}
Cost Management
Monitor and control your API costs:
// Track token usage
const message = await client.messages.create({
model: 'claude-sonnet-4-20250514',
max_tokens: 1024,
messages: [{ role: 'user', content: prompt }],
});
console.log('Input tokens:', message.usage.input_tokens);
console.log('Output tokens:', message.usage.output_tokens);
// Estimate cost (Claude Sonnet pricing as of 2026)
const inputCost = (message.usage.input_tokens / 1_000_000) * 3;
const outputCost = (message.usage.output_tokens / 1_000_000) * 15;
console.log(`Estimated cost: $${(inputCost + outputCost).toFixed(4)}`);
Tips for Reducing Token Usage
- Be specific in system prompts — shorter, focused instructions use fewer tokens
- Use
max_tokenswisely — set it to the expected response length, not the maximum - Truncate large inputs — send relevant code sections, not entire files
- Choose the right model — use Haiku for simple tasks, Sonnet for most work, Opus for complex reasoning
Pro Tips
- Always validate JSON responses from Claude. Use
zodorajvto validate the schema before trusting the output. - Set reasonable
max_tokensfor each use case. Code reviews need 2-4K, summaries need 500-1K, classifications need 100-200. - Use the system prompt for behavior, not the user message. System prompts are cheaper (cached more efficiently) and provide consistent behavior.
- Implement request timeouts for production APIs. A 30-second timeout prevents requests from hanging indefinitely.
- Log every API call in production — token counts, latency, model version. You need this data for cost optimization.
Key Takeaways
- The Messages API is the foundation: model, messages array, max_tokens
- Use streaming for real-time UIs and long responses
- Tool use lets Claude call your functions with structured arguments
- Always implement retries with exponential backoff for production apps
- Track token usage for cost management
- Choose the right model for each task to balance quality and cost
Sources
Looking for more? Check out Adaptels.
Related Articles
How to Debug Node.js Memory Leaks (Step-by-Step Guide)
Learn how to detect, diagnose, and fix Node.js memory leaks using heap snapshots, Chrome DevTools, and clinic.js — with real code examples.
Running Local LLMs With Ollama: Developer Setup Guide
Set up Ollama to run local LLMs on your machine. Covers installation, model selection, API usage, and integrating local models into your dev workflow.
AI Code Review Tools for Developers: Automate Your PR Reviews (2026)
Compare the best AI code review tools in 2026 — CodeRabbit, GitHub Copilot, Sourcery, and more. Setup guides, pricing, and real-world recommendations.