I've been building tools with the Claude API for the past year — everything from automated code reviewers to content pipelines to data processing workflows. Unlike chatbot wrappers, the API lets you embed AI capabilities directly into your applications where they actually provide value. This guide covers everything I wish I'd known when I started, from basic usage to production patterns.

TL;DR: Learn to build production-ready AI applications using the Anthropic SDK. Covers messages API, tool use, streaming, and a practical code reviewer example.

Setup

Install the Anthropic SDK and set up your API key:

npm install @anthropic-ai/sdk

Get your API key from console.anthropic.com and set it as an environment variable:

export ANTHROPIC_API_KEY=sk-ant-...

The Messages API: Basics

The Messages API is the core of every Claude interaction:

import Anthropic from '@anthropic-ai/sdk';

const client = new Anthropic();

async function askClaude(prompt: string): Promise<string> {
  const message = await client.messages.create({
    model: 'claude-sonnet-4-20250514',
    max_tokens: 1024,
    messages: [
      { role: 'user', content: prompt },
    ],
  });

  // Extract text from the response
  const textBlock = message.content.find((block) => block.type === 'text');
  return textBlock?.text ?? '';
}

const response = await askClaude('Explain closures in JavaScript in 3 sentences.');
console.log(response);

System Prompts

System prompts set the behavior and personality of the model:

const message = await client.messages.create({
  model: 'claude-sonnet-4-20250514',
  max_tokens: 1024,
  system: `You are a senior software engineer doing code reviews.
Be direct, specific, and constructive.
Focus on bugs, security issues, and performance problems.
Ignore minor style preferences.`,
  messages: [
    { role: 'user', content: `Review this code:\n\n${codeSnippet}` },
  ],
});

Multi-Turn Conversations

Pass the full conversation history to maintain context:

const conversationHistory: Anthropic.MessageParam[] = [];

async function chat(userMessage: string): Promise<string> {
  conversationHistory.push({ role: 'user', content: userMessage });

  const response = await client.messages.create({
    model: 'claude-sonnet-4-20250514',
    max_tokens: 1024,
    messages: conversationHistory,
  });

  const assistantMessage = response.content
    .filter((block) => block.type === 'text')
    .map((block) => block.text)
    .join('');

  conversationHistory.push({ role: 'assistant', content: assistantMessage });
  return assistantMessage;
}

await chat('What is a race condition?');
await chat('Show me an example in Node.js.');
await chat('How do I fix it?');

Streaming Responses

For real-time output in UIs, use streaming:

async function streamResponse(prompt: string) {
  const stream = client.messages.stream({
    model: 'claude-sonnet-4-20250514',
    max_tokens: 1024,
    messages: [{ role: 'user', content: prompt }],
  });

  // Process chunks as they arrive
  for await (const event of stream) {
    if (
      event.type === 'content_block_delta' &&
      event.delta.type === 'text_delta'
    ) {
      process.stdout.write(event.delta.text);
    }
  }

  // Get the final message object
  const finalMessage = await stream.finalMessage();
  console.log('\n\nTokens used:', finalMessage.usage);
}

Streaming in a Web API

Pipe streaming responses to an HTTP client:

// Next.js API route
export async function POST(req: Request) {
  const { prompt } = await req.json();

  const stream = client.messages.stream({
    model: 'claude-sonnet-4-20250514',
    max_tokens: 2048,
    messages: [{ role: 'user', content: prompt }],
  });

  // Convert to a ReadableStream for the Response
  const readableStream = new ReadableStream({
    async start(controller) {
      for await (const event of stream) {
        if (
          event.type === 'content_block_delta' &&
          event.delta.type === 'text_delta'
        ) {
          controller.enqueue(
            new TextEncoder().encode(event.delta.text)
          );
        }
      }
      controller.close();
    },
  });

  return new Response(readableStream, {
    headers: { 'Content-Type': 'text/plain; charset=utf-8' },
  });
}

Tool Use (Function Calling)

Tool use lets Claude call functions you define. The model decides when to call a tool, provides the arguments, and you execute it and return the result.

const tools: Anthropic.Tool[] = [
  {
    name: 'get_weather',
    description: 'Get the current weather for a city',
    input_schema: {
      type: 'object',
      properties: {
        city: {
          type: 'string',
          description: 'City name, e.g., "San Francisco"',
        },
        unit: {
          type: 'string',
          enum: ['celsius', 'fahrenheit'],
          description: 'Temperature unit',
        },
      },
      required: ['city'],
    },
  },
];

async function chatWithTools(userMessage: string) {
  const response = await client.messages.create({
    model: 'claude-sonnet-4-20250514',
    max_tokens: 1024,
    tools,
    messages: [{ role: 'user', content: userMessage }],
  });

  // Check if Claude wants to use a tool
  if (response.stop_reason === 'tool_use') {
    const toolUse = response.content.find(
      (block) => block.type === 'tool_use'
    );

    if (toolUse && toolUse.type === 'tool_use') {
      // Execute the tool
      const result = await executeFunction(toolUse.name, toolUse.input);

      // Send the result back to Claude
      const followUp = await client.messages.create({
        model: 'claude-sonnet-4-20250514',
        max_tokens: 1024,
        tools,
        messages: [
          { role: 'user', content: userMessage },
          { role: 'assistant', content: response.content },
          {
            role: 'user',
            content: [
              {
                type: 'tool_result',
                tool_use_id: toolUse.id,
                content: JSON.stringify(result),
              },
            ],
          },
        ],
      });

      return followUp;
    }
  }

  return response;
}

async function executeFunction(name: string, input: Record<string, unknown>) {
  if (name === 'get_weather') {
    // Call your actual weather API here
    return { temperature: 22, condition: 'sunny', city: input.city };
  }
  throw new Error(`Unknown function: ${name}`);
}

Practical Example: AI Code Reviewer

Let's build a real code review tool that analyzes pull request diffs:

import Anthropic from '@anthropic-ai/sdk';
import { execSync } from 'node:child_process';

const client = new Anthropic();

interface ReviewResult {
  summary: string;
  issues: Array<{
    severity: 'critical' | 'warning' | 'suggestion';
    file: string;
    line?: number;
    description: string;
  }>;
  approved: boolean;
}

async function reviewPullRequest(baseBranch = 'main'): Promise<ReviewResult> {
  // Get the diff
  const diff = execSync(`git diff ${baseBranch}...HEAD`).toString();

  if (!diff.trim()) {
    return { summary: 'No changes found.', issues: [], approved: true };
  }

  const response = await client.messages.create({
    model: 'claude-sonnet-4-20250514',
    max_tokens: 4096,
    system: `You are a senior software engineer reviewing a pull request.
Analyze the diff for:
1. Bugs and logic errors
2. Security vulnerabilities (injection, XSS, auth bypass)
3. Performance issues (N+1 queries, missing indexes, memory leaks)
4. Error handling gaps

Respond with JSON matching this schema:
{
  "summary": "One paragraph summary",
  "issues": [
    {
      "severity": "critical|warning|suggestion",
      "file": "path/to/file",
      "line": 42,
      "description": "What's wrong and how to fix it"
    }
  ],
  "approved": true/false
}

Only flag real issues. Do not nitpick style or formatting.`,
    messages: [
      {
        role: 'user',
        content: `Review this pull request diff:\n\n${diff}`,
      },
    ],
  });

  const text = response.content
    .filter((block) => block.type === 'text')
    .map((block) => block.text)
    .join('');

  // Parse the JSON response
  const jsonMatch = text.match(/\{[\s\S]*\}/);
  if (!jsonMatch) {
    throw new Error('Failed to parse review response');
  }

  return JSON.parse(jsonMatch[0]) as ReviewResult;
}

// Run the review
const review = await reviewPullRequest();
console.log(`Summary: ${review.summary}`);
console.log(`Approved: ${review.approved}`);
console.log(`Issues found: ${review.issues.length}`);

for (const issue of review.issues) {
  const icon =
    issue.severity === 'critical' ? 'X' :
    issue.severity === 'warning' ? '!' : '-';
  console.log(`[${icon}] ${issue.file}:${issue.line ?? '?'} — ${issue.description}`);
}

Error Handling and Retries

Production applications need robust error handling:

import Anthropic from '@anthropic-ai/sdk';

const client = new Anthropic();

async function robustQuery(
  prompt: string,
  maxRetries = 3
): Promise<string> {
  for (let attempt = 1; attempt <= maxRetries; attempt++) {
    try {
      const message = await client.messages.create({
        model: 'claude-sonnet-4-20250514',
        max_tokens: 1024,
        messages: [{ role: 'user', content: prompt }],
      });

      const text = message.content.find((b) => b.type === 'text');
      return text?.text ?? '';
    } catch (error) {
      if (error instanceof Anthropic.RateLimitError) {
        // Rate limited: wait and retry
        const waitTime = Math.pow(2, attempt) * 1000;
        console.log(`Rate limited. Waiting ${waitTime}ms...`);
        await new Promise((resolve) => setTimeout(resolve, waitTime));
        continue;
      }

      if (error instanceof Anthropic.APIError && error.status >= 500) {
        // Server error: retry
        console.log(`Server error (${error.status}). Attempt ${attempt}/${maxRetries}`);
        await new Promise((resolve) => setTimeout(resolve, 1000 * attempt));
        continue;
      }

      // Client error (400, 401, etc.): do not retry
      throw error;
    }
  }

  throw new Error(`Failed after ${maxRetries} attempts`);
}

Cost Management

Monitor and control your API costs:

// Track token usage
const message = await client.messages.create({
  model: 'claude-sonnet-4-20250514',
  max_tokens: 1024,
  messages: [{ role: 'user', content: prompt }],
});

console.log('Input tokens:', message.usage.input_tokens);
console.log('Output tokens:', message.usage.output_tokens);

// Estimate cost (Claude Sonnet pricing as of 2026)
const inputCost = (message.usage.input_tokens / 1_000_000) * 3;
const outputCost = (message.usage.output_tokens / 1_000_000) * 15;
console.log(`Estimated cost: $${(inputCost + outputCost).toFixed(4)}`);

Tips for Reducing Token Usage

Be specific in system prompts — shorter, focused instructions use fewer tokens
Use max_tokens wisely — set it to the expected response length, not the maximum
Truncate large inputs — send relevant code sections, not entire files
Choose the right model — use Haiku for simple tasks, Sonnet for most work, Opus for complex reasoning

Pro Tips

Always validate JSON responses from Claude. Use zod or ajv to validate the schema before trusting the output.
Set reasonable max_tokens for each use case. Code reviews need 2-4K, summaries need 500-1K, classifications need 100-200.
Use the system prompt for behavior, not the user message. System prompts are cheaper (cached more efficiently) and provide consistent behavior.
Implement request timeouts for production APIs. A 30-second timeout prevents requests from hanging indefinitely.
Log every API call in production — token counts, latency, model version. You need this data for cost optimization.

Key Takeaways

The Messages API is the foundation: model, messages array, max_tokens
Use streaming for real-time UIs and long responses
Tool use lets Claude call your functions with structured arguments
Always implement retries with exponential backoff for production apps
Track token usage for cost management
Choose the right model for each task to balance quality and cost

Sources

Looking for more? Check out Adaptels.

Build an AI-Powered App with Claude API: Complete Developer Guide

On this page