AnthropicClaude APINext.jsAI Integration

Integrating the Anthropic Claude API into a Next.js App

A practical guide to adding Claude AI to your Node.js or Next.js application — covering streaming responses, tool use, prompt caching, and patterns that work in production.

Deepak Kaushal·April 10, 2026·6 min read

Anthropic's Claude API is one of the most capable LLM APIs available. This guide walks through integrating it into a Next.js application — from basic completions to streaming, tool use, and prompt caching — using patterns I've applied in production AI products.

Installation and Basic Setup

Install the Anthropic SDK and add your API key to environment variables. Your API key must never be exposed to the browser — always call the API from a Next.js Route Handler or Server Action.

npm install @anthropic-ai/sdk

// app/api/chat/route.js
import Anthropic from '@anthropic-ai/sdk';

const client = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });

export async function POST(req) {
  const { message } = await req.json();
  const response = await client.messages.create({
    model: 'claude-sonnet-4-6',
    max_tokens: 1024,
    messages: [{ role: 'user', content: message }],
  });
  return Response.json({ text: response.content[0].text });
}

Streaming Responses for Better UX

For conversational UI, streaming is essential. Instead of waiting for the full response, stream tokens to the client as they're generated. In Next.js, use a ReadableStream in your Route Handler and consume it on the client with the Fetch API. Users see output within milliseconds of submitting — critical for perceived responsiveness.

Prompt Caching for Cost Reduction

If your application uses a long system prompt (documentation, product catalogue, knowledge base), prompt caching can reduce costs by up to 90% and cut latency significantly. Mark static parts of your prompt with the cache_control breakpoint. Anthropic caches these tokens server-side and reuses them across requests — one of the most impactful optimisations for production AI apps.

Tool Use (Function Calling)

Claude's tool use lets the model call external functions — search, databases, APIs — and incorporate results into its response. Define your tools as JSON schemas, pass them in the API call, and implement a loop: if Claude returns tool_use, execute the function and feed the result back. This is how you build AI assistants that can actually take actions, not just generate text.

Production Considerations

Rate limiting, retry logic, and error handling are critical in production. The Anthropic SDK includes automatic retries with exponential backoff. Implement per-user rate limits on your backend to prevent abuse. Log all API calls with token counts for cost monitoring — Anthropic's pricing is per token, so understanding usage patterns directly impacts your margins.

Need AI Integration Help?

I've built production AI features using OpenAI, Anthropic Claude, and Google Gemini — from simple completions to multi-agent workflows with tool use and RAG pipelines over private data. If you need AI integrated into your product, get in touch.

Sharetribe FlexNext.js