Node.js SDK
Install and use the official Deepcrawl JavaScript/TypeScript SDK client in Node runtimes.
Deepcrawl’s SDK wraps every API endpoint with typed methods, retry helpers, and structured errors. Use it anywhere you can run server-side JavaScript—Next.js route handlers, serverless functions, edge workers (Node-compatible), CLI tools, or background jobs.
Requirements
- Node.js 18.17+ (Node 20+ recommended for built-in
fetchandAbortController). - Access to a Deepcrawl API key (see Quick Start).
- Environment variables stored securely (local
.env, platform secrets, etc.).
DeepcrawlApp is server-only. Do not import it inside client-side components or ship it to the browser bundle.
Install the package
pnpm add deepcrawlnpm install deepcrawlyarn add deepcrawlbun add deepcrawlStep 1. Configure environment variables
Choose where to persist your API key (shell export, .env file, secret manager). For Next.js, use .env.local and access via process.env in server code.
Optional: declare a custom DEEPCRAWL_API_URL if you self-host the workers. Leave it unset to use https://api.deepcrawl.dev.
Restart your dev server so the new environment variables load.
# .env.local (server only)
DEEPCRAWL_API_KEY=dc_your_API_key
# Optional for self-hosted API deployment
DEEPCRAWL_API_URL=https://your-worker.example.comStep 2. Initialize the client
// src/lib/deepcrawl.ts
import { DeepcrawlApp } from 'deepcrawl';
export const deepcrawl = new DeepcrawlApp({
apiKey: process.env.DEEPCRAWL_API_KEY as string,
baseUrl: process.env.DEEPCRAWL_API_URL, // Optional for self-hosted API deployment
});The constructor automatically sets headers (Authorization, x-api-key, User-Agent), negotiates retries for read/link endpoints, and uses keep‑alive HTTPS agents in Node runtimes. You can pass a custom fetch (for polyfills) or fetchOptions to tweak timeouts.
Basic usage
import { deepcrawl } from '@/lib/deepcrawl';
export async function getMarkdown(url: string) {
const markdown = await deepcrawl.getMarkdown(url);
// const markdown = await deepcrawl.getMarkdown(url, {...options}); // with more options
return markdown; // string (clean markdown)
}import { deepcrawl } from '@/lib/deepcrawl';
export async function readUrl(url: string) {
const result = await deepcrawl.readUrl(url);
// const result = await deepcrawl.readUrl(url, {...options}); // with more options
return result;
}import { deepcrawl } from '@/lib/deepcrawl';
export async function extractLinks(url: string) {
const result = await deepcrawl.extractLinks(url);
// const tree = await deepcrawl.extractLinks(url, {...options}); // with more options
if ('tree' in result && result.tree) {
// TypeScript infers LinksSuccessResponseWithTree
console.log(result.tree.metadata?.title);
} else {
// TypeScript infers LinksSuccessResponseWithoutTree
console.log(result.title);
}
return result;
}Sharing one client instance
- Instantiate the client once per process (singleton module export) to reuse HTTPS agents.
- In serverless/edge functions, create the client outside the handler to benefit from warm invocations.
Handling errors and retries
import {
DeepcrawlAuthError,
DeepcrawlRateLimitError,
DeepcrawlReadError,
} from 'deepcrawl';
import { deepcrawl } from '@/lib/deepcrawl';
export async function safeGetMarkdown(url: string) {
try {
return await deepcrawl.getMarkdown(url);
} catch (error) {
if (error instanceof DeepcrawlAuthError) {
// Missing/invalid API key
throw new Error('Check Deepcrawl credentials');
}
if (error instanceof DeepcrawlRateLimitError) {
const retryAfter = error.data?.retryAfter ?? 60;
throw new Error(`Rate limited, retry in ${retryAfter}s`);
}
if (error instanceof DeepcrawlReadError) {
// Access error.data for the original request payload
console.error('Read error:', error.data);
}
throw error;
}
}- The SDK retries transient network errors and certain read/link operations up to two times. For custom logic, wrap calls with your own retry policy.
- All error classes inherit from
DeepcrawlError, so you can add a fallback handler.
Framework examples
If you want provide deepcrawl as your service in your system, you can use the following examples.
// app/api/markdown/route.ts
import { NextResponse } from 'next/server';
import { deepcrawl } from '@/lib/deepcrawl';
export async function POST(req: Request) {
const { url } = await req.json();
const markdown = await deepcrawl.getMarkdown(url);
return NextResponse.json({ markdown });
}// src/index.ts
import { Hono } from 'hono';
import { DeepcrawlApp } from 'deepcrawl';
type Bindings = {
DEEPCRAWL_API_KEY: string;
};
const app = new Hono<{ Bindings: Bindings }>();
app.post('/read', async (c) => {
const deepcrawl = new DeepcrawlApp({
apiKey: c.env.DEEPCRAWL_API_KEY,
});
const { url } = await c.req.json();
try {
const result = await deepcrawl.readUrl(url);
return c.json(result);
} catch (error) {
return c.json({ error: (error as Error).message }, 500);
}
});
export default app;// src/index.ts
import { DeepcrawlApp } from 'deepcrawl';
interface Env {
DEEPCRAWL_API_KEY: string;
}
export default {
async fetch(request: Request, env: Env): Promise<Response> {
if (request.method !== 'POST') {
return new Response('Method not allowed', { status: 405 });
}
const deepcrawl = new DeepcrawlApp({
apiKey: env.DEEPCRAWL_API_KEY,
});
try {
const { url } = await request.json();
const markdown = await deepcrawl.getMarkdown(url);
return new Response(JSON.stringify({ markdown }), {
headers: { 'Content-Type': 'application/json' },
});
} catch (error) {
return new Response(
JSON.stringify({ error: (error as Error).message }),
{ status: 500, headers: { 'Content-Type': 'application/json' } }
);
}
},
};import express from 'express';
import { deepcrawl } from './lib/deepcrawl';
const app = express();
app.use(express.json());
app.post('/api/read', async (req, res) => {
try {
const result = await deepcrawl.readUrl({
url: req.body.url,
metadata: true,
});
res.json(result);
} catch (error) {
res.status(500).json({ error: (error as Error).message });
}
});
app.listen(3000);Client Options Overview
import { DeepcrawlApp } from 'deepcrawl';
export const deepcrawl = new DeepcrawlApp({
apiKey: process.env.DEEPCRAWL_API_KEY as string,
baseUrl: process.env.DEEPCRAWL_API_URL, // optional self-hosted origin
...deepcrawlConfig, // see below
});You can override fetch, fetchOptions, or headers during construction if
you need custom agents, proxies, or tracing metadata.
DeepcrawlConfig
Prop
Type
DeepcrawlFetchOptions
Prop
Type
Base Hosts & Authentication
- Production host:
https://api.deepcrawl.dev. - Self-hosted Worker: pass
baseUrltonew DeepcrawlApp({ baseUrl })or send requests directly to your Worker origin. - Authentication: send
Authorization: Bearer <DEEPCRAWL_API_KEY>orx-api-key: <DEEPCRAWL_API_KEY>. Dashboard sessions may also forward signed cookies; prefer API keys for automation. - Content types: POST endpoints expect JSON (UTF-8); GET endpoints accept query parameters that map to the option types below.
Rate limits & retries
- Workers return
429with codeRATE_LIMITEDand aretryAfterduration (seconds). - The SDK automatically retries idempotent operations (
getMarkdown,readUrl,extractLinks) with exponential backoff unless the error is explicitly typed. - Enable caching (
cacheOptions) to reduce repeated calls and token costs.
What’s next
- Explore the SDK reference for every method, type, and option.
- Check the Playground guide to prototype requests before porting them to code.
- Combine with the Logs API to inspect past runs, collect analytics, or export markdown snapshots.