Deepcrawl

Get Links

Fetch extracted links for a page via GET /links.

getLinks is the identical companion to extractLinks but with quick access support directly from your browser URL bar. It shares the same schema but is accessible through a GET request, making it ideal for quick lookups or copy-paste-ready results where you don't need to POST a body.

No prerequisites required

Link extraction works by parsing the actual HTML content of your target page—no sitemap.xml, robots.txt, or other configuration files needed. Deepcrawl intelligently discovers links by analyzing the page structure, making it work on any website regardless of their SEO setup.

This abbreviated snapshot comes from a real crawl of hono.dev, if you are logged into the dashboard already click this url from your browser to see the raw response, or you can try it out from here in playground.

https://hono.dev/docs/concepts/motivation
https://hono.dev/docs/concepts/routers
https://hono.dev/docs/getting-started/basic
https://hono.dev/docs/api/hono
https://hono.dev/examples/web-api
https://hono.dev/examples/proxy
https://hono.dev/llms.txt
https://hono.dev/llms-full.txt
https://hono.dev/llms-small.txt

When to use this endpoint

  • You want to preview link extraction results without constructing a POST payload.
  • You’re integrating from environments that only allow GET (e.g., low-code tooling, browser tabs using cookie auth).
  • You don’t need to modify request options beyond query parameters.

For deeper customization (tree generation, exclusion patterns, metadata, metrics), use extractLinks.

Request format

curl \
  -H "Authorization: Bearer $DEEPCRAWL_API_KEY" \
  "https://api.deepcrawl.dev/links?url=https://example.com"
  • Optionally, add query parameters mirroring the POST options (tree, metadata, cleanedHtml, link extraction toggles, etc.).
import { DeepcrawlApp } from 'deepcrawl';

const deepcrawl = new DeepcrawlApp({
  apiKey: process.env.DEEPCRAWL_API_KEY as string,
});

const links = await deepcrawl.getLinks('https://example.com', {
  ...getLinksOptions,
});
  • Responses default to the same structure as extractLinks—tree data when tree=true, otherwise only extracted links.

Query parameters - GetLinksOptions

  • Even though this is a GET request, the available fields align with GetLinksOptions. Send them as query parameters (e.g., tree=true, metadata=true, includeExternal=false).

Prop

Type

Response structure - GetLinksResponse

  • This is a union of two shapes:

    1. GetLinksResponseWithTree (when tree is enabled in options) – includes a tree hierarchy you can traverse, and metadata is nested in the tree node.
    2. GetLinksResponseWithoutTree (when tree is false in options) – omits tree, returning only extracted links and metadata.

GetLinksResponse

  • GetLinksResponseWithTree

    Prop

    Type

  • GetLinksResponseWithoutTree

    Prop

    Type

Type safely narrow by checking if ('tree' in response && response.tree) before reading the tree.

Example response:

  • With tree:
{
  requestId: '123e4567-e89b-12d3-a456-426614174000',
  success: true,
  cached: false,
  targetUrl: "https://example.com",
  timestamp: "2024-01-15T10:30:00.000Z",
  ancestors: ["https://example.com"],
  tree: {
    url: "https://example.com",
    name: "Home",
    lastUpdated: "2024-01-15T10:30:00.000Z",
    metadata: { title: "Example", description: "..." },
    extractedLinks: { internal: [...], external: [...] },
    children: [...]
  }
}
  • Without tree:
{
  requestId: '123e4567-e89b-12d3-a456-426614174000',
  success: true,
  cached: false,
  targetUrl: "https://example.com",
  timestamp: "2024-01-15T10:30:00.000Z",
  title: "Example Website",
  description: "Welcome to our site",
  metadata: { title: "Example", description: "..." },
  extractedLinks: { internal: [...], external: [...] }
}

Errors use the shared schema:

Prop

Type

Tuning tips

  • Combine the GET endpoint with the Playground for rapid iteration. Once satisfied, move to POST calls to leverage cached options.
  • Use includeExternal=true when mapping third-party references; toggle includeMedia to catalog assets.
  • Pair outputs with readUrl to fetch detailed content for the most important links.

Need richer control? Jump to extractLinks.