Metadata Indexer

Metadata Indexer

An open source and self-hostable service that indexes all the embed urls mentioned in Farcaster casts, and their metadata.

Why should you use this, and not build your own, or load the data on pageload?

  • We've put 100+ hours of work into this tool; is this a core differentiator for your app? If not, perhaps use this infra rather than building it yourself
  • You can and are free to use this even if you aren't integrating Mod.
  • Fetching metadata on user's pageload is slow (1s+) and you can't always be sure of the dimensions of the image(s), so you may accidentally introduce Cumulative Layout Shift (opens in a new tab)
  • It supports special handlers for NFT-extended open graph spec (opens in a new tab), CAIP-19 urls, Opensea, Zora, Images and Arweave files.
  • This indexer live indexes every new Farcaster cast as soon as it's propagated to our hub.
  • It supports JSON Linked Data (opens in a new tab) and gives you additional metadata about the page
  • We're already indexing and caching the metadata of over 500,000 urls shared on Farcaster.
  • It's MIT Licensed and if you don't want to rely on our APIs or are worried about it shutting down, you can always self host (opens in a new tab) it easier than you can build it yourself.

How to use

Endpoint to get the Metadata for a list of cast hashes

POST https://api.modprotocol.org/api/cast-embeds-metadata

curl --request POST \
--url https://api.modprotocol.org/api/cast-embeds-metadata \
--header 'Content-Type: application/json' \
--data '["0x15f0c75fd1735789b52c708f858816d149910c1d","0x1af019b5ad82f159efa3f2c51b4d4653604180d4"]'

This will return a JSON object with the following structure:

{
  "0x1234...abcd": [
    {
      "castHash": "0x1234...abcd",
      "url": "...",
      "normalizedUrl": "...",
      "index": 0,
      "metadata": {
        "title": "Example Title",
        "description": "Example Description",
        "image": {
          "url": "https://example.com/image.png"
        }
        // ...
      }
    }
  ]
}

Returned metadata objects conform to the UrlMetadata type. This can then be used to render embeds in a cast.

import { UrlMetadata } from "@mod-protocol/core";
 
cast.embeds.map((embed) => {
  const metadata: UrlMetadata = metadataResponse[cast.hash][embed.url];
  return <RichEmbed metadata={metadata} />;
});

Endpoint to get the Metadata for a list of urls

/api/cast-embeds-metadata/by-url

Fetching metadata from the cache by url is as simple as making a POST request to the following endpoint with a list of urls in the body.

curl --request POST \
--url https://api.modprotocol.org/api/cast-embeds-metadata/by-url \
--header 'Content-Type: application/json' \
--data '["https://google.com","https://twitter.com"]'

This will return a JSON object with the following structure:

{
  "https://google.com": {
    "image": {
      "url": "https://www.google.com/images/branding/googleg/1x/googleg_standard_color_128dp.png",
      "height": 128,
      "width": 128
    },
    "description": "Celebrate Native American Heritage Month with Google",
    "title": "Google",
    "publisher": "google.com",
    "logo": {
      "url": "https://www.google.com/favicon.ico"
    },
    "mimeType": "text/html; charset=UTF-8"
  },
  "https://twitter.com": {
    "description": "From breaking news and entertainment to sports and politics, get the full story with all the live commentary.",
    "title": "@ on Twitter",
    "publisher": "Twitter",
    "logo": {
      "url": "https://abs.twimg.com/favicons/twitter.3.ico"
    },
    "mimeType": "text/html; charset=utf-8"
  }
}

Returned metadata objects conform to the UrlMetadata type. This can then be used to render embeds in a cast.

import { UrlMetadata } from "@mod-protocol/core";
 
const metadata: UrlMetadata = metadataResponse[embed.url];
return <RichEmbed metadata={metadata} />;

Typescript type of Metadata

type UrlMetadata = {
  image?: {
    url?: string;
    width?: number;
    height?: number;
  };
  "json-ld"?: Record<string, object[]>;
  description?: string;
  alt?: string;
  title?: string;
  publisher?: string;
  logo?: {
    url: string;
  };
  customOpenGraph?: {
    "fc:frame"?: string;
    "fc:frame:image"?: string;
    "fc:frame:button:1"?: string;
    "fc:frame:button:2"?: string;
    "fc:frame:button:3"?: string;
    "fc:frame:button:4"?: string;
  };
  nft?: {
    owner?: FarcasterUser;
    tokenId?: string;
    mediaUrl?: string;
    mintUrl?: string;
    mimeType?: string;
    collection: {
      id: string;
      name: string;
      chain: string;
      contractAddress: string;
      creatorAddress: string;
      itemCount: number;
      ownerCount: number;
      mintUrl?: string;
      description?: string;
      imageUrl?: string;
      openSeaUrl?: string;
      creator?: FarcasterUser;
    };
  };
};

Refreshing the cache

It is possible to manually trigger a refresh of the cache for a given URL by making a request to the following endpoint:

POST /refresh?url=<url-encoded-url>