DEV Community

Cover image for Speed up NFT indexing with Subsquid
Massimo Luraschi
Massimo Luraschi

Posted on

Speed up NFT indexing with Subsquid

Summary

This series of articles is aimed at providing examples and explanation for some of the aspects of features of Subsquid.

This instalment wants to showcase two things, only one of which is actually inherently part of Subsquid SDK. The second one is still something that the SDK makes possible and easier to achieve. Furthermore, it's a possibility that many developers might not know, so it's worth mentioning.

Although the title of this article specifically talks about NFTs, the subjects here treated can actually be translated to any indexing project where it's necessary to repeatedly invoke a smart contract function to access its state.

The topics covered by this article are Server Extensions (through custom resolvers) and Multicall contract.

The project discussed in this article is hosted on a GitHub repository:

https://github.com/RaekwonIII/bored-ape-yacht-club-indexing

The repository is also configured, so you can run the project on Gitpod:

Open in Gitpod

Introduction

As the series of articles on use cases of Subsquid framework grows, I want to add more and more examples and mention more and more features, including some that the average Subsquid developer might not be aware of.

This time it's the turn of Server Extensions. Subsquid documentation has an extensive documentation page detailing this functionality, so this page is mostly aimed at providing a practical example.

On top of this, because it's better to have more examples than less, I am taking this project as a chance to show how to use external API calls to grab the token metadata from the tokenURI, similarly to how it was done on the ENS article.

The Multicall contract is also a very interesting talking point which should become a staples for every developer that needs to repeatedly perform calls to smart contract functions to access the contract's status. Aggregating these transactions into a single call will contribute to efficiency of your indexer and maintain a good overall performance.

Project setup

If you haven’t already, you need to install Subsquid CLI first:

npm i -g @subsquid/cli@latest
Enter fullscreen mode Exit fullscreen mode

As with most squid ETLs, it all starts with creating a project from a template. In a terminal, launch the command:

sqd init bored-ape-yacht-club -t evm
Enter fullscreen mode Exit fullscreen mode

Here, bored-ape-yacht-club is the name we are going to give to our project, you can change it to anything you like. And -t evm specifies what template should be used and makes sure the setup will create a project from the EVM-indexing template.

Let’s also install the dependencies, by launching this command:

npm i
Enter fullscreen mode Exit fullscreen mode

Contract ABI

The next step was importing the ABI into the project and generate bindings for the ABI. I have found the BAYC token contract on Ethereum block explorer and noted the address.

I then generated TypeScript boilerplate code to interact with it, thanks to Subsquid's CLI command:

sqd typegen 0xbc4ca0eda7647a8ab7c2061c2e118a18a936f13d
Enter fullscreen mode Exit fullscreen mode

If you take a look at the commands.json file, specifically at the typegen shortcut defined in it, you'll be able to see it's adding the --multicall option.

This means the command will also generate a file named multicall.ts containing specific bindings for the Multicall contract I have discussed in the introduction. This will be useful in the indexing logic.

Schema

For this project, I didn't steer much away from other similar NFT indexing projects I have shown in previous articles. As usual, I am interest in Transfers, in theTokens, as well as the Owner of each token, and I'm also capturing Contract information.

I did, however, add the imageUrl field to the Token model, because this is part of what I wanted to achieve by using the Multicall contract. Here's the schema:

type Token @entity {
  id: ID!
  owner: Owner
  uri: String
  imageUrl: String
  transfers: [Transfer!]! @derivedFrom(field: "token")
  contract: Contract
}

type Owner @entity {
  id: ID!
  ownedTokens: [Token!] @derivedFrom(field: "owner")
}

type Contract @entity {
  id: ID!
  name: String! @index
  symbol: String!
  totalSupply: BigInt!
  tokens: [Token!]! @derivedFrom(field: "contract")
}

type Transfer @entity {
  id: ID!
  token: Token!
  from: Owner
  to: Owner
  timestamp: DateTime! @index
  block: Int! @index
  transactionHash: String!
}
Enter fullscreen mode Exit fullscreen mode

Let's save this content in the schema.graphql file and update the TypeScript models, by running the command:

sqd codegen
Enter fullscreen mode Exit fullscreen mode

Data indexing

For the data processing itself, I didn't want to create anything too complex, because it's not the focus of the project. Also, I am not going to provide the full code listing here, I'd recommend instead to take a look at the repository on GitHub.

Instead, I'm going to provide a list of functionalities that the logic accomplishes, and some code snippets to showcase how they are implemented:

  • configure the indexer to extract data generated by the BAYC smart contract and filter for the event I am interested in: Transfer
const contractAddress = "0xBC4CA0EdA7647A8aB7C2061c2E118A18a936f13D".toLowerCase();

const multicallAddress = "0x5ba1e12693dc8f9c48aad8770482f4739beed696".toLowerCase();

const processor = new EvmBatchProcessor()
.setDataSource({
  chain: process.env.RPC_ENDPOINT,
  archive: "https://eth.archive.subsquid.io",
})
.addLog(contractAddress, {
  filter: [
    [
      events.Transfer.topic,
    ],
  ],
  data: {
    evmLog: {
      topics: true,
      data: true,
    },
    transaction: {
      hash: true,
    },
  },
});
Enter fullscreen mode Exit fullscreen mode
  • Process the batch of events that Subsquid Archive provides to my processor at recurring intervals and verify the event kind, and use a separate function to extract, save data in a temporary data interface array, then use another function to process this data array and save information on the database
processor.run(new TypeormDatabase(), async (ctx) => {
  const baycDataArr: BAYCData[] = [];

  for (let c of ctx.blocks) {
    for (let i of c.items) {
      if (i.address === contractAddress && i.kind === "evmLog") {
        if (i.evmLog.topics[0] === events.Transfer.topic) {
          const baycData = handleTransfer({
            ...ctx,
            block: c.header,
            ...i,
          });
          baycDataArr.push(baycData);
        }
      }
    }
  }

  await saveBAYCData(
    {
      ...ctx,
      block: ctx.blocks[ctx.blocks.length - 1].header,
    },
    baycDataArr
  );
});

type BAYCData = {
  id: string;
  from: string;
  to: string;
  tokenId: bigint;
  timestamp: Date;
  block: number;
  transactionHash: string;
};
Enter fullscreen mode Exit fullscreen mode
  • Here is the handleTransfer function, responsible for decoding the Transfer event and save data in a BAYCData interface.
function handleTransfer(
  ctx: LogHandlerContext<
    Store,
    { evmLog: { topics: true; data: true }; transaction: { hash: true } }
  >
): BAYCData {
  const { evmLog, block, transaction } = ctx;

  const { from, to, tokenId } = events.Transfer.decode(evmLog);

  const baycData: BAYCData = {
    id: `${transaction.hash}-${evmLog.address}-${tokenId.toBigInt()}-${
      evmLog.index
    }`,
    from,
    to,
    tokenId: tokenId.toBigInt(),
    timestamp: new Date(block.timestamp),
    block: block.height,
    transactionHash: transaction.hash,
  };
  return baycData;
}
Enter fullscreen mode Exit fullscreen mode

Token URI via Multicall contract

At this point, the transferred NFTs have been identified and the transfer information is stored in the baycDataArr array, but what is making sure this information is permanent, is the saveBAYCData function.

Its logic is acutally very similar to the one in the ENS tokens article, so I am not going to repeat it. Furthermore, it's possible to use the code stored in the GitHub repository as a reference (which I advise looking at).

I am going to instead focus on the parts that make this project worth talking about. Right at the very end of the function, after the loop over the baycDataArr array has ended, and before saving on the database, this code section below is where the Multicall contract is used to fetch the tokenURI, by aggregating multiple calls into a single one.

const maxHeight = maxBy(baycDataArr, data => data.block)!.block;
const multicall = new Multicall(ctx, {height: maxHeight}, multicallAddress);

ctx.log.info(`Calling multicall for ${baycDataArr.length} tokens...`);

const results = await multicall.tryAggregate(functions.tokenURI, baycDataArr.map(data => [contractAddress, [BigNumber.from(data.tokenId)]] as [string, BigNumber[]]), 100);
results.forEach((res, i) => {
  let t = tokens.get(baycDataArr[i].tokenId.toString());
  if (t) {
    let uri = '';
    if (res.success) {
      uri = <string>res.value;
    } else if (res.returnData) {
      uri = <string>functions.tokenURI.tryDecodeResult(res.returnData) || '';
    }
    t.uri = uri;
    if (!tokenIdToImageUrl.has(t.id)) tokensWithNoImage.push(t.id);
  }
})

ctx.log.info(`Done`);
Enter fullscreen mode Exit fullscreen mode

It's worth noting that tokenIdToImageUrl is a Map defined at the very top of the file, which act as a "cache" of tokens we have already encountered and whose image we have already fetched.

Then, right after processing all results from the multicall contract, we can fetch the image url we need with this piece of code (note: api is an instance of the Axios API library, which is also instantiated at the top of the file):

await Promise.all(tokensWithNoImage.map( async (id) => {
  const t = tokens.get(id);
  if (t && t.uri) {
    try {
      const res = await api.get(t.uri);
      tokenIdToImageUrl.set(id, res.data.image);
      t.imageUrl = res.data.image;
    }
    catch (error) {
      console.log(error);
    }
  }
}))
Enter fullscreen mode Exit fullscreen mode

Custom Resolver

The data processing implemented so far made sure I saved all the Transfers, but what if I wanted to do some aggregations? Let's imagine I wanted to know the daily total transfer, how would I access this kind of data?

There is a common pattern offered by GraphQL servers, which might be less known by Subsquid developers. This is, to write your own custom resolver as a server extension, and specify custom queries.

I set on to create an index.ts file under the src/server-extension/resolvers and I built my own daily aggregations using simple SQL queries. Here is how to get the daily Transfers of BAYC NFTs:

// src/server-extension/resolvers/index.ts
import 'reflect-metadata'
import type { EntityManager } from 'typeorm'
import { Field, ObjectType, Query, Resolver } from 'type-graphql'
import { Transfer } from '../../model'

@ObjectType()
export class TransfersDayData {
    @Field(() => Date, { nullable: false })
    day!: Date

    @Field(() => Number, {nullable: false})
    count!: number

    constructor(props: Partial<TransfersDayData>) {
        Object.assign(this, props)
    }
}

@Resolver()
export class TransfersDayDataResolver {
    constructor(private tx: () => Promise<EntityManager>) {}

    @Query(()=>[TransfersDayData])
    async getTransfersDayData(): Promise<TransfersDayData[]> {
        const manager = await this.tx()
        const repository = manager.getRepository(Transfer)

        const data: {
            day: string
            count: number
        }[] = await repository.query(`
            SELECT DATE(timestamp) AS day, COUNT(*) as count
            FROM transfer
            GROUP BY day
            ORDER BY day DESC
        `)
        return data.map(
            (i) => new TransfersDayData({
                day: new Date(i.day),
                count: i.count
            })
        )
    }
}
Enter fullscreen mode Exit fullscreen mode

Start indexing

If you have followed along and want to try this at home, at this point you should launch the database Docker container and generate a new migration file (this will also take care of removing any existing migration files, relating to the template project):

sqd up
sqd migration:generate
Enter fullscreen mode Exit fullscreen mode

Launch the processor:

sqd process
Enter fullscreen mode Exit fullscreen mode

And finally, in a separate terminal window, launch the GraphQL service:

sqd serve
Enter fullscreen mode Exit fullscreen mode

Let’s open the browser at the address: localhost:4350/graphql and test our work so far.

Let’s find all the tokens that have a value in the uri field, and get some information about them, including the uri itself:

query MyQuery {
    tokens(limit: 10, where: {uri_isNull: false}) {
        id
        uri
        owner {
            id
        {
    }
}
Enter fullscreen mode Exit fullscreen mode

But ultimately, thanks to the code section visiting this very same URL, we were able to save the image url, which we can query:

query MyQuery {
    tokens(limit: 10, where {imageUrl_isNull: false}) {
        id
        imageUrl
        uri
    }
}
Enter fullscreen mode Exit fullscreen mode

Let’s copy the image from the field and paste the URL in another tab, and that’s our bored ape.

We can also use the server extension we built, and select the day and the count, and when we execute it, we will get a list of days and the related count of transfers for each day. And the information here was retrieved via a custom SQL query.

Gitpod

Another option, as mentioned at the start of the article, is to run the project on Gitpod by clicking on the related link on the project’s README file:

Which will launch the entire project in a cloud-hosted execution environment for you.

Conclusion

In conclusion, with this article I wanted to prove to be able to

  • fully index the Bored Ape Yacht Club NFT collection

  • fetch their metadata, including their token URL in a performant way, thanks to Multicall contract

  • use API calls to these endpoints to locally save the URLs of the images

  • create our own personalized query by extending the GraphQL server and implementing a custom resolver

So, in this video, we learned that Subsquid SDK gives you the freedom to use external libraries and perform API calls to augment your on-chain data, that Multicall contract helps you keep the performance of your indexer in check, and that custom resolvers can allow developers to introduce custom manipulations and aggregations on top of indexed data.

If you found this article interesting, and you want to read more, please follow me and most importantly, Subsquid.

Subsquid socials:

Website | Twitter | Discord | LinkedIn | Telegram | GitHub | YouTube

Top comments (0)