DEV Community

Cover image for Scrape OpenSea Data with Python
Crawlbase
Crawlbase

Posted on • Originally published at crawlbase.com

Scrape OpenSea Data with Python

This blog was originally posted to Crawlbase Blog

Scraping data from OpenSea is super helpful, especially if you’re into NFTs (Non-Fungible Tokens), which have gone crazy in the last few years. NFTs are unique digital assets—art, collectibles, virtual goods—secured on blockchain technology. As one of the largest NFT marketplaces, OpenSea has millions of NFTs across categories, so it’s a go-to for collectors, investors, and developers. Whether you’re tracking trends, prices, or specific collections, having this data is gold.

But OpenSea uses JavaScript to load most of its data, so traditional scraping won’t work. That’s where the Crawlbase Crawling API comes in—it can handle JavaScript-heavy pages, so it’s the perfect solution for scraping OpenSea data.

In this post, we’ll show you how to scrape OpenSea data, collection pages, and individual NFT detail pages using Python and the Crawlbase Crawling API. Let’s get started!

Why Scrape OpenSea for NFT Data?

Scraping OpenSea can help you track and analyze valuable NFT data, including prices, trading volumes, and ownership information. Whether you’re an NFT collector, a developer building NFT-related tools, or an investor looking to understand market trends, extracting data from OpenSea gives you the insights you need to make informed decisions.

Here are some reasons why scraping OpenSea is important:

An image stating the reasons to scrape OpenSea for NFT data

  1. Track NFT Prices: Monitor individual NFT prices or an entire collection over time
  2. Analyze Trading Volumes: Understand how in-demand certain NFTs are based on sales and trading volumes.
  3. Discover Trends: Find out what are the hottest NFT collections and tokens in real-time.
  4. Monitor NFT Owners: Scrape ownership data to see who owns specific NFTs or how many tokens a wallet owns.
  5. Automate Data Collection: Instead of checking OpenSea manually, you can auto collect the data and save it in different formats like CSV or JSON.

OpenSea’s website use JavaScript rendering so scraping it can be tricky. But with the Crawlbase Crawling API, you can handle this problem and extract the data easily.

What Data Can You Extract From OpenSea?

When scraping OpenSea, it’s important to know what data to focus on. The platform has a ton of information about NFTs (Non-Fungible Tokens), and extracting the right data will help you track performance, analyze trends, and make decisions. Here’s what to extract:

Image of key data points to extract from OpenSea

  1. NFT Name: The name that is unique to each NFT often holds a branding or collection sentiment.
  2. Collection Name: The NFT collection to which the individual NFT belongs. Collections usually represent sets or series of NFTs.
  3. Price: The NFT listing price. This is important for understanding market trends and determining the value of NFTs.
  4. Last Sale Price: The price the NFT was previously sold at. It gives a history for NFT market performance.
  5. Owner: The NFT's present holder (usually a wallet address).
  6. Creator: The artist or creator of the NFT. Creator information is important for tracking provenance and originality.
  7. Number of Owners: Some NFTs have multiple owners, which indicates how widely held the token is.
  8. Rarity/Attributes: Many NFTs has traits that make them unique and more desirable.
  9. Trading Volume: The overall volume of sales and transfers of the NFT or the entire collection.
  10. Token ID: The unique identifier for the NFT on the blockchain, useful for tracking specific tokens across platforms.

OpenSea Scraping with Crawlbase Crawling API

The Crawlbase Crawling API makes OpenSea data scraping easy. Since OpenSea uses JavaScript to load its content, traditional scraping methods will fail. But the Crawlbase API works like a real browser so you can get all the data you need.

Why Use Crawlbase Crawling API for OpenSea

  1. Handles Dynamic Content: The Crawlbase Crawling API can handle JavaScript heavy pages and ensures the scraping only happens after all NFT data (prices, ownership) is exposed.
  2. IP Rotation: To prevent getting blocked by OpenSea’s security, Crawlbase rotates IP addresses. So you can scrape multiple pages without worrying about rate limits or bans.
  3. Fast Performance: Crawlbase is fast and efficient for scraping large data volumes, saving you time especially when you have many NFTs and collections.
  4. Customizable Requests: You can adjust headers, cookies and other parameters to fit your scraping needs and get the data you want.
  5. Scroll-Based Pagination: Crawlbase supports scroll-based pagination so you can get more items on collection pages without having to manually click through each page.

Crawlbase Python Library

Crawlbase also has a python library using which you can easily use Crawlbase products into your projects. You’ll need an access token which you can get by signing up with Crawlbase.

Here’s an example to send a request to Crawlbase Crawling API:

from crawlbase import CrawlingAPI

# Initialize Crawlbase API with your access token
crawling_api = CrawlingAPI({'token': 'CRAWLBASE_JS_TOKEN'})

def make_crawlbase_request(url):
    response = crawling_api.get(url)

    if response['headers']['pc_status'] == '200':
        html_content = response['body'].decode('utf-8')
        return html_content
    else:
        print(f"Failed to fetch the page. Crawlbase status code: {response['headers']['pc_status']}")
        return None
Enter fullscreen mode Exit fullscreen mode

Note: Crawlbase provides two types of tokens: a Normal Token for static sites and a JavaScript (JS) Token for dynamic or browser-rendered content, which is necessary for scraping OpenSea. Crawlbase also offers 1,000 free requests to help you get started, and you can sign up without a credit card. For more details, check the Crawlbase Crawling API documentation.

In the next section, we’ll set up your Python environment for scraping OpenSea effectively.

Setting Up Your Python Environment

Before scraping data from OpenSea, you need to set up your Python environment. This setup will ensure you have all the necessary tools and libraries to make your scraping process smooth and efficient. Here’s how to do it:

Installing Python and Required Libraries

Install Python: Download Python from the official website and follow the installation instructions. Make sure to check "Add Python to PATH" during installation.

Set Up a Virtual Environment (optional but recommended): This keeps your project organized. Run these commands in your terminal:

cd your_project_directory
python -m venv venv
venv\Scripts\activate  # Windows
# or
source venv/bin/activate  # macOS/Linux
Enter fullscreen mode Exit fullscreen mode

Install Required Libraries: Run the following command to install necessary libraries:

pip install beautifulsoap4 crawlbase pandas
Enter fullscreen mode Exit fullscreen mode
  • beautifulsoap4: For parsing and extracting data from HTML.
  • crawlbase: For using the Crawlbase Crawling API.
  • pandas: For handling and saving data in CSV format.

Choosing an IDE

Select an Integrated Development Environment (IDE) to write your code. Popular options include:

Now that your Python environment is set up, you’re ready to start scraping OpenSea collection pages. In the next section, we will inspect the HTML for CSS selectors.

Scraping OpenSea Collection Pages

In this section, we will scrape collection pages from OpenSea. Collection pages show various NFTs grouped under specific categories or themes. To do this efficiently we will go through the following steps:

Inspecting the HTML for CSS Selectors

Before we write our scraper we need to understand the structure of the HTML on the OpenSea collection pages. Here’s how to find the CSS selectors:

  1. Open the Collection Page: Go to the OpenSea website and navigate to any collection page.
  2. Inspect the Page: Right-click on the page and select “Inspect” or press Ctrl + Shift + I to open the Developer Tools.

OpenSea Collection page HTML inspect

  1. Find Relevant Elements: Look for the elements that contain the NFT details. Common data points are:
  • Title: In a <span> with data-testid="ItemCardFooter-name".
  • Price: Located within a <div> with data-testid="ItemCardPrice", specifically in a nested <span> with data-id="TextBody".
  • Image URL: In an <img> tag with the image source in the src attribute.
  • Link: The NFT detail page link is in an <a> tag with the class Asset--anchor.

Writing the Collection Page Scraper

Now we have the CSS selectors, we can write our scraper. We will use the Crawlbase Crawling API to handle JavaScript rendering by using its ajax_wait and page_wait parameters. Below is the implementation of the scraper:

from crawlbase import CrawlingAPI
import pandas as pd

# Initialize Crawlbase API with your access token
crawling_api = CrawlingAPI({'token': 'CRAWLBASE_JS_TOKEN'})

def make_crawlbase_request(url):
    options = {
        'ajax_wait': 'true',
        'page_wait': '5000'
    }

    response = crawling_api.get(url, options)

    if response['headers']['pc_status'] == '200':
        html_content = response['body'].decode('utf-8')
        return html_content
    else:
        print(f"Failed to fetch the page. Crawlbase status code: {response['headers']['pc_status']}")
        return None

def scrape_opensea_collection(html_content):
    soup = BeautifulSoup(html_content, 'html.parser')
    data = []

    # Find all NFT items in the collection
    nft_items = soup.select('div.Asset--loaded > article.AssetSearchList--asset')

    for item in nft_items:
        title = item.select_one('span[data-testid="ItemCardFooter-name"]').text.strip() if item.select_one('span[data-testid="ItemCardFooter-name"]') else ''
        price = item.select_one('div[data-testid="ItemCardPrice"] span[data-id="TextBody"]').text.strip() if item.select_one('div[data-testid="ItemCardPrice"] span[data-id="TextBody"]') else ''
        image = item.select_one('img')['src'] if item.select_one('img') else ''
        link = item.select_one('a.Asset--anchor')['href'] if item.select_one('a.Asset--anchor') else ''

        # Add the extracted data to the list
        data.append({
            'title': title,
            'price': price,
            'image_url': image,
            'link': f"https://opensea.io{link}"  # Construct the full URL
        })

    return data
Enter fullscreen mode Exit fullscreen mode

Here we initialize the Crawlbase Crawling API and create a function make_crawlbase_request to get the collection page. The function waits for any AJAX requests to complete and waits 5 seconds for the page to fully render before passing the HTML to the scrape_opensea_collection function.

In scrape_opensea_collection, we parse the HTML with BeautifulSoup and extract details about each NFT item using the CSS selectors we defined earlier. We get the title, price, image URL and link for each NFT and store this in a list which is returned to the caller.

Handling Pagination in Collection Pages

OpenSea uses scroll-based pagination, so more items load as you scroll down the page. We can use the scroll and scroll_interval parameters for this. This way we don’t need to manage pagination explicitly.

options = {
    'ajax_wait': 'true',
    'scroll': 'true',
    'scroll_interval': '20'  # Scroll for 20 seconds
}
Enter fullscreen mode Exit fullscreen mode

This will make the crawler scroll for 20 seconds so we get more items.

Storing Data in a CSV File

After we scrape the data we can store it in a CSV file. This is a common format and easy to analyze later. Here’s how:

def save_data_to_csv(data, filename='opensea_data.csv'):
    df = pd.DataFrame(data)
    df.to_csv(filename, index=False)
    print(f"Data saved to {filename}")
Enter fullscreen mode Exit fullscreen mode

Complete Code Example

Here’s the complete code that combines all the steps:

from crawlbase import CrawlingAPI
import pandas as pd
from bs4 import BeautifulSoup

# Initialize Crawlbase API with your access token
crawling_api = CrawlingAPI({'token': 'CRAWLBASE_JS_TOKEN'})

def make_crawlbase_request(url):
    options = {
        'ajax_wait': 'true',
        'scroll': 'true',
        'scroll_interval': '20'  # Scroll for 20 seconds
    }

    response = crawling_api.get(url, options)

    if response['headers']['pc_status'] == '200':
        html_content = response['body'].decode('utf-8')
        return html_content
    else:
        print(f"Failed to fetch the page. Crawlbase status code: {response['headers']['pc_status']}")
        return None

def scrape_opensea_collection(html_content):
    soup = BeautifulSoup(html_content, 'html.parser')
    data = []

    # Find all NFT items in the collection
    nft_items = soup.select('div.Asset--loaded > article.AssetSearchList--asset')

    for item in nft_items:
        title = item.select_one('span[data-testid="ItemCardFooter-name"]').text.strip() if item.select_one('span[data-testid="ItemCardFooter-name"]') else ''
        price = item.select_one('div[data-testid="ItemCardPrice"] span[data-id="TextBody"]').text.strip() if item.select_one('div[data-testid="ItemCardPrice"] span[data-id="TextBody"]') else ''
        image = item.select_one('img')['src'] if item.select_one('img') else ''
        link = item.select_one('a.Asset--anchor')['href'] if item.select_one('a.Asset--anchor') else ''

        # Add the extracted data to the list
        data.append({
            'title': title,
            'price': price,
            'image_url': image,
            'link': f"https://opensea.io{link}"  # Construct the full URL
        })

    return data

def save_data_to_csv(data, filename='opensea_data.csv'):
    df = pd.DataFrame(data)
    df.to_csv(filename, index=False)
    print(f"Data saved to {filename}")

if __name__ == "__main__":
    url = "https://opensea.io/collection/courtyard-nft"
    html_content = make_crawlbase_request(url)

    if html_content:
        data = scrape_opensea_collection(html_content)  # Extract data from HTML content
        save_data_to_csv(data)
Enter fullscreen mode Exit fullscreen mode

opensea_data.csv Snapshot:

opensea_data.csv file snapshot

Scraping OpenSea NFT Detail Pages

In this section, we will learn how to scrape NFT detail pages on OpenSea. Each NFT has its detail page that has more information such as title, description, price history and other details. We will follow these steps:

Inspecting the HTML for CSS Selectors

Before we write our scraper, we need to find the HTML structure of the NFT detail pages. Here’s how to do it:

  1. Open an NFT Detail Page: Go to OpenSea and open any NFT detail page.
  2. Inspect the Page: Right-click on the page and select “Inspect” or press Ctrl + Shift + I to open the Developer Tools.

OpenSea NFT detail page HTML inspect

  1. Locate Key Elements: Search for the elements that hold the NFT details. Here are the common data points to look for:
  • Title: In an <h1> tag with class item--title.
  • Description: In a <div> tag with class item--description.
  • Price: In a <div> tag with class Price--amount.
  • Image URL: In an <img> tag inside a <div> with class media-container.
  • Link to the NFT page: The current URL of the NFT detail page.

Writing the NFT Detail Page Scraper

Now that we have our CSS selectors, we can write our scraper. We’ll use the Crawlbase Crawling API to render JavaScript. Below is an example of how to scrape data from an NFT detail page:

from crawlbase import CrawlingAPI
from bs4 import BeautifulSoup
import pandas as pd

# Initialize Crawlbase API with your access token
crawling_api = CrawlingAPI({'token': 'CRAWLBASE_JS_TOKEN'})

def make_crawlbase_request(url):
    options = {
        'ajax_wait': 'true',
        'page_wait': '5000'
    }

    response = crawling_api.get(url, options)

    if response['headers']['pc_status'] == '200':
        html_content = response['body'].decode('utf-8')
        return html_content
    else:
        print(f"Failed to fetch the NFT detail page. Crawlbase status code: {response['headers']['pc_status']}")
        return None

def scrape_opensea_nft_detail(html_content, url):
    soup = BeautifulSoup(html_content, 'html.parser')

    title = soup.select_one('h1.item--title').text.strip() if soup.select_one('h1.item--title') else ''
    description = soup.select_one('div.item--description').text.strip() if soup.select_one('div.item--description') else ''
    price = soup.select_one('div.Price--amount').text.strip() if soup.select_one('div.Price--amount') else ''
    image_urls = [img['src'] for img in soup.select('div.media-container img')]
    link = url  # The link is the current URL

    nft_data = {
        'title': title,
        'description': description,
        'price': price,
        'images_url': image_urls,
        'link': link
    }

    return nft_data
Enter fullscreen mode Exit fullscreen mode

Storing Data in a CSV File

Once we have scraped the NFT details, we can save them in a CSV file. This allows us to easily analyze the data later. Here’s how to do it:

def save_nft_data_to_csv(data, filename='opensea_nft_data.csv'):
    df = pd.DataFrame([data])  # Convert the single NFT data dictionary to a DataFrame
    df.to_csv(filename, index=False)
    print(f"NFT data saved to {filename}")
Enter fullscreen mode Exit fullscreen mode

Complete Code Example

Here’s the complete code that combines all the steps for scraping NFT detail pages:

from crawlbase import CrawlingAPI
from bs4 import BeautifulSoup
import pandas as pd

# Initialize Crawlbase API with your access token
crawling_api = CrawlingAPI({'token': 'CRAWLBASE_JS_TOKEN'})

def make_crawlbase_request(url):
    options = {
        'ajax_wait': 'true',
        'page_wait': '5000'
    }

    response = crawling_api.get(url, options)

    if response['headers']['pc_status'] == '200':
        html_content = response['body'].decode('utf-8')
        return html_content
    else:
        print(f"Failed to fetch the NFT detail page. Crawlbase status code: {response['headers']['pc_status']}")
        return None

def scrape_opensea_nft_detail(html_content, url):
    soup = BeautifulSoup(html_content, 'html.parser')

    title = soup.select_one('h1.item--title').text.strip() if soup.select_one('h1.item--title') else ''
    description = soup.select_one('div.item--description').text.strip() if soup.select_one('div.item--description') else ''
    price = soup.select_one('div.Price--amount').text.strip() if soup.select_one('div.Price--amount') else ''
    image_urls = [img['src'] for img in soup.select('div.media-container img')]
    link = url  # The link is the current URL

    nft_data = {
        'title': title,
        'description': description,
        'price': price,
        'images_url': image_urls,
        'link': link
    }

    return nft_data

def save_nft_data_to_csv(data, filename='opensea_nft_data.csv'):
    df = pd.DataFrame([data])  # Convert the single NFT data dictionary to a DataFrame
    df.to_csv(filename, index=False)
    print(f"NFT data saved to {filename}")

# Example usage
if __name__ == "__main__":
    nft_url = "https://opensea.io/assets/matic/0x251be3a17af4892035c37ebf5890f4a4d889dcad/94953658332979117398233379364809351909803379308836092246404100025584049123386"
    html_content = make_crawlbase_request(nft_url)

    if html_content:
        nft_data = scrape_opensea_nft_detail(html_content, nft_url)  # Extract data from HTML content
        save_nft_data_to_csv(nft_data)  # Save NFT data to CSV
Enter fullscreen mode Exit fullscreen mode

opensea_nft_data.csv Snapshot:

opensea_nft_data.csv file snapshot

Optimize OpenSea NFT Data Scraping

Scraping OpenSea opens up a whole world of NFTs and market data. Throughout this blog, we covered how to scrape OpenSea using Python and Crawlbase Crawling API. By understanding the layout of the site and using the right tools, you can get valuable insights while keeping ethics in mind.

When you get deeper into your scraping projects, remember to store the data in human readable formats, like CSV files, to make analysis a breeze. The NFT space is moving fast and being aware of new trends and technologies will help you get the most out of your data collection efforts. With the right mindset and tools you can find some great insights in the NFT market.

If you want to do more web scraping, check out our guides on scraping other key websites.

📜 How to Scrape Monster.com
📜 How to Scrape Groupon
📜 How to Scrape TechCrunch
📜 How to Scrape X.com Tweet Pages
📜 How to Scrape Clutch.co

If you have any questions or want to give feedback, our support team can help you with web scraping. Happy scraping!

Frequently Asked Questions

Q. Why should I web scrape OpenSea?

Web scraping is a way to automatically extract data from websites. By scraping OpenSea, you can grab important information about NFTs, such as their prices, descriptions, and images. This data helps you analyze market trends, track specific collections or compare prices across NFTs. Overall, web scraping provides valuable insights that can enhance your understanding of the NFT marketplace.

Q. Is it legal to scrape data from OpenSea?

Web scraping is a gray area when it comes to legality. Many websites including OpenSea allow data collection for personal use but always read the terms of service before you start. Make sure your scraping activities comply with the website’s policies and copyright laws. Ethical scraping means using the data responsibly and not flooding the website’s servers.

Q. What tools do I need to start scraping OpenSea?

To start scraping OpenSea, you’ll need a few tools. Install Python and libraries like BeautifulSoup and pandas for data parsing and manipulation. You’ll also use Crawlbase Crawling API to handle dynamic content and JavaScript rendering on OpenSea. With these tools in place you’ll be ready to scrape and analyze NFT data.

Top comments (0)