DEV Community

František Nesveda for Apify

Posted on • Originally published at blog.apify.com on

Apify ❤️ Python: Releasing a Python SDK for Actors

Whether you are scraping with BeautifulSoup, Scrapy, Selenium, or Playwright, the Apify Python SDK helps you run your project in the cloud at any scale.

At Apify, our mission is to empower people to create great web scrapers using the best technologies possible and to run them in the cloud effortlessly. That's why we're thrilled to introduce our new Apify SDK for Python, allowing you to write Apify Actors in Python and tap into the wide range of libraries and tools in the Python ecosystem that make web scraping simple and efficient.

from apify import Actor
from bs4 import BeautifulSoup
import requests

async def main():
    async with Actor:
        input = await Actor.get_input()
        response = requests.get(input['url'])
        soup = BeautifulSoup(response.content, 'html.parser')
        await Actor.push_data({
            'url': input['url'],
            'title': soup.title.string,
        })

Enter fullscreen mode Exit fullscreen mode

When combined with the Apify platform, Actors have access to a wide variety of features designed specifically to meet developers web scraping and automation needs. These include on-demand scaling of computing resources, run scheduling and monitoring, data center and residential proxies, as well as the ability to publish Actors in Apify Store and even monetize your code.

Whether you have a simple scraper using BeautifulSoup, a powerful web spider written with Scrapy, or you use Selenium or Playwright to automate browser interaction, the Apify SDK for Python will help you run your projects in the cloud at any scale.

Apify Python

Getting Started

Actors were designed with the purpose of being used together with the Apify platform. So, to unlock the full potential of Actors, lets create one in Apify Console. This is a fairly straightforward process, and you will only need to sign up for a free Apify account to follow along.

Once youre in Apify Console, and you go to Actors Create New there, youre presented with a choice of Actor templates:

Web Scraping Python templates

We have predefined Actor templates for all the major web scraping libraries like Scrapy, BeautifulSoup, Playwright, and Selenium.

Once you create an Actor from your selected Actor template, you can edit its code to perform the scraping tasks you need, run the Actor, and, if youre happy with it, integrate it with your existing data pipelines and schedule it to scrape data in regular intervals.

Creating Actors locally

If you want to create and run Apify Actors directly on your local computer so that you can, for example, track the source code in a version control system, you can do so using the Apify CLI, using the command apify create my-python-actor.

Apify Python SDK Templates

When you execute that command, youll be presented with the same choice of templates as in Apify Console. Once you choose a template, an Actor will be created for you in the my-python-actor directory, and all its requirements will be installed in a virtual environment in my-python-actor/.venv. To run the actor, you can just run cd my-python-actor && apify run.

When you run an Actor locally, its output is stored in the storage folder. There, you can find the contents of the Actors default dataset, key-value store, and request queue.

To push the Actor to Apify Console and run it there, you can use the apify push command, which will upload the actors source code to the Apify platform and build the actor there.

Get in touch

Were excited to see what you will create with the Apify SDK for Python. If you find any issues, please report them in the SDKs GitHub repository.

🐍 Try writing an Actor in Python

And dont forget to join our developer community on Discord. We will be waiting for you there to hear your feedback and help you with any questions that might arise.

Top comments (0)