DEV Community: Kevin Cole

'Aw CRUD!' Enable Your Custom GPT to Talk to Any Airtable Base

Kevin Cole — Thu, 16 Nov 2023 21:48:39 +0000

Last week, OpenAI announced a number of new models, features and pricing changes to their suite of tools. Among these announcements was the introduction of "Custom GPTs" which can be fed specific instructions and documents. While these features offer a convenient no-code alternative to assembling your own chain of LLM tools and agents, perhaps the most impactful feature is the ability to easily provide your Custom GPTs with access to "Actions": third-party integrations accessed via an API.

In this post, I'll provide step-by-step instructions for connecting your custom GPTs to any Airtable base you own, enabling your GPT to find and read records, update or create new records, and delete existing records.

Prerequisites: In order to follow this tutorial, you'll need an OpenAI subscription (to access the Custom GPTs beta feature) and an Airtable account (free or paid).

Initializing Your Custom GPT

For this tutorial, we'll build a Custom GPT that helps us track interesting GitHub projects by adding them to our Airtable Base "Cool Projects".

Of course, given the combined flexibility of Airtable and ChatGPT, there are few limits on the kinds of projects you can build by combining these two tools. You could, for example, use an Airtable base as a kind of long-term memory bank for your Custom GPT, as a bare bones Retrieval Augmented Generation (RAG) tool, etcetera.

To start, ensure you're logged into your OpenAI account and visit the 'Explore' page. There, click "Create a GPT" to begin configuring your custom GPT.

While you can interact with the "Create" dialog to set up some of the GPT's basic parameters, I don't recommend doing so because it's quite easy to hit the current 50 messages/hour cap for GPT-4. Instead, click on the 'Configure' tab.

We'll name our GPT 'Project Tracker', provide a description and a short set of instructions.

We can refine the Instructions later, so for now we'll just enter something straightforward:

Project Tracker is a friendly assistant which uses its set of Actions to interact with records in the 'Cool Projects' Airtable Base.

At this point, we can decide which Capabilities to provide our custom GPT. We'll definitely want to enable Web Browsing, since we'll ask the GPT to visit and summarize GitHub repositories by providing it with links. I'll skip DALL-E Image Generation for now, but enable the Code Interpreter to allow us to do some cool analysis tasks using our Airtable data.

Let's also generate a nice icon for the custom GPT. To do so, I'll switch back to the 'Create' tab and prompt the GPT Builder.

Prompt: Generate a pixel art illustration of a lightbulb to use as my GPT's icon.

We'll start of with this basic configuration.

Don't forget to save your changes so far!

Now that we've initialized our Project Tracker GPT, we'll need to turn to Airtable to generate instructions on how our GPT can interface with the Airtable API and our Airtable Base.

Generating an OpenAPI Schema from Our Airtable Base

For the purposes of this demonstration, I've created a publicly accessible Airtable Base called "Cool Projects".

To get the best performance out of your custom GPT, there are a few things I'd recommend. First, make use of Airtable's field descriptions to provide semantically rich descriptions of the purpose of each field. We'll pass these field descriptions to the custom GPT when defining our Actions, and while I can't confirm that this is the case, providing further context to the GPT about how each field is used seems to improve performance.

To start, I'll add a few of my own GitHub projects manually. This will provide some context for the custom GPT regarding how we expect the data to be stored and formatted. While you could start with an empty base and have your GPT add all the records to it, I find it performs best when there is pre-existing content to model.

Generating an OpenAPI Schema

OpenAI's Actions API enables custom GPTs to interface with third-party applications through those platforms' Application Programming Interfaces (APIs). To instruct the GPT on how to interact with this API, the Custom GPT Editor accepts an OpenAPI schema, a standardized format for defining how an API expects to be interacted with.

In order to enable our custom GPT to interact with our Airtable base, we'll have to provide our own OpenAPI schema. Luckily, we don't need to generate this schema by hand!

I've created a script for Airtable's Scripting extension, adapting the (very well implemented!) work of GitHub user TheF1rstPancake on a custom Airtable Block that generates an OpenAPI specification from any Airtable base, which can instantly generate this schema.

To use it, we'll need to add the Scripting extension to our base (Extensions → + Add an extension → Scripting).

We'll edit the script, deleting the boilerplate code and replacing it with our own script, which you can find in this GitHub Gist.

After pasting the code into the Script Editor, click Run and you'll be presented with a full OpenAPI schema for your Airtable Base. This schema will allow you to grant your custom GPT with full Create Replace Update Destroy (CRUD) access to your base. However, before we finish defining our GPT's Actions, we'll need need to create a Personal Access Token (PAT) which provides our custom GPT with the necessary permission scopes to interact with our Airtable base via the Airtable API, and which we'll use to authenticate the custom GPT agent.

🚧 Warning 🚧

By default, the OpenAPI schema includes all methods of interacting with the Airtable API, including destructive methods like PATCH and DELETE. It also includes all of the specified base's tables.

This means that if you, for example, were to make a custom GPT and provide it with your OpenAPI schema, and subsequently shared that custom GPT with others, anyone could instruct the GPT to perform destructive operations on your Airtable base, like deleting all of its records or writing over existing records! Please keep this in mind when developing and sharing your custom GPTs.

You can limit the operations available to the GPT, but you'll need to manually remove their corresponding "path" definitions from the generated schema (for example, using your favorite text editor) before defining your GPT's Actions.¹

It's both possible and recommended to limit the scope of the GPT's permissions when generating its Airtable PAT; however, I wouldn't recommend only using permissions scope limitations to restrict your GPT's capabilities, as it will attempt to use any defined Actions and report authentication/permission errors in the chat output.

Generating a Personal Access Token for Your Custom GPT

Since Airtable is deprecating API keys at the end of January 2024, we'll need to create a Personal Access Token instead.²

To create a PAT to authenticate with and access the Airtable API, visit https://airtable.com/create/tokens, and click Create new token.

Give this PAT a meaningful name, and assign its scope of permissions according to your needs. For this demonstration, I'll enable full read/write capabilities so that my custom GPT can add new records and update existing ones.

Be sure to carefully select the scope of access for this PAT. You'll want to limit the scope to the single Base that you're working with.

It's worthwhile to give our PAT a meaningful name, because record revision histories in the Airtable interface will show this name to identify changes made by our custom GPT.

Once you click Create token, you'll be presented with the PAT. Be sure to copy this to a text file on your desktop, since we'll need to provide it to our custom GPT when setting up our actions. If you forget the PAT, you can regenerate it with the same settings from the Personal access token dashboard in Airtable.

Adding Airtable Access to Your Custom GPT

To finish enabling your custom GPT to interface with your Airtable base, edit your GPT and open the Configure pane. Click on Add Action, and paste your generate OpenAPI Schema into the Schema input. You should see a list of all of the actions that are generated from this schema.

Under "Available actions" we now see a list of the different operations available for each table in our base.

As a final step, we need to set up our authentication configuration. Click on the cog icon by the
Authentication input, and select API Key. Paste your PAT into the API Key field, and select "Bearer" as the Auth Type.

You can go ahead and test our your Actions by clicking the Test button.

A test of our "Get Projects" operation demonstrates that our custom GPT is now able to read data in from our Airtable base!

And that's it! Your custom GPT now has full CRUD access to your Airtable base. Now, you can do things like ask your GPT to add new records to your base's tables:

Now you can use natural language to instruct your custom GPT to interact with your Airtable API.

Tips and Reflections

While testing out the new Custom GPT feature, I've come across a few hiccups:

Out-of-date Action schemas are occasionally 'remembered' by your custom GPT, leading them to form incorrect calls to the Airtable API. A work around seems to be deleting your Action entirely, saving the custom GPT, and then re-adding your updated Action schema.
Custom GPTs don't seem to hold the full OpenAPI schema in memory when deciding on actions. For example, I don't believe that the entire "schemas" specification is followed at all times, so occasionally the custom GPT will try to send malformed API calls or attempt ot push/patch malformed objects. It is responsive to feedback, at least!
You can add additional information about how to best navigate/utilize the information in your Airtable base via the custom GPT's Instructions field; however, GPTs don't always make 'informed' decisions about how to query or transform information received from outside sources. For example, if you ask for the GPT to retrieve a record, it may attempt to filter on the record's title and if there is no match and the API returns no results, it will complain that it wasn't able to find the record instead of searching all records and then conducting a more 'semantic' search on the results.
There's a limit to how much information your custom GPT will ingest. As a result, you'll need to make strategic use of Airtable's Views feature if your use case involves a lot of records or records with long text data. The GPT generally attempts to limit returned information using the query parameters/filters specified in the Airtable API, but occasionally makes mistakes.

By default, the "typecast" option in the Airtable API is disabled. This is meant to prevent the custom GPT from creating new options in Single/Multi-Select fields. However, if you would like to enable this behavior you can manually change the value of the "typecast" parameter in the OpenAPI specification.

🔐 Another note: I'd recommend disabling sharing your chat with OpenAI to improve their services, especially if you're dealing with personal/sensitive data! To do so, deselect 'Use conversation data in your GPT to improve our models' under Additional Settings.

Footnotes

I may refactor the current script to enable users to specify which methods they would like to include/exclude from the generated schema, but currently this is not supported and the limitations of Airtable's Scripting runtime make adding this functionality somewhat painful. ↩
While it's not a best practice to assign a PAT to a third-party integration, there are a number of limitations which make it unfeasible to use Airtable for public-facing purposes, including for a publicly accessible custom GPT. There is a limit of 5 API requests per second per base, which makes concurrent connections essentially impossible with a user base of non-trivial size. For this reason, I've eschewed creating an OAuth integration, but you should consider doing so if you plan to share your GPT with others. ↩

Cranberry: The Cure for Your URI

Kevin Cole — Sat, 21 Oct 2023 05:33:21 +0000

Everyone is familiar with the term URL. Even if you couldn't recall off the top of your head that this abbreviation stands for Uniform Resource Locator, you'd be able to describe URLs as the addresses of specific websites that you put into your browser's address bar or, more frequently, click to via hyperlinks on other webpages.

You may not be so familiar with Uniform Resource Identifiers (URIs), a much broader syntax for identifying different resources across the web. For example, there are URI schemes which allow you to invoke a device's SMS messaging capabilities to compose a message, and perhaps a more familiar example are Spotify's URIs that allow you to open a specific song, artist or playlist in the desktop/mobile application via a link.

<!-- An example of an SMS URI -->
sms:+15105550101?body=hello

<!-- An example of the Spotify URI scheme -->
spotify:artist:7t0rwkOPGlDPEhaOcVtOt9

<!-- An example of the Git protocol URI scheme -->
git://github.com/whatwg/url.git

<!-- Zotero's URI Protocol -->
zotero://select/library/collections/{collection_key}/items/{item_key}

While URIs provide a ton of useful functionality, actually using them can be a bit difficult. Many of the places you'd think to use a hyperlink to provide clickable access to a URI refuse to validate the input URI. For example, it's (currently) not possible to provide a URI as the target of a link in the popular note taking application Notion.so.

That's always bothered me, especially because for my work I make heavy use of the research and citation manager Zotero, which allows you to create URIs that, when opened, point you to a specific item in your library of saved materials. When working with a team of researchers and in a library of hundreds and even thousands of documents, being able to send someone one of these URIs is a huge time saver!

As I was driving the other day I had a shockingly simple realization: why not make a micro-service which accepts a Zotero URI as a part of a traditional URL, and redirects the browser to that URI when opened? In part, I was inspired by the ease with which Gitpod allows you to append a repository's URL to their own web address in order to open the repository in Gitpod. And so, I set out to solve this uncomfortable URI woe. And thus was born: Cranberry

Cranberry leverages the advantages of so-called 'edge computing', a paradigm focused on ensuring low-latency by performing computational manipulation as close to the data source and client as possible. This is made possible by hosting Cranberry via Vercel and utilizing their Edge Middleware to redirect requests containing a Zotero URI. I decided to use my favorite lightweight web development framework, Astro, although Vercel and its Edge Middleware services are compatible with a number of different frameworks.

In this article, I'll walk provide a step-by-step walk-through documenting how to set up this type of micro-service and also reflect on some of the benefits and drawbacks of the approach taken.

Defining Our Goals

For this micro-service, we want to be able to do the following:

Serve a small homepage describing the project.
Redirect any URL containing a Zotero URI to the specified URI. (We'll use a query parameter ?uri={USER_PROVIDED_URI} )¹
Provide appropriate and understandable error messages.

It's also helpful to define a few anti-goals:

We don't want to redirect to user-provided URLs or other URI protocols, as doing so could make our service useful for bad actors trying to maliciously redirect users.²
We don't want to send HTML to the client unless useful (e.g., visitors to the homepage who haven't come via a link containing a URI or in case of errors).

Initializing Your Astro Project

# Create a new project with npm
npm create astro@latest

Houston, Astro's adorable mascot, will guide you through the initial set up. Choose a name and location for your project, as well as whether or not to start with some basic scaffolding files and whether or not you plan to use Typescript. Let's use the basic project template, and in this project we'll use a mixture of both Javascript and Typescript.

After your project initializes, be sure to change into the project directory.

Enabling Server-Side Rendering

As mentioned, I decided to use Server-side Rendering (SSR) for this project, both because I wanted to familiarize myself with this rendering mode, as well as to take advantage of a few benefits that Edge-enabled SSR integrations can offer us for this use case, namely:

Because the majority of requests to our website's servers are merely meant to instantly redirect to the provided URI, there's no need to actually serve any website content. We'll only want to serve HTML to users who are visiting the homepage, or in case of an error.
By using an SSR host with a global Content Delivery Network (CDN) and Edge-powered middleware, we can further minimize latency, especially for users who would otherwise be further away from our server.
Since we perform our operations in at the Edge, we can ensure that all user agents and browsers will be able to utilize our service in accordance to the principle of progressive enhancement. If we were to rely on client-side manipulation, the service would only work if Javascript is supported and enabled.

However, there's a flip side to this coin! For one, depending on a (company's) CDN network and using their proprietary Edge function service and runtime means that our website and its functionality is much less portable than relying on more traditional middleware strategies or even taking a client-side approach.

Since we're planning to deploy via Vercel, our next step will be to install and configure the Vercel SSR Adapter for Astro.

# This will install the adapter and update your astro config file
npx astro add vercel

If you prefer to manually install dependencies:

Install via npm

npm install @astrojs/vercel

Update your astro.config.mjs:

import { defineConfig } from 'astro/config';
import vercel from '@astrojs/vercel/serverless';

export default defineConfig({
    output: 'server',
    adapter: vercel(),
});

In order to test our edge middleware locally in development, we'll also need to install the Vercel CLI tool:

npm install -g vercel

Setting Up Our Middleware

So, what exactly does 'middleware' do? In the context of web development, middleware describes computations and modifications done between the time a server receives an incoming request and the response to that request is issued. It's a kind of interception layer that allows for all kinds of important procedures like authenticating users before serving protected content, fetching data from databases or other APIs and including that data in the response, among many others.

In our project, middleware will allow us intercept incoming requests (e.g., when someone clicks on a Cranberry-fied link or navigates to the homepage) and to decide on the appropriate response based on the data available in that request. More specifically, we'll want to evaluate whether or not the request contains a URI and, if so, whether or not the URI provided is 'valid' for our purposes (e.g., if the URI uses the zotero:// protocol).

In other words, there should be three potential responses returned:

Case One: No URI Included → show project homepage.
Case Two: URI is included, and uses the Zotero protocol → immediately redirect to the provided URI.
Case Three: URI is included, but is invalid → return error message and show error page.

With this plan in mind, let's set up our middleware.js file which will be used by Vercel to set up Edge Middleware functions.

// Originally this function was defined locally, but we'll use it again later to validate user input on our homepage so I've encapsulated it and imported it here
import isValidURI from './src/utils/isValidZoteroURI'

// We use the config export to define a matcher pattern, which tells Vercel which paths to run our middlewear on
export const config = {
    matcher: ['/'] // Match the root path
}

// Here's where we define our middlewear functionality.
// The function should return a response, and is async
// so that we can fetch the Error page and return it in our response
export default async function middleware(req) {
    // Grab the incoming request's url
    const url = new URL(req.url);
    // Then extract the URI appended to the url
    const uriParam = url.searchParams.get('uri');

    // CASE ONE
    // If there's no URI, return null to make no change to the response
    if (!uriParam) {
        return null; // This means a normal GET request will fetch the home page
    }

    // CASE TWO
    // If theres a URI and it's valid, try to redirect
    if (isValidZoteroURI(uriParam)) {
    // See below for why we have to catch errors even after validating URIs
        try {
            return Response.redirect(uriParam, 302); // Redirect to the URI
        } catch (error) {
            console.error('Redirection error:', error);
            return await handleError(url, 'Failed to redirect.');
        }
    }
        return await handleError(url, 'The URI provided is invalid.');
}

// Handle errors by returning a 400 status & displaying error page
async function handleError(url, errorMessage) {
    // Fetch the page created by `error.astro`
    const errorUrl = new URL('/error', url.origin);
    const errorPageResponse = await fetch(errorUrl);
    const errorPageContent = await errorPageResponse.text();
    // Serve our error page to provide direction for human users 
    return new Response(errorPageContent, {
        status: 400,
        headers: {
            'Content-Type': 'text/html',
            'X-Error-Message': errorMessage,
        },
    });
}

Let's take a look at the contents of ./src/utils/isValidZoteroURI.ts:

export default function isValidZoteroURI(uri: string): boolean {
    // We grab the scheme (text before the first colon)
    const scheme = uri.split(':')[0].toLowerCase()

    // Return true if the protocol uses zotero's scheme
    if (['zotero'].includes(scheme)) {
        return true
    }

    // Otherwise, return false
    return false
}

With that, we should have functioning middleware to intercept and respond to incoming requests accordingly. All that's left is to put together the .astro files to handle our front-end, and then to deploy to Vercel.

You can test your middleware locally by running vercel dev in your project's root directory -- but you'll need to have a valid HTML page returned at the endpoints /index.html and error.html, otherwise the middleware function will throw a 404 error. If you'd like to see how I put together my minimal frontend, take a look at Cranberry's GitHub repository.

To test in a staging deployment, run vercel deploy. By deploying to a staging environment, you'll be able to see your Vercel Edge/Serverless function logs for debugging.

When you're happy with the way things are looking and working, you can deploy to production with vercel deploy --prod.

I originally planned to use the shorter URL fragment syntax (#), but this approach only works for client-side manipulation, as URL fragments aren't sent to the server and are only used in the browser. ↩
Initially, I planned to make a kind of 'universal URI-as-URL wrapper' that would allow any URI which could be successfully constructed into a URL with the WebAPI's new URL() constructor to be used to create a redirect. However, I ran into a few issues. For one, this was an obvious security concern since potential bad actors could use the service to obfuscate malicious URLs/URIs. On a more practical note, Vercel's Edge Runtime seems to use different polyfills and a non-standard implementation of Response.redirect(), which made it impossible to redirect to some (valid) URIs. This led me down the rabbit hole of URL technical specifications, state-machine based parsing and validation, and a firm reminder of how much of a miracle internet interoperability is. ↩

Setting Up a (Free*) Collaborative Python Development Environment for a Small Team

Kevin Cole — Mon, 16 Oct 2023 15:38:34 +0000

Perhaps you've found yourself in this pickle: you're preparing to dig into a new coding project but this time, it won't just be you doing the work.

It's one thing to initiate a repository on your local machine and invite others to collaborate asynchronously through remote source management tools like Github, Gitlab or Codeberg; introducing the prospect of real-time collaboration and managing the dreaded "works on my machine" bumbles might warrant a bit more architectural thinking.

I found myself in this situation last week while initiating a new research project that will require a small team of collaborators to work on a Python-heavy project exploring the potentials and pitfalls of leveraging LLMs to provide enhanced general orientation for asylum-seekers on their rights and the procedures which apply to them. It's always a struggle to tame that initial urge to jump into your IDE of choice and start coding, but it was clear that this project could benefit from some measures to ensure new team members could be brought onboard and enabled to contribute with minimal friction.

In this post, I'll walk through both our decision-making process and provide a guide on how to use GitPod, Github and Jupyter Notebook to set up a collaborative Python development environment.

If you're just interested in the step-by-step guide, feel free to jump right there. And if you're really in a rush, you can fork this template repository as a basis for your own dev environment hosted by Gitpod.

To Containerize or Not to Containerize?

If you're planning to work collaboratively on a Python-heavy project, one of the first determinations you'll need to make is whether or not you plan to use "containerization". To provide a simple definition relevant to our project planning, "containerization" is an approach to software development and deployment where code and/or services are run within a minimal virtualized runtime environment that is either hosted on a local machine or run remotely (aka, "in the cloud").¹

What's the point of "containerization"? In our use case, containerization's main benefit is to ensure that all contributing developers can access and run code in a single, standardized environment to reduce errors and incompatibilities caused by differing local runtime environments and configurations. It allows us to largely sidestep issues faced when contributors are working on different operating systems, have locally-installed libraries/packages which introduce incompatibilities, and the myriad other configuration permutations that naturally result from the ways in which we all use our own computers.

And what about the drawbacks? In some cases, running your development environment within a container might introduce too much additional overhead. For example, running containers locally could require you to provide guidance on installing Docker on a plethora of individual computers—ack! Containerization can also lead to performance bottlenecks, especially for compute-heavy tasks like machine learning or 3D graphics rendering. Finally, taking full advantage of containerization frequently entails hosting these containerized environments remotely, and this can be quite costly!

So, how to decide? Ultimately, you'll need to take a hard look at your organization's use case to make a final determination. In our case, the following considerations were a top priority:

Contributors should be able to access the development environment with just a fundamental understanding of common software development tools: VS Code or a similar IDE and git-based source control.² It's a priority to enable contributors to learn through their research in this project, so barriers to entry must be as low as possible.
Contributors should not have to consider and manage project dependencies. Our priority is to enable direct contribution, rather than having to fiddle with configurations.
The solution must enable secure handling of authentication data, allowing for the group's work to be shared openly while also permitting the use access-controlled resources (in our case, AWS Bedrock's LLM APIs).
Considering the size of our team and the nature of our (non-profit) research, costs should be minimal (ideally, $0.00!)

Given this set of criteria, we opted to containerize our development environment but specifically chose to use a hosted "cloud-based development environment", rather than running our development container locally (for example, with Docker) or setting up our own (read, self-managed) remotely hosted container runtime.

What's a Cloud-based Development Environment?

Cloud-based Development Environments (CDEs) are remotely hosted runtimes used to enable one or many developers to work on software from different devices, and increasingly they're coupled with other tools in the developer's toolchain to provide one-click access to a "ready-to-code" state. You've probably come across some of the more popular service providers like GitPod, GitHub Codespaces, or Google's Cloud Workstations.

So, what's the deal with CDEs? To bring things to a point, many popular CDE solutions exist in a grey zone between the three main cloud service business models:

"Software as a Service" (SaaS)
"Platform as a Service" (PaaS)
"Infrastructure as a Service" (IaaS)

As others have rightfully pointed out,³ this means that using a (non self-hosted) CDE product makes you a current or potential future customer. At the same time, these products also deliver a valuable service, namely simplifying the overhead required to provision and manage your own remotely-run container instance.

In the current environment of VC-funded "blitzscaling," small teams/projects (and in our case, particularly non-profit organizations) are generally able to skate by on the "generous free tiers" made possible by this phenomenon, though the same cautionary warnings ought still apply:

Generous Free Tiers are frequently a "loss lead" offering and as many the hobbyist has learned from experience (ahem, Heroku), they may one day simply cease to exist.
That related dread-word, "Vendor Lock In".
Overdependence on abstracted/productized solutions can lead to knowledge/practical experience gaps in teams.

Like most things in life, there are certainly a set of benefits and tradeoffs to be considered, so it's crucial to take a hard look at your project, team and organization's requirements, goals, resources and options while deciding on a path forward. Just don't get so bogged down in the weeds that you forget that you can alter the direction of that path, even if doing so down the line might incur costs.

Our Solution: GitPod, GitHub & Jupyter Notebook

After a bit of reflection on our primary goals, anti-goals, and operational constraints, we landed on the following set of tools for our Python-focused, research-oriented project:

GitPod: GitPod is an (open sourced!) CDE solution which strongly integrates with VS Code, provides straightforward configuration for the workspace's underlying container image, and provides some built-in support for handling access to the dev workspace and environmental variables.
GitHub: Our organization already uses GitHub to host and manage remote repositories, so this was a bit of a given.⁴ Using a git-based source control system is a practical necessity in this type of collaborative project, allowing for
Jupyter Notebook: Because our project is research-focused, we decided to use Jupyter Notebook to maximize the accessibility and reproducibility of our work by leveraging the ability to directly document our approaches with Markdown in Jupyter Notebook's .ipynb files.

You might not want to take this approach if your project:

Requires or greatly benefits from hardware acceleration;
Requires you to run multiple concurrent and/or persistent services (databases, authentication servers, etc.);
Needs to support full-time contributors: GitPod's free tier is capped at 50 hours of container up-time per month!

Setting Up Our Workspace

Initialize Your Repository

The first step to getting your Gitpod workspace running is to initialize a GitHub repository.
Create a GitPod Workspace (and optionally, a Gitpod Project)

You can open your repository (or any repo your GitHub account has access to) in a Gitpod 'workspace' (i.e., ephemeral containerized runtime environment) by prepending gitpod.io/# to your GitHub repo's URL. For example, you can open the forem project in a Gitpod Workspace by navigating to the following URL: https://gitpod.io/#https://github.com/forem/forem.

You'll be asked to select your preferred editor experience, be that VS Code for the Browser, VS Code Desktop or another supported desktop IDE, or via SSH. Regardless of your editing method of choice, your next step will be setting up Gitpod's configuration files.
Adding Gitpod Dotfiles

Most configuration for Gitpod Workspaces is handled by two Dotfiles which you'll want to place in the root directory of your project's repo: .gitpod.yml and .gitpod.Dockerfile.
- .gitpod.Dockerfile: This (optional) file allows you more flexibility to use your own custom Dockerfile, rather than one of Gitpod's official Docker images. For this set up, we'll create a .gitpod.Dockerfile to ensure our container uses a consistent Python version.
- .gitpod.yml: This file specifies the underlying Gitpod workspace image to use for your runtime environment and allows you to define commands to be run on workspace startup as well as what ports (if any) you'd like to expose.

Here's our .gitpod.Dockerfile:

FROM gitpod/workspace-full

USER gitpod

# Install and set global Python version to 3.11
RUN pyenv install 3.11 \
    && pyenv global 3.11

This Dockerfile provides Gitpod with instructions to spin up containers for our workspace using the default image, which comes pre-bundled with typical development tools. We chose to use the default image out of convenience, but the gitpod/workspace-python image would have provided a lighter out-of-the-box footprint.

It also instructs for that image to run two pyenv commands: one to install Python version 3.11, and one to set the global Python environment to v3.11. This is an important step for our project because some Langchain dependencies are currently incompatible with Python versions >= 12.0.

We used the following configuration in our .gitpod.yml file:

image:
    file: .gitpod.Dockerfile

tasks:
    - init: pip install -r requirements.txt

This configuration does two things:

Instructs Gitpod to use our custom workspace image.
We instruct Gitpod to run the command pip install -r requirements.txt when the workspace container starts up, ensuring that all necessary Python libraries are installed and available in the runtime environment.
Initialize requirements.txt and Commit Changes

Before we finish our work initializing Gitpod, we'll need to install some of our known requirements and importantly, persist our changes by committing to our GitHub repository. You can use the terminal session connected to your Gitpod workspace to install Python libraries with pip. In our case, we'll install Jupyter Notebook:

pip install jupyter

Since we're working in an ephemeral workspace (container) and in a collaborative project, we'll need to be sure to save these changes by committing and pushing them to our GitHub repo. We'll use pip's freeze command to save our currently-installed libraries to a requirements.txt file in the root directory of our repo, so that all necessary libraries are installed each time our workspace image is recreated.

pip freeze > "requirements.txt"

This command redirects the output of pip freeze (a list of currently-installed libraries and their version) to the text file requirements.txt, referenced in our .gitpod.yml file.

Now we're ready to persist all of these changes by committing and pushing to our GitHub repository. You can use the built-in VS Code (or your editor of choice) 'Source Control' panel, or alternatively use the terminal in your workspace:

# Stage changed files
git add .
# Commit changes with a message
git commit -m "Your commit message here"
# Push these changes to the main branch of your remote repo
git push --set-upstream origin main

At this point, you've got a functional cloud-based development environment! To add contributors, you'll just need to ensure that they have appropriate access to your project's repository and that they've created a Gitpod account. As a next step, you might consider setting up branch protections in your GitHub repo to prevent unintentional commits to your main branch by contributors, or further defining your development environment by specifying VS Code extensions to be pre-installed in your .gitpod.yml Dotfile.

If like our team you're planning to use Jupyter Notebook files, note that you'll get the best support using VS Code for Desktop, rather than the browser.⁵

Concerns and Reflections

Rolling out a CDE with Gitpod turned out to be pretty simple, but the "cautionary tales" aren't without virtue. With just 50 hours of container uptime a month, it's clear that Gitpod can't provide a completely cost-free solution for professional teams working fulltime, and there are strong arguments to be made for simplifying this collaborative project's architecture by using just a GitHub repo and a more robust environment management tool, like Conda.

Working in a nonprofit and humanitarian organization brings with it a particular set of needs and goals which aren't necessarily always reflected in tech-first corporations, and it's important not to brush aside these priorities in favor of the architecture du jour.

Some other potential drawbacks to the CDE approach include:

Network latency, which can be a major obstacle especially when working with contributors in areas with intermittent or weak internet connections;
More limited options for managing larger file storage without introducing additional complexity;
Limitations on data sovereignty given reliance on third-party hosting of project repositories.

That being said, using a CDE in the specific context of this initiative allows our team to lower barriers to meaningful contribution by focusing on commonly-taught tooling (VS Code, GitHub) and minimize time spent troubleshooting platform and machine-specific installation woes. It allows for day one exposure to the team's work for new contributors, while also allowing for the abstractions upholding the 'ready-to-code' environment to be elegantly surfaced and retired: contributors can transition to development on their local machine (with or without containerization) once they are confident in setting up their own environment by cloning the repository locally and installing dependencies.

In practice, the term containerization encapsulates a broad approach in software engineering that may be used towards various ends, such as isolating the execution environment of potentially hazardous code from critical systems. IBM has a helpful overview of the topic at: https://www.ibm.com/topics/containerization ↩
VS Code's official documentation on using the Source Control panel is quite helpful as a teaching/learning resource! ↩
I found Mike Nikle's blog post on the subject to offer a strong, if perhaps overly skeptical, view on the drawbacks of adopting CDE products: "Dev environments in the cloud are a half-baked solution". ↩
It's worth noting here that GitPod also supports GitLab and Bitbucket. ↩
Per Gitpod's documentation: https://www.gitpod.io/docs/introduction/languages/python#jupyter-notebooks-in-vs-code ↩