DEV Community

Cover image for Pasteit! - A pastebin on IPFS
Amal Shaji
Amal Shaji

Posted on • Originally published at

Pasteit! - A pastebin on IPFS

IPFS stands for InterPlanetary File System. It is similar to the idea of torrents, but better. IPFS is a peer-to-peer hypermedia protocol designed to make the web faster, safer, and more open. I'm not going to nerd about IPFS more; just read the IPFS whitepaper.

I stumbled upon IPFS a couple of years ago and found it interesting. Back then, the only way to access IPFS was to spin up your own node(not sure, maybe lack of research). Today we have multiple free IPFS endpoints. We can use these endpoints to interact with the IPFS network.

The article is about storing text data on the IPFS network. This is something I've worked on the past few days, using the IPFS network to store data for free.

Tools used

  • FastAPI
  • MongoDB
  • Svelte
  • Infura IPFS endpoint

Why use a backend?

It is easy to make get/post requests to the IPFS endpoint using javascript's fetch API. But the problem is IPFS creates a hash for each file. This hash can be used for file identification.


But it is not an easy job to remember such hashes, so we need to store an alias to these hashes using a database.

FastAPI will regulate the whole program flow. We'll build APIs for communication between services.

Building the application

Setup env variables

# .env

Enter fullscreen mode Exit fullscreen mode

Setup mongodb

Let's use Docker to spin up MongoDB. Docker removes the overhead for a local installation and other basic setups.

# pull MongoDB
docker pull mongo

# Start mongo container
docker run -it -v mongodata:/data/db -p 27017:27017 --name ipfs-store -d mongo
Enter fullscreen mode Exit fullscreen mode

-v mongodata:/data/db

-v is for specifying the volume. It is important to map MongoDB storage to the local directory to persist data even after the container is stopped. We map /data/db of container to mongodata of our project directory. Make sure the mongodata folder exists.

# requirements.txt

Enter fullscreen mode Exit fullscreen mode

Code the database

We'll use pymongo to communicate with our database.

# database/

class DataBase:
    def __init__(self) -> None:
        self.client = MongoClient(getenv("MONGO_CON_STRING"))
        self.db = self.client.pasteit
        self.col = self.db.links

    def set(self, short: str, hash: str) -> str:
        short_exists = self.col.find_one({"hash": hash})
        if short_exists is not None:
            return short_exists.get("short")
        data = {"short": short, "hash": hash}
        return short

    def get(self, short: str) -> str:
        data = self.col.find_one({"short": short})
        if data is not None:
            return data.get("hash")
        return None

    def close(self) -> None:
Enter fullscreen mode Exit fullscreen mode

Creating abstractions like this can make it easy to read code. I defined the set and get method with a series of pymongo operations to get the job done.

Every database insertion will be of this format,

    "short": "hash"
Enter fullscreen mode Exit fullscreen mode

You can also use Redis here since we're making all insertions key: value based; I used MongoDB because this application is deployed on vercel with MongoDB atlas.

The code above is fairly simple. We create a get method to fetch a hash based on the short provided. We define the set method to store a short: hash pair. But first, we make sure the hash isn't already in the database.

Make the IPFS connection

# ipfs/

class IPFS:
    def __init__(self) -> None:
        self.ipfs = ipfsApi.Client("", 5001)

    def add(self, text: str) -> str:
        filename = f"/tmp/{str(uuid4())}"
        with open(filename, "w") as f:
        res = self.ipfs.add(filename)
        return res[0].get("Hash")

    def cat(self, hash: str) -> str:
        data =
        return data
Enter fullscreen mode Exit fullscreen mode

Communications with the IPFS endpoint are simple get/post requests with payload, but you need to take care of the encoding. I used a library which has already done the basic things for us.

We define an add method, which writes the input string to a file and then uploads it to IPFS. The cat method reads the data using the hash.

Code the server

The server has two endpoints. /api/v1 to post the text to be uploaded and / to fetch data using short URLs.


async def connection() -> dict:
    return {"db": DataBase(), "ipfs": IPFS()}"/api/v1/")
async def pasteit(data: Data, con: dict = Depends(connection)) -> dict:
    hash = con["ipfs"].add(data.text)
    short = str(uuid4())[:6]
    short = con["db"].set(short, hash)
    return {"message": short}

async def get_paste(short: str, con: dict = Depends(connection)) -> dict:
    hash = con["db"].get(short)
    if hash is not None:
        data = con["ipfs"].cat(hash)
        return {"message": data}
    return {"message": "invalid short"}
Enter fullscreen mode Exit fullscreen mode

Here we assume that all data is successfully uploaded. Then we create a custom identifier for each hash using the first six characters of uuid.uuid4(). We need to perform a collision test on this method of short generation.


from uuid import uuid4

def get_id() -> str:
    return str(uuid4())[:6]

def test_n(n: int) -> None:
    outputs = [get_id() for _ in range(n)]
    unique_outputs = set(outputs)
    fraction = 1 - (len(unique_outputs) / len(outputs))
    print(f"Test for {n} shorts, collision: {fraction*100:.2f}")

if __name__ == "__main__":
Enter fullscreen mode Exit fullscreen mode
-> python
Test for 100 shorts, collision: 0.00
Test for 1000 shorts, collision: 0.00
Test for 10000 shorts, collision: 0.05
Test for 100000 shorts, collision: 0.26
Test for 1000000 shorts, collision: 2.93

-> python
Test for 100 shorts, collision: 0.00
Test for 1000 shorts, collision: 0.00
Test for 10000 shorts, collision: 0.01
Test for 100000 shorts, collision: 0.27
Test for 1000000 shorts, collision: 2.92
Enter fullscreen mode Exit fullscreen mode

I guess the test passed, except for n=1,000,000, which got ~30,000 collisions. But it's safe to assume we're not going to get that many requests in a short span of time.

The frontend

# src/App.svelte

    let data = "";
    let hash = "";
    const upload = () => {
        fetch("http://localhost:8000/api/v1", {
            method: "POST",
            body: JSON.stringify({ text: data }),
            .then((res) => res.json)
            .then((data) => (hash = data.message));

<textarea id="data" bind:value={data} />
<button id="upload" on:click={upload}>Upload</button>
Enter fullscreen mode Exit fullscreen mode

This code should give you a fair idea of the frontend build. The current text limit is set to 200 characters.

What's next for pasteit!?

I'm planning to convert this into a file sharing service on IPFS. Maybe throw in a little encryption to make people interested!!



Check out the final application at

GitHub :

Discussion (0)