DEV Community

KILLALLSKYWALKER
KILLALLSKYWALKER

Posted on

The Chosen One : Mage AI

As i write previously how i found Mage AI , i will try to explain more the reason why it the chosen one , at least for me .

You know why because there's a debate who is actually the chosen one anakin or luke , but of course it back to how you see it . Same as the tool , right tool for right job .

The chosen one

When i try Mage AI , from the very first run , i already feel like it

  • Really simple to start , just with one docker command , or with docker compose you can already use it end to end

  • Interactive development where it using notebook style block , it really easy to test and get the feedback

  • Production ready , scheduling , monitoring , built in retry , notification , a lot of connector / plugin ( of course there some tweak need to do , but really minimal )

  • Team friendly , really easy to be use by beginner ( When i build the pipeline , im the only person in the team that know how to use this , but i do a sharing on this tools , and bring other dev on board really easy )

Instead of struggling learning the tool , i can build the pipeline right away .

How Simple It Actually

This is just a demo to show why it simple , interactive , production ready and team friendy .

docker run -it -p 6789:6789 -v $(pwd):/home/src mageai/mageai /app/run_app.sh mage start demo
Enter fullscreen mode Exit fullscreen mode

once it running you can go to localhost:6789 and will see this dashboard

Mage AI Dashboard

To add your first pipeline by clicking pipeline menu

Pipelines

Once you in the pipelines page you can click new button , you can fill the information like below

First pipeline

Once done you can see your pipelines detail where you can an option to add block like data loader , transformer , data exporter and other s . But at the moment we just need 3 block which is data loader , transformer and data exporter , once we setup this 3 we already have a complete component that cover ETL flow which is

  1. Extract ( using loader block )
  2. Transform ( ( Using transformers block )
  3. Load ( Using exporter block )

Act you can chain more the block for more complex flow and more dynamic , but as for demo this only sufficient enough .

Complete Block Setup

Data Loader Block
Ignore @test first . Not cover in this demo . This is where we load the api data from the swapi api .

import io
import pandas as pd
import requests
if 'data_loader' not in globals():
    from mage_ai.data_preparation.decorators import data_loader
if 'test' not in globals():
    from mage_ai.data_preparation.decorators import test


@data_loader
def load_data_from_api(*args, **kwargs):
    """
    Load Star Wars planets data from SWAPI
    """
    url = 'https://swapi.info/api/planets'
    response = requests.get(url)

    # Check if request was successful
    if response.status_code != 200:
        raise Exception(f"API request failed with status code: {response.status_code}")

    data = response.json()

    df = pd.DataFrame(data)

    return df


@test
def test_output(output, *args) -> None:
    """
    Template code for testing the output of the block.
    """
    assert output is not None, 'The output is undefined'
Enter fullscreen mode Exit fullscreen mode

Transformer Block
You can use directly the template that provided by the block for transformation , don't worry you still can add your own method for your own transformation .

from mage_ai.data_cleaner.transformer_actions.base import BaseAction
from mage_ai.data_cleaner.transformer_actions.constants import ActionType, Axis
from mage_ai.data_cleaner.transformer_actions.utils import build_transformer_action
from pandas import DataFrame

if 'transformer' not in globals():
    from mage_ai.data_preparation.decorators import transformer
if 'test' not in globals():
    from mage_ai.data_preparation.decorators import test


@transformer
def execute_transformer_action(df: DataFrame, *args, **kwargs) -> DataFrame:
    """
    Execute Transformer Action: ActionType.REMOVE

    Docs: https://docs.mage.ai/guides/transformer-blocks#remove-columns
    """
    action = build_transformer_action(
        df,
        action_type=ActionType.REMOVE,
        arguments=['residents','films','url'],  # Specify columns to remove
        axis=Axis.COLUMN,
    )

    return BaseAction(action).execute(df)


@test
def test_output(output, *args) -> None:
    """
    Template code for testing the output of the block.
    """
    assert output is not None, 'The output is undefined'

Enter fullscreen mode Exit fullscreen mode

Export Block
The demo only show to csv . In real case we will export to the real destination .

from mage_ai.io.file import FileIO
from pandas import DataFrame

if 'data_exporter' not in globals():
    from mage_ai.data_preparation.decorators import data_exporter


@data_exporter
def export_data_to_file(df: DataFrame, **kwargs) -> None:
    """
    Template for exporting data to filesystem.

    Docs: https://docs.mage.ai/design/data-loading#fileio
    """
    filepath = 'star_wars.csv'
    FileIO().export(df, filepath)

Enter fullscreen mode Exit fullscreen mode

Now you have all the block and the pipelien already complete . What left is you just need to set the schedule when want it to be trigger . Once you set that , then it good to go :) So you can see how it simple and easy to run up end to end pipeline with Mage AI ? The chosen one , Mage AI .

Top comments (0)