DEV Community

Vaibhav Dubey
Vaibhav Dubey

Posted on

I built an open-source library to generate specialised machine learning models from natural language

Smolmodels is an open-source library that uses graph search with LLM-based code generation to automatically create lightweight, task-specific ML models from natural language descriptions.

Here's how it works with a time-series prediction example; let’s say df is a dataframe containing the “air passengers” dataset from statsmodels.

    import smolmodels as sm

    model = sm.Model(
        intent="Predict the number of international air passengers (in thousands) in a given month, based on historical time series data.",
        input_schema={"Month": str},
        output_schema={"Passengers": int}
    )

    model.build(dataset=df, provider="openai/gpt-4o")

    prediction = model.predict({"Month": "2019-01"})

    sm.models.save_model(model, "air_passengers")
Enter fullscreen mode Exit fullscreen mode

Under the hood, the library:

  • Parses the intent to identify ML task type and constraints
  • Uses graph search to explore potential model architectures
  • Automatically optimises the solutions produced
  • Generates task-specific training code
  • Generates inference code to use the model

The project is in alpha stage and completely open source: https://github.com/plexe-ai/smolmodels

The library is fully open-source (Apache-2.0), so feel free to use it however you like. Or just tear it apart in the comments if you think this is dumb. We’d love some feedback, and we’re very open to code contributions!

Heroku

This site is built on Heroku

Join the ranks of developers at Salesforce, Airbase, DEV, and more who deploy their mission critical applications on Heroku. Sign up today and launch your first app!

Get Started

Top comments (0)

A Workflow Copilot. Tailored to You.

Pieces.app image

Our desktop app, with its intelligent copilot, streamlines coding by generating snippets, extracting code from screenshots, and accelerating problem-solving.

Read the docs