Flask used to come to mind when data scientists want to spin up a python-based data science app, but there is a better option now. To create an interactive facade for a machine learning or visualization script, Streamlit is way faster, since it removed the need to write any front-end code.
Now we 'll go through step-by-step how to build a Streamlit app. I will also review some pros and cons of Streamlit.
Who is this for
Anyone who wants to put an interactive user interface or visible facade to the python scripts.
Prerequisite
Python knowledge
What can it do
Streamlit can be used to built machine learning/AI apps or display exploratory/analytical data visualizations or both at the same time.
⛵ Getting started
Hello world
To get started, first install the library
pip install streamlit
Then create a folder called streamlit_demo (or your preferred name) and add an app.py file to it.
import streamlit as st
st.text("Hello world")
In terminal cd into the folder and type
streamlit run app.py
You will see these in the terminal.
Now a browser window will automatically open up and you have the first app up and running🎉
Next we can start to add UI components, which are treated as variables in app.py, add this snippet to understand what that means.
if st.checkbox("show"):
st.title("My first app")
Some of the common UI components include radio button, checkbox, selectbox, slider, and text input can be found in this documentation.
Now that we have a basic idea of how this works, we can start reading data and make a simple plot. The data and code for the following sections can be found in this gist.
import streamlit as st
import pandas as pd
import numpy as np
# header
st.title("London Bikes")
st.subheader("Numuber of Trips starting from Hyde Parker Corner")
# read data - source: London Bicycle Hires from Greater London Authority on Google Datasets via Bigquery
df = pd.read_csv('london_bikes.csv')
# area plot
df['start_day'] = pd.to_datetime(df['start_date']).dt.date
df_trips_by_day = df['start_day'].value_counts()
st.area_chart(df_trips_by_day)
Using Streamlit's built in charting function, we made our first plot in a few lines of code, and it's interactive and downloadable!
Going one step further, one can visualize the destinations of all the trip started from Hyde Park Corner.
💡 Tips and Tricks
As one adds libraries and functions to the app, it becomes more important to organize the dependencies in one place. A nifty trick to make things easier is to create Makefile and requirements.txt. As an option, we can create these two files to make the workflow easier.
Requirements.txt list out the packages needed for the app to run
Makefile offers a recipe on how to properly set up the app.
For example, when you run make install, it will automatically install all the dependencies.
Now that we had a glimpse of the functionalities of Streamlit, we can weigh it against other frameworks.
⚖️Pros and Cons
Streamlit has a long list of pros which I love:
Accessible app making for everyone (who uses Python). This is the main draw, since it can save time by allowing one to focus on the data science aspect, and also suits those who may not want to learn HTML/CSS. The learning curve is fairly flat.
Cover most common UIs needed in a data app. Plus, they look good! It contains slider, checkbox, radio buttons, a collapsible side bar, progress bar, file upload, etc. Overall these functionalities and the ease of use are impressive. It would be even nicer if can have an information button next to certain components to offer further explanations.
Support multiple interactive visualization libraries. It supports libraries such as plotly, altair, bokeh, Vega-Lite, and pydeck.
And here are some other libraries it's compatible with.
And since it's python-based, one can run some ml algorithm in the same app and then plot charts on the output of algorithm, such as cluster or classification labels.
💡 Tips and Tricks
To view more UI chart components in action, this demo app is a good place to start where you can view both the chart and code side-by-side.
Some cons:
I shall start off by saying none of these cons is prohibitive. They are more of a good-to-have and are listed here to keep a balanced evaluation of the tool
- Convenience vs flexibility. Not specific to Streamlit, the higher level a framework is, the less customization it tends to provide.
In streamlit's case, one cannot customize the layout currently (but it's on their roadmap), which makes it not the most suitable choice for complex dashboards that require container-like layout. Streamlit layout mainly consists of a big vertical panel and a small side panel. One cannot really position 2 different charts side-by-side, or add custom elements in any specified locations. If you want to customize everything, you might have to use React.js/Vue.js.
- Size of data input. Streamlit has a soft limit of 50MB for data upload. I will be interested in How the framework enables large scale application going forward.
- Limited support for video/animation. No native support for video format and no play button. This can limit certain use case involving video analysis and animation.
Last but not least, one needs to be aware that it is focused on DS/ML use cases, thus will not offer the full suite of functionalities that Flask/Django have. This might not necessarily be a con, having a focus could a good thing. It simply means if one wants to build apps with other functionalities such as user authentication, newsletter subscription, or user-to-user interaction, then it's better to look elsewhere.
⌛ Wrap-up
Streamlit is simple enough such that everyone can use it. It can be a superb option if you need a quick solution for an interactive data app, especially with both machine learning algorithm running in the background and interactive plots. Hopefully today we had some fun building data science apps. In a future post, we'll cover how to deploy it to the cloud and containerized it using Docker.
For more information, these resources might be useful.
awesome-streamlit
streamlit gallery
Top comments (0)