In today’s data-driven world, APIs play a vital role in connecting services and applications. APIs allow access to real-time information from anywhere on the web.
We will learn how to extract data from an API using Python with the powerful requests
library.
💡 What Is an API?
When we say "extracting data from an API," we're typically referring to making HTTP requests to an endpoint provided by a service, and getting back structured data .
1.📦 Getting Started: Installing the requests
Library
pip install requests
2.Python Script to fetch data
# importing the requests library
import requests
### Defining the url to fetch data
url = "https://jsonplaceholder.typicode.com/posts"
# Make the GET request to get data from the url defined
response = requests.get(url)
# Check if the request was successful
if response.status_code == 200:
data = response.json() # parse the json data
else:
print("Failed to retrieve data:", response.status_code)
Understanding the Code
requests.get(url)
– Sends an HTTP GET request to the specified URL.
response.status_code
– Shows the result of the request (200 means OK).
response.json()
– Parses the JSON response into a Python list/dictionary.
When we run requests to the API ,the status codes are essential in determining the result of our requests.
Here are some common http status code:
Status Code | Meaning | Description |
---|---|---|
200 |
OK | The request was successful and the server returned the data. |
400 |
Bad Request | The server could not understand the request due to invalid syntax. |
403 |
Forbidden | You do not have permission to access the resource. |
404 |
Not Found | The requested resource does not exist. |
500 |
Internal Server Error | The server encountered an unexpected condition. |
Handling Query Parameters
Many APIs require you to send additional information as query parameters in your URL. For example, let’s say we have an API that lets us search for users, and we can specify the user’s name as a query parameter, like this: https://api.example.com/users?name=john.
Here’s how you can do it with requests:
import requests
url = "https://api.example.com/users"
params = {'name': 'john'}
response = requests.get(url, params=params)
print(response.json())
Handling Errors
Things can go wrong; the server might be down. It's always good practise to include error handling:
try:
response = requests.get(url, timeout=10)
response.raise_for_status() # Raise exception for bad status codes
data = response.json()
except requests.exceptions.RequestException as e:
print("An error occurred:", e)
Fetching Stock Data from Alpha Vantage
Alpha Vantage provides realtime and historical financial market data through a set of powerful data APIs and spreadsheets.
For this project we will use a free API to fetch the daily,weekly & monthly stock data and turn it into a clean, usable format.
We will define a python function fetch_data
that connects to the API, requests stock data for a given symbol,formats the data into a neat pandas dataframe and returns the result.
import requests
import pandas as pd
API_KEY = 'your_api_key_here' # Alpha Vantage API key
BASE_URL = 'https://www.alphavantage.co/query' # Alpha Vantage API endpoint
def get_time_series(symbol, function):
params = {
'function': function,
'symbol': symbol,
'apikey': API_KEY,
'datatype': 'json'
}
response = requests.get(BASE_URL, params=params)
data = response.json()
# Check for any API errors
if 'Error Message' in data or 'Note' in data:
raise ValueError(f"Error fetching data for {symbol}: {data.get('Error Message', data.get('Note', 'Unknown error'))}")
# Determine the key for the time series data
time_series_key = next(k for k in data.keys() if 'Time Series' in k)
df = pd.DataFrame.from_dict(data[time_series_key], orient='index')
df.index = pd.to_datetime(df.index)
df = df.sort_index(ascending=False) # Sort the data by most recent
return df
🧑💻 Function Breakdown: get_time_series(symbol, function)
The function get_time_series(symbol, function)
does the following:
-
Takes Two Parameters:
-
symbol
: The stock ticker (e.g.,'AAPL'
for Apple). -
function
: The type of time series data (e.g.,'TIME_SERIES_DAILY'
,'TIME_SERIES_WEEKLY'
).
-
-
Sends a Request to Alpha Vantage:
- Uses the provided
symbol
andfunction
to request data from Alpha Vantage’s API.
- Uses the provided
-
Extracts the Time Series Data:
- Looks through the API response to find the key that holds the time series data (like
"Time Series (Daily)"
).
- Looks through the API response to find the key that holds the time series data (like
-
Converts to a pandas DataFrame:
- The raw data from the API is turned into a DataFrame, which makes it easier to manipulate and analyze.
- The DataFrame is sorted so that the most recent data comes first.
-
Returns the Data:
- The function returns the cleaned-up DataFrame containing the stock's time series data.
In essence, this function grabs stock data from Alpha Vantage, converts it to a readable format, and sorts it for easy use!
Here’s how you’d use the function:
apple_data = get_time_series('AAPL', 'TIME_SERIES_DAILY')
print(apple_data.head())
This will show you Apple’s most recent daily stock data, with columns like open, high, low, close, and volume.
🧹 Conclusion
Extracting data from APIs using Python is an essential skill for developers, data scientists, and data engineers. With just a few lines of code using the requests library, you can tap into vast amounts of real-time information from across the internet.
Whether you're building something big or just experimenting, APIs open a gateway to endless possibilities.
Top comments (0)