Before food delivery apps came along, ordering food online was a pain. I started using Swiggy, the Indian food delivery app, in 2016 and because it made it way too easy to get food delivered, my orders just went through the roof. I thought it will be fun to analyze how much and what have I ordered.
There are two sections of this post: a) How I did it? b) The numbers and my analysis. Depending on your interest, you can skip either.
Swiggy doesn't have an API to get your personal data. However, it has a page where it lists your past orders. Trouble is, it will only show five orders at once, and you have to click "Show more orders" to go further back. Naturally, you have to automate it somehow.
I first logged in to the web version of the app since it will be comparatively easier to analyze the requests there. It seems the orders page makes an XHR request to the URL
/dapi/order/all?order_id=$$ORDER_ID$$ to fetch the orders where
$$ORDER_ID$$ is used for pagination.
The response is an array of JSON objects with a lot of fields.
We don't need to figure out which parts of response we need to use, because we can store it all and see to that later.
I will use Postgres to store this data because a) it just works and b) has great support for querying JSON. Installing Postgres locally is really simple on Macs via the Postgres app.
Here's the schema I will use:
CREATE TABLE swiggy ( order_id text PRIMARY KEY, data jsonb );
Why do we just need two columns? Because
order_id is enough to uniquely identify every order and in the
data column we can dump the whole object. As I said earlier, Postgres makes it really easy to query JSON.
Let's get the data! I will be using bash for this. "BASH???? ARGHHH. WHY???", you may ask. I used to have a similar aversion towards bash, but in the past year, I have realized it makes it really easy to do certain things once you get the hang of it.
It's perfectly fine if you want to use a different approach as the basic ideas would be very much portable. Like, using
axios instead of
First, the basic request. We need to figure out if a simple request for getting orders would work at all. The easiest way to do that is to "Copy as cURL" the request from the developer's console and run it in your terminal.
If it works, that's great! You can start removing the unneeded parameters, like headers, user agent, and tracking cookies. After I did the trimming, the basic request to get orders boiled down to this:
curl -sL "https://www.swiggy.com/dapi/order/all?order_id=$next_order_id" \ -H "cookie: _session_tid=$SESSION_ID"
$SESSION_ID is the long parameter visible in the first screenshot.
Once the request works, we can parse the JSON and iterate over the objects using
IFS=$'\n' for order in $(echo "$all_orders" | jq -c '.data.orders' ); do echo "$order" done;
Why do we use
IFS=$'\n'? Because by default bash will split tabs and spaces too, and various names in the response (like restaurant, delivery person) can have spaces.
Now, we just need to store it in the DB. And write code to fetch the next orders. In all, the whole script will be like this:
#!/bin/bash # Your Swiggy Session ID, check screenshot #1 SESSION_ID="__SESSION_ID__" while true; do all_orders=$(curl -sL "https://www.swiggy.com/dapi/order/all?order_id=$next_order_id" \ -H "cookie: _session_tid=$SESSION_ID") IFS=$'\n' for order in $(echo "$all_orders" | jq -c '.data.orders' ); do # We use $$ to wrap $order because it takes cares spaces in the data psql "postgres://shubhamjain@localhost/stats" -c " INSERT INTO swiggy (order_id, data) VALUES ( '$(echo "$order" | jq -r '.order_id')', \$\$$order\$\$ ); " done; # Next order id would be order id of the last object in array next_order_id=$(echo "$all_orders" | jq '.[-1].order_id') done;
After running this, you should see something like this:
At some point, it will start failing, which means it has reached the end and you can quit the script then.
Note: None of this is the ideal way to do things. I wasn't interested in programming properly, only hacking together something cool.
Cool! Now we have the data. Let's jump to answering questions.
A LOT! I knew I was ordering a ton of times, but I never put a number on it. Actually doing that, with this project, I was taken aback realizing the number of orders I have made and the money I have spent.
SELECT COUNT(*), EXTRACT('year' FROM date(data->>'order_time')::timestamp) as year FROM swiggy GROUP BY 2;
Total orders: 915
How much I have spent?
SELECT SUM(CAST(data ->> 'order_total_with_tip' AS DOUBLE PRECISION)), EXTRACT('year' FROM DATE(data ->> 'order_time')::TIMESTAMP) FROM swiggy GROUP BY 2;
Total spent: ₹1,84,329
That seems to a lot of money. Enough to buy a decent second-hand car, at least.
Pareto Principle seems to apply very well here. Most of my orders are for a cafe called Chai Shai. I love ordering from there—it has good snacks and everything is well-packaged.
SELECT COUNT(*), data ->> 'restaurant_name' FROM swiggy GROUP BY 2 ORDER BY 1 DESC;
It seems I order more during the morning and afternoon than in the evening. That can be explained by the fact that I often don't feel that hungry in the evening and I prefer something light for dinner.
SELECT COUNT(*), EXTRACT('hour' FROM TO_TIMESTAMP(data ->> 'order_time', 'YYYY-MM-DD hh24:mi:ss')::TIMESTAMP) FROM swiggy GROUP BY 2;
It seems I am fond of ordering for a quick bite and desserts than a full meal. Desserts are something I should definitely cut down on.
SELECT cuisine, COUNT(*) FROM ( SELECT JSONB_ARRAY_ELEMENTS_TEXT(data -> 'restaurant_cuisine') AS cuisine FROM swiggy )x GROUP BY 1 ORDER BY 2 DESC;
It was an interesting exercise to understand my habits over the years. There are tons of ways I can dig more into the data but for now, I will stop here. Insight fatigue isn't great either.
I don't think ordering is necessarily bad but it's lazy. Knowing the scale of my habit, I know it would help me to make a little conscious effort to order less.