Anyone building a cooking, meal planning, or grocery app hits the same wall fast: people do not save recipes as clean data. They save them as pictures. A screenshot of a Pinterest pin. A frame grabbed from a TikTok. A photo of a page in a cookbook. A wall of text pasted into a note.
If your app needs ingredients, servings, and steps as real fields, that messy input is a genuine problem. Scraping a web page only works when there is a web page, and most recipes shared on social apps never have one. So you end up writing brittle parsers, bolting on OCR, and babysitting edge cases forever.
This is the exact gap I built an API to close.
The idea
Send a recipe in any common form, a photo, a screenshot, a web link, or pasted text, and get back clean structured JSON. Every ingredient comes split into a name, a quantity, and a unit, with the unit normalized to a fixed vocabulary. The category, the serving count, and the instructions come back too.
The piece most tools skip is the picture. Reading a recipe out of an image needs vision, not HTML parsing. That part is built in here, so you never have to stand up your own OCR pipeline or stitch together a vision model yourself.
One endpoint, three inputs
You call a single endpoint and send one of three body shapes.
From pasted text:
{
"type": "text",
"content": "Garlic Butter Shrimp. 1 lb shrimp, 3 tbsp butter, 4 cloves garlic, 1/2 tsp salt. Cook 5 minutes."
}
From a web link:
{
"type": "url",
"content": "https://www.example.com/garlic-butter-shrimp"
}
From a screenshot or photo, pass the image as base64 (you can send several photos of the same recipe and they get combined):
{
"type": "image",
"images": [
{ "base64": "<base64 image data>", "mimeType": "image/jpeg" }
]
}
A full call
Here is a request in Node. Grab your key and the exact host string from the API page on RapidAPI, then drop them in.
const res = await fetch("https://recipe-extractor-screenshot-photo-and-url-to-json.p.rapidapi.com/extract", {
method: "POST",
headers: {
"Content-Type": "application/json",
"X-RapidAPI-Key": "YOUR_RAPIDAPI_KEY",
"X-RapidAPI-Host": "recipe-extractor-screenshot-photo-and-url-to-json.p.rapidapi.com"
},
body: JSON.stringify({
type: "text",
content: "Garlic Butter Shrimp. 1 lb shrimp, 3 tbsp butter, 4 cloves garlic, 1/2 tsp salt. Cook 5 minutes."
})
});
const data = await res.json();
console.log(data.recipe);
And the response:
{
"recipe": {
"name": "Garlic Butter Shrimp",
"description": "A quick garlic butter shrimp skillet.",
"category": "dinner",
"servings": 2,
"ingredients": [
{ "name": "shrimp", "quantity": "1", "unit": "lb" },
{ "name": "butter", "quantity": "3", "unit": "tbsp" },
{ "name": "garlic", "quantity": "4", "unit": "clove" },
{ "name": "salt", "quantity": "0.5", "unit": "tsp" }
],
"instructions": "Cook for 5 minutes."
}
}
Notice the ingredients arrive already broken into fields, ready to drop straight into a database or a shopping cart. No second parsing pass, no regex zoo.
Where this fits
Meal planning and grocery apps can let users import a recipe by photo or link, then build a shopping list automatically from the structured ingredients. Nutrition trackers can read the ingredient list to estimate macros. Recipe organizers can digitize screenshots and cookbook photos into a searchable library. And if you are feeding a model, clean structured data beats raw HTML every time.
Try it
The API lives on RapidAPI with a free tier so you can test it in a minute. Find it here:
If you build something with it, I would love to hear what you made.
Top comments (1)
Recipe extraction is a deceptively good benchmark for structured output because the source fights back: fractions, ranges, 'a pinch', ingredients that only appear inside instruction steps. The schema carries most of the quality here. Loose fields like quantity-as-string push the ambiguity downstream to every consumer; strict fields force resolution once, at extraction time, while the surrounding context still exists to resolve it with. Same tradeoff as classic ETL, just with a model in the middle.