I recently built a small Instagram profile scraper in Go, packaged it as an Apify Actor, and published it so other people can use it without maintaining the infrastructure themselves.
The goal was simple: fetch public Instagram profile data by username and return clean, automation-friendly JSON. I did
not want browser automation, heavy dependencies, or deeply nested output that becomes painful to use in datasets, exports, or pipelines.
The problem
A lot of scraping projects work, but they are hard to operationalize.
They rely on full browser stacks, break on minor changes, or return raw payloads that still need another transformation layer before they become useful. For profile lookups, I wanted something much lighter:
- input: one or more Instagram usernames
- output: structured profile data
- deployment: packaged for Apify
- operations: proxy-ready and resilient to partial failures
The approach
I built the Actor in pure Go with no external dependencies beyond the standard library.
Instead of browser automation, the scraper makes a direct request to Instagram’s web profile endpoint and sends the headers that Instagram expects for that request. That keeps the runtime small and fast, which is a good fit for an Apify Actor.
The Actor accepts either a legacy username field or a usernames array, normalizes the input, strips @, and removes duplicates. That makes it easier to use both manually and from automations.
What the scraper returns
The Actor extracts and normalizes the most useful profile fields, including:
- username and internal Instagram ID
- full name and biography
- follower, following, and post counts
- profile picture URLs
- private, verified, business, and professional flags
- related profiles
- latest posts
The latestPosts section is where I spent more time than expected. I did not want to return only a shortcode and a caption. I wanted each post to be immediately useful, so I included things like:
- caption text
- hashtags and mentions parsed from the caption
- likes and comments count
- dimensions
- image URLs
- tagged users
- child posts for carousel content
- normalized timestamps
That way, the Actor output is already useful for lead generation, competitor monitoring, influencer research, and internal dashboards.
Making it practical for Apify
Building the scraper itself was only half the task. The other half was productizing it.
I added:
- an Apify input schema for usernames
- a dataset schema for cleaner output browsing
- a Docker build so the Actor can run consistently
- dataset push logic so each profile is saved directly to the Apify dataset
- proxy support for more reliable requests at scale
One implementation detail I care about is failure handling. If one username is invalid or unavailable, the whole run should not fail. The Actor skips missing profiles and continues processing the rest. It only fails the run on actual technical errors such as network or dataset write failures.
That matters in production much more than it seems during local development.
What I learned
A few lessons stood out while building this:
First, scraping is only part of the value. Data shape matters just as much. A flat, predictable output is more valuable than a huge raw JSON blob.
Second, operational details matter early. Timeouts, proxy support, and partial-failure handling are not “later” concerns if you want to publish a usable product.
Third, packaging changes how you think. Once I decided to publish the scraper on Apify, I had to think less like a developer running a script and more like someone maintaining a small API product.
Final result
The result is a lightweight Instagram Profile Scraper Actor in Go that can fetch one or many public profiles and return structured output ready for datasets and automations.
If you want to try it without building your own pipeline, you can check it out here:
- You can check it here: Link
If you are building scraping tools yourself, my main advice is this: optimize for usable output, not just successful requests. That is usually what makes the difference between a side script and a product.




Top comments (0)