DEV Community

Cover image for I built Hippotable for in-browser data analysis
Vladimir Klepov
Vladimir Klepov

Posted on

I built Hippotable for in-browser data analysis

I'm happy to announce the first public release of Hippotable — a tool that lets you analyze data without leaving your browser, on desktop & mobile.

I often analyze small- to mid-sized datasets for work and for fun — e.g. to find out the distribution of a certain bug by platform, or calculate unique affected users. But what tools do I have to help me here?

  1. Bash lets you uniq | wc -l — handy, but making advanced pipelines is hard.
  2. Google sheets does the job, but struggles above 10K rows due to all the cruft, and using it for sensitive data such as personal budgets or user data is a no-no.
  3. Python + jupyter + pandas is up to any data problem, but it's overkill for my simplistic use cases, and requires a lot of code.

So I set out to build a simple browser-based tool to do the job. Hippotable can:

  • Open CSV files up to 100 Mb in size.
  • Scroll though thousands of rows.
  • Filter and sort your data in real time.
  • Aggregate / groupby data to gain deeper insights.
  • 🏗️ Build powerful data pipelines with multiple filter / aggregate steps.
  • Share results with CSV export.

It's also free and open source.

Example

Now, let me walk you through an example of analyzing an annotated movie dataset from kaggle. Let's start simple and see which countries, on average, make the best movies. Group by country, sort by average rating:

Image description

Hm, this looks like a selection of countries which happened to co-produce a decent film once, not that interesting. Let's try again, removing countries that have <10 movies:

Image description

Now that's unexpected! In case you're curious, lots and lots of bad films come from Italy:

Image description

Combining multiple filter and aggregation layers enables really powerful processing pipelines. For example, here are countries that were home to most great directors (see, not all is lost for Italy):

Image description


That's it for today! Give hippotable a try and star on GitHub to help spread the word. Join me next time to learn about the amazing tech I used to make this happen.

SurveyJS custom survey software

Build Your Own Forms without Manual Coding

SurveyJS UI libraries let you build a JSON-based form management system that integrates with any backend, giving you full control over your data with no user limits. Includes support for custom question types, skip logic, an integrated CSS editor, PDF export, real-time analytics, and more.

Learn more

Top comments (0)

Billboard image

The Next Generation Developer Platform

Coherence is the first Platform-as-a-Service you can control. Unlike "black-box" platforms that are opinionated about the infra you can deploy, Coherence is powered by CNC, the open-source IaC framework, which offers limitless customization.

Learn more

👋 Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay