DEV Community

Discussion on: Hacktoberfest 2020 — Who's looking for contributors?

Collapse
 
yaelriv profile image
YaelRiv • Edited

lakeFS is a data lake management platform written in Go.
This is our first hacktoberfest and we couldn't be more excited.

We created hacktoberfest issues that new contributors can easily tackle:
github.com/treeverse/lakeFS/labels...

And you're welcome to join our slack channel for help and more ideas.

GitHub logo treeverse / lakeFS

An open source platform that delivers resilience and manageability to object-storage based data lakes

Hacktoberfest License Go Node

What is lakeFS

lakeFS is an open source layer that delivers resilience and manageability to object-storage based data lakes.

With lakeFS you can build repeatable, atomic and versioned data lake operations - from complex ETL jobs to data science and analytics.

lakeFS supports AWS S3 or Google Cloud Storage as its underlying storage service. It is API compatible with S3, and works seamlessly with all modern data frameworks such as Spark, Hive, AWS Athena, Presto, etc.

For more information see the Official Documentation.

Capabilities

Development Environment for Data

  • Experimentation - try tools, upgrade versions and evaluate code changes in isolation.
  • Reproducibility - go back to any point of time to a consistent version of your data lake.

Continuous Data Integration

  • Ingest new data safely by enforcing best practices - make sure new data sources adhere to your lake’s best practices such as format and schema enforcement, naming convention, etc.