DEV Community

Daniel Githinji
Daniel Githinji

Posted on

The Unspoken Truths of Dev Work: How I Wrestled My Data Science Project onto GitHub (and learned some secrets)

Every developer knows the drill: write code, solve problem, push to GitHub. Simple, right? Not always. As I tackled the Week 3 homework for the Data Talks Club scholarship—focused on lead scoring with a classic ML workflow—my journey from Google Colab to a clean GitHub commit turned into a mini-saga of real-world dev frictions.

This isn't just about getting the right answers (though I got those too, for Q1-Q6 on mutual information, feature elimination, and regularization). It's about how the actual process of deployment and version control can sometimes teach you more than the algorithms themselves.

The Setup: Colab Comfort & GitHub Ambition
My environment of choice was Google Colab—fast, free, and great for quick iterations on ML tasks. My goal: complete the homework, export the notebook, and push it to my GitHub repo (dgithinjibit/dataTalksClubWeek3). Sounds straightforward for a developer building a platform like SyncSenta, right?

The plan was clear:

Code the ML solution in Colab.

git clone the empty repo in Colab.

git add, git commit, git push.

Friction Point #1: The Phantom Branch - src refspec main does not match any
After getting the initial code down, I hit my first snag. After running git init (which you generally don't do when cloning an existing repo), and then attempting to push, I got the infamous:

error: src refspec main does not match any
error: failed to push some refs to 'https://github.com/dgithinjibit/dataTalksClubWeek3.git'
The Lesson: This error is Git's polite way of saying, "Hey, you're trying to push a main branch, but your local repository doesn't actually have a main branch with any commits yet!" A branch only truly exists in Git after its first commit.

The Fix: A git clone of the (empty) remote repo, followed by an mv of my notebook into that cloned folder, then a proper git add and the crucial first git commit. This establishes the local main branch. Always commit before you push, especially for the first time on a new branch.

Friction Point #2: The Elusive File - mv: cannot stat '../datatalksweek3.ipynb': No such file or directory
Even after getting the Git commands right, the Colab file system threw a curveball. My mv command to move the notebook into the repo folder failed:

mv: cannot stat '../datatalksweek3.ipynb': No such file or directory
The Lesson: Colab notebooks, especially those opened from or saved to Google Drive, aren't always sitting in the /content/ directory. You need to explicitly mount your Google Drive if that's where the file resides. Otherwise, the Colab runtime can't see it.

The Fix: Mounting Google Drive (from google.colab import drive; drive.mount(...)) and using the absolute path (/content/drive/MyDrive/Colab Notebooks/datatalksweek3.ipynb) was the solution. It's a reminder that even in "cloud" environments, file paths are king.

Friction Point #3: The Secret Sneak - GitHub Push Protection Strikes!
Just when I thought I was home free, GitHub's automated security stepped in:

remote: error: GH013: Repository rule violations found for refs/heads/main.
remote: - GITHUB PUSH PROTECTION
remote: —————————————————————————————————————————
remote: - Push cannot contain secrets
remote: —— GitHub Personal Access Token ——————————————————————
remote: locations: datatalksweek3.ipynb:1
The Lesson: My Personal Access Token (PAT) — which I used to authenticate Git operations in Colab — was inadvertently saved within the notebook's output cells. GitHub's Push Protection caught it. This is critical. Never commit secrets to a public repository. It's a fundamental security principle.

The Fix (The "Clean Code" Way):

Downloaded the notebook locally. This was the fastest way to get full control.

Manually edited the .ipynb file in VS Code, deleting the entire cell where the PAT was used and, crucially, where its output was stored. No git filter-repo gymnastics needed.

Used git reset HEAD^ --hard to undo the previous bad commit locally.

git add, git commit the clean notebook.

git push origin main (no force needed this time, as the remote was still empty).

The Takeaway: Beyond the Algorithm
This wasn't just homework; it was a crash course in the practicalities of a developer's workflow. From understanding how Git actually builds its history, to navigating cloud file systems, to prioritizing security and clean commit practices—these "frictions" are where the real learning happens.

As I continue building SyncSenta and pushing the boundaries of AI in education, these are the lessons that build resilient systems and a robust development pipeline. It's not just about writing code; it's about owning the entire stack and adapting to every challenge that comes your way.

What are your "tedious" dev lessons that turned into critical skills? Share in the comments!

Top comments (0)