How Perfection Prevents Progress: A Lesson From My Thesis
I fell into the classic developer's trap and how you can avoid making the same mistake

Ever spent weeks building a system where the problem could be solved in a few hours?
That's exactly the trap I fell into while working on my machine learning thesis at Tetra Pak.
The Over-Engineering Trap
My task seemed straightforward: Clean messy data to build a prediction model.
But instead of rolling up my sleeves and getting started, I spent weeks trying to architect the perfect system:
- Advanced visualizations
- Complex data classes
- The ultimate data pipeline that could process anything
The result? Almost nothing to show for it.
The One Simple Truth I Discovered
"You Can't automate what you haven't first done manually"
As developers, we're notorious for spending 2 hours automating a 5-minute task. Without realizing it, I'd spent 3 weeks trying to automate something that I could do manually in 3 hours.
The Breakthrough Moment
Everything changed when I shiften my goal from:
"Create the perfect data cleaning pipeline that works for every dataset"
to:
"Clean just one dataset to 80% of what I would want it to be"
Results? Task completed in one morning. I was stunned.
This simple, focused approach was so easy to iterate on that I quickly could scale it to work on the majority on my dataset.
The 80/20 Rule for Data Scientist
If you're stuck in data cleaning hell, remember that you don't need:
- A perfect data pipeline
- Classes hanlding all possible data formats
- Perfectly cooperating functions
All you need is one notebook file that takes one data file from unworkable to acceptable.
It's faster, it actually works, and most importantly - You'll make real progress instead of chasing perfection.
Want to get into data science & AI? Subscribe for more lessons delivered straight to your inbox.