StockSimPy is a lightweight Python library for simple stock backtesting. The goal is to understand Pandas, experiment with stock strategies better, and create an easy-to-use alternative to more complex backtesting tools. This is part 3 of the series where I build this library in public.
After finishing basic indicator calculation functions, I needed a way to keep track of all the stock information in an organised, reusable format. That’s where the StockData
comes in — it acts as a container for everything you’ll need in backtesting or simulation.
I initially thought it should be easy to code as it just needed to keep the information and require some simple import and export, but I was quite wrong. Turns out working with data can be messy.
Data Validation
When importing stock data, you can’t assume the columns are always consistent. Strategies require the use of different features, but some fields are essential:
The tricky path — though — is naming conventions. What do I mean?
Let's take “Open” as an example; it could show up as “OPEN”, “open”, “OpeN”, “open_price”, “OpenPrice”, “openPrice”, and many other wild naming styles.
Lowercasing handles some cases, but what about the ones with “price” in the name? Then I thought — I could easily search for the substring “open” in the whole word. This covers all the cases I mentioned above, but if open is named something else entirely, it wouldn’t work.
A more comprehensive approach might be to create a full-blown synonym-matching system. But that might be overkill for now. Still, I might add it as a feature in the future if somebody requests it.
Data Import
The most important feature of StockData
is importing data—without that, it’s just an empty shell.
I was quite skeptical about creating these import functions at first. I considered leaving import up to the user — just pass in a Pandas DataFrame — but having built-in loaders felt more convenient. So far, StockData
supports imports from:
- SQLite
- CSV
- Excel
- Pandas DataFrame
- Python dictionary
- JSON
(This process felt quite repetitive as I was just using built-in pandas functions or just straight-up copying documentation.)
To simplify things, I added anauto_loader()
function that picks the correct import based on the file extension of source
parameter. I used **kwargs
so users can pass in additional parameters.
On top of that, StockData
integrates directly with yfinance (optional dependency). This allows fetching live stock data for a given ticker and date range, making it much more practical.
For testing purposes, there’s also a generate_mock_data()
function. It isn’t designed for real backtesting but is useful for experimenting with new features.
Data Export
Here is a question: why export data you already imported? Two reasons:
- Users might want to inspect or clean their data after transformations.
- I will soon integrate the indicator functions from earlier posts, with
StockData
so exporting results will be handy.
Export currently supports all the same formats mentioned in import, plus SQL. There is also a flexible to_custom()
function that lets you define your own export method.
It was such a twist, this step turned out to be more about data flexibility rather than really "storing data." With StockData in place, stocksimpy now has a solid foundation for testing.
If you want to use this library in the future, or have any ideas that I could add, go for it. Ask me in comments, connect with me on socials. I want to make this project something useful.
Follow the rest of the series, watch me build in public.
- 🌟Star this on GitHub
- 🧠 Follow me on Twitter / X
- 🔷 I’m now on Bluesky
- 📰 Or read more of my posts here on Medium
- 💬 Let’s connect on LinkedIn
Thanks for reading.
Top comments (2)
I enjoy the global reach Virturo provides for trading
Some comments may only be visible to logged-in visitors. Sign in to view all comments.