Managing Supply Chain Data Flow in Side Projects in 3 Simple Steps

#life #supplychain #datamanagement #sideprojects

Introduction: My Dance with Supply Chain Data in My Own Projects

In my own projects, I sometimes encounter unexpected needs. Especially when setting up small-scale systems related to production, supply, or shipping, managing data flow correctly becomes as critical as the software itself. In the past, while designing these flows in large corporate ERP systems, I realized how complex organizational processes we were digitizing. However, delving that deep in side projects often brings an unnecessary burden. The goal is to achieve maximum efficiency with minimum effort, saying "it's good enough."

In this post, I will share the 3 fundamental steps I apply in my own projects to manage supply chain data flow without overcomplicating it. These steps aim to offer more pragmatic and direct solutions compared to large-scale systems. My goal is to provide you with methods that are "applicable by yourself," not "designed for companies."

Step 1: Identifying Data Sources and Destinations

The first step in any data management process is to clarify what kind of data we are dealing with and where we want to take this data. In my own projects, this can usually be a simple Excel spreadsheet or a PostgreSQL database. What's important is to know where the data comes from and in what format.

When working on a production ERP, the basic flow of the supply chain was: from sourcing raw materials, to production planning, operator screens, shipment tracking, and finally invoicing. At each point of this flow, there were different data types: raw material stock quantity, lead times for production planning, operational data entered by operators, shipping addresses and delivery statuses, invoice details... Each was a separate focus.

ℹ️ Example Scenario: Mini Stock Tracking System

In my own small side projects, for instance, when setting up a simple stock tracking system for a workshop, my data sources would be:

Input Data: Name of newly arrived raw material, quantity, date of arrival, and supplier information. This is usually provided via a web form or manual entry.

Output Data: Name of raw material used in production, quantity, and date of use. This is also generated by operator inputs.

Destination Data: Instantaneous stock levels, alerts for depleted or low materials, tracking of most used materials. This data is presented through reporting screens or notifications.

Supplier Data: Supplier contact information and past order data.

Even in this simple scenario, knowing where the data comes from (input/output) and where it needs to go (stock levels, alerts) directly impacts the database schema and reporting logic. Without this clarity, data loss or incorrect analyses in subsequent steps become inevitable.

When identifying data sources, it's important to ask, "Why does this data exist?" If data doesn't have a clear purpose in the workflow, collecting it only brings an unnecessary burden. In my own projects, I usually try to ensure that data directly meets a business need, rather than collecting it just "in case it's needed." This allows me to keep data volume under control and speed up analyses.

Checklist for Data Sources and Destinations

What data enters the system? (e.g., Raw material information, order details, operational logs)
Who or what generates this data? (e.g., Supplier, operator, sensor)
What format does the data arrive in? (e.g., CSV, JSON, manual entry, database record)
What does this data represent? (e.g., Stock quantity, shipment status, production step)
Where should this data go? (e.g., Stock table, shipment tracking screen, reporting dashboard)
What decisions should this data support? (e.g., Placing orders, stopping production, planning shipments)

The answers to these questions form the first step in shaping your data model and infrastructure.

Step 2: Simplifying and Automating Data Flow

After identifying data sources and destinations, the next step is to plan how this data will reach its destination. In side projects, the goal is to establish a flow that is as simple and automated as possible, rather than complex integration layers. This usually means minimizing manual steps and facilitating data transformations.

In large corporate projects, data flows are typically managed through ETL (Extract, Transform, Load) tools or complex middleware services. However, in my own projects, setting up and maintaining such tools can be a waste of time. Therefore, I lean towards lighter solutions like direct database queries, simple API calls, or even scheduled scripts.

💡 Simple Automation Example: PostgreSQL and pg_cron

In my own small side projects, if I'm using a PostgreSQL database and need to perform data updates or reporting operations at specific intervals, using a PostgreSQL extension like pg_cron is a great solution. This allows me to schedule SQL queries directly from within the database.

For example, I can easily automate tasks like updating stock levels every night or flagging orders that meet certain criteria using pg_cron. This eliminates the need to set up a separate scheduled task manager (cron job) and allows me to manage everything in one place.

An example pg_cron job:
-- Flag low-stock items every night at 02:00 AM
SELECT cron.schedule(
    '0 2 * * *',
    $$
    UPDATE stok_malzemeleri
    SET durum = 'azaliyor'
    WHERE mevcut_miktar < minimum_stok_seviyesi AND durum <> 'azaliyor';
    $$
);
This type of automation eliminates the need for manual data checks and allows me to detect potential issues (like running out of stock) early on.

Another way to simplify data flow is to standardize data formats. If you are receiving data from different sources, converting them all into a common structure makes the processing much easier. For instance, standardizing all date formats to a single format like YYYY-MM-DD HH:MM:SS ensures that comparisons and sorting are done without errors.

Things to Consider When Automating Data Flow

Identify Repetitive Tasks: Which steps are repeated daily, weekly, or monthly? Aim to automate these steps.
Use Simple Tools: Instead of complex ETL tools, prefer lighter solutions like scripts (Python, Bash), database features (cron jobs, stored procedures), or simple API integrations.
Manage Data Transformations: Bring data from different sources into a common format. This simplifies data cleaning and integration.
Add Error Handling: Plan how errors will be managed in automated processes. Set up mechanisms that report errors (email, logging).
Consider Scalability: If your project grows, can your chosen automation method handle this growth? Keeping it simple initially makes future restructuring easier.

Step 3: Visualizing and Analyzing Data

After establishing and automating the data flow, it's necessary to visualize and analyze this data to make it meaningful. In my own side projects, this is usually done through simple dashboards or regular reports. The goal is to quickly see the trends, problems, and opportunities that the data reveals.

In large ERP systems, dedicated BI (Business Intelligence) tools are typically used for this task. However, in side projects, considering the licensing costs or complexity of such tools, more accessible solutions are preferred. For example, open-source tools that can be used with PostgreSQL databases, or even libraries that simply draw graphs, can serve our purpose.

⚠️ Data Visualization in PostgreSQL: Nginx and Grafana

For projects hosted on my own server (VPS), I usually meet my data visualization needs with Grafana. Grafana can connect to many different data sources, including PostgreSQL, and allows for the creation of very effective dashboards. If my project has a web interface, I can securely access Grafana by using Nginx as a reverse proxy.

For example, to see the stock level of raw materials used in production tracking in real-time, I can use a query like this in Grafana:
-- Show current quantity and minimum level from the stock materials table
SELECT
    zaman AS "time",
    malzeme_adi AS "Material Name",
    mevcut_miktar AS "Current Quantity",
    minimum_stok_seviyesi AS "Minimum Level"
FROM stok_hareketleri
WHERE malzeme_adi = 'Screw M3'
ORDER BY zaman DESC
LIMIT 100;
When this query is visualized as a line graph in Grafana, it allows me to instantly see the material's change over time and whether it's approaching the minimum level. This type of visualization is much faster and more effective than manually reviewing reports.

In the analysis part, it's not enough to just answer "what happened?"; we also need to ask "why did it happen?". Understanding the reasons behind the data is critical for preventing future problems or capitalizing on opportunities. In my own projects, I usually try to detect anomalies or specific patterns by monitoring the data's change over time.

Tips for Data Visualization and Analysis

Choose the Right Visualization Type: Use the appropriate visualization tool, such as line graphs for time-series data, bar graphs for categorical data, and pie charts for ratios.
Keep it Simple: Dashboards should not be complex. The most important metrics and trends should be prominent.
Be Interactive: If possible, add features that allow users to filter data and drill down into details.
Monitor Trends: Track the data's change over time. Sudden drops or increases can be indicators of potential problems or opportunities.
Investigate Correlations: Examine the relationships between different data sets. How does a change in one data point affect another?
Don't Hesitate to Ask Questions: Data should tell you a story. Ask more questions about points you don't understand or find suspicious.

Conclusion: A Cycle of Controlled Growth and Learning

My approach to managing supply chain data flow in my own projects is essentially about simplifying and applying what I've learned from large-scale systems. Clarifying data sources and destinations, keeping the flow as automated and simple as possible, and finally visualizing and analyzing the data to make it meaningful are the cornerstones of this process.

By following these three steps, I can both increase the operational efficiency of my own projects and make more informed decisions over time. The important thing is to establish a feasible and sustainable system, rather than being perfectionistic from the start. This way, my relationship with data becomes healthier, and I can further develop this system in parallel with my project's growth. This approach offers me the opportunity to make complex processes more manageable and to stay in a continuous learning loop.

The methods I've shared in this post are quite effective for small and medium-sized projects. If you are facing similar challenges in your own projects, I recommend you try these steps.