Cem AKAN

Posted on May 29

Rethinking Smart Parking: A Dynamic Line and Box Approach to Computer Vision

#ai #automation #machinelearning

In the world of computer vision, there is a dirty secret. Most smart systems are actually quite rigid. If you want to track a parking lot, the standard industry practice involves a developer sitting in front of a monitor for hours, manually clicking on every single corner of every single parking spot. They draw hundreds of static polygons. It is a fragile, exhausting process. If the camera angle shifts by a few degrees or a maintenance crew repaints the lines, the whole system becomes blind.

We knew we needed a system that could think for itself. Instead of drawing a map and telling the machine where to look, why not build a system that understands the environment by observing the cars? This is how Solovision was born. We moved away from hardcoded coordinates and toward a methodology we call the Line and Box approach.

We are all too familiar with the frustration of circling a parking lot for twenty minutes. It is a waste of fuel, time, and sanity. But fixing this should not require a human to manually draw every single spot in a database.

[other]A quick look at the daily struggle of finding a parking spot and how automated detection can change the experience.[/other]

Here is a deep dive into the logic, the math, and the chaotic reality of making a machine see like a human.

The Core Concept: Let the Cars Define the Spots

The fundamental idea behind this new approach is simple. Instead of explicitly defining where the parking spots are, we let the computer vision algorithm figure it out based on where cars are currently parked.

If you see a row of five cars parked next to each other, human logic immediately tells you that they are sitting in a designated parking row. If there is a large gap between the second and third car, you intuitively know that gap is an empty parking space. Our goal was to translate this human intuition into algorithmic logic.

To do this, the system processes a video feed through a specific pipeline built on Python, OpenCV, and the YOLOv8 object detection model.

Step 1- Vehicle Detection The Red Boxes:

The first step is standard object detection. The system takes the video feed and processes it frame by frame using the YOLOv8 model.

YOLOv8 is incredibly fast and accurate at identifying vehicles. For every car it detects it draws a bounding box and returns the coordinates. In our visualization we represent these occupied spaces with red boxes. This granular isolation is the foundation of everything that follows because you cannot connect the dots until you find them.

Step 2- Dynamic Parking Line Inference The Blue Line:

This is where the methodology departs from traditional static mapping. We take the coordinates of all those red boxes and try to find geometric relationships between them.

The algorithm looks for cars that are aligned with one another. It searches for collinearity among the bounding boxes. When it finds a group of cars sitting in a line it calculates a dominant axis or a regression line through them. This line represents the center of that specific parking row. We visualize this as a blue line running through the red boxes.

Step 3- Free Spot Generation The Green Boxes:

Once the system has successfully established a parking line it starts looking for gaps. It measures the distance between the red boxes along that specific blue line.

Of course not every gap is a parking space. Sometimes cars are just parked a bit far apart. To solve this the algorithm compares the gap size to a dynamic threshold typically calculated as roughly one point five times the average width of the detected cars in that row. If the gap is large enough to fit a vehicle the system infers that this empty space is an available parking spot. It then generates a green box to represent it.

Because the system calculates this on the fly it is incredibly resilient. If a camera angle changes slightly the algorithm just recalculates the lines based on the new visual data.

The User Experience

To make this technology easy to use we built a clean web interface. Users do not need to be developers to run it. They can simply upload a pre recorded video file or paste a live camera stream URL directly into the dashboard.

Once the stream is connected the system instantly goes to work. The user is taken to the Live Detection screen where they can watch the algorithm process the environment in real time.

When Logic Meets the Real World

In clean diagrams this methodology looks great. However applying computer vision to a real parking lot introduces plenty of edge cases. The algorithm does not see the world in three dimensions like we do it just sees a flat two dimensional grid of pixels.

When we first tested the dynamic line logic without strict constraints the output was incredibly noisy.

Look closely at the image above. This raw debugging state perfectly illustrates the challenge of dynamic inference. The system detects the cars with red boxes just fine but when it tries to draw the blue lines to establish parking rows it struggles.

Because of the camera perspective cars in the back rows look like they are sitting right on top of cars in the front. The algorithm tries to connect a car from row one to a car from row three generating a web of overlapping lines. When the lines are wrong the gap measurement logic fails. It sees a false gap between two incorrectly connected cars and starts placing green boxes right in the middle of driving lanes.

This false line merging is our biggest hurdle. Sometimes the algorithm figures out the general direction but is just slightly off connecting cars that have nothing to do with each other.

Now contrast that noise with the True state we are actively building towards.

In a successful detection the algorithm isolates each parallel row. The blue lines stay strictly within their designated lanes allowing the gap measurement to work perfectly and place green boxes exactly where real empty spots exist.

Improving the Logic

To move from this noisy debugging state to a stable release we are implementing several backend improvements. Simple dot to dot connections are not enough.

First we are enhancing the tracking stability. By integrating an object tracker like DeepSORT or BYTETrack we can maintain a consistent identification for each vehicle across frames. This stops the bounding boxes from flickering which in turn prevents the green boxes from disappearing and reappearing randomly.

Second we are overhauling the line logic itself. Instead of simple alignment checks we are moving toward robust mathematical algorithms like RANSAC or a Hough Transform. These algorithms can find the true dominant axis of parked cars while ignoring visual outliers caused by camera perspective. We are also adding a post processing step to strictly enforce parallel lines and merge short adjacent segments that belong to the same axis.

Data Logging and Analytics

Once the real time detection is stabilized the final piece of the puzzle is data. The Flask backend sends the status of every spot to the frontend for live monitoring while simultaneously logging the data to a PostgreSQL database. A CSV fallback is used if the database is unavailable.

This historical data is where the real value lies. It allows parking operators to see overall statistics analyze trends and identify peak traffic hours. For instance the system can flag that peak congestion happens between eight and nine in the evening and recommend alternative times for visitors.

Moving Forward

The Line and Box methodology is our attempt to shift away from rigid manual setups toward dynamic intelligent inference. Sure wrestling with real world camera angles and two dimensional perspective is a headache but the payoff is a highly scalable system. We are building something that can look at a parking lot figure out the geometry on its own and just work.

No more drawing fifty manual polygons every time a camera shifts :3

If you want to see how we are tackling these

perspective challenges check out the source code and explore the repository on GitHub at https://github.com/solovision/Solovision to see the progress for yourself.