Toki Hirose

Posted on Mar 29 • Edited on Apr 25

3. Exploring Feature Distributions from Pedestrian Trajectories

#statistics #computervision #python #datascience

Motivation

In this article, I explore the statistical distributions of features extracted from pedestrian trajectories. Although the tracking accuracy still has room for improvement, analyzing the distributions of speed and related features allows me to characterize average pedestrian behavior from real video data. I have recently been studying information geometry, and I want to investigate how pedestrian behavior changes across locations by modeling it as probability distributions. By constructing a statistical manifold from these distributions across multiple locations, I aim to capture differences between pedestrian populations that conventional clustering methods such as k-means — which rely on Euclidean distance — cannot detect.

Note

I use AI assistance to draft and polish the English, but the analysis, interpretation, and core ideas are my own. Learning to write technical English is itself part of this project.

Introduction

In Article 1, I detected pedestrians in video footage and visualized their trajectories. In Article 2, I projected those trajectories onto a map using homography. Known limitations remain — detection accuracy and calibration point selection for the homography transform — but these are out of scope for this article and will be addressed in a future iteration. In this article, I analyze the speed and behavioral distributions of the detected pedestrians.

The data used in this analysis was collected at the Ginza pedestrian zone (歩行者天国) in Tokyo.

Feature Extraction

Each pedestrian trajectory stores a timestamp, pixel coordinates within the video frame, and the height of the detected bounding box. From a fixed camera, this is sufficient to measure speed, acceleration, dwell time, and several behavioral indicators.

Pixel to Meter Conversion

All trajectory measurements are initially in pixel units. To convert to real-world distances, I used the bounding box height as a depth proxy. For each track at each frame, the average bounding box height between consecutive frames is computed and used to derive a scale factor based on the assumed average human height of 1.7 m:

scale [m/px] = H_REAL / avg_bbox_height_px
real_dist [m] = pixel_dist [px] × scale

This approach effectively performs implicit monocular depth estimation using the known height of a pedestrian as a reference object.

Speed

Per-step real-world speed was computed by dividing the depth-normalized displacement between frames by the elapsed time. For each track, I extracted:

real_speed_mean: mean walking speed
real_speed_cv: coefficient of variation (std / mean), which normalizes variability relative to the mean and indicates how much a single trajectory fluctuates in pace

Acceleration

Acceleration was computed as the finite difference of per-step speed divided by elapsed time. I extracted the mean of signed acceleration (real_accel_mean). Since acceleration can be positive or negative, the signed mean captures whether there is a net directional bias across the trajectory.

I also computed decel_ratio — the fraction of acceleration steps where the value is negative (i.e., the pedestrian is decelerating). A value near 0.5 indicates balanced acceleration and deceleration (typical steady walking); values above 0.5 indicate dominant braking behavior. While real_accel_abs_mean captures the intensity of speed change, decel_ratio captures its frequency, making them complementary indicators.

Speed Skewness

For each trajectory, I computed the skewness of the per-step speed distribution (speed_skew). A value near zero indicates a symmetric speed profile — steady walking. A negative value indicates the presence of a slow phase within the trajectory (hesitation or stopping). A positive value indicates a fast phase (e.g., hurrying after a pause). This captures the temporal asymmetry of the speed profile in a single scalar, which mean and variance alone cannot express.

Stop Ratio

I computed stop_ratio as the fraction of steps where real-world speed falls below 0.3 m/s. A higher stop ratio indicates that the pedestrian spent more time engaging with the environment — reading a sign, looking at a shop window, or reacting to a stimulus.

Path Straightness

real_straightness is the ratio of the depth-normalized displacement from start to end point to the total real path length (range: 0–1). Values near 1 indicate a nearly straight trajectory; values near 0 indicate winding, reversal, or wandering. Both displacement and total path length are depth-normalized using the bounding box height at the respective trajectory points, correcting for the variation in pixel scale with camera distance.

Dwell Time

duration_sec is the observation duration of each track, computed as the number of frames from first to last detection multiplied by the frame interval.

Distribution Analysis

I plotted the frequency distribution of each feature across all tracks and fitted candidate probability distributions using the Kolmogorov-Smirnov (KS) test.

The KS statistic D measures the maximum difference between the empirical CDF of the data and the theoretical CDF of a candidate distribution. A smaller D indicates a better fit. The corresponding p-value represents the probability of observing a deviation as large as D by chance under the null hypothesis; a higher p-value indicates a better fit. When comparing multiple candidate distributions, I selected the one with the smallest KS statistic (breaking ties by p-value).

Candidates were assigned based on the support of each feature:

Positive-valued: Normal, Log-normal, Half-normal, Gamma, Exponential
Bounded [0, 1]: Beta, Uniform, Normal, Log-normal
Full real line: Normal, Laplace, t(df=5), Cauchy

Feature	Unit	Mean	Std	Best Distribution	KS	p-value	Accept (p>0.05)
`real_speed_mean`	m/s	1.360	0.420	Gamma	0.037	0.350	✓
`real_speed_cv`	—	0.894	0.383	Log-normal	0.061	0.019	✗
`real_accel_mean`	m/s²	0.026	5.362	Cauchy	0.034	0.474	✓
`stop_ratio`	—	0.085	0.089	Beta	0.118	0.000	✗
`speed_skew`	—	2.460	1.866	Gamma	0.031	0.591	✓
`real_straightness`	—	0.484	0.268	Beta	0.031	0.466	✓
`decel_ratio`	—	0.499	0.054	Normal	0.067	0.008	✗
`duration_sec`	s	3.910	3.076	Gamma	0.045	0.161	✓

Results

The eight features selected for downstream analysis and their associated distribution families are as follows:

Feature	Best Distribution	Accept (p>0.05)	Rationale
`real_speed_mean`	Gamma	✓	Best fit for walking speed distribution
`real_speed_cv`	Log-normal	✗	Lowest KS among all candidates; captures individual variation in pace
`real_accel_mean`	Cauchy	✓	Lowest KS; heavy tails reflect rare but strong braking and surging events
`stop_ratio`	Beta	✗	Lowest KS; bounded in [0,1]; sensitive indicator for behavioral intervention
`speed_skew`	Gamma	✓	Lowest KS; captures temporal asymmetry in the speed profile
`real_straightness`	Beta	✓	Lowest KS; bounded in [0,1]; captures path linearity
`decel_ratio`	Normal	✗	Lowest KS; captures deceleration frequency
`duration_sec`	Gamma	✓	Best fit for observation duration

For features where the KS test was rejected (p < 0.05), I adopted the distribution with the lowest KS statistic as the working approximation. The reasoning is described in the Discussion.

Interpretation

real_speed_mean

Why this distribution?
Speed is non-negative and right-skewed: most pedestrians walk at a typical pace, with a tail of faster walkers. The Gamma distribution naturally accommodates a lower bound at zero and a right tail, making it a better fit than the Normal distribution, which would assign probability mass to negative values.

What the data reveals
The analysis reveals that the mean walking speed is 1.36 m/s, consistent with typical pedestrian speeds in a busy shopping district like Ginza. Speeds above 2.5 m/s are present but rare, representing pedestrians in a hurry.

real_speed_cv

Why this distribution?
The coefficient of variation is a positive quantity representing normalized speed variability within a trajectory. In a dense urban environment like Ginza, pedestrians frequently adjust pace due to crowds, direction changes, and window shopping. Tracking ID switches — where the system reassigns an ID between two different individuals — can also cause artificial speed jumps. Both effects contribute to a right-skewed distribution with a heavy tail, which the Log-normal distribution captures well.

What the data reveals
The mean CV of 0.89 indicates that most pedestrians show substantial within-trajectory speed variation. This is expected given the high foot traffic and the current limitations of the CentroidTracker-based tracking system.

real_accel_mean

Why this distribution?
Signed mean acceleration can be positive or negative, so a bell-shaped distribution centered near zero is expected. In a pedestrian zone without traffic signals, pedestrians are not forced to stop — sharp decelerations more likely reflect voluntary behavior such as entering a shop or looking at a sign. The Cauchy distribution captures the same bell shape as the Normal but with much heavier tails, consistent with the occasional extreme acceleration events observed.

What the data reveals
The distribution is centered very close to zero (mean = 0.026 m/s²), confirming that pedestrians neither systematically accelerate nor decelerate over their trajectories. The heavy Cauchy tails reflect rare but strong braking or surging events.

stop_ratio

Why this distribution?
stop_ratio is bounded in [0, 1] and represents the fraction of steps below a speed threshold within a trajectory. The Beta distribution is the natural choice for proportions on this interval.

What the data reveals
The distribution peaks near zero and 0.05, indicating that most pedestrians rarely drop below 0.3 m/s. However, the long right tail — reaching up to approximately 0.4 — confirms that a meaningful minority paused significantly, likely engaging with shop displays or other environmental stimuli.

real_straightness

Why this distribution?
real_straightness is bounded in [0, 1], making the Beta distribution the natural candidate. In a shopping district, pedestrians follow moderately direct paths but with meaningful deviation — neither hugging a straight line nor wandering randomly. This intermediate behavior, away from both boundary extremes, is well captured by a Beta distribution with parameters that place probability mass in the interior of [0, 1].

What the data reveals
The distribution is spread across the interior of [0, 1] with a mean of 0.48, indicating that pedestrians follow paths that are neither fully straight nor fully winding. This intermediate behavior is consistent with movement through a busy shopping district, where pedestrians maintain a general direction but frequently deviate in response to storefronts, crowds, and signage.

speed_skew

Why this distribution?
Per-step speed is non-negative, so the within-trajectory speed distribution has a natural floor at zero. This floor creates an inherent right skew in the speed profile, which means speed_skew is almost always positive (98.4% of tracks in this dataset). The Gamma distribution, which is defined for positive values and flexible in shape, fits this behavior well.

What the data reveals
The mean speed_skew is 2.47 (std = 2.03), indicating that most pedestrians have a few brief high-speed moments within an otherwise slower trajectory. This is consistent with occasional surges in pace — stepping around another person, crossing a gap in the crowd — embedded in otherwise steady walking.

decel_ratio

Why this distribution?
decel_ratio measures the fraction of steps where acceleration is negative. For steady walking, this should be close to 0.5 — alternating acceleration and deceleration in roughly equal measure. The Normal distribution centered near 0.5 captures this well.

What the data reveals
The mean is 0.499, confirming that pedestrians in this dataset decelerate and accelerate in nearly equal proportions. There is no systematic braking bias in baseline behavior — an important reference for detecting shifts caused by an intervention in Article 12 and 13.

duration_sec

Why this distribution?
Observation duration is positive and right-skewed: most pedestrians cross the field of view in a few seconds, but some linger, pause, or follow longer paths. The Gamma distribution accommodates the peak at short durations and the heavy right tail.

What the data reveals
The median duration is approximately 2 seconds, consistent with pedestrians walking through the frame. The right tail extends to around 14 seconds. One limitation of this approach is that the maximum observed duration is relatively short — it is likely that some pedestrians who stopped for longer had their track IDs reassigned by the tracker, truncating their trajectories.

Discussion

KS Test Sensitivity at Large Sample Sizes

The Kolmogorov-Smirnov test becomes increasingly sensitive as sample size grows. With n ≈ 600, even small deviations from a theoretical distribution are detectable, leading to rejection even when the visual fit is clearly reasonable. The KS statistic D measures the maximum gap between empirical and theoretical CDFs; a small D is evidence of good fit regardless of whether the test is formally rejected. For this reason, I adopted the distribution with the smallest D as the working approximation for all features, including those where the null hypothesis was rejected.

Feature Correlation and Implications for Manifold Construction

The correlation matrix reveals that speed_skew and real_speed_cv are strongly correlated (r = 0.77). Highly correlated features introduce redundant dimensions into the statistical manifold, which may affect the geometry of the parameter space in Article 4. Both features are retained for now, and their effect on the manifold structure will be examined in that article.

The Cauchy Distribution and Fisher Information

The Cauchy distribution for real_accel_mean is a physically meaningful result: it reflects the occasional extreme acceleration events — sudden stops at a shop entrance, abrupt lane changes — that occur against a background of smooth walking. However, the Cauchy distribution has no defined mean or variance. Its Fisher information metric is defined only through the scale parameter γ, so in Article 4, the treatment of the location parameter μ in the manifold construction will require careful consideration.

Limitations of the Data Collection Method

One limitation of this approach is that the CentroidTracker used for pedestrian tracking is prone to ID switching when pedestrians cross paths. This produces artificial velocity jumps in some trajectories and likely inflates the values of real_speed_cv and speed_skew. Improving tracking accuracy — for example by switching to a re-identification-based tracker — would reduce this artifact.

A second limitation is that the depth normalization via bbox_height assumes full-body visibility. For occluded or partially cropped pedestrians, the scale factor may be unreliable.

Toward Detecting Behavioral Change

The eight distribution families established here define the coordinate system of the statistical manifold in Article 4. In future articles (Articles 12 and 13), I will investigate how the parameters of these distributions shift when a standing demonstration is present. A systematic shift in Beta parameters for stop_ratio, or a change in the Cauchy scale parameter for real_accel_mean, would provide a quantitative signature of behavioral disruption.

Conclusion

In this article, I extracted eight features from pedestrian trajectories obtained from video footage and fitted probability distributions to each feature using the KS test. The analysis reveals that mean walking speed, speed skewness, and dwell time follow Gamma distributions; speed variability follows a Log-normal distribution; signed mean acceleration follows a Cauchy distribution; stop ratio and path straightness follow Beta distributions; and deceleration ratio follows a Normal distribution. These eight features and their associated distribution families form the coordinate parameterization of the statistical manifold to be constructed in Article 4.

In the next article

In the next article, I will apply this parameterization schema to pedestrian trajectory data collected at multiple stations along the Yamanote Line. Each location will be represented as a point on a statistical manifold, where coordinates are the estimated parameters of the eight distributions identified here. Using the Fisher information metric and e/m geodesics, I will compute distances between locations in a way that reflects genuine differences in pedestrian behavior — differences that Euclidean distance in raw feature space cannot capture.