Note:I use AI assistance to draft and polish the English, but the analysis, interpretation, and core ideas are my own. Learning to write technical English is itself part of this project.
Introduction
In this article, I extend the pedestrian trajectory feature distributions measured in Article 3 to analyze pedestrian trajectory distributions across multiple urban locations. Rather than applying PCA, I construct a manifold where each location's distribution becomes a single coordinate point, and compute KL divergences based on dual flat structure for comparison between stations. By embedding each observation point's information onto the manifold, we can connect how differences in pedestrian behavioral dynamics are influenced by differences in urban spatial structure — a connection developed further in the next article.
The key idea is to treat individual pedestrian trajectory observations not as isolated events, but as distributions where each observation point becomes a single point on the information-geometric manifold. Comparing stations on geodesics with dual flat structure — separating natural parameters from expectation parameters of the pedestrian trajectory distributions — allows us to observe non-linear behavioral differences in a geometrically precise way.
Motivation
At the end of last year, I read Spivak's "Can the Subaltern Speak?" I felt that the method of searching for where silence lies is similar to acoustic reflection surveying in geophysics. Acoustic reflection surveying sends sound into the ground; where strata boundaries exist, reflection intensity changes. Graphing the vertical intensity changes reveals stratum surfaces as regions of high reflection. Where liquid exists underground, reflections become chaotic rather than coherent.
When ideology is treated as its environment, well-discussed places show clear strata surfaces, while places that go undiscussed become regions that cannot be measured by that observation method. Like acoustic reflection surveying, the purpose of this series is to capture reflection surfaces by applying some action and identify where invisible places are.
Around the same time, I encountered information geometry. KL divergence measures the asymmetric difference between two distributions. I thought that invisible differences between places — the kind that symmetric metrics erase — might emerge precisely in that asymmetry.
Methodological Foundation
Why Information Geometry Over PCA?
PCA's essential operation is dimensionality reduction — discarding data. It treats observations as points in Euclidean space and projects onto directions of maximum variance, focusing on the observations themselves and summarizing inter-indicator relationships linearly. Euclidean distance between parameters does not correspond to "statistical distinguishability."
Information geometry addresses this through its dual flat structure:
- Distributions as manifold points: Entire distributions are the unit of analysis, not individual observations
- Fisher metric: The unique invariant metric on statistical manifolds, measuring distance as statistical distinguishability — how easily two distributions can be separated from data
-
Dual structure (e-connection / m-connection): Naturally separates observational indicators from distributional parameters:
- e-connection: captures changes in natural parameters (generative mechanisms)
- m-connection: captures changes in expectation parameters (observable statistics)
- The same observed change may reflect different magnitudes of change in the generative mechanism depending on the location on the manifold — a non-symmetry PCA cannot capture in principle
Station as a Point on the Manifold
Data was collected at five locations: Ginza 1-chome, Ginza 2-chome, Shinjuku, Kamata, and Shinbashi.
| point_name | tracks | lat | lon | point attribution |
|---|---|---|---|---|
| Ginza1 | 618 | 35.67380 | 139.76772 | Shopping / Tourism |
| Ginza2 | 517 | 35.67385 | 139.76775 | Shopping / Tourism |
| Shinjuku | 989 | 35.69183 | 139.70259 | Large-scale commercial / Transit hub |
| Kamata | 745 | 35.56262 | 139.71545 | Commercial / Transit hub |
| Shinbashi | 1106 | 35.66575 | 139.75797 | Business |
The point attribution labels indicate the urban function and regional character of each location. "Shopping/Tourism" suggests pedestrian influx is primarily for sightseeing and shopping; "Business" indicates a high proportion of commuting and work-related use.
Shinjuku, as a major terminal station, is a complex point where commercial and transit functions overlap. Even sharing the "Shopping/Tourism" label with Ginza, Shinjuku is expected to show greater diversity in travel purpose and speed distribution, with more pronounced mixing of stop and through-traffic behavior. Shinjuku's manifold point likely reflects broader distributions and more mixed traffic behavior compared to Ginza. Each station's point on the manifold is determined by its trajectory feature distribution parameters; point attribution functions as an "urban context" tag attached to that point.
Note: Ginza 1 and 2 are at adjacent intersections and overlap at this map scale.
Each station is represented by fitting the same 8-feature distribution schema established in Article 3:
| Feature | Distribution | Parameters |
|---|---|---|
real_speed_mean |
Normal | (μ, σ) |
real_speed_cv |
Log-normal | (s, loc, scale) |
real_accel_abs_mean |
Half-normal | (loc, σ) |
stop_ratio |
Beta | (α, β) |
real_straightness |
Beta | (α, β) |
speed_skew |
Gamma | (a, loc, scale) |
decel_ratio |
Beta | (α, β) |
duration_sec |
Log-normal | (s, loc, scale) |
The manifold point for station s is the concatenated vector of all fitted parameters:
p_s = (μ_speed, σ_speed, s_cv, …, s_dur, scale_dur) ∈ M_U
As a concrete example, the manifold coordinates for Shinjuku are:
| Feature | Distribution | Fitted Parameters |
|---|---|---|
real_speed_mean |
Normal | μ=1.4625, σ=0.5319 |
real_speed_cv |
Log-normal | s=0.3845, loc=0, scale=0.8025 |
real_accel_abs_mean |
Half-normal | loc=0, σ=28.2603 |
stop_ratio |
Beta | α=0.7074, β=7.8578 |
real_straightness |
Beta | α=0.8339, β=0.7307 |
speed_skew |
Gamma | a=1.861, loc=0, scale=1.1513 |
decel_ratio |
Beta | α=31.5501, β=31.492 |
duration_sec |
Log-normal | s=0.8659, loc=0, scale=2.1172 |
This parameter vector defines Shinjuku's single point on M_U. The same procedure is applied to all five stations to populate the manifold.
Dual Flat Structure
M_U is a dually flat manifold because each feature distribution belongs to an exponential family. The product of independent exponential families inherits this structure with:
- Natural parameters θ: Canonical exponential family parametrization
- Expectation parameters η: E[T(X)] where T(X) are sufficient statistics
- Related by Legendre transform: η = ∇_θ ψ(θ) where ψ is the log-partition function
For each distribution family:
- Normal(μ,σ): θ = (μ/σ², −1/2σ²), η = (μ, μ²+σ²)
- Log-normal(s,scale): θ = (m/s², −1/2s²) where m=log(scale), η = (m, m²+s²)
- Half-normal(σ): θ = −1/2σ², η = σ²
- Gamma(a,scale): θ = (a−1, −1/scale), η = (ψ(a)−log(1/scale), a·scale)
- Beta(α,β): θ = (α−1, β−1), η = (ψ(α)−ψ(α+β), ψ(β)−ψ(α+β))
For example, applying real_speed_mean (Normal distribution) to the dual flat structure:
-
Natural parameters θ
- θ₁ = μ/σ²: the mean weighted by precision. If pedestrian speeds are highly variable (large σ²), this value is small — a weak signal. If everyone walks at nearly the same speed, it is large — a strong signal.
- θ₂ = −1/2σ²: encodes the precision of the distribution. A large spread yields a small (more negative) value; a broad distribution yields a value close to zero.
-
Expectation parameters η
- η₁ = μ: simply the mean walking speed — directly readable from the data.
- η₂ = μ² + σ²: the second moment, encoding both the mean and the spread of speeds.
Applying these conversions to Shinjuku's fitted parameters yields a 15-dimensional coordinate vector in each system (θ-dim = η-dim = 15, one dimension per sufficient statistic across all 8 features):
θ (Shinjuku): [ 5.1697e+00 -1.7674e+00 -1.4882e+00 -3.3827e+00 -6.0000e-04
-2.9260e-01 6.8578e+00 -1.6610e-01 -2.6930e-01 8.6100e-01
-8.6860e-01 3.0550e+01 3.0492e+01 1.0005e+00 -6.6690e-01]
η (Shinjuku): [ 1.4625e+00 2.4218e+00 -2.2000e-01 1.9620e-01 7.9864e+02
-3.2875e+00 -9.1700e-02 -9.8470e-01 -1.2312e+00 4.6990e-01
2.1425e+00 -7.0020e-01 -7.0210e-01 7.5010e-01 1.3124e+00]
These two vectors are the dual coordinates of Shinjuku's point on M_U. The θ-coordinates encode the generative mechanism (natural parameters), while the η-coordinates encode the observable statistics (expectation parameters). Their relationship via the Legendre transform is what makes geodesic and divergence calculations tractable.
Fisher Information Metric
The Fisher information matrix G for M_U is block diagonal due to feature independence:
G = diag(G₁, G₂, …, G₈)
Each block G_i is computed analytically for the corresponding distribution family.
Results
Manifold Construction
Applied the schema to 5 JRE Line stations with available trajectory data. Each station's manifold coordinates were computed by fitting distributions to trajectory features extracted as in Article 3.
KL Divergence Between Stations
Using the block diagonal structure, KL divergence between stations p and q is:
D_KL(p ‖ q) = Σᵢ D_KL(pᵢ ‖ qᵢ)
For example, for a Normal-distributed feature:
D_KL(𝒩(μ₁,σ₁) ‖ 𝒩(μ₂,σ₂)) = log(σ₂/σ₁) + (σ₁² + (μ₁−μ₂)²) / (2σ₂²) − 1/2
Each distribution family (Log-normal, Half-normal, Gamma, Beta) has its own closed-form expression, computed analytically in the same way.
The divergence matrix shows clear behavioral differences between stations, with some pairs showing much higher divergence than others.
| Ginza1 | Ginza2 | Shinjuku | Kamata | Shinbashi | |
|---|---|---|---|---|---|
| Ginza1 | 0.000 | 0.200 | 0.175 | 0.165 | 0.665 |
| Ginza2 | 0.152 | 0.000 | 0.319 | 0.341 | 0.521 |
| Shinjuku | 0.208 | 0.467 | 0.000 | 0.453 | 0.740 |
| Kamata | 0.167 | 0.439 | 0.424 | 0.000 | 1.566 |
| Shinbashi | 0.448 | 0.490 | 0.505 | 0.984 | 0.000 |
Note that D(p‖q) ≠ D(q‖p) — KL divergence is asymmetric. For example, D(Kamata‖Shinbashi) = 1.566 while D(Shinbashi‖Kamata) = 0.984. Shinbashi shows the highest divergence from all other stations, suggesting it occupies a behaviorally distinct region of M_U.
This figure shows how much KL divergence arises per feature for each station pair. Row 1: D(p‖q) — forward direction. Row 2: D(q‖p) — reverse direction. Row 3: D(p‖q) − D(q‖p) — asymmetry (with zero line). The third row in particular reveals which features are driving the KL asymmetry.
e-Geodesics and m-Geodesics
The dual flat structure enables two types of geodesics between stations:
- e-geodesic: Straight line in θ-space (natural parameter space)
- m-geodesic: Straight line in η-space (expectation parameter space)
The asymmetry between these paths reveals the curvature of the behavioral manifold. For pairs with high KL divergence, the midpoint of the e-geodesic and m-geodesic can differ significantly, indicating nonlinear relationships between generative mechanisms and observed statistics.
The figure below shows the e-geodesic (left) and m-geodesic (right) between two observation points, along with the KL divergence computed at intermediate points along each path (center). t=0 represents the start point and t=1 the end point. At t=0 and t=1 the KL divergence is zero by definition; it reaches its maximum at t=0.5, the midpoint.
| Pair | Sym-KL | Max Div | Mean Div | Nonlinearity |
|---|---|---|---|---|
| Kamata↔Shinbashi | 1.27542 | 0.04743 | 0.02420 | High |
| Shinjuku↔Shinbashi | 0.62270 | 0.01401 | 0.00695 | Medium |
| Ginza1↔Shinbashi | 0.55669 | 0.01352 | 0.00672 | Medium |
| Ginza2↔Shinjuku | 0.39324 | 0.00896 | 0.00446 | Medium |
| Ginza2↔Shinbashi | 0.50506 | 0.00890 | 0.00442 | Medium |
| Ginza2↔Kamata | 0.39019 | 0.00546 | 0.00269 | Low |
| Shinjuku↔Kamata | 0.43869 | 0.00372 | 0.00182 | Low |
| Ginza1↔Ginza2 | 0.17615 | 0.00202 | 0.00099 | Low |
| Ginza1↔Shinjuku | 0.19169 | 0.00178 | 0.00087 | Low |
| Ginza1↔Kamata | 0.16602 | 0.00081 | 0.00040 | Low |
- Max Div: max geodesic gap
- Mean Div: mean geodesic gap
The maximum geodesic gap is 0.047, which is small — indicating that this manifold is relatively flat. The e-geodesic and m-geodesic paths can be considered approximately identical.
PCA Distance vs Sym-KL Divergence
Image 5 illustrates the discrepancy between PCA and KL divergence. Each point represents an observation station projected onto two PCA components from the 15-dimensional θ feature space. The edges between points encode Sym-KL divergence: thicker edges indicate greater distributional difference. Notably, pairs such as Ginza1 and Shinjuku appear far apart in PCA space despite having one of the smaller KL divergences — a clear demonstration of how the Euclidean and Fisher metrics can produce conflicting orderings.
| Pair | PCA dist | Sym-KL | PCA dist (norm) | Sym-KL (norm) | diff (KL - PCA) |
|---|---|---|---|---|---|
| Kamata vs Shinbashi | 7.272 | 1.275 | 0.899 | 1.000 | 0.101 |
| Shinjuku vs Shinbashi | 7.907 | 0.623 | 1.000 | 0.412 | -0.588 |
| Ginza1 vs Shinbashi | 6.019 | 0.557 | 0.699 | 0.352 | -0.347 |
| Ginza2 vs Shinbashi | 3.914 | 0.505 | 0.364 | 0.306 | -0.059 |
| Shinjuku vs Kamata | 6.168 | 0.439 | 0.723 | 0.246 | -0.477 |
| Ginza2 vs Shinjuku | 6.910 | 0.393 | 0.841 | 0.205 | -0.636 |
| Ginza2 vs Kamata | 3.574 | 0.390 | 0.310 | 0.202 | -0.108 |
| Ginza1 vs Shinjuku | 4.954 | 0.192 | 0.530 | 0.023 | -0.507 |
| Ginza1 vs Ginza2 | 2.811 | 0.176 | 0.189 | 0.009 | -0.180 |
| Ginza1 vs Kamata | 1.624 | 0.166 | 0.000 | 0.000 | 0.000 |
Discussion
PCA Distance vs KL Divergence
Although PCA was used for visualization, the actual analysis uses 15-dimensional θ/η vectors. Comparing normalized PCA distance and Sym-KL on a 0–1 scale reveals different orderings — pairs involving Shinjuku show particularly large discrepancies. This arises from a fundamental difference in what each metric measures: PCA reduces to 2 components and computes Euclidean distance; KL divergence applies the Fisher metric across all 15 components. A large KL divergence means the two distributions are statistically distinguishable with fewer samples — it captures how different the distributions are, not just how far apart their parameter vectors sit in Euclidean space.
The manifold itself is relatively flat (maximum geodesic gap = 0.047), meaning e-geodesic and m-geodesic paths are nearly identical. The discrepancy with PCA is therefore not a curvature effect but a metric effect: information geometry preserves the statistical structure of distributions, while PCA discards it. The dual structure further separates generative parameters from observable statistics — a distinction PCA cannot make. Unlike regression approaches that treat residuals as noise, the adjoint functor framework (developed in Article 5) will interpret residuals as structural untranslatability between urban form and pedestrian behavior.
Asymmetry
D(Kamata‖Shinbashi) > D(Shinbashi‖Kamata) means there are behaviors among Kamata pedestrians that rarely occur at Shinbashi. Features that produce asymmetry are those where one station has behavior patterns that simply don't appear at the other. For example, if stop_ratio asymmetry is large, Kamata pedestrians have a broader (or narrower) tail in their stopping-rate distribution than Shinbashi pedestrians. This asymmetry can represent functionally meaningful urban differences — a property that symmetric distance measures like PCA cannot capture.
Conclusion
By constructing M_U as a statistical manifold, we can quantify pedestrian behavioral differences between stations with geometric precision. The dual flat structure characterizes the geometry of space and enables geometrically meaningful comparison of station distributions. This approach moves beyond simple correlation analysis to detect structural differences in how urban environments shape human movement patterns. The dual structure of M_U will also serve as the foundation for decomposing intervention effects in Series 2.
In the next article, I'll compare this pedestrian manifold with a spatial manifold constructed from OSMnx street network features, establishing the adjoint relationship between urban structure and pedestrian behavior.





Top comments (0)