DEV Community

Toki Hirose
Toki Hirose

Posted on

Urban visible structure mediates the effect of intervention on pedestrian behavior

Introduction

The central problem is how the internal structure of a city's atmosphere affects human behavior in my research. In this article, I investigate what kind of geometry of city's road network changes human behavior. When I stand with holding a protest message among pedestrians, their behavior slightly changes. I feel the amount of change of that is different in every points. For example, while the pedestrians don't look at my message in Shinbashi which is famous busy area, they start to talk to me in Kofu where is a regional city.
Before this article, I used pedestrian video data in which I stood holding a placard nearby camera, detected trajectories of humans by Centroidtracker, extracted features from the trajectories, fitted statistical distributions to them. In addition I calculated KL-divergence between observation points, and tested the correlation between the KL divergence difference and the difference between network geometries of city road.
Article 5 closed by identifying several open problems. Since then, additional issues emerged during the analysis, so I list them all here.

  1. All recordings were taken while I was holding a protest placard next to camera -- there was no baseline condition.
  2. The dataset covered only five observation points, which was insufficient.
  3. Although Centroid Tracker was initially used, it frequently swapped IDs between pedestrians as they crossed paths.
  4. Duration_sec which is a feature of pedestrian tended to be affected by angle of view of recording and city structure.
  5. Using symmetric KL divergence discarded the directional information that asymmetric KL divergence preserves.
  6. The road network radius(420m) was likely too large to capture the immediate urban context relevant to pedestrians.

To address these issues, I conducted a follow-up experiment on 5 May 2026.

  1. Recordings were divided into two segments: a baseline phase (without intervention) and an intervention phase ( in which I stood holding the placard).
  2. Data collection expanded to 15 Yamanote Line and Sobu Line stations: both baseline and intervention segments were recorded at 12 locations ( 2 in Shinjuku, 1 in Ikebukuro, 2 in Ueno, 1 in Akihabara, 2 in Kinshicho, 1 in Shinbashi, 1 in Shinagawa, 2 in Shibuya). Only baseline data were collected at 3 additional locations (Ikebukuro, Gotanda, Kinshicho).
  3. Centroid tracker was exchanged to Bytetracker. Bytetracker reduced ID swaps; this improvement was confirmed by increased trajectory straightness.
  4. Duration_sec was excluded.
  5. Asymmetric KL divergence was used instead of symmetric KL divergence.
  6. The road network extraction radius was reduced from 420 m to 100 m. The statistical methodology remained unchanged, but the new experimental design enabled me to measure the difference in pedestrian behavior before and after the intervention at each location. For locations with both baseline and intervention recordings, I could compute coordinates on the manifold for each condition and measure the distance between them using asymmetric KL divergence. This allowed me to investigate how road network geometry influenced the magnitude and direction of behavioral change.

Formalization

Variable Meaning Observable?
C Visible urban structure (roads) Yes — fixed
S Hidden power structure (latent) No — indirect only
U Pedestrian behaviour before intervention Yes
U_I Pedestrian behaviour during intervention Yes
I Intervention (standing demonstration) Yes — controlled

This table summarizes observable variables and unobservable variable. In this article, the change of pedestrian behavior between before and after intervention is calculated as U_I-U. I expresses intervention which is my standing with protest message. According to Bratton the City Layer comprises two aspects: the physical structures and virtual structures. The virtual structure — norms, social expectations, and the implicit rules of a space — functions as a hidden power that shapes individual behavior. Pedestrians respond both the visible urban construction C, and this hidden power structure S. Intervention changes the hidden power structure in the City Layer, though it does not alter the physical structure.
Markov kernel κ(C, S) represents how pedestrian behavior is conditionally generated given the observable structure C and hidden structure S.
U = κ(C, S)
intervention I perturbs this Markov kernel. Under intervention, the behavior changes to:
U_I = κ(C, S | I)
U_I and U are observed at same location. Since the intervention does not alter the physical structure C, the difference U_I - U eliminated C, leaving only the effect of the intervention on S.
U_I - U = κ(C, S|I) - κ(C, S) = S_I - S
Hidden power structure of space can't be observed, however right side could. By analyzing how the intervention shifts pedestrian behavior, this article investigates the atmosphereic structure S that operated within urban space.

Asymmetric KL divergence

KL divergence is defined as:
D(p||q) = ∫ p(x) log (p(x)/q(x)) dx
This formula shows that KL divergence rise in the area where p has probability and q doesn't. In the same way, D(q||p) is calculated and yields a different value. Although I used symmetric KL divergence(1/2(D(p‖q) + D(q‖p))) to compare between locations in previous articles, I adopted asymmetric KL divergence in this article.
D(baseline ‖ intervention) measures how baseline distribution has probability on the area where intervention doesn't have probability. In other words, this measures how pedestrian behaviors are suppressed by the intervention.
D(intervention ‖ baseline) is the opposite above. This shows how intervention generates pedestrian behaviores.
Using symmetric KL divergence lost the information suppression and generation of pedestrian behavior.
The asymmetric ratio further clarifies which effect --suppression or generation-- dominates at each location.
r = D(baseline||intervention)/D(intervention||baseline)

  • r >> 1: the demonstration primarily suppresses existing behaviour
  • r << 1: the demonstration primarily generates new behaviour
  • r ≈ 1: bidirectional shift — the intervention both removes and creates

Data Acquisition

Pedestrian behavior data

At each location, I recorded the timestamp when the standing demonstration began. This allowed me to divide each video into two segments: baseline(before intervention) and intervention(during demonstration) Each segment was treated as a point on the manifold of M_U. Former points are {observation_point}_baseline, others are {observation_point}_intervention. Observations with only baseline or only intervention data were plotted on M_U but excluded from the Markov kernel change analysis.

Urban structure data

At each location, I extracted network data within a 100m radius using OSMnx.

Excluding Outlier

Since the intervention was standing with holding protest message among pedestrians, the videos recorded my body which stoped or walked around center of angle. When standing, my body appeared as a single trajectory with high stop rate and long duration. Therefore, I filtered out trajectories exceeding the upper fence threshold, which was based on the baseline segment's distribution.
upper fence = Q_3_baseline + 1.5 ・ IQR_baseline

Feature extraction

Pedestrian features (M_U)

I recorded pedestrian trajectories using a GoPro Hero 8 mounted on a tripod. ByteTrack extracted each pedestrian's pixel coordinates, temporal intervals, and bounding box heights across consecutive frames. To convert pixel distances to real-world meters, I divided each frame's bounding box height by 1.7 m (average human height) to obtain a scaling factor.

features mean
real_speed_mean mean walking speed [m/s](Gamma)
real_speed_cv coefficient of variation (std / mean)(Log-normal)
real_accel_abs_mean The mean of absolute acceleration[m/s²](Half-normal)
stop_ratio The fraction of steps where real-world speed falls below 0.3 m/s(beta_dist)
real_straightness The ratio of the depth-normalized displacement from start to end point to the total real path length (range: 0–1)(beta_dist)
speed_skew The skewness of the per-step speed distribution(Gamma)
decel_ratio The fraction of acceleration steps where the value is negative(beta_dist)

Road network features (M_C)

At each location, I got road network within 100m radius along observation point with OSMnx. Each network edge was treated as one observation to calculate a distribution of network geometry. I explained this methodology in detail in Article 5.

features mean
edge_length Distribution of street segment lengths (Log-normal)
circuity_mapped Directness of each segment (Beta)
node degree Intersection complexity (Poisson)
betweenness_centrality Node centrality in network flow(Poisson)
bearing Directness of each segment(von Mises)
edge_density_km street km per km²(scalar)

Information geometry

Statistical manifolds have two coordinate systems: natural parameters and expectation parameters. Natural parameters represent the shape of each distribution, while expectation parameters represent its mean and variance.

Analysis

Sensitivity is location-specific, not feature-specific

The difference between before and after intervention cancels visible structure of urban space.
Δ = U_I - U = S_I - S
Asymmetric KL Divergence and Asymmetric ratio for each observation location
This figure illustrated the KL divergence change for each locaion and feature. The top shows D(baseline|intervention), the middle show D(intervention|baseline) and the bottom shows the ratio of them. For each KL divergence direction, the values are similar. KL divergence values differ substantially across locations. Especially KL divergences have large values at Crosswalk points. At crosswalk locations, pedestrians can only move directly across the camera's field of view. On the other hands in square points, the KL divergences were lower value than Crosswalk points. The ratios of asymmetric KL divergence were larger than one in some observation points: Shinjuku EastGateSquare, Akihabra Crosswalk_StationFront, and Shinbashi. These three locations share no obvious geographic or morphological characteristic.

Feature bottom 1 bottom 2 top 2 top 1
Speed mean [m/s] Ikebukuro_SF_CW
(-0.1511)
Shibuya_CW_MiyashitaPark
(-0.0779)
Kamata
(+0.1322)
Ueno_Shinobazu_StationExit
(+0.1813)
Speed CV Kamata
(-0.0473)
Shinjuku_EGS
(-0.0180)
Shibuya_CW_MiyashitaPark
(+0.0534)
Kinshicho_CW2
(+0.0548)
Accel abs mean [m/s²] Ikebukuro_SF_CW
(-3.4607)
Uenopark_CW
(-1.5465)
Kinshicho_CW2
(+2.4712)
Ueno_Shinobazu_StationExit
(+10.5036)
Stop ratio Kamata
(-0.0145)
Shinjuku_EGS
(-0.0118)
Kinshicho_CW2
(+0.0124)
Shibuya_CW_MiyashitaPark
(+0.0185)
Straightness Kinshicho_CW2
(-0.0772)
Shibuya_CW_MiyashitaPark
(-0.0766)
Shinbashi
(+0.0236)
Kamata
(+0.0692)
Speed skew Kamata
(-0.2122)
Akihabara_CW_SF
(-0.1259)
Ueno_Shinobazu_StationExit
(+0.0876)
Kinshicho_CW2
(+0.0926)
Decel ratio Kinshicho_CW2
(-0.0104)
Ikebukuro_SF_CW
(-0.0064)
Ueno_Shinobazu_StationExit
(+0.0077)
Kamata
(+0.0091)

This table sorts the most 2 and worst 2 change KL divergence difference every location for each feature. Changes vary widely across features and locations. Some observations emerged at in multiple features. While Ueno_Shinobazu_StationExit showed large shift, Ikebukuro_SF_CW and Shinjuku_EastGate_Station had large minus shifts. Kamata, Shibuya_Crosswalk_MiyashitaPark and Kinshicho_Crosswalk2 showed large plus and minus shifts. These resulted the sensitivity of intervention do not related to features but also locations. Each location exhibits an intrinsic sensitivity to intervention that is independent of which feature is measured.

Complex road network make the urban space robust

To understand the Markov kernel κ(C, S), I investigated whether the visible structure M_C constrains how the hidden structure S responds to intervention.
Q1 — Magnitude: Does the size of the KL shift (KL_fwd, KL_rev, KL_total, asym_ratio) correlate
with the M_C θ-coordinate (natual parameter) of each observation.

Q2 — Direction: Does the direction of the shift on M_U correlate with M_C?
Direction was defined by the difference vector on natural parameter from baseline to intervention point.
Δθ = θ_intervention - θ_baseline
Unit vector u_i = Δθ_i / |Δ\θ_i| shows purturvation direction on M_U.

Natural parameter M_C and KL divergence change's index
These graphs illustrate scatter plots of the result of principal component analysis of natural parameter on M_C and dual direction KL divergence's change, asymmetric ratio, and principal components of direction of the shift on M_U. Statistical significance (p ≤ 0.05) was found for two relationships: MC_PC2 with the asymmetry ratio, and MC_PC2 with the direction of behavioral shift on M_U.

The weight heatmap of M_C's principal components
This figure shows the weights of M_C's principal components. Especially component 1 has the effects of betweenness centrality and node degree. That represents intermediation on the network. The principal component 2 shows high effects of node degree, edge_length and scalar_edge_density_km. It means the height of complexity of network.

MC_PC2 vs asym_ratio(ρ=+0.589, p=0.021) represents that complex road networks suppress the intervention effect: D(baseline||intervention) is large. MC_PC2 vs dir_MU_PC1(ρ=-0.518, p=0.048) shows that the intervention shifts pedestrian behavior toward constant speed and straighter paths in complex networks.

For each feature, below table exhibits spearman correlation between M_C's principal component 2 and D(baseline||intervention). Decel ratio only has plus correlation, others are minus or almost zero. Since the sample number is 15, they don't have statistical significance. As network complexity increases, D(baseline ‖ intervention) increases, while deceleration behavior decreases. This pattern suggests that complex networks constrain behavioral diversity.

Feature spearman_rho p_value n
decel_ratio 0.346429 0.205896 15
stop_ratio -0.010714 0.969770 15
real_straightness -0.017857 0.949635 15
real_accel_abs_mean -0.057143 0.839700 15
real_speed_cv -0.110714 0.694463 15
real_speed_mean -0.271429 0.327789 15
speed_skew -0.300000 0.277317 15

Interpretation

Summary of analysis

  1. intervention effects emerge in simple roads with high flowability, but are muted in squares where pedestrians naturally linger.
  2. Complex road networks suppress intervention effects, shifting behavior toward homogeneity and straightness.

Invisible power structure S's sensitivity of intervention relies on its complexity

Complex road network suppress the pedestrian perturbation makes pedestrian to be homogeneous. It means that M_C identify the sensitivity of intervention. On the other hand simple network similar to crosswork has larger change by intervention.
This means that M_C's constraints are weak and intervention can more freely perturb pedestrians through urban hidden structure S. The sensitivity of κ(C, ·) depends on network complexity: the more complex the road network, the harder the intervention can perturb the kernel.
In Bratton's terms, initially I formulated κ(C, S) assuming C and S were independent. However in this analysis, complex structure constrains the freedom of hidden structures to perturb behavior. This reflects Bratton's description that the City layer comprises hardware and software in constant interaction. when the City Layer's hardware (complex street networks) is intricate, the software structure (hidden power S) has less freedom to modify the Markov kernel.

Answer the initial problem: How structure Urban space's atmosphere has?

To summarize: urban atmosphere emerges from the interplay of visible (C) and invisible (S) structures. The complexity of the road network determines how freely the hidden structure can respond to intervention. Why does the complex network, like square, make pedestrian's behavior to be hard. The answer lies in freedom of movement. In complex road network pedestrians have multiple route to their destination. They can recognize the demonstration and navigate around it. In simple roads, pedestrians have no alternative; their attention is captured when they walk through the demonstration.

Reference

Benjamin H. bratton (2016). "The Stack: On Software and Sovereignty" (2016, MIT Press)
Tobias Fritz etal (2023). Representable Markov Categories and Comparison of statistical Experiments in Categorical Probability . https://strathprints.strath.ac.uk/85547/1/Fritz_etal_TCS_2023_Representable_Markov_categories_and_comparison_of_statistical_experiments.pdf
https://www.authoritarian-stack.info
Boeing, G. (2025). Modeling and Analyzing Urban Networks and Amenities with OSMnx. Geographical Analysis 57 (4), 567-577. doi:10.1111/gean.70009

Top comments (0)