How We Built the Case
Transparent documentation of our data science approach: analyzing 3.7 million violation records to prove that targeted enforcement can protect student study time.
Data Sources & Scale
Our analysis combines multiple authoritative datasets to create the most comprehensive picture of bus enforcement patterns and their impact on student commutes.
MTA Bus Automated Camera Enforcement Violations
Complete violation records including location, time, route, and enforcement status
MTA Bus Speed Data
Average speeds by route and time period for before/after analysis
CUNY Campus Locations
Coordinates and enrollment data for proximity analysis
Bus Route Geometry
GTFS route shapes and stop locations for spatial analysis
Data Quality & Validation
Analysis Pipeline
Step-by-step breakdown of our methodology, from raw data to actionable insights.
Data Integration
Combined violation records with bus stop locations using spatial joins
Temporal Analysis
Identified peak violation patterns by hour, day, and academic calendar
CUNY Proximity Mapping
Calculated distances from violations to CUNY campuses using buffer analysis
Enforcement Paradox Calculation
Developed metric combining violation intensity with speed improvement
Impact Quantification
Estimated time lost and academic disruption for student populations
Statistical Methods
Significance Testing
- • Chi-square tests for categorical associations
- • Mann-Whitney U tests for non-parametric comparisons
- • Bonferroni correction for multiple comparisons
- • Bootstrap confidence intervals (n=10,000)
Spatial Analysis
- • Haversine distance calculations
- • K-means clustering for hotspot identification
- • Moran's I for spatial autocorrelation
- • Buffer analysis with 50m, 100m, 500m tolerances
Reproducible Research
All analysis code and documentation is available for verification and replication.
Limitations & Future Directions
Transparent acknowledgment of constraints and opportunities for expansion.
Current Limitations
- Weather and seasonal variations not fully modeled
- Individual student commute patterns estimated from aggregate data
- Long-term academic impact requires longitudinal study
- Cost-benefit analysis based on estimated enforcement costs
Future Enhancements
- Real-time violation prediction using machine learning
- Student survey integration for commute pattern validation
- Pilot program impact measurement framework
- Expansion to other educational institution networks
Have suggestions for improving our methodology? We welcome feedback from researchers, policymakers, and fellow students.
📝 Provide Feedback