The "Bad Apples" Problem
Can We Identify Them?
Explore how data density bias affects our understanding of complaint concentration, and simulate the impact of early warning system policies.
Simulation Parameters
The Data Density Illusion
When complaints are sparse relative to officers, even random assignment creates apparent concentration. The "top 2% account for X% of complaints" statistic is misleading without comparing to this null distribution.
Complaint Concentration Curves
The gap between the red line (actual) and gray dashed line (random) represents true concentration. The gap between gray dashed and black (uniform) represents data density bias - concentration that appears even under pure randomness.
Understanding the Math
The naive calculation suggests the top 2% of officers are 8.8x more likely to generate complaints. However, even under random assignment, the top 2% would be 3.8x more likely simply due to statistical clustering.
After correcting for data density bias, the top 2% are actually only 2.3x more likely to generate complaints—still elevated, but far less dramatic than the naive estimate.