Data Science / R / R News

Stop and Frisk: Spatial Analysis of Racial Differences

by AO · June 24, 2015

This article is originally published at https://stablemarkets.wordpress.com

Stops in 2014, with red lines indicated high white stop density areas and blue shades indicating high black stop density areas. Notice that high white stop density areas are very different from high black stop density areas. The star in Brooklyn marks the location where officers Lui and Ramos were killed. The star on Staten Island markets the location of Eric Garner's death. — Stops in 2014. Red lines indicate high white stop density areas and blue shades indicate high black stop density areas.
Notice that high white stop density areas are very different from high black stop density areas.
The star in Brooklyn marks the location of officers Liu’s and Ramos’ deaths. The star on Staten Island marks the location of Eric Garner’s death.

In my last post, I compiled and cleaned publicly available data on over 4.5 million stops over the past 11 years.

I also presented preliminary summary statistics showing that blacks had been consistently stopped 3-6 times more than whites over the last decade in NYC.

Since the last post, I managed to clean and reformat the coordinates marking the location of the stops. While I compiled data from 2003-2014, coordinates were available for year 2004 and years 2007-2014. All the code can be found in my GitHub repository.

My goals were to:

See if blacks and whites were being stopped at the same locations
Identify areas with especially high amounts of stops and see how these areas changed over time.

Killing two birds with one stone, I made density plots to identify areas with high and low stop densities. Snapshots were taken in 2 year intervals from 2007-2013. Stops of whites are indicated in red contour lines and stops of blacks are indicated in blue shades.

There are two things to note:

The snapshots indicate that, in those years, blacks and whites were stopped at very different locations. Whites were being stopped predominantly in Staten Island, Brooklyn, and Manhattan. There is very little overlap with high black stop density areas.
Blacks were stopped predominantly around the Brooklyn/Queens border and Manhattan/Bronx border.
These spatial discrepancies are consistent over the time given.
The high density areas are getting larger over time as the total number of stops decline (indicated by the range of the map legends).

Here is the map of stops in 2014, the last year for which I have data:

In 2014, we see more concentrated stops of blacks along the coast of Staten island. In fact, Eric Garner died in precisely one of these high-density areas. The location of his death is marked with a star.

Similarly, Officers Liu and Ramos also died in a high black stop density area (location marked with the star in Brooklyn).

Importance. It’s easy to see the importance of such spatial analyses. They add several layers of information on top of the basic summary statistics I presented in my previous post. As I’ve shown above, very terrible and unfortunate events can happen in high-density areas.

Simultaneity. Let’s say we overlay this stop and frisk data with perfectly measured crime data (the potential mismeasurement of “crime” is discussed below) and find that high black density areas actually have low crime density. We cannot necessarily conclude that the NYPD is engaging in a racist expansion of stops in black areas, despite low crime rates. What if crime rates are low because of the high amounts of stops? With the current data, it’s hard to say which way the causality would be run.

Unobserved Factors. Simultaneity aside, we also have unobserved factors to contend with. Are the spatial discrepancies visualized above due to racist police segmenting geographically to efficiently target blacks? Or are the spatial discrepancies simply due to the fact that blacks and whites, in general, live and/or hang out in very different places? Without additional data, it’s hard to say.

Difficulty Establishing Simple Claims. Even the relatively simple claim of “blacks commit crimes at higher than average rates” is difficult to establish. When most people speak of “crime rates”, they are actually referring to arrest rates. We usually don’t observe crimes because criminals aren’t generally upfront people who self-report their crimes. So, we use police arrest data as a proxy for crime. However, if we think that police are inherently racist, then the arrest data they record would also be biased upward. Arrest rates could be much higher than crime rates. My point is that even establishing simple claims requires great care (both in how we phrase the claims and how we attempt to answer them) and is often difficult.

Racism. As I said above, issues such as simultaneity and unobserved factors make it very difficult to establish even simple relationships or claims. It is even harder to establish the inherent racism of an entire group of people, or the inherent criminality of an entire group of people. Much more information is needed.

I hope that making this data available and clean for public use will help researchers address some of these difficulties. Again, all of my code and datasets are available on GitHub. My hope is that other people will combine this data with their own data to reach more impactful conclusions. As always, please cite when sharing.

Thanks for visiting r-craft.org
This article is originally published at https://stablemarkets.wordpress.com
Please visit source website for post related comments.

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Stop and Frisk: Spatial Analysis of Racial Differences

You may also like...

Categories

Stop and Frisk: Spatial Analysis of Racial Differences

You may also like...

R makes it too easy to write papers

Getting geo data into SQL Server using API and R

December 2022: “Top 40” New CRAN Packages

Categories