Introduction

As it is currently Winter, this analysis’ goal is to identify which areas of Boston register the most Boston311 complaints about snow, and which areas of the city register the least. Snow can be quite disruptive in urban areas as it interrupts transit systems and general movement around the city, and Boston311 calls allow for point data to be collected to see where the most disruption and inconveniences are taking place. For the analysis, requests for snow plowing and miscellaneous snow complaints from 2017 are the chosen variables. The hypothesis is: the highest concentration of points will be located in the northern part of the city and concentration will be lower in the southern parts of the city. This is based on prior knowledge of the dataset, in which past analysis’ showed this spatial pattern.

Point Mapping

# filter the data to select variables
snow311 <- boston311 %>%
  filter(grepl("Request for Snow Plowing|Snow Complaint", TYPE))
unique(snow311$TYPE)
## [1] "Request for Snow Plowing"                      
## [2] "Misc. Snow Complaint"                          
## [3] "Request for Snow Plowing (Emergency Responder)"
# interactive map of variables
tmap_mode("view")
## tmap mode set to interactive viewing
tm_basemap(leaflet::providers$OpenStreetMap.BlackAndWhite) + tm_shape(snow311_sf) +
  tm_dots(col="TYPE")
# map points by neighborhoods
tm_shape(bostonNeighborhoods_sf) + tm_polygons(col="requests") +
  tm_shape(snow311_sf) + tm_dots(alpha=0.1)
# map density by neighborhoods
bostonNeighborhoods_sf %>%
  mutate(requestMile = requests/SqMiles) %>%
  tm_shape(.) + tm_polygons(col="requestMile", style="quantile", n=5,
                            title="311 Requests/Mile") +
  tm_shape(snow311_sf) + tm_dots(alpha=0.1)

Kernel Density Estimates (KDA)

# look at density using Kernel Density Estimate (KDE)
Snow_Request_Density <- density(rcsnow_ppp)
plot(Snow_Request_Density, main="Total Snow Request Density")
contour(Snow_Request_Density, add=TRUE)

# explore density patterns of individual variables
misc_ppp <- subset(rcsnow_ppp, marks=="Misc. Snow Complaint")
plow_ppp <- subset(rcsnow_ppp, marks=="Request for Snow Plowing")
Density_Plow_Points <- density(plow_ppp)
plot(Density_Plow_Points, main="Plow Request Density")

Density_Misc_Points <- density(misc_ppp)
plot(Density_Misc_Points, main="Misc. Snow Complaint Density")

K-Functions

# k-function to see if plowing complaints are more or less clustered or dispersed than they would be if random
# subset plow points
kplow <- Kest(plow_ppp, correction = "border")
plot(kplow, main="Request for Snow Plowing K-functon")

kplow.env <- envelope(plow_ppp, Kest, correction = "border")
## Generating 99 simulations of CSR  ...
## 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38,
## 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76,
## 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98,  99.
## 
## Done.
plot(kplow.env, main="Plow Request Envelope")

# k-function to see if misc snow complaints are more or less clustered or dispersed than they would be if random
# subset snow complaint points
kmisc <- Kest(misc_ppp, correction = "border")
plot(kmisc, main="Misc. Snow Complaint K-functon")

kmisc.env <- envelope(misc_ppp, Kest, correction = "border")
## Generating 99 simulations of CSR  ...
## 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38,
## 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76,
## 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98,  99.
## 
## Done.
plot(kmisc.env, main="Misc. Snow Complaint Envelope")

Data and Methods

First, Boston 311 data is downloaded and read into R as a csv file. Next, the variable formats are changed so case enquiry IDs are listed as characters rather than numbers, and dates are put into a format specifically made for dates using the lubridate library. The specific variables, Requests for Snow Plowing and Misc. Snow Complaints are then filtered from the dataset as they are the variables being targeted for the analysis. The following step is to subset the data so only points from the year 2017 are kept, in order to look at the most recent full Winter on record. In order for the analysis to work, the points from the variables need to be put into a simple features format in the proper coordinate system, which is WGS 84. An interactive map is then made using the simple feature object to display the Boston 311 points on an open street map.

To perform the point pattern density, a boston neighborhoods shapefile was downloaded from worldmap.harvard.edu and is brought into R as a simple feature object using an st_read function. From here, the point simple feature object is joined to the neighborhood simple feature object using the st_join function, and the amount of points in each Boston neighborhood are counted and mapped to show total points per neighborhood and points per square mile in each neighborhood. Following the creation of the maps, the simple feature objects are converted to ppp objects and the neighborhood simple feature object is converted to an sp object to be used as a window for the ppp objects. A Boston town boundary is created by downloading town boundaries of Massachusetts from MassGIS and filtering it to only show Boston. This is done to ensure the Boston boundary is accurate. Marks are then created from the ppp objects for use in the Kernel Density Estimate (KDE) maps. 

To make the KDE maps, the Boston 311 simple feature object is converted to the Massachusetts state plane coordinate system using st_transform and then made into a ppp object using as.ppp with the Boston sp object as the window. Marks are then created using as.factor of the new point simple feature object to look at all Boston 311 plow and snow complaint points. Next, separate ppp objects are created for plow points and misc. snow complaint points and the three ppp objects are mapped using the density function.

The final step is to perform k-functions on plow points and misc. snow complaint points to see if they are more or less clustered than they would be if they were randomly placed throughout the city. This is done by Kest functions on the individual variable's ppp objects and plotting them using a basic plot function.

Results

According the each map created, the highest amounts of Boston 311 points for Snow Plow Requests and Misc. Snow Complaints are found in the northern part of the city and decrease gradually as you move South and towards the airport, as it is a non-residential area. This is shown most clearly by the KDE maps, especially the Total Snow Request Density map as it utilizes contour lines to mark the constant decreasing gradient as you move away from the most dense area in northern Boston. This is in line with the hypothesis, as no spatial anomalies were found and the spatial pattern was consistent with both variables.

Discussion

The purpose of the analysis was to attempt to disprove the hypothesis to see if a different spatial pattern was shown by Boston 311 data in relation to snow by using Snow Plow Request point and Misc. Snow Complaint points, perhaps because certain parts of the city and more impacted than snow by others. Unfortunately, no such deviation from the general trend of the dataset as a whole was found, as the snow variables behaved spatially much like other variables in the Boston 311 dataset.
One difficulty in the analysis came when creating the marks for each variable. One variable, Unshoveled Sidewalk was removed from the analysis during this step as it was plotting points outside the window so the number of marks was not matching the number of points. Further questions for the analysis include if the pattern would ever change drastically depending on snowfall amounts or when looking at past years, as only 2017 was observed in the analysis. It would also be interesting to see how snow cleanup services are allocated throughout the city to compare the spatial patterns.