Introduction

Geographic Information Systems (GIS) has become a larger factor in sports analytics as teams embrace data analysis within their organization. The use of GIS provides an in depth look on spatial data that can directly impact the team's plan, which has made analytics an important position on many teams' staff. GIS provides a visual analysis that can be communicated to both the coaching staff and individual players. It also gives the ability to quantify the spatial data that is being created on the court. GIS then gives the analyst the ability to provide both the spatial and statistical analysis to the coaching staff.  This is all true for the sport of basketball, since it is a spatial game where the location on the court is better utilized by certain players or teams. These locations are utilized by coaching staffs to develop their game plans on both the offensive and defensive sides (Miller and Born 2017). GIS has also been used to evaluate individual players' shooting abilities and their fit with a team. This gives the General Management staff better insight when building their team.

Another application for GIS is how the data is used to improve shooter development by tracking the spatial rim pattern in relationship to the location on the court (Marty 2018).  As sports analytics grows, teams will continue to seek new information to get a competitive advantage over their peers. One casual observation of the impact of analytics in the game of basketball is how teams have transitioned from shooting two pointers to shooting three pointers. This change was brought along by studying analytics and evaluating the efficiency of shot locations. Players are chosen with stronger care with more thought put into how the player fits in with the team's playstyle and how well the player ranks in advanced metrics. Where as in the past a player was evaluated based off their production also known as box score statistics. Today, the player is evaluated more so on the context of their game (Shea 2014).

In this project, I explore the differences of spatial data in a team's performance. This is done by observing the shot location data from the 2017-2018 season for the Boston Celtics. This is done by creating hexagonal shot charts for the team's wins and losses through the programming language R. These shot charts are then analyzed to note any differences between the two.  I theorize that the data will show a significant difference between wins and losses and could provide further insight on the team's performance.

Literary Review

With this hypothesis, it is important to clearly define and visualize the separate shot locations in the Celtics’ wins and losses. As teams continue to apply the knowledge from analytics to try to improve their performance, this information must be in a useable format for both the coaching and management personnel (Fry and Ohlmann 2012). Using R, this analysis can be done in different formats to benefit the team using GIS. One application is the use of evaluating shooting patterns (Marty 2018). This is done by evaluating both the location of the shot on the court, along with how the ball interacts with the hoop (Marty, 2018). The results enable teams to evaluate shooters further than current practice and allows coaches to teach players stronger ball shooting methods.



A second application of GIS for basketball is to use it to map NBA strategies (Miller and Bornn 2017). By integrating machine learning into basketball, further intelligence can be gathered that will translate into more successful coaching (Miller and Bornn 2017). Between individual player trajectories and possession mapping, the coaches can further evaluate both theirs and the opposing team’s play (Miller and Bornn 2017). These tasks are all accomplishable with GIS, but ultimately the importance of this information is only valuable if it improves the team’s win count.



To analyze the data in R, the project focuses on a script by Ed Maia. Within the script, the focus is using data directly from NBA.com to analyze player shot data. The information that can be created from the scripts include shot chart information, hexbin shot charts, and accuracy charts (Maia 2015). This information works with the JSON data created by both the NBA and companies like ESPN. This GIS application, along with the products created by the scripts, provide enough information to benefit a team’s game planning.

Methodology

In order to conduct this analysis, secondary data will be used. The data being used is 2017-2018 shot chart information for the Boston Celtics from NBASavant.com. This data is scraped from sites like ESPN.com (Willman 2019). This data is created for every game and includes information such as location, result, player data, and type of shot. This data is often used by both fans and media members to show box score statistics (Shea 2014). The R script takes the box score statistics and provides more context, which is more valuable to understand spatial information.



NBASavant.com uses a web scraping process to gather the necessary information for each game. Web scraping is the process of extracting data from a website. Previously the NBA provided JSON data through their website, but over the past few years have stopped. JSON data is the data that contains the information for shot charts. Since the information is not published on their website, programmers have used web scraping to extract the JSON information from sites like NBA.com and ESPN.com. This is how NBASavant.com gets there information for shot charts. Wins and losses were added to the data from NBASavant.com in order to test the project’s hypothesis. There are some games missing from the data, since it was taken from a third party website. As seen below, there is still over four thousand shots taken in this dataset. This is close to sixty percent of the total shots taken by the team in the 2017-2018 season.



The script was created by Ed Maia to create shot charts and hexbin shot charts. The script calls for ggplot to create the shot chart (Maia 2015). The packages grid and jpeg were used to create the image of the basketball court (Maia 2015). To make sure that the image is not distorted the coord_fixed function was used (Maia 2015). The hexbin package was used to create the hexbin shot chart. By using hexbin and the function stat_hexbin, it is another way to map the points. This package and function are used instead of ggplot2 and geom_point (Maia 2015). This was done because the hexbin shot chart has gained popularity in the NBA due to the work of Kirk Goldsberry and it creates another visual aid to analyze the data.

Results

library(rjson)
## Warning: package 'rjson' was built under R version 3.5.2
library(tidyverse)
## -- Attaching packages ---------------------------------------------------------------------- tidyverse 1.2.1 --
## v ggplot2 3.1.0       v purrr   0.2.5  
## v tibble  2.0.1       v dplyr   0.8.0.1
## v tidyr   0.8.1       v stringr 1.3.1  
## v readr   1.3.1       v forcats 0.3.0
## Warning: package 'ggplot2' was built under R version 3.5.2
## Warning: package 'tibble' was built under R version 3.5.2
## Warning: package 'readr' was built under R version 3.5.2
## Warning: package 'dplyr' was built under R version 3.5.2
## -- Conflicts ------------------------------------------------------------------------- tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
bostonshot<-read.csv("nba_savant.csv", stringsAsFactors = FALSE)
#read the data
str(bostonshot)
## 'data.frame':    4070 obs. of  23 variables:
##  $ name             : chr  "Marcus Smart" "Kyrie Irving" "Kyrie Irving" "Gordon Hayward" ...
##  $ team_name        : chr  "Boston Celtics" "Boston Celtics" "Boston Celtics" "Boston Celtics" ...
##  $ game_date        : chr  "10/17/2017" "10/17/2017" "10/17/2017" "10/17/2017" ...
##  $ result           : chr  "Loss" "Loss" "Loss" "Loss" ...
##  $ season           : int  2017 2017 2017 2017 2017 2017 2017 2017 2017 2017 ...
##  $ espn_player_id   : int  2990992 6442 6442 4249 6442 2596158 3917376 NA NA NA ...
##  $ team_id          : int  1610612738 1610612738 1610612738 1610612738 1610612738 1610612738 1610612738 1610612738 1610612738 1610612738 ...
##  $ espn_game_id     : int  400974437 400974437 400974437 400974437 400974437 400974437 400974437 400974437 400974437 400974437 ...
##  $ period           : int  4 4 1 1 3 3 1 3 2 2 ...
##  $ minutes_remaining: int  7 0 11 9 7 0 2 4 9 1 ...
##  $ seconds_remaining: int  39 33 44 14 12 52 17 15 56 50 ...
##  $ shot_made_flag   : int  0 0 1 1 0 0 0 0 0 0 ...
##  $ action_type      : chr  "Driving Floating Bank Jump Shot" "Driving Floating Jump Shot" "Driving Floating Jump Shot" "Fadeaway Jump Shot" ...
##  $ shot_type        : chr  "2PT Field Goal" "2PT Field Goal" "2PT Field Goal" "2PT Field Goal" ...
##  $ shot_distance    : int  6 8 10 9 25 25 25 12 25 17 ...
##  $ opponent         : chr  "Cleveland Cavaliers" "Cleveland Cavaliers" "Cleveland Cavaliers" "Cleveland Cavaliers" ...
##  $ x                : int  52 -86 -1 -9 3 71 -160 -96 -107 131 ...
##  $ y                : int  37 17 100 91 255 242 200 80 233 117 ...
##  $ dribbles         : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ touch_time       : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ defender_name    : logi  NA NA NA NA NA NA ...
##  $ defender_distance: int  0 0 0 0 0 0 0 0 0 0 ...
##  $ shot_clock       : int  0 0 0 0 0 0 0 0 0 0 ...
# covert x and y coordinates into numeric
bostonshot$x <- as.numeric(as.character(bostonshot$x))
bostonshot$y <- as.numeric(as.character(bostonshot$y))
bostonshot$shot_distance <- as.numeric(as.character(bostonshot$shot_distance))

# have a look at the data
View(bostonshot)

In order to understand the information extracted above, it is important to set the boundaries of the basketball court.

library(ggplot2)

# simple plot using EVENT_TYPE to colour the dots
ggplot(bostonshot, aes(x=x, y=y)) +
  geom_point(aes(colour = shot_type))+
  labs(caption = "Figure 1")

library(grid)
library(jpeg)
## Warning: package 'jpeg' was built under R version 3.5.2
library(RCurl)
## Warning: package 'RCurl' was built under R version 3.5.2
## Loading required package: bitops
## 
## Attaching package: 'RCurl'
## The following object is masked from 'package:tidyr':
## 
##     complete
# half court image
courtImg.URL <- "https://thedatagame.files.wordpress.com/2016/03/nba_court.jpg"
court <- rasterGrob(readJPEG(getURLContent(courtImg.URL)),
                    width=unit(1,"npc"), height=unit(1,"npc"))


# plot using NBA court background and colour by shot zone
ggplot(bostonshot, aes(x=x, y=y)) + 
  annotation_custom(court, -250, 250, -50, 420) +
  geom_point(aes(colour = shot_type)) +
  xlim(-250, 250) +
  ylim(-50, 420)+
  labs(caption = "Figure 2")
## Warning: Removed 16 rows containing missing values (geom_point).

# plot using ggplot and NBA court background image
ggplot(bostonshot, aes(x=x, y=y)) +
  annotation_custom(court, -250, 250, -50, 420) +
  geom_point(aes(colour = shot_type)) +
  xlim(250, -250) +
  ylim(-50, 420) +
  geom_rug(alpha = 0.2) +
  coord_fixed() +
  ggtitle("Boston Celtics Shot Chart 2017-2018") +
  labs(caption = "Figure 3")+
  theme(line = element_blank(),
        axis.title.x = element_blank(),
        axis.title.y = element_blank(),
        axis.text.x = element_blank(),
        axis.text.y = element_blank(),
        legend.title = element_blank(),
        plot.title = element_text(size = 15, lineheight = 0.9, face = "bold"))
## Warning: Removed 16 rows containing missing values (geom_point).

This next step cleans up the image and adds the Boston Celtics logo to help identify where the data came from (Figure 4).

library(grid)
library(gridExtra)
## 
## Attaching package: 'gridExtra'
## The following object is masked from 'package:dplyr':
## 
##     combine
library(png)
library(RCurl)

# scrape player photo and save as a raster object
playerImg.URL <- paste("http://content.sportslogos.net/logos/6/213/full/slhg02hbef3j1ov4lsnwyol5o.png", sep="")
playerImg <- rasterGrob(readPNG(getURLContent(playerImg.URL)), 
                        width=unit(0.15, "npc"), height=unit(0.15, "npc"))


# plot using ggplot and NBA court background
ggplot(bostonshot, aes(x=x, y=y)) +
  annotation_custom(court, -250, 250, -50, 420) +
  geom_point(aes(colour = shot_type)) +
  xlim(250, -250) +
  ylim(-50, 420) +
  labs(caption = "Figure 4")+
  geom_rug(alpha = 0.2) +
  coord_fixed() +
  ggtitle("Boston Celtics Shot Chart 2017-2018") +
  theme(line = element_blank(),
        axis.title.x = element_blank(),
        axis.title.y = element_blank(),
        axis.text.x = element_blank(),
        axis.text.y = element_blank(),
        legend.title = element_blank(),
        plot.title = element_text(size = 15, lineheight = 0.9, face = "bold"))
## Warning: Removed 16 rows containing missing values (geom_point).
# add celtics logo and footnote to the plot
pushViewport(viewport(x = unit(0.9, "npc"), y = unit(0.9, "npc")))
print(grid.draw(playerImg), newpage=FALSE)
## NULL
grid.text(label = "thedatagame.com.au", just = "centre", vjust = 50)

This next visual (Figure 5) is a Hexbin Shot chart made popular by Kirk Goldsberry (Maia 2015). This hexbin shot chart shows the percentage in which the shot was taken by the Boston Celtics by location. This gives an insight on where the Celtics prefer to shoot.

library(hexbin)
## Warning: package 'hexbin' was built under R version 3.5.2
# plot shots using ggplot, hex bins, NBA court backgroung image.
ggplot(bostonshot, aes(x=x, y=y)) + 
  annotation_custom(court, -250, 250, -52, 418) +
  stat_binhex(bins = 25, colour = "gray", alpha = 0.7) +
  scale_fill_gradientn(colours = c("yellow","orange","red")) +
  guides(alpha = FALSE, size = FALSE) +
  xlim(250, -250) +
  ylim(-52, 418) +
  labs(caption = "Figure 5")+
  geom_rug(alpha = 0.2) +
  coord_fixed() +
  ggtitle("Boston Celtics Hexbin Shot Chart 2017-2018") +
  theme(line = element_blank(),
        axis.title.x = element_blank(),
        axis.title.y = element_blank(),
        axis.text.x = element_blank(),
        axis.text.y = element_blank(),
        legend.title = element_blank(),
        plot.title = element_text(size = 17, lineheight = 1.2, face = "bold"))
## Warning: Removed 16 rows containing non-finite values (stat_binhex).
## Warning: Removed 2 rows containing missing values (geom_hex).
# add player photo and footnote to the plot
pushViewport(viewport(x = unit(0.9, "npc"), y = unit(0.9, "npc")))
print(grid.draw(playerImg), newpage=FALSE)
## NULL
grid.text(label = "thedatagame.com.au", just = "centre", vjust = 50)

To help prove my theory, it is important to take the information above and seperate the information between wins and losses. Figure 6 is the Shot Chart for wins.

#create object for Celtic's wins

shotwin<- subset(bostonshot, result == "Win")

str(shotwin)
## 'data.frame':    2783 obs. of  23 variables:
##  $ name             : chr  "Terry Rozier" "Kyrie Irving" "Kyrie Irving" "Kyrie Irving" ...
##  $ team_name        : chr  "Boston Celtics" "Boston Celtics" "Boston Celtics" "Boston Celtics" ...
##  $ game_date        : chr  "12/4/2017" "12/4/2017" "12/4/2017" "12/4/2017" ...
##  $ result           : chr  "Win" "Win" "Win" "Win" ...
##  $ season           : int  2017 2017 2017 2017 2017 2017 2017 2017 2017 2017 ...
##  $ espn_player_id   : int  3074752 6442 6442 6442 3213 6442 2990992 2990992 3213 3917376 ...
##  $ team_id          : int  1610612738 1610612738 1610612738 1610612738 1610612738 1610612738 1610612738 1610612738 1610612738 1610612738 ...
##  $ espn_game_id     : int  400975094 400975094 400975094 400975094 400975094 400975094 400975094 400975094 400975094 400975094 ...
##  $ period           : int  2 1 3 4 1 2 3 3 1 4 ...
##  $ minutes_remaining: int  7 11 6 1 3 5 5 0 1 2 ...
##  $ seconds_remaining: int  28 37 4 12 42 49 4 29 22 7 ...
##  $ shot_made_flag   : int  1 1 0 1 0 0 1 0 0 0 ...
##  $ action_type      : chr  "Driving Floating Bank Jump Shot" "Driving Floating Bank Jump Shot" "Driving Floating Jump Shot" "Driving Floating Jump Shot" ...
##  $ shot_type        : chr  "2PT Field Goal" "2PT Field Goal" "2PT Field Goal" "2PT Field Goal" ...
##  $ shot_distance    : num  4 3 10 6 8 16 20 22 25 24 ...
##  $ opponent         : chr  "Milwaukee Bucks" "Milwaukee Bucks" "Milwaukee Bucks" "Milwaukee Bucks" ...
##  $ x                : num  13 10 46 42 79 -163 47 229 90 -28 ...
##  $ y                : num  46 31 99 52 28 -11 203 -15 243 245 ...
##  $ dribbles         : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ touch_time       : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ defender_name    : logi  NA NA NA NA NA NA ...
##  $ defender_distance: int  0 0 0 0 0 0 0 0 0 0 ...
##  $ shot_clock       : int  0 0 0 0 0 0 0 0 0 0 ...
# plot using ggplot and hexbin for Celtics' wins
ggplot(shotwin, aes(x=x, y=y)) +
  annotation_custom(court, -250, 250, -50, 420) +
  geom_point(aes(colour = shot_type)) +
  xlim(250, -250) +
  ylim(-50, 420) +
  labs(caption = "Figure 6")+
  geom_rug(alpha = 0.2) +
  coord_fixed() +
  ggtitle("Boston Celtics Shot Chart 2017-2018 Wins") +
  theme(line = element_blank(),
        axis.title.x = element_blank(),
        axis.title.y = element_blank(),
        axis.text.x = element_blank(),
        axis.text.y = element_blank(),
        legend.title = element_blank(),
        plot.title = element_text(size = 15, lineheight = 0.9, face = "bold"))
## Warning: Removed 12 rows containing missing values (geom_point).
# add player photo and footnote to the plot
pushViewport(viewport(x = unit(0.9, "npc"), y = unit(0.9, "npc")))
print(grid.draw(playerImg), newpage=FALSE)
## NULL
grid.text(label = "thedatagame.com.au", just = "centre", vjust = 50)

Figure 7 shows the Hexbin Shot Chart for wins.

#hexbins
ggplot(shotwin, aes(x=x, y=y)) + 
  annotation_custom(court, -250, 250, -52, 418) +
  stat_binhex(bins = 25, colour = "gray", alpha = 0.7) +
  scale_fill_gradientn(colours = c("yellow","orange","red")) +
  guides(alpha = FALSE, size = FALSE) +
  xlim(250, -250) +
  ylim(-52, 418) +
  labs(caption = "Figure 7")+
  geom_rug(alpha = 0.2) +
  coord_fixed() +
  ggtitle("Boston Celtics Hexbin Shot Chart Wins") +
  theme(line = element_blank(),
        axis.title.x = element_blank(),
        axis.title.y = element_blank(),
        axis.text.x = element_blank(),
        axis.text.y = element_blank(),
        legend.title = element_blank(),
        plot.title = element_text(size = 17, lineheight = 1.2, face = "bold"))
## Warning: Removed 12 rows containing non-finite values (stat_binhex).
## Warning: Removed 1 rows containing missing values (geom_hex).
# add player photo and footnote to the plot
pushViewport(viewport(x = unit(0.9, "npc"), y = unit(0.9, "npc")))
print(grid.draw(playerImg), newpage=FALSE)
## NULL
grid.text(label = "thedatagame.com.au", just = "centre", vjust = 50)

Figure 8 shows the Shot Chart for losses.

#create object for Celtic's losses
shotloss <- subset(bostonshot, result == "Loss")

str(shotloss)
## 'data.frame':    1287 obs. of  23 variables:
##  $ name             : chr  "Marcus Smart" "Kyrie Irving" "Kyrie Irving" "Gordon Hayward" ...
##  $ team_name        : chr  "Boston Celtics" "Boston Celtics" "Boston Celtics" "Boston Celtics" ...
##  $ game_date        : chr  "10/17/2017" "10/17/2017" "10/17/2017" "10/17/2017" ...
##  $ result           : chr  "Loss" "Loss" "Loss" "Loss" ...
##  $ season           : int  2017 2017 2017 2017 2017 2017 2017 2017 2017 2017 ...
##  $ espn_player_id   : int  2990992 6442 6442 4249 6442 2596158 3917376 NA NA NA ...
##  $ team_id          : int  1610612738 1610612738 1610612738 1610612738 1610612738 1610612738 1610612738 1610612738 1610612738 1610612738 ...
##  $ espn_game_id     : int  400974437 400974437 400974437 400974437 400974437 400974437 400974437 400974437 400974437 400974437 ...
##  $ period           : int  4 4 1 1 3 3 1 3 2 2 ...
##  $ minutes_remaining: int  7 0 11 9 7 0 2 4 9 1 ...
##  $ seconds_remaining: int  39 33 44 14 12 52 17 15 56 50 ...
##  $ shot_made_flag   : int  0 0 1 1 0 0 0 0 0 0 ...
##  $ action_type      : chr  "Driving Floating Bank Jump Shot" "Driving Floating Jump Shot" "Driving Floating Jump Shot" "Fadeaway Jump Shot" ...
##  $ shot_type        : chr  "2PT Field Goal" "2PT Field Goal" "2PT Field Goal" "2PT Field Goal" ...
##  $ shot_distance    : num  6 8 10 9 25 25 25 12 25 17 ...
##  $ opponent         : chr  "Cleveland Cavaliers" "Cleveland Cavaliers" "Cleveland Cavaliers" "Cleveland Cavaliers" ...
##  $ x                : num  52 -86 -1 -9 3 71 -160 -96 -107 131 ...
##  $ y                : num  37 17 100 91 255 242 200 80 233 117 ...
##  $ dribbles         : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ touch_time       : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ defender_name    : logi  NA NA NA NA NA NA ...
##  $ defender_distance: int  0 0 0 0 0 0 0 0 0 0 ...
##  $ shot_clock       : int  0 0 0 0 0 0 0 0 0 0 ...
# plot using ggplot and hexbin for Celtics' losses
ggplot(shotloss, aes(x=x, y=y)) +
  annotation_custom(court, -250, 250, -50, 420) +
  geom_point(aes(colour = shot_type)) +
  xlim(250, -250) +
  ylim(-50, 420) +
  labs(caption = "Figure 8")+
  geom_rug(alpha = 0.2) +
  coord_fixed() +
  ggtitle("Boston Celtics Shot Chart 2017-2018 Losses") +
  theme(line = element_blank(),
        axis.title.x = element_blank(),
        axis.title.y = element_blank(),
        axis.text.x = element_blank(),
        axis.text.y = element_blank(),
        legend.title = element_blank(),
        plot.title = element_text(size = 15, lineheight = 0.9, face = "bold"))
## Warning: Removed 4 rows containing missing values (geom_point).
#create object for Celtic's losses
shotloss <- subset(bostonshot, result == "Loss")

str(shotloss)# add player photo and footnote to the plot
## 'data.frame':    1287 obs. of  23 variables:
##  $ name             : chr  "Marcus Smart" "Kyrie Irving" "Kyrie Irving" "Gordon Hayward" ...
##  $ team_name        : chr  "Boston Celtics" "Boston Celtics" "Boston Celtics" "Boston Celtics" ...
##  $ game_date        : chr  "10/17/2017" "10/17/2017" "10/17/2017" "10/17/2017" ...
##  $ result           : chr  "Loss" "Loss" "Loss" "Loss" ...
##  $ season           : int  2017 2017 2017 2017 2017 2017 2017 2017 2017 2017 ...
##  $ espn_player_id   : int  2990992 6442 6442 4249 6442 2596158 3917376 NA NA NA ...
##  $ team_id          : int  1610612738 1610612738 1610612738 1610612738 1610612738 1610612738 1610612738 1610612738 1610612738 1610612738 ...
##  $ espn_game_id     : int  400974437 400974437 400974437 400974437 400974437 400974437 400974437 400974437 400974437 400974437 ...
##  $ period           : int  4 4 1 1 3 3 1 3 2 2 ...
##  $ minutes_remaining: int  7 0 11 9 7 0 2 4 9 1 ...
##  $ seconds_remaining: int  39 33 44 14 12 52 17 15 56 50 ...
##  $ shot_made_flag   : int  0 0 1 1 0 0 0 0 0 0 ...
##  $ action_type      : chr  "Driving Floating Bank Jump Shot" "Driving Floating Jump Shot" "Driving Floating Jump Shot" "Fadeaway Jump Shot" ...
##  $ shot_type        : chr  "2PT Field Goal" "2PT Field Goal" "2PT Field Goal" "2PT Field Goal" ...
##  $ shot_distance    : num  6 8 10 9 25 25 25 12 25 17 ...
##  $ opponent         : chr  "Cleveland Cavaliers" "Cleveland Cavaliers" "Cleveland Cavaliers" "Cleveland Cavaliers" ...
##  $ x                : num  52 -86 -1 -9 3 71 -160 -96 -107 131 ...
##  $ y                : num  37 17 100 91 255 242 200 80 233 117 ...
##  $ dribbles         : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ touch_time       : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ defender_name    : logi  NA NA NA NA NA NA ...
##  $ defender_distance: int  0 0 0 0 0 0 0 0 0 0 ...
##  $ shot_clock       : int  0 0 0 0 0 0 0 0 0 0 ...
pushViewport(viewport(x = unit(0.9, "npc"), y = unit(0.9, "npc")))
print(grid.draw(playerImg), newpage=FALSE)
## NULL
grid.text(label = "thedatagame.com.au", just = "centre", vjust = 50)

Figure 9 shows the Hexbin Shot Chart for losses

#hexbins

ggplot(shotloss, aes(x=x, y=y)) + 
  annotation_custom(court, -250, 250, -52, 418) +
  stat_binhex(bins = 25, colour = "gray", alpha = 0.7) +
  scale_fill_gradientn(colours = c("yellow","orange","red")) +
  guides(alpha = FALSE, size = FALSE) +
  xlim(250, -250) +
  ylim(-52, 418) +
  labs(caption = "Figure 9")+
  geom_rug(alpha = 0.2) +
  coord_fixed() +
  ggtitle("Boston Celtics Hexbin Shot Chart Losses") +
  theme(line = element_blank(),
        axis.title.x = element_blank(),
        axis.title.y = element_blank(),
        axis.text.x = element_blank(),
        axis.text.y = element_blank(),
        legend.title = element_blank(),
        plot.title = element_text(size = 17, lineheight = 1.2, face = "bold"))
## Warning: Removed 4 rows containing non-finite values (stat_binhex).
# add player photo and footnote to the plot
pushViewport(viewport(x = unit(0.9, "npc"), y = unit(0.9, "npc")))
print(grid.draw(playerImg), newpage=FALSE)
## NULL
grid.text(label = "thedatagame.com.au", just = "centre", vjust = 50)

Below the script gives a side by side comparison of the season shot totals for both wins and losses. As displayed, the Celtics seemed to have shot the ball better from beyond the three point line in there wins and in the midrange region by the free throw line. There is also more spatial locations for made shots during wins compared to the larger amount of white regions in losses.The numbers represent the number of shots located in each hexbin.

library(cowplot)
## Warning: package 'cowplot' was built under R version 3.5.3
## 
## Attaching package: 'cowplot'
## The following object is masked from 'package:ggplot2':
## 
##     ggsave
winhex <- ggplot(shotwin, aes(x=x, y=y)) + 
  annotation_custom(court, -250, 250, -52, 418) +
  stat_binhex(bins = 25, colour = "gray", alpha = 0.7) +
  scale_fill_gradientn(colours = c("yellow","orange","red")) +
  guides(alpha = FALSE, size = FALSE) +
  xlim(250, -250) +
  ylim(-52, 418) +
  geom_rug(alpha = 0.2) +
  coord_fixed() +
  ggtitle("Wins") +
  theme(line = element_blank(),
        axis.title.x = element_blank(),
        axis.title.y = element_blank(),
        axis.text.x = element_blank(),
        axis.text.y = element_blank(),
        legend.title = element_blank(),
        plot.title = element_text(size = 17, lineheight = 1.2, face = "bold"))


losshex <- ggplot(shotloss, aes(x=x, y=y)) + 
  annotation_custom(court, -250, 250, -52, 418) +
  stat_binhex(bins = 25, colour = "gray", alpha = 0.7) +
  scale_fill_gradientn(colours = c("yellow","orange","red")) +
  guides(alpha = FALSE, size = FALSE) +
  xlim(250, -250) +
  ylim(-52, 418) +
  labs(caption = "Figure 10")+
  geom_rug(alpha = 0.2) +
  coord_fixed() +
  ggtitle("Losses") +
  theme(line = element_blank(),
        axis.title.x = element_blank(),
        axis.title.y = element_blank(),
        axis.text.x = element_blank(),
        axis.text.y = element_blank(),
        legend.title = element_blank(),
        plot.title = element_text(size = 17, lineheight = 1.2, face = "bold"))

plot_grid(winhex, losshex, labels = "Boston Celtics Season Shot Chart")
## Warning: Removed 12 rows containing non-finite values (stat_binhex).
## Warning: Removed 1 rows containing missing values (geom_hex).
## Warning: Removed 4 rows containing non-finite values (stat_binhex).

In Figure 11, the same data is used as in Figure 10 with a normalization of the color ramp for each Hexshot Chart. This enables further analysis between the two seperate charts. It shows a larger volume of three point shooting compared to two point shooting during wins. In losses there is less of a difference between the two and a larger number of areas on the court that shots were not attempted.

ggplot(bostonshot, aes(x = x, y = y)) +
    annotation_custom(court, -250, 250, -52, 418)+
     stat_binhex(bins = 25, colour = "gray", alpha = 0.7) +
    xlim(250, -250) +
    ylim(-52, 418) +
  labs(caption = "Figure 11")+
  ggtitle("HexShot Shots Wins/Losses Comparison") +
  xlab("")+ 
  ylab("")+
    geom_rug(alpha = 0.2) +
    coord_fixed() +
    facet_wrap( ~ shot_made_flag) +
    scale_colour_gradientn(colours = rev(rainbow(5)))
## Warning: Removed 16 rows containing non-finite values (stat_binhex).
## Warning: Removed 2 rows containing missing values (geom_hex).

Conclusion

The data shows that there is a noticeable difference in the Boston Celtics' shot chart when comparing wins and losses. This is observed spatially in both the regular shot charts and in the hexbin shot charts. An increase in efficiency from three-point range and around the free throw line are the most obvious spatial differences. There is also a larger spatial spread for shots attempted in wins compared to losses. This is indicated by larger amounts of white space in the hexbin shot chart for losses. These results adequately support the importance of spatial information in analyzing team performance.

Further analysis for the coaching staff could be conducted by analyzing the individual players compared to the team. This will help identify spatial locations where a player performs better and worse and allows the coaching staff to put the player in a location that results in the most success. This could also be done for opposing teams and players to identify defensive schemes that minimize successful locations on the court. General Management staff could also use this location data to see how free agents or trade targets could impact the team with their shot locations.

Another logical step forward would to analyze spatial locations in relation to the play calling made by the coaching staff. This would further benefit the team and give more context on how the spatial location of the shots was decided. Also, the rest of the shot data needs to be analyzed for the 2017-2018 regular season. While assumptions can be made with the current dataset, there remains a question for that season due to the lacking data. As a spatial sport, there is so much that can be done by using GIS to benefit a team. This project alone shows how a team is spatially different in wins and losses. 

References

Franks, A., A. Miller, L. Bornn, and K. Goldsberry 2015. Characterizing the spatial structure of defensive skill in professional basketball. Institute of Mathematical Statistics 9:94-121.

Fry, M., and J. Ohlmann. 2012. Introduction to the Special Issue on Analytics in Sports, Part I: General Sports Applications. Interfaces 42:105-108.

Jensen, D. 2014. Spatial Analysis and Visualization in the NBA using GIS applications. California State University Department of Geography.

Maia, E. 2015. How to create NBA shot charts in R. Data Science and Sports Analytics. https://thedatagame.com.au/2015/09/27/how-to-create-nba-shot-charts-in-r/ (last accessed 29 April 2019).

Maia, E. 2017. NBA-Shot-Charts/NBA shot charts.R. NBA-Shot-Charts. https://github.com/Maiae/NBA-Shot-Charts/blob/master/NBA shot charts.R (last accessed 29 April 2019).

Marty, R. 2018. High-resolution shot capture reveals systematic biases and an improved method for shooter evaluation. 42 Analytics.

Miller, A., and L. Bornn. 2017. Possession Sketches: Mapping NBA Strategies. 42 Analytics.

Shea, S. 2014. Basketball Analytics: Spatial Tracking. CreateSpace Independent Publishing Platform.

Willman, D. Shot Tracker. NBA Savant. http://www.nbasavant.com/shot_search.php (last accessed 29 April 2019).