4 The Madison Square Garden Effect in the NBA
Author: Robert Hutto
4.0.1 What is Madison Square Garden?
MSG, or “The Garden”, is a historic arena located in the heart of Manhattan. The Garden opened on February 11, 1968, making it the oldest major sporting facility in the New York metropolitan area and the oldest active arena in the NBA. Beyond basketball, the arena hosts concerts, boxing matches, political events, and other major cultural performances, making it one of the most recognizable venues in the world. The Garden is especially iconic in the world of basketball, with many revering it as the “Mecca of Basketball”.
4.0.2 What makes MSG so special?
MSG is the premier indoor venue of New York City–the most visited and densely populated city in the United States, as well as the nation’s largest media market. Events held at MSG often receive disproportionate national and international attention, and NBA games played at The Garden are more likely to be nationally televised, widely discussed in sports media, and preserved in highlight compilations than games played elsewhere.
MSG is also the home court of the New York Knicks, one of the NBA’s most valuable and widely followed franchises, regardless of on-court success. Knicks fans are notoriously vocal, expressive, and unafraid to engage with players. Courtside seats are typically filled with famous actors, musicians, athletes, and public figures, many of whom are highlighted on broadcasts and arena displays. On any given night, there are dozens of high-profile celebrities present throughout the arena, creating a feeling that the game is a public showcase, rather than just another regular-season game. Given the arena’s historic status, visibility, and the fans it brings in, performances at MSG are perceived as particularly meaningful and memorable.
In line with this, MSG boasts a long list of iconic player performances. Michael Jordan scored 55 points in his “double-nickel” return game to MSG on March 28, 1995 after his first retirement, definitively reinstating himself as the face of the league. On February 2, 2009, Kobe Bryant set the record for most points scored in a game at The Garden with a 61-point performance, solidifying his legacy as one of the best scorers the NBA has ever seen. Stephen Curry’s breakout game on February 27, 2013, where he made 11 out of his 13 three-point attempts to score 54 points at The Garden, is often referenced as the moment he elevated to star status. Nearly nine years later at The Garden, Curry hit his 2974th career three-pointer, surpassing Ray Allen and breaking the NBA’s all-time three point scoring record. The list goes on.
This combination of historical performances, celebrity presence, media visibility, and competitive intensity has brought on a widely-accepted narrative that the pressures associated with MSG uniquely influence individual games and player performances. Clutch star players rise to the occasion with standout performances, while others “shrink under the lights” and struggle to perform, further entrenching the belief that the arena itself exerts an influence on performance.
4.0.3 Is the MSG effect real?
While perceptions of the “MSG Effect” are deeply ingrained in basketball culture, they are rooted in isolated moments, media framing, and retrospective storytelling rather than statistics. The present analysis seeks to address this gap by examining whether this “MSG Effect” is actually detectable in statistics, or if it is simply a product of our narrative-obsessed imagination.
4.0.4 Three overarching research questions:
4.0.4.1 Q1: Do the New York Knicks experience a special home-court advantage due to playing at MSG?
4.0.4.2 Q2: Do visiting players play differently at MSG than other arenas?
4.0.4.3 Q3: Who benefits the most from playing at MSG?
4.1 —————————————————————————–
4.2 NBA Data Project
4.2.0.1 hoopR allows us to call all NBA game data from the 2002 season to present, so that’s what we will work with.
library(hoopR)
library(tidyverse)
#> ── Attaching core tidyverse packages ──── tidyverse 2.0.0 ──
#> ✔ dplyr 1.1.4 ✔ readr 2.1.5
#> ✔ forcats 1.0.1 ✔ stringr 1.6.0
#> ✔ ggplot2 4.0.0 ✔ tibble 3.3.0
#> ✔ lubridate 1.9.4 ✔ tidyr 1.3.1
#> ✔ purrr 1.2.0
#> ── Conflicts ────────────────────── tidyverse_conflicts() ──
#> ✖ dplyr::filter() masks stats::filter()
#> ✖ dplyr::lag() masks stats::lag()
#> ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
seasons <- 2002:most_recent_nba_season()
# Let's download game-level schedule data for every game played in this era.
sched <- load_nba_schedule(seasons = seasons)
# Only standard NBA games (excludes ALLSTAR, USA/WORLD, EAST/WEST, etc.)
sched <- sched %>%
filter(type_abbreviation == "STD")
nba_abbrevs <- sched %>%
select(home_abbreviation, away_abbreviation) %>%
pivot_longer(cols = everything(), values_to = "team_abbreviation") %>%
distinct(team_abbreviation)
# Let's create a dataset with only games played at MSG.
msg_games <- sched %>%
filter(venue_full_name == "Madison Square Garden") %>% # venue name is in schedule data :contentReference[oaicite:3]{index=3}
transmute(
game_id,
season,
season_type,
game_date,
venue_full_name,
home_abbreviation,
away_abbreviation,
home_score,
away_score,
home_winner,
neutral_site
)
# Cleaning MSG schedule data to only include Knicks regular season and playoff games.
msg_games %>% count(season_type, sort = TRUE)
#> ── ESPN NBA Schedule from hoopR data repository ────────────
#> ℹ Data updated: 2026-01-06 07:28:14 EST
#> # A tibble: 1 × 2
#> season_type n
#> <int> <int>
#> 1 2 1001
msg_games %>% count(home_abbreviation, sort = TRUE) %>% head(10)
#> ── ESPN NBA Schedule from hoopR data repository ────────────
#> ℹ Data updated: 2026-01-06 07:28:14 EST
#> # A tibble: 3 × 2
#> home_abbreviation n
#> <chr> <int>
#> 1 NY 999
#> 2 EAST 1
#> 3 IND 1
msg_knicks_home_games <- msg_games %>%
filter(home_abbreviation == "NY", neutral_site == FALSE)
msg_knicks_home_games %>%
count(season_type, sort = TRUE)
#> ── ESPN NBA Schedule from hoopR data repository ────────────
#> ℹ Data updated: 2026-01-06 07:28:14 EST
#> # A tibble: 1 × 2
#> season_type n
#> <int> <int>
#> 1 2 999
# Load player box scores for all games in all seasons.
pb <- load_nba_player_box(seasons = seasons)
pb %>%
filter(team_abbreviation %in% c("NY", "NYK")) %>%
count(team_abbreviation, sort = TRUE)
#> ── ESPN NBA Player Boxscores from hoopR data repository ────
#> ℹ Data updated: 2026-01-06 07:28:12 EST
#> # A tibble: 1 × 2
#> team_abbreviation n
#> <chr> <int>
#> 1 NY 26886
# Let's add some composite measures of offensive and defensive stat creation.
pb <- pb %>%
filter(!did_not_play, minutes > 0) %>%
mutate(
# True Shooting Percentage
denom = 2 * (field_goals_attempted + 0.44 * free_throws_attempted),
ts = if_else(denom > 0, points / denom, NA_real_),
# Composite performance metrics
offensive_output = points + rebounds + assists,
defensive_output = steals + blocks
)
# Create dataset of all player box scores only from games at MSG. Categorize home/away players. Calculate TS%.
pb_msg <- pb %>%
inner_join(
msg_knicks_home_games,
by = c("game_id", "season", "season_type", "game_date")
) %>%
mutate(
at_msg = TRUE,
is_knicks = (team_abbreviation == "NY"),
is_home = (home_away == "home"),
is_away = (home_away == "away"),
ts = points / (2 * (field_goals_attempted + 0.44 * free_throws_attempted))
)
pb_road_flagged <- pb %>%
filter(home_away == "away", !did_not_play, minutes > 0) %>%
left_join(
msg_knicks_home_games %>% transmute(game_id, at_msg = TRUE),
by = "game_id"
) %>%
mutate(
at_msg = if_else(is.na(at_msg), FALSE, at_msg),
ts = points / (2 * (field_goals_attempted + 0.44 * free_throws_attempted))
)4.3 —————————————————————————–
4.4 Q1: Do the New York Knicks experience a special home-court advantage due to playing at MSG?
4.4.0.1 Where do the Knicks rank in terms of home court advantage?
# Let's find each team's home court advantage (average total points scored at home - average total points scored away).
non_nba_teams <- c("EAST", "WEST", "USA", "WORLD", "GIA", "LEB")
team_game <- pb %>%
filter(
!did_not_play,
minutes > 0,
) %>%
group_by(game_id, team_abbreviation, home_away) %>%
summarise(
team_points = sum(points, na.rm = TRUE),
.groups = "drop"
)
team_home_away <- team_game %>%
filter(!team_abbreviation %in% non_nba_teams) %>%
group_by(team_abbreviation, home_away) %>%
summarise(
avg_points = mean(team_points, na.rm = TRUE),
.groups = "drop"
)
team_home_advantage <- team_home_away %>%
pivot_wider(
names_from = home_away,
values_from = avg_points
) %>%
mutate(
home_court_advantage = home - away
) %>%
select(team_abbreviation, home_court_advantage)
nba_abbrevs <- sched %>%
select(home_abbreviation, away_abbreviation) %>%
pivot_longer(
cols = everything(),
values_to = "team_abbreviation"
) %>%
distinct(team_abbreviation)
team_home_advantage_nba <- team_home_advantage %>%
semi_join(nba_abbrevs, by = "team_abbreviation")
knicks_abbrevs <- c("NY", "NYK")
team_home_advantage_ranked <- team_home_advantage_nba %>%
mutate(is_knicks = team_abbreviation %in% knicks_abbrevs) %>%
arrange(desc(home_court_advantage)) %>%
mutate(rank = row_number())
# Display:
knicks_row <- team_home_advantage_ranked %>%
filter(is_knicks)
display_table <- team_home_advantage_ranked %>%
filter(!team_abbreviation %in% non_nba_teams) %>%
mutate(
home_court_advantage = round(home_court_advantage, 2)
) %>%
select(rank, team_abbreviation, home_court_advantage)
knitr::kable(
display_table,
caption = "Team-Level Home Court Advantage (Home − Away Points)"
)| rank | team_abbreviation | home_court_advantage |
|---|---|---|
| 1 | DEN | 4.46 |
| 2 | POR | 4.29 |
| 3 | ATL | 3.83 |
| 4 | WSH | 3.72 |
| 5 | MIL | 3.59 |
| 6 | OKC | 3.51 |
| 7 | MIA | 3.51 |
| 8 | SAC | 3.50 |
| 9 | GS | 3.45 |
| 10 | SA | 3.43 |
| 11 | UTAH | 3.40 |
| 12 | IND | 3.32 |
| 13 | NJ | 3.16 |
| 14 | NO | 3.10 |
| 15 | CLE | 3.08 |
| 16 | DAL | 2.99 |
| 17 | ORL | 2.88 |
| 18 | TOR | 2.79 |
| 19 | MEM | 2.74 |
| 20 | DET | 2.54 |
| 21 | LAL | 2.53 |
| 22 | PHX | 2.53 |
| 23 | NY | 2.44 |
| 24 | CHA | 2.40 |
| 25 | PHI | 2.21 |
| 26 | BOS | 2.17 |
| 27 | SEA | 2.04 |
| 28 | LAC | 2.02 |
| 29 | HOU | 1.58 |
| 30 | MIN | 0.92 |
| 31 | CHI | 0.82 |
| 32 | BKN | 0.43 |
4.4.0.2 New York’s home court advantage ranks 23rd on the list of NBA teams, placing them in the bottom third of the league. However, their home–away scoring differential was close to the league average, indicating that Madison Square Garden does not confer a markedly weaker or stronger team-level advantage.
4.5 —————————————————————————–
4.6 Q2: Do visiting players play differently at MSG than other arenas?
4.6.1 For context, let’s look at the league-wide home vs. away comparisons.
league_home_away <- pb %>%
filter(!did_not_play, minutes > 0) %>%
mutate(
location = if_else(home_away == "home", "Home", "Away"),
ts = points / (2 * (field_goals_attempted + 0.44 * free_throws_attempted))
) %>%
group_by(location) %>%
summarise(
games = n_distinct(game_id),
pts = mean(points, na.rm = TRUE),
ts = mean(ts, na.rm = TRUE),
tov = mean(turnovers, na.rm = TRUE),
offensive_output = mean(offensive_output, na.rm = TRUE),
defensive_output = mean(defensive_output, na.rm = TRUE),
.groups = "drop"
)
league_home_away
#> # A tibble: 2 × 7
#> location games pts ts tov offensive_output
#> <chr> <int> <dbl> <dbl> <dbl> <dbl>
#> 1 Away 31229 9.84 0.519 1.33 16.0
#> 2 Home 31229 10.1 0.531 1.31 16.6
#> # ℹ 1 more variable: defensive_output <dbl>
t.test(points ~ home_away, data = pb)
#>
#> Welch Two Sample t-test
#>
#> data: points by home_away
#> t = -13.268, df = 645877, p-value < 2.2e-16
#> alternative hypothesis: true difference in means between group away and group home is not equal to 0
#> 95 percent confidence interval:
#> -0.3146714 -0.2336676
#> sample estimates:
#> mean in group away mean in group home
#> 9.836286 10.110455
# Players score more points (+0.27 PTS/G) at home vs. away (p-value < .001).
t.test(ts ~ home_away, data = pb)
#>
#> Welch Two Sample t-test
#>
#> data: ts by home_away
#> t = -18.207, df = 618892, p-value < 2.2e-16
#> alternative hypothesis: true difference in means between group away and group home is not equal to 0
#> 95 percent confidence interval:
#> -0.01308774 -0.01054380
#> sample estimates:
#> mean in group away mean in group home
#> 0.5192728 0.5310886
# Players score more efficiently (+1.19 TS%) at home vs. away (p-value < .001).
t.test(turnovers ~ home_away, data = pb)
#>
#> Welch Two Sample t-test
#>
#> data: turnovers by home_away
#> t = 7.8553, df = 645871, p-value = 3.996e-15
#> alternative hypothesis: true difference in means between group away and group home is not equal to 0
#> 95 percent confidence interval:
#> 0.02064273 0.03436863
#> sample estimates:
#> mean in group away mean in group home
#> 1.332744 1.305238
# Players turn the ball over less often (-0.03 TO/G) at home vs. away (p-value < .001).
t.test(offensive_output ~ home_away, data = pb)
#>
#> Welch Two Sample t-test
#>
#> data: offensive_output by home_away
#> t = -17.986, df = 645721, p-value < 2.2e-16
#> alternative hypothesis: true difference in means between group away and group home is not equal to 0
#> 95 percent confidence interval:
#> -0.5773124 -0.4638555
#> sample estimates:
#> mean in group away mean in group home
#> 16.04081 16.56139
# Players produce more offensive stats (+0.52) at home vs. away (p-value < .001).
t.test(defensive_output ~ home_away, data = pb)
#>
#> Welch Two Sample t-test
#>
#> data: defensive_output by home_away
#> t = -14.912, df = 645133, p-value < 2.2e-16
#> alternative hypothesis: true difference in means between group away and group home is not equal to 0
#> 95 percent confidence interval:
#> -0.05672701 -0.04354752
#> sample estimates:
#> mean in group away mean in group home
#> 1.178608 1.228745
# Players produce more defensive stats (+0.05) at home vs. away (p-value < .001).
league_long <- league_home_away %>%
pivot_longer(
cols = c(pts, ts, tov, offensive_output, defensive_output),
names_to = "metric",
values_to = "value"
) %>%
mutate(
metric = recode(metric,
pts = "Points per player-game",
ts = "True Shooting (TS%)",
tov = "Turnovers per player-game",
offensive_output = "Offensive Output (PTS + REB + AST)",
defensive_output = "Defensive Output (STL + BLK)"
)
)
league_long <- league_long %>%
group_by(metric) %>%
mutate(z_value = (value - mean(value)) / sd(value)) %>%
ungroup()
# Bar plot
ggplot(league_long, aes(x = location, y = value, fill = location)) +
geom_col(width = .85) +
facet_wrap(~ metric, scales = "free_y") +
labs(
title = "League-Wide Home vs Away Performance",
x = NULL,
y = NULL
) +
theme_minimal(base_size = 12) +
theme(
legend.position = "none",
strip.text = element_text(face = "bold")
)4.6.1.1 Across the league, players perform better at home games than away games.
4.6.2 Let’s see if visiting players play better or worse at MSG compared to other away games.
# Do visiting players post better or worse averages at MSG compared to other arenas?
# Note: To isolate the effect of Madison Square Garden on visiting teams, analyses were restricted to away games only. As a result, all Knicks home games were excluded, and MSG performances reflect exclusively visiting team data.
opponent_msg_summary <- pb_road_flagged %>%
group_by(at_msg) %>%
summarise(
games = n_distinct(game_id),
pts = mean(points, na.rm = TRUE),
ts = mean(ts, na.rm = TRUE),
tov = mean(turnovers, na.rm = TRUE),
offensive_output = mean(offensive_output, na.rm = TRUE),
defensive_output = mean(defensive_output, na.rm = TRUE),
.groups = "drop"
)
opponent_msg_summary
#> # A tibble: 2 × 7
#> at_msg games pts ts tov offensive_output
#> <lgl> <int> <dbl> <dbl> <dbl> <dbl>
#> 1 FALSE 30254 9.83 0.519 1.33 16.0
#> 2 TRUE 975 9.96 0.529 1.30 16.1
#> # ℹ 1 more variable: defensive_output <dbl>
t.test(points ~ at_msg, data = pb_road_flagged)
#>
#> Welch Two Sample t-test
#>
#> data: points by at_msg
#> t = -1.5019, df = 10704, p-value = 0.1332
#> alternative hypothesis: true difference in means between group FALSE and group TRUE is not equal to 0
#> 95 percent confidence interval:
#> -0.29435401 0.03896862
#> sample estimates:
#> mean in group FALSE mean in group TRUE
#> 9.832303 9.959996
# The difference in scoring (+0.12 PTS/G) by visiting players at MSG compared to other arenas is not statistically significant (p-value = 0.14).
t.test(ts ~ at_msg, data = pb_road_flagged)
#>
#> Welch Two Sample t-test
#>
#> data: ts by at_msg
#> t = -3.7425, df = 10246, p-value = 0.0001832
#> alternative hypothesis: true difference in means between group FALSE and group TRUE is not equal to 0
#> 95 percent confidence interval:
#> -0.015206877 -0.004752634
#> sample estimates:
#> mean in group FALSE mean in group TRUE
#> 0.5189624 0.5289422
# The difference in shooting efficiency (+1.0 TS%) at MSG compared to other arenas *is* statistically significant (p-value < .001).
t.test(turnovers ~ at_msg, data = pb_road_flagged)
#>
#> Welch Two Sample t-test
#>
#> data: turnovers by at_msg
#> t = 2.2537, df = 10754, p-value = 0.02423
#> alternative hypothesis: true difference in means between group FALSE and group TRUE is not equal to 0
#> 95 percent confidence interval:
#> 0.00415104 0.05959276
#> sample estimates:
#> mean in group FALSE mean in group TRUE
#> 1.333738 1.301866
# The difference in turnovers committed (-0.03 TO/G) at MSG compared to other arenas *is* statistically significant (p-value = .029).
t.test(offensive_output ~ at_msg, data = pb_road_flagged)
#>
#> Welch Two Sample t-test
#>
#> data: offensive_output by at_msg
#> t = -0.25467, df = 10709, p-value = 0.799
#> alternative hypothesis: true difference in means between group FALSE and group TRUE is not equal to 0
#> 95 percent confidence interval:
#> -0.2619123 0.2016807
#> sample estimates:
#> mean in group FALSE mean in group TRUE
#> 16.03987 16.06998
# The difference in offensive stat creation (+0.10) is not statistically significant (p-value = 0.87).
t.test(defensive_output ~ at_msg, data = pb_road_flagged)
#>
#> Welch Two Sample t-test
#>
#> data: defensive_output by at_msg
#> t = 3.6609, df = 10766, p-value = 0.0002525
#> alternative hypothesis: true difference in means between group FALSE and group TRUE is not equal to 0
#> 95 percent confidence interval:
#> 0.02229100 0.07367352
#> sample estimates:
#> mean in group FALSE mean in group TRUE
#> 1.180105 1.132122
# The difference in defensive stat creation (+0.05) *is* statistically significant (p-value < .001).
road_means <- pb_road_flagged %>%
group_by(at_msg) %>%
summarise(
ts = mean(ts, na.rm = TRUE),
turnovers = mean(turnovers, na.rm = TRUE),
defensive_output = mean(defensive_output, na.rm = TRUE),
.groups = "drop"
) %>%
mutate(
location = if_else(at_msg, "Away at MSG", "Away (Other)")
) %>%
select(location, ts, turnovers, defensive_output)
road_long <- road_means %>%
pivot_longer(
cols = c(ts, turnovers, defensive_output),
names_to = "metric",
values_to = "value"
) %>%
mutate(
metric = recode(metric,
ts = "True Shooting (TS%)",
turnovers = "Turnovers per player-game",
defensive_output = "Defensive Output (STL + BLK)"
),
location = factor(location, levels = c("Away (Other)", "Away at MSG"))
)
ggplot(road_long, aes(x = location, y = value, fill = location)) +
geom_col(width = 0.8) +
theme(axis.text.x = element_text(angle = 45, hjust = 1, vjust = 1)) +
facet_wrap(~ metric, scales = "free_y") +
geom_hline(yintercept = 0, linetype = "dashed") +
labs(
title = "Visiting Player Performance: MSG vs Other Away Arenas",
x = NULL,
y = NULL
) +
theme_minimal(base_size = 12) +
theme(
legend.position = "none",
strip.text = element_text(face = "bold")
)4.6.2.1 Compared to other away games, players shoot more efficiently, turn the ball over less often, and produce more defensive stats when they play away games at MSG. This supports the notion that playing at MSG may elevate performances more than playing at other stadiums, at least for visiting players. This leads us to our next question.
4.7 —————————————————————————–
4.8 Q3: Who benefits the most from playing at MSG?
4.8.1 Which players put up the best performances at MSG? (min = 8 games played at MSG)
4.8.1.1 IMPORTANT NOTE: The data we are using for these figures are based on away games played at MSG. Since Knicks players do not play away games at MSG, their data is based on their games played at MSG for non-Knick teams.
player_msg_overall <- pb_road_flagged %>%
filter(at_msg == TRUE, !did_not_play, minutes >= 15) %>%
mutate(
total_output = offensive_output + defensive_output
) %>%
group_by(athlete_id, athlete_display_name) %>%
summarise(
games = n(),
avg_pts = mean(points, na.rm = TRUE),
avg_ts = mean(ts, na.rm = TRUE),
avg_off = mean(offensive_output, na.rm = TRUE),
avg_def = mean(defensive_output, na.rm = TRUE),
avg_tot = mean(total_output, na.rm = TRUE),
.groups = "drop"
)
player_msg_overall_clean <- player_msg_overall %>%
filter(games >= 8)
# Here are the top 20 players with the highest total outputs at MSG.
player_msg_overall_clean %>%
arrange(desc(avg_tot)) %>%
select(
athlete_display_name,
games,
avg_pts,
avg_ts,
avg_off,
avg_def,
avg_tot
) %>%
head(20)
#> # A tibble: 20 × 7
#> athlete_display_name games avg_pts avg_ts avg_off avg_def
#> <chr> <int> <dbl> <dbl> <dbl> <dbl>
#> 1 Kobe Bryant 12 33.9 0.623 44.1 1.83
#> 2 Anthony Davis 9 28.6 0.602 42.6 3
#> 3 LeBron James 31 28.2 0.590 42.7 2.58
#> 4 Kevin Durant 11 31.2 0.659 43.4 1.55
#> 5 James Harden 14 27.6 0.633 42.1 2.71
#> 6 Joel Embiid 11 27.5 0.595 41.4 2.27
#> 7 Giannis Antetokounm… 20 23.4 0.593 40.0 2.5
#> 8 Stephen Curry 12 28.3 0.623 39.9 1.83
#> 9 Russell Westbrook 17 22.2 0.530 39.1 1.94
#> 10 Allen Iverson 12 26.6 0.469 38.1 2.5
#> 11 Trae Young 11 25.6 0.529 38.4 1.18
#> 12 Devin Booker 9 31.2 0.613 37.9 1.33
#> 13 Nikola Jokic 10 23.4 0.657 37.9 1.2
#> 14 Donovan Mitchell 11 26.6 0.577 36.5 1.82
#> 15 Dirk Nowitzki 15 26.5 0.636 36.5 1.47
#> 16 Jayson Tatum 14 23.6 0.572 35.1 2.57
#> 17 DeMarcus Cousins 9 20.2 0.528 34.3 3.22
#> 18 Zach LaVine 10 26.8 0.615 35.5 1.7
#> 19 Kyrie Irving 12 26 0.593 35.9 1.25
#> 20 Tracy McGrady 11 23.1 0.516 34.1 2.27
#> # ℹ 1 more variable: avg_tot <dbl>
# Here are the top 20 players with the highest true shooting % at MSG.
player_msg_overall_clean %>%
arrange(desc(avg_ts)) %>%
select(
athlete_display_name,
games,
avg_pts,
avg_ts,
avg_off,
avg_def
) %>%
head(20)
#> # A tibble: 20 × 6
#> athlete_display_name games avg_pts avg_ts avg_off avg_def
#> <chr> <int> <dbl> <dbl> <dbl> <dbl>
#> 1 Joe Ingles 8 10.6 0.774 17.1 0.625
#> 2 Patrick Patterson 10 8.5 0.772 13.3 1.6
#> 3 DeAndre Jordan 12 11.1 0.749 21.7 2.08
#> 4 Rudy Gobert 11 12.4 0.739 22.5 2.27
#> 5 Kevin Martin 8 24.2 0.712 30 1.38
#> 6 Jae Crowder 11 14.1 0.690 19.4 1.09
#> 7 Jonas Jerebko 10 8.5 0.685 14.7 1
#> 8 Wally Szczerbiak 9 16.4 0.684 22.7 1
#> 9 Richaun Holmes 8 10.2 0.682 16.9 1.5
#> 10 Joe Harris 9 9.67 0.682 13.7 0.667
#> 11 Nick Collison 8 6.88 0.680 12.8 0.875
#> 12 Blake Griffin 10 21 0.676 30.7 1.6
#> 13 Cameron Johnson 8 16.2 0.674 21.5 0.75
#> 14 Kelly Olynyk 14 10.6 0.671 16.8 0.786
#> 15 Domantas Sabonis 12 19.1 0.670 34.1 0.75
#> 16 JJ Redick 15 16 0.666 20.4 0.333
#> 17 Ed Davis 9 9.67 0.665 17.8 1
#> 18 Draymond Green 12 10.6 0.664 23.8 2.92
#> 19 Corey Maggette 10 18.9 0.661 25 0.9
#> 20 Kevin Durant 11 31.2 0.659 43.4 1.554.8.2 Who steps up their game the most playing at MSG vs. other away games?
# Let's compute every player's MSG advantage score = (average offensive + defensive output at MSG away games - average offensive + defensive output at other away games).
player_msg_advantage <- pb_road_flagged %>%
filter(!did_not_play, minutes >= 15) %>%
mutate(total_output = offensive_output + defensive_output) %>%
group_by(athlete_id, athlete_display_name, at_msg) %>%
summarise(
games = n(),
avg_total = mean(total_output, na.rm = TRUE),
avg_pts = mean(points, na.rm = TRUE),
avg_ts = mean(ts, na.rm = TRUE),
.groups = "drop"
) %>%
pivot_wider(
names_from = at_msg,
values_from = c(games, avg_total, avg_pts, avg_ts)
) %>%
mutate(
msg_adv_total = avg_total_TRUE - avg_total_FALSE,
msg_adv_pts = avg_pts_TRUE - avg_pts_FALSE,
msg_adv_ts = avg_ts_TRUE - avg_ts_FALSE
)
player_msg_advantage <- player_msg_advantage %>%
filter(games_TRUE >= 8)
# Let's identify the top MSG risers and chokers.
msg_extremes <- bind_rows(
player_msg_advantage %>% arrange(desc(msg_adv_total)) %>% slice_head(n = 20) %>% mutate(group = "MSG Risers"),
player_msg_advantage %>% arrange(msg_adv_total) %>% slice_head(n = 20) %>% mutate(group = "MSG Chokers")
) %>%
mutate(
athlete_display_name = factor(athlete_display_name, levels = athlete_display_name[order(msg_adv_total)])
)
msg_extremes <- msg_extremes
ggplot(msg_extremes, aes(x = msg_adv_total, y = athlete_display_name, fill = group)) +
geom_col(width = 0.75) +
facet_wrap(~ group, scales = "free_y") +
geom_vline(xintercept = 0, linetype = "dashed", color = "grey40") +
labs(
title = "MSG Risers and Chokers: Total Stat Output",
subtitle = "Within-player difference in total output at MSG vs other away games",
x = "MSG Total Output − Other Away Total Output",
y = NULL
) +
theme_minimal(base_size = 12) +
theme(legend.position = "none", strip.text = element_text(face = "bold"))4.8.2.1 These are the players whose statistical outputs change the most playing at MSG vs. other arenas. Knicks fans, do any stand out in your memory? Do any surprise you?
4.8.3 Let’s also look at shooting efficiency.
ts_extremes <- bind_rows(
player_msg_advantage %>% arrange(desc(msg_adv_ts)) %>% slice_head(n = 20) %>% mutate(group = "TS Risers"),
player_msg_advantage %>% arrange(msg_adv_ts) %>% slice_head(n = 20) %>% mutate(group = "TS Chokers")
) %>%
mutate(
athlete_display_name = factor(athlete_display_name, levels = athlete_display_name[order(msg_adv_ts)])
)
ggplot(ts_extremes, aes(x = msg_adv_ts, y = athlete_display_name, fill = group)) +
geom_col(width = 0.75) +
facet_wrap(~ group, scales = "free_y") +
geom_vline(xintercept = 0, linetype = "dashed", color = "grey40") +
labs(
title = "MSG Risers and Chokers: Shooting Efficiency",
subtitle = "Within-player difference in True Shooting at MSG vs other away games",
x = "MSG Advantage (TS%)",
y = NULL
) +
theme_minimal(base_size = 12) +
theme(legend.position = "none", strip.text = element_text(face = "bold"))4.8.3.1 These are the players whose shooting efficiency changed the most playing at MSG vs. other arenas.
4.8.4 How do the stars of the NBA today perform at MSG compared to other venues?
# Let's make a dataset using recent all-stars from 2024 and 2025.
recent_all_stars <- c(
"LeBron James", "Stephen Curry", "Kevin Durant", "Giannis Antetokounmpo",
"Nikola Jokic", "Joel Embiid", "Luka Doncic", "Jayson Tatum",
"Jimmy Butler", "Damian Lillard", "Anthony Davis", "Kawhi Leonard",
"Shai Gilgeous-Alexander", "Devin Booker", "Jaylen Brown", "Kyrie Irving",
"Tyrese Haliburton", "Donovan Mitchell", "Bam Adebayo", "Jalen Brunson",
"Anthony Edwards", "Julius Randle", "Trae Young", "Pascal Siakam",
"James Harden", "Jalen Williams", "Evan Mobley", "Victor Wembanyama",
"Cade Cunningham", "Tyler Herro", "Jaren Jackson Jr.", "Darius Garland",
"Alperen Sengun", "Tyrese Maxey", "Paolo Banchero", "Scottie Barnes"
)
# Remove the MSG game minimum for the young guys on this list. (This is ugly but I figured it'd work)
player_msg_advantage1 <- pb_road_flagged %>%
filter(!did_not_play, minutes >= 15) %>%
mutate(total_output = offensive_output + defensive_output) %>%
group_by(athlete_id, athlete_display_name, at_msg) %>%
summarise(
games = n(),
avg_total = mean(total_output, na.rm = TRUE),
avg_pts = mean(points, na.rm = TRUE),
avg_ts = mean(ts, na.rm = TRUE),
.groups = "drop"
) %>%
pivot_wider(
names_from = at_msg,
values_from = c(games, avg_total, avg_pts, avg_ts)
) %>%
mutate(
msg_adv_total = avg_total_TRUE - avg_total_FALSE,
msg_adv_pts = avg_pts_TRUE - avg_pts_FALSE,
msg_adv_ts = avg_ts_TRUE - avg_ts_FALSE
)
allstar_msg_adv <- player_msg_advantage1 %>%
filter(athlete_display_name %in% recent_all_stars) %>%
mutate(
msg_adv_ts = msg_adv_ts * 100)
# Let's make a scatterplot with TS% change on the y axis, offensive output change on the x axis.
library(ggrepel)
ggplot(player_msg_advantage1,
aes(x = msg_adv_total, y = msg_adv_ts)) +
geom_point(
data = allstar_msg_adv,
size = 3,
) +
geom_text_repel(
data = allstar_msg_adv,
aes(label = athlete_display_name),
size = 3,
color = "blue",
max.overlaps = Inf
) +
geom_hline(yintercept = 0, linetype = "dashed", color = "grey50") +
geom_vline(xintercept = 0, linetype = "dashed", color = "grey50") +
labs(
title = "How Recent NBA All-Stars Perform at Madison Square Garden",
subtitle = "Differences in offensive output (x) and shooting efficiency (y) at MSG",
x = "Change in Total Stat (Offensive + Defensive) Output",
y = "Change in TS percentage points"
) +
theme_minimal(base_size = 12)