Skip to main content

The 2025 Brooklyn Open Data Collection: Analyst Portfolios: 3 Environmental Stressors and Social Complaints in New York City

The 2025 Brooklyn Open Data Collection: Analyst Portfolios
3 Environmental Stressors and Social Complaints in New York City
  • Show the following:

    Annotations
    Resources
  • Adjust appearance:

    Font
    Font style
    Color Scheme
    Light
    Dark
    Annotation contrast
    Low
    High
    Margins
  • Search within:
    • Notifications
    • Privacy
  • Project HomeBrooklyn Civic Data Lab
  • Projects
  • Learn more about Manifold

Notes

table of contents
  1. About
    1. 0.1 How to Use This Book
    2. 0.2 Companion Textbook
    3. 0.3 Instructor Note
    4. 0.4 Why NYC Open Data?
    5. 0.5 Contributors
    6. 0.6 Acknowledgments
    7. 0.7 How to Cite This Volume
  2. 1 Toxic Homes: Exploring Mold Exposure Complaint and Domestic Violence Report Trends in NYC
    1. 1.1 Loading, Prepping, Cleaning, & Aggregating
      1. 1.1.1 Data Preparation & Cleaning
      2. 1.1.2 Aggregating Mold Data & DV Data
    2. 1.2 Exploring the Data
      1. 1.2.1 Domestic Violence Data
      2. 1.2.2 Mold Exposure Data
      3. 1.2.3 Summary Stats
      4. 1.2.4 Borough/Year Distributions
      5. 1.2.5 Heat Map
      6. 1.2.6 Preliminary Correlation
    3. 1.3 Temporal Trends
      1. 1.3.1 Exploring Mold Resolution
      2. 1.3.2 Quick Look at Resolution Time
      3. 1.3.3 Average Resolution Delay per Month
      4. 1.3.4 Lagged Data
    4. 1.4 Statistical Analysis
    5. 1.5 Regression Models
    6. 1.6 Discussion & Insights
  3. 2 Beating Around the Bush: Uncovering the Hidden Link Between Urban Trees and Wildlife Activity
    1. 2.1 Required Packages
    2. 2.2 Data and Methods
      1. 2.2.1 Data Sources
      2. 2.2.2 Data Cleaning and Preparation
    3. 2.3 Descriptive Analysis (Plots)
      1. 2.3.1 Street Tree Distribution Across Boroughs (Bar chart)
      2. 2.3.2 Wildlife Incidents Across Boroughs (Bar chart)
      3. 2.3.3 Combining Tree and Wildlife Data at the Borough Level (Table)
      4. 2.3.4 Wildlife Incidents Relative to Street Tree Availability (Standardized bar chart / rate per 10,000 trees)
      5. 2.3.5 Spatial Distribution of Street Trees (Binned spatial density plot / heatmap)
      6. 2.3.6 Park-Level Patterns in Wildlife Incidents (Faceted horizontal bar chart)
      7. 2.3.7 Species Involved in Wildlife Incidents (Faceted horizontal bar chart)
    4. 2.4 Inferential and Exploratory Analyses
      1. 2.4.1 Differences in Average Street Tree Size Across Boroughs (One-way ANOVA)
      2. 2.4.2 Association Between Borough and Wildlife Condition (Chi-square test of independence)
      3. 2.4.3 Exploratory Relationship Between Street Tree Abundance and Wildlife Incidents (Simple linear regression)
    5. 2.5 Discussion and Implications
      1. 2.5.1 Conclusion
      2. 2.5.2 Audience & Relevance
      3. 2.5.3 Connection to Open Data
  4. 3 Environmental Stressors and Social Complaints in New York City
    1. 3.1 Research Question
    2. 3.2 Data Sources
    3. 3.3 Reproducible Workflow
    4. 3.4 Loading Downloaded Excel Datasets
    5. 3.5 Accessing NYC Open Data via API (311 Noise Complaints)
    6. 3.6 Data Cleaning and Preparation
    7. 3.7 Merging Datasets
    8. 3.8 Descriptive Statistics
    9. 3.9 Visualization 1: Flooding Complaints by Borough
    10. 3.10 Visualization 2: Flooding and Noise Complaints
    11. 3.11 Statistical Analysis
    12. 3.12 Results
    13. 3.13 Discussion
    14. 3.14 Limitations and Future Directons
    15. 3.15 Connection to Open Data
    16. 3.16 Conclusion
  5. 4 The Madison Square Garden Effect in the NBA
    1. 4.0.1 What is Madison Square Garden?
    2. 4.0.2 What makes MSG so special?
    3. 4.0.3 Is the MSG effect real?
    4. 4.0.4 Three overarching research questions:
    5. 4.1 —————————————————————————–
    6. 4.2 NBA Data Project
    7. 4.3 —————————————————————————–
    8. 4.4 Q1: Do the New York Knicks experience a special home-court advantage due to playing at MSG?
    9. 4.5 —————————————————————————–
    10. 4.6 Q2: Do visiting players play differently at MSG than other arenas?
      1. 4.6.1 For context, let’s look at the league-wide home vs. away comparisons.
      2. 4.6.2 Let’s see if visiting players play better or worse at MSG compared to other away games.
    11. 4.7 —————————————————————————–
    12. 4.8 Q3: Who benefits the most from playing at MSG?
      1. 4.8.1 Which players put up the best performances at MSG? (min = 8 games played at MSG)
      2. 4.8.2 Who steps up their game the most playing at MSG vs. other away games?
      3. 4.8.3 Let’s also look at shooting efficiency.
      4. 4.8.4 How do the stars of the NBA today perform at MSG compared to other venues?
    13. 4.9 —————————————————————————–
    14. 4.10 Conclusion: Is the MSG Effect detectable?
      1. 4.10.1 On an individual player performance level: yes.
  6. 5 NYC Restaurants and Museums
    1. 5.1 Packages
    2. 5.2 Data Loading, Cleaning, and Merging
    3. 5.3 Loading Data
    4. 5.4 Cleaning and Merging Data Sets
      1. 5.4.1 Cleaning “restaurant_rating_data” Set
    5. 5.5 Cleaning “restaurant_data” Set
    6. 5.6 Merging Data Sets
    7. 5.7 Inputting Ratings for EACH Restaurant
    8. 5.8 Deleting Restaurants Without Rating from Google
    9. 5.9 Merging “dba” and “name” Columns
    10. 5.10 Deleting Unnecessary Columns in “merged_restaurant_data” Set
    11. 5.11 Cleaning “museum_data” Set
    12. 5.12 Goal 1: Statistical analysis (higher ratings)
    13. 5.13 Creating New Column
    14. 5.14 Typing “Yes” or “No”
    15. 5.15 Binning ratings into Groups
    16. 5.16 Contingency Table
    17. 5.17 Visualizing our Data
    18. 5.18 Chi-Square Test
      1. 5.18.1 Chi=Square Interpretation
    19. 5.19 Goal 2: Statistical analysis (Restaurant Violations)
    20. 5.20 Creating New Column
    21. 5.21 Typing “None” or “Critical”
    22. 5.22 Contingency Table
    23. 5.23 Visualizing our Data
    24. 5.24 Chi-Square Test
      1. 5.24.1 Interpretation
    25. 5.25 Fisher’s Exact Test
      1. 5.25.1 Interpretation
    26. 5.26 Goal 3: Creating an interactive Map
    27. 5.27 Conclusion
    28. 5.28 References
  7. 6 Leading Causes of Death and Indoor Environmental Complaints
    1. 6.1 Loading Libraries and importing data sets
    2. 6.2 Cleaning the data sets
    3. 6.3 Looking at both data sets
    4. 6.4 Visualizations
    5. 6.5 Pairing Complaint types with Causes of Death
    6. 6.6 Process of merging data
    7. 6.7 Merged Data
    8. 6.8 Corrleation between causes of death and indoor environmental complaints
    9. 6.9 Linear Regression
    10. 6.10 Relevance and Conclusion
  8. 7 Social Infrastructure & Well-Being
    1. 7.1 Libraries Used
    2. 7.2 Data Loading
    3. 7.3 Cleaning
      1. 7.3.1 Basic Events Cleaning
      2. 7.3.2 BoroReport Cleaning
      3. 7.3.3 Final Events Cleaning
    4. 7.4 Events Count
    5. 7.5 SNAP Benefits Count
    6. 7.6 Merging
    7. 7.7 Linear Regression
    8. 7.8 Conclusion

3 Environmental Stressors and Social Complaints in New York City

Author: Emma Valentina

Climate change increasingly affects urban environments through flooding, extreme weather, and infrastructure strain. Beyond physical impacts, environmental stressors may influence psychological well-being and social behavior, particularly in dense urban settings such as New York City. Environmental stress theory suggests that chronic exposure to environmental risk can heighten stress, frustration, and conflict within communities.

This project investigates whether environmental risk exposure, measured through flooding-related complaints, is associated with social indicators of anxiety, operationalized as noise complaints across New York City. By utilizing publicy available NYC Open Data, this project demonstrates how open civic data can be used to explore psychological and behavioral patterns at the community level in a reproducible way.

This analysis is framed as a proposal suitable for submission to NYC Open Data Week 2026 and emphasizes transparency, reproducibility, and civic relevance.

3.1 Research Question

How does exposure to environmental risk (measured by flooding complaints) relate to social indicators of anxiety (measured by noise complaints) across New York City boroughs?

3.2 Data Sources

This project uses four datasets from the NYC Open Data Portal:

  1. 311 Service Requests (Noise Complaints subset)
  2. 311 Service Requests (Flooding-related complaints subset)
  3. E-Designations
  4. Hazard Mitigation Plan - Mitigation Actions Database

Due to the size of the full 311 Service Requests dataset, noise complaint data were accessed via the NYC Open Data API. Smaller supporting datasets were downloaded and imported as Excel files. This mixed-format approach improves efficiency while maintaining transparency and reproducibility.

3.3 Reproducible Workflow

library(tidyverse)
library(lubridate)
library(readxl)
library(httr)
library(jsonlite)

knitr::opts_chunk$set(
  message = FALSE,
  warning = FALSE
)

3.4 Loading Downloaded Excel Datasets

edesignation <- read_excel(
  "E-Designations.xlsx",
  na = c("N/A", "NA", "")
)

hazard_mitigation <- read_excel(
  "Hazard_Mitigation_Plan.xlsx",
  na = c("N/A", "NA", "")
)

street_flooding <- read_excel(
  "Street_Flooding_Complaints.xlsx",
  na = c("N/A", "NA", "")
)

3.5 Accessing NYC Open Data via API (311 Noise Complaints)

To avoid downloading the full 311 datatset, noise complaints were retrieved using the NYC Open Data API. Only relevant variables and records were requested.

get_nyc_data <- function(url) {
  response <- GET(url)
  content(response, as = "text", encoding = "UTF-8") |>
    fromJSON(flatten = TRUE) |>
    as_tibble()
}
noise_url <- URLencode(paste0(
  "https://data.cityofnewyork.us/resource/erm2-nwe9.json?",
  "$select=unique_key,created_date,complaint_type,borough,city,",
  "x_coordinate_state_plane,y_coordinate_state_plane&",
  "$where=complaint_type LIKE 'Noise%' ",
  "AND created_date >= '2018-01-01T00:00:00'&",
  "$limit=50000"
))

noise <- get_nyc_data(noise_url)

3.6 Data Cleaning and Preparation

Data were cleaned to ensure consistent borough labeling and proper date formatting. Complaint counts were aggregated at the borough-year level to allow comparisons across time and location.

noise_clean <- noise |>
  filter(!is.na(borough)) |>
  mutate(
    created_date = as.POSIXct(created_date),
    year = year(created_date),
    Borough = borough
  ) |>
  group_by(Borough, year) |>
  summarise(
    noise_complaints = n(),
    .groups = "drop"
  )

flood_clean <- street_flooding |>
  filter(!is.na(Borough)) |>
  mutate(
    created_date = as.POSIXct(`Created Date`),
    year = year(created_date)
  ) |>
  group_by(Borough, year) |>
  summarise(
    flood_complaints = n(),
    .groups = "drop"
  )

3.7 Merging Datasets

complaints <- left_join(
  flood_clean,
  noise_clean,
  by = c("Borough", "year")
)

3.8 Descriptive Statistics

summary(complaints)
#>    Borough               year      flood_complaints
#>  Length:87          Min.   :2010   Min.   :   1.0  
#>  Class :character   1st Qu.:2014   1st Qu.: 220.5  
#>  Mode  :character   Median :2018   Median : 394.0  
#>                     Mean   :2018   Mean   : 512.8  
#>                     3rd Qu.:2022   3rd Qu.: 765.5  
#>                     Max.   :2025   Max.   :1571.0  
#>                                                    
#>  noise_complaints  
#>  Min.   :    1.00  
#>  1st Qu.:    8.25  
#>  Median :  528.00  
#>  Mean   : 1654.70  
#>  3rd Qu.: 1235.25  
#>  Max.   :12350.00  
#>  NA's   :57

This summary provides and overview of variation in environmental risk exposure and social complaint behavior across boroughs and years.

3.9 Visualization 1: Flooding Complaints by Borough

ggplot(flood_clean, aes(x = Borough, y = flood_complaints)) +
  geom_col(fill = "#2a9d8f") +
  labs(
    title = "Street Flooding Complaints by Borough",
    x = "Borough",
    y = "Number of Flooding Complaints"
  ) +
  theme_minimal()

3.10 Visualization 2: Flooding and Noise Complaints

complaints <- complaints %>%
  mutate(Borough = toupper(trimws(Borough)))
ggplot(complaints, aes(x = flood_complaints, y = noise_complaints, color = Borough)) +
  geom_point(size = 3, alpha = 0.7) +
  geom_smooth(method = "lm", se = TRUE, color = "black") +
  scale_color_manual(values = c(
    "MANHATTAN" = "#2a9d8f",
    "BROOKLYN" = "#264653",
    "BRONX" = "#8ac926",
    "QUEENS" = "#56c596",
    "STATEN ISLAND" = "steelblue"
  )) + 
  labs(
    title = "Relationship Between Flooding and Noise Complaints",
    x = "Flooding Complaints",
    y = "Noise Complaints"
  ) +
  theme_minimal()

3.11 Statistical Analysis

model <- lm(noise_complaints ~ flood_complaints, data = complaints)
summary(model)
#> 
#> Call:
#> lm(formula = noise_complaints ~ flood_complaints, data = complaints)
#> 
#> Residuals:
#>     Min      1Q  Median      3Q     Max 
#> -1848.6 -1543.1 -1220.2  -332.9 10785.9 
#> 
#> Coefficients:
#>                   Estimate Std. Error t value Pr(>|t|)
#> (Intercept)      1487.2655   881.3323   1.688    0.103
#> flood_complaints    0.2803     1.1848   0.237    0.815
#> 
#> Residual standard error: 2876 on 28 degrees of freedom
#>   (57 observations deleted due to missingness)
#> Multiple R-squared:  0.001994,   Adjusted R-squared:  -0.03365 
#> F-statistic: 0.05595 on 1 and 28 DF,  p-value: 0.8147

3.12 Results

The regression analysis evaluates whether higher levels of flooding complaints are associated with increased noise complaints. A positive relationship would align with environmental stress theory, suggesting that climate-related disruptions may be linked to heightened social tension and anxiety-related behaviors.

3.13 Discussion

This project demonstrates how climate-related environmental stressors may influence social behavior in urban environments. Flooding represents a tangible, recurring disruption, while noise complaints may reflect increased frustration or conflict within communities. Together, these indicators provide insight into how climate risk may shape everyday social experiences in New York City.

By integrating psychological theory with open civic data, this analysis highlights the value of interdisciplinary approaches to understanding climate impacts.

3.14 Limitations and Future Directons

311 complaint data reflect reporting behavior rather than direct psychological measurement and may vary by access to city services or trust in government. Future work could incorporate demographic or spatial analyses, examine temporal patterns around extreme weather events, or integrate measures of pro-environmental behavior and peer influence.

3.15 Connection to Open Data

This project emphasizes transparency and reproducibility by combining downloaded open datasets with API-based data access. NYC Open Data enables researchers and communities to explore complex environmental and social issues using publicly accessible information, supporting informed decision-making and civic engagement.

3.16 Conclusion

Environmental stressors associated with climate change may have meaningful social and psychological consequences in urban settings. Using NYC Open Data, this project provides a reproducible framework for examining how environmental risk relates to social indicators of anxiety across New York City. These findings underscore the importance of incorporating psychological perspectives into climate resilience planning.

Annotate

Next Chapter
4 The Madison Square Garden Effect in the NBA
PreviousNext
Analyst Case Studies
Powered by Manifold Scholarship. Learn more at
Opens in new tab or windowmanifoldapp.org