Churches Out, Tattoo Parlors In: Measuring Retail Turnover in New York City’s “Gentrification Core”
Anyone who has lived in or visited New York City, or even seen it on the screen, knows that it is a city of commerce and activity. Part and parcel of that is that it is teeming with retail. To support the densest and most vertical city in the nation requires a tight network of storefronts hawking everything from a coffee or a pack of cigarettes to couches and electronics, to say nothing of the myriad services and experiences like nail salons and escape rooms. This is true even today, just a few years after a global pandemic made a danger of physical proximity and turbocharged the shift to online shopping.
But retail is a notoriously tricky business, and in a fast-paced city like New York, it should be no surprise that close to half of stores operating today weren’t here before the pandemic! But new storefronts serve another purpose to hawking bacon, egg & cheese sandwiches and haircuts; they serve as quick visual cues to passersby about the nature of the neighborhood, especially when it conforms to or undermines a preexisting perception of that nature. You probably wouldn’t bat an eye at a French leather goods store on Madison Avenue, but it’s newsworthy when one opens in a formerly industrial waterfront district.[1]
“Gentrification” as a concept is hotly debated on social media and in community meetings, but it’s not strictly defined in the social sciences and can even be a controversial academic subject. It was coined by a British sociologist in the 1960s to describe middle-class residents moving to and fixing up areas that had formerly been working-class.[2] It’s important to consider alongside the mid-century white flight to the suburbs, dismantling of the social safety net in the 1980s, and the relative urban safety that followed the crack and murder epidemic of the 1990s.
But whereas displacement is a concrete term that refers to the actual removal of residents, often by great structural and capitalistic forces, gentrification doesn’t necessarily entail any end result like that, and is the work not of real estate corporations or municipalities, but of individuals like you and me (although capital does play a big part). And even if your average urban dweller probably couldn’t tell you if or how much local rents had changed over the past few years, they will likely be able to point to the new coffee shop or the closed auto body shop as signs that things aren’t what they were.
We know from decades of urban geographic research that race, income and housing precarity are spatially nonrandom, often thanks to a history of government-sanctioned segregation and home-lending practices. Moore & Diez Roux showed not only that Census tracts in three U.S. cities were highly racially segregated, but that the distribution of types of food establishment was as well, with negative health implications for racial minorities (2006). Baghestani et al show that access to subway stations is highly differentiated by race and income, despite the fact that the system was built over a hundred years ago (2024). McLafferty and Grady used kernel estimation to map the density of prenatal clinics in Brooklyn against accessibility to different ethnic neighborhoods (2005).
But retail is also spatially nonrandom in a way that we already know about: through zoning, New York City and countless other municipalities dictate what kinds of uses can go where, often with admirable intentions (e.g. keeping residents safely distant from a source of pollution), but potentially also with harmful and/or unforeseen side effects. So the question this exercise seeks to answer is: are the types of storefronts closing and opening in gentrifying neighborhoods (defined narrowly later) different from retail turnover elsewhere in the city?
Data sources & description
The primary source of my storefront data is from a company called Live XYZ. Live XYZ bills itself as a “comprehensive database of NYC’s 150,000+ storefronts and public spaces in every borough on every block along with occupancy data, vacancy data, contact information, and more” (LiveXYZ). The storefront data are collected and re-verified by a team of human scouts on a rolling neighborhood-by-neighborhood basis. As a city worker, I have access to this directory under an agreement with the New York City Office of Technology and Innovation. I hope to not only use this dataset for my capstone project, but also for a future neighborhood-level retail analysis in my work for the Department of Housing Preservation & Development.
In April 2025, when the data for this project was downloaded, there were 209,952 operating storefronts in the five boroughs, roughly one for every 40 New York City residents. These locations are plotted as dots in Figure 1, where Manhattan is a virtual yellow chunk of land, but even other parts of the city are dense enough that you can clearly make out the voids: JFK airport, Flushing-Meadows Park, Prospect Park and the Green-Wood Cemetery. The further you get from the center, you can see long spindly tails where retailers cluster on stretches of road.
The stores are categorized into 16 different “macro”-categories (arts & culture, auto, body, drinks, entertainment, essentials, fashion, food, groups, home & hobby, lodging, miscellaneous, municipal, parks & recreation, services, transport), of which food is by far the most-represented, with about 20% of the total points. My analysis of the difference between the gentrified part of the city and the rest of the city that follows, will hinge on the distribution of the subcategories under these larger ones, of which there are over 1,000!
Figure 1: Overview of the universe of storefronts captured by the data source in April 2025 with the status of “operating”
For the demographic data that is going to help me identify the gentrifying/gentrified part of the city to focus on, I started with tables downloaded directly from the U.S. Census American Community Survey (ACS) for Census tracts within the five boroughs for two different time periods: 2018 and 2022. For reasons discussed below, I then switched to ACS data that had already been aggregated to the Community District (CD) level by the NYU Furman Center, a university resource dedicated to housing and socioeconomic inequality issues.
At first, I had chosen only race and income data from the U.S. Census to serve as my measures for whether a tract was experiencing gentrification. I wanted to look at the change in the share of the white population between two timescales, as well as the change in the median income. Again, gentrification is a slippery term, but these two measures, at the least, can be agreed as accepted indicators in the discourse.[3] Once I started browsing the data available at the Furman Center at the CD level, I added percent change in gross apartment rent.
Data conditioning and transformation
The preparation and analysis of my data relied on two principal steps: first, to identify the spatial definition of gentrification to focus on, and second, to calculate the storefront difference between that zone and the rest of the city by looking at subcategories of openings and closings. For the first part, I conducted a spatial join of my demographic data from the Census and CoreData spreadsheets to shapefiles for Census tracts and Community Districts, respectively. If I hadn’t chosen to switch from the Census tract level to the CD level (discussed in greater detail below), I would have had to do much more manual conditioning of the Census data.
For the storefront data, I had to manually code opening dates for the individual businesses, as this data point wasn’t provided as a field in the raw data download. Live XYZ has only been around since 2015, and they didn’t complete a full initial sweep of the entire city’s businesses until January 2019. So each row of storefront data has a “space creation date,” which is the date Live XYZ entered the physical retail space into their database, and a “place creation date,” which is the first date Live XYZ logged the business that is or was inhabiting that space. To give the example of my pharmacy: 200 Water Street, New York, NY 10038 would be the “space,” while Duane Reade would be the currently inhabiting “place.”
So where space creation and place creation dates are the same, that was the first capture of an existing business, and we don’t know how old that business is. But if place creation date is less than space creation date, the business opened for the very first time sometime soon before the place creation date; Live XYZ had already identified the space, and this was a newly operating business there. This is the best proxy I could come up with for newly opened businesses.
All places tagged “permanently closed” as a status are considered closed. This includes ones that were classed as “newly opened” by the logic above, so businesses that opened during the observation period (2018-2025) but had been reclassified as closed by the time I downloaded the data in April 2025 would be considered closed.
Analysis
Again, the process for arriving at my results relies first on defining what will be my zone of interest vis-a-vis gentrification, and then doing a comparison of the retail turnover in that district. After first experimenting with tract-level cluster analysis in ArcGIS, I ultimately decided to manually group a selection of CDs based on the multivariate combination of three gentrification indicators: white population share shift, high earner share shift (households earning $100,000 or more in 2023 dollars), and percent increase in rent. From that newly identified “gentrification core,” I identified the storefront points that lay “entirely within” the boundaries of the newly dissolved layer, and broke them down by subcategory and compared that list with those outside the boundaries—the “rest of the city.”
At first I chose the Census tract as the sub-geography to map the changes in income and share of the white population, with data downloaded from the US Census website. From these I created two Anselin LISA cluster maps to show those areas where Census tracts appeared together in the extremes (clusters) and those areas that stood out against their neighbors (outliers) for each measure. But this approach yielded challenges on two fronts.
First, some Census tract boundaries changed shape or were split up in the middle of my comparison timeline, notably due to redistricting following the 2020 Census. This is particularly problematic for an analysis like mine, because areas where new Census tracts are being created are by definition areas of population change, particularly when new people are moving in. For example, you can see the waterfront areas of northern Brooklyn and Long Island City in Queens—where we know there has been great population growth due to a Bloomberg administration rezoning[4]—are missing on my maps.
I think I could have remedied this by manually combining the raw figures from the (often two) new Census tracts that were present in the post-2020 American Community Survey to create a composite figure that would spatially equate with the Census tract that existed pre-2020. But this was going to be not only laborious, but also a potentially imprecise and misleading exercise—I didn’t know for sure that Census Tract 101 pre-2020 always matches with Census Tracts 101.1 and 101.2 post-2020. Besides, I knew there was a sub-geography whose boundaries do not change often—the Community District (CD). And as we’ll see with the CD maps below, these larger subdivisions offer clearer geographic trends for my indicators, the second challenge I identified in this tract-level series.
Figure 2: Shift in the proportion of white population by Census tract, 2018-2023 (left), and cluster/outlier analysis of same (right).
Figure 3: Median income change by Census tract (%), 2018-2023 (left), and cluster/outlier analysis of same (right).
Nevertheless, we can glean some interesting insights from these tract-level maps, especially as they support what we’ll see later at the CD-level, only in finer detail. The densest clusters of the shift in the white share of the population can be seen in the Bed-Stuy/central Brooklyn area, as well is in the southwestern Bronx and Jamaica section of Queens (Figure 2). For income, the only truly standout and contiguous Census tracts for change in median income are the same Bed-Stuy/central Brooklyn core seen in the white shift, but this time extending north to Williamsburg, east to Bushwick, and west to downtown Brooklyn and Park Slope (Figure 3). This almost completely overlaps with what I’ll call the “gentrification core” in the CD-level analysis.
The bivariate map combining white shift and income change underscores the analysis above, but is a little hard to read – there are simply so many Census tracts in tight proximity doing different things! The CD-level maps below will be much easier to interpret at a city-wide level, even if we have to look at three side-by-side. But helpfully, the Bed-Stuy section still stands out as patch of dark blue (Figure 4).
Figure 4: Bivariate analysis combining median income change by Census tract (%), 2018-2023 (pink), and shift in the proportion of white population by Census tract, 2018-2023 (teal), and where high changes in both overlap (dark blue).
The CD-level data I obtained from the NYU Furman CoreData portal not only corrects for the changing boundaries of the Census tracts, but also takes some of the work out of identifying clusters. Looking at a map of a geography as extensive and complex as New York City’s, it’s a lot easier to identify trends from 59 Community Districts than from hundreds or thousands of tiny tracts. In Figure 5, you can see the choropleth maps of my three gentrification indicators (shift in white population share, shift in share of high earners, percent increase in rent) arranged side-by-side. Using only four buckets of Jenks categorization, we can clearly see the CDs that scored highly on each measure.
For white shift share, there are six CDs that score high for white shift share; except for Manhattan 10 (central Harlem), all are in north central Brooklyn. For high earner shift, all six are in north central Brooklyn. And for rent increase, all four are. But among all three categories, three CDs score consistently high, and they happen to be contiguous: Brooklyn CDs 2, 3 and 4... what a cluster! This district—stretching from Brooklyn Heights and DUMBO on the East River waterfront, inland through downtown Brooklyn, Fort Greene, Clinton Hill, Bedford-Stuyvesant, and Bushwick on the border with Queens in the east—is what I will call the “gentrification core” (GC).
Figure 5: White shift, high earner shift and rent increase, varying timescales, with the three Community Districts (CDs) that scored high on all three outlined in teal
Results
In the GC, 10,370 storefronts were existing and stayed in business throughout the examination period (2018-2025). During the same period, 7,629 new operations opened. From this we can calculate a “novelty rate” of 42.4% (7,629 / 10,370 + 7,629). For the rest of the city (RC), there were 136,651 existing storefronts, with 160,966 newly opened, giving us a novelty rate of 54.1%. By this measure, the rest of the city actually saw faster storefront generation than the GC, which is maybe surprising if we think of a gentrifying neighborhood as at its core one experiencing changes.
The GC saw 4,841 closings, giving it a “failure rate” of 31.8% (4,841 / 10,370 + 4,841). Meanwhile, the RC had 76,394 closings, for a failure rate of 35.9%. So by this measure, storefronts in the GC are slightly more sustainable propositions than in the RC. With lower rates of both storefront creation and failure, the GC is surprisingly a space of continuity surrounded by a city marked by change.
For context, Figure 6 is a heatmap showing the distribution of closings (yellow-brown) citywide with openings in blue. They are adjusted for the fact that there were more openings than closings. You can see closings have a hotter central hotspot (in white) in the Manhattan central business district, while openings have more representation in outlying areas like eastern Queens and Staten Island. This conforms with the post-Covid trend of more New Yorkers working from home and spending more time and money in their neighborhoods.
Figure 6: Storefront closings (left) and openings (right)
In Figures 7 and 8, I have mapped first the openings and closings in the GC distinguished by color, and then just the openings, distinguished by macro-category. At a glance, it’s hard to make any discernible trends from these maps. In Figure 7, I placed the closings layer on top of the openings because there were fewer of them, but they obviously obscure places where a new business opened up in the place of an old one. It seems like openings maybe have spread to side streets, while closings (and hidden openings) are more clustered along the main thoroughfares.
In Figure 8, the most obvious takeaway is that there are no clearly identifiable “restaurant rows” or “fashion districts”—every category is intermingled with the others. This contrasts with LiveXYZ’s main map (https://www.livexyz.com/) which codes the categories by color and you can clearly make out the yellow fashion hotspot of SoHo and the small but very orange strip of flower shops on West 28th Street.
Figure 7: Storefront turnover, 2018-2025, “gentrification core”
Figure 8: Openings by category, “gentrification core”
Figure 9 is a histogram showing the incidence of each subcategory in both the GC and the RC. This is purely for graphic purposes as it would be very unwieldy to show the labels for all 1,000+ subcategories. But you can see the “long tail” created by many niche subcategories boasting only one or a few instances trailing the towering “convenience store”—far and away the most common in both geographies.
Figure 9: Subcategories of openings in gentrification core (top) vs. rest of the city (bottom)
In Table 1, I compared the fifty most common openings in the GC with those in the rest of the city. During the observed time period, the GC saw openings divided into 568 distinct subcategories, ranging from “convenience store” with 340 openings, to “Arabic clothing store,” “jazz club,” and “yarn store,” with one each. During the same time period, the rest of the city saw openings split across 1,037 subcategories, with the same number one spot: 5,618 new convenience stores!
In green, I highlighted the 29 subcategories whose percent of the total openings (column D) were higher than the rest of the city (whether or not the latter made it in the top 50 for the rest of the city). If this sub-set of the city was chosen at random—that is, not picked out for its supposed uniqueness as a “gentrification hotspot”—we’d expect the number of subcategories that overrepresents the city as a whole to be roughly half, or even less, if they were very close. And we wouldn’t expect much variation from one column to the other. So this table shows us the types of storefronts opening in the GC, are, in fact, slightly different from the rest of the city, and some categories dramatically so.
Discussion & Conclusion
Some of the highlighted subcategories read like a cartoon sketch of a gentrified block: real estate agency (overrepresented by 0.38 percentage points), art gallery (+0.36 pts.), thrift store (+0.58 pts.), tattoo parlor (+0.44 pts.), cocktail bar (+0.43), fitness studio (+0.26 pts.), and coworking space (+0.41 pts.). And clocking in at place 50 is one of the subcategories I semi-jokingly hypothesized would be overrepresented in these neighborhoods in my initial project proposal – the yoga studio (+0.34). Others are more enigmatic: Mexican restaurant (+0.19 pts.) or sandwich shop (+0.20 pts.). But of the 29 highlighted subcategories, 12 don’t even crack the top 50 for the rest of the city.
Of course, these are my rather subjective, personal associations with an image of the “gentrified neighborhood.” I don’t know if there’s any good, scientifically rigorous way to group these subcategories one way or the other. But for the purposes of this project, I think it’s safe to say the subcategories in my spatially defined zone do not closely match those of the rest of the city.
The comparison of subcategories that closed over the same period in the GC compared to the rest of the city is less demonstrative. There are 27 subcategories that appear more frequently in the GC than overall (Table 2). But several of these (e.g. coffee shop, real estate agency, bar) also overrepresented in openings, meaning perhaps that these are just industries that see lots of turnover. But a few of them point to a neighborhood growing more childless and less religious: churches, daycare centers, and religious centers all overrepresent in closures in the GC.
This exploration has shown that the make-up of store openings and closings in a zone defined as increasingly white, wealthy, and expensive to live in, is in fact different from that in the rest of the city. While not necessarily surprising, this information could be used in planning discussions during rezoning considerations. Next steps might be a regression analysis, zonal analysis of opening/closing overlap, space-time cube mapping, and/or a more systematized categorization of “gentrification” store subcategories. As long as New York City remains an ever-changing city, this dataset and these questions will remain noteworthy!
References
American Community Survey, NYU Furman Center. May 21, 2024. Real Median Gross Rent 2006-2022. NYU Furman Center’s CoreData.nyc. Retrieved May 10, 2025, from https://furmancenter.org/neighborhoods.
Baghestani, A., Nikbakht, M., Kucheva, Y., & Afshar, A. (2024). Assessing spatial and racial equity of Subway Accessibility: Case Study of New York City. Cities, 155, 105489. https://doi.org/10.1016/j.cities.2024.105489
Census (2000) and American Community Survey (2022), Furman Center. May 21, 2024. Racial and Ethnic Composition. NYU Furman Center’s CoreData.nyc. Retrieved May 10, 2025, from https://furmancenter.org/neighborhoods.
Census (2000) and American Community Survey (2022) via IPUMS USA, Furman Center. May 21, 2024. Household Income Distribution. NYU Furman Center’s CoreData.nyc. Retrieved May 10, 2025, from https://furmancenter.org/neighborhoods.
LiveXYZ. Directory. Live XYZ Inc. Retrieved April 17, 2025, from https://directory.livexyz.com/places.
McLafferty, S., & Grady, S. (2005). Immigration and geographic access to prenatal clinics in Brooklyn, NY: A Geographic Information Systems Analysis. American Journal of Public Health, 95(4), 638–640. https://doi.org/10.2105/ajph.2003.033985
Moore, L. V., MSPH, & Diez Roux, A. V., MD, PhD (2006). Associations of Neighborhood Characteristics with the Location and Type of Food Stores. American Journal of Public Health, 96(2), 325-330.
U.S. Census Bureau. (n.d.). Census Tract – Median Household Income 2023. U.S. Department of Commerce. Retrieved April 14, 2025, from https://data.census.gov/
U.S. Census Bureau. (n.d.). Census Tract – Race 2023. U.S. Department of Commerce. Retrieved April 14, 2025, from https://data.census.gov/
https://www.brooklynpaper.com/hermes-williamsburg-fake-historic-design/↑
https://www.kqed.org/news/136343/gentrification-a-word-from-another-place-and-time#:~:text=In%201964%2C%20Glass%20observed%20%E2%80%9COne,the%20middle%2Dclass%20... ↑
https://cityobservatory.org/everything-that-causes-gentrification-from-a-to-zv2-2/?utm_source=substack&utm_medium=email↑
https://www.politico.com/states/new-york/city-hall/story/2014/02/the-quiet-massive-rezoning-of-new-york-078398↑