Clear-Sighted Statistics
Chapter 6: Index Numbers
I. Introduction
Index numbers are everywhere. Turn to the financial page of any newspaper and you will find indices mentioned. (The plural of index is indices or indexes.) We have been using indices ever since 1764 when the Italian economist Giovanni Rinaldo, the Count of Carli, invented them to compare the prices of grain, wine, and oil for a 250-year period.
Index numbers are widely used in economics, business, politics, and other arenas to compare the relative difference among quantitative variables. Marketers, for example, develop a Brand Development Index (BDI) and a Consumer Development Index (CDI) when they plan marketing activities in local markets. Statisticians use index numbers when they calculate the coefficient of variation (CV), which we discussed in Chapter 5. And, also as discussed in Chapter 5, index numbers are used to calculate a geometric mean when the data contain negative numbers. Investors and financial analysts use indices to understand the strengths and weaknesses of financial markets. The Dow Jones Industrial Average, Standard & Poor’s 500 Stock Average, the NIKKEI 225, and the Nasdaq-100 are influential indices that reveal the strength of financial markets. Economists use the government’s Consumer Price Index (CPI) and the Producers Price Index (PPI) to analyze changes in the prices of a “market basket” of goods. In economics a “market basket” refers to a mix of goods and services purchased consistently. Such indices are often considered measures of the state of the economy.
In this chapter, we will review index numbers. After completing this chapter, you will:
• Understand why we use index numbers.
• Know how to interpret index numbers.
• Calculate simple unweighted index numbers.
• Use index numbers to calculate the Coefficient of Variation (CV).
• Use index numbers to calculate the geometric mean when the data have negative numbers.
• Calculate unweighted price indices.
• Calculate weighted indices like the Laspeyres and Paasche prices indices as well as the Fisher’s Ideal Index and Value Index.
• Identify special-purpose indices published by the government and financial institutions.
This chapter contains the following files, which you should download:
• Chapter06_BDIs_CDIs.xlsx
• Chapter06_Exercises.xlsx
• Chapter06_GEOMEAN_index.xlsx
• Chapter06_SubwaySystem_Stations.xlsx
• Chapter06_UnweightedPriceIndex.xlsx
• Chapter06_WeightedIndex.xlsx
II. What Index Numbers Do and How to Interpret Them
Index numbers are a convenient way to compare the difference in a series of numbers. These differences are relative to one value compared to another, which is considered the base, or comparison, variable. Most commonly used indices show changes over time. The Consumer Price Index is just one example of an index that compares changes of data over time. But, index numbers are not always used to compare the difference or change in numbers over time. Indices like the coefficient of variation, brand development index, and category development index do not show changes over time.
Here is the formula for a simple index number:
Equation 1: Simple Index Number Formula
Expressed symbolically, the formula is:
Equation 2: Simple Index Number Symbolic Formula
Where: P stands for the index number
Po is the base value
Pt is the selected value
Interpreting Simple Index Numbers
Let’s calculate some simple index numbers and interpret the results. In 2015, CityMetric, an arm of the British magazine New Statesman, published a report on the number of stations in the world’s ten biggest subway systems. Here is their data comparing the number of subway stations in each city. The base value for these indices is the 10-city average of 287.9 stations. This data can be found in Chapter06_SubwaySystem_Stations.xlsx.
Figure 1: Number of Subway Stations for 10 International Cities
How do we interpret these index numbers? Paris and Madrid both index at 105 compared to the 10-city average. This means these cities have 5 percent more subway stations than the 10-city average. An index close to 100 is considered “average.” Please note: Index numbers are often rounded off to the nearest whole number.
Let’s turn to the subway system in Tokyo, which has only 179 stations compared to the ten-city average of 287.9 stations. Tokyo’s index is 62. This means that Tokyo has only 62 percent of the 10-city average, or 38 percent fewer stations than the 10-city average, found by 100 – 62 = 38. An index below 100 means that the selected value is below the base value, which is always 100.
The smallest possible index number is zero. Nice, France, for instance, does not have a subway, hence no subway stations. Its index, therefore, would be zero, found by (0/287.9)*100.
An index number above 100 means that the selected value is larger than the base. An index of 200 indicates that the value is twice as large as the base, 300 indicates that the value is three times the base, and so forth. The index for New York City, with its 468 subway stations, is 163, or 63 percent above the 10-city average.
We could compare New York City to Tokyo. If we used Tokyo as the base, the index for New York City would be 261, found by (468/179)*100. This means that New York City has 161 percent more subway stations than Tokyo. If we designate New York City as the base, Tokyo has an index of 38, found by (179/468)*100. This means that Tokyo has only 38 percent as many stations as New York City, or has 62 percent fewer stations, found by 100 percent – 38 percent.
III. Simple Index Numbers in Marketing: BDIs and CDIs
On my first day as an assistant account executive at the New York office of the old, prestigious advertising agency, Foote, Cone & Belding. my boss greeted me. She said that at 1:30pm we would be talking to the media department about our client’s “beady eyes” and “seedy eyes.” These terms were new to me. I was confused. Was she insulting our client? She then gave me a sheet of paper with six columns. The first column contained the names of the Nielsen DMAs (Designated Marketing Areas, which are explained below). The next column was the percentage of U.S. households represented by the Nielsen DMAs. The remaining columns showed the brand sales and percentage of total sales in these DMAs, BDIs (Brand Development Indices), category sales in these DMAs, and CDIs (Category Development Indices). I immediately retreated to my office, took out my media planning textbook, (there was no Internet back then), and read about these terms and learned how to calculate BDIs and CDIs.
Nielsen DMAs, (Designated Market Areas), are geographic regions in the U.S. in which local television viewing habits are measured by the A. C. Nielsen Company. There are 210 DMAs in the U.S. One variable Nielsen reports is the proportion of American households in each DMA. Marketers use DMAs to plan local media campaigns. BDIs are an index of the proportion of a brand’s sales compared to the proportion of households as the base. CDIs are an index of the proportion of a category’s sales with the proportion of households as the base.
The Excel file Chapter06_BDIs_CDIs.xlsx, shows fictitious BDIs and CDIs for a made-up chocolate bar called Whizzo. The British comedy troop Monty Python has a very funny routine about Whizzo Chocolate. One of Whizzo Chocolate’ flavors is “Crunchy Frog” with a real, uncooked frog added for that special crunch. Click on this Whizzo Chocolate link to view this comedy bit.
Figure 2 shows the top 20 Nielsen DMAs for 2018-2019. It also shows fictitious chocolate category sales and Whizzo Chocolate sales along with the corresponding Category Development Indices, and Brand Development Indices. According to Nielsen, which also tracks retail sales, chocolate market sales are approximately $22.4 billion for that year. For our fictitious chocolate maker, Whizzo, let’s set the annual retail sales at $250 million.
Figure 2: BDIs and CDIs
The New York DMA represents 6.441 percent of U.S. Households. New York’s share of chocolate category sales was 8.05 percent. The CDI for New York is 125, found by (8.05/6.441)*100. The percent of Whizzo’s sales in New York was set at 19.62 percent. The CDI for New York is 305, found by (19.62/6.441)*100.
What does this say about New York? It says two things:
1. New York is a strong chocolate market. Chocolate sales are 25% higher than its population would suggest.
2. New Yorkers are especially fond of chocolate with raw frogs as evidenced by the brand’s exceptionally high BDI of 305.
Whizzo’s sales in New York, therefore, are over three times the sales of an average market. Detroit, on the other hand, with a 33 CDI and a BDI of 45 is an under-developed chocolate market as well as a weak market for Whizzo Chocolate.
IV. Using Simple Index Numbers to Calculate the Coefficient of Variation and Geometric Mean
A. Coefficient of Variation
In Chapter 5, you learned that the coefficient of variation (CV) is a standardized measure for comparing dispersion of two or more distributions that have different ratio scale measurements. The CV can be reported as either an index number, decimal, or percentage. Here are the formulas for the CV as am Index:
Table 1: Coefficient of Variation as an Index and Percentage
Where: σ is the population standard deviation
μ is the population mean
s is the sample standard deviation
X̅ is the sample mean
In Chapter 5 we compared the coefficients of variation for two distributions: The price of a Big Mac in 20 countries and the Monthly Mobile Data Usage in gigabytes in these countries. Here are the coefficients of variation reported as an index and decimal:
Table 2: Coefficient of Variation Reported as an Index and Percentage
Reporting the CV as an index or decimal leads to the same conclusion. The coefficient of variation as an index for Big Mac is 34.29, the decimal is 0.3429. Conclusion: The standard deviation for Big Mac prices is 34.29 percent of the mean price. For monthly mobile data usage, the index number is 73.78 and the decimal is 0.7378. The standard deviation for monthly mobile data usage is 73.78 percent of the mean. Based on these coefficients of variation, mobile data usage is more variable than Big Mac prices.
B. Geometric Mean
We introduced the geometric mean in Chapter 5, which is the preferred measure to find the average change in growth rates, ratios, percentages, or index numbers over time. It is always less than or equal to the arithmetic mean.
The one issue with the geometric mean is that none of the variables can be negative numbers. Suppose you want to calculate the average annual rate of change for an investment that lost money during some of the years you owned it. You cannot use the rate of return calculated as a percentage because some of the years have a negative percentage. In Figure 3, Excel returns the #NUM! error in cell B6. This error occurs when there is a problem with the number format. In this case it is the negative percentages. These data can be found in Chapter06_GEOMEAN_index.xlsx.
Figure 3: Geometric Mean with Negative Percentages and Index Numbers
Let’s review your investment: In Year 1, you invested $10,000. At the end of that year, you made $1,500 for a 15 percent rate of return. The second year, however, your investment lost $2,875, or 25 percent of its value. You have a negative rate of return, -25%. At the end of Year 2, your investment is now worth $8,625, found by $11,000 - $2,875. Year 3 was another bad year. You lost another 10 percent or $862.50. At the start of Year 4, your investment was worth $7,762.50, and during that year your investment gained a modest 2 percent. The value of your investment increased by $155.25, and is now worth $7,917.75. Your loss for the four year period: $2,082.25.
What is your rate of return using the geometric mean? If you use the rates of returned shown in Cells B2:B5, you get the #NUM! error in B6, indicating that the answer is not valid. It is invalid because your annual rates of return in Cells B2:B5 have negative numbers, -25% and -10%. The workaround is to convert these percentage rates to index numbers. The conversion of the percentage rates to index numbers is shown in Column F with the formulas in Column G. The index numbers are calculated using the starting value as the “base” and the closing value as the “selected value.” The index numbers for these four years are: 115, 75, 90, and 102. Using index numbers removes the negative percent changes. You can now calculate the geometric mean using Excel or by hand. The geometric mean is 94.33. We can only arrive at a precise answer for the geometric mean by converting the percentages into index numbers. This means you lost money on the investment. The average rate of return for this investment is -5.67%, found by 94.33 – 100. Clearly this was not a good investment.
V. Simple Price Indices
A. Simple Price Index
A simple price index is used to compare changes in price over two periods for a market basket of items. One way to calculate a simple average price index is to calculate the arithmetic mean of the indices. Here is the formula:
Equation 3: Simple Price Index
Where: Σ indicates the operation of addition
P is the average index
Pi are the individual indices
n is the number of indices
Figure 4 compares retail prices for 2009 and 2019 for four chocolate manufacturers along with a fifth category for small brands called “All Others.” The simple indices are shown in Column D and the formulas are in Column E. The simple price index, shown in cell D7, is the mean of the five indices:
Figure 4: Simple Average Price Index
This data can be found in the workbook Chapter06_UnweightedPriceIndex.xlsx in the Simple Average Index worksheet. The simple price index of 146.67 indicates that the retail prices have increased by 46.67 percent. We would typically report this index as 147. We should also note that while Whizzo is more expensive than other brands, its price has increased at a lower rate; 133 versus 150, or 33.33 percent versus 50 percent.
B. Simple Aggregate Price Index
An alternative method for calculating a simple, or unweighted, price index is an aggregate price index. The prices for each item are summed (not the indices) and then the index is calculated from the sum of the base and selected periods. Here is the formula for the Simple Aggregate Price Index:
Equation 4: Simple Aggregate Price Index
Where: ΣPo is the sum of the values in the base period
ΣPt is the sum of the values in the selected period
Figure 5 shows the fictional simple aggregate index number for the indices of the retail prices for 2009 and 2019 for five chocolate manufacturers.
Figure 5: Single Aggregate Index Number
The index of 145.5 indicates that the retail prices have increased by 45.5 percent. Again, we would round off this index to 146. This data can be found in the workbook Chapter06_UnweightedPriceIndex.xlsx in the Simple Aggregate Price Index worksheet.
VI. Weighted Price Indices
Using a weighted price index is often considered more appropriate than using an unweighted index because the weights given to each variable in the index are adjusted to account for the quantity of each variable. The two most commonly used weighted price indices are the Laspeyres Price Index and Paasche Price Index. They measure the change in price for a market basket of different items. Both are named after their creators: economist Etienne Laspeyres and economist, political scientist, and statistician Hermann Paasche. These indices have shortcomings, however, which has led to the development of a third index, the Fisher’s Ideal Index. There is also another weighted price index called the Value Index.
A. Laspeyres and Paasche Indices
The difference between the Laspeyres and Paasche indices is how they weight the prices and quantities. The Laspeyres index is a base period quantity index because its weights use the base period’s quantities and prices. The Paasche index is a current period weighted index because it uses the current or observed period weightings. This will become clearer after we examine the formulas and see an example of how to calculate these indices.
The Laspeyres index is defined as the sum of the base prices times the base quantities over the selected period, defined as the sum of prices for the observed period times the base period’s quantities multiplied by 100. Here is this formula in words:
Equation 5: Laspeyres Index Formula in Words
The Paasche index is defined as the selected period or the sum of the prices and quantities from the observed period over the sum of the base prices times the quantities from the observed period multiplied by 100. Here is this formula in words:
Equation 6: Paasche Index Formula in Words
Here are the formulas in symbols for the two indices.
Table 3: Formulas for the Laspeyres and Paasche Indices
Where: Pt is the Price for the Observed Period
Po is the Price for the Base Period
Qt is the Quantity for the Observed Period
Qo is the Quantity for the Base Period
To calculate these indices, we need a market basket of goods. For our market basket, the retail prices were sourced from http://www.thepeoplehistory.com/pricebasket.html. Figure 6 shows our market basket. Quantities were selected at random. The data are available in the Chapter06_WeightedIndex.xlsx workbook on the Market Basket worksheet. The calculations for the Laspeyres and Paasche indices are also in this workbook under separate worksheets.
Figure 6: Market Basket for Weighted Indices
The Laspeyres index for this market basket is 141.519, or 142, based on the calculation shown in Figure 7:
Figure 7: Laspeyres Index – Excel
The Laspeyres index of 142 indicates that the price for these market baskets has increased by 42 percent (41.519 percent to be precise). In this market basket, bacon, with its high quantity of 115 units, has the most weight.
Figure 8 shows the calculations for the Paasche index, which is 137 (137.34). This indicates that the price for these market baskets has increased by 37 percent, or less than the Laspeyres index.
Figure 8: Paasche Index – Excel
Let’s review: We have the same market baskets. Yet the indices are different: Laspeyres is 142 and Paasche is 137. Which index should we use? To answer this question, we need to highlight the advantages and disadvantages for both of these weighted indices.
Table 4: Advantages and Disadvantages of Laspeyres and Paasche Indices
When time and money are a concern, use the Laspeyres index.
B. Fisher’s Ideal Index (FII)
You may wonder if there is a way to overcome the drawbacks of the Laspeyres index that over-estimates the effect of price and of the Paasche index that under-estimates the effect of price. Fisher’s Ideal Index is such a method. It is the geometric mean of the Laspeyres and Paasche indices. Named after the American economist, Irving Fisher, it is called “ideal” because it corrects Laspeyres’ positive price bias and Paasche’s negative price bias. Technically the Fisher’s Ideal Index, or FII, is the geometric mean of the product of the Laspeyres and Paasche indices. Equation 7 shows the formula for Fisher’s Ideal Index:
Equation 7: Formula for Fisher’s Ideal Index
Because we only have the Laspeyres and Paasche indices for two time periods, we can construct the Fisher’s Ideal Index for a single period by taking the square root of the product of the two indices: 141.52 * 137.34. The answer is 139.41 or 139 when we round. The calculation is shown in Figure 9:
Figure 9: Fisher’s Ideal Index Calculation
The advantage of the Fisher’s Ideal Index is that it corrects for the upward bias of the Laspeyres Index and the downward bias of the Paasche Index. As a result, the FII will always be between the Laspeyres and Paasche indices. The disadvantage of the FII is that it can be a bit more difficult to calculate than the Laspeyres and Paasche indices.
C. Value Index
Unlike the Laspeyres and Paasche indices, which measure only the change in price, a value index measures the change in price and quantity. Equation 8 shows the formula for the Value index:
Equation 8: Formula for the Value Index
Where: V = Value Index
ΣPtQt is the sum of the prices for the selected period times the quantities for the selected period
ΣPoQo the sum of the prices for the base period times the quantities for the base period
Using the same market basket used for the Laspeyres and Paasche indices, the value index is 147 (146.82). Figure 10 shows the value index:
Figure 10: Excel Calculation of the Value Index
How do we interpret the value index of 147 (146.82)? The value of the market basket increased by approximately 47 percent. These calculations can be found on the Value worksheet in Chapter06_WeightedIndex.xlsx.
VII. Government and Financial Indices
The federal government, financial services companies, and trade associations produce and publish their own indices. Here are a few well-known examples:
1. The Consumer Price Index (CPI) is published by the United States Department of Labor’s Bureau of Labor Statistics. According to the Bureau, the “CPI is a measure of the average change over time in the prices paid by urban consumers for a market basket of consumer goods and services. Indexes are available for the U.S. and various geographic areas. Average price data for select utility, automotive fuel, and food items are also available.”1 In June 2019, the CPI for all urban consumers was 256.143 over its 1982-1984 base.2 This means that prices have increased 156.143 percent over those of the 1982 to 1984 period. A basket of goods that cost $100 in the 1982 to 1984 period would cost $256.14 in 2019.
2. The Producer Price Index (PPI) is also published by the United States Department of Labor’s Bureau of Labor Statistics. According to the Bureau, “The Producer Price Index (PPI) program measures the average change over time in the selling prices received by domestic producers for their output. The prices included in the PPI are from the first commercial transaction for many products and some services.”3
3. The Dow Jones Industrial Average (DJIA) is a financial index published by S&P Dow Jones LLC. Started in 1896, it measures the daily stock prices of 30 large companies on the New York Stock Exchange and NASDAQ. The DJIA is a closely watched index because it serves as a proxy for the health of financial markets and the American economy.
4. S&P 500 Index: This barometer for large capitalization American equities includes the top 500 companies based on market capitalization. This covers approximately 80 percent of available market capitalization.
5. Russel 2000 Index is a financial index published by FTSE Russell, a subsidiary of the London Stock Exchange Group. The Russell 2000 is a stock market index composed of 2,000 publicly-traded small-capitalization American firms.
6. NASDAQ-100 is an index published by the National Association of Securities Dealers. NASDAQ is an electronic marketplace for buying and selling securities. This index includes 100 of the largest non-financial companies listed on the Nasdaq exchange based on market capitalization.
7. NIKKEI 225 is a stock market index for the Tokyo Stock Exchange. It is a price-weighted index that measures the performance of 225 publicly-traded companies for a broad selection of industrial sectors.
VIII. Summary
Index numbers provide a measure of the relative difference between a base value and a selected value. There are many formulas for calculating index numbers. The correct formula depends on the analyst’s objectives and the nature of the data.
IX. Exercises
Data for these questions can be found in Chapter06_Exercises.xlsx.
Exercise 1: Table 5 shows the median price for homes in the U.S. from third quarter 2017 to second quarter 2020 in thousands of dollars.
• Create index numbers for these data using Q3 2017 as the base.
• Calculate the geometric mean using these index numbers. Comment on your findings.
Table 5: Median Price for Existing Homes in the U.S. Source: National Association of Realtors, April 2019
*Estimated
Exercise 2: Table 6 lists the fifteen largest Heavy Urban Rapid Rail Systems in the U.S. (Think trains, not trams or trolleys.)
• Create a simple index for the length of track and number of stations for these urban rail systems. Use the arithmetic mean for your bases.
• Comment on your findings.
Table 6: Heavy Urban Rapid Rail Systems in the U.S.
Exercise 3: Gasoline Prices:
Table 7 shows the price of a gallon of gasoline in U.S. dollars for 19 countries. The data were collected during fourth quarter 2018 by the International Monetary Fund, the World Bank, and the United Nations. These data were published on www.globalpetrolprices.com.
• Create unweighted indices for gasoline prices. Use the U.S. as your base.
• Comment on your findings.
Table 7: Gasoline Prices per Gallon in U.S. Dollars, 4th Qtr. 2018
Exercise 4: Geometric Mean with Negative Numbers
Table 8 shows an investment you made four years ago. The table includes the starting and closing value of this investment for the four years you held it. It also shows the annual rate of return. What is your average annual rate of return? Using Excel, calculate the geometric mean and explain your results.
Table 8: Geometric Mean for an Investment Using Index Numbers
Exercise 5: Calculating BDIs and CDIs
You are an executive at Grabbit & Runne, a small advertising agency. Your biggest client is Whizzo Chocolate. Its marketing director has looked at the BDIs and CDIs you sent her for the top 20 Nielsen DMAs. See Chapter06_BDIs_CDIs.xlsx. She requests that you provide the BDIs and CDIs for the 21st through the 25th Nielsen DMA. Provide these calculations and comment on your findings. The data necessary for your calculations is in Chapter06_Exercises.xlsx under the worksheet 4) BDIs_CDIs and in Figure 11:
Figure 11: Nielsen DMA: Five Mid-Sized Markets
Exercise 6: Weighted Indices
Table 9 shows a market basket of goods for 2010 and 2019. Using 2010 as the base:
• Calculate a Laspeyres index and comment
• Calculate a Paasche index and comment
• Calculate a Fisher’s Ideal index and comment
• Calculate a Value index and comment
Table 9: Market Basket – Source: http://www.thepeoplehistory.com/pricebasket.html
Except where otherwise noted, Clear-Sighted Statistics is licensed under a
Creative Commons License. You are free to share derivatives of this work for
non-commercial purposes only. Please attribute this work to Edward Volchok.