Grocery Price Index Frequently Asked Questions

We knew that it is important for the grocery industry to understand pricing trends, especially in such volatile times, and since we have the largest, most hyper-local, and most real-time data with history, we created the index to give this visibility to ourselves, our customers, and to the general public.

The biggest difference is that we created price indices not only at a national level, but at local levels as well. We know from our customers that price competition happens locally. Consumers shop for the best prices at stores that are nearby, which is a key reason why the demand for Datasembly's solutions have grown so rapidly. And we decided to create this index, which is actually a series of indices at state and major metro levels, to show pricing trends where it is most relevant to consumers.

We had a hypothesis that pricing trends will likely differ from very urban to very rural areas because of issues related to supply chain, different levels of demand, and because the competitive dynamics are different with generally fewer stores over a large area.

We also found out that the NCHS (National Center for Health Statistics) had defined a six-level urban-rural classification scheme for all U.S. counties and that we could align our store and pricing data to that scheme. This allowed us to see pricing trends across these different urban/rural segments and we found some very interesting variances that you can see within the index.

The following is detailed description of the six segments:

Large Central Metro (Inner Cities)

Counties in MSAs (metropolitan statistical area) of 1 million or more population that: 1. Contain the entire population of the largest principal city of the MSA, or 2. Have their entire population contained in the largest principal city of the MSA, or 3. Contain at least 250,000 inhabitants of any principal city of the MSA.

Large Fringe Metro (Suburbs)
Counties in MSAs of 1 million or more population that did not qualify as large central metro counties.
Medium Metro
Counties in MSAs of populations of 250,000 to 999,999.
Small Metro
Counties in MSAs of populations less than 250,000.
Micropolitan
Counties in micropolitan statistical areas.
Non-core (Rural)
Nonmetropolitan counties that did not qualify as micropolitan.

We used the United States Office of Management and Budget (OMB) definition of metropolitan statistical areas (MSA) and took the top 54 of the 392 within that list. The OMB defines a Metropolitan Statistical Area as one or more adjacent counties, or county equivalents, that have at least one urban core area of at least 50,000 population, plus adjacent territory that has a high degree of social and economic integration with the core as measured by the commuting tie.

We did an analysis of the categorization taxonomy of the top 15 grocery retailers in the country and found the most commonly used high level categories and rationalized the differences to come up with what we think is an easy-to-understand and comprehensive set of categories.

Two things to keep in mind. First, the scale of changes in most instances are smaller than they look on the graph as many of the spikes are just fractions of a percentage point. Also, we have observed that during COVID, many banners stopped putting many of their products on promotion so the week-to-week spikes are sometimes attributable to changes in promotion. You will probably notice significant changes during the holiday season in many instances and those often happen because of holiday promotions.

We can already use our customer's own set of hierarchical categories within our applications to allow them to do pricing, promotion, and assortment analysis using our data. We are currently working on a capability to create the same type of functionality you can see in our grocery pricing index, using our customer's own categories as well. We don't yet have a date for when this capability will be available.

One of the unique aspects of our algorithm is that we actually create an index for each individual product represented in the index. This means that we can aggregate those index in anyway we like (by geo, state, rural/urban segment) and we have actually already calculated indices by banner as well but don't show that in the public version.

First, we chose the products in each category that had the most coverage across stores in the United States. To get rid of the "noise" of hundreds of thousands of products we have in each category, we decided to use the top 1,000 products in terms of store coverage.

The index will be updated on a weekly basis.

Overview

Indices are always relative to the price of products collected in the first week of October 2019. We refer to this week as the base week.

Aggregations Calculation

For an individual product -> price_index = target_week.best_price / base_week.best_price. The best price is the lower of the list price or a promotion price if the product is on promotion.
For each store, aggregations are created for each category and the aggregation is simply the average of indices for each product in that category.
Banner level category aggregations are based on the Store level category aggregations
State level category aggregations are based on the Store level category aggregations
Geography level category aggregations are based on the Store level category aggregations
Overall category aggregations are based on the Banner level category aggregations
Since the quantity of products is not uniform we apply weights to the aggregation calculations.
Since the quantity of products is not uniform we apply weights to the aggregation calculations.

Assortment Changes

The assortment of products for each category that represent the Grocery Pricing Index often changes. This happens at each level of aggregations differently as well since the assortment can be slightly different in different geographies.
There are generally two types of assortment changes that need to be accounted for:
a. Product appears in a target week but not in the base week
b. Product appears in a base week but not in a target week.

Appears in the target week but not in the base week

A best_price for that product for the base_week needs to be derived.
That best_price will be derived using the first week, after the base week, for which a best_price appears for that product. That week will be called new_base_week
That calculation is as follows for each category in each store:
a. Find the first week, after the base week, for which a best_price appears for that product. This will be the new_base_week.
b. Calculate the store index for that category for the new_ base_week using only products that have best_prices in the base_week and the new_base_week.
c. Divide the best_price for the product in the new_base_week by "b" above.
d. Use the value in "c" as the derived best_price for that product in the base_week that was missing the best_price.

Appears in the target week but not in the base week

This algorithm for this case is far simpler than the one above. There are no derived best prices that are actually created.
When a base price exists for a product but is missing in a target week, that product is just not included in the averaging of the products in that category to create the category index.

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Grocery Price IndexFrequently Asked Questions

Why did you create the Grocery Price Index?

Why is your Grocery Price Index different from other grocery pricing indexes that are available?

How did you come up with the Urban / Rural Locales and what do they mean?

Large Central Metro (Inner Cities)

How did you come up with a definition of the metro areas?

How did you come up with the Categories?

Some of the metro areas and state indices seem to have a lot of sharp spikes from one week to another. Especially during the beginnings of COVID times. Did prices really change that much in a short period?

As a retailer or CPG, can you provide me with this type of indexing capability for my own categories?

Will I be able to see index details by banner?

How did you choose what products to put into each category?

How often will you be updating the index?

What is the methodology used for the index calculations?

Overview

Aggregations Calculation

Assortment Changes

Appears in the target week but not in the base week

Appears in the target week but not in the base week

Cookie Policy

Grocery Price Index
Frequently Asked Questions