Wednesday, December 07, 2005

Fishsticks gets aggregation bias 

Craig Westover does a nice analysis of the PioneerPress story on the effect of Cities-area smoking bans on revenues for area bars and restaurants. As a statistical issue, the PP analysis uses a statement about means of a distribution of revenue effects to say something about individual observations within the distribution. To wit,
The Pioneer Press does an excellent job of data collection. I�m not about to argue with their findings per se; however, nowhere in the article do they explain the significance of working with aggregate data. The significance is, aggregate data hides the impact of the smoking ban on individuals most affected by the ban. That is a characteristic of aggregate data and why it is used to justify projects with unintended consequences.

[When talking about education, many of the same people supporting smoking bans and education writers are quick to point out (correctly) that aggregate data showing student improvement hides the fact that there is a significant gap between white students and students of color. Aggregate data works the same way in underreporting the impact of the ban on specific segments of the hospitality industry.]

I wrote to the authors of the PP study and have their data. It is indeed a set of totals for taxable sales of food and liquor by zip code, along with the number of establishments. As Captain Fishsticks points out, that tells us nothing about individual harm. And the article does tell us that some areas on average were harmed, as you'd guess.

The PP study ignores a study done by Hennepin County itself a couple of months ago. Unlike the reporters, the county researchers had individual-business-level data, and they found that there were declines particularly for smaller establishments.
By segmenting the 497 Hennepin County establishments by total revenue size, there is some evidence that smaller businesses were more likely to show a decline in liquor sales and less likely to recoup the decline through increased food sales.

It would appear that businesses where liquor sales exceed food sales were more likely to see reduced liquor sales with no offset from increased food sales.

On the other hand, there also is evidence that establishments where food sales exceed liquor sales were more likely to have some of the lower liquor sales offset by higher food sales.

This matches Craig's story of Acme Comedy, a place large enough and with enough demand that they could increase prices to take from remaining patrons enough extra to recoup the loss of sales from its bar business. Of the 270 Hennepin County establishments that had lost liquor sales, 106 were smaller places (less than $200k total taxable sales over 9 months; the average for the county is $570k.) More of this type of analysis will be needed to verify claims of harm, and to do it requres more disaggregated data.

Categories: ,