Guest Essay by Kip Hansen

beam_of_darkness

This essay is second in a series of essays about Averages — their use and misuse. My interest is in the logical and scientific errors, the informational errors, that can result from what I have playfully coined “The Laws of Averages”.

Averages

As both the word and the concept “average” are subject to a great deal of confusion and misunderstanding in the general public and both word and concept have seen an overwhelming amount of “loose usage” even in scientific circles, not excluding peer-reviewed journal articles and scientific press releases, I gave a refresher on Averages in Part 1 of this series. If your maths or science background is near the great American average, I suggest you take a quick look at the primer in Part 1 before reading here.

A Beam of Darkness Into the Light

The purpose of presenting different views of any data set — any collection of information or measurements about a thing, a class of things, or a physical phenomenon — is to allow us to see that information from different intellectual and scientific angles — to give us better insight into the subject of our studies, hopefully leading to a better understanding.

Modern statistical [software] packages allow even high school students to perform sophisticated statistical tests of data sets and to manipulate and view the data in myriad ways. In a broad general sense, the availability of these software packages now allows students and researchers to make [often unfounded] claims for their data by using statistical methods to arrive at numerical results — all without understanding either the methods or the true significance or meaning of the results. I learned this by judging High School Science Fairs and later reading the claims made in many peer-reviewed journals. One of the currently hotly discussed controversies is the prevalence of using “P-values” to prove that trivial results are somehow significant because “that’s what P-values less than 0.05 do”. At the High School Science Fair, students were including ANOVA test results about their data –none of them could explain what ANOVA was or how it applied to their experiments.

Modern graphics tools allow all sorts of graphical methods of displaying numbers and their relationships. The US Census Bureau has a whole section of visualizations and visualization tools. An online commercial service, Plotly, can create a very impressive array of visualizations of your data in seconds. They have a level of free service that has been more than adequate for almost all of my uses [and a truly incredible collection of possibilities for businesses and professionals at a rate of about a dollar a day]. RAWGraphs has a similar free service.

The complex computer programs used to create metrics like Global Average Land and Sea Temperature or Global Average Sea Level are believed by their creators and promoters to actually produce a single-number answer, an average, accurate to hundredths or thousandths of a degree or fractional millimeters. Or, if not actual quantitatively accurate values, at least accurate anomalies or valid trends are claimed. Opinions vary wildly on the value, validity, accuracy and precision of these global averages.

Averages are just one of a vast array of different ways to look at the values in a data set. As I have shown in the primer on averages, there are three primary types of averages — Mean, Median, and Mode — as well as a number of more exotic types.

In Part 1 of this series, I explained the pitfalls of averages of heterogeneous, incommensurable objects or data about objects. Such attempts end up with Fruit Salad, an average of Apples-and-Oranges: illogical or unscientific results, with meanings that are illusive, imaginary, or so narrow as not to be very useful. Such averages are often imbued by their creators with significance — meaning — that they do not have.

As the purpose of looking at data in different ways — such as looking at a Mean, a Median, or a Mode of the numerical data set — is to lead to a better understanding, it is important to understand what actually happens when numerical results are averaged and in what ways they lead to improved understanding and in what ways they lead to reduced understanding.

A Simple Example:

Let’s consider the height of the boys in Mrs. Larsen’s hypothetical 6th Grade class at an all boys school. We want to know their heights in order to place a horizontal chin-up bar between two strong upright beams for them to exercise on (or as mild constructive punishment — “Jonny — ten chin-ups, if you please!”). The boys should be able to reach it easily by jumping up a bit so that when hanging by their hands their feet don’t touch the floor.

The Nurse’s Office supplies the heights of the boys, which are averaged to get the arithmetical mean of 65 inches.

Using the generally accepted body part ratios we do quick math to approximate the needed bar height in inches:

Height/2.3 = Arm length (shoulder to fingertips)

65/2.3 = 28 (approximate arm length)

65 + 28 = 93 inches = 7.75 feet or 236 cm

Our calculated bar height fits nicely in a classroom with 8.5 foot ceilings, so we are good. Or are we? Do we have enough information from our calculation of the Mean Height?

Let’s check by looking at a bar graph of all the heights of all the boys:

boys

This visualization, like our calculated average, gives us another way to look at the information, the data on the heights of boys in the class. Realizing that because the boys range from just five feet tall (60 inches) all the way to almost 6 feet (71 inches) we will not be able to make one bar height that is ideal for all. However, we see now that 82% of the boys are within 3 inches either way of the Mean Height and our calculated bar height will do fine for them. The 3 shortest boys may need a little step to stand on to reach the bar, and the 5 tallest boys may have to bend their legs a bit to do chin ups. So we are good to go.

But when they tried the same approach in Mr. Jones’ class, they had a problem.

There are 66 boys in this class and their Average Height (mean) is also 65 inches, but the heights had a different distribution:

boys_2

Mr. Jones’ class has a different ethnic mix which results in an uneven distribution, much less centered around the mean. Using the same Mean +/- 3 inches (light blue) used in our previous example, we capture only 60% of the boys instead of 82%. In Mr. Jones class, 26 of the 66 boys would not find the horizontal bar set at 93 inches convenient. For this class, the solution was a variable height bar with two settings: one for the boys 60-65 inches tall (32 boys), one for the boys 66-72 inches tall (34 boys).

For Mr. Jones’ class, the average height, the Mean Height, did not serve to illuminate the information about boys’ height to allow us to have a better understanding. We needed a closer look at the information to see our way through to the better solution. The variable height bar works well for Mrs. Larsen’s class as well, with the lower setting good for 25 boys and the higher setting good for 21 boys.

Combining the data from both classes gives us this chart:

boys_3

This little example is meant to illustrate that while averages, like our Mean Height, serve well in some circumstances, they do not do so in others.

In Mr. Jones’ class, the larger number of shorter boys was obscured, hidden, covered-up, averaged-out by relying on the Mean Height to inform us of the best solutions for the horizontal chin-up bar.

It is worth noting that Mrs. Larsen’s class, shown in the first bar chart above, has a distribution of heights that more closely mirrors what is called a Normal Distribution, a graph of which looks like this:

normal_distribution

Most of the values are creating a hump in the middle and falling off evenly, more or less, in both directions. Averages are good estimations of data sets that look like this if one is careful to use a range on either side of the Mean. Means are not so good for data sets like Mr. Jones’ class, or for the combination of the two classes. Note that the Arithmetical Mean is exactly the same for all three data sets of height of boys — the two classes and the combined — but the distributions are quite different and lead to different conclusions.

US Median Household Income

A very common measure of economic well-being in the United States is the US Census Bureau’s annual US Median Household Income.

First note that it is given as a MEDIAN — which means that there should be an equal number…