Measures of central tendency attempt to summarise a set of data with a single value that describes the centre or middle of the scores.
The three main measures of central tendency are the mean, median, and mode. Deciding which one is best depends on some other characteristics of the particular set of data, and we will look further into the suitability of the different measures in our lesson on describing distributions.
Mean: Often referred to as the average–this is the sum of the scores divided by the number of scores.
Median: The middle value of an ordered set of data–or the value that separates the bottom half and top half of the scores.
Mode: The most frequently occurring value. For continuous data or data grouped in class intervals we talk about the modal class - the most frequently occurring class, rather than a mode.
The mean is described as the average of the numbers in a data set. It is defined as the sum of the scores divided by the number of scores.
We can use the interactive tool below to visualise the position of the mean for different data sets, and also how the mean changes as we move one of the scores around.
The symbol for the mean of a sample is $\overline{x}$x, whilst the population mean is represented by the symbol $\mu$μ (Greek letter 'mu'). We typically don't have data for every member of the population, so we usually don't know $\mu$μ exactly, but we can estimate it by using the sample mean, $\overline{x}$x, from a well designed survey.
If certain scores are repeated, such as when information is given in a frequency table then we can find the total sum of all scores by multiplying each unique score by its frequency, then adding them all up.
We summarise the calculation of the mean below.
The mean of a set of data is calculated by:
$\text{Mean}=\frac{\text{Total sum of all scores}}{\text{Number of scores}}$Mean=Total sum of all scoresNumber of scores
If certain scores are repeated, then:
$\text{Total sum of all scores}=\text{sum of}\ \left(\text{Unique score}\times\text{Frequency}\right)$Total sum of all scores=sum of (Unique score×Frequency)
Now let's look at a few examples of calculating the mean of different data sets.
Find the mean from the data in the stem plot below.
Stem | Leaf | |||
$2$2 | $3$3 | $8$8 | ||
$3$3 | $1$1 | $1$1 | $1$1 | |
$4$4 | $0$0 | $3$3 | ||
$5$5 | $0$0 | $3$3 | $8$8 | $8$8 |
$6$6 | $2$2 | $2$2 | $9$9 | |
$7$7 | $1$1 | $8$8 | ||
$8$8 | $3$3 | |||
$9$9 | $0$0 | $0$0 | $1$1 |
Think: We can find the mean by adding up all of the scores, then dividing the total by the number of scores.
Do:
$\text{Mean}$Mean | $=$= | $\frac{\text{Total of all scores}}{\text{Number of scores}}$Total of all scoresNumber of scores |
$=$= | $\frac{23+28+3\times31+40+43+50+53+2\times58+2\times62+69+71+78+83+2\times90+91}{20}$23+28+3×31+40+43+50+53+2×58+2×62+69+71+78+83+2×90+9120 | |
$=$= | $\frac{1142}{20}$114220 | |
$=$= | $57.1$57.1 |
A statistician has organised a set of data into the frequency table shown.
Score ($x$x) | Frequency ($f$f) |
---|---|
$44$44 | $8$8 |
$46$46 | $10$10 |
$48$48 | $6$6 |
$50$50 | $18$18 |
$52$52 | $5$5 |
(a) Complete the frequency distribution table by adding a column showing the total sum for each unique score.
Think: For each unique score ($x$x-value), multiply it by the number of times that score appears. In other words, multiply the unique score by its frequency $\left(f\right)$(f) to find the total sum for that score.
Do: So for a score of $44$44, which occurred $8$8 times, the total score is $44\times8=352$44×8=352. Completing the entire column, we get the following table.
Score ($x$x) | Frequency ($f$f) | $fx$fx |
---|---|---|
$44$44 | $8$8 | $352$352 |
$46$46 | $10$10 | $460$460 |
$48$48 | $6$6 | $288$288 |
$50$50 | $18$18 | $900$900 |
$52$52 | $5$5 | $260$260 |
Totals | $47$47 | $2260$2260 |
(b) Calculate the mean of this data set. Round your answer to two decimal places.
Think: We calculate the mean by dividing the sum of the scores (that is, the sum of all the $fx$fx's) by the number of scores (the total frequency).
Do:
$\text{Mean}$Mean | $=$= | $\frac{\text{Total of all scores}}{\text{Number of scores}}$Total of all scoresNumber of scores |
$=$= | $\frac{2260}{47}$226047 | |
$=$= | $48.09$48.09 ($2$2 d.p.) |
Find the mean of the following scores:
$8$8, $15$15, $6$6, $27$27, $3$3.
In each game of the season, a basketball team recorded the number of 'three-point shots' they scored. The results for the season are represented in the given dot plot.
What was the total number of points scored from three-point shots during the season?
What was the mean number of points scored from three-point shots each game? Round to two decimal places if necessary.
What was the mean number of three point shots per game this season? Leave your answer to two decimal places if necessary.
The mean of $4$4 scores is $21$21. If three of the scores are $17$17, $3$3 and $8$8, find the $4$4th score, $x$x.
Enter each line of working as an equation.
The median is one way of describing the middle or the centre of a data set using a single value. The median is the middle score in a data set.
The data must be ordered (usually in ascending order) before calculating the median.
Suppose we have five numbers in our data set: $4$4, $11$11, $15$15, $20$20 and $24$24.
The median would be $15$15 because it is the value right in the middle. There are two numbers on either side of it.
$4,11,\editable{15},20,24$4,11,15,20,24
If we have a larger data set, however, we may not be able to see straight away which term is in the middle. There are two methods we can use to help us work this out.
Once a data set is ordered, we can cross out numbers in pairs (one high number and one low number) until there is only one number left. Let's check out this process using an example. Here is a data set with nine numbers:
Note that this process will only leave one term if there are an odd number of terms to start with. If there are an even number of terms, this process will leave two terms instead, if you cross them all out, you've gone too far! To find the median of a set with an even number of terms, we can then take the mean of these two remaining middle terms.
We can also work out which term will be the middle number by considering whether there is an odd or even number of scores, and then using a formula.
We summarise the formulas below.
Let $n$n be the number of terms.
Let's use the same set of nine numbers from the previous example, $1,1,3,5,7,9,9,10,15$1,1,3,5,7,9,9,10,15. We can see that there is an odd number of scores, $n=9$n=9, so the position of the median is:
$\text{Position of median }$Position of median | $=$= | $\frac{9+1}{2}$9+12 |
Where we've used $\frac{n+1}{2}$n+12 |
$=$= | $5$5th term |
Simplifying the fraction |
So again, we find that the median is $7$7.
Let's now try this with an even number of terms. Here is a data set with four terms: $8,12,17,20$8,12,17,20. This time, we have $n=4$n=4. What would happen if we used the same procedure as above?
$\text{Position of median}$Position of median | $=$= | $\frac{4+1}{2}$4+12 |
Where we've used $\frac{n+1}{2}$n+12 again |
$=$= | $2.5$2.5th term |
Simplifying the fraction |
What does the "$2.5$2.5th term" mean? Well, just like when we used the "cross-out" method, the $2.5$2.5th term means the average (mean) of the $2$2nd and $3$3rd terms. This is why the when the number of scores, $n$n, is even, we find the average of the $\frac{n}{2}$n2th term and $\left(\frac{n}{2}+1\right)$(n2+1)th terms.
Again, remember that the data must be in order before counting along to the median position. So in this example, the median will be the average of $12$12 and $17$17.
$\text{Median }$Median | $=$= | $\frac{12+17}{2}$12+172 |
Taking the average of the $2$2nd and $3$3rd scores |
$=$= | $14.5$14.5 |
Simplifying the fraction |
Consider the following scores:
$23,25,13,9,11,21,24,17,20$23,25,13,9,11,21,24,17,20
Sort the scores in ascending order.
Calculate the median.
Write down $4$4 consecutive odd numbers whose median is $40$40.
Write all solutions on the same line separated by a comma.
Determine the following using the histogram:
The total number of scores.
The median.
The mode is another measure of central tendency - that is, it's a third way of describing a value that represents the centre of the data set. The mode describes the most frequently occurring score. For continuous data or data grouped in class intervals we talk about the modal class - the most frequently occurring class, rather than a mode.
Let's say we ask $10$10 people how many pets they have. $2$2 people say no pets, $6$6 people say one pet and $2$2 people say they have two pets. What is the most common number of pets for people to have? In this case, the most common number is one pet, because the largest number of people, which was $6$6, had one pet. So the mode of this data set is $1$1.
Data can have more than one mode when several outcomes have the same highest frequency. When the data has two or more modes we refer to it as being multimodal and if it has exactly two modes it is called bimodal.
Note: We can also refer to the general shape of the data as being bimodal if the data has two clear peaks. When talking about the general shape the peaks do not need to be of exactly the same height.
A statistician organised a set of data into the frequency table shown below, find the mode of the data.
Score ($x$x) | Frequency ($f$f) |
---|---|
$10$10 | $26$26 |
$20$20 | $10$10 |
$30$30 | $18$18 |
$40$40 | $18$18 |
$50$50 | $15$15 |
Think: The mode is the score that occurs most frequently.
Do: The highest number in the frequency column is $26$26. This corresponds to the score of $10$10, and therefore the mode is $10$10.
Reflect: At a glance, it may seem unusual that $10$10 is the mode, since the mode measures central tendency, and $10$10 is far from being the centre of the numbers that we saw between $10$10 and $50$50.
The mode measures central tendency, but a different kind of central tendency. It tells us where the data likes to "bunch up"–this gives us an approximation for what score we're likely to draw if we sample from the data set.
Find the mode of the following scores:
$8,18,5,2,2,10,8,5,14,14,8,8,10,18,14,5$8,18,5,2,2,10,8,5,14,14,8,8,10,18,14,5
Mode = $\editable{}$
Find the mode from the histogram shown.
For data grouped in intervals, such as continuous data, we cannot find the exact measures of centre as we do not have the individual scores. We can however find approximate measures by representing all scores in an interval by the class centre (midpoint) of the given interval.
Estimate the mean for the data represented in the grouped frequency table:
Class | Frequency |
---|---|
$30-<40$30−<40 | $12$12 |
$40-<50$40−<50 | $16$16 |
$50-<60$50−<60 | $25$25 |
$60-<70$60−<70 | $4$4 |
Think: To estimate the mean for the data we first need to determine the class centres, which will be used to represent all the scores in a class. For instance, the class centre for the first interval is $\frac{30+40}{2}=35$30+402=35.
We then use: $\text{Total sum of all scores}\approx\text{sum of}\ \left(\text{Class centre}\times\text{Frequency}\right)$Total sum of all scores≈sum of (Class centre×Frequency)
Do:
Class | Class centre | Frequency |
---|---|---|
$30-<40$30−<40 | $35$35 | $12$12 |
$40-<50$40−<50 | $45$45 | $16$16 |
$50-<60$50−<60 | $55$55 | $25$25 |
$60-<70$60−<70 | $65$65 | $7$7 |
$\text{Mean}$Mean | $=$= | $\frac{\text{Total sum of all scores}}{\text{Number of scores}}$Total sum of all scoresNumber of scores |
$\approx$≈ | $\frac{35\times12+45\times16+55\times25+65\times7}{60}$35×12+45×16+55×25+65×760 | |
$=$= | $49.5$49.5 |
Thus, the mean for this data set is approximately $49.5$49.5.
Consider the table below.
Score | Frequency |
---|---|
$1$1 - $4$4 | $2$2 |
$5$5 - $8$8 | $7$7 |
$9$9 - $12$12 | $15$15 |
$13$13 - $16$16 | $5$5 |
$17$17 - $20$20 | $1$1 |
Use the midpoint of each class interval to determine an estimate for the mean of the following sample distribution. Round your answer to one decimal place.
Which is the modal group?
$1$1 - $4$4
$17$17 - $20$20
$13$13 - $16$16
$5$5 - $8$8
$9$9 - $12$12
Consider the table below.
Score (x) | Frequency |
---|---|
$0\le x<20$0≤x<20 | $4$4 |
$20\le x<40$20≤x<40 | $15$15 |
$40\le x<60$40≤x<60 | $23$23 |
$60\le x<80$60≤x<80 | $73$73 |
$80\le x<100$80≤x<100 | $45$45 |
Use the midpoint of each class interval to determine an estimate for the mean of the following sample distribution. Round your answer to one decimal place.
Which is the modal group?
$0\le x<20$0≤x<20
$60\le x<80$60≤x<80
$20\le x<40$20≤x<40
$40\le x<60$40≤x<60
$80\le x<100$80≤x<100
Throughout this chapter and in particular for moderate to large data sets, you should use appropriate technology such as a calculator with statistics program on your computer.
Tips:
Select your brand of calculator below to work through an example of finding the measures of centre using technology.
Casio ClassPad
Calculator example coming soon.
TI Nspire
Calculator example coming soon.