When we display bivariate data that appears to have a linear relationship, we often wish to find a line that best models the relationship so we can see the trend and make predictions. We call this the line of best fit.
We want to draw a line of best fit for the following scatterplot:
Let's try drawing three lines across the data and consider which is most appropriate.
We can tell straight away that $A$A is not the right line. This data appears to have a positive linear relationship, but $A$A has a negative gradient. $B$B has the correct sign for its gradient, and it passes through three points! However, there are many more points above the line than below it, and we should try to make sure the line of best fit passes through the centre of all the points. The means that line $C$C is the best fit for this data out of the three lines.
Below is an example of what a good line of best fit might look like.

The following scatter plot shows the data for two variables, $x$x and $y$y.
A scatter plot with both the x- and y-axes ranging from $0$0 to $10$10. Both axes are marked at intervals of $1$1. Eight data points are plotted as solid black dots on the scatter plot. The first data point is at $\left(1,2\right)$(1,2). The second data point is at $\left(2,1\right)$(2,1). The third data point is at $\left(3,3\right)$(3,3).The fourth data point is at $\left(4,5\right)$(4,5). The fifth data point is at $\left(5,6\right)$(5,6). The sixth data point is at $\left(6,5\right)$(6,5). The seventh data point is at $\left(7,7\right)$(7,7). The eighth data point is at $\left(8,7\right)$(8,7). The coordinates are not explicitly labeled or given.
Determine which of the following graphs contains the line of best fit.
A scatter plot with both the x- and y-axes ranging from $0$0 to $10$10. Both axes are marked at intervals of $1$1. Eight data points are plotted as solid black dots on the scatter plot. The data first point is at $\left(1,2\right)$(1,2). The second data point is at $\left(2,1\right)$(2,1). The third data point is at $\left(3,3\right)$(3,3).The fourth data point is at $\left(4,5\right)$(4,5). The fifth data point is at $\left(5,6\right)$(5,6). The sixth data point is at $\left(6,5\right)$(6,5). The seventh data point is at $\left(7,7\right)$(7,7). The eighth data point is at $\left(8,7\right)$(8,7). A straight green line runs diagonally from the lower left of the plot, upwards to the upper right of the plot. The line passes though one of the plotted points. The plotted points are positioned closely along the line. Two points are above the line. Five points are below the line. The coordinates of the plotted points are not explicitly labeled.
A scatter plot with both the x- and y-axes ranging from $0$0 to $10$10. Both axes are marked at intervals of $1$1. Eight data points are plotted as solid black dots on the scatter plot. The first data point is at $\left(1,2\right)$(1,2). The second data point is at $\left(2,1\right)$(2,1). The third data point is at $\left(3,3\right)$(3,3).The fourth data point is at $\left(4,5\right)$(4,5). The fifth data point is at $\left(5,6\right)$(5,6). The sixth data point is at $\left(6,5\right)$(6,5). The seventh data point is at $\left(7,7\right)$(7,7). The eighth data point is at $\left(8,7\right)$(8,7). A straight green line runs diagonally from the lower left of the plot, upwards to the upper right of the scatter plot. The plotted points are positioned very closely along this line. Four data points are above the line. Four data points are below this line. The coordinates of the plotted points are not explicitly labeled.
The following scatter plot shows the data for two variables, $x$x and $y$y.
Determine which of the following graphs contains the line of best fit.
Use the line of best fit to estimate the value of $y$y when $x=4.5$x=4.5.
$4.5$4.5
$5$5
$5.5$5.5
$6$6
Use the line of best fit to estimate the value of $y$y when $x=9$x=9.
$6.5$6.5
$7$7
$8.4$8.4
$9.5$9.5