Algebra: Box Plots, Interquartile Range and Outliers, Explained!

Ahoy math friends! This post takes a look at one method of analyzing data; the box plot method. This method is great for visually identifying outliers and the overall spread of numbers in a data set.

Box plots look something like this:

Screen Shot 2020-09-02 at 11.19.22 AM.png

Why Box Plots?

Box Plots are a great way to visually see the distribution of a set of data.  For example, if we wanted to visualize the wide range of temperatures found in a day in NYC, we would get all of our data (temperatures for the day), and once a box plot was made, we could easily identify the highest and lowest temperatures in relation to its median (median: aka middle number).

From looking at a Box Plot we can also quickly find the Interquartile Range and upper and lower Outliers. Don’t worry,  we’ll go over each of these later, but first, let’s construct our Box Plot!

Screen Shot 2020-09-02 at 11.20.42 AM->  First, we want to put all of our temperatures in order from smallest to largest.
Screen Shot 2020-09-02 at 11.21.28 AM.png-> Now we can find Quartile 1 (Q1), Quartile 2 (Q2) (which is also the median), and Quartile 3 (Q3).  We do this by splitting the data into sections and finding the middle value of each section.Screen Shot 2020-09-05 at 11.19.22 PM

Q1=Median of first half of data

Q2=Median of entire data set

Q3=Median of second half of data

-> Now that we have all of our quartiles, we can make our Box Plot! Something we also have to take notice of, is the minimum and maximum values of our data, which are 65 and 92 respectively. Let’s lay out all of our data below and then build our box plot:

Screen Shot 2020-09-05 at 11.19.27 PM

Screen Shot 2020-09-05 at 11.20.45 PM

Now that we have our Box Plot, we can easily find the Interquartile Range and upper/lower Outliers.

Screen Shot 2020-09-05 at 11.21.54 PM

->The Interquartile Range is the difference between Q3 and Q1. Since we know both of these values, this should be easy!

Screen Shot 2020-09-05 at 11.22.02 PMNext, we calculate the upper/lower Outliers.

Screen Shot 2020-09-05 at 11.23.45 PM

-> The Upper/Lower Outliers are extreme data points that can skew the data affecting the distribution and our impression of the numbers. To see if there are any outliers in our data we use the following formulas for extreme data points below and above the central data points.

Screen Shot 2020-09-05 at 11.24.27 PM*These numbers tell us if there are any data points below 44.75 or above 114.75, these temperatures would be considered outliers, ultimately skewing our data. For example, if we had a temperature of  Screen Shot 2020-09-05 at 11.26.38 PMor Screen Shot 2020-09-05 at 11.29.25 PM these would both be considered outliers.

Screen Shot 2020-09-05 at 11.24.35 PM

Practice Questions:

Screen Shot 2020-09-05 at 11.34.21 PMSolutions:

Screen Shot 2020-09-05 at 11.37.06 PM

Screen Shot 2020-09-05 at 11.37.39 PM

Screen Shot 2020-09-05 at 11.38.10 PM

Screen Shot 2020-09-05 at 11.39.06 PM

Still got questions? No problem! Don’t hesitate to comment with any questions or check out the video above. Happy calculating! 🙂

Facebook ~ Twitter ~ TikTok ~ Youtube

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s