Lesson #8 - Tool #5 - Scatter
Diagram
A Tool to Show Relationships between
Variables or Attributes

© The Quality Web, authored by Frank E. Armstrong, Making Sense
Chronicles - 2003 - 2016

# TOOL #5 - THE SCATTER DIAGRAM

The Scatter Diagram is another Quality Tool that can be used to
show the relationship between "paired data", and can provide
more useful information about a production process. What is
meant by "paired data"? The term "cause-and-effect" relationship
between two kinds of data may also refer to a relationship
between one cause and another, or between one cause and
several others. For example, you could consider the relationship
between an ingredient and the product hardness; between the
cutting speed of a blade and the variations observed in length of
parts; or the relationship between the illumination levels on the
production floor and the mistakes made in quality inspection of
product produced.
To illustrate this relationship, below are a few examples of scatter
diagrams indicating the relationships between paired data. We will
discuss how to interpret these charts, and then we will learn how
to make one with paper and pencil.
The first diagram exhibits strong correlation, or a strong
connection from one attribute to another. The second diagram
has a moderate correlation, and the third diagram has a negative
correlation which means one does not contribute to the other.
In the above examples, you can see that the dots, which are
actually data points, have various relationships. The Strong
correlation indicates that there is a close relationship between the
data that is paired together. In the middle diagram, you see a
slightly different pattern indicating that there is, in some cases, a
relationship and in other cases there is no relationship. The last
diagram on the right indicates that there is no correlation, or no
relationship at all between the paired data.
In the first diagram on the left, you would be able to determine
that you have a strong relationship and thus one measurement
has a strong relationship to the other; therefore, you would be
able to prove that one item affects the other closely.
In the last diagram on the right, you would be able to determine
that there is absolutely no relationship between the two items,
and you need to review the "Cause-and-Effect" Diagram or "brain-
storming" session to try and find another item that your primary
item measured, might have a relationship to.
The middle diagram is the one that is going to cause you some
grief. This particular diagram is more difficult to interpret, and
actually requires a more detailed investigation into which data
points correlate, and which data points have absolutely no
comparison. Then, you need to try and determine why certain
ones reveal a relationship and others do not.
How To Make A Scatter Diagram
The Basic Scatter Diagram Layout
Once again, it is best if you have graph paper to make your
diagram with. However, I am going to show you how to do this
with a spreadsheet form, and at the bottom of this lesson, there is
a blank spreadsheet that you can use for the production floor.
On gridline or graph paper:
STEP #1 - Draw an "L" form just like you did for the pareto diagram
(see the below figure). Make your scale units at even multiples,
such as 10, 20, etc. so as to have an even scale system.
STEP #2 - On the Horizontal axis (Known as the "X" axis, from Left
to Right) you place the Independent or "cause" variable.
STEP #3 - On the Vertical axis (Known as the "Y" axis, from Bottom
to Top) you place the Dependent or "effect" variable.
STEP #4 - Plot your data points at the intersection of your data
plots of the X and Y values. For Example = X = 5, Y = 2. Go right 5
spaces, and then go up 2 spaces to plot the point.
Linear Relationship: does the Data "Line Up"?
Linearity has Four Parameters:
1. Correlation - Measures how well the data line up. The
more the data resembles a straight line, the higher the
correlation to each other.
2. Slope - Measures the steepness of the data. The steeper
the data slope, assuming the correlation is good, the greater
the importance of the relationship. A change in the "X"
variable will have a larger impact on the "Y" variable, and
you will begin to see a pattern that represents the Moderate
Correlation diagram above.
3. Direction - The "X" variable can have a positive or a
negative impact on the "Y" variable. As one factor goes up,
the other goes down. In a positive correlation, both factors
will move in the same direction. In the graph examples
below, you can see that the positive correlation moves from
the lower left, toward the upward right. The negative
correlation moves from the lower right, toward the upward
left.
4. Y Intercept - where a line drawn through the data crosses
the "Y" axis. For a positive correlation, it represents the
minimum "Y" value; for a negative correlation it presents the
maximum "Y" value.
You can see that the data pattern moving from the bottom left
upward to the top right indicate a positive correlation between
the data. This is an upward sloping data grouping.
Conversely, here the data pattern moving from the top left
downward to the bottom right indicate a negative correlation
between the data, and hence a downward sloping data grouping.
TEST YOUR LEARNING
On the Scatter Chart AT THIS LINK, you are going to plot the data
from the table above. This sample data is taken from a
manufacturing process. There were thought to be two related
factors that affected the outcome of the product. That was the
conveyor speed in centimeters per second, and the cut length of
the product. The problem is that there was a fairly large
inconsistency in cut lengths of the rubber tubing produced. The
quest is to try and determine if there is a relationship between the
production conveyor speed and the resulting cut lengths.
You will now plot the data from the above data sheet on to the
manual scatter diagram, and then add up the totals. You will see a
correlation in that as the conveyor speed increases, the length of
the cut piece increases as a rule; however, you will also notice that
it is not the only probable cause. The dispersion of the cut lengths
for the same conveyor speed is due to other causes, which would
need to be reconsidered. The point being, then, is while there is
partial relationship, there is still more to discover, and more brain
storm activity is required.
There is another Scatter Diagram method to be considered
further, whereas you would test the correlation between two
kinds of data using 4 Quadrants, and calculate the difference per
quadrant. This, however, is a more complicated method and often
is used with Design of Experiments. We will not consider this
method within this lesson context.
CHECK YOUR WORK
Hopefully, you actually did spend the time plotting the Scatter
Diagram on the attached chart. The best way to understand it, is
to actually create one yourself. You Learn Best by Doing it
Yourself!!
Your column totals and row totals, along with your finished
Scatter Diagram, should resemble the final product I have
prepared for you. CLICK HERE to check your finished work against
the actual chart.