
The common person believes that if a part is made in mass production from a machine, all of the parts will be exactly alike. The truth is that even with the best of machines and processes, no two parts are exactly the same. The product will have a main or "mean" specification limit, with plus/minus tolerance that states that as long as the part is produced within this range, to that range, it is an acceptable part. The object is to hit the target specification, however, that is not always totally possible. The purpose of a Histogram is to take the data that is collected from a process and then display it graphically to view how the distribution of the data, centers itself around the mean, or main specification. From the data, the histogram will graphically show:
Below, you will see an example of a histogram. Notice that there is one main peak, but also two secondary peaks on either side of the main peak.
The easiest way to explain how a histogram is formed is to say that the form is obtained by splitting the range of the data, into equalsized bins (called classes). Then, for each bin, the number of points from the data set that fall into each bin, is counted. The best way to understand how the histogram is formed is to actually prepare one, so you should try to do the same as you follow along.
We will use the data listed in figure 6 for our exercise. This data represents the measurements taken from a process that makes machine parts, produced on Line A and Line B. The specification is listed as 150 ± 0.5 mm. The values on this chart were arrived at by subtracting 150mm from the measured value, and then multiplying by 10. For example, a measurement of 149.9  150 would equal .10; multiplied by 10 would equal 1 We will now make a histogram of the data listed below, and compare the parts produced on Line A and Line B, and then overlay the two together.
Figure 6 Figure 8 
Figure 7
CLASS EXERCISE: Plot data from Lines A & B on the Manual Graph (For Excel version, Click Here). In this first exercise, I want you to put an "X" or "1" in the manual graph for Line A data above in the left side of the form under the column "Tally". Make a mark in the appropriate row for every data point in Line A, and then do the same for Line B. Then I want you to put the same mark for Line B data above in the right side of the form under the column "Tally". Total the number of occurrence for each number in the column marked "Frequency", and then add the frequency for A and B and put that number in the column marked A + B frequency. When you are finished, your form should look like THIS when completed. When you look at your graph on the form you just completed, you actually have a histogram of both Line A and Line B. If you were to plot those numbers on an Excel bar graph, they would look like this:
Figure 9  Line A Histogram Figure 10  Line B Histogram Remember that the specification was 150.0 ± 0.5 mm, therefore, any plot on either graph that is more than + 5 or less than  5 is a nonconforming product and is unacceptable. Visually, by comparing both histograms, you can see that Line A has a shift to the right of the center line specification (150.0). Line B has a shift to the left. CLASS EXERCISE 2: The next exercise I want you to do is to take the total of Line A & B and plot that histogram. From your first exercise sheet, you added A + B and put that number in the far right column. Use the attached HISTOGRAM FORM to make your plots, put a "X" in each square. I have already put the totals from your first sheet in the Frequency column. If we were to overlay both graphs, or plot both sets together, the histogram would look like figure 11, and your form should look just like this.
Figure 11  Combined Histogram of Line A & B 

To calculate the mean (Xbar), or average value, and the standard deviation to be used for further statistical computations, we will use the below chart for Line A. The standard deviation is a measure of variability. Data is always scattered around the zone of central tendency, and the extent of this scatter is called dispersion or variation. Range is a simple method of measuring variance, but the most important measure is the Standard Deviation. The Standard Deviation is the square root of the population variance. To understand the chart, the left column is the actual value recorded on the right, and the "ui" factor on the left of the measurment. The next column (fi) indicates how many times each value was recorded from the data taken. The third column (ui) is the value indicated in the first column, to the left of the actual measurement, or the class representative value in converted form. The fourth column is the second column multiplied by the third column, (fi * ui). For example, 1 times  2 = 2.
The fifth column is a little tricky. You take the "fi" value and multiply it by the square of "ui" (for example 2 * 2 = 4, times 1 = 4). I have done Line A for you. Now, you need to practice by doing Line B, and then also by computing the values for the combined Line A & Line B. Use the following BLANK FORMS to do your calculations. I have provided the initial numbers for you. After you have completed the exercise, you may select Exercise 3 below (see Check Your Work) and compare your results. Now to compute Xbar and the Standard Deviation (s) from the table of Line A, we use the following formula:
To explain the above formula, 150 is the specification value. 138 is the total of the column fi * ui, and 60 is the total number of measurements taken (N). 0.1 is the formula factor. In the standard deviation formula, 532 is the total of the column fi * ui^{2}. CLASS EXERCISE 4: You have already calculated the information on the previous form for Line B and the Combined Line A & B. Now it is time for you to COMPUTE THE MEAN and STANDARD DEVIATION for LINE B and for the combined LINE A & B. To help ensure that you are on the right track, I have given you the answers below. However, you still need to do the actual calculations for practice to ensure you understand how to get the right answers.

With the specification of 150 ± 0.5 mm, the width of the class, or class interval, is 1 mm. This is five times the standard deviation (s) of Line A and five times the standard deviation (s) of Line B; four times the standard deviation (s) of A & B combined. In order for products to remain within specification, the width of a class should be at least SIX TIMES the Standard Deviation (s). The Process Capability Index (C_{p}), is a value indicating how capable a process is of producing product without many defects. The higher the process capability index, the better the process is centered around the mean specification and the less possibility of defects. With reference to the process capability index (C_{p}), it can be expressed as follows:
C_{p} = width of class > 1 6sFor Line A C_{p} = 1.0 / 6 * .19 = 1.0 / 1.14 = .87 C_{p}. For Line B C_{p} = 1.0 / 6 * .21 = 1.0 / 1.25 = .79 C_{p}. For Line A & B Combined C_{p} = 1.0 / 6 * .26 = 1.0 / 1.56 = .64 C_{p}. While both lines exhibit that the products produced are close to the center of the specification, both of the indexes are less than 1, so this indicates that there will be defectives produced. Notice that when you combine both Line A and Line B, you have defectives on both sides of the specification, and thus the defectives produced actually increases, therefore the C_{p} drops even lower. For a process to be suitable, it should have a C_{p} greater than 1.0. The higher the number, the better the process is centered. In the chart below, you can see the Cp, or Process Capability Index relative to the total product outside the twosided specification limits, or +/ tolerance.
What can we do to eliminate the defectives and improve the process capability?

Check your results on Exercise 3 here.

Menu 
Check Sheet 
Pareto Diagram 
Histogram 
CauseandEffect
