Notes on Statistical Analysis used in SPC Control

Statistical analysis of a process is based on probabilities, not certainties. When you are monitoring a process, and an SPC chart has a point which is equal to or exceeds a control limit, it does not mean the process out of control. In a typical Shewhart-style control chart with 3-sigma limits, the odds that a given sample interval will randomly exceed the control limits is (1.0 – 3sigma) = 1.0 – 0.9973 = 0.0027, or 1 in 370. So what it does mean is that you should monitor the control process more carefully, and if the control infractions continue, something is probably trending wrong (out of control) with the process.

When you are monitoring a Variable Control Chart that uses numeric sample data, the difference between being > (or <) than the upper control limit, or >= (or <=) than the lower control limit, is not significant. Because the difference between a > test, and a >= test is much, much, less than the numeric precision of the sampled items you are measuring. It would be like doing your taxes to the 0.000001 of a cent, when all of the important values are measured in whole $ dollars amounts. If you are near, at, or exceed the 3-sigma limit you should interpret that as a high probability that the process is out of control.

If you are monitoring an Attribute Control Chart, which by definition uses integer based sample data, (i.e. number of defects or number of defective parts), the difference between being > (or <) than the upper control limit, or >= (or <=) than the lower control limit, can be very significant. Because the difference between a > test, and a >= test (i.e. one whole integer unit) is much greater than the precision of the sampled items you are measuring.

All manufacturing processes are going to have an inherent mean and sigma (the measure of the process variation in terms of the overall standard deviation) value. Statistics can calculate an estimate of the mean and sigma values. But it is only an estimate. This is where sampling theory comes in. Depending on the sampling strategy, there will usually be one specific chart type which is optimal. The range method of estimating a sample populations variance, used in the XBar-R chart, has long been popular in quality control since it was first formalized in the 1920’s, no doubt because of its ease of application in the days before calculators and computers. Estimating the sample standard deviation using the range method is 100% effective for 2 points, and 87% effective for 9 points. Median-Range charts are still used for a slightly different reason. When the sample subgroup data tends to be highly skewed, i.e. it does not follow a normal distribution) the median value of the subgroup can be a more accurate way to estimate the sample subgroup mean than the actual mean calculation.

There are a large number of special rules, above and beyond that standard Shewhart chart 3-sigma limits. These include complicated rulesets such as the Western Electric Rules (WECO), Nelson, Juran, Gitlow, Hughes, AAIG, Westgard and Duncan. Almost all include the (2 or 3 > 2-sigma rules, the 4 of 5 > 1-sigma rules and 8 of 8 one one side of centerline rules). Once you start adding more than the simple 3-sigma control rules tests to a process, you are adding degrees of freedom. Each quasi-independent control rule is going to produce its own false positives. So if you have 15 control rules, each rule will have its own characteristic run length (300 to 1000 for example) before, on average, a false positive is produced. If you evaluate all 15 control rules every sample interval, the odds that at least one control rule produces a false positive is much, much, higher than the simple 3-sigma test. Ultimately it is a matter of cost. Is it worth the time and effort evaluating and chasing down false positives, trying to catch a true positive, compared to just waiting until the 3-sigma limits are triggered? Since the control limits are always within the specification limits, no actual product has been rejected in either case.

The Specification Limits of a process are often confused with control limits. Specification limits are imposed externally and are not calculated based on the manufacturing process under control. They represent the maximum deviation allowable for the process variable being measured. Parts will be (or should be) rejected if they are out of spec. Specification limits are calculated based on input from customers and/or engineering. Usually specification limits are going to be wider than the SPC 3-sigma limits, because you want the SPC control limits to trip before you get to the specification limits. The SPC control limits give you advance notice that the process is going south before you start rejecting parts based on specification limits. Another way to look at it is that specification limits are set by marketing and engineering, while the SPC control limits are set by the statistical variation inherent in the machines making the product.