|

|
|
|
Treatment of Errors and
Uncertainties in Physics Experiments
|
You, the experimenter, will go and measure some
physical quantities numerically (using some piece of equipment to
do so). You make a measurement. The measurement is necessarily
somewhat different than the true value of the physical
quantity.
First, we need to define some terms:
- Accuracy: This is the extent to
which your measurement is in fact close to the
true value. If you do not a priori know the true value,
then it may be difficult to determine to what extent your
measurement is accurate.
- Precision: This is the extent to
which you can specify the exactness of a measurement. For
example, to report that the time is about 3 PM is less
precise than to say the time is 3:02:45. Being more
precise does not always imply being more accurate.
- Statistical (Random) Uncertainty: This
is a key idea for this lab. The statistical uncertainty
of a measurement is the uncertainty that reflects the
fact that every time you make a measurement, you must, be
necessity, measure a slightly different quantity each
time. The tendency for a measured value to "jump
around" from measurement to measurement is the
statistical error.
- Systematic Uncertainty: This is
uncertainty and error in your measurement caused by
anything that is not statistical uncertainty. This
includes instrumental effects, not-taking things into
account (will the change in barometric air pressure
impact this measurement?) and gross (stupid) errors.
Experimentalists often use the words "error"
and "uncertainty" interchangably. Get used to it.
Here is a key point: Statistical errors can, in
general, be estimated in only one way: by repeating the
measurement several times. Therefore, in all your work in the
lab, you must repeat measurements to determine the statistical
error.
This means in particular, that you cannot estimate
statistic error by simply looking at the precision of your
instrument, or by counting significant digits, or any of that
sort of thing. If you tell me, for example, that a ruler has tick
marks to make measurement to within 1.0 mm, this is an incorrect
estimate of the statistical error. To determine the
statistical error, you must repeatedly make the same identical
measurement with the same ruler many times to determine the error
on your measurement statistically. You need to effectively
repeat the identical experiment in order to understand the
statistical errors. In general, this is the only proper way
to measure statistical uncertainty.
(An aside here -- having said this, there are
certain physical situations where for some reason or another you
may be able to mathmatically characterize the statistical errors
in some way, and therefore can treat them accordingly. For
example, you may be able to argue on a physical basis that some
quantity is fluctuating according to a prescription of a guassian
random deviate, or a poisson deviate. In some cases you may be
able to make an assumption about the form of the
statistical errors. However, if you do such a thing, you may be
on thin ice, particularly if you cannot justify your assumptions.
In general, you don't really know if you can make any assumptions
about the form of the statistical error. As a rule, it is
safer to measure the statistical uncertainties directly from the
experiment by repeating the measurement rather than making some
assumption about the form of the statistical error. ) You can't
go wrong if you measure the statistical error empirically.)
Systematic errors are not so easy to measure.
Try to estimate these on a physical basis -- or include them in
your model. You should strive to reduce systematic errors to as
close to zero as possible -- or at least they should be small
compared to your statistical errors. If the unaccounted-for
systematic errors in an experiment are larger than the
statistical errors, then you will not be able to verify any
physical model. Such an experiment is, by definition, a poor
experiment. Avoid doing any poor experiments.
To estimate the statistical error on a
measurement, repeat the measurement several times exactly the
same way (at the same point) and calculate the standard
deviation (sometimes called the "root-mean-square"
or RMS) of the values you measure. The standard deviation
represents an estimate on the error of any one measurement.
The error on the mean (i.e. average) is approximately the
standard deviation divided by the square-root of the number of
measurements.
Often you can use measured statistical errors
on one of your data points to infer the statistical errors on the
other points. (In other words you do not necessarily need to
repeat ALL measurements many times, but for each experiment be
sure to repeat at least one measurement several times to get a
measure of the statistical error.)
You must account for the statistical error on
your measured points by representing these uncertainties as error
bars on your plots. In nearly every lab here, you are varying
some quantity, X, and the measuring the impact on some quantity,
Y. Measure the statistical error on Y. Then plot Y vs. X with
error bars on Y that are statistical errors. This is what a
physicist means by error bars. You must show such a plot
in your lab report. A plot without error bars is just plain wrong.
Once you have plotted the points, do a fit to
some model function (usually a line) that describes the physics
you expect to verify. A least-squared linear regression is
usually fine. You can fit by eye if you are careful, so far as I
am concerned. Care with a ruler and graph paper works as well as
tabulated data and computer graphics. But I must see the plot.
Once you have done a best fit, however, you
need to address the following question: is the best fit a good
fit. In other words, does the model fit the data to within
the uncertainties prescribed by the error bars. This is the
critical question for the experimental physicist. Your goal is
not to measure a number. Your goal is not even to measure the
"right" number. Really, your goal is to determine if
the physical model is supported by the data. In my opinion,
if you take this notion to heart, you will understand the soul of
experimental physics. To do this you must numerically answer the
question: "Does the data fit the model to within the
statistical uncertainties on the measurements?"
Error Analysis (Non-Calculus)
These notes are designed to supplement the treatments in any freshman physics
laboratory manual. The level of presentation does not use calculus, and is
suitable for freshman. Because of the dearth of technical symbols available in
the HTML language, and the inconsistent way browsers and readers display them,
I have avoided use of the standard (delta)x notation for error in x. Instead,
I use upper case letters for measured quantities, and lower case for their
errors: A ± a. The notation <x> represents the mean (arithmetic
average) value of X. I use s to represent standard deviation (usually it is
the Greek lower case sigma). ASCII notation is used for equations to shorten
the download time, compared to bitmapped graphics.
A new section of examples (Section J) has been added, October 6, 1996. The
section letter labels are now in alphabetical order.
A. UNCERTAINTIES (ERRORS) OF MEASUREMENT
Consistent with current practice, the term "error" is used here as a
synonym for "experimental uncertainty."
No measurement is perfectly accurate or exact. Many instrumental, physical and
human limitations cause measurements to deviate from the "true" values
of the quantities being measured. These deviations are called "experimental
uncertainties," but more commonly the shorter word "error" is
used.
What is the "true value" of a measured quantity? We can think of it
as the value we'd measure if we somehow eliminated all error from instruments
and procedure. This is a natural enough concept, and a useful one, even though
at this point in the discussion it may sound like circular logic.
We can improve the measurement process, of course, but since we can never
eliminate measurement errors entirely, we can never hope to measure true
values. We have only introduced the concept of true value for
purposes of discussion. When we specify the "error" in a quantity or
result, we are giving an estimate of how much that measurement is likely to
deviate from the true value of the quantity. This estimate is far more than a
guess, for it is founded on a physical analysis of the measurement process and a
mathematical analysis of the equations which apply to the instruments and to the
physical process being studied.
A measurement or experimental result is of little use if nothing is known
about the probable size of its error. We know nothing about the reliability of a
result unless we can estimate the probable sizes of the errors and uncertainties
in the data which were used to obtain that result.
That is why it is important for students to learn how to determine
quantitative estimates of the nature and size of experimental errors and to
predict how these errors affect the reliability of the final result. Entire
books have been written on this subject.[1] The following discussion is designed
to make the student aware of some common types of errors and some simple ways to
quantify them and analyze how they affect results.
A warning: Some introductory laboratory manuals still use
old-fashioned terminology, defining experimental error as a comparison of the
experimental result with a standard or textbook value, treating the textbook
value as if it were a true value. This is misleading, and is not
consistent with current practice in the scientific literature. This sort of
comparison with standard values should be called an experimental
discrepancy to avoid confusion with measures of error (uncertainty). The
only case I can think of where this measure is marginally appropriate as a
measure of error is the case where the standard value is very much
more accurate than the experimental value.
Consider the case of an experimenter who measures an important quantity
which no one has ever measured before. Obviously no comparison can be made
with a standard value. But this experimenter is still obligated to provide a
reasonable estimate of the experimental error (uncertainty).
Consider the more usual case where the experimenter measures something to
far greater accuracy than anyone previously achieved. The comparison with the
previous (less accurate) results is certainly not a measure of the error.
And often you are measuring something completely unknown, like the density
of an unknown metal alloy. You have no standard value with which to compare.
So, if you are using one of these lab manuals with the older, inadequate,
definition of error, simply substitute "experimental discrepancy"
wherever you see "experimental error" in the book. Then, don't
forget, that you are also obligated to provide an experimental error
estimate, and support it. If you determine both the error and the discrepancy,
the experimental discrepancy should fall within the error limits of both your
value and the standard value. If it doesn't, you have some explaining, and
perhaps further investigation, to do.
B: DETERMINATE AND INDETERMINATE ERRORS
Experimental errors are of two types: (1) indeterminate and (2) determinate (or
systematic) errors.
1. Indeterminate Errors.[2]
Indeterminate errors are present in all experimental measurements. The name
"indeterminate" indicates that there's no way to determine the size or
sign of the error in any individual measurement. Indeterminate errors cause a
measuring process to give different values when that measurement is repeated
many times (assuming all other conditions are held constant to the best of the
experimenter's ability). Indeterminate errors can have many causes, including
operator errors or biases, fluctuating experimental conditions, varying
environmental conditions and inherent variability of measuring instruments.
The effect that indeterminate errors have on results can be somewhat reduced
by taking repeated measurements then calculating their average. The average is
generally considered to be a "better" representation of the "true
value" than any single measurement, because errors of positive and negative
sign tend to compensate each other in the averaging process.
2. Determinate (or Systematic) Errors.
The terms determinate error and systematic error are
synonyms. "Systematic" means that when the measurement of a quantity
is repeated several times, the error has the same size and algebraic sign
for every measurement. "Determinate" means that the size and sign
of the errors are determinable (if the determinate error is recognized and
identified).
A common cause of determinate error is instrumental or procedural bias.
For example: a miscalibrated scale or instrument, a color-blind observer
matching colors.
Another cause is an outright experimental blunder. Examples: using an
incorrect value of a constant in the equations, using the wrong units, reading a
scale incorrectly.
Every effort should be made to minimize the possibility of these errors, by
careful calibration of the apparatus and by use of the best possible measurement
techniques.
Determinate errors can be more serious than indeterminate errors for three
reasons. (1) There is no sure method for discovering and identifying them just
by looking at the experimental data. (2) Their effects can not be
reduced by averaging repeated measurements. (3) A determinate error has the same
size and sign for each measurement in a set of repeated measurements, so there
is no opportunity for positive and negative errors to offset each other.
C. PRECISION AND ACCURACY
A measurement with relatively small indeterminate error is said to have high
precision. A measurement with small indeterminate error and small
determinate error is said to have high accuracy. Precision does not
necessarily imply accuracy. A precise measurement may be inaccurate if it has a
determinate error.
D. STANDARD WAYS FOR COMPARING QUANTITIES
1. Deviation.
When a set of measurements is made of a physical quantity, it is useful to
express the difference between each measurement and the average (mean) of the
entire set. This is called the deviation of the measurement from the
mean. Use the word deviation when an individual measurement of a set is
being compared with a quantity which is representative of the entire set.
Deviations can be expressed as absolute amounts, or as percents.
2. Difference.
There are situations where we need to compare measurements or results which
are assumed to be about equally reliable, that is, to express the absolute or
percent difference between the two. For example, you might want to compare two
independent determinations of a quantity, or to compare an experimental result
with one obtained independently by someone else, or by another procedure. To
state the difference between two things implies no judgment about which is more
reliable.
3. Experimental discrepancy.
When a measurement or result is compared with another which is assumed or
known to be more reliable, we call the difference between the two the experimental
discrepancy. Discrepancies may be expressed as absolute discrepancies or as
percent discrepancies. It is customary to calculate the percent by dividing the
discrepancy by the more reliable quantity (then, of course, multiplying by 100).
However, if the discrepancy is only a few percent, it makes no practical
difference which of the two is in the denominator.
E. MEASURES OF ERROR
The experimental error [uncertainty] can be expressed in several standard ways:
1. Limits of error
Error limits may be expressed in the form Q ± q where Q is the measured
quantity and q is the magnitude of its limit of error.[3] This expresses the
experimenter's judgment that the "true" value of Q lies between Q - q
and Q + q. This entire interval within which the measurement lies is called the range
of error. Manufacturer's performance guarantees for laboratory instruments
are often expressed this way.
2. Average deviation[4]
This measure of error is calculated in this manner: First calculate the mean
(average) of a set of successive measurements of a quantity, Q. Then find the
magnitude of the deviations of each measurement from the mean. Average these
magnitudes of deviations to obtain a number called the average deviation
of the data set. It is a measure of the dispersion (spread) of the measurements
with respect to the mean value of Q, that is, of how far a typical measurement
is likely to deviate from the mean.[5] But this is not quite what is needed to
express the quality of the mean itself. We want an estimate of how far the mean
value of Q is likely to deviate from the "true" value of Q. The
appropriate statistical estimate of this is called the average deviation of
the mean. To find this rigorously would involve us in the theory of
probability and statistics. We will state the result without proof.[6]
For a set of n measurements Qi whose mean value is <Q>,[7]
the average deviation of the mean (A.D.M.) is:
n | |
SUM |Q - <Q>|
i=1 | i |
Ave. Dev. of the Mean = -------------- (1)
1/2
(n-1) (n)
The vertical bars enclosing an expression mean "take the absolute value
of" that expression. That means that if the expression is negative, make it
positive.
If the A.D.M. is quoted as the error measure of a mean, <Q>exp,
this is equivalent to saying that the probability of <Q>exp
lying within one A.D.M. of the "true" value of Q, Qtrue, is
58%, and the odds against it lying outside of one A.D.M. are 1.4 to 1.
As a rough rule of thumb, the probability of <Q>exp being
within three A.D.M. (on either side) of the true value is nearly 100% (actually
98%). This is a useful relation for converting (or comparing) A.D.M. to limits
of error.[8]
3. Standard Deviation.
[This section is included for completeness, and may be skipped or skimmed
unless your instructor specifically assigns it.]
The standard deviation is a well known, widely used, and
statistically well-founded measure of error. For a set of n measurements Qi
whose mean value is <Q>, the standard deviation of the mean is found from:
----------------
| | |
| SUM |Q - <Q>|
Std. Dev. of the Mean = | | i |
| -------------- (2)
\| n(n-1)
The sum is from i=1 to n.
This form of the equation is not very convenient for calculations. By
expanding the summand it may be recast into a form which lends itself to
efficient computation with an electronic calculator:
---------------------
| | 2 2|
| SUM |Q - (n<Q>) |
Std. Dev. of the mean = | | i | (3)
| ------------------
\| n(n-1)
[Note that the (n<Q>)2 is a separate term in the numerator, it
is not summed over.]
The calculation of the standard deviation requires two summations, one a sum
of the data values (to obtain <Q>), and one a sum of the squares of the
data values. Many electronic calculators allow these two sums to be obtained
with only one entry of each data value. This is a good feature to have in a
scientific calculator. When n is large, the quantity n(n-1) becomes
approximately n2, further simplifying the work.
The use of the standard deviation is hardly justified unless the experimenter
has taken a large number of repeated measurements of each
experimentally determined quantity. This is seldom the case in the freshman
laboratory.
It can be shown that when the measurements are distributed according the
"normal" ("Gaussian")[11] distribution, average deviations
and standard deviations are related by a simple formula:[12]
[average deviation] = 0.80 [standard deviation] This is a useful "rule
of thumb" when it is necessary to compare the two measures of error or
convert from one to the other.
F. STANDARD METHODS FOR EXPRESSING ERROR
1. Absolute Error.
Uncertainties may be expressed as absolute measures, giving the size
of the a quantity's uncertainty in the same units in the quantity itself.
Example. A piece of metal is weighed a number of times, and the
average value obtained is: M = 34.6 gm. By analysis of the scatter of the
measurements, the uncertainty is determined to be m = 0.07 gm. This absolute
uncertainty may be included with the measurement in this manner: M = 34.6 ±
0.07 gm.
The value 0.07 after the ± sign in this example is the estimated absolute
error in the value 3.86.
2. Relative (or Fractional) Error.
Uncertainties may be expressed as relative measures, giving the
ratio of the quantity's uncertainty to the quantity itself. In general:
absolute error in a measurement
Relative error = ------------------------------- (5)
size of the measurement
Example. In the previous example, the uncertainty in M = 34.6 gm was m =
0.07 gm. The relative uncertainty is therefore:
m 0.07 gm
- = ------- = 0.002, or, if you wish, 0.2%
M 34.6 gm
It is a matter of taste whether one chooses to express relative errors "as
is" (as fractions), or as percents. I prefer to work with them as fractions
in calculations, avoiding the necessity for continually multiplying by 100. Why
do unnecessary work?
But when expressing final results, it is often meaningful to express the
relative uncertainty as a percent. That's easily done, just multiply the
relative uncertainty by 100. This one is 0.2%.
3. Absolute or relative form; which to use.
Common sense and good judgment must be used in choosing which form to use to
represent the error when stating a result. Consider a temperature
measurement with a thermometer known to be reliable to ± 0.5 degree Celsius.
Would it make sense to say that this causes a 0.5% error in measuring the
boiling point of water (100 degrees) but a whopping 10% error in the measurement
of cold water at a temperature of 5 degrees? Of course not! [And what if the
temperatures were expressed in degrees Kelvin? That would seem to reduce the
percent errors to insignificance!] Errors and discrepancies expressed as
percents are meaningless for some types of measurements. Sometimes this is due
to the nature of the measuring instrument, sometimes to the nature of the
measured quantity itself, or the way it is defined.
There are cases where absolute errors are inappropriate and therefore the
errors should be expressed in relative form. There are also cases where the
reverse is true.
Sometimes both absolute and relative error measures are necessary to
completely characterize a measuring instrument's error. For example, if a
plastic meter stick uniformly expanded, the effect could be expressed as a
percent determinate error. If a one half millimeter were worn off the zero end
of a stick, and this were not noticed or compensated for, this would best be
expressed as an absolute determinate error. Clearly both errors might be present
in a particular meter stick. The manufacturer of a voltmeter (or other
electrical meter) usually gives its guaranteed limits of error as a constant
determinate error plus a `percent' error.
Both relative and fractional forms of error may appear in the intermediate
algebraic steps when deriving error equations. [This is discussed in section H
below.] This is merely a computational artifact, and has no bearing on the
question of which form is meaningful for communicating the size and nature of
the error in data and results.
G. IMPORTANCE OF REPEATED MEASUREMENTS
A single measurement of a quantity is not sufficient to convey any information
about the quality of the measurement. You may need to take repeated measurements
to find out how consistent the measurements are.
If you have previously made this type of measurement, with the same
instrument, and have determined the uncertainty of that particular measuring
instrument and process, you may appeal to your experience to estimate the
uncertainty. In some cases you may know, from past experience, that the
measurement is scale limited, that is, that its uncertainty is smaller
than the smallest increment you can read on the instrument scale. Such a
measurement will give the same value exactly for repeated measurements of the
same quantity. If you know (from direct experience) that the measurement is
scale limited, then quote its uncertainty as the smallest increment you can read
on the scale.
Students in this course won't need to become experts in the fine details of
statistical theory. But they should be constantly aware of the experimental
errors and do whatever is necessary to find out how much they affect results.
Care should be taken to minimize errors. The sizes of experimental errors in
both data and results should be determined, whenever possible, and quantified by
expressing them as average deviations. [In some cases common-sense experimental
investigation can provide information about errors without the use of involved
mathematics.]
The student should realize that the full story about experimental errors has
not been given here, but will be revealed in later courses and more advanced
laboratory work.
H. PROPAGATION OF DETERMINATE ERRORS
The importance of estimating data errors is due to the fact that data errors
propagate through the calculations to produce errors in results. It is the
size of a data errors' effect on the results which is most important. Every
effort should be made to determine reasonable error estimates for every
important experimental result.
We illustrate how errors propagate by first discussing how to determine the
error limits (or maximum error) of results by considering how data errors
propagate through simple mathematical operations. We first confine our attention
to determinate errors, which have a known sign. In this way we will
discover certain useful rules for error propagation, then we'll then be able to
extend the rules to other error measures and also to indeterminate errors.
We are developing the mathematical rules for "finite differences,"
the algebra of numbers which have relatively small variations imposed upon them.
The finite differences are those variations from "true values" caused
by experimental errors.
Suppose that an experimental result is calculated from the sum of two data
quantities A and B. For this discussion we'll use a and b to represent the
errors in A and B respectively. The data quantities are written to explicitly
show the errors:
(A + a) and (B + b)
We allow that a and b may be either positive or negative, the signs being
"in" the symbols "a" and "b."
The result of adding A and B to get R is expressed by the equation: R = A +
B. With the errors explicitly included, this is written:
(A + a) + (B + b) = (A + B) + (a + b)
The result with its error, r, explicitly shown, is: (R + r):
(R + r) = (A + B) + (a + b)
The error in R is therefore: r = a + b.
We conclude that the error in the sum of two quantities is just the sum of
the errors in those quantities. You can easily work out for yourself the case
where the result is calculated from the difference of two quantities.
In that case the error in the result will be the difference in the errors.
Summarizing:
- Sum and difference rule. When two quantities are added (or subtracted),
their determinate errors add (or subtract).
Now lets consider a result obtained by multiplication, R = AB. With errors
explicitly included:
(R + r) = (A + a)(B + b) = AB + aB + Ab + ab
or: r = aB + Ab + ab
This doesn't look promising for stating as a simple rule. However, when we
express the errors in relative form, things look better. If the error a
is small relative to A, and b is small relative to B, then (ab) is certainly
small relative to AB, as well as small compared to (aB) and (Ab). Therefore we
neglect the term (ab) (throw it out), since we are interested only in error
estimates to one or two significant figures. Now we express the relative error
in R as
r aB + bA a b
- = ------- = - + -
R AB A B
This gives us a very simple rule:
- Product rule. When two quantities are multiplied, their relative
determinate errors add.
A similar procedure may be carried out for the quotient of two quantities, R =
A/B.
A + a A (A + a) B A (B + b)
----- - - ------- - - - -------
r B + b B (B + b) B B (B + b)
- = --------- = -----------------------
R A A
- -
B B
(A + a) B - A (B + b) ~ (a)B - B(a) ~ a b
= --------------------- = ----------- = - - -
A(B + b) AB A B
The approximation made in the next to last step was to neglect b in the
denominator, which is valid if the relative errors are small. So the result is:
- Quotient rule. When two quantities are divided, the relative
determinate error of the quotient is the relative determinate error of the
numerator minus the relative determinate error of the denominator.
A consequence of the product rule is this:
- Power rule. When a quantity Q is raised to a power, P, the relative
determinate error in the result is P times the relative determinate error in
Q. This also holds for negative powers, i.e. the relative determinate error
in the square root of Q is one half the relative determinate error in Q.
I. PROPAGATION OF INDETERMINATE ERRORS
Indeterminate errors have unknown sign. If their distribution is symmetric about
the mean, then they are unbiased with respect to sign. Also, if indeterminate
errors in different quantities are independent of each other, their signs have a
tendency offset each other in computations.[11]
When we are only concerned with limits of error (or maximum error)
we must assume a "worst-case" combination of signs. In the case of
subtraction, A - B, the worst-case deviation of the answer occurs when the
errors are either +a and -b or -a and +b. In either case, the maximum error will
be (a + b).
In the case of the quotient, A/B, the worst-case deviation of the answer
occurs when the errors have opposite sign, either +a and -b or -a and +b. In
either case, the maximum size of the relative error will be (a/A + b/B).
The results for the operations of addition and multiplication are the same as
before. In summary, maximum indeterminate errors propagate according to
the following rules:
- Addition and subtraction rule. The absolute indeterminate errors add.
- Product and quotient rule. The relative indeterminate errors add.
A consequence of the product rule is this:
- Power rule. When a quantity Q is raised to a power, P, the relative
error in the result is P times the relative error in Q. This also holds for
negative powers, i.e. the relative error in the square root of Q is one half
the relative error in Q.
These rules apply only when combining independent errors, that
is, individual errors which are not dependent on each other in size or sign.
It can be shown (but not here) that these rules also apply sufficiently well
to errors expressed as average deviations. The one drawback to this is that the
error estimates made this way are still overconservative in that they do not
fully account for the tendency of error terms associated with independent errors
to offset each other. This, however, would be a minor correction of little
importance in our work in this course.
Error propagation rules may be derived for other mathematical operations as
needed. For example, the rules for errors in trig functions may be derived by
use of trig identities, using the approximations: sin ß = ß and cos
ß = 1, valid when ß is small. Rules for exponentials may be derived
also.
When mathematical operations are combined, the rules may be successively
applied to each operation, and an equation may be algebraically derived[12]
which expresses the error in the result in terms of errors in the data. Such an
equation can always be cast into standard form in which each error
source appears in only one term. Let x represent the error in x, y the error in
y, etc. Then the error r in any result R, calculated by any combination of
mathematical operations from data values X, Y, Z, etc. is given by:
r = (c ) x + (c ) y + (c ) z ... etc.
x y y
which may always be algebraically rearranged to:
r x y z
- = {C } - + {C } - + {C } - ... etc. (5)
R x X y Y z Z
The coefficients (cx) and {Cx} etc. in each term are extremely important because
they, along with the sizes of the errors, determine how much each error affects
the result. The relative size of the terms of this equation shows us the
relative importance of the error sources. It's not the relative size of the
errors (x, y, etc), but the relative size of the error terms which tells us
their relative importance.
If this error equation was derived from the determinate-error rules,
the relative errors in the above equation might have + or - signs. The
coefficients may also have + or - signs, so the terms themselves may have + or -
signs. It is therefore possible for terms to offset each other.
If this error equation was derived from the indeterminate error
rules, the error measures appearing in it are inherently positive. The
coefficients will turn out to be positive also, so terms cannot offset each
other.
It is convenient to know that the indeterminate error equation may be
obtained directly from the determinate-error equation by simply choosing the
worst-case, i.e., by taking the absolute value of every term. This forces all
terms to be positive. This step is only done after the
determinate-error equation has been fully derived in standard form.
The error equation in standard form is one of the most useful tools for
experimental design and analysis. It should be derived (in algebraic form) even
before the experiment is begun, as a guide to experimental strategy. It can show
which error sources dominate, and which are negligible, thereby saving time one
might spend fussing with unimportant considerations. It can suggest how the
effects of error sources might be minimized by appropriate choice of the sizes
of variables. It can tell you how good a measuring instrument you need to
achieve a desired accuracy in the results.
The student who neglects to derive and use this equation may spend an entire
lab period using instruments, strategy, or values insufficient to the
requirements of the experiment. And he may end up without the slightest idea why
the results were not as good as they ought to have been.
A final comment for those who wish to use standard deviations as
indeterminate error measures: Since the standard deviation is obtained from the
average of squared deviations, equation (5) must be modified--each term
of the equation (both sides) must be squared:
2 2 2 2 2 2 2
(r/R) = C (x/X) + C (y/Y) + C (z/z) (6)
x y z
This rule is given here without proof.
J. EXAMPLES
Example 1: A student finds the constant acceleration of a slowly moving
object with a stopwatch. The equation used is s = (1/2)at2. The time
is measured with a stopwatch, the distance, s, with a meter stick.
s = 2 ± 0.005 meter. This is 0.25%.
t = 4.2 ± 0.2 second. This is 4.8%.
What is the acceleration and its estimated error?
We shift to capital letters for quantities, lower case for errors. Solve the
equation for the result, a. A = 2S/T2. Its indeterminate-error
equation is:
a t s
- = 2 - + -
A T S
The factor of 2 in the time term causes that term to dominate, for application
of the rule for errors in quantities raised to a power causes the 4.8% error in
the time to be doubled, giving over 9.5% error in T2. The 1/4 percent
error due to the distance measurement is clearly negligible compared to the 9.5%
error due to the time measurement, so the result (the acceleration) is written:
A = 0.23 ± 0.02 m/s2.
Example 2: A result is calculated from the equation R = (G+H)/Z, the
data values being:
G = 20 ± 0.5 (2.5%)
H = 16 ± 0.5 (3.1%)
Z = 106 ± 1.0 (0.9%)
The ± symbol tells us that these errors are indeterminate. The calculation of R
requires both addition and division, and gives the value R = 0.338. The error
calculation requires both the addition and multiplication rule, applied in
succession, in the same order as the operations performed in calculating R
itself.
The division rule requires that we use relative (fractional errors).
The relative error in the numerator is, by the addition rule, 1.0/36 = 0.028.
The relative error in the denominator is 1.0/106 = 0.0094. The relative error in
the denominator is added to that of the numerator to give 0.028 + 0.0094 =
0.0374, which is the relative error in R. That's 3.74%.
If the absolute error in R is required, it is (0.0374)R =
0.0374/0.338 = 0.0126.
The result, with its error, may be expressed by rounding the absolute error
to one or two significant figures:
R = 0.338 ± 0.013
Example 3: Write a determinate-error equation for example 1.
We follow the same steps, but represent the errors symbolically. Let N
represent the numerator, N=G+H. The determinate error in N is then g+h. The
relative error in the numerator is (g+h)/N. The relative error in the
denominator is z/Z. The relative error in R is then:
r g + h z g h z
- = ----- - - = --- + --- - -
R G + H Z G+H G+H Z
r G g H h z
- = --- - + --- - - -
R G+H G G+H H Z
The absolute error may be written, using R = (G+H)/Z:
RG g RH h Rz 1 1 (G+H)
r = --- - + --- - - -- = - g + - h - ----- z
G+H G G+H H Z Z Z 2
Z
0.5 0.5 36(1)
Check: r = --- + --- - ----- = .0094 - .0032 = .006
106 106 106^2
This equation is in standard form; each error, g, h, and z appears in only
one term, that term representing that error's contribution to the error in
R. The result for R = (G-H)/Z looks the same except that the second term has a
negative sign.
Example 4: Derive the indeterminate error equation for this same
formula, R = (G+H)/Z.
Here's where our previous work pays off. Look at the determinate error
equation of example 3 and rewrite it for the worst case of signs of the
terms. That's equivalent to making all of the terms of the standard
form equation positive:
r G g H h z
- = --- - + --- - + -
R G+H G G+H H Z
The error in the result would be 0.0094 + 0.0032 = 0.0126, as we found before.
Example 5: Rework example 2, this time using the indeterminate error
equation obtained in example 4.
Putting in the values:
r 20 0.5 16 0.5 1
- = ----- --- + ----- --- + ---
R 20+16 20 20+16 16 106
r 20 0.5 16 0.5 1
- = -- --- + -- --- + ---
R 36 20 36 16 106
r
- = 0.555(0.025) + 0.5(0.031) + 0.0094
R
r
- = 0.014 + 0.014 + 0.0094 = 0.0374
R
This is less than 4%.
Example 6: A result, R, is calculated from the equation R = (G+H)/Z,
with the same data values as the previous example. After the experiment is
finished, it is discovered that the value of Z was 0.05 too small because of a
systematic error in the measuring instrument. The result was obtained from
averaging large amounts of data, and the task of recalculating a correction to
each value is daunting. But that's not necessary Use this information to correct
the result.
Look at the determinate error equation:
r G g H h z
- = --- - + --- - - -
R G+H G G+H H Z
The -0.05 error in Z represents a relative error of -0.05/106 in Z. Assuming
zero determinate error in G and H, we have:
r/R = -(z/Z) = -(-0.05/106)
So: r = (0.05/106)(0.338) = 0.0001594
Example 7: The density of a long copper rod is to be obtained. Its
length is measured with a meter stick, its diameter with micrometer calipers,
and its mass with an electronic balance.
L = 60.0 ± 0.1 cm (0.17%)
D = 0.632 ± 0.002 cm (0.32%) [The error in D2 is therefore 0.64%]
m = 16.2 ± 0.1 g (0.006%)
The cross sectional area is (PI)r2 = (PI)D2/4. So the
density is = m/v = 4m/L(PI)D2. The relative error in the result (the
density) should be no more than (0.17% + 0.64% + 0.006% = 0.816%) or about 0.8%.
This is written:
density = 8.606 ± 0.07 g/cm3
A reference book gives 8.87 g/cm3 as the density of copper. The
experimental discrepancy is 0.26, indicating that something is wrong. The
student who took this data may have blundered in a measurement. Maybe the
material wasn't pure copper, but a copper alloy. If it is a measurement blunder,
the diameter measurement is the most likely suspect.
K. THE OBJECTIVES OF LABORATORY WORK
A good way to conclude this chapter is to consider what the students' objectives
in laboratory ought to be. The freshman laboratory is not the same as a
research lab, but we hope that the student will become aware of some of the
concerns, methods, instruments, and goals of physics researchers.
Experiments in freshman lab fall into several categories. In each case below,
we indicate what the student's responsibility should be.
1. To measure a fundamental physical quantity.
The student designs an experimental strategy to obtain the most accurate
result with the available equipment. The student must understand the operation
of the equipment and investigate the inherent uncertainties in the experiment
fully enough to state the limits of error of the data and result(s) with
confidence that the "true" values (if they were known) would not lie
outside of the stated error limits.
2. To confirm or verify a well-known law or principle.
In this case it is not enough to say "The law was (or was not)
verified." The experimenter must state to what error limits the
verification holds, and for what limits on range of data, experimental
conditions, etc. It is too easy to over-generalize. A student in freshman lab
does not verify a law, say F = ma, for all possible cases where that
law might apply. The student probably investigated the law in the more limited
case of the gravitational force, near the earth's surface, acting on a small
mass falling over distances of one or two meters. The student should state these
limitations. One should not broadly claim to have "verified Newton's
law." Even worse would be to claim to have "proved Newton's law."
3. To investigate a phenomena in order to formulate a law or relation
which best describes it.
Here it is not enough to find a law that "works," but to show that
the law you find is a better representation of the data than other laws you
might test. For example, you might have a graph of experimental data which
"looks like" some power of x. You find a power which seems to fit.
Another student says it "looks like" an exponential function of x. The
exponential curve is tried and seems to fit. So which is the "right"
or "best" relation? You may be able to show that one of them is better
at fitting the data. One may be more physically meaningful, in the context of
the larger picture of established physics laws and theory. But it may be that
neither one is a clearly superior representation of the data. In that case you
should redesign the experiment in such a way that it can conclusively decide
between the two competing hypotheses.
The reader of your report will look very carefully at the "results and
conclusions" section, which represents your claims about the outcome of the
experiment. The reader will also look to see whether you have justified your
claims by specific reference to the data you took in the experiment. Your claims
must be supported by the data, and should be reasonable (within the limitations
of the experiment). This is a test of your understanding of the experiment, of
your judgment in assessing the results, and your ability to communicate.
L. CONCLUSION
Error analysis is not an "after-the-fact" activity; it pervades the
entire experimental process from experiment design through data-taking to the
final analysis of the results. Nor is it a "cut-and-dried" procedure
or set of recipes for "calculating errors." While there are
statistical mathematical criteria which underlie the entire process,
considerable insight and judgment and common sense must be brought to bear on
the experiment to properly assess the dynamical interaction of the error
sources. The experimenter must understand the physics which bears on
the experiment to do a proper job of this. The experimenter must exercise
judgment and common sense in choosing experimental strategies to improve
results, and in choosing methods for determine the effect of experimental
uncertainties. When error analysis is treated as a "mindless"
calculation process, the gravest blunders of analysis and interpretation can
occur.
APPENDIX I. MEASURES OF UNCERTAINTY
The size of the experimental uncertainty in a set of measurements may be
expressed in several ways, depending on how "conservative" you want to
be.
1. Limits of error.
An attempt to specify the entire range in which all measurements
will lie. In practice one specifies the range within which the measured values
lie.
2. Average deviation.
The average deviation of a set of measurements from its
mean is found by summing the deviations of the n measurements, then dividing the
sum by (n-1). This measure describes the "spread" of the set of
measurements.
When one wishes to make inferences about how far an estimated mean is likely
to deviate from the "true" mean value of the parent distribution, use
the average deviation of the mean. To calculate it, sum the deviations
of the n measurements, then divide this sum by n(n-1)1/2. This
measure expresses the quality of your estimate of the mean. This is the
measure we call the uncertainty (or error) in the mean.
This last definition automatically includes two mathematical corrections, one
required to make inferences about the parent distribution from a finite sample
of data, and one to correct for the fact that you have used only a small
sample.
3. Standard deviation.
The standard deviation has become a "standard" method for
expressing uncertainties because it is supported by a well-developed
mathematical model. Unfortunately it is only appropriate when the experimenter
(a) has large data samples, and (b) knows that the distribution of the data is
really Gaussian, or near-Gaussian. Therefore its use in the freshman lab is
seldom justified--something like using a sledgehammer to crack a walnut.
APPENDIX II. CALCULATIONS USING STANDARD DEVIATIONS
The rules for error propagation for the elementary algebraic operations may be
restated to apply when standard deviations are used as the error measure for
random (indeterminate) errors:
- When independently measured quantities are added or subtracted, the
standard deviation of the result is the square root of the sum of the
squares of the standard deviations of the quantities.
- When independently measured quantities are multiplied or divided, the
relative (fractional or percent) standard deviation of the result is the
square root of the sum of the squares of the relative standard deviations of
the quantities.
These are cumbersome to write. The simple underlying idea is this:
When using standard deviations, the rules for combining average deviations
are modified in this way: Instead of simply summing the error measures, you
square them, sum the squares and then take the square root of the sum. This is
called "summing in quadrature."
Are Standard Deviations Better? Too many elementary laboratory manuals
stress the standard deviation as the one standard way to express error measures.
However, one can find, from standard statistical theory that when very few
measurements are made, the error estimates themselves will have low precision.
The uncertainty of an error estimate made from n pieces of data is
100
----------- percent. (7)
1/2
[2(n-1)]
So we'd have to average 51 independent values to obtain a 10% error in the
determination of the error. We would need 5000 measurements to get an error
estimate good to 1%. If only 10 measurements were made, the uncertainty in the
standard deviation is about 24%. This is why we have continually stressed that
error estimates of 1 or 2 significant figures are sufficient when data samples
are small.
This is just one reason why the use of the standard deviation in elementary
laboratory is seldom justified. How often does one take more than a few
measurements of each quantity? Does one even take enough measurements to
determine the nature of the error distribution? Is it Gaussian, or something
else? One usually doesn't know. If it isn't close to Gaussian, the whole
apparatus of the usual statistical error rules for standard deviation must be
modified. But the rules for maximum error, limits of error, and average error
are sufficiently conservative and robust that they can still be reliably used
even for small samples.
However, when three or more different quantities contribute to a result, a
more realistic measure of error is obtained by using the `adding in quadrature'
method described at the beginning of this section.
Just as it's bad form to display more significant figures than are justified,
or to claim more significance for results than is warranted by the experiment,
so, too, it is bad form to use statistical techniques and measures of error to
express results when the data does not justify those error measures nor the
mathematical rules used to obtain them. This implies more quality significance
to the results than may be the case, and borders on scientific fraud.
APPENDIX III. IMPORTANCE OF INDEPENDENCE WHEN USING ALGEBRAIC ERROR
PROPAGATION EQUATIONS
The algebraic rules given for propagation of indeterminate errors are one way to
derive correct error equations, but must be used with care. Here's an example
which illustrates a pitfall you must avoid.
A student wishes to calculate the error equation for two resistors in
parallel. The equation for parallel resistors is:
1 1 1
- = - + - (8)
R X Y
The student solves this for R, obtaining:
XY
R = ----- (9)
X + Y
The error in the denominator is, by the sum rule, x+y. To proceed, we must use
the quotient rule, which requires relative error measures. So the
student converts the error in the denominator to relative form, (x+y)/(X+Y). The
rest involves products and quotients, so the relative determinate error in R is
found to be:
r x y x + y
- = - + - - ----- (10)
R X Y X + Y
The next step requires some algebra to cast this in standard form, but let's not
waste the effort, for this equation is already wrong!
Why? Eq. 9 has X and Y in both numerator and denominator. Therefore the
numerator and denominator are not independent. The quotient rule is not
valid when the numerator and denominator aren't independent.
To avoid this blunder, do whatever algebra is necessary to rearrange the
original equation so that application of the rules will never require combining
errors for non-independent quantities. In fact, the form of the equation given
in Eq. 8 is an ideal starting point, for all of its operations (+ and /) involve
independent quantities.
To do this correctly, begin with Eq. 8 (in which each quantity appears only
once and there is no question that every operation is independent). The relative
error in 1/X is, by the quotient rule, (0 - x/X) which is simply -x/X. The error
in 1/X is therefore (-x/X)(1/X) = -x/X2. Likewise the error in y is
-y/Y2 and in r is -r/R2. Finally, using the addition rule
for errors, the result is:
2 2
r x y r R x R r R x R y
-- = -- + -- , or - = - - + - - , or r = - - + - -
2 2 2 R X X Y Y X X Y Y
R X Y
(11)
Or, using Eq. 9, the right side can be expressed in terms of measured quantities
only.
r Y x X y
- = --- - + --- - (12)
R X+Y X X+Y Y
EXERCISES
In the following situations, consider common sense physical principles to
determine which is the most meaningful way to describe the error: as an
absolute error or a fractional error, an indeterminate error or a determinate
error, a precise measure or an accurate one. Support your answers by stating
your reasoning.
(1) A batch of plastic meter sticks is accurately manufactured, but a year after
leaving the factory the plastic shrank fairly uniformly by an average amount of
2 mm.
(2) The knife edges of a mechanical balance (used for weighing objects) have
become blunted.
(3) The fast/slow setting screw in a precision mechanical stopwatch is
misadjusted.
(4) The supports of the cone bearing in a mechanical electrical voltmeter
have become loose so that the pointer bearing is very loosely confined.
(5) The effect (small) of air drag on a measurement of the acceleration due
to gravity by a falling body experiment.
(6) (a) The effect of uncontrolled and unmeasured laboratory temperature on a
delicate mechanical instrument which makes measurements daily over many months.
(b) The effect of temperature on the instrument if the experiment took 60
seconds to complete.
(7) The effect of air drag on the period of a pendulum.
(8) The effect of very impure alcohol used as the liquid in the determination
of density of a solid by Archimedes' principle. [The solid is weighed when
immersed in the liquid and the formula for the result contains the density of
the liquid.]
In the next group of exercises, assume the following data: A = 10, B = 2, C = 5,
D = 20. In each case the formula for the result, R, is given. Calculate the
numeric value of R. Find the determinate error equation in each case, and then
use it to answer the specific question asked.
(9) Equation: R = (C - B)/A. Use the determinate-error equation to find what
the value of R would be if B were actually 2.1 instead of 2. Check your answer
by direct calculation.
r c - b a
- = ----- - -
R C - B A
Hint: Without actually writing the whole determinate-error equation, we
can write the term of that equation which gives the contribution due to
error in B.
r -B b
- = ----- - ,
R C - B B
due to error in B alone.
(10) Equation: R = (C/A) - C - 5. Use the error equation to find R if C were
changed to 4.7. Check answer by direct calculation.
(11) Equation: R = (D2C2)-3/(D - A)2. Find
how R changes if D changes to 22, A changes to 12 and C changes to 5.3 (all at
once).
(12) Equation: R = D sin [(A - C)/3B]. Find how R changes if C increases by
2%. Remember that arguments of trig functions are always in radians.
(13) Equation: R = exp[(C - B)/D] Find how R changes if B decreases by 2% and
D increases by 4 units. This is standard notation: exp(x) means the same as ex.
Here e is, of course, the base of natural logarithms.
This last group of questions is more general and requires careful thought and
analysis of all possibilities. Be sure to consider these in the most general
context, considering all possible measures of error: indeterminate, determinate,
relative and absolute. The statements might be true for one kind of error
measure and false for others. If so, specify this in your answer.
(14) A student says, "When two measurements are mathematically combined,
the error in the result is always greater than the error of either of the
measurements." Discuss this statement critically.
(15) Another student says, "When two measurements have 2% error, and
they are used in an equation to calculate a result, the result will have 4%
error." Discuss, critically.
(16) Still another student says, "When several measurements are used to
calculate a result, the error in the result can never be less than the error of
the worst measurement". Discuss, critically.
(17) Yet another student says, "When several measurements are used to
calculate a result, and the error of one is 10 times as large as the next worst
one, you might as well neglect all but the worst one in the error propagation
equation." Discuss, critically.
ENDNOTES
1. Some of the better treatments of error analysis are:
- Young, Hugh D. Statistical Treatment of Experimental Data.
McGraw-Hill 1962.
- Baird, D. C. Experimentation, an introduction to measurement theory
and experiment design.. Second edition. Prentice-Hall, 1988.
- Taylor, John R. An Introduction to Error Analysis. University
Science Books, 1962.
- Meiners, Harry F., Eppenstein and Moore. Laboratory Physics.
Wiley, 1969.
- Swartz, Clifford E. Used Math, for the first two years of college
science. Prentice-Hall, 1973. American Institute of Physics, 1996.
Chapter 1 discusses error analysis at the level suitable for Freshman.
- Swartz, Clifford E. and Thomas Miner. Teaching Introductory Physics, A
Sourcebook. American Institute of Physics, 1977. Chapter 2 of this
valuable book gives an account of error analysis which is entirely
consistent with my own philosophy on the matter. It discusses three levels
of treatment of errors.
- Significant Figures--a first approximation to error analysis. (But one
not adequate for undergraduate laboratory work in physics.)
- Absolute and Percentage Errors--a second approximation to error
analysis. This is the level we have discussed at length above. Swartz
and Miner say "[These] rules are ... often satisfactory. Indeed,
for most introductory laboratory work, they are the only valid rules.
- Data Distribution Curves--a third approximation to error analysis.
This includes the use of standard deviations as a measure of error, and
the rules for combining them. I cannot resist quoting from this book:
The use of this third approximation to error analysis is justified
only when certain experimental conditions and demands are met. If the
formalism is applied blindly, as it often is, sophisticated precision
may be claimed when it does not exist at all. The situation is
aggravated by the easy availability of statistical programs on many
hand calculators. Just enter a few numbers, press the keys, and
standard deviations and correlations will come tumbling out to 10
insignificant figures.
2. Some books call these "random errors." This is a poor name, for
indeterminate errors in measurements are not entirely random according to the
mathematical definition of random. I've also seen them called "chance
errors." Some other synonyms for indeterminate errors are: accidental,
erratic, and statistical errors.
3. The magnitude of a quantity is its size, without regard to its algebraic
sign.
4. The average deviation might more properly be called the "average
absolute deviation," or "mean absolute deviation," since it is a
mean of the absolute values of the deviations, not of the deviations
themselves. [The mean of the deviations of a symmetric distribution would be
zero.]
5. In the statistical study of uncertainties, the words "average"
and "mean" are not used as if they were complete synonyms. When
referring to the average of a set of data measurements, the word
"mean" is always used, rather than "average." When referring
to other averaging processes the word "average" is preferred. Perhaps
this usage distinction is to avoid generating a clumsy name like "mean
deviation of the mean."
6. See Laboratory Physics by Meiners, Eppensein and Moore for more
details about the average deviation, and other measures of dispersion.
7. This relatively new notation for mean values is, I think, neater and
easier to read than the old notation of putting a bar over the Q.
8. For a good discussion see Laboratory Physics by Meiners,
Eppenstein and Moore. There (on p. 36) you will find a side-by-side calculation
of average deviation and standard deviation, and a discussion of how they
compare as measures of error.
9. The Gaussian distribution, sometimes called the "normal curve of
error" has the equation:
2
-[(X - <X>)/2s]
f(X) = C e
where <X> is the mean value of the measurement X, and s is the standard
deviation of the measurements. C is a scaling constant. f(X) is the number of
measurements falling within a range of values from X to X + x, where x is small.
This is the famous "bell-shaped curve" of statistics.
10. See Meiners et. al., who comment: "This means that for many
purposes, we can use the average deviation...instead of the standard deviation.
This is an advantage because the average deviation is easier to compute than the
standard deviation."
11. Independent errors are those for which the error of one individual
measurement is not dependent on the errors in other measurements. No error
influences the others, or is mathematically determinable from the others.
12. Calculus may be used instead.


|