/*********************************************************/
start SS (result, y, XE, XR, level,printss,printh);
/*
SS.SAS
SUBROUTINE TO COMPUTE ANALYSIS OF VARIANCE
NUMERATOR SUMS OF SQUARES
Donald B. Macnaughton
donmac@matstat.com
ABSTRACT
This SAS IML subroutine computes analysis of variance nu-
merator sums of squares for statistical tests in unbalanced
analysis of variance. The subroutine can compute the fol-
lowing types of sums of squares:
- HTO (Higher-level Terms are Omitted)
= SPSS ANOVA Experimental
= SAS Type II in the two-way case
- SAS Type II
- HTOS (Higher-level Terms are Omitted unless Significant)
= superset of SAS Type II and HTO
- HTI (Higher-level Terms are Included)
= SAS Type III
= SPSS ANOVA Unique
= the default method in many analysis of variance pro-
grams
- sequential
= SAS Type I
= SPSS ANOVA Hierarchical in the two-way case
- SPSS ANOVA Hierarchical
- other types of sums of squares (including those in analy-
sis of covariance) that can be specified as being equal to
the difference between the residual sums of squares of two
overparameterized model equations.
Each time the subroutine is called it computes the single
sum of squares specified by the calling arguments and re-
turns the result to the calling program through the returned
argument RESULT.
SUBROUTINE USES
The subroutine
- provides an easy way of computing numerator sums of
squares that cannot be computed in some statistical pack-
ages
- provides a way of checking how a particular sum of squares
is computed in a statistical package
- illustrates the mathematical aspects of computing sums of
squares
- assists in performing simulation tests of conjectures
about analysis of variance sums of squares.
SUBROUTINE MAIN ARGUMENTS
When calling this subroutine you must specify three main ar-
guments: y, XE, and XR.
The first main argument, y, must be a column vector contain-
ing the values of the response variable obtained in the ex-
periment.
The second and third main arguments, XE and XR, must be sub-
matrices of the (full-column-rank) design matrix for the ex-
periment. These submatrices jointly specify the particular
sum of squares to be computed. XE must contain the subma-
trix associated with the effect being tested. XR must con-
tain the horizontal concatenation of the submatrices associ-
ated with all the other terms (except the error term) on the
right side of the two model equations whose residual sums of
squares you wish to difference.
I give examples of specification of y, XE, and XR in two
sample calling programs (1998a, 1998b).
SUBROUTINE METHOD
As noted above, the subroutine is supplied with the vector y
and the matrices XE and XR. Here is the code (all four
lines) for computing the requested sum of squares and stor-
ing it in the variable RESULT:
H = XE - XR * ginv(XR) * XE <-- hypothesis matrix
(dimensions are N x DF)
PM = H * ginv(H) <-- projection matrix (N x N)
yp = PM * y <-- projection of y (N x 1)
result = ssq(yp) <-- squared length of the
projection (1 x 1).
If you are familiar with matrix algebra, the above code will
probably be understandable, even if you are not familiar
with SAS IML. But here are definitions for functions that
might be puzzling:
ginv(X) = Moore-Penrose generalized inverse of X
ssq(x) = sum of the squares of the elements in x.
The subroutine method is derived from discussion and proofs
by Hocking (1985) and Searle (1987). Aldrich (1998) dis-
cusses the history of the methods. I discuss the method in
more detail below.
SUMMARY OF THE REST OF THE SUBROUTINE
The remaining lines in this subroutine (717 lines) contain
- a discussion of details of the method discussed above
- a discussion of details of the subroutine operation and
- the executable version of the four statements given above.
I recommend that you omit reading these lines unless you are
interested in the details. (To find the end of the subrou-
tine to return to reading the calling program, search for
the row of asterisks.)
THE HYPOTHESIS MATRIX H
As noted above, the first step of this subroutine is to gen-
erate the hypothesis matrix H, using the following state-
ment:
H = XE - XR * ginv(XR) * XE.
Actually, the above statement is one of two different meth-
ods that the subroutine can use to generate H. (I describe
how to specify which method is to be used in the Secondary
Arguments section below.) The other method is
H = XE - XR * inv(XR` * XR) * XR` * XE.
Algebraically, both methods yield exactly the same result
for H, as shown by Graybill's theorem 6.2.16 (1983, 112).
(At several places in this subroutine I state that two meth-
ods yield exactly the same result. Although my statements
are algebraically correct, if we use the two methods in a
computer program, usually differences between the values
computed by the two methods will occur, due to differences
in roundoff errors. One should be aware of these differ-
ences but they can usually be ignored because they are usu-
ally restricted to just the last few significant digits in
computations that are performed in SAS in sixteen decimal
digit precision.)
H has the following properties:
1. H has N rows, where N is the number of rows in y, XE,
and XR.
2. H has DF columns, where DF is the number of degrees of
freedom of the effect being tested.
3. Choose any cell in the table that summarizes the layout
of the experiment. (For an example of such a table, see
Searle's carrot seed germination data [1987, 79], repro-
duced in table 1 in Macnaughton [1998a].) All the rows
in H associated with the chosen cell are identical to
each other.
4. The sum of each column of H equals zero. This is sur-
prising because the sums of the columns of XE and XR,
which are used to generate H, generally do NOT equal
zero in unbalanced experiments.
5. If H has only one column (i.e., DF = 1), we can view the
elements in H as a statement of the hypothesis being
tested. That is, the elements in H are (indirectly)
multiplied by the corresponding elements in the response
data vector (in a "contrast") as a step in computing the
sum of squares.
6. If H has more than one column, the columns of H are
"linearly independent".
7. If H has more than one column, the elements in each col-
umn of H can be viewed as a "portion" of the hypothesis
being tested. That is, each column of H represents a
separate contrast that is applied to the response data
vector. The results of these contrasts are mathemati-
cally combined to compute the sum of squares. (The con-
trasts are not directly applied to the response data
vector, but only indirectly through PM, as discussed be-
low.)
8. Surprisingly, H is not unique. That is, for any given H
we can replace it by any of an infinite number of "re-
lated" matrices (which have, of course, the same number
of rows and columns as H), and this subroutine will re-
turn exactly the same value for the sum of squares. H
does not need to be unique to yield a unique sum of
squares because the columns of H are not defining them-
selves. Instead, the columns are defining a unique sub-
space of the N-dimensional vector space under study.
Linear algebra shows that other versions of the matrix H
can be used to define the same unique subspace. More
specifically, suppose T is ANY "non-singular" matrix
with DF rows and DF columns. Then we can replace H with
H1 = H * T and this subroutine will return exactly the
same sum of squares.
THE PROJECTION MATRIX PM
As noted above in the "Method" section, the next step after
computing H is to compute the projection matrix PM, using
the following statement:
PM = H * ginv(H).
Actually (as with H), the above statement is one of two dif-
ferent methods that this subroutine can use to generate PM.
The other method is
PM = H * inv(H` * H) * H`.
As (again) shown by Graybill's theorem 6.2.16, both methods
yield exactly the same value for PM.
The method immediately above is equivalent to computing
Hocking's P(c) in (6.168) (1985, 153). This method is par-
tially shown to be valid in Graybill's Theorem 4.4.1 (1983,
73). Harville proves some useful results for projection ma-
trices (1997, sec 12.3).
Note how the above two equations imply that PM is a "normal-
ized" version of the hypothesis matrix H. PM has the fol-
lowing properties:
1. PM has N rows and N columns.
2. Choose any cell in the table that summarizes the layout
of the experiment. All the rows in PM associated with
that cell are identical to each other.
3. PM is symmetric. That is, the first row has exactly the
same elements (in left-to-right order) as the first col-
umn (in top-to-bottom order), and the second row has ex-
actly the same elements as the second column, and so on.
4. The sum of the elements in each row of PM (and the sum
of the elements in each column) is equal to zero. This
implies that if a row of PM is multiplied by the vector
y, it produces a "contrast" of the values in y.
5. The trace of PM (i.e., the sum of the elements on the
descending diagonal) is equal to DF, the number of de-
grees of freedom associated with the hypothesis being
tested.
6. The rank of PM (i.e., the number of linearly independent
rows or columns in PM) is equal to DF. Thus if DF is
equal to 1, any row in PM is a multiple of any other
row. If DF is equal to k, any row in PM can be gener-
ated as a linear combination of any k mutually independ-
ent other rows.
7. The sum of the squares of the elements in PM is equal to
DF.
8. The rows (and columns) of PM are linear combinations of
the columns of H.
9. PM is unique. That is, suppose a projection matrix PM
is computed from a given hypothesis matrix H. Suppose
that then H is multiplied by any non-singular matrix T
with DF rows and DF columns to yield H1. Suppose that
then a new projection matrix PM1 is computed from H1.
Then PM1 = PM.
10. PM is idempotent. That is, if we multiply PM by itself,
the answer we get is PM.
11. If (as discussed below) we project an arbitrary vector x
through PM to yield x1, and if we then project x1
through PM to yield x2, we will find that x1 and x2 are
identical, although they will generally differ from x.
12. PM has DF eigenvalues (characteristic values) that are
equal to +1 and the remaining eigenvalues are equal to
zero. The eigenvectors (characteristic vectors) of PM
that correspond to the non-zero eigenvalues can be used
as columns to form a valid version of the hypothesis ma-
trix H.
yp, THE PROJECTION OF y BY PM
Thus far, we have only used information about the *predic-
tor* variables in the experiment to derive the following
four new matrices: XE, XR, H, and PM. That is, we have not
yet taken any account of the values of the response variable
stored in the vector y. The next step in computing the sum
of squares is to mathematically marry the response variable
and the predictor variables. We do so by using the projec-
tion matrix PM to "project" y. That is, we postmultiply the
projection matrix by y to yield a new vector, called yp, as
follows:
yp = PM * y.
The vector yp has the following properties:
1. Like y, yp is an N x 1 column vector.
2. Choose any cell in the table that summarizes the layout
of the experiment. All the elements in yp associated
with that cell are identical to each other.
THE SUM OF SQUARES AS THE SQUARED LENGTH OF yp
The desired sum of squares is simply the squared length of
yp, and is computed as
result = ssq(yp).
Thus the sum of squares is simply the squared length of the
projection of the vector y by the projection matrix PM. PM
has two important properties related to the projection of y
1. If no relationship exists between the response variable
and the particular set of predictor variables associated
with XE, (and if certain well-known assumptions are ade-
quately satisfied), the length of the projection of y can
be expected to be "short"; it will be roughly equal to a
known length (i.e., a length that can be computed from
the data). On the other hand if a relationship between
the variables *does* exist, the length of the projection
of y will tend to be longer than the known length. Thus
computing a p-value is simply computing whether the pro-
jection of y is longer than could be reasonably expected
if no relationship exists.
2. If we are studying a balanced experiment, and if we com-
pute projection matrices PM1 and PM2 for any two of the
effects (main effects or interactions) in the experiment,
PM1 * PM2 = 0, where 0 is an N x N matrix of zeros.
(This impressive result does not generally occur in un-
balanced experiments.) This means that in a balanced ex-
periment the projection of any vector y by PM1 is "or-
thogonal" to (i.e., at right angles to) the projection of
the same vector (or any other vector y1) by PM2. It also
means that if an experiment is balanced, none of the ef-
fects in the experiment can "contaminate" the statistical
tests of other effects in the experiment. (This contami-
nation, which I shall demonstrate in later material, oc-
curs with some statistical tests in unbalanced experi-
ments.)
OTHER METHODS OF COMPUTING THE SUM OF SQUARES
Simple linear algebra implies that we can also compute the
desired sum of squares as
result = yp` * yp.
Also, Searle (1987) shows in (82) on page 264 and (90) on
page 272 and (90) on page 318 that we can compute the de-
sired sum of squares directly from PM as a quadratic form
result = y` * PM * y.
I find the projection approach (a geometric approach) easier
to understand than the quadratic form approach (an algebraic
approach). I visualize the response vector as an arrow in
the N-dimensional vector space that is "projected through"
the projection matrix to generate another arrow in a sub-
space of the first space. The length of the second arrow is
related to selected properties of the first arrow. In par-
ticular, the projection matrix is specifically designed so
that the length of the second arrow shows (in a way that is
as mathematically "clear" as possible) the strength of sup-
port (provided by the y-values obtained in the experiment)
for the hypothesis that the relationship between variables
associated with the projection matrix exists in the popula-
tion.
GENERAL COMMENTS ABOUT THE SUBROUTINE METHOD
It is helpful to review what this subroutine accomplishes.
In essence, the calling program passes the matrices XE and
XR to the subroutine. These matrices contain the specifica-
tions of the layout of the experiment and the specification
of two model equations. The calling program also passes the
vector y to the subroutine. This vector contains the values
of the response variable obtained in the experiment. Fol-
lowing Yates' characterization of analysis of variance sums
of squares (1934, 63), the subroutine uses the three argu-
ments to compute the difference between the residual sums of
squares of the two model equations. Depending on how the
calling program specifies the values of XE and XR, this al-
lows the subroutine to compute a sum of squares using any of
the seven approaches to computing analysis of variance sums
of squares named in the abstract of the subroutine. (I il-
lustrate how to call the subroutine to compute sums of
squares for some of the approaches in two computer programs
[1998a, 1998b].)
SUBROUTINE SECONDARY ARGUMENTS
The remaining lines in this subroutine (419 lines) contain
- details of the subroutine operation and
- the executable version of the statements given above.
You can find the end of the subroutine to return to reading
the calling program by searching for the row of asterisks.
To use this subroutine you must supply values for three sec-
ondary arguments: LEVEL, PRINTSS, and PRINTH. These argu-
ments control details of how the subroutine performs the
computation and prints the results.
The argument LEVEL controls which method the subroutine uses
to compute the hypothesis matrix H and the projection matrix
PM. If you set LEVEL to 1, the two matrices are computed
using the standard inverse. If you set LEVEL to 2, the two
matrices are computed using the generalized inverse (as
shown in the Method section). Using LEVEL = 1 seems to
yield solutions that are slightly more accurate.
The argument PRINTSS controls whether the subroutine prints
the sum of squares it has computed. If you set PRINTSS to
1, the subroutine prints the value of the sum of squares in
25.15 format. If you set PRINTSS to zero, the subroutine
does not print the sum of squares but instead only returns
the value to the calling program through the argument
RESULT. If you set PRINTSS to 2, the subroutine prints all
the printable digits of the computed sum of squares in E-no-
tation for possible comparison against other computed val-
ues. If PRINTSS is 1 or 2, the subroutine also prints the
results of tests of the integrity of the projection matrix
PM. (I describe the tests below.)
The argument PRINTH controls whether the subroutine prints
the following three intermediate results:
- the hypothesis matrix H
- the projection matrix PM and
- the projection, yp, of the response data vector.
If you set PRINTH to 1, the subroutine prints the intermedi-
ate results. If you set PRINTH to 2, the subroutine prints
the intermediate results but prints the hypothesis matrix
and the projection of the response vector transposed, which
can sometimes save space in the output. If you set PRINTH
to zero, the subroutine does not print intermediate results.
SUBROUTINE NOTES
The numerical method of computing sums of squares used by
this subroutine is efficient for supporting the conceptual
approach to analysis of variance sums of squares of comput-
ing the difference between the residual sums of squares of
two overparameterized model equations. However, the method
is generally not the most efficient method in terms of
- economy of computation in memory required
- economy of computation in time required
- minimization of roundoff errors for large datasets or ill-
conditioned data.
Nevertheless, the subroutine generally yields highly accu-
rate sums of squares in minimal time and is therefore more
than adequate for most applications that cannot be handled
by a general analysis of variance program.
One inefficiency of the method used by this subroutine re-
lates to the inclusion of identical rows within each of the
following five arrays:
- XE and XR, the relevant submatrices of the design matrix
- H, the hypothesis matrix
- PM, the projection matrix
- yp, the projection of vector y.
That is, (as noted above) for any given cell in the table
that summarizes the layout of the experiment, all the rows
in each of the five arrays that correspond to that cell are
(within the array) identical -- one row for each value of
the response variable associated with the cell. Sunwoo and
Kim (1997) discuss an approach to analyzing unbalanced ex-
periments that eliminates this duplication. For very large
experiments, this subroutine could be enhanced to take ac-
count of the Sunwoo and Kim approach, thereby substantially
increasing the computational efficiency (at the expense of
an increase in complexity).
(The enhancement might work as follows: Compute y1, which
is an m-dimensional column vector containing the cell means
[or possibly cell totals] of the values of the response
variable, where m is the number of cells in the experiment.
Compute XE, XR, H, and PM as above but on the basis of an
equivalent experiment with only one observation per cell.
Compute yp1 as the projection of y1 by PM. Scale yp1 with
the Sunwoo and Kim T matrix to [in effect] yield yp, and
compute the desired sum of squares from yp as before.)
The hypothesis matrix H I use in this subroutine has dimen-
sions N x DF. This is the transpose of the hypothesis ma-
trix Ronald Hocking discusses in his book (1985, 153), and I
discuss in a paper (1997, appendix C). I have used the
transpose because it yields slightly simpler notation.
To conserve memory, the subroutine erases XE and XR after
using them.
If you wish to run this subroutine on an EBCDIC system
(e.g., an IBM mainframe), see the note near the end.
SUBROUTINE EXECUTABLE STATEMENTS
First, check the arguments passed to the subroutine and stop
if a problem is found.
*/
if level ^= 1 & level ^=2 then do;
print '***ERROR*** in call to SS subroutine.';
print 'Value of LEVEL is' level;
print 'LEVEL must be 1 or 2. Execution terminated.';
abort;
end;
if printss ^= 0 & printss ^= 1 & printss ^= 2 then do;
print '***ERROR*** in call to SS subroutine.';
print 'Value of PRINTSS is' printss;
print 'Value must be 0, 1, or 2. Execution terminated.';
abort;
end;
if printh ^= 0 & printh ^= 1 & printh ^= 2 then do;
print '***ERROR*** in call to SS subroutine.';
print 'Value of PRINTH is' printh;
print 'PRINTH must be 0, 1, or 2. Execution terminated.';
abort;
end;
if type(y) = 'U' | type(XE) = 'U' | type(XR) = 'U' then do;
print '***ERROR*** in call to SS subroutine.';
string = 'One or more of the matrices y, XE';
string = string + ', and XR do not exist. You must ';
string = string + 'specify the three matrices before ';
string = string + 'calling the SS subroutine.';
print string;
print 'Execution terminated.';
abort;
end;
n = nrow(y);
if nrow(XE) ^= n | nrow(XR) ^= n then do;
string = '***ERROR*** in call to SS subroutine. ';
string = string + 'Discrepancy found between the ';
string = string + 'number of rows in y, XE, and XR:';
print string;
nrow_y = n;
nrow_XE = nrow(XE);
nrow_XR = nrow(XR);
print nrow_y nrow_XE nrow_XR;
print 'Execution terminated.';
abort;
end;
/*
Compute Searle's M1 as defined in his (76) on pages 263 and
318 of his book (1987) as
M1 = I(n) - XR * ginv(XR)
where
I(n) = the identity matrix (of dimension n x n with 1's
on the diagonal and zeros elsewhere).
Then compute the hypothesis matrix
H = M1 * XE.
Note that XE in this subroutine is equivalent to Searle's X2
and XR is equivalent to Searle's X1.
The following statements perform the above arithmetic but
bypass the intermediate step of computing M1. The chosen
method depends on the value of LEVEL.
*/
if level = 1 then
H = XE - XR * inv(XR` * XR) * XR` * XE;
else
H = XE - XR * ginv(XR) * XE;
/*
Since they are no longer needed, erase XE and XR to conserve
memory. Note that erasing the two matrices here means that
after returning from this subroutine the values in the ma-
trices will not be available in the calling program.
To conserve memory, other matrices are also erased below as
soon as they are no longer needed.
*/
free XE XR;
/*
Compute the projection matrix PM using the appropriate
method, as determined by LEVEL.
*/
If level = 1 then
PM = H * inv(H` * H) * H`;
else
PM = H * ginv(H);
if printh = 0 then free H;
/*
Compute the projection of y.
*/
yp = PM * y;
if printss = 0 then free PM;
/*
Compute the desired sum of squares as the squared length of
the projection yp.
*/
result = ssq(yp);
if printh = 0 then free yp;
/*
If requested, print the computed sum of squares.
*/
if printss = 1 then print result [format=25.15];
if printss = 2 then print result [format=e23.];
/*
If requested, print the intermediate results:
- the hypothesis matrix
- the projection matrix
- the projection of y.
Print the hypothesis matrix and the projection of y untrans-
posed or transposed, depending on the value of PRINTH.
*/
if printh = 1 then do;
print ,
'Hypothesis matrix H = XE - XR * ginv(XR) * XE:', H;
print ,'Projection matrix PM = H * ginv(H):', PM;
print ,'Projection of y: yp = PM * y:', yp;
end;
if printh = 2 then do;
print ,
'Transpose of hypoth. matrix H = XE - XR * ginv(XR) * XE:',
(H`);
print ,'Projection matrix PM = H * ginv(H):', PM;
print ,'Transpose of projection of y: yp = PM * y:',
(yp`);
end;
/*
If the argument PRINTSS requested printing of sums of
squares then, as a way of (indirectly) checking the accuracy
of the subroutine, perform checks on PM to see how close it
is to having the following properties:
- symmetric
- rows sum to zero
- idempotent.
*/
if printss > 0 then do;
/*
The checks are done by first computing the largest relative
difference between corresponding values in two matrices that
should be identical. The relative difference between two
matrices P and Q (called the "relative error") is defined as
E = (P - Q) / P.
Both the subtraction and the division in the above expres-
sion are done on an element-by-element basis. Thus E is it-
self a matrix with the same dimensions as P and Q. The sub-
routine computes L, the largest absolute value of the ele-
ments in E as a measure of the equality of the two matrices.
The subroutine then converts L to a rough number of digits
of accuracy with the formula
ndigits = -log10(L).
First generate PMD, the divisor matrix for two of the rela-
tive errors. PMD is simply PM, except that we ensure that
none of the values in PMD is zero.
*/
if all(PM ^= 0) then PMD = PM;
else do;
PMD = PM + ((PM = 0) * .6e-78);
n = sum(PM = 0);
string = 'elements of PM were zero. .6E-78 ';
string = string + 'was added to these elements ';
string = string + 'to avoid dividing by zero in ';
string = string + 'computing relative errors.';
print n string;
end;
/*
Second, set DECACC, which is the number of decimal digits of
accuracy carried in the computations. This value depends on
the computer operating system and language used to run the
subroutine. For my (Windows) computer the correct value
with SAS is 16.
*/
decacc = 16;
/*
Third, compute the maximum absolute relative difference be-
tween corresponding elements in PM and its transpose as a
measure of the symmetry of PM.
*/
mrelerrS = max( abs( (PM - PM`)/PMD ) );
if mrelerrS = 0 then ndigitsS = decacc;
else ndigitsS = -log10(mrelerrS);
/*
Fourth, compute the maximum absolute relative error between
the sum of elements in a row of PM and zero; use the average
of the absolute values of the elements in the row as the di-
visor. (See page 47 of the IML manual for subscript reduc-
tion operators [SAS Institute Inc., 1989].)
*/
D = abs(PM);
mrelerrR = max( abs( PM[,+] / D[,:] ) );
if mrelerrR = 0 then ndigitsR = decacc;
else ndigitsR = -log10(mrelerrR);
/*
Fifth, compute the maximum absolute relative difference be-
tween corresponding elements in PM and PM-squared as a meas-
ure of how close PM is to being idempotent.
*/
mrelerrI = max( abs( (PM - (PM * PM)) / PMD ) );
if mrelerrI = 0 then ndigitsI = decacc;
else ndigitsI = -log10(mrelerrI);
/*
Print the computed numbers of digits of accuracy.
*/
print ndigitsS [format=5.1]
ndigitsR [format=5.1]
ndigitsI [format=5.1];
end; /* of if printss > 0 then do */
/*
SUBROUTINE NOTE FOR USERS OF EBCDIC COMPUTER SYSTEMS
The NOT operator (^) above in this subroutine is the correct
operator for ASCII systems. If the subroutine is run on an
EBCDIC system (e.g., an IBM mainframe), you may have to
change each occurrence of ^ to the EBCDIC logical NOT opera-
tor, which looks like a minus sign with a short vertical bar
dropping down from the right end (and which is EBCDIC hexa-
decimal 5F).
SUBROUTINE REFERENCES
Aldrich, J. 1998. Doing least squares: Perspectives from
Gauss and Yule. _International Statistical Review_ 66,
61-81.
Graybill, F. A. 1983. _Matrices with Applications in
Statistics_ 2d ed. Belmont, CA: Wadsworth.
Harville, D. A. 1997. _Matrix Algebra From a Statistician's
Perspective._ New York: Springer-Verlag.
Hocking, R. R. 1985. _The Analysis of Linear Models._
Monterey, CA: Brooks/Cole.
Macnaughton, D. B. 1997. Which sums of squares are best in
unbalanced analysis of variance. Available at
http://www.matstat.com/ss/
Macnaughton, D. B. 1998a. PR0139.HTM: Computing numerator
sums of squares in unbalanced analysis of variance: Two-
way case. Available in the "Computer Programs" section
at http://www.matstat.com/ss/
Macnaughton, D. B. 1998b. PR0165.HTM: Computing numerator
sums of squares in unbalanced analysis of variance:
Three-way case. Available in the "Computer Programs"
section at http://www.matstat.com/ss/
SAS Institute Inc. 1989. _SAS/IML Software: Usage and Refer-
ence, Version 6, First Edition._ Cary, NC: author.
Searle, S. R. 1987. _Linear Models for Unbalanced Data._
New York: Wiley.
Sunwoo, H., and B. C. Kim. 1997. Analysis of the unbalanced
linear model based on the balanced model. _Journal of
Statistical Computation and Simulation_ 56, 373-385.
Yates, F. 1934. The analysis of multiple classifications
with unequal numbers in the different classes. _Journal
of the American Statistical Association_ 29, 51-66.
*/
finish SS; /* end of subroutine SS
version of June 19, 1998
***********************************************************/