Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Estimability Tools for Rank-Deficient Linear Models: Ensuring Trustworthy Predictions, Summaries of Statistics

The importance of estimability in linear models when dealing with rank-deficient models, which can produce questionable predictions. The author proposes tools to help package developers ensure that only estimable predictions are returned, flagging non-estimable ones. The document also includes an example of how to implement these tools in R using the estimability package.

Typology: Summaries

2021/2022

Uploaded on 09/27/2022

pratic
pratic 🇬🇧

5

(4)

216 documents

1 / 5

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
CONTRIBUTED RESE ARC H ARTICLES 195
Estimability Tools for Package
Developers
by Russell V. Lenth
Abstract
When a linear model is rank-deficient, then predictions based on that model become
questionable because not all predictions are uniquely estimable. However, some of them are, and the
estimability
package provides tools that package developers can use to tell which is which. With the
use of these tools, a model object’s
predict
method could return estimable predictions as-is while
flagging non-estimable ones in some way, so that the user can know which predictions to believe. The
estimability
package also provides, as a demonstration, an estimability-enhanced
epredict
method
to use in place of predict for models fitted using the stats package.
Introduction
Consider a linear regression or mixed model having fixed component of the matrix form
Xβ
β
β
. If
X
is
not of full column rank, then there is not a unique estimate
b
of
β
β
β
. However, consider using
λ
λ
λ0b
to
estimate the value of some linear function
λ
λ
λ0β
β
β=jλjβj
. (We use
x0
to denote the transpose of a vector
x
.) For some
λ
λ
λ
s, the prediction depends on the solution
b
; but for others—the estimable ones—it does
not.
An illustration
An example will help illustrate the issues. In the following commands, we create four predictors
x1
x4
and a response variable y:
>x1<--4:4
> x2 <- c(-2, 1, -1, 2, 0, 2, -1, 1,-2)
>x3<-3*x1-2*x2
>x4<-x2-x1+4
> y <- 1 + x1 + x2 + x3 + x4 + c(-.5, .5, .5, -.5, 0, .5, -.5, -.5, .5)
Clearly,
x3
and
x4
depend linearly on
x1
and
x2
and the intercept. Let us fit two versions of the
same model to these data, entering the predictors in different orders, and compare the regression
coefficients:
> mod1234 <- lm(y ~ x1 + x2 + x3 + x4)
> mod4321 <- lm(y ~ x4 + x3 + x2 + x1)
> zapsmall(rbind(b1234 = coef(mod1234), b4321 = coef(mod4321)[c(1, 5:2)]))
(Intercept) x1 x2 x3 x4
b1234 5 3 0 NA NA
b4321 -19 NA NA 3 6
Note that in each model, two regression coefficients are
NA
. This indicates that the associated predictors
were excluded due to their linear dependence on the others.
The problem that concerns us is making predictions on new values of the predictors. Here are
some predictor combinations to try:
> testset <- data.frame(
+ x1 = c(3, 6, 6, 0, 0, 1),
+ x2 = c(1, 2, 2, 0, 0, 2),
+ x3 = c(7, 14, 14, 0, 0, 3),
+ x4 = c(2, 4, 0, 4, 0, 4))
And here is what happens when we make the predictions:
> cbind(testset,
+ pred1234 = predict(mod1234, newdata = testset),
+ pred4321 = suppressWarnings(predict(mod4321, newdata = testset)))
x1 x2 x3 x4 pred1234 pred4321
1 3 1 7 2 14 14
2 6 2 14 4 23 47
The R Journal Vol. 7/1, June 2015 ISSN 2073-4859
pf3
pf4
pf5

Partial preview of the text

Download Estimability Tools for Rank-Deficient Linear Models: Ensuring Trustworthy Predictions and more Summaries Statistics in PDF only on Docsity!

Estimability Tools for Package

Developers

by Russell V. Lenth

Abstract When a linear model is rank-deficient, then predictions based on that model become questionable because not all predictions are uniquely estimable. However, some of them are, and the estimability package provides tools that package developers can use to tell which is which. With the use of these tools, a model object’s predict method could return estimable predictions as-is while flagging non-estimable ones in some way, so that the user can know which predictions to believe. The estimability package also provides, as a demonstration, an estimability-enhanced epredict method to use in place of predict for models fitted using the stats package.

Introduction

Consider a linear regression or mixed model having fixed component of the matrix form X βββ. If X is not of full column rank, then there is not a unique estimate b of βββ. However, consider using λλλb to estimate the value of some linear function λλλβββ = ∑j λ j β j. (We use x ′^ to denote the transpose of a vector x .) For some λλλ s, the prediction depends on the solution b ; but for others—the estimable ones—it does not.

An illustration

An example will help illustrate the issues. In the following commands, we create four predictors x1–x and a response variable y:

x1 <- -4 : 4 x2 <- c(-2, 1, -1, 2, 0, 2, -1, 1,-2) x3 <- 3 * x1 - 2 * x x4 <- x2 - x1 + 4 y <- 1 + x1 + x2 + x3 + x4 + c(-.5, .5, .5, -.5, 0, .5, -.5, -.5, .5)

Clearly, x3 and x4 depend linearly on x1 and x2 and the intercept. Let us fit two versions of the same model to these data, entering the predictors in different orders, and compare the regression coefficients:

mod1234 <- lm(y ~ x1 + x2 + x3 + x4) mod4321 <- lm(y ~ x4 + x3 + x2 + x1) zapsmall(rbind(b1234 = coef(mod1234), b4321 = coef(mod4321)[c(1, 5:2)]))

(Intercept) x1 x2 x3 x b1234 5 3 0 NA NA b4321 -19 NA NA 3 6

Note that in each model, two regression coefficients are NA. This indicates that the associated predictors were excluded due to their linear dependence on the others.

The problem that concerns us is making predictions on new values of the predictors. Here are some predictor combinations to try:

testset <- data.frame(

  • x1 = c(3, 6, 6, 0, 0, 1),
  • x2 = c(1, 2, 2, 0, 0, 2),
  • x3 = c(7, 14, 14, 0, 0, 3),
  • x4 = c(2, 4, 0, 4, 0, 4))

And here is what happens when we make the predictions:

cbind(testset,

  • pred1234 = predict(mod1234, newdata = testset),
  • pred4321 = suppressWarnings(predict(mod4321, newdata = testset)))

x1 x2 x3 x4 pred1234 pred 1 3 1 7 2 14 14 2 6 2 14 4 23 47

Warning message: In predict.lm(mod1234, new = testset) : prediction from a rank-deficient fit may be misleading

Note that the two models produce identical predictions in some cases (rows 1, 3, and 4), and different ones in the others. There is also a warning message (we suppressed this the second time) telling us that this could happen.

It happens that cases 1, 3, and 4 are estimable, and the others are not. It would be helpful to know which predictions to trust, rather than a vague warning. The epredict function in estimability (Lenth,

  1. accomplishes this:

require("estimability") rbind(epred1234 = epredict(mod1234, newdata = testset),

  • epred4321 = epredict(mod4321, newdata = testset))

1 2 3 4 5 6 epred1234 14 NA 23 5 NA NA epred4321 14 NA 23 5 NA NA

The results for non-estimable cases are indicated by NAs. Note that in both models, the same new-data cases are identified as estimable, and they are the ones that yield the same predictions. Note also that estimability was determined separately from each model. We do not need to compare two different fits to determine estimability.

Adding estimability checking to a modeling package

It is a simple matter to add estimability checking to the predict method(s) in a new or existing package.

  1. The package should import the estimability package.
  2. Use estimability::nonest.basis to obtain the basis for non-estimable functions of the regres- sion coefficients, preferably in the model-fitting code.
  3. In the predict method, call estimability::is.estble on the model matrix for new data. (It is not necessary to check estimability of predictions on the original data.)

This is implemented in code in a manner something like this:

fitmodel <- function(...) { ... code to create: 'X' (the fixed-effects model matrix), 'QR' (QR decomp of 'X'), 'object' (the fitted model object) ... object$nonest <- estimability::nonest.basis(QR) object }

predict.mymodel <- function(object, newdata, tol = 1e-8, ...) { ... if(!missing(newdata)) { ... code to create: 'newX' (the model matrix for 'newdata') ... estble <- estimability::is.estble(newX, object$nonest, tol)

... code to flag non-estimable predictions ...

ith element of Xb is x ′ i b where x ′ i is the ith row of X. Thus, x ′ i βββ is estimable for each i, and so are any linear combinations of these. Indeed, λλλβββ is estimable if and only if λλλ ′^ is in the row space of X —or equivalently, λλλ is in the column space of X ′^ (Monahan, 2008, Result 3.1).

From the above result, we can establish estimability of λλλβββ either by showing that λλλ is in the column space of X ′, or by showing that it is orthogonal to the null space of X (Monahan, 2008, Methods 3.2 and 3.3, respectively). One way to implement the second idea is to note that if ( XX )−^ is any generalized inverse of XX , then P = ( XX )( XX )−^ is a projection onto the column space of X ′, and so IP is a projection onto the null space of X. The columns of IP thus comprise a basis for this null space. Accordingly, estimability of λλλβββ can be determined by showing that λλλ ′( IP ) = 0 ′.

SAS uses IP , as described above, to test estimability. Of course, in computation we need to set a tolerance for being close enough to 0. SAS deems λλλβββ non-estimable if λλλ 6 = 0 and max | λλλ ′( IP )| > ψ · max | λλλ |, where ψ is a tolerance factor with a default value of 10−^4. For details, see SAS Institute Inc. (2013), the chapter on “Shared Concepts,” documentation for the ESTIMATE statement and its SINGULAR option.

Methods used by estimability

In the estimability package, we obtain the null basis in a different way than shown above, in part because the QR decomposition of X is already (usually) available in an ‘lm’ object. Suppose that X is n × p, then we can write X = QR , where Q (n × p) has orthonormal columns, and R (p × p) is upper triangular with nonzero diagonal elements. If X is of rank r < p, then columns are pivoted so that the linearly dependent ones come last, and we only need r columns of Q and the top r rows of R , along with the pivoting information. Call these pivoted, down-sized matrices Q ˜ and R ˜ respectively; and let ˜ X (still n × p) denote X with its columns permuted according to the pivoting. We now have that X^ ˜ = Q ˜ ˜ R , and R ˜ comprises the first r rows of an upper triangular matrix. Observe that each row of ˜ X is thus a linear combination of the rows of ˜ R —that is, the row space of ˜ X is the same as the row space of R^ ˜. So the much smaller matrix R ˜ has everything we need to know to determine estimability.

Now, let d = p − r denote the rank deficiency, and define

S =

[

R ˜′ p×r^0 r×d I d×d

]

Then S is an upper triangular matrix with nonzero diagonal—hence it is of full rank p = r + d. Let S = TU be the QR decomposition of S. We then have that T has orthonormal columns, and in fact it is a Gram-Schmidt orthonormalization of S. Accordingly, the first r columns of T comprise an orthonormalization of the columns of R ˜′, and thus they form a basis for the row space of R ˜ and hence of X ˜. Letting N ˜ denote the last d columns of T , we have that N ˜ has orthonormal columns, all of which are orthogonal to the first r columns of T and hence to the row space of R ˜ and ˜ X. It follows that N ˜ is an orthonormal basis for the null space of X ˜. After permuting the rows of N ˜ according to the original pivoting in the QR decomposition of X , we obtain a basis N for the null space of X.

To test estimability of λλλβββ , first compute z ′^ = λλλN , which is a d-vector. Theoretically, λλλβββ is estimable if and only if zz = 0 ; but for computational purposes, we need to set a tolerance for its being suitably small. Again, we opt to deviate from SAS’s approach, which is based on max |zi |. Instead, we deem λλλβββ non-estimable when λλλ 6 = 0 and zzτ · λλλλλλ , where τ is a tolerance value. Our default value is τ = ( 10 −^4 )^2 = 10 −^8. The rationale for this criterion is that it is rotation-invariant: We could replace NVN where V is any p × p orthogonal matrix, and zz will be unchanged. Such rotation invariance seems a desirable property because the assessment of estimability does not depend on which null basis is used.

Bells and whistles

The estimability package’s epredict function serves as a demonstration of adding estimability check- ing to the predict methods in the stats package. It works for lm, aov, glm, and mlm objects. It also provides additional options for the type argument of predict. When newdata is provided, type = "estimability" returns a logical vector showing which rows of newdata are estimable; and type = "matrix" returns the model matrix for newdata.

An accompanying enhancement is eupdate, which runs update on a model object and also adds a nonest member. Subsequent epredict calls on this object will use that as the null basis instead of having to reconstruct it. For example,

mod123 <- eupdate(mod1234,. ~. - x4)

mod123$nonest

[,1] [1,] 0. [2,] -0. [3,] 0. [4,] 0.

Now let’s find out which predictions in testset are estimable with this model:

epredict(mod123, newdata = testset, type = "estimability")

1 2 3 4 5 6 TRUE TRUE TRUE TRUE TRUE FALSE

Thus, more of these test cases are estimable when we drop x 4 from the model. Looking at testset, this makes sense because two of the previously non-estimable rows differ from estimable ones only in their values of x 4.

Conclusion

In rank-deficient linear models used for predictions on new predictor sets, it is important for users to know which predictions are to be trusted, and which of them would change if the same model had been specified in a different way. The estimability package provides an easy way to add this capability to new and existing model-fitting packages.

Bibliography

R. V. Lenth. estimability: Estimability Tools for Linear Models, 2015. URL http://CRAN.R-project.org/ package=estimability. R package version 1.1. [p196]

J. F. Monahan. A Primer on Linear Models. Chapman & Hall/CRC, 2008. [p197, 198]

SAS Institute Inc. SAS/STAT Software, Version 13.2. SAS Institute Inc., Cary, NC, 2013. URL http: //www.sas.com/. [p198]

S. R. Searle. Linear Models. Wiley Classics. Wiley-Interscience, 1997. [p197]

G. A. F. Seber and A. J. Lee. Linear Regression Analysis. John Wiley & Sons, Inc., second edition, 2003. [p197]

Russell V. Lenth The University of Iowa 241 Schaeffer Hall Iowa City, IA 52242 USA russell-lenth@uiowa.edu