Is there a way to calculate the standard error estimates and t-values of my beta coefficients if my matrix does not have full rank and therefore is not invertible?

I am writing a script in Ruby to do least squares regression and analysis of variance. I can compute the beta coefficients for my least squares approximation with...

b = ((X'*X)^(-1))*X*y'

However, this assumes my matrix has full rank.

I have data sets that occasionally contain a set of linearly dependent columns and therefore will not invert. I want to keep the linearly dependent columns in my design matrix, so I found discovered at PlanetMath that I can compute the betas using the "pseudo-inverse" from singular value decomposition (SVD)...

b = V*(S'*S)^(-1)*S'*U'*y'

T-values are calculated by taking the diagonal of the inverse of the covariance matrix C[i][j]

t[i] = b[i] / (s * Math.sqrt(C[i][i]))


b[i] : beta for the ith feature
s : standard error of entire data set and its estimates
X : design matrix--the matrix with your independent variables
C =  ((X'*X)^(-1))

Is there another way to calculate C using the S, U, or V from Singular Value Decomposition?

P.S.: I also found this paper to be quite relevant.

asked 04 Aug '12, 19:40

dpott197's gravatar image

accept rate: 0%

edited 07 Aug '12, 14:21

fbahr's gravatar image

fbahr ♦

There is no free lunch here. If X is less than full rank, then there is a unique best (least square error) set of estimates of y that are linear in x, but there are uncountably many coefficient vectors that produce those estimates. For variables that are involved in a linear relation (which may be all of x or a subset), it is guaranteed that at least one coefficient estimate has a coefficient 0 for that variable, and at least one that doesn't (unless, by cosmic coincidence, all the variables in that subset of x get zero coefficients). This would make t-values uninterpretable even if you could get them.


answered 05 Aug '12, 15:52

Paul%20Rubin's gravatar image

Paul Rubin ♦♦
accept rate: 19%


Which makes me wonder... both PCR and PLSR [1] [2] have been developed to overcome problems which arise when \(X\) is rank-deficient. So: are \(t\)-values calculated from \(\beta\)s obtained through PCR or PLSR different to interpret from \(t\)-values obtained from "standard" (multiple) LLSR results? [see also: Numerical Linear Algebra in Data Mining, section 3.4]

(07 Aug '12, 14:56) fbahr ♦

Oh well... on second thought, "overcome ... rank deficiencies" probably refers to column rank deficiencies (i.e., cases of collinearity) only.

(07 Aug '12, 15:41) fbahr ♦

PCR and PLSR linearly transform the original predictor (X) matrix into a full rank matrix (Z) of smaller dimension (few variables), which eliminates the multicollinearity problem. You get the beta coefficients and their t-statistics for the regression of the response variable on the z variables the usual way; but the z variables are linear combinations of the x variables, and there is no way to convert t-statistics for the coefficients of the z variables to t-statistics for coefficients of the x variables.

(07 Aug '12, 17:46) Paul Rubin ♦♦

Unfortunately, the matrix C does not exist.

You have to consider reducing the size of the matrix X, for example through the principal component analysis ...


\[C := V \times [S'S]^{-1} \times V'\]


\[ \begin{eqnarray} X &:= &U S V'\\ b &:= &[X'X]^{-1} X' y, \quad\textrm{so} &&\\ b &= &[V S' U' U S V']^{-1} V S' U' y\\ U' U &= &I\\ b &= &[V S' S V']^{-1} V S U' y = V [S' S]^{-1} V' V S' U' y\\ V' V &= &I\\ b &= &V [S' S]^{-1} S' U' y\\ cov(b) &= &[X'X]^{-1} s^2 = V [S' S]^{-1} V' s^2 ??? \end{eqnarray} \]

Is it there in that problem \(\textrm{det}(S'S)=0\) and again \([S'S]^{-1}\) does not exist and thus \(C\) also does not exist?

PCA: \[ \begin{eqnarray} Z &= &P \times X \end{eqnarray} \]

We can remove some variables in \(Z\) and instead \(b_x\) we can estimate \(b_z\). If Z has less variables, we can not write something like that:

\[ \begin{eqnarray} b_x &= &P' \times b_z\\ t_x &= &P' \times t_z \end{eqnarray} \]

Well done, Florian!


answered 07 Aug '12, 18:06

Slavko's gravatar image

accept rate: 12%

edited 09 Aug '12, 04:15


Many thanks, Florian! Last night I spent a lot of time for a looking for an example of how to use the LaTeX on this site.

(08 Aug '12, 04:53) Slavko

Regarding the PCA derivation above, if X has less than full rank, one or more of the Z variables will have zero variance (i.e., be constant). If we start out with X and Y centered, then the constant columns of Z will be identically zero, which means you will be unable to estimate regression coefficients for them.

(08 Aug '12, 17:03) Paul Rubin ♦♦

The matrix \(Z\) is smaller, because we omit variables with zero variance. Moreover, other variables in \(Z\) are uncorrelated with each other, which results from the method of PCA.

(08 Aug '12, 18:49) Slavko

Thank you Paul, in fact PCR it is a one way ticket. If \(Z\) has less variables, it is not possible to estimate regression coefficients for X. Well, yesterday at 2 am I forgot about the variable containing same 1's ..

In general, last night these arrays somehow looked different :-) I corrected my answer. I am glad that there is a such place like this, where I can get to know the views of wonderful people who are willing to help others. Creating this site was very valuable initiative.

(09 Aug '12, 03:49) Slavko
Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here



Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text]( "Title")
  • image?![alt text](/path/img.jpg "Title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported



Asked: 04 Aug '12, 19:40

Seen: 2,665 times

Last updated: 09 Aug '12, 04:15

OR-Exchange! Your site for questions, answers, and announcements about operations research.