Is there a way to calculate the standard error estimates and t-values of my beta coefficients if my matrix does not have full rank and therefore is not invertible? I am writing a script in Ruby to do least squares regression and analysis of variance. I can compute the beta coefficients for my least squares approximation with...
However, this assumes my matrix has full rank. I have data sets that occasionally contain a set of linearly dependent columns and therefore will not invert. I want to keep the linearly dependent columns in my design matrix, so I found discovered at PlanetMath that I can compute the betas using the "pseudo-inverse" from singular value decomposition (SVD)...
T-values are calculated by taking the diagonal of the inverse of the covariance matrix C[i][j]
where
Is there another way to calculate C using the S, U, or V from Singular Value Decomposition? P.S.: I also found this paper to be quite relevant. |
There is no free lunch here. If X is less than full rank, then there is a unique best (least square error) set of estimates of y that are linear in x, but there are uncountably many coefficient vectors that produce those estimates. For variables that are involved in a linear relation (which may be all of x or a subset), it is guaranteed that at least one coefficient estimate has a coefficient 0 for that variable, and at least one that doesn't (unless, by cosmic coincidence, all the variables in that subset of x get zero coefficients). This would make t-values uninterpretable even if you could get them. 1
Which makes me wonder... both PCR and PLSR [1] [2] have been developed to overcome problems which arise when \(X\) is rank-deficient. So: are \(t\)-values calculated from \(\beta\)s obtained through PCR or PLSR different to interpret from \(t\)-values obtained from "standard" (multiple) LLSR results? [see also: Numerical Linear Algebra in Data Mining, section 3.4]
(07 Aug '12, 14:56)
fbahr ♦
1
Oh well... on second thought, "overcome ... rank deficiencies" probably refers to column rank deficiencies (i.e., cases of collinearity) only.
(07 Aug '12, 15:41)
fbahr ♦
2
PCR and PLSR linearly transform the original predictor (X) matrix into a full rank matrix (Z) of smaller dimension (few variables), which eliminates the multicollinearity problem. You get the beta coefficients and their t-statistics for the regression of the response variable on the z variables the usual way; but the z variables are linear combinations of the x variables, and there is no way to convert t-statistics for the coefficients of the z variables to t-statistics for coefficients of the x variables.
(07 Aug '12, 17:46)
Paul Rubin ♦♦
|
Unfortunately, the matrix C does not exist. You have to consider reducing the size of the matrix X, for example through the principal component analysis ... hmmm, \[C := V \times [S'S]^{-1} \times V'\] SVD: \[ \begin{eqnarray} X &:= &U S V'\\ b &:= &[X'X]^{-1} X' y, \quad\textrm{so} &&\\ b &= &[V S' U' U S V']^{-1} V S' U' y\\ U' U &= &I\\ b &= &[V S' S V']^{-1} V S U' y = V [S' S]^{-1} V' V S' U' y\\ V' V &= &I\\ b &= &V [S' S]^{-1} S' U' y\\ cov(b) &= &[X'X]^{-1} s^2 = V [S' S]^{-1} V' s^2 ??? \end{eqnarray} \] Is it there in that problem \(\textrm{det}(S'S)=0\) and again \([S'S]^{-1}\) does not exist and thus \(C\) also does not exist? PCA: \[ \begin{eqnarray} Z &= &P \times X \end{eqnarray} \] We can remove some variables in \(Z\) and instead \(b_x\) we can estimate \(b_z\). If Z has less variables, we can not write something like that: \[ \begin{eqnarray} b_x &= &P' \times b_z\\ t_x &= &P' \times t_z \end{eqnarray} \] Well done, Florian! 1
Many thanks, Florian! Last night I spent a lot of time for a looking for an example of how to use the LaTeX on this site.
(08 Aug '12, 04:53)
Slavko
1
Regarding the PCA derivation above, if X has less than full rank, one or more of the Z variables will have zero variance (i.e., be constant). If we start out with X and Y centered, then the constant columns of Z will be identically zero, which means you will be unable to estimate regression coefficients for them.
(08 Aug '12, 17:03)
Paul Rubin ♦♦
1
The matrix \(Z\) is smaller, because we omit variables with zero variance. Moreover, other variables in \(Z\) are uncorrelated with each other, which results from the method of PCA.
(08 Aug '12, 18:49)
Slavko
1
Thank you Paul, in fact PCR it is a one way ticket. If \(Z\) has less variables, it is not possible to estimate regression coefficients for X. Well, yesterday at 2 am I forgot about the variable containing same 1's .. In general, last night these arrays somehow looked different :-) I corrected my answer. I am glad that there is a such place like this, where I can get to know the views of wonderful people who are willing to help others. Creating this site was very valuable initiative.
(09 Aug '12, 03:49)
Slavko
|