help-gsl
[Top][All Lists]

[Help-gsl] Pvalue of weighted linear regression

 From: Benjamin Otto Subject: [Help-gsl] Pvalue of weighted linear regression Date: Tue, 15 Jan 2008 17:36:48 +0100

```Hi,

when I calculate a linear regression with gsl_fit_linear() there are two
different ways of obtaining the pvalue.

ESS <- sum((y-yestimate)^2) or direct use of "sumsq" (which is given
back fy the function)
SSR <- sum((yestimate-mean(y))^2)
TSS <- sum((y-mean(y))^2)
R^2 <- SSR/TSS or 1-ESS/TSS

F-statistic:
R^2/(p-1)
---------
(1-R^2)/(n-p)
where n is the number of elements and p the number of
variables, normally p=2
so the term will be simplified to:
f_stat = R^2/((1-R^2)/(n-2))

and the pvalue is:
1 - gsl_cdf_fdist_P (f_stat,1,nelem-2)

Now my question: What changes here when I perform a weighted linear
regression with gsl_fit_wlinear?

The "sumsq" is already calculated by the function, but if I had to
calculate the ESS manually then I would have to multiply the
(y-yestimate) differences with the weights before applying the power of
2 and the sum. The same should apply to the TSS, doens't it? And then
the rest of the calculation remains the same as for a nonweighted
regression.

But obviously there still HAS to be another difference in calculation.
Because when I perform the same regression in R with
lm(y~x,weights=weights) I DO get a different R2 value and thereby
different f-statistic. And this is not a basic calculation error because
my slope and intercept are identical in R.

Any suggestions what I'am doing wrong?

best regards,

Benjamin Otto

--
Pflichtangaben gemäß Gesetz über elektronische Handelsregister und
Genossenschaftsregister sowie das Unternehmensregister (EHUG):

Universitätsklinikum Hamburg-Eppendorf
Körperschaft des öffentlichen Rechts
Gerichtsstand: Hamburg

Vorstandsmitglieder:
Prof. Dr. Jörg F. Debatin (Vorsitzender)
Dr. Alexander Kirstein
Ricarda Klein
Prof. Dr. Dr. Uwe Koch-Gromus

```