Um blog sobre nada

Um conjunto de inutilidades que podem vir a ser úteis

Residuals and R square

Posted by Diego em Novembro 11, 2014


(Still working with the linear model l defined here)

 

Residuals are the distances between the actual point and the estimate given by the regression line.
Bellow we can see the X, Y point, the “predicted point” (the value of y substituting x on the y = 2.0654x – 1.3368 function) and the residual (actual point – predicted point):

X

Y

 

Predicted point

Residual

1

5

 

0.7286

4.2714

2

6

 

2.794

3.206

3

7

 

4.8594

2.1406

4

8

 

6.9248

1.0752

5

9

 

8.9902

0.0098

6

10

 

11.0556

-1.0556

7

11

 

13.121

-2.121

8

12

 

15.1864

-3.1864

9

13

 

17.2518

-4.2518

10

14

 

19.3172

-5.3172

11

15

 

21.3826

-6.3826

12

20

 

23.448

-3.448

13

25

 

25.5134

-0.5134

14

33

 

27.5788

5.4212

15

34

 

29.6442

4.3558

16

35

 

31.7096

3.2904

17

36

 

33.775

2.225

18

37

 

35.8404

1.1596

19

38

 

37.9058

0.0942

20

39

 

39.9712

-0.9712

 

 

Same values can be accessed using: l$residuals:

clip_image001

 

The predicted points can be found using: l$fitted.values

clip_image002

FYI, l$model will return the data.

 

R2 is the percentage of variation explained by the regression model.

It is calculated by dividing the total error with the line by the total error from the meanY

 

image

 

 
 

 

X Y   Predicted Point Residual (error) Squared Error (residual^2) Squared error from Mean
1 5   0.7286 4.2714 18.24485796 235.6225
2 6   2.794 3.206 10.278436 205.9225
3 7   4.8594 2.1406 4.58216836 178.2225
4 8   6.9248 1.0752 1.15605504 152.5225
5 9   8.9902 0.0098 9.604E-05 128.8225
6 10   11.0556 -1.0556 1.11429136 107.1225
7 11   13.121 -2.121 4.498641 87.4225
8 12   15.1864 -3.1864 10.15314496 69.7225
9 13   17.2518 -4.2518 18.07780324 54.0225
10 14   19.3172 -5.3172 28.27261584 40.3225
11 15   21.3826 -6.3826 40.73758276 28.6225
12 20   23.448 -3.448 11.888704 0.1225
13 25   25.5134 -0.5134 0.26357956 21.6225
14 33   27.5788 5.4212 29.38940944 160.0225
15 34   29.6442 4.3558 18.97299364 186.3225
16 35   31.7096 3.2904 10.82673216 214.6225
17 36   33.775 2.225 4.950625 244.9225
18 37   35.8404 1.1596 1.34467216 277.2225
19 38   37.9058 0.0942 0.00887364 311.5225
20 39   39.9712 -0.9712 0.94322944 347.8225
        Sum: 215.7045116 3052.55
             
MeanY: 20.35          
             
  Total squared error with the line:   215.70  
  Squared error from the meanY:     3052.55
             
  Percentage of the total variation that is not explained by the line (215.7/3052.55): 0.070663711
  7% of the total variation is not explained by the variation in X  
             
          R2   (1 – 0.07) 0.929336289

 

Residual Variation: Yi-Yi_hat
Total Variation: Yi-mean(Yi)

The term R^2 represents the percent of total variation described by the model

 

R2   is equal to the correlation of (X,Y) squared.
R2   = r (squared)

 

The mean of the residuals is 0 and the residuals are not correlated with the predictor X:

clip_image004

 

 

Using R

1) Manually:

Calculate the mean:

> mu <- mean(foo$Y)
> mu
[1] 20.35

 

Calculate the squared error from the mean:

> sTot <- sum((foo$Y-mu)^2)
> sTot
[1] 3052.55

 

R provides the function deviance, which calculates the sum of the squared residuals, the differences between the actual value and the estimated value specified by the line defined by the given parameters (slope and intercept).   (You’d get the same result if you square all value on the “residual” column and sum them).

 

> sRes <- deviance(l)
> sRes
[1] 215.7045
 
> 1-sRes/sTot
[1] 0.9293363
 
2) Pre defined calculations:
> summary(l)$r.squared
[1] 0.9293363
 
OR
 
> cor(foo$Y, foo$X)^2
[1] 0.9293363
 

Deixe uma Resposta

Preencha os seus detalhes abaixo ou clique num ícone para iniciar sessão:

Logótipo da WordPress.com

Está a comentar usando a sua conta WordPress.com Terminar Sessão / Alterar )

Imagem do Twitter

Está a comentar usando a sua conta Twitter Terminar Sessão / Alterar )

Facebook photo

Está a comentar usando a sua conta Facebook Terminar Sessão / Alterar )

Google+ photo

Está a comentar usando a sua conta Google+ Terminar Sessão / Alterar )

Connecting to %s

 
%d bloggers like this: