Easy Tutorial
❮ R List R Decision Making ❯

R - Linear Regression

R Language Examples

In statistics, linear regression is a regression analysis that models the relationship between a dependent variable and one or more independent variables using the least squares function, known as the linear regression equation.

Simply put, it is a statistical method used to determine the quantitative relationship of mutual dependence between two or more variables.

In regression analysis, if there is only one independent variable and one dependent variable, and their relationship can be approximated by a straight line, this type of regression analysis is called simple linear regression analysis. If the regression analysis includes two or more independent variables and the relationship between the dependent variable and the independent variables is linear, it is called multiple linear regression analysis.

The mathematical equation for simple linear regression analysis:

y = ax + b

Next, we can create a predictive model for human height and weight:

  1. Collect sample data: height and weight.
  2. Use the lm() function to create a relationship model.
  3. Find the coefficients from the created model and create a mathematical equation.
  4. Get a summary of the relationship model to understand the average error, i.e., residuals (the difference between estimated and actual values).
  5. Use the predict() function to predict a person's weight.

Prepare Data

The following are height and weight data for individuals:

# Height, in cm
151, 174, 138, 186, 128, 136, 179, 163, 152, 131

# Weight, in kg
63, 81, 56, 91, 47, 57, 76, 72, 62, 48

lm() Function

In R, you can perform linear regression using the lm() function.

The lm() function is used to create a relationship model between independent and dependent variables.

The syntax for the lm() function is as follows:

lm(formula, data)

Parameter descriptions:

Create a relationship model and get the coefficients:

Example

# Sample data
x <- c(151, 174, 138, 186, 128, 136, 179, 163, 152, 131)
y <- c(63, 81, 56, 91, 47, 57, 76, 72, 62, 48)

# Submit to lm() function
relation <- lm(y ~ x)

print(relation)

Executing the above code outputs:

Call:
lm(formula = y ~ x)

Coefficients:
(Intercept)            x  
    -38.4551       0.6746

Use the summary() function to get a summary of the relationship model:

Example

x <- c(151, 174, 138, 186, 128, 136, 179, 163, 152, 131)
y <- c(63, 81, 56, 91, 47, 57, 76, 72, 62, 48)

# Submit to lm() function
relation <- lm(y ~ x)

print(summary(relation))

Executing the above code outputs:

Call:
lm(formula = y ~ x)

Residuals:
    Min      1Q     Median      3Q     Max 
-6.3002    -1.6629  0.0412    1.8944  3.9775 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept) -38.45509    8.04901  -4.778  0.00139 ** 
x             0.67461    0.05191  12.997 1.16e-06 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 3.253 on 8 degrees of freedom
Multiple R-squared:  0.9548,    Adjusted R-squared:  0.9491 
F-statistic: 168.9 on 1 and 8 DF,  p-value: 1.164e-06

predict() Function

The predict() function is used to predict values based on the model we have established.

The syntax for the predict() function is as follows:

predict(object, newdata)

Parameter descriptions:

The following example predicts a new weight value:

Example

# Sample data

x <- c(151, 174, 138, 186, 128, 136, 179, 163, 152, 131) y <- c(63, 81, 56, 91, 47, 57, 76, 72, 62, 48)

Submitting to the lm() function

relation <- lm(y~x)

Predicting the weight for a height of 170 cm

a <- data.frame(x = 170) result <- predict(relation, a) print(result)


Executing the above code outputs:

1 76.22869


We can also generate a chart:

## Example

Sample data

x <- c(151, 174, 138, 186, 128, 136, 179, 163, 152, 131) y <- c(63, 81, 56, 91, 47, 57, 76, 72, 62, 48) relation <- lm(y~x)

Generate png image

png(file = "linearregression.png")

Generate chart

plot(y, x, col = "blue", main = "Height & Weight Regression", abline(lm(x~y)), cex = 1.3, pch = 16, xlab = "Weight in Kg", ylab = "Height in cm") ```

The chart is as follows:

R Language Example

❮ R List R Decision Making ❯