Module 7: Progression in Regression Analysis

Hi everyone! 

This week we are focusing on regression analysis from linear with one predictor variable to multi variable regression! Regression can be simply put as the an estimation or best fit of the relationship between two variables giving us a tool to predict the outcome of one variable based on its linear relationship to another variable.

1. In this assignment's segment, we will use the following regression equation  Y = a + bX +e
Where:
Y is the value of the Dependent variable (Y), what is being predicted or explained

a or Alpha, a constant; equals the value of Y when the value of X=0

or Beta, the coefficient of X; the slope of the regression line; how much Y changes for each one-unit change in X.

X is the value of the Independent variable (X), what is predicting or explaining the value of Y

e is the error term; the error in predicting the value of Y, given the value of X (it is not displayed in most regression equations).

A reminder about lm() Function. 

lm([target variable] ~ [predictor variables], data = [data source])

1.1 
The data in this assignment:

x <- c(16, 17, 13, 18, 12, 14, 19, 11, 11, 10)
y <- c(63, 81, 56, 91, 47, 57, 76, 72, 62, 48)

1.1 Define the relationship model between the predictor and the response variable:

There is a positive linear relationship between x and y given by the +3.269 increase in y for every one unit increase in x.


1.2 Calculate the coefficients?





b = 3.269, a = 19.206


2. Problem -

Apply the simple linear regression model (see the above formula) for the data set called "visit" (see below), and estimate the the discharge duration if the waiting time since the last eruption has been 80 minutes.
> head(visit) 
  discharge  waiting 
1     3.600      79 
2     1.800      54 
3     3.333      74 
4     2.283      62 
5     4.533      85 
6     2.883      55 

Employ the following formula discharge ~ waiting and data=visit)

2.1 Define the relationship model between the predictor and the response variable.

There is a positive linear relationship between discharge and waiting time.

Y = 0.06756x - 1.53317


2.2 Extract the parameters of the estimated regression equation with the coefficients function.





2.3 Determine the fit of the eruption duration using the estimated regression equation.


The fit of eruption using the estimated regression line tells us the next discharge after an 80 minute wait time is predicted to 3.871 minutes long. 



3.  Multiple regression

We will use a very famous datasets in R called mtcars. This dateset was extracted from the 1974 Motor Trend US magazine, and comprises fuel consumption and 10 aspects of automobile design and performance for 32 automobiles (1973--74 models).

This data frame contain 32 observations on 11 (numeric) variables.

[, 1]mpgMiles/(US) gallon
[, 2]cylNumber of cylinders
[, 3]dispDisplacement (cu.in.)
[, 4]hpGross horsepower
[, 5]dratRear axle ratio
[, 6]wtWeight (1000 lbs)
[, 7]qsec1/4 mile time
[, 8]vsEngine (0 = V-shaped, 1 = straight)
[, 9]amTransmission (0 = automatic, 1 = manual)
[,10]gearNumber of forward gears

To call mtcars data in R
R comes with several built-in data sets, which are generally used as demo data for playing with R functions. One of those datasets build in R is mtcars.
In this question, we will use 4 of the variables found in mtcars by using the following function

input <- mtcars[,c("mpg","disp","hp","wt")]
print(head(input))

The R will display

Screen Shot 2021-09-22 at 10.49.54 AM.png

3.1 Examine the relationship Multi Regression Model as stated above and its Coefficients using 4 different variables from mtcars (mpg, disp, hp and wt).
- Report on the result and explanation: what does the multi regression model and coefficients tells about the data?   





This multi regression models tells us the mpg of the these 32 cars has a negative relationship with the displacement, horsepower, and weight of the car. The weight of the car is about 3 times less for each car that consumes one more mpg. The horsepower of the car is 0.03 less for each mpg increase in a car model and the displacement of the car is 0.000937 less for each mpg increase in a car model. 

4.  From our textbook pp. 110 Exercises # 5.1
With the rmr data set, plot metabolic rate versus body weight. Fit a linear regression to the relation.

plot(metabolic.rate~body.weight,data=rmr)

4.1 According to the fitted model, what is the predicted metabolic rate for a body weight of 70 kg? 

The predicated metabolic rate for a body weight of 70 kg is 1305.394.





-Ramya's POV

Comments

Popular posts from this blog

Module 10: Varying Multivariate Regression

Final Project: Serially looking at your Cereal Choices

Module 12: Timely Time Series