Posts

Showing posts from November, 2022

Final Project: Serially looking at your Cereal Choices

Image
  Hi everyone!  Almost everyone starts their mornings with a nice bowl of cereal. In the grocery aisle, you choose a box of cheerios and chocolate puffs. However, your decision to buy that cereal may not have been completely your choice at all. You just picked from the options that were conveniently placed in front of you.  My project will analyze the relationship between the shelves cereals are placed on to the sugar, fiber, and calories of that cereal. Do the ingredients of cereals, either healthy or sugary, play a role in what shelf cereals are placed on and their ratings?  Two Part Analysis  Do the ingredients s ugar, calories, and fiber content have a true mean difference with that cereal's placement on shelf 1,2, or 3 (counting from the ground)? I will use boxplots to understand each ingredients distribution in the shelves and perform ANOVA tests to determine if each ingredient is significant for the placement of the the cereal. ANOVA i...

Module 12: Timely Time Series

Image
  Hi everyone!  This week we are looking into time series where we dive into the past to predicate the future! Our goals in using time series analysis include identifying patterns in the periodic data, predict short term trends, and model these findings. Time series data has components with the most common being the trend and seasonal components. In our data if we see a lot of fluctuation around a common line or "zone", we can focus on methods to understand the trend component like simple exponential smoothing. If more of the data fluctuates based on periodic fluctuations we use methods like decomposing and seasonally adjusting. We will be focusing on exponential smoothing in our data today! This is useful for short term forecasting with an alpha coefficient between 0 to 1 indicating the weight of the moving average. A smaller alpha means the data is getting more smoothed while a higher alpha indicates less smoothing.  Let's get started! The table...

Module 11: Logical Logistic Regression

Image
  Hi everyone!  This week we looked into logistic regression, where like other forms of regression analysis we have looked at we estimate dependent variables based on the effect of the independent variable. The unique aspect of logistic regression is the dependent variable is binary, so the outcome has two options like pass or fail, yes or no, etc.  The formula is: y = (e^(b0 + b1x)) / (1 + e^(b0 + b1x)) where y equals the output proportion between 0 and 1 The best b values will result in y being close to 1. In R, we calculate logistic regression using the function glm(). 10.1 1. Set up an additive model for the  ashina  data, as part of ISwR package. 2. This data contain additive effects on subjects, period and treatment.  Compare  the results with those with those obtained from t tests.  The treatment is significant in the ANOVA results at a p value of 0.01228. The t test of getting the treatment vs not shows a significant p ...

Module 10: Varying Multivariate Regression

Image
  Hi everyone!  This week we dive into a more complex regression analysis than the linear regression we have done. Before we had one predictor variable and one outcome variable. In multivariate multiple regression, we have multiple predictor(or independent) variable but one outcome variables we are analyzing. The formula looks like  y = b 1 x 1  + b 2 x 2  + …   + b n x n  + error.  We will look into the relationships between the variables collectively using ANOVA analysis and individually through linear regression (lm) to understand which variables are contributing to the significance of the relationship. R sqaured will be an important factor to look into because it shows the amount of variation in dependent variable explained by the regression, thus a larger R squared indicates the predictor is more precisely able to predict outcome.  9.1.  Conduct ANOVA (analysis of variance) and Regression coefficients to the data from cyst...