View(phone)

Background

What is the relation between Phone screen and Phone price?

In this analysis I will be comparing the relationship between the two. The test used in this analysis is linear regression. I would be interpreted with slope and y intercept. The slope is interpreted as, “the change in the average y-value for a one unit change in the x-value.”

There are two equations used in linear regression

\[ {Y_i} = \overbrace{\beta_0}^\text{y-int} + \overbrace{\beta_1}^\text{slope} * {X_i} + \epsilon_i \quad \text{where} \ \epsilon_i \sim N(0, \sigma^2) \]

\[ {\hat{Y}_i} = \overbrace{b_0}^\text{est. y-int} + \overbrace{b_1}^\text{est. slope} * {X_i} \]

Hypothesis

\[ \left.\begin{array}{ll} H_0: \beta_1 = 0 \\ H_a: \beta_1 \neq 0 \end{array} \right\} \ \text{Slope Hypotheses} \]

\[ \left.\begin{array}{ll} H_0: \beta_0 = 0 \\ H_a: \beta_0 \neq 0 \end{array} \right\} \ \text{Intercept Hypotheses} \]

Data Analysis

Basic Statistics

pander(phone %>% 
  summarise(Correlation = cor(Screen, Price, use="complete.obs")))
Correlation
0.5688
pander(phone %>% 
  group_by(Screen) %>% 
  summarise(avgprice = mean(Price)))

summarise() ungrouping output (override with .groups argument)

Screen avgprice
4.7 400
6.1 915
6.2 1400
6.5 550
6.6 675
6.8 588
6.9 1300
7.6 2000

Linear Regression

mylm <- lm(Price ~ Screen, data = phone)
pander(summary(mylm))
  Estimate Std. Error t value Pr(>|t|)
(Intercept) -1601 1215 -1.317 0.2202
Screen 390.6 188.3 2.075 0.06784
Fitting linear model: Price ~ Screen
Observations Residual Std. Error \(R^2\) Adjusted \(R^2\)
11 423.3 0.3235 0.2484
par(mfrow = c(1,3))
plot(mylm, which = 1:2)
plot(mylm$residuals)

ggplot(phone, aes(x = Screen, y = Price)) +
  geom_point() +
  geom_smooth(method = "lm", se=FALSE) +
  theme_bw()

pander(confint(mylm, level = 0.90))
  5 % 95 %
(Intercept) -3828 626.6
Screen 45.48 735.7

The Data

pander(phone, split.tables = Inf)
Screen Price
6.1 1000
6.1 830
6.6 750
6.5 700
4.7 400
6.9 1300
6.2 1400
6.8 588
7.6 2000
6.5 400
6.6 600

Interpretation

\[ \underbrace{\hat{Y}_i}_\text{Price} = \overbrace{-1601}^\text{est. y-int} + \overbrace{390.6}^\text{est. slope} \underbrace{X_i}_\text{Screen} \]

The equation above is derived from the linear regression model. From this equation we can find the price of a phone based on screen size. But from the data you can learn that price is not just based on screen size. So this isn’t a great model to come to a conclusion for phone price.