High School Seniors

Questions and Hypotheses

Data Analysis

Graphical Summaries

Interpretation

library(tidyverse)
library(DT)
library(pander)
library(mosaic)
library(car)

dat <- read_csv("https://github.com/kctolli/MATH325/raw/master/Data/HighSchoolSeniors.csv") %>% na.omit()
#Remember: select "Session, Set Working Directory, To Source File Location", and then play this R-chunk into your console to read the HSS data into R. 

df <- dat %>% select(Gender, Video_Games_Hours, Social_Websites_Hours, Texting_Messaging_Hours, Computer_Use_Hours, Watching_TV_Hours)

Questions and Hypotheses

How does electronic use time relate to gender? Do males spend more time on electronics?

Data Analysis

Brief Glimpse (Names) of the base data set. The data is from a survey given to the high school seniors that participated in the study.

names(df)

## [1] "Gender"                  "Video_Games_Hours"      
## [3] "Social_Websites_Hours"   "Texting_Messaging_Hours"
## [5] "Computer_Use_Hours"      "Watching_TV_Hours"

HSS <- df %>% 
  mutate(Electronic_Use = Video_Games_Hours + Social_Websites_Hours + Texting_Messaging_Hours + Computer_Use_Hours + Watching_TV_Hours) %>% 
  select(- Video_Games_Hours, - Social_Websites_Hours, - Texting_Messaging_Hours, - Computer_Use_Hours,  - Watching_TV_Hours) %>% 
  filter(Electronic_Use <= 160)

Mutate all hours spent on video games, social websites, texting, computer use and watching tv by add them up and creating an electronic use variable.

names(HSS)

## [1] "Gender"         "Electronic_Use"

This is the Numerical Summary used for this data set.

datatable(HSS, options=list(lengthMenu = c(5,10,50)), extensions="Responsive")

	Gender	Electronic_Use
1	Male	11
2	Female	75
3	Female	35
4	Male	16
5	Male	38

Graphical Summaries

Where do the outliers lay? What gender has more outliers? Which gender’s data stays closer to mean? What gender has a higher total electronic use time per week?

ggplot(data = HSS) + 
  geom_boxplot(aes(x = Gender, y = Electronic_Use, color = Gender)) +
  theme_bw()

ggplot(data = HSS) + 
  geom_col(aes(x = Gender, y = Electronic_Use, fill = Gender)) +
  theme_bw()

Based on the graphical summaries my hypothesis was wrong. Since the graphs side more on the female side. Yet we aren’t for sure on what the true conclusions. We can try t-tests which will give a closer and more exact conclusion.

Interpretation

ttest = t.test(Electronic_Use ~ Gender, data = HSS, mu = 0, alternative = "two.sided", conf.level = 0.95)

pander(ttest)

Welch Two Sample t-test: `Electronic_Use` by `Gender` (continued below)
Test statistic	df	P value	Alternative hypothesis
0.5338	302.7	0.5939	two.sided

mean in group Female	mean in group Male
44.22	42.16

Above is the results of an independent samples t-test. An independent samples t test is used when a value is hypothesized for the difference between two (possibly) different population means, . The mean of the data for females is higher than for males. Which means that the overall and average of the electronic use is greater for females compared to males. I find this interesting since males tend to play more video games than females.

My P-value is 0.5939 which is greater than the confidence interval which is .05. So We will fail to reject the null hypothesis.

High School Seniors

t Test

Week 3

Questions and Hypotheses

Data Analysis

Graphical Summaries

Interpretation