d=read.csv(“county.csv”)
View(d)
# Name of the dataset is “d”.
# View the dataset. This is about US counties,
# such as county’s state, population, income, etc.
# QUESTIONS
# 1) What percentage of the dataset is populated with NA values?
# Hint: Calculate total NAs then divide by (number of rows X number of columns)
# 2) Show the county name and population in 2017 for the state of Connecticut
# where unemployment rate is greater than 5.0.
# 3) Show the county names, states and per capita income whose population
# change is positive and unemployement rate is less than 2.
# 4) We are wondering about population change from 2010 to 2017
# in counties where poverty rate is greater than 20 and
# unemployment rate is greater than 8.
# For this purpose, create a new variable using the following formula
# as a new column of the dataset d.
# popChng17_20 = (pop2017 – pop2010)/pop2010.
#
# 5) The population in “Hoonah Angoon Census Area” of Alaska in 2017
# is missing (i.e. NA). However, a quick Google search showed that
# this population is actually 2139. Now replace the NA in this
# spot with 2139.
# 6) We are wondering about mean poverty level for metro and non-metro counties
# in the state of Connecticut. Calculate them. Which one is higher?
# (NOTE: To ignore NAs, use na.rm = TRUE when calculating means)
# 7) Which year has the highest variation in terms of county populations?
# Is it 2000, 2010 or 2017? (NOTE: To ignore NAs, use na.rm = TRUE )
# 8) Create a histogram for homeownership variable
# with 40 bars. Comment on the skewness of the distribution.
# 9) Create a boxplot showing poverty variable for categories metro and non-metro areas
# (HINT: use y ~ x notation in the boxplot.) Comment on the plot
# as to which location has higher poverty overall?
# 10) Assume that you are investigating the variables that
# could be associated with ‘poverty’ variable. Create scatter
# plots on a 2 x 2 panel for poverty vs. unemployment_rate,
# homeownership, per_capita_income, and pop_change.
# Which variables seem to be associated with poverty?
# (NOTE: If you get “figure margins too large”, enlarge the plotting
# window to the left and upward)
#################################################
# In this part, use ggplot2 and dplyr packages
#################################################
library(ggplot2)
library(dplyr)
# 11) We wonder about the change in unemployement rate
# as education level changes.
# For this purpose, create a boxplot for unemployment_rate vs. median_edu.
# Label x-axis as “Education Level”
# Comment on the chart. How does the unemployement rate change?
# 12) Using dplyr and ggplot2, find population change from 2010 to 2017
# in counties where poverty rate is greater than 20 and
# unemployment rate is greater than 8.
# For this purpose, mutate a new variable
# called popChng17_20 = (pop2017 – pop2010)/pop2010.
# Then, create a ggplot boxplot showing popChng17_20 vs metro.
# 13) Group the dataset by State, and then summarize using
# count, mean unemployement rate, and mean per capita income.
# Sort the result by mean unemployment rate.
# (Hint: Remove NAs when calculating means: na.rm = TRUE)
ASSIGNMENT 08 S01 Introduction to Psychology I Directions: Be sure to save an electronic copy…
Include a comprehensive, thoughtful and critical analysis to the arguments and perspectives of the readings…
Discussion Prompt: Plagiarism As a writer, one of the gravest errors to make is to…
Question 1: Write a Hypothetical. Write a legal memorandum analyzing what happened in the following…
You work at Happy Joe's family restaurant and want to see if customer meal satisfaction…
The Assignment must be submitted on Blackboard (WORD format only) via allocated folder. Assignments submitted…