STA 326 2.0 Programming and Data Analysis with RR Data Import and ExportDr Thiyanga TalagalaOnline distance learning/teaching materials during the COVID-19 outbreak.1 / 11

Data import with readr

R package

readr: part of the core tidyverse.

library(tidyverse)

── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──

✓ ggplot2 3.3.5     ✓ purrr   0.3.4
✓ tibble  3.1.2     ✓ dplyr   1.0.7
✓ tidyr   1.1.3     ✓ stringr 1.4.0
✓ readr   1.4.0     ✓ forcats 0.5.1

── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
x dplyr::filter() masks stats::filter()
x dplyr::lag()    masks stats::lag()

`readr` data import functions

read_csv: reads comma-delimited files.
read_csv2: reads semicolon-separated files
read_tsv: reads tab-delimited files

2 / 11

🛠 Import data from a .csv file

Syntax

datasetname <- read_csv("include_file_path")

When you run read_csv, it prints out the names and type of each column.

Switch to R

3 / 11

If the file is saved inside the project folder

Demo: Go to google classroom and watch the video importdatacsv1.mov

If the file is saved outside the project folder

Demo: Go to google classroom and watch the video importdatacsv2.mov

4 / 11

🛠 Importing csv file from a website

Syntax

datasetname <- read_csv("include url here")

Example

url <- "https://thiyanga.netlify.app/project/datasets/foodlabel.csv"
foodlabel <- read_csv(url)

Warning: Missing column names filled in: 'X43' [43]

Parsed with column specification:
cols(
  .default = col_double()
)

See spec(...) for full column specifications.

head(foodlabel, 1)

# A tibble: 1 x 80
  Gender   Age Education Employment Income Housesize children marital fshopper
   <dbl> <dbl>     <dbl>      <dbl>  <dbl>     <dbl>    <dbl>   <dbl>    <dbl>
1      1    22         5          4      3         5        2       0        0
# … with 71 more variables: mplanner <dbl>, place <dbl>, FA <dbl>,
#   Diabetes <dbl>, Metabolic cyndrents <dbl>, Other <dbl>, specific <dbl>,
#   job1 <dbl>, job2 <dbl>, Exercise <dbl>, Health <dbl>, taste <dbl>,
#   easy <dbl>, familiarity <dbl>, friends <dbl>, Useful <dbl>, Easiness <dbl>,
#   Sufficient <dbl>, Trusfulness <dbl>, Clear <dbl>, attractive pack <dbl>,
#   hc/nutriclaims <dbl>, graphical <dbl>, Free/prize <dbl>, source <dbl>,
#   netquan <dbl>, low in fat <dbl>, low in cho <dbl>, sodium <dbl>,
#   e labels <dbl>, place2 <dbl>, fa2 <dbl>, Health_1 <dbl>, X43 <dbl>,
#   f1 <dbl>, f2 <dbl>, f3 <dbl>, f4 <dbl>, f5 <dbl>, f6 <dbl>, f7 <dbl>,
#   f8 <dbl>, f9 <dbl>, f10 <dbl>, f11 <dbl>, f12 <dbl>, f13 <dbl>, f14 <dbl>,
#   f15 <dbl>, f16 <dbl>, f17 <dbl>, f18 <dbl>, i1 <dbl>, i2 <dbl>, i3 <dbl>,
#   i4 <dbl>, i5 <dbl>, i6 <dbl>, i7 <dbl>, i8 <dbl>, i9 <dbl>, i10 <dbl>,
#   i11 <dbl>, i12 <dbl>, i13 <dbl>, i14 <dbl>, i15 <dbl>, i16 <dbl>,
#   i17 <dbl>, i18 <dbl>, cluster <dbl>

5 / 11

`read.csv` and `read_csv`

read.csv is in base R.
read_csv is in tidyverse.
read.csv() performs a similar job to read_csv().
read_csv() works well with other parts of the tidyverse.
read_csv() is faster than read.csv().
read_csv() will always read variables containing text as character variable. In contrast, the base R function read.csv() will, by default, convert any character variable to a factor.

6 / 11

🛠 Writing to a File

We can save tibble (or dataframe) to a csv file, using write_csv().
write_csv() is in the readr package.

Syntax

write_csv(name_of_the_data_set_you_want_to_save, "path_to_write_to")

Example

data(iris)
# This will save inside your project folder
write_csv(iris, "iris.csv") 
# This will save inside the data folder which is inside your project folder
write_csv(iris, "data/iris.csv")

Switch to R

Demo: Go to google classroom and watch the video exportdatacsv.mov

7 / 11

🛠 Importing Excel .xlsx files

Syntax

library(readxl)
mydata <- read_xlsx("file_path")

Switch to R

Demo: Go to google classroom and watch the video readxlsx.mov

8 / 11

Importing SAS, SPSS and STATA files

SAS

read_sas("mtcars.sas7bdat")
write_sas(mtcars, "mtcars.sas7bdat")

SPSS

read_sav("mtcars.sav")
write_sav(mtcars, "mtcars.sav")

Stata

read_dta("mtcars.dta")
write_dta(mtcars, "mtcars.dta")

9 / 11

Importing other types of data

feather: for sharing with Python and other languages
httr: for web apis
jsonlite: for JSON
rvest: for web scraping
xml2: for XML

Working with feather, httr, jsonlite, rvest and xml2 is beyond the scope of the course.

10 / 11

Slides available at: hellor.netlify.app

11 / 11

R package

readr: part of the core tidyverse.

library(tidyverse)

── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──

✓ ggplot2 3.3.5 ✓ purrr 0.3.4 ✓ tibble 3.1.2 ✓ dplyr 1.0.7 ✓ tidyr 1.1.3 ✓ stringr 1.4.0 ✓ readr 1.4.0 ✓ forcats 0.5.1

── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ── x dplyr::filter() masks stats::filter() x dplyr::lag() masks stats::lag()

↑, ←, Pg Up, k	Go to previous slide
↓, →, Pg Dn, Space, j	Go to next slide
Home	Go to first slide
End	Go to last slide
Number + Return	Go to specific slide
b / m / f	Toggle blackout / mirrored / fullscreen mode
c	Clone slideshow
p	Toggle presenter mode
s	Start & Stop the presentation timer
t	Reset the presentation timer
?, h	Toggle this help

STA 326 2.0 Programming and Data Analysis with R

R Data Import and Export

Dr Thiyanga Talagala

Online distance learning/teaching materials during the COVID-19 outbreak.

Data import with readr

R package

readr data import functions

🛠 Import data from a .csv file

Syntax

If the file is saved inside the project folder

If the file is saved outside the project folder

🛠 Importing csv file from a website

Syntax

Example

read.csv and read_csv

🛠 Writing to a File

Syntax

Example

🛠 Importing Excel .xlsx files

Syntax

Importing SAS, SPSS and STATA files

SAS

SPSS

Stata

Importing other types of data

Data import with readr

R package

readr data import functions

Help

`readr` data import functions

`read.csv` and `read_csv`

`readr` data import functions