01 – Finding your way in R (Part 1)

Kieran Healy

January 23, 2024

Data Wrangling &
Data Visualization

What we’ll be Doing

Course Website

Get up and Running: Install R

The R Project Website

Install RStudio

The RStudio / Posit Website

Download Problem Set 01

Open it

  • Uncompress the Zip file if that doesn’t happen automatically
  • Double-click the 01-problem-set.Rproj file.
  • RStudio should launch

Try rendering the problem set

We take a broadly Plain Text approach

The plain person’s guide
  • Using R and the Tidyverse can be understood within this broader context.
  • The same principles would apply to, e.g., using Python or similar tools.
  • For more on this see the notes at https://mptc.io

RStudio is an IDE for R

A kitchen is an IDE for Meals

R and RStudio

R and RStudio

R & RStudio

R & RStudio

R & RStudio

R & RStudio

R & RStudio

R & RStudio

Your code is what’s real in your project

Consider not showing output inline

Writing documents

Using Quarto to produce and reproduce your work

Where we want to end up

PDF out

Where we want to end up

HTML out

Where we want to end up

Word out

How to get there?

  • Write an R script with some notes inside. Create some figures and tables, paste them into our document.
  • This will work, but we can do better.

We can make this …

… by writing this

The code gets replaced by its output

  • This approach has its limitations, but it’s very useful.

  • When learning these workflows, stick with the defaults at the beginning. Later, you can customize the look of the output in all kinds of ways.

The right frame of mind

Learn by doing

  • This is like learning how to drive a car, or how to cook in a kitchen … or learning to speak a language.
  • After some orientation to what’s where, you will learn best by doing.
  • Software is a pain, but you won’t crash the car or burn your house down.

TYPE OUT
YOUR CODE
BY HAND

Samuel Beckett

Getting Oriented

Loading the tidyverse libraries

library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.3     ✔ readr     2.1.4
✔ forcats   1.0.0     ✔ stringr   1.5.0
✔ ggplot2   3.4.4     ✔ tibble    3.2.1
✔ lubridate 1.9.3     ✔ tidyr     1.3.0
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package to force all conflicts to become errors

The tidyverse has several components.

We’ll return to this message about Conflicts later.

Tidyverse components

  • library(tidyverse)
  • Loading tidyverse: ggplot2
  • Loading tidyverse: tibble
  • Loading tidyverse: tidyr
  • Loading tidyverse: readr
  • Loading tidyverse: purrr
  • Loading tidyverse: dplyr
  • Call the package and …
  • <| Draw graphs
  • <| Nicer data tables
  • <| Tidy your data
  • <| Get data into R
  • <| Fancy Iteration
  • <| Action verbs for tables

What R looks like

Code you can type and run:

## Inside code chunks, lines beginning with a # character are comments
## Comments are ignored by R

my_numbers <- c(1, 1, 2, 4, 1, 3, 1, 5) # Anything after a # character is ignored as well

Output:

my_numbers 
[1] 1 1 2 4 1 3 1 5

This is equivalent to running the code above, typing my_numbers at the console, and hitting enter.

What R looks like

By convention, code output in documents is prefixed by ##

Also by convention, outputting vectors, etc, gets a counter keeping track of the number of elements. For example,

letters
 [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q" "r" "s"
[20] "t" "u" "v" "w" "x" "y" "z"