Make Some Graphs

Soc 690S: Week 03a

Kieran Healy

Duke University

January 2025

Make Some Graphs

Load our libraries

library(here)      # manage file paths
library(socviz)    # data and some useful functions
library(tidyverse) # your friend and mine
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.1     ✔ tibble    3.2.1
✔ lubridate 1.9.4     ✔ tidyr     1.3.1
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(gapminder) # some data

Nearly done with the scaffolding

  • ✅ Thought about elements of visualization
  • ✅ Gotten oriented to R and RStudio
  • ✅ Knitted a document
  • ✅ Written a bit of ggplot code

Nearly done with the scaffolding

  • ✅ Thought about elements of visualization
  • ✅ Gotten oriented to R and RStudio
  • ✅ Knitted a document
  • ✅ Written a bit of ggplot code
  • ⬜ Get my data in to R
  • ⬜ Make a plot with it

Feed ggplot tidy data

What is tidy data?

Tidy data

What is tidy data?

Tidy data is in long format

Every column is a single variable

Grolemund & Wickham

Every row is a single observation

Grolemund & Wickham

Every cell is a single value

Grolemund & Wickham

Get your data into long format

Very, very often, the solution to some data-wrangling or data visualization problem in a Tidyverse-focused workflow is:

Get your data into long format

Very, very often, the solution to some data-wrangling or data visualization problem in a Tidyverse-focused workflow is:

First, get the data into long format

Then do the thing you want.

Untidy data exists for good reasons

Storing and printing data in long format entails a lot of repetition:

library(palmerpenguins)
penguins |> 
  group_by(species, island, year) |> 
  summarize(bill = round(mean(bill_length_mm, na.rm = TRUE),2)) |> 
  tinytable::tt()
species island year bill
Adelie Biscoe 2007 38.32
Adelie Biscoe 2008 38.70
Adelie Biscoe 2009 39.69
Adelie Dream 2007 39.10
Adelie Dream 2008 38.19
Adelie Dream 2009 38.15
Adelie Torgersen 2007 38.80
Adelie Torgersen 2008 38.77
Adelie Torgersen 2009 39.31
Chinstrap Dream 2007 48.72
Chinstrap Dream 2008 48.70
Chinstrap Dream 2009 49.05
Gentoo Biscoe 2007 47.01
Gentoo Biscoe 2008 46.94
Gentoo Biscoe 2009 48.50

Untidy data exists for good reasons

A wide format is easier and more efficient to read in print:

penguins |> 
  group_by(species, island, year) |> 
  summarize(bill = round(mean(bill_length_mm, na.rm = TRUE), 2)) |> 
  pivot_wider(names_from = year, values_from = bill) |> 
  tinytable::tt()
species island 2007 2008 2009
Adelie Biscoe 38.32 38.70 39.69
Adelie Dream 39.10 38.19 38.15
Adelie Torgersen 38.80 38.77 39.31
Chinstrap Dream 48.72 48.70 49.05
Gentoo Biscoe 47.01 46.94 48.50

But also for less good reasons

Spot the untidiness

But also for less good reasons

Spot the untidiness
  • 😠 More than one header row
  • 😡 Mixed data types in some columns
  • 💀 Color and typography used to encode variables and their values

Fix it before you import it

Prevention is better than cure!

An excellent article by Karl Broman and Kara Woo:

Data organization in spreadsheets

The most common tidyr operation

Pivoting from wide to long:

edu
# A tibble: 366 × 11
   age   sex    year total elem4 elem8   hs3   hs4 coll3 coll4 median
   <chr> <chr> <int> <int> <int> <int> <dbl> <dbl> <dbl> <dbl>  <dbl>
 1 25-34 Male   2016 21845   116   468  1427  6386  6015  7432     NA
 2 25-34 Male   2015 21427   166   488  1584  6198  5920  7071     NA
 3 25-34 Male   2014 21217   151   512  1611  6323  5910  6710     NA
 4 25-34 Male   2013 20816   161   582  1747  6058  5749  6519     NA
 5 25-34 Male   2012 20464   161   579  1707  6127  5619  6270     NA
 6 25-34 Male   2011 20985   190   657  1791  6444  5750  6151     NA
 7 25-34 Male   2010 20689   186   641  1866  6458  5587  5951     NA
 8 25-34 Male   2009 20440   184   695  1806  6495  5508  5752     NA
 9 25-34 Male   2008 20210   172   714  1874  6356  5277  5816     NA
10 25-34 Male   2007 20024   246   757  1930  6361  5137  5593     NA
# ℹ 356 more rows

Here, a “Level of Schooling Attained” variable is spread across the columns, from elem4 to coll4. We need a key column called “education” with the various levels of schooling, and a corresponding value column containing the counts.

Wide to long with pivot_longer()

We’re going to put the columns elem4:coll4 into a new column, creating a new categorical measure named education. The numbers currently under each column will become a new value column corresponding to that level of education.

edu |> 
  pivot_longer(elem4:coll4, names_to = "education")
# A tibble: 2,196 × 7
   age   sex    year total median education value
   <chr> <chr> <int> <int>  <dbl> <chr>     <dbl>
 1 25-34 Male   2016 21845     NA elem4       116
 2 25-34 Male   2016 21845     NA elem8       468
 3 25-34 Male   2016 21845     NA hs3        1427
 4 25-34 Male   2016 21845     NA hs4        6386
 5 25-34 Male   2016 21845     NA coll3      6015
 6 25-34 Male   2016 21845     NA coll4      7432
 7 25-34 Male   2015 21427     NA elem4       166
 8 25-34 Male   2015 21427     NA elem8       488
 9 25-34 Male   2015 21427     NA hs3        1584
10 25-34 Male   2015 21427     NA hs4        6198
# ℹ 2,186 more rows

Wide to long with pivot_longer()

We can name the value column to whatever we like. Here it’s a number of people.

edu |> 
  pivot_longer(elem4:coll4, 
               names_to = "education", 
               values_to = "n")
# A tibble: 2,196 × 7
   age   sex    year total median education     n
   <chr> <chr> <int> <int>  <dbl> <chr>     <dbl>
 1 25-34 Male   2016 21845     NA elem4       116
 2 25-34 Male   2016 21845     NA elem8       468
 3 25-34 Male   2016 21845     NA hs3        1427
 4 25-34 Male   2016 21845     NA hs4        6386
 5 25-34 Male   2016 21845     NA coll3      6015
 6 25-34 Male   2016 21845     NA coll4      7432
 7 25-34 Male   2015 21427     NA elem4       166
 8 25-34 Male   2015 21427     NA elem8       488
 9 25-34 Male   2015 21427     NA hs3        1584
10 25-34 Male   2015 21427     NA hs4        6198
# ℹ 2,186 more rows

How to get your own data into R

Reading in CSV files

  • Base R has read.csv()
  • Corresponding tidyverse “underscored” version: read_csv().
  • It is pickier and more talkative than the Base R version. Use it instead.

Where’s my data? Using here()

  • If we’re loading a file, it’s coming from somewhere.
  • If it’s a file on our hard drive somewhere, we will need to interact with the file system. We should try to do this in a way that avoids absolute file paths.
# This is not portable!
df <- read_csv("/Users/kjhealy/Documents/data/misc/project/data/mydata.csv")
  • We should also do it in a way that is platform independent.
  • This makes it easier to share your work, move it around, etc. Projects should be self-contained.

Where’s my data? Using here()

The here package, and here() function builds paths relative to the top level of your R project.

here() # this path will be different for you
[1] "/Users/kjhealy/Documents/courses/socdata.co"

Where’s the data? Using here()

This seminar’s files all live in an RStudio project. It looks like this:

/Users/kjhealy/Documents/courses/socdata.co
├── R
├── README.qmd
├── _extensions
├── _freeze
├── _quarto.yml
├── _site
├── _targets
├── _targets.R
├── _variables.yml
├── about.qmd
├── assets
├── assignment
├── content
├── data
├── deploy.sh
├── example
├── files
├── fonts
├── images
├── index.qmd
├── projects
├── renv
├── renv.lock
├── schedule
├── slides
├── socdata.co.Rproj
├── staging
└── syllabus

I want to load files from the data folder, but I also want you to be able to load them. I’m writing this from somewhere deep in the slides folder, but you won’t be there. Also, I’m on a Mac, but you may not be.

Where’s the data? Using here()

So:

## Load the file relative to the path from the top of the project, without separators, etc
organs <- read_csv(file = here("files", "data", "organdonation.csv"))

Where’s the data? Using here()

organs
# A tibble: 238 × 21
   country  year donors   pop pop.dens   gdp gdp.lag health health.lag pubhealth
   <chr>   <dbl>  <dbl> <dbl>    <dbl> <dbl>   <dbl>  <dbl>      <dbl>     <dbl>
 1 Austra…    NA  NA    17065    0.220 16774   16591   1300       1224       4.8
 2 Austra…  1991  12.1  17284    0.223 17171   16774   1379       1300       5.4
 3 Austra…  1992  12.4  17495    0.226 17914   17171   1455       1379       5.4
 4 Austra…  1993  12.5  17667    0.228 18883   17914   1540       1455       5.4
 5 Austra…  1994  10.2  17855    0.231 19849   18883   1626       1540       5.4
 6 Austra…  1995  10.2  18072    0.233 21079   19849   1737       1626       5.5
 7 Austra…  1996  10.6  18311    0.237 21923   21079   1846       1737       5.6
 8 Austra…  1997  10.3  18518    0.239 22961   21923   1948       1846       5.7
 9 Austra…  1998  10.5  18711    0.242 24148   22961   2077       1948       5.9
10 Austra…  1999   8.67 18926    0.244 25445   24148   2231       2077       6.1
# ℹ 228 more rows
# ℹ 11 more variables: roads <dbl>, cerebvas <dbl>, assault <dbl>,
#   external <dbl>, txp.pop <dbl>, world <chr>, opt <chr>, consent.law <chr>,
#   consent.practice <chr>, consistent <chr>, ccode <chr>

And there it is.

read_csv() has variants

  • read_csv() Field separator is a comma: ,
organs <- read_csv(file = here("files", "data", "organdonation.csv"))
  • read_csv2() Field separator is a semicolon: ;
# Example only
my_data <- read_csv2(file = here("data", "my_euro_file.csv))

Both are special cases of read_delim()

Other species are also catered to

  • read_tsv() Tab separated.
  • read_fwf() Fixed-width files.
  • read_log() Log files (i.e. computer log files).
  • read_lines() Just read in lines, without trying to parse them.

Also often useful …

  • read_table()

For data that’s separated by one (or more) columns of space.

And for foreign file formats …

The haven package provides

  • read_dta() Stata
  • read_spss() SPSS
  • read_sas() SAS
  • read_xpt() SAS Transport

Make these functions available with library(haven)

The readxl package provides

  • read_xlsx() Modern Excel files
  • read_xls() Older Excel files
  • Plus a suite of functions for dealing with e.g. tabbed spreadsheets

You can read files remotely, too

  • You can give these functions local files, or they can also be pointed at URLs.
  • Compressed files (.zip, .tar.gz) will be automatically uncompressed.
  • (Be careful what you download from remote locations!)
organ_remote <- read_csv("https://kjhealy.co/organdonation.csv")
organ_remote
# A tibble: 238 × 21
   country  year donors   pop pop.dens   gdp gdp.lag health health.lag pubhealth
   <chr>   <dbl>  <dbl> <dbl>    <dbl> <dbl>   <dbl>  <dbl>      <dbl>     <dbl>
 1 Austra…    NA  NA    17065    0.220 16774   16591   1300       1224       4.8
 2 Austra…  1991  12.1  17284    0.223 17171   16774   1379       1300       5.4
 3 Austra…  1992  12.4  17495    0.226 17914   17171   1455       1379       5.4
 4 Austra…  1993  12.5  17667    0.228 18883   17914   1540       1455       5.4
 5 Austra…  1994  10.2  17855    0.231 19849   18883   1626       1540       5.4
 6 Austra…  1995  10.2  18072    0.233 21079   19849   1737       1626       5.5
 7 Austra…  1996  10.6  18311    0.237 21923   21079   1846       1737       5.6
 8 Austra…  1997  10.3  18518    0.239 22961   21923   1948       1846       5.7
 9 Austra…  1998  10.5  18711    0.242 24148   22961   2077       1948       5.9
10 Austra…  1999   8.67 18926    0.244 25445   24148   2231       2077       6.1
# ℹ 228 more rows
# ℹ 11 more variables: roads <dbl>, cerebvas <dbl>, assault <dbl>,
#   external <dbl>, txp.pop <dbl>, world <chr>, opt <chr>, consent.law <chr>,
#   consent.practice <chr>, consistent <chr>, ccode <chr>

You can read files remotely, too

  • Unfortunately readxl does not support getting data from remote URLs.
  • We can work around this. But it is annoying because …

Wide Remote Data: Topical Edition

https://kjhealy.co/sd/enduse_imports.xlsx

get_excel_file <- function(url) {
  httr::GET(url, httr::write_disk(tf <- tempfile(fileext = ".xlsx")))
  readxl::read_xlsx(tf) 
}

enduse <- get_excel_file("https://kjhealy.co/sd/enduse_imports.xlsx")
enduse
# A tibble: 22,042 × 14
   CTY_CODE CTY_DESC    END_USE COMM_DESC    value_14 value_15 value_16 value_17
   <chr>    <chr>       <chr>   <chr>           <dbl>    <dbl>    <dbl>    <dbl>
 1 0000     World Total 00000   Green coffee  5.23e 9  5.12e 9  4.79e 9  5.18e 9
 2 0000     World Total 00010   Cocoa beans   1.31e 9  1.43e 9  1.29e 9  1.19e 9
 3 0000     World Total 00020   Cane and be…  1.60e 9  1.74e 9  1.77e 9  1.66e 9
 4 0000     World Total 00100   Meat produc…  1.21e10  1.28e10  1.07e10  1.10e10
 5 0000     World Total 00110   Dairy produ…  1.95e 9  2.14e 9  2.02e 9  1.95e 9
 6 0000     World Total 00120   Fruits, fro…  1.46e10  1.58e10  1.71e10  1.83e10
 7 0000     World Total 00130   Vegetables    1.09e10  1.13e10  1.25e10  1.28e10
 8 0000     World Total 00140   Nuts          2.39e 9  2.80e 9  2.90e 9  3.33e 9
 9 0000     World Total 00150   Food oils, …  7.00e 9  6.05e 9  6.22e 9  6.85e 9
10 0000     World Total 00160   Bakery prod…  9.34e 9  9.65e 9  1.07e10  1.11e10
# ℹ 22,032 more rows
# ℹ 6 more variables: value_18 <dbl>, value_19 <dbl>, value_20 <dbl>,
#   value_21 <dbl>, value_22 <dbl>, value_23 <dbl>

Let’s transform it to long format.

A Plot’s Components

What we need our code to make

  • Data represented by visual elements;
  • like position, length, color, and size;
  • Each measured on some scale;
  • Each scale with a labeled guide;
  • With the plot itself also titled and labeled.

How does
ggplot
do this?

ggplot’s flow of action

Here’s the whole thing, start to finish

Flow of action

We’ll go through it step by step

Flow of action

ggplot’s flow of action

What we start with

ggplot’s flow of action

Where we’re going

ggplot’s flow of action

Core steps

ggplot’s flow of action

Optional steps

ggplot’s flow of action: required

Tidy data

ggplot’s flow of action: required

Aesthetic mappings

ggplot’s flow of action: required

Geom

Let’s go piece by piece

Start with the data

gapminder
# A tibble: 1,704 × 6
   country     continent  year lifeExp      pop gdpPercap
   <fct>       <fct>     <int>   <dbl>    <int>     <dbl>
 1 Afghanistan Asia       1952    28.8  8425333      779.
 2 Afghanistan Asia       1957    30.3  9240934      821.
 3 Afghanistan Asia       1962    32.0 10267083      853.
 4 Afghanistan Asia       1967    34.0 11537966      836.
 5 Afghanistan Asia       1972    36.1 13079460      740.
 6 Afghanistan Asia       1977    38.4 14880372      786.
 7 Afghanistan Asia       1982    39.9 12881816      978.
 8 Afghanistan Asia       1987    40.8 13867957      852.
 9 Afghanistan Asia       1992    41.7 16317921      649.
10 Afghanistan Asia       1997    41.8 22227415      635.
# ℹ 1,694 more rows
dim(gapminder)
[1] 1704    6

Create a plot object

Data is the gapminder tibble.

p <- ggplot(data = gapminder)

Map variables to aesthetics

Tell ggplot the variables you want represented by visual elements on the plot

p <- ggplot(data = gapminder,
            mapping = aes(x = gdpPercap,
                          y = lifeExp))

Map variables to aesthetics

The mapping = aes(...) call links variables to things you will see on the plot.

x and y represent the quantities determining position on the x and y axes.

Other aesthetic mappings can include, e.g., color, shape, size, and fill.

Mappings do not directly specify the particular, e.g., colors, shapes, or line styles that will appear on the plot. Rather, they establish which variables in the data will be represented by which visible elements on the plot.

p has data and mappings but no geom

p

This empty plot has no geoms.

Add a geom

p + geom_point() 

A scatterplot of Life Expectancy vs GDP

Try a different geom

p + geom_smooth() 

A scatterplot of Life Expectancy vs GDP

Build your plots layer by layer

p <- ggplot(data = gapminder,
            mapping = aes(x = gdpPercap,
                          y=lifeExp))
p + geom_smooth()

Life Expectancy vs GDP, using a smoother.

This process is additive

p <- ggplot(data = gapminder,
            mapping = aes(x = gdpPercap,
                          y=lifeExp))

This process is additive

p <- ggplot(data = gapminder,
            mapping = aes(x = gdpPercap,
                          y=lifeExp))
p + geom_smooth()

This process is additive

p <- ggplot(data = gapminder,
            mapping = aes(x = gdpPercap,
                          y=lifeExp))
p + geom_smooth() +
  geom_point()

Every geom is a function

Functions take arguments

p <- ggplot(data = gapminder,
            mapping = aes(x = gdpPercap,
                          y = lifeExp))
p + geom_point() + 
  geom_smooth(method = "lm") 

Keep Layering

 p <- ggplot(data = gapminder,
             mapping = aes(x = gdpPercap,
                           y=lifeExp))

Keep Layering

 p <- ggplot(data = gapminder,
             mapping = aes(x = gdpPercap,
                           y=lifeExp))
p + geom_point()

Keep Layering

 p <- ggplot(data = gapminder,
             mapping = aes(x = gdpPercap,
                           y=lifeExp))
p + geom_point() +
    geom_smooth(method = "lm")

Keep Layering

 p <- ggplot(data = gapminder,
             mapping = aes(x = gdpPercap,
                           y=lifeExp))
p + geom_point() +
    geom_smooth(method = "lm") +
    scale_x_log10()

Fix the labels

p <- ggplot(data = gapminder,
            mapping = aes(x = gdpPercap,
                          y=lifeExp))

Fix the labels

p <- ggplot(data = gapminder,
            mapping = aes(x = gdpPercap,
                          y=lifeExp))
p + geom_point()

Fix the labels

p <- ggplot(data = gapminder,
            mapping = aes(x = gdpPercap,
                          y=lifeExp))
p + geom_point() +
    geom_smooth(method = "lm")

Fix the labels

p <- ggplot(data = gapminder,
            mapping = aes(x = gdpPercap,
                          y=lifeExp))
p + geom_point() +
    geom_smooth(method = "lm") +
    scale_x_log10(labels = scales::label_dollar())

Add labels, title, and caption

p <- ggplot(data = gapminder, 
            mapping = aes(x = gdpPercap, 
                          y = lifeExp))
p + geom_point() + 
  geom_smooth(method = "lm") +
    scale_x_log10(labels = scales::label_dollar()) +
    labs(x = "GDP Per Capita", 
         y = "Life Expectancy in Years",
         title = "Economic Growth and Life Expectancy",
         subtitle = "Data points are country-years",
         caption = "Source: Gapminder.")

Mapping vs Setting
your plot’s aesthetics

“Can I change the color of the points?”

p <- ggplot(data = gapminder,
            mapping = aes(x = gdpPercap,
                          y = lifeExp,
                          color = "purple"))

## Put in an object for convenience
p_out <- p + geom_point() +
    geom_smooth(method = "loess") +
    scale_x_log10()

What has gone wrong here?

p_out

Try again

p <- ggplot(data = gapminder,
            mapping = aes(x = gdpPercap,
                          y = lifeExp))

## Put in an object for convenience
p_out <- p + geom_point(color = "purple") +
    geom_smooth(method = "loess") +
    scale_x_log10()

Try again

p_out

Geoms can take many arguments

  • Here we set color, size, and alpha. Meanwhile x and y are mapped.
  • We also give non-default values to some other arguments
p <- ggplot(data = gapminder,
            mapping = aes(x = gdpPercap,
                          y = lifeExp)) 
p_out <- p + geom_point(alpha = 0.3) +
    geom_smooth(color = "orange", 
                se = FALSE, 
                linewidth = 8, 
                method = "lm") +
    scale_x_log10()

Geoms can take many arguments

p_out

alpha for overplotting

p <- ggplot(data = gapminder, 
            mapping = aes(x = gdpPercap, 
                          y = lifeExp))
p + geom_point(alpha = 0.3) + 
  geom_smooth(method = "lm") +
    scale_x_log10(labels = scales::label_dollar()) +
    labs(x = "GDP Per Capita", 
         y = "Life Expectancy in Years",
         title = "Economic Growth and Life Expectancy",
         subtitle = "Data points are country-years",
         caption = "Source: Gapminder.")

Map or Set values
per geom

Geoms can take their own mappings

p <- ggplot(data = gapminder,
            mapping = aes(x = gdpPercap,
                          y = lifeExp,
                          color = continent,
                          fill = continent))

Geoms can take their own mappings

p <- ggplot(data = gapminder,
            mapping = aes(x = gdpPercap,
                          y = lifeExp,
                          color = continent,
                          fill = continent))
p + geom_point()

Geoms can take their own mappings

p <- ggplot(data = gapminder,
            mapping = aes(x = gdpPercap,
                          y = lifeExp,
                          color = continent,
                          fill = continent))
p + geom_point() +
    geom_smooth(method = "loess")

Geoms can take their own mappings

p <- ggplot(data = gapminder,
            mapping = aes(x = gdpPercap,
                          y = lifeExp,
                          color = continent,
                          fill = continent))
p + geom_point() +
    geom_smooth(method = "loess") +
    scale_x_log10(labels = scales::label_dollar())

Geoms can take their own mappings

p <- ggplot(data = gapminder,
            mapping = aes(x = gdpPercap,
                          y = lifeExp))

Geoms can take their own mappings

p <- ggplot(data = gapminder,
            mapping = aes(x = gdpPercap,
                          y = lifeExp))
p + geom_point(mapping = aes(color = continent))

Geoms can take their own mappings

p <- ggplot(data = gapminder,
            mapping = aes(x = gdpPercap,
                          y = lifeExp))
p + geom_point(mapping = aes(color = continent)) +
    geom_smooth(method = "loess")

Geoms can take their own mappings

p <- ggplot(data = gapminder,
            mapping = aes(x = gdpPercap,
                          y = lifeExp))
p + geom_point(mapping = aes(color = continent)) +
    geom_smooth(method = "loess") +
    scale_x_log10(labels = scales::label_dollar())

Geoms can take their own mappings

p <- ggplot(data = gapminder,
            mapping = aes(x = gdpPercap,
                          y = lifeExp))
p + geom_point(mapping = aes(color = continent)) +
    geom_smooth(method = "loess") +
    scale_x_log10(labels = scales::label_dollar())

Pay attention to which scales and guides are drawn, and why

Guides and scales reflect aes() mappings

  • mapping = aes(color = continent, fill = continent)

Guides and scales reflect aes() mappings

  • mapping = aes(color = continent, fill = continent)

  • mapping = aes(color = continent)

Remember: Every mapped variable has a scale

Saving your work

Use ggsave()

## Save the most recent plot
ggsave(filename = "figures/my_figure.png")

## Use here() for more robust file paths
ggsave(filename = here("figures", "my_figure.png"))

## A plot object
p_out <- p + geom_point(mapping = aes(color = log(pop))) +
    scale_x_log10()

ggsave(filename = here("figures", "lifexp_vs_gdp_gradient.pdf"), 
       plot = p_out)

ggsave(here("figures", "lifexp_vs_gdp_gradient.png"), 
       plot = p_out, 
       width = 8, 
       height = 5)

In code chunks

Set options in any chunk:

RMarkdown Style


```{r, fig.height=8, fig.width=5, fig.show="hold", fig.cap="A caption"}
gapminder |> 
  ggplot(mapping = aes(x = gdpPercap, y = lifeExp)) + 
  geom_point()
```

Quarto Style


```{r}
#| fig.height=8 
#| fig.width=5
#| fig.show: "hold" 
#| fig.cap="A caption"

gapminder |> 
  ggplot(mapping = aes(x = gdpPercap, y = lifeExp)) + 
  geom_point()

```

Or for the whole document:

knitr::opts_chunk$set(warning = TRUE,
                        message = TRUE,
                        fig.retina = 3,
                        fig.align = "center",
                        fig.asp = 0.7,
                        dev = c("png", "pdf"))

Getting Help

How to read an R Help page

How to read an R Help page

How to read an R Help page

How to read an R Help page

How to read an R Help page