Formatting + Summary Statistics

Solutions

Below, we want to accomplish the following:

Packages

We use the ## in front of the text we want to be section headers. The more # we make, the smaller the text becomes!

#| is a code chunk argument. We need to make sure that there are no spaces between # and |, and that it is on the first line of the chunk argument.

We can turn messages off by using message: false as seen above.

We can use echo: false to hide code (but it still runes in the background!)

We can use eval: false to stop the code chunk from running (but it still will show in the document)

We bold text by putting two stars on each side of what we want to bold! We italicize using one star on each side.

Data

glimpse(mtcars)
Rows: 32
Columns: 11
$ mpg  <dbl> 21.0, 21.0, 22.8, 21.4, 18.7, 18.1, 14.3, 24.4, 22.8, 19.2, 17.8,…
$ cyl  <dbl> 6, 6, 4, 6, 8, 6, 8, 4, 4, 6, 6, 8, 8, 8, 8, 8, 8, 4, 4, 4, 4, 8,…
$ disp <dbl> 160.0, 160.0, 108.0, 258.0, 360.0, 225.0, 360.0, 146.7, 140.8, 16…
$ hp   <dbl> 110, 110, 93, 110, 175, 105, 245, 62, 95, 123, 123, 180, 180, 180…
$ drat <dbl> 3.90, 3.90, 3.85, 3.08, 3.15, 2.76, 3.21, 3.69, 3.92, 3.92, 3.92,…
$ wt   <dbl> 2.620, 2.875, 2.320, 3.215, 3.440, 3.460, 3.570, 3.190, 3.150, 3.…
$ qsec <dbl> 16.46, 17.02, 18.61, 19.44, 17.02, 20.22, 15.84, 20.00, 22.90, 18…
$ vs   <dbl> 0, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0,…
$ am   <dbl> 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0,…
$ gear <dbl> 4, 4, 4, 3, 3, 3, 3, 4, 4, 4, 4, 3, 3, 3, 3, 3, 3, 4, 4, 4, 3, 3,…
$ carb <dbl> 4, 4, 1, 1, 2, 1, 4, 2, 2, 4, 4, 3, 3, 3, 4, 4, 4, 1, 2, 1, 1, 2,…

Summary Statistics

Packages

mtcars

Pull up the help file for summarise using ?summarise in the Console. Read about the description, useful functions, and then scroll down to the examples. Copy the first example in the code chunk below and run it. What is is doing? Practice reading this code as a sentence!

Note There are two different pipes in R: |> and %>%. They have identical functionality for the scope of this course. I will be using the |> pipe, as it has some computational benefits beyond the scope of 511.

mtcars |>
  summarise(mean_disp = mean(disp), n_count = n())
  mean_disp n_count
1  230.7219      32

note: we should not name column names the same names as common functions

Now, let’s go through the second example together!

mtcars |> #and then
  group_by(cyl) |> #group by cyl (4,6,8)
  summarise(mean_displacement = mean(disp), n_count = n()) #calculate summary statistics
# A tibble: 3 × 3
    cyl mean_displacement n_count
  <dbl>             <dbl>   <int>
1     4              105.      11
2     6              183.       7
3     8              353.      14

we can also group by more than one variable

mtcars |> #and then
  group_by(cyl, vs) |> #group by cyl (4,6,8)
  summarise(mean_displacement = mean(disp), n_count = n()) #calculate summary statistics
`summarise()` has grouped output by 'cyl'. You can override using the `.groups`
argument.
# A tibble: 5 × 4
# Groups:   cyl [3]
    cyl    vs mean_displacement n_count
  <dbl> <dbl>             <dbl>   <int>
1     4     0              120.       1
2     4     1              104.      10
3     6     0              155        3
4     6     1              205.       4
5     8     0              353.      14