summarise()
Functions + Arguments
A walkthrough on what & what to think about
Functions
A function is an action that we want to perform typically on some sort of data structure. A function is always followed by an open and closed parentheses.
We need to be cautious about spacing when using functions. Specifically, we need to make sure that the ()
and the function name do not have spaces between them.
summarise () # this will give you an error message
Arguments
Arguments are the parameters provided to a function to perform operations in a programming language. In R programming, we can use as many arguments as we want and are separated by a comma. In the following example, kable()
is the function, and has an argument of digits = 3
. Arguments go inside functions, and do not need their own set of parentheses.
kable(digits = 3)
Note: Functions can be passed as arguments to other functions! We see this a lot when calculating summary statistics and making graphs.
mtcars |>
group_by(cyl) |>
summarise(mean_disp = mean(disp),
n = n())
summarise()
is a function. However, we are also using other functions (as arguments within summarise) to calculate our statistics! mean()
is a function. So is n()
. These are functions because these are actions we want to take, each needing their own set of open and closed parentheses.
Parentheses
Un-closed parentheses can cause a lot of headaches. I suggest thinking about the following:
– Understand that any function needs to have a set of parentheses(one to open; one to close)
– Understand that arguments go inside functions, and do not require parentheses
– Clicking next to a parentheses shows how it is closed.
Walkthrough
mtcars |>
group_by(cyl) |>
summarise(mean_disp = mean(disp),
n = n())
We know the group_by()
is a function (action) we are performing on mtcars
, with the argument cyl
going inside the parentheses. We make sure the function is closed before piping to the next line.
summarise
is an action we want to perform (i.e. a function) so it has parentheses. Naming the data frame column is something passed into the function (i.e. mean_disp =
), making it an argument with no parentheses. We nest a function within the function summarise()
to calculate the mean. Calculating the mean is an action, hence a function with parentheses. disp
is the argument passed to it. We separate arguments with a ,
as seen above. The same logic can be applied to n = n()
.
Notice that there is an ending )
after the function n()
. This closes summarise, and is necessary for our code to work.