2 Intro to R
We will begin our adventure by opening RStudio. If this is your first time opening RStudio, you should see the following panes:
- Console (entire left)
- Environment/History (upper right)
- Files/Plots/Packages/Help (lower right)
You can change the default location of the panes, among many other things: Customizing RStudio.
For now, place your cursor in the console so we can start coding with R!
2.1 Using interactive windows
Throughout this text, we will use interactive code blocks provided by the leanrR package. This way, you can run code without having to open RStudio!
Below you can see an example of an interactive R window. In the white window (under the button with the text Start Over) is where you will write code to be executed. To run code you have written, click the green Run Code button and observe how information pops up under the window. If no errors were made, this is where the answer will be returned. Error messages are shown in a red box, also under the learnR window.
Now spend a few minutes using the window below as a calculator and run simple calculations by clicking Run Code.
2.2 Objects
R is an object-oriented programming language. This means R creates different types of objects that we can manipulate with functions and operators.
To create an object in R, we can assign a value to an object using an assignment operator using either a left arrow <-
or an equal sign =
. Click the “Run Code” button to get started and play around with the code!
In plain English, the above snippet tells us that “five times ten is assigned to my_object”.
By convention, we use <-
to assign variables. Don’t be lazy and use =
to assign variables. Although it will work, it will just sow confusion later. Code is miserable to read on a good day. Give your eyes a break and use <-
.
Although object names are flexible, we need to follow some rules:
- Object names cannot start with a digit and cannot contain certain other characters such as a comma or a space.
- As a general rule of thumb, object names should be short and meaningful. Misleading or overly long object names will make it a pain to debug your code.
Below are examples of various object name conventions. My best advice would be to pick one and stick with it.
this_is_snake_case
other.people.use.periods
evenOthersUseCamelCase
Let’s make another assignment:
<- 2.5 this_is_a_really_long_name
To inspect the object we’ve just created, try out RStudio’s auto-complete feature: type the first few characters, press TAB
, add characters until you get what you want, then press return/enter.
2.3 Functions
We will use functions in most of our work with R, either pre-written or ones we write ourselves. Functions help us easily repeat instructions and carry out multiple tasks in a single step, saving us a lot of space in our code.
You can call functions like this:
functionName(arg1 = val1, arg2 = val2, ...)
Notice that we use =
instead of <-
within a function. Here, arg1
and arg2
are the arguments of the function. Likewise, val1
and val2
are the parameters of arg1
and arg2
.
Let’s try using seq()
which makes regular sequences of numbers:
seq(1, 10)
## [1] 1 2 3 4 5 6 7 8 9 10
The above snippet also demonstrates something about how R resolves function arguments. You can always specify name = value
if you’re unsure. If you don’t, R attempts to resolve by position. In the above snippet, R assumed we wanted a sequence from = 1
that goes to = 10
.
As an exercise, try creating a sequence of numbers from 1 to 10 by increments of 2:
If you just make an assignment, you don’t see the assigned value. To show the assigned value, just call the variable.
<- seq(1, 10)
one_to_ten one_to_ten
## [1] 1 2 3 4 5 6 7 8 9 10
You can shorten this common action by surrounding the assignment with parentheses.
<- seq(1, 10)) (one_to_ten
## [1] 1 2 3 4 5 6 7 8 9 10
Not all functions have (or require) arguments:
date()
## [1] "Tue Feb 8 15:02:12 2022"
If you’ve been following along in RStudio, look at your workspace (in the upper right pane.) The workspace is where user-defined objects accumulate. You can also get a listing of these objects with commands:
objects()
ls()
If you want to remove the object named one_to_ten
, you can do this:
rm(one_to_ten)
To remove everything:
rm(list = ls())
or click the broom icon in RStudio’s Environment pane.
2.4 Math operators
Here are some basic math operations you can perform in R. Try playing around with them in the interactive window.
2.5 Conditionals
Conditional statements check if a condition is true or false using logical operators (operators that return either TRUE
or FALSE
). For example:
20 == 10*2
## [1] TRUE
"hello" == "goodbye"
## [1] FALSE
These statements return a value is of type "logical"
, which is either TRUE
if the condition is satisfied, or FALSE
if the condition is not satisfied. One important note is that TRUE
and FALSE
are objects on their own, rather than the strings “true” and “false”.
Conditional statements are made with a range of logical operators. Here are some examples:
Operator | Plain English |
---|---|
== |
is equal to |
!= |
is not equal to |
< or > |
is less than OR is greater than |
<= or >= |
is less than or equal to OR is greater than or equal to |
is.na() |
is an NA value |
There are other logical operators, including %in%
, which checks if a value is present in a vector of possible values. Try playing around with the following statements and checking their output by running the code.
If you’re keen, you’ll notice in the last line that we use c()
to group objects together. This data structure is called a vector. As a brief introduction, vectors combine objects of the same type. Don’t worry too much about the specifics of vectors, as we will cover it in much greater depth in the next chapter.
We can also combine conditions using the logical and (&
) along with the logical or (|
). The logical &
returns TRUE
if and only if both conditions are true, and it returns FALSE
otherwise. Let’s look at the following examples:
# is (5 greater than 2) AND (6 greater than 10)?
5 > 2) & (6 >= 10) (
## [1] FALSE
# is (5 greater than 2) OR (6 greater than 10)?
5 > 2) | (6 >= 10) (
## [1] TRUE
2.5.1 If statements
Conditional statements generate logical values to filter inputs. if
statements use conditional statements to control flow of a program. Below is the general form of an if
statement:
if (the conditional statement is TRUE) {
do something }
Let’s look at an example:
Try assigning 6 to x
and predict the output.
Although an if
statement alone is handy, we often want to check multiple conditions. We can add more conditions and associated actions with else if
statements.
Suppose we want to send an automated message to our friends. Here’s how we can do it:
<- "Jasmine"
friend if (friend == "Jason") {
<- "Hi, Jason!"
msg else if (friend == "Jasmine") {
} <- "How are you, Jasmine?"
msg
} msg
## [1] "How are you, Jasmine?"
We can specify what to do if none of the conditions are TRUE
by using else
on its own. Try modifying the code below to print “Stranger danger!” if our friend’s name isn’t “Jason” or “Jasmine”.
2.6 Working directory
Any process running on your computer has a notion of its “working directory”. By default in R, a working directory is where R will look for files you ask it to load. It is also where any files you write to disk will go. You can explicitly get your working directory with the getwd()
function:
getwd()
The working directory is also displayed at the top of the RStudio console.
You can set your working directory at the command line like so:
setwd("path-to-my-directory/")
The setwd()
function is extremely useful for times you want to read in external data, such as a .csv file.
2.6.1 Other important things
Below is a collection of important miscellaneous items to consider.
- R scripts are usually saved with a
.R
or.r
suffix. - Comments start with one or more
#
symbols. Use them. RStudio helps you (de)comment selected lines withCtrl
+Shift
+C
(Windows and Linux) orCommand
+Shift
+C
(Mac). - Clean out the workspace. You can do so by clicking the broom icon or by typing
rm(list = ls())
into the console.
This workflow will serve you well in the future:
- Create an RStudio project for an analytical project.
- Keep inputs there (we’ll soon talk about importing).
- Keep scripts there; edit them, run them in bits or as a whole from there.
- Keep outputs there.
Avoid using your mouse for your workflow. Firstly, using the keyboard is faster. Secondly, writing code instead of clicking helps with reproducibility. That is, it will be much easier to retrospectively determine how a numerical table or PDF was actually produced.
Many experienced users never save the workspace, never save .RData
files (I’m one of them), and never save or consult the history. Once/if you get to that point, there are options available in RStudio to disable the loading of .RData
and permanently suppress the prompt on exit to save the workspace (go to Tools > Options > General).