Skip to content

Commit db6acec

Browse files
Merge pull request #50 from UtrechtUniversity/dorien-changes
Update materials
2 parents a69cd76 + 87074ac commit db6acec

File tree

7 files changed

+126
-85
lines changed

7 files changed

+126
-85
lines changed

book/slides/slides_introduction.html

Lines changed: 48 additions & 37 deletions
Large diffs are not rendered by default.

book/slides/slides_introduction.qmd

Lines changed: 40 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -627,7 +627,8 @@ df[df$name=="Bob", "age"]
627627

628628
## Answers to exercise 5
629629

630-
1. From your dataframe `df`, return complete rows for everyone living in a country of your choice.
630+
1. From your dataframe `df`, return all columns for everyone living in a country of your choice.
631+
631632
```{r}
632633
#| label: exercise-5.1
633634
df[df$country=="UK", ]
@@ -844,6 +845,23 @@ if(number >= 18){
844845
print(age_category)
845846
```
846847

848+
Bonus exercise: expand the if-else statement to assign "toddler" if number is smaller than 2:
849+
850+
```{r}
851+
#| label: exercise-7-bonus
852+
number <- 8
853+
854+
if(number >= 18){
855+
age_category <- "adult"
856+
} else if(number < 2) {
857+
age_category <- "toddler"
858+
} else {
859+
age_category <- "minor"
860+
}
861+
862+
print(age_category)
863+
```
864+
847865
# Programming: functions {background-color=#FFCD00}
848866

849867
## Functions: a sequence
@@ -862,34 +880,9 @@ mean(df$age)
862880
mean(1:100)
863881
```
864882

865-
## Functions
866-
867-
**Functions can also be used to make a complex line of code easier to write/read:**
868-
869-
You write the function once:
870-
```{r}
871-
#| label: function-find-bobs-age
872-
find_bobs_age <- function(data){
873-
bobs_age <- data[data$name == "Bob", "age"]
874-
return(bobs_age)
875-
}
876-
```
877-
878-
Now, every time you want to find Bob's age you use:
879-
```{r}
880-
#| label: use-find-bobs-age
881-
find_bobs_age(df)
882-
```
883-
884-
. . .
885-
886-
**Functions are the bread and butter of programming!**
887-
888-
A good script will consist mostly of functions, with a minimal amount of code that applies the functions.
889-
890883
## Functions in R
891884

892-
- To make a function, use the function `function()`:
885+
- To make a function yourself, use the function `function()`:
893886
```r
894887
myFun <- function()
895888
```
@@ -936,6 +929,26 @@ myFun(3, 4)
936929
myFun(90, 71)
937930
```
938931

932+
## Functions
933+
934+
**Functions are the bread and butter of programming!**
935+
936+
A good script will consist mostly of functions, with a minimal amount of code that applies the functions.
937+
938+
. . .
939+
940+
```r
941+
myFun <- function(arg1, arg2){
942+
multiplication <- arg1 * arg2
943+
return(multiplication)
944+
}
945+
```
946+
947+
Note that:
948+
949+
- arg1 and arg2 are **internal variables**: they do not exist outside of the function and they are used within the function to perform the code.
950+
- The output of a function is always spit out using `return` (**not** `print`).
951+
939952
# Go to exercise 8
940953

941954
## Answers to exercise 8

book/slides/slides_tidyverse.html

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2404,6 +2404,7 @@ <h2>Learn more</h2>
24042404
<li>Statistical Inference via Data Science: A ModernDive into R and the Tidyverse: <a href="https://moderndive.com/" class="uri">https://moderndive.com/</a></li>
24052405
<li>Big Book of R: <a href="https://www.bigbookofr.com/" class="uri">https://www.bigbookofr.com/</a></li>
24062406
<li>Better spreadsheets: <a href="https://better-spreadsheets.netlify.app/" class="uri">https://better-spreadsheets.netlify.app/</a></li>
2407+
<li><a href="https://bookdown.org/ndphillips/YaRrr/">YaRrr! The Pirate’s Guide to R</a></li>
24072408
</ul>
24082409
<p><strong>See also the <a href="https://utrechtuniversity.github.io/workshop-introduction-to-R-and-data/what-next.html">What’s next page</a>.</strong></p>
24092410
</section>

book/slides/slides_tidyverse.qmd

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1008,6 +1008,7 @@ What have we learned this afternoon?
10081008
- Statistical Inference via Data Science: A ModernDive into R and the Tidyverse: <https://moderndive.com/>
10091009
- Big Book of R: <https://www.bigbookofr.com/>
10101010
- Better spreadsheets: <https://better-spreadsheets.netlify.app/>
1011+
- [YaRrr! The Pirate’s Guide to R](https://bookdown.org/ndphillips/YaRrr/)
10111012

10121013
**See also the [What's next page](https://utrechtuniversity.github.io/workshop-introduction-to-R-and-data/what-next.html).**
10131014

book/what-next.qmd

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@ Not finished learning? Feel free to check our website, [uu.nl/rdm](https://www.u
1010
- Statistical Inference via Data Science: A ModernDive into R and the Tidyverse: <https://moderndive.com/>
1111
- Big Book of R: <https://www.bigbookofr.com/>
1212
- Better spreadsheets: <https://better-spreadsheets.netlify.app/>
13+
- [YaRrr! The Pirate’s Guide to R](https://bookdown.org/ndphillips/YaRrr/)
1314

1415
### Data visualization
1516

course-materials/baseR_exercises.Rmd

Lines changed: 17 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -96,7 +96,7 @@ Before you start, please run this code:
9696
rm(name,age,country)
9797
```
9898

99-
1. From your dataframe df, return complete rows for everyone living in a country of your choice.
99+
1. From your dataframe df, return all columns for everyone living in a country of your choice.
100100

101101
2. Return only the names of everyone in your data frame df under 40.
102102

@@ -131,7 +131,7 @@ is.na(NA)
131131
### Exercise 7: If statement
132132

133133
Make an if statement that tests if a number is larger than 18.
134-
If the number is larger, the variable age_category should be assigned the value "adult". If not, the variable age_category should be assigned the vaue "minor".
134+
If the number is larger, the variable age_category should be assigned the value "adult". If not, the variable age_category should be assigned the value "minor".
135135

136136
```{r}
137137
number <- 8
@@ -143,6 +143,21 @@ if(){
143143
print(age_category)
144144
```
145145

146+
BONUS exercise:
147+
148+
Expand your if-else statement from above with an additional condition: if the number is less than 2, age_category should get the value "toddler" assigned.
149+
150+
```{r}
151+
number <- 8
152+
153+
if(){
154+
155+
}
156+
157+
print(age_category)
158+
```
159+
160+
146161
### Exercise 8: Function
147162

148163
Turn the if statement from the last exercise into a function. Let the user provide the value for number, and return the age_category.

course-materials/datascience_exercises.Rmd

Lines changed: 18 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -56,7 +56,7 @@ library(readxl)
5656
Use the function `read_excel()` to read a related data set `penguins_isotopes.xlsx` (an Excel file!), also located in your `data` folder, into R.
5757

5858
```{r}
59-
penguins_isotopes <- ???(path = "???")
59+
penguins_isotopes <- ???
6060
```
6161

6262
## Exercise 3
@@ -68,9 +68,7 @@ For this function, you need to provide:
6868
- The name of the folder and the file name: "data/penguins_isotopes.csv"
6969

7070
```{r}
71-
???(penguins_isotopes,
72-
# Name of the file, including extension (.csv)
73-
file = "???")
71+
7472
```
7573

7674
# Chapter 2: Selecting & Filtering Data
@@ -95,18 +93,20 @@ Use the function `select()` to include the right columns in our data subset.
9593
**Note** that in order to work with the subset from Exercise 4, we need to use that data frame (`penguin_subset`) as our data input!
9694

9795
```{r}
98-
penguins_subset <- select(???)
96+
penguins_subset <- ???
9997
```
10098

10199

102100
## Exercise 5
103101

104102
For some of the penguins, the sex was not determined. Use the function `filter()` to keep only the rows where the column `Sex` specifies either "FEMALE" or "MALE".
105103

104+
Use the `penguins_subset` dataframe that you created in the previous exercise as input.
105+
106106
One way to do this (though not the only way) is to filter out the rows where `Sex` is NA. Remember the function `is.na()` and remember that to negate a condition, you use the operator! For example, `!is.na(NA)` returns `FALSE`.
107107

108108
```{r}
109-
penguins_subset_2 <- filter(penguins_subset, ???)
109+
penguins_subset_2 <- ???
110110
```
111111

112112

@@ -117,17 +117,17 @@ Use the `mutate()` function to create a new column called `culmen_ratio`, which
117117
Make sure to work with the `penguins_subset_2` dataframe.
118118

119119
```{r}
120-
penguins_subset_3 <- mutate(penguins_subset_2, ???)
120+
penguins_subset_3 <- ???
121121
```
122122

123123
## Exercise 7
124124

125-
In the next chapter, we will make our dataframe tidy. Before we do that, we will rename the `Culmen_Length_mm` and `Culmen_Depth_mm` columns in order to make subsequent operations easier.
125+
In the next chapter, we will make our dataframe tidy. Before we do that, we will rename the `Culmen_Length_mm` column as `length` and `Culmen_Depth_mm` as `depth` in order to make subsequent operations easier.
126+
127+
Use the `rename` function to rename these columns. Make sure to take `penguins_subset_3` as the input dataframe.
126128

127129
```{r}
128-
penguins_subset_4 <- rename(penguins_subset_3,
129-
length = ???,
130-
depth = ???)
130+
penguins_subset_4 <- ???
131131
```
132132

133133
# Exercise 8
@@ -157,8 +157,8 @@ For this, you need to specify:
157157
```{r}
158158
penguins_long <- penguins_subset_5 %>%
159159
pivot_longer(cols = c(???, ???),
160-
names_to = "culmen_element",
161-
values_to = "measurement")
160+
??? = "culmen_element",
161+
??? = "measurement")
162162
```
163163

164164
### Exercise 10
@@ -175,29 +175,28 @@ Make sure to feed the right data frame into the workflow: the column `culmen_ele
175175

176176
```{r}
177177
penguins_summary <- penguins_long %>%
178-
group_by(???) %>%
179-
summarize(avg = ???,
180-
sd = ???)
178+
??? %>%
179+
???
181180
```
182181

183182
## Exercise 11
184183

185184
Use the function `full_join()` to merge the `penguins_summary` and `penguins_long` data frames, so that each row in the long data frame will have additional columns with the mean and standard deviation for its group.
186185

187186
```{r}
188-
penguins_join <- full_join(???)
187+
penguins_join <- ???
189188
```
190189

191190
# Chapter 4: Data Visualization
192191

193192
## Exercise 12
194193

195194
Using the ggplot2 package, let's plot the culmen length against their flipper length from the `penguins` dataframe.
196-
Culmen_Length_mm and Flipper_Length_mm are both continuous variables, and therefore we are now choosing a scatterplot.
195+
`Culmen_Length_mm` and `Flipper_Length_mm` are both continuous variables, and therefore we are now choosing a scatterplot.
197196

198197
In the code chunk below:
199198

200-
- Put Culmen_Length_mm on the x-axis, and Flipper_Length_mm on the y-axis
199+
- Put `Culmen_Length_mm` on the x-axis, and `Flipper_Length_mm` on the y-axis
201200
- Color the points according to Species
202201
- Add scatter points using the geom `geom_point()`
203202
- Add x- and y-labels and a title with `labs()`

0 commit comments

Comments
 (0)