Skip to content

Commit f6d11b8

Browse files
Merge pull request #43 from UtrechtUniversity/dorien-changes
first try at automation, content edits
2 parents 70db62f + 651cac4 commit f6d11b8

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

62 files changed

+406
-337
lines changed

.gitignore

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -41,5 +41,4 @@ rsconnect/
4141
.Rproj.user
4242

4343
# Project-specific files that need to be ignored
44-
data/penguins_isotopes.csv
45-
datascience_solutions_files/
44+
docs/

LICENSE.md

Lines changed: 0 additions & 8 deletions
This file was deleted.

book/_quarto.yml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,9 @@
11
project:
22
type: book
33
output-dir: ../docs
4+
pre-render:
5+
- "pre-rendering/zip-course-materials.R"
6+
- "pre-rendering/render-slides.R"
47

58
book:
69
title: "Introduction to R & Data"

book/data-structures.qmd

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,8 @@
66
<iframe class="slide-deck" src="slides/slides_introduction.html#data-structures" title="Data structures" width="100%" height="540"></iframe>
77
```
88

9+
[Link to the slides](slides/slides_introduction.html#data-structures)
10+
911
## Video
1012

1113
{{< video https://youtu.be/Ffk2Kxa_b_M >}}

book/data-visualization.qmd

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,8 @@
66
<iframe class="slide-deck" src="slides/slides_tidyverse.html#data-visualization" title="Data visualization with ggplot2" width="100%" height="540"></iframe>
77
```
88

9+
[Link to the slides](slides/slides_tidyverse.html#data-visualization)
10+
911
## Video
1012

1113
{{< video https://vimeo.com/470862707 >}}

book/functions.qmd

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,8 @@
66
<iframe class="slide-deck" src="slides/slides_introduction.html#programming-functions" title="Functions" width="100%" height="540"></iframe>
77
```
88

9+
[Link to the slides](slides/slides_introduction.html#programming-functions)
10+
911
## Video
1012

1113
{{< video https://youtu.be/P_qSXHyIUpQ >}}

book/if-statements.qmd

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,8 @@
66
<iframe class="slide-deck" src="slides/slides_introduction.html#programming-if-statements" title="If-statements" width="100%" height="540"></iframe>
77
```
88

9+
[Link to the slides](slides/slides_introduction.html#programming-if-statements)
10+
911
## Video
1012

1113
{{< video https://youtu.be/ASVKW4dyLZI >}}

book/importing-data.qmd

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,8 @@
66
<iframe class="slide-deck" src="slides/slides_tidyverse.html#importing-data" title="Importing data" width="100%" height="540"></iframe>
77
```
88

9+
[Link to the slides](slides/slides_tidyverse.html#importing-data)
10+
911
## Video
1012

1113
{{< video https://vimeo.com/470836273 >}}

book/indexing-dataframes.qmd

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,8 @@
66
<iframe class="slide-deck" src="slides/slides_introduction.html#indexing-a-data-frame" title="Indexing dataframes" width="100%" height="540"></iframe>
77
```
88

9+
[Link to the slides](slides/slides_introduction.html#indexing-a-data-frame)
10+
911
## Video
1012

1113
{{< video https://youtu.be/m15hbXG6I-Y >}}

book/indexing-vectors-and-lists.qmd

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,8 @@
66
<iframe class="slide-deck" src="slides/slides_introduction.html#indexing-vectors-lists" title="Indexing vectors and lists" width="100%" height="540"></iframe>
77
```
88

9+
[Link to the slides](slides/slides_introduction.html#indexing-vectors-lists)
10+
911
## Video
1012

1113
{{< video https://youtu.be/e10nO2swYIE >}}

book/introduction-tidyverse.qmd

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,8 @@
66
<iframe class="slide-deck" src="slides/slides_tidyverse.html" title="Introduction to the Tidyverse" width="100%" height="540"></iframe>
77
```
88

9+
[Link to the slides](slides/slides_tidyverse.html)
10+
911
## Video
1012

1113
{{< video https://vimeo.com/470831693 >}}

book/introduction.qmd

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,8 @@
66
<iframe class="slide-deck" src="slides/slides_introduction.html" title="Introduction to R and Rstudio" width="100%" height="540"></iframe>
77
```
88

9+
[Link to the slides](slides/slides_introduction.html)
10+
911
## Video
1012

1113
{{< video https://youtu.be/FFYSAUJ305o >}}

book/loops.qmd

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,8 @@
66
<iframe class="slide-deck" src="slides/slides_introduction.html#programming-loops" title="For-loops" width="100%" height="540"></iframe>
77
```
88

9+
[Link to the slides](slides/slides_introduction.html#programming-loops)
10+
911
## Video
1012

1113
{{< video https://youtu.be/K4KSjizSJFk >}}

book/missing-data.qmd

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,8 @@
66
<iframe class="slide-deck" src="slides/slides_introduction.html#missing-data" title="Missing data" width="100%" height="540"></iframe>
77
```
88

9+
[Link to the slides](slides/slides_introduction.html#missing-data)
10+
911
## Video
1012

1113
{{< video https://youtu.be/4gVvlg1Itzs >}}

book/piping.qmd

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,8 @@
66
<iframe class="slide-deck" src="slides/slides_tidyverse.html#piping-operations" title="Piping operations" width="100%" height="540"></iframe>
77
```
88

9+
[Link to the slides](slides/slides_tidyverse.html#piping-operations)
10+
911
## Exercise 8
1012

1113
Now try out exercise 8 in `datascience_exercises.Rmd`.

book/pre-rendering/render-slides.R

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
library(quarto)
2+
print("Let's render the Morning slides first")
3+
quarto_render("slides/slides_introduction.qmd")
4+
5+
print("Moving onto the Tidyverse slides")
6+
quarto_render("slides/slides_tidyverse.qmd")
7+
8+
print("Done rendering the slides! Now onto the book itself...")
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
print("Zipping the course materials folder")
2+
3+
zip(zipfile = "../course-materials.zip",
4+
files = "../course-materials")
5+
6+
print("Course materials zipped!")

book/schedule.qmd

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -11,9 +11,9 @@
1111
| 12:40 | Recap & Questions |
1212
| **12:45** | **Lunch break** |
1313
| 13:30 | Introduction to Tidyverse, Importing data: _Exercises 0-3_ |
14-
| 14:15 | Selecting & filtering data, Piping operations: _Exercises 4-8_ |
14+
| 14:15 | Selecting, filtering, mutating and renaming data, Piping operations: _Exercises 4-8_ |
1515
| 14:55 | Recap & Questions |
1616
| **15:00** | **Coffee break** |
1717
| 15:15 | Tidy data, Summarizing & combining data: _Exercises 9-11_ |
18-
| 16:00 | Data visualization: _Exercise 12_ |
18+
| 16:00 | Data visualization: _Exercise 12-13_ |
1919
| 16:55 | Final recap and closing |

book/slides/slides_tidyverse.html

Lines changed: 109 additions & 131 deletions
Large diffs are not rendered by default.

book/slides/slides_tidyverse.qmd

Lines changed: 30 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -7,12 +7,11 @@ execute:
77
format:
88
revealjs:
99
pagetitle: "The data science workflow with Tidyverse"
10-
theme: [default]
10+
theme: default
1111
css: custom.css
1212
embed-resources: true
13+
keep-md: false
1314
smaller: true
14-
gfm:
15-
mermaid-format: png
1615
---
1716

1817
# {background-color=#FFCD00}
@@ -468,7 +467,7 @@ penguins_subset_4 <- rename(penguins_subset_3,
468467

469468
A key tidyverse component that chains all data science steps together:
470469

471-
`%>%`
470+
`%>%`[^1]
472471

473472
. . .
474473

@@ -479,6 +478,10 @@ Why?
479478
- no need to save intermediate R objects with `<-`
480479
- easily add and/or delete steps in your pipeline without breaking the code
481480

481+
[^1]: Base R now also has a piping operator: `|>`
482+
which [works very similarly](https://www.tidyverse.org/blog/2023/04/base-vs-magrittr-pipe/) to the magrittr piping operator `%>%`
483+
484+
482485
## Pipe operator: how it works
483486

484487
```{r}
@@ -537,15 +540,17 @@ penguins_subset_5 <- penguins %>%
537540

538541
## Tidy Data
539542

540-
> Tidy data sets are all alike; but every messy data set is messy in its own way ([Wickham/Grolemund, 2017](https://r4ds.had.co.nz/tidy-data.html)]
543+
> Tidy data sets are all alike; but every messy data set is messy in its own way ([Wickham/Grolemund, 2017](https://r4ds.had.co.nz/tidy-data.html))
541544
542545
. . .
543546

544547
Tidy Data Principles: principles for structuring tabular data sets:
545548

546-
1. Each variable forms a column.
547-
2. Each observation forms a row.
548-
3. Each type of observational unit forms a table.
549+
1. Each variable must have its own column.
550+
2. Each observation must have its own row.
551+
3. Each value must have its own cell.
552+
553+
![](https://d33wubrfki0l68.cloudfront.net/6f1ddb544fc5c69a2478e444ab8112fb0eea23f8/91adc/images/tidy-1.png "Visualization of the tidy data principles. Source: R 4 Data Science, https://r4ds.had.co.nz/tidy-data.html")
549554

550555
## Our df - but extended
551556

@@ -567,14 +572,14 @@ Wide or long?
567572

568573
. . .
569574

570-
That's right: **wide**. Why is this not tidy?
575+
**Wide!** Why is this not tidy?
571576

572577
. . .
573578

574579
- Values in column names
575580
- Multiple observations per row
576581

577-
## Our extended df - Long format
582+
## Wide vs Long
578583

579584
```{r}
580585
#| label: pivot-ext-df
@@ -586,18 +591,8 @@ df_long <- df_ext %>%
586591
values_to = "mood") %>%
587592
mutate(week = as.numeric(gsub("mood_wk", "", week))) %>%
588593
arrange(name, week)
589-
590-
df_long
591594
```
592595

593-
. . .
594-
595-
- 1 observation (week) per row
596-
- multiple rows per individual
597-
598-
## Wide vs Long
599-
600-
601596
::: columns
602597
::: {.column width="53%"}
603598
**Wide**
@@ -608,7 +603,8 @@ df_long
608603
head(df_ext, 4)
609604
```
610605

611-
- All observations on 1 individual in 1 row
606+
- Values in column names
607+
- Multiple observations per row: all observations on 1 individual in 1 row
612608
- Not tidy
613609

614610
:::
@@ -624,13 +620,18 @@ head(df_ext, 4)
624620
head(df_long, 4)
625621
```
626622

627-
- No values as column headers
628623
- Single observation (weight, mood) in a single row
629-
- Often needed for data visualization
624+
- No values in column names
625+
- Tidy!
626+
630627
:::
631628
:::
632629
:::
633630

631+
. . .
632+
633+
Tidy data is a consistent way of storing data + most R functions work with vectors of values (columns). Tidyverse packages are designed to work with tidy data (`dplyr`, `ggplot2`, etc.)
634+
634635
## tidyr: Tidy Messy Data
635636

636637
Do It Yourself:
@@ -943,16 +944,10 @@ df_long %>%
943944
theme(text = element_text(size = 25))
944945
```
945946

946-
## ggplot2 resources
947-
948-
Useful resources for plotting with ggplot2:
949-
950-
- Choose a visualization > Get example code: <https://www.data-to-viz.com/>
951-
- ggplot2 book: <https://ggplot2-book.org/>
952-
- ggplot flipbook: <https://evamaerey.github.io/ggplot_flipbook/ggplot_flipbook_xaringan.html>
953-
954947
# Go to Exercise 12 - 13
955948

949+
Tip: Choose a visualization -> Get example code: <https://www.data-to-viz.com/>
950+
956951
## Answers to Exercise 12
957952

958953
A scatterplot of Culmen_Length_mm against Flipper_Length_mm per Island.
@@ -994,6 +989,10 @@ What have we learned this afternoon?
994989
- Data Science in a Box: <https://datasciencebox.org/content>
995990
- Learning statistics with R: <https://learningstatisticswithr.com/book/>
996991
- Statistical Inference via Data Science: A ModernDive into R and the Tidyverse: <https://moderndive.com/>
992+
- Big Book of R: <https://www.bigbookofr.com/>
993+
- Better spreadsheets: <https://better-spreadsheets.netlify.app/>
994+
995+
**See also the [What's next page](https://utrechtuniversity.github.io/workshop-introduction-to-R-and-data/what-next.html).**
997996

998997
## Where to find us
999998

Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.

book/subsetting-and-mutating-data.qmd

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,8 @@
66
<iframe class="slide-deck" src="slides/slides_tidyverse.html#selecting-filtering-data" title="Selecting and filtering data" width="100%" height="540"></iframe>
77
```
88

9+
[Link to the slides](slides/slides_tidyverse.html#selecting-filtering-data)
10+
911
## Video
1012

1113
{{< video https://vimeo.com/470859983 >}}

book/summarizing-combining-data.qmd

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,8 @@
66
<iframe class="slide-deck" src="slides/slides_tidyverse.html#summarizing-and-combining-data" title="Summarizing and combining data" width="100%" height="540"></iframe>
77
```
88

9+
[Link to the slides](slides/slides_tidyverse.html#summarizing-and-combining-data)
10+
911
## Exercise 10-11
1012

1113
Now try out exercises 10 and 11 in `datascience_exercises.Rmd`.

book/summary.qmd

Lines changed: 0 additions & 3 deletions
This file was deleted.

book/syntax-and-data-types.qmd

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,8 @@
66
<iframe class="slide-deck" src="slides/slides_introduction.html#r-syntax-data-types" title="Syntax and data types" width="100%" height="540"></iframe>
77
```
88

9+
[Link to the slides](slides/slides_introduction.html#r-syntax-data-types)
10+
911
## Video
1012

1113
{{< video https://youtu.be/S8zTmEvpYYk >}}

book/transformations-and-tidy-data.qmd

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,8 @@
66
<iframe class="slide-deck" src="slides/slides_tidyverse.html#tidy-data" title="Tidy data" width="100%" height="540"></iframe>
77
```
88

9+
[Link to the slides](slides/slides_tidyverse.html#tidy-data)
10+
911

1012
## Video
1113

book/vectors.qmd

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,8 @@
66
<iframe class="slide-deck" src="slides/slides_introduction.html#vectors-in-r" title="Vectors in R" width="100%" height="540"></iframe>
77
```
88

9+
[Link to the slides](slides/slides_introduction.html#vectors-in-r)
10+
911
## Video
1012

1113
{{< video https://youtu.be/XMFjteCdHbQ >}}

book/what-next.qmd

Lines changed: 23 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,25 @@
11
# What's next?
22

3-
Not finished learning? Feel free to check our website, [uu.nl/rdm](https://www.uu.nl/en/research/research-data-management/training-workshops/) for our other workshops and trainings.
3+
Not finished learning? Feel free to check our website, [uu.nl/rdm](https://www.uu.nl/en/research/research-data-management/training-workshops/) for our other workshops and trainings.
4+
5+
### More R resources
6+
7+
- R 4 Data Science: <https://r4ds.hadley.nz/>
8+
- Data Science in a Box: <https://datasciencebox.org/content>
9+
- Learning statistics with R: <https://learningstatisticswithr.com/book/>
10+
- Statistical Inference via Data Science: A ModernDive into R and the Tidyverse: <https://moderndive.com/>
11+
- Big Book of R: <https://www.bigbookofr.com/>
12+
- Better spreadsheets: <https://better-spreadsheets.netlify.app/>
13+
14+
### Data visualization
15+
16+
- Choose a visualization > Get example code: <https://www.data-to-viz.com/>
17+
- ggplot2 book: <https://ggplot2-book.org/>
18+
- ggplot flipbook: <https://evamaerey.github.io/ggplot_flipbook/ggplot_flipbook_xaringan.html>
19+
- Better plots: <https://better-plots.netlify.app/>
20+
21+
### Get practice
22+
23+
- Starting a data analysis project - Extra practice exercises: <https://tavareshugo.github.io/r-intro-tidyverse-gapminder/90-appendix-exercises/index.html>
24+
- Exercises data wrangling with the tidyverse: <https://data-se.netlify.app/2021/02/24/exercises-to-data-wrangling-with-the-tidyverse/>
25+
- Use real public data: <https://github.com/awesomedata/awesome-public-datasets>

course-materials.zip

1 MB
Binary file not shown.

0 commit comments

Comments
 (0)