Have you ever reproduced someone else’s research analysis?
How about reproducing your own old work? What tools did you use?
Do you know how to reproduce your collaborator’s work? Was it hard or easy?
How would you go about extending the analysis further?
If you notice a data error how easy would it be to re-create the analysis?
What if your collaborator is no longer available?
Wizardry
Alison Hill from R Markdown Anatomy
R Markdown can help generate:
It enables you to:
What is great about R Markdown documents is that they are fully reproducible and support many static and dynamic output formats, to name a few: PDF, HTML, MS Word, Beamer… You can incorporate narrative text and code of your data analysis to produce an elegantly formatted story telling journey.
It is a variant of Markdown that has embedded R code chunks (denoted by three back ticks), to be used with knitr to make it easy to create reproducible web-based reports.
R Markdown is a plain text file that has the extension .Rmd
To use R Markdown you will need to install the package from CRAN and load it with:
install.packages("rmarkdown", repos = "http://cran.us.r-project.org")
suppressPackageStartupMessages(library(rmarkdown))
Before we start learning about RMarkdown
let us do an exercise of RMarkdown appreciation π
Inside the data folder you will find the following csv
files:
Split into groups of three and discuss how you would produce a Word document or a PowerPoint presentation to present your findings for the following problem:
Open the exyu_olympic.csv
file in Excel and try to:
We will work on this for the next 10 minutes. ππ¬
We already have the R code that will provide the results for those questions. ππ
Open the RMarkdown4RR.Rproj
file. Once you get RStudio up and running for the given project click on the “R/script1.R” file and run it chunk by chunk. πππ
The question we have now is: Can we put this into a document or a presentation? π€ Of course we can, but we need to learn how to do it. π€
Task 1:
Open the file RMarkdown_Intro.Rmd
Change the title of the Markdown Document from My First Markdown Document
to RMarkdown Introduction
.
Click the “Knit” button to see the compiled version of your sample code.
Task 2: Letβs format this document further by
Changing the author of the document to your own name
Rewriting the first sentence of the document to say “This is my first R Markdown document”
Recompiling the document so you can see your changes
You can turn a word into a link by surrounding it in hard brackets: [ ] and then placing the link behind it in parentheses: ( ), like this:
[RStudio](www.rstudio.com)
Task 3: Make GitHub in the following paragraph link to https://github.com/TanjaKec/RMarkdown4RR
To embed formatting instructions into your document using Markdown, you would surround text by:
one asterisk to make it italic: italic
two asterisks to make it bold: bold and
backticks to make it monospaced: monospaced
.
To make an ordered list you need to place each item on a new line after a number followed by a period followed by a space:
π‘! Note that you need to place a blank line between the list and any paragraphs that come before it.
Task 4:
When analysing data… The variables can be one of two broad types:
Attribute variable: has its outcomes described in terms of its characteristics or attributes;
Measured variable: has the resulting outcome expressed in numerical terms.
R
codeTo embed an R code chunk you would use three back ticks:
chunk of code
Task 5: Replace the cars
data set with the gapminder
data set. Don’t forget to load the gapminder
package using library(gapminder)
.
If you haven’t got the gapminder
package available in your system you will need to install it:
install.packages("gapminder")
.
R
codeYou can also embed plots by setting echo = FALSE
to the code chunk to
prevent printing of the R code that generates the plot:
chunk of code
Task 6: Replace the base boxplot of mpg
vs. cyl
by a ggplot
’s boxplot to examine a relationship between continent
and lifeExp
(note that you can use some of the dplyr
functions!).
suppressPackageStartupMessages(library(dplyr))
library(ggplot2)
# ggplot boxplot
ggplot(gapminder, aes(x = continent, y = lifeExp)) +
geom_boxplot(outlier.colour = "hotpink") +
geom_jitter(position = position_jitter(width = 0.1, height = 0),
alpha = .2) +
labs (title= "Life Exp. vs. Continent",
x = "Continent", y = "Life Exp.") +
theme(legend.position = "none",
panel.border = element_rect(fill = NA,
colour = "black",
size = .75),
plot.title=element_text(hjust=0.5))
Finally, if you wish to add mathematical equations to your Markdown document you can easily embed LaTeX maths equations into your report.
To display an equation in its own line it needs to be surrounded by the double dollar symbol:
$$
y = a + bx
$$
,
or to embed an equation in line within the text you would use only one dollar symbol: $y = a + bx$
.
Task 7: Display the equation in the Including Mathematical Equations paragraph into its own line.
Now you’ve got the basics of rmarkdown
we will move on to editing more sophisticated features of your dynamic document.
When creating an HTML document from R Markdown, you need to specify the HTML document output format in the YAML metadata of your document. You can learn more about it by checking this chapter of R Markdown: The Definitive Guide book.
Let us go through the next set of prepared .rmd
files in your RMarkdown4RR
project folder.
Do you remember our RMarkdown Appreciation exercise? π
There is a list of files you should open and knit to see what features are incorporated into the documents and to learn how it is done.
File 01_rmdApprec.Rmd
incorporates the R code given in R/script1.R
file. Open this Rmd
file and check its metadata. Try to play around with the document’s layout by changing some of the features, such as the table of contents (TOC) using the toc option or theme (for more available themes you can use see the blog post r-markdown-theme-gallery.
File 02_rmdApprec.Rmd
enables you to create scientific and technical writing, native to the web by using the Distill Basics template. For more see https://rstudio.github.io/distill/basics.html.
File 03_rmdApprec.Rmd
shows you how to add a static image file to your document. You should check the following blog post Tips and tricks for working with images and figures in R Markdown documents by Zev Ross.
File 04_rmdApprec.Rmd
illustrates happy collaboration with Rmd to docx.
File 05_rmdApprec.Rmd
shows how to put your work into a slide show.
File 06_rmdApprec.Rmd
is a slide show you can upload on Rstudio’s RPubs server for sharing documents on the web.
Install the rticles
package to get all of the available rmd
templates for various paper articles.
This section is based on the material developed by Dr Alison Hill for R Markdown for Medicine workshop.
R is a powerful tool for reproducible research. There are many other packages that you could also use for sharing your work. Creating Shiny application using the R::Shiny package is a nice way of engaging with the audience while sharing your work. Here is an example of reproducible research of the problem used in the Resampling in the Undergraduate Statistics Curriculum paper:
https://tatjanakec.shinyapps.io/permutation_bootstrap/
Reproducible research should not be a burden for anybody who takes their research seriously. If nothing else, the good habits of reproducibility may actually turn out to be a time-saver in the longer run for any research practitioner.
Easy to use
Making a mistake is not ‘the end of the world’
Allows you to keep a history of changes to your project through which it is easy to navigate
Takes up minimal space
Backup of your project
No need for a server: easy to set up
GitHub’s strong community: your colleagues are probably already there
Provides tools to help enhance collaboration
A common location to share your work
We will go through the basic steps of connecting RStudio with your GitHub account, but for more detailed instructions you should check Happy Git with R.
We are going to assume you are already familiar with and have done:
βοΈ Chapter 5: Register a GitHub account
βοΈ Chapter 6: Install or upgrade R and RStudio
Give it a meaningful name
Copy repo’s HTTPS address
Open a new project in RStudio: File β‘οΈ New Project…
Select Version Control β‘οΈ Git
Paste the address of your Git repo
and check the box for Open in new session
before you hit the Create Project
button.
You’re ready to go! π
You would definitely find the following useful:
Material is released under a Creative Commons Attribution-ShareAlike 4.0 International License.