Automatically generate personalized certificates and other documents with R

Examples for automatically creating customized PDF or Word documents using R and R Markdown

Overview

Generating personalized documents is often useful. Since this is a very common task, programs like Word or similar software can do this. But I like to use R if I can. And the whole R Markdown system is perfectly suited for repeat generation of customized documents.

I’m certainly not the first one to have the idea of using R, and in fact my initial approach is based on this prior post. In this brief post, I describe a few ways of using R and R Markdown to auto-generate custom documents, and provide example code and explanations for anyone who might want to use this (including my future self).

Rationale

As a teacher, I occasionally need to generate personalized certificates for students to indicate they successfully completed a workshop, class, or similar event. Another use case is to generate personalized letters of acceptance (or rejection) to some program. Many other use cases exist. In fact, my first and original application was to automatically generate gift certificates for all my nieces and nephews 😁.

For this example, I assume that we are trying to generate certificates of completion for a group of students, with each certificate showing their name and score.

Overall setup

You need the following elements:

  • A data frame containing the information you want to be personalized, e.g. the name and score of each student.
  • A Rmd document that serves as the template for the certificate. I show two examples, for use with either LaTeX/PDF or Word as output.
  • A short script that reads the data frame, reads the template, then personalizes the template and creates output (done via R Markdown/knitr/pandoc).

We’ll discuss each element next.

Data frame

This part is fairly straightforward. You need a file (e.g. and Excel or CSV file) that you can load as a data frame and that contains the information you want to use for each personalized document. Here is an example of a very simple data frame containing names and scores for 2 students, which is stored in this csv file.

LastNameFirstNameScore
SmithJennifer93
JonesPaul82

How you collected that data is up to you. If the file containing your data is a bit more messy, you might have to do a few cleaning steps before you can use it.

Templates

There are different ways you can create the templates and thus the output, with different advantages and disadvantages. I’ll describe and show examples for LaTeX/PDF and Word, and discuss other options briefly.

LaTeX/PDF

For this approach, you start with a template file that contains some LaTeX commands, including the placeholders that will get personalized. Here is code for an example file, you can get the file here. We’ll look at the resulting output below.

---
title: ""
output: pdf_document
classoption: landscape
---

\begin{center}
\includegraphics[height=4cm]{fig1.png}
\hfill
\includegraphics[height=4cm]{fig2.jpg} \\
\bigskip
{\Huge\bf Certificate of Accomplishment } \\
\bigskip
{\Huge <<FIRSTNAME>> <<LASTNAME>> } \\
\bigskip
{\Large has successfully completed the course {\it Generating Certificates with R} with a score of <<SCORE>> } \\
\bigskip
{\Huge Congratulations!}
\end{center}

This template places 2 images at the top, writes some text, and most importantly, adds some placeholder text that will be customized for each student with the script shown below. It doesn’t matter what placeholder text you write, as long as it’s unique such that when you do the replacement, only the instance you want replaced is actually changed. Enclosing with special characters such as << >> is a good option for this, but it’s not required.

The advantages of the LaTeX/PDF approach are that 1) LaTeX allows you to do a lot of formatting and customization of the template so it looks exactly the way you want it, 2) the end product is a PDF file, which is easy to print or share with those for whom they are meant. The disadvantage is that you need to know a bit of LaTeX to set up your template, or at least be willing to spend some time with Google until you found all the snippets of commands you need for the layout you want to have.

Word

For this approach, you start with a template file that contains commands that lead to a decent looking Word document. Again, it needs to include the placeholders that will get personalized. Here is code for an example file, you can get the file here. We’ll look at the end result below.

---
title: ""
output: 
  word_document:
    reference_docx: wordstyletemplate.docx
---

![Image](fig1.png)

# Certificate of Accomplishment

:::{custom-style="mystyle1"}
<<FIRSTNAME>>  <<LASTNAME>>
:::

has successfully completed the course "Generating Certificates with R" with a score of <<SCORE>>

:::{custom-style="mystyle2"}
Congratulations!
:::

```{r fig2, echo=FALSE, out.width="50%"}
knitr::include_graphics("fig2.jpg")
```

Note that by default, going from R Markdown to Word doesn’t give you much ability to apply formatting. However, it is possible to do a decent amount of formatting using a word style template. I have another blog post which describes this approach, and I’m using it here.

Even with the word style formatting, some things can’t be controlled well. Placement and sizing of figures is the main problem, no matter if you include the figures with basic Markdown commands or use the include_graphics() function. You’ll see the problem if you try to run this example (code below). As such, for something that includes figures, using the LaTeX/PDF workflow seems almost always better. A scenario where the Word setup might be useful is if you want to produce customized letters. The one main advantage of a Word output (in addition to not having to figuring out LaTeX commands) is that the output can be further edited if needed.

Other options

I believe the PDF or Word outputs are best for most instances, but occasionally another format might be needed. You can use this overall approach to generate other outputs, for instance the standard R Markdown html output, or different versions of presentation slides (e.g. ioslides, Beamer, Powerpoint), etc. In principle, any R Markdown output format should work. You just need to alter your template file accordingly.

Processing script

Once you generated your data and template, you only need a few lines of code to read the template, personalize it, and turn it into the wanted output. Here is example code that shows how to process the data and template files above, you can get the file here.

#####################################################
# automatically create customized certificates
#####################################################

# load needed packages
library('readr')
library('dplyr')
library('stringr')

# load data, read everything in as a string/character
df <- read.csv("student_data.csv",colClasses = 'character')

#set this to TRUE if you want to generate word output, FALSE for pdf
word_out = TRUE

# load either pdf or word certificate template
template <- ifelse(word_out, readr::read_file("certificate_template_word.Rmd"), readr::read_file("certificate_template_pdf.Rmd"))

#run through all students, generate personalized certificate for each
for (i in 1:nrow(df))
{

  #replace the placeholder words in the template with the student information
  current_cert <- template %>%
    str_replace("<<FIRSTNAME>>", df[i,'FirstName']) %>%
    str_replace("<<LASTNAME>>", df[i,'LastName']) %>%
    str_replace("<<SCORE>>", df[i,'Score'])

  #generate an output file name based on student name
  out_filename = paste(df[i,'LastName'],df[i,'FirstName'],'Certificate',sep="_")
  out_filename = paste0(out_filename, ifelse(word_out, '.docx','.pdf'))

  #save customized Rmd to a temporary file
  write_file(current_cert, "tmp.Rmd")

  #create the certificates using R markdown.
  #it will detect the ending of the output file and use the right format
  rmarkdown::render("tmp.Rmd", output_file = out_filename)

  #temporary Rmd file can be deleted.
  file.remove("tmp.Rmd")

}

Results

If you run the script for the PDF output, you’ll get two pdf certificates, here is one of them. Similarly, if you run the Word template, you get two Word documents, here is one of them. As mentioned above, the Word output is not that nicely formatted, the figure placement and sizing can’t really be controlled.

Further Resources

To reproduce the example here, you can download this zip file which contains the data file, the R Markdown template for PDF output and the R Markdown template for Word output, the certificate generator script, the files for figure 1 and figure 2 as well as the word style file. Place all files into a folder and run the certificate_generator.R function to produce either PDF or Word output.

Associate Professor

Data analysis and modeling with a focus on infectious diseases.