In order to see if I could create a plot with a subtitle, I went back to some of my own code drawing on the
Lahman
database package. The code below summarizes the data using dplyr
, and creates a ggplot2 plot showing the annual average number of runs scored by each team in every season from 1901 through 2014, including a trend line using the loess smoothing method.
This is an update to my series of blog posts, most recently 2015-01-06, visualizing run scoring trends in Major League Baseball.
# load the package into R, and open the data table 'Teams' into the
# workspace
library(Lahman)
data(Teams)
#
# package load
library(dplyr)
library(ggplot2)
#
# CREATE SUMMARY TABLE
# ====================
# create a new dataframe that
# - filters from 1901 [the establishment of the American League] to the most recent year,
# - filters out the Federal League
# - summarizes the total number of runs scored, runs allowed, and games played
# - calculates the league runs and runs allowed per game
MLB_RPG <- Teams %>%
filter(yearID > 1900, lgID != "FL") %>%
group_by(yearID) %>%
summarise(R=sum(R), RA=sum(RA), G=sum(G)) %>%
mutate(leagueRPG=R/G, leagueRAPG=RA/G)
Plot the MLB runs per game trend
Below is the code to create the plot, including the formatting. Note the
hjust=0
(for horizontal justification = left) in the plot.title
line. This is because the default for the title is to be centred, while the subtitle is to be justified to the left.MLBRPGplot <- ggplot(MLB_RPG, aes(x=yearID, y=leagueRPG)) +
geom_point() +
theme_bw() +
theme(panel.grid.minor = element_line(colour="gray95")) +
scale_x_continuous(breaks = seq(1900, 2015, by = 20)) +
scale_y_continuous(limits = c(3, 6), breaks = seq(3, 6, by = 1)) +
xlab("year") +
ylab("team runs per game") +
geom_smooth(span = 0.25) +
ggtitle("MLB run scoring, 1901-2014") +
theme(plot.title = element_text(hjust=0, size=16))
MLBRPGplot
MLB run scoring, 1901-2014 |
Adding a subtitle: the function
So now we have a nice looking dot plot showing the average number of runs scored per game for the years 1901-2014.But a popular feature of charts--particularly in magazines--is a subtitle that has a summary of what the chart shows and/or what the author wants to emphasize.
In this case, we could legitimately say something like any of the following:
- The peak of run scoring in the 2000 season has been followed by a steady drop
- Teams scored 20% fewer runs in 2015 than in 2000
- Team run scoring has fallen to just over 4 runs per game from the 2000 peak of 5 runs
- Run scoring has been falling for 15 years, reversing a 30 year upward trend
How can we add a subtitle to our chart that does that?
The function Bob Rudis has created quickly and easily allows us to add a subtitle. The following code is taken from his blog post. Note that the code for this function relies on two additional packages,
grid
and gtable
. Other than the package loads, this is a straight copy/paste from Bob's blog post.library(grid)
library(gtable)
ggplot_with_subtitle <- function(gg,
label="",
fontfamily=NULL,
fontsize=10,
hjust=0, vjust=0,
bottom_margin=5.5,
newpage=is.null(vp),
vp=NULL,
...) {
if (is.null(fontfamily)) {
gpr <- gpar(fontsize=fontsize, ...)
} else {
gpr <- gpar(fontfamily=fontfamily, fontsize=fontsize, ...)
}
subtitle <- textGrob(label, x=unit(hjust, "npc"), y=unit(hjust, "npc"),
hjust=hjust, vjust=vjust,
gp=gpr)
data <- ggplot_build(gg)
gt <- ggplot_gtable(data)
gt <- gtable_add_rows(gt, grobHeight(subtitle), 2)
gt <- gtable_add_grob(gt, subtitle, 3, 4, 3, 4, 8, "off", "subtitle")
gt <- gtable_add_rows(gt, grid::unit(bottom_margin, "pt"), 3)
if (newpage) grid.newpage()
if (is.null(vp)) {
grid.draw(gt)
} else {
if (is.character(vp)) seekViewport(vp) else pushViewport(vp)
grid.draw(gt)
upViewport()
}
invisible(data)
}
Adding a subtitle
- Rename the active plot object
gg
(simply because that's what Bob's code uses) - Define the text that we want to be in the subtitle
- Call the function
# set the name of the current plot object to `gg`
gg <- MLBRPGplot
# define the subtitle text
subtitle <-
"Run scoring has been falling for 15 years, reversing a 30 year upward trend"
ggplot_with_subtitle(gg, subtitle,
bottom_margin=20, lineheight=0.9)
MLB run scoring, 1901-2014 with a subtitle |
Wasn't that easy? Thanks, Bob!
And it's going to get easier; in the few days since his blog post, Bob has taken this into the
ggplot2
development environment, working on the code necessary to add this as a simple extension to the package's already extensive functionality. And Jan Schulz has chimed in, adding the ability to add a text annotation (e.g. the data source) under the plot. It's early days, but it's looking great. (See ggplot2
Pull request #1582.) Thanks, Bob and Jan!And thanks also to the rest of the
ggplot2
developers, for making those of us who use the package create good-looking and effective data visualization. Ain't open development great?The code for this post (as an R markdown file) can be found in my Bayesball github repo.
-30-