Last time around I used R to plot the average runs per game for the American League, starting in 1901. Now I’ll do the same for the National League. I'll save a comparison of the two leagues for my next post.
A fundamental principal of programming is that code can be repurposed for different sets of datas. So much of what I’m going to describe recycles the R code I used for the AL exercise.
So starting with the preliminary step, I went back to Baseball Reference for the data, followed up by the same sort of finessing described for the AL. Once the data was read into the R workspace, I simply copies the AL code, and changed the variable names to create new objects and variables. (I could have simply rerun the same code, but I wanted to have both the AL and NL data and trend lines available for comparison.) This included creating new LOESS trend lines.
July 17, 2012
July 14, 2012
I have started to explore the functionality of R, the statistical and graphics programming language. And with what better data to play than that of Major League Baseball?
There have already been some good examples of using R to analyze baseball data. The most comprehensive is the on-going series at The Prince of Slides (Brian Mills, aka Millsy), cross-posted at the R-bloggers site. I am nowhere near that level, but explaining what I've done is a valuable exercise for me -- as Joseph Joubert said (no doubt in French) "To teach is to learn twice over."
So after some reading (I have found Paul Teetor's R Cookbook particularly helpful) and working through some examples I found on the web, I decided to plot some time series data, calculate a trend line, and then plot the points and trend line. I started with the American League data, from its origins in 1901 through to the All Star break of 2012. For this, I relied on this handy table at Baseball Reference.
Step 1: load the data into the R workspace. This required a bit of finessing in software outside R. Any text editor such as Notepad or TextPad would do the trick. What I did was paste it into the text editor, tidied up the things listed below, and then saved the file with a .csv extension.