December 7, 2012

West Coast League baseball coming to Victoria

We're getting closer to the arrival of summer collegiate baseball in Victoria.  The Victoria HarbourCats are one of two new clubs in the West Coast League, along with the Medford Rogues.  The team has been generating some buzz already, seven months before playing a game. The club posted their 2013 schedule yesterday.  And to top things off, the HarbourCats will be hosting the league All Star Game in July, .

The best sources of information about the HarbourCats are the club's official homepage and the fan blog.

Given my own proclivities, I've created a Google map plotting the locations of all the teams in the league.  A zoomed out screenshot is below.



October 25, 2012

"Baseball's Timeless Tradition"

Worth four minutes of your time:  Stephen Brunt's "The World Series: Baseball's Timeless Tradition" video, broadcast as part of Sportsnet Canada's World Series coverage.


October 6, 2012

Neyer on *that* Infield Fly

As predicted, there has been a great deal of chatter on the application of the Infield Fly rule in the eight inning of last night's Cardinals-Braves NL Wild Card game.

The best thing I've read is Rob Neyer's analysis at SB Nation, "Everything you always wanted to know about the Infield Fly Rule*". Neyer covers the rule in general, its application in this particular case (he thinks it was an appropriate call, and one which realistically did not hurt the Braves), and sums up with the fact that the game turned not on this call, but the Braves' three errors and failure to score when, late in the game, they had runners in scoring position.


October 5, 2012

The origins of the Infield Fly Rule

I'm sure lots of ink will be spilled in the hours/days to come, but here's an interesting legal note: "The Common Law Origins of the Infield Fly Rule" (University of Pennsylvania Law Review, Vol.123:1474-1481, June 1975.)

The Infield Fly Rule is obviously not a core principle of baseball. Unlike the diamond itself or the concepts of "out" and "safe," the Infield Fly Rule is not necessary to the game. Without the Infield Fly Rule, baseball does not degenerate into bladderball the way the collective bargaining process degenerates into economic warfare when good faith is absent. It is a technical rule, a legislative response to actions that were previously permissible, though contrary to the spirit of the sport.
Here's a discussion of the article at Topography of Ignorance.

(Tip of the Mariners cap to ronb78 who pasted the link at Lookout Landing.)


August 31, 2012

Open data comes to the Premier League

In the world of sabermetrics, what we now think of "open data" has been the norm for years, thanks to the likes of Retrosheet, Sean Lahman, etc.  But in other sports that has not been the case; witness this article at The Atlantic on the opening of performance data by the Manchester City Football Club and the team's own news release.


August 10, 2012

Trends in run scoring - comparing the leagues

My previous two posts have looked a using R to create trend lines for the run scoring environments in the American and National leagues.  This time around, I'll plot the two against each other to allow for some comparisons.

(The code below assumes that you've read the data into your workspace and calculated the LOESS trend lines, as I did in the previous two posts.)

One of the things I quickly appreciated about the R environment is the option to quickly compare and manipulate (for example, multiply) data from two different source files without having to cut-and-paste the data together.  For everything in this post, we've got two data tables (one for each league) and they remain separate.

July 17, 2012

Trends in run scoring, NL edition (more R)

Last time around I used R to plot the average runs per game for the American League, starting in 1901. Now I’ll do the same for the National League.  I'll save a comparison of the two leagues for my next post.

A fundamental principal of programming is that code can be repurposed for different sets of datas. So much of what I’m going to describe recycles the R code I used for the AL exercise.

So starting with the preliminary step, I went back to Baseball Reference for the data, followed up by the same sort of finessing described for the AL. Once the data was read into the R workspace, I simply copies the AL code, and changed the variable names to create new objects and variables.  (I could have simply rerun the same code, but I wanted to have both the AL and NL data and trend lines available for comparison.)  This included creating new LOESS trend lines.

July 14, 2012

Trends in AL run scoring (using R)

I have started to explore the functionality of R, the statistical and graphics programming language. And with what better data to play than that of Major League Baseball?

There have already been some good examples of using R to analyze baseball data. The most comprehensive is the on-going series at The Prince of Slides (Brian Mills, aka Millsy), cross-posted at the R-bloggers site. I am nowhere near that level, but explaining what I've done is a valuable exercise for me -- as Joseph Joubert said (no doubt in French) "To teach is to learn twice over." 

So after some reading (I have found Paul Teetor's R Cookbook particularly helpful) and working through some examples I found on the web, I decided to plot some time series data, calculate a trend line, and then plot the points and trend line. I started with the American League data, from its origins in 1901 through to the All Star break of 2012.  For this, I relied on this handy table at Baseball Reference.

Step 1: load the data into the R workspace.  This required a bit of finessing in software outside R. Any text editor such as Notepad or TextPad would do the trick.  What I did was paste it into the text editor, tidied up the things listed below, and then saved the file with a .csv extension.

July 7, 2012

Strike Three

A great series over at Dead Spin, Better Know An Umpire, comes to an end.  For a final wrap up, they've posted a gallery of animated gifs, showing every umpire's strike three punch out call.

Lots of fun.


April 23, 2012

Leadoff walks

It's well established that leadoff walks are no worse than leadoff singles.

The pre-eminent research is by David W. Smith (of fame), whose paper "Does Walking the Leadoff Batter Lead to Big Innings?" (published in SABR's The Baseball Research Journal in 2007 (#35, pp.23-24) answered the question with an emphatic "NO". Smith looked at game records from 1974 to 2002, and found that just under 40% of hitters who reach first base, regardless of method of getting there (single, walk, or hit-by-pitch) scored a run.

More recently, plen's "The Leadoff Walk" looked at a longer time frame (starting in 1952 and ending perhaps in 2009), and John Dewan wrote about "The Dread Leadoff Walk" (also here). Both found roughly the same thing as Smith. plen's assessment was a couple of percentage points lower than Smith's, reflecting the lower run scoring environments in the years on either side of Smith's analysis.

Miguel Cabrera
Miguel Cabrera. Taking a pitch doesn't make a dramatic photo.
Which brings us to today. "An in-depth look at the leadoff walk in 2011" was written by Guy Spurrier (at the National Post). Spurrier references Smith's work and then goes on to use data to give us a look at the season just passed. In addition to providing some nifty graphics, it also provides details about the best and worst players in this category.

Miguel Cabrera led the Majors in 2011 with 23 leadoff walks -- not a surprising result, given that he was third overall with 108 walks on the season, trailing only Jose Bautista (132) and Joey Votto (110).


April 13, 2012

Mariano Rivera in 3D

Mariano Rivera, 1993

We already knew that Mariano Rivera has been consistently amazing for a long time (16 seasons and counting), and with what appears to be a confoundingly limited arsenal -- he only throws two pitches, a straight fastball and the infamous cutter. Tom Verducci's Sports Illustrated feature from 2009, "The Sure Thing", remains one of the best written summaries of Rivera's success.

Also from 2009, iamawesomer published an article on Beyond the Boxscore, "Mariano's Gonna Cut You, Everybody Knows It, And Nobody Can Do Anything About It" that used PitchFX data to analyze Rivera's pitches -- notably the location and movement.

But this great New York Times 3D video goes a long way in demonstrating the effectiveness of his pitches by giving us a batters-eye-view of the pitches, from when the swing has to start to where the ball will eventually be when it gets to the batter.

Wonderful stuff.


April 9, 2012

Will academic journals change?

At the end of January I wrote a short post with the title "The bizarre world of academic journals", with a link to an article at The Atlantic.

Today The Guardian published an article "Academic spring: how an angry maths blog sparked a scientific revolution", which tells about Tim Gowers, a Maths professor at Cambridge University, who has sparked a widespread protest of the largest publisher of academic journals, Elsevier. This article, like the earlier piece in The Atlantic, takes a critical view of the business model that the academic publishers have relied on.

For those of us outside The Academy, the trend toward more accessible research papers is nothing but a good thing.


April 6, 2012

The end of new ballparks?

Writing as part of the New Yorker "Sporting Scene" blog, Reeves Wiedeman points out that we have likely seen The End of the Retro Ballpark.

The architecture firm Populous has designed 18 of the last 23 new ballparks, both retro (starting the whole trend with Camden Yards in Baltimore) and thoroughly modern (Marlins Park). But both trends may be at an end, since it is unlikely there will be any new ballparks built in the near future. Wiedeman notes that there are three groups of MLB ballparks: the historic icons (Wrigley, Fenway), the newer parks, and the clubs that are having a struggle getting funding for a new park (Oakland, Tampa Bay).


April 2, 2012

Ballparks via Google Maps

Mike Fast, an analyst who used to write for Baseball Prospectus but who is now employed by the Astros, put together a Google Map of the Astros organization -- the location of the team's minor league affiliates.

I liked the idea so much I spent a few minutes plotting the same thing for the Seattle Mariners and their minor league affiliates.

For a more complete view of the Mariners affiliates, visit the wikipedia entry on the topic (which includes a historic timeline of the changing affiliations) or the team's affiliates news page.


April 1, 2012

On retro ballparks

Mark Byrnes in The Atlantic: Cities argues that the wave of retro ballparks that began 20 years ago with the construction of Camden Yards in Baltimore is over.  Byrnes states that Citi Field might be the last (and arguably the most forced) of the retro movement, the Great American Ballpark in Cincinnati bucked the trend, and ushered in a new wave of ballparks with contemporary design elements.


March 8, 2012

Probability and Statistics Cookbook

A great resource -- Matthias Vallentin has assembled this "Probability and Statistics Cookbook", full of formulas, functions, and graphs.


February 26, 2012

The Interior Angel Stadium

It’s winter, and there is no baseball. The first spring training exhibition games in Florida and Arizona start at the end of the week, and Opening Day – in Tokyo – is a month away.

In “The Interior Stadium”, an essay that appears in the book The Summer Game (1972), Roger Angell writes:
Baseball has one saving grace that distinguishes it–for me, at any rate–from every other sport. Because of its pace, and thus the perfectly observed balance, both physical and psychological, between opposing forces, its clean lines can be restored in retrospect. This inner game—baseball in the mind—has no season, but it is best played in the winter, without the distraction of other baseball news.
The world has changed dramatically in the 40 years since Angell wrote those words. There is constant baseball news now during the off-season—rumours and confirmed trades and free agent signings, PED testing scandals, and from the sabermetric community a steady stream of analysis of the season just past and projections of the season to come.

But recalling the games of the past is still best done in the winter. And I have been thinking lately about the one MLB game I attended during the 2011 season. This was the final game of a three-game series that the A’s played against the Angels, on Sunday, September 25. I am a fan of neither team, but we were going to be in the area, so off we went.

The teams had split the first two games of the series, Oakland winning the Friday opener 3-1 and the Angels taking Saturday’s game 4-2. That win put the Angels 2 ½ games behind the Red Sox and 1 ½ behind the Rays in the chase for the wild card (the Rangers had already clinched the AL West on Friday). For all intents, this was a game the Angels needed to win if their playoff hopes were to be sustained as they headed into the final series of the season, three games against the Rangers.

But I had spent the past five days on a cruise ship, sailing down the coast from Vancouver to Los Angeles, and had been off the grid—all of this was news to me when we landed in L.A. on Sunday morning. So it was a quick glance at the MLB website (the "Road to October" article I read before the game is here) to get the lay of the land before we headed to the ballpark for the afternoon game.

This was my second visit to Angel Stadium, and it strikes me (speaking as one who has been to all of four MLB ballparks) as rather generic. The stadium sits in the middle of a sea of asphalt, surrounded on all sides by parking lots. With that said, once inside the park, it’s a decent place for a ballgame. The ushers and other staff are helpful and friendly, the place is clean, and has a great family-friendly atmosphere. On this last observation, it probably didn’t hurt that it was Fan Appreciation Day, the downside of which was that the team store (50% off everything!) was packed. So no ballpark souvenirs for us on this visit.

They paved paradise, and put up a parking lot. And a baseball stadium.

My wife and I had seats up in the lower tier of the upper deck—a few rows shy of nosebleed—just to the first base side of home. When we arrived at our seats batting practice was winding down, and the place was starting to fill up. (The announced attendance was just over 40,000.) At first, I thought that the chap sitting next to me had divided loyalties, dressed as he was in a red Angels cap and a shirt with the green and gold of the A’s. It took me a moment to realize that it was a football jersey; he spent the entire ballgame listening to the Packers game on his radio, happening 2,000 miles away in Chicago. Most of the folks around us—including quite a few families with kids, which is always great to see—were wearing some form of Angel gear, except for a contingent of maybe eight people a few rows in front of us who were there to root for Oakland. Loudly.

The starting pitchers both entered the games with ERAs north of 5, Rich Harden for the A’s and Joel Pinero for the Angels. Based on that, I anticipated a game with plenty of runs; a slug-fest, even.

But this is baseball, so instead we got something quite different. The game opened with Pinero setting down the A’s in order. In the bottom of the first, the second Angel batter, Howie Kendrick, tripled and scored when Bobby Abreu singled. 1-0 Angels after one inning.

The Angels got something started in the second, with Callaspo leading off with a single, but ultimately they failed to score and left two stranded. In the third, Abreu smacked a solo homer, and Vernon Wells hit another solo shot in the sixth. Fireworks over the waterfalls, much to the delight of the crowd—although they really aren’t that effective on a sunny California afternoon.

Meanwhile, Pinero was mowing down the Athletics—he was perfect through 4 1/3 innings. Oakland’s #5 batter David DeJesus got the team’s first hit with one out in the 5th inning, only have the next batter ground into an inning-ending double play.

Pinero lasted into the seventh, giving up two one-out singles before being lifted for Cassevah. I was surprised by how few fans acknowledged Pinero as he left the field—he’d been very effective, giving up three hits, no walks, and striking out four. But apparently almost nobody noticed. Ultimately the two Oakland runners failed to score, so after seven innings, the score was 3-0.

Things got lively in the eighth inning. Oakland loaded the bases on a pair of singles and a walk, and then plated two runs on another single and a sac fly, narrowing the gap to 3-2 Angels.

Wells strikes out as Hunter and Callespo pull off a double-steal.
But the Angels got the two runs right back in their half of the eighth. Torii Hunter walked to lead off, and then Navarro bunted to move Hunter to second. Callaspo walked, and he and Hunter pulled off a double-steal, much to the delight of the crowd. This was followed by a single to right field, and both Hunter and Callaspo scored.  Needless to say, the crowd was pretty pleased at this turn of events. 5-2 Angels after eight innings, with the win expectancy at 97.5%.

But this is baseball. In the top of the ninth, Josh Willingham led off with a homer. Oakland then proceeded to score three more runs on a string of hits, a goofy error, and a sac fly. After 8 1/2 innings, Oakland was now leading 6-5, and turned the win expectancy to 83.3% in their favour. At this point, the Oakland fans in front of us were more enthusiastic than any of the Angels fans around us had been all game.

In the bottom of the ninth, the crowd didn’t have any spark (not that they had a lot to begin with), and the team didn’t give them anything to cheer about either. Howie Kendrick managed a two-out walk, but that was it. And with the loss, the Angels’ hopes of making the playoffs dimmed to only the faintest of glimmers. (In the final series of the season, they ended up being swept by the World Series-bound Rangers, quite the opposite to what they needed.)

From a neutral fan’s point of view, this was a great game. Pinero’s pitching was worthy of note, and the unlikely Oakland comeback made for an exciting finish.

Some game summaries:


February 13, 2012

Bill James in People magazine

People, June 3, 1991.

No kidding.

First (May 31, 1982) up was a review of the then-newly published Baseball Abstract in the "Picks and Pans" section.

Then in 1991 a personal profile entitled "Holy R.b.i.—it's Statman!" (note: capitalization of RBI is in the archive) appeared, and can be found here.


January 26, 2012

Statistical approaches

At the Statistics for Experimental Biologists blog, a short article about the four main ways to approach a statistical problem:  Bayesian, frequentist, information-theoretic, and likelihood, entitled "Putting the methods you use into context". The article points out that there are overlaps between the four approaches, and ends with the statement "Knowing the big picture allows you to reflect on the methods you use, and ask whether they are appropriate for the task. It is also a useful antidote to some of the dogmatism associated with statistical analyses (you don't have to do something one way just because that's how you saw others do it)."

The article also includes some references and links to peer-reviewed journal articles in PDF form that are (contrary to my previous post) freely available to the hoi polloi.  One of those is by Gerd Gigerenzer, and appeared in the Journal of Socio-Economics in 2004. In the article, titled "Mindless Statistics", Gigerenzer defines the "null ritual" as having three steps:
1. set up a statistical null hypothesis, but do not specify your own hypothesis nor any alternative hypothesis,
2. use the 5% significance level for rejecting and accepting the null hypothesis, and
3. always perform this procedure.
Gigerenzer then asks "Why do intelligent people engage in statistical rituals rather than in statistical thinking?" His conclusions are rooted in the ritualistic nature of statistical hypothesis testing, in that it has all of the elements of social rituals.  First, there is the repetition of the same action (repeating the test over and over).  Second, there is a focus on special numbers and colours -- 0.05 and 0.01, two standard deviations, etc.  Third, a ritual incorporates fears about serious sanctions for transgressions; in the academy, the adjudicators are professors, academic advisors, journal editors, and textbook publishers.  And finally, the ritual relies on "wishful thinking and delusions that virtually eliminate critical thinking", which in statistical research is the p-value.

I will finish with Gigerenzer's final two paragraphs, quoted here in their entirety:

We know but often forget that the problem of inductive inference has no single solution.
There is no uniformly most powerful test, that is, no method that is best for every problem.
Statistical theory has provided us with a toolbox with effective instruments, which require
judgment about when it is right to use them. When textbooks and curricula begin to teach
the toolbox, students will automatically learn to make judgments. And they will realize that
in many applications, a skilful and transparent descriptive data analysis is sufficient, and
preferable to the application of statistical routines chosen for their complexity and opacity.
Judgment is part of the art of statistics.
To stop the ritual, we also need more guts and nerves.We need some pounds of courage
to cease playing along in this embarrassing game. This may cause friction with editors and
colleagues, but it will in the end help them to enter the dawn of statistical thinking.


January 25, 2012

The bizarre world of academic journals

On a fairly regular basis, for both my work and for light entertainment, I seek out journal articles written by leading scholars.  And I begin every search knowing there is a high probability that it will be foiled, and I will find myself unable to access those materials.

Laura McKenna, writing for The Atlantic, sums the bizarre nature of academic journals with this well-written and succinct article titled "Locked in the Ivory Tower: Why JSTOR Imprisons Academic Research".

Highly recommended.


January 13, 2012

SABR Analytics Conference

The Society for American Baseball Research (SABR) has just announced that they will be hosting the first-ever conference dedicated to baseball analytics from March 15-17, 2012.

All of the "featured speakers" shown are team general managers and executives -- but I anticipate that as more speakers are announced there will be some number crunchers added to the list.

Hopefully there will be a conference procedings document produced for those of us who won't be able to attend.


January 7, 2012

The music of my mind

Jason Branon at Baseball Nation offers up "ProGS - The Pronunciation Guide For Sabermetricians", which is just what it says.

At first I thought, why bother? I tended to side with Jason's leader, Rob Neyer -- the only people who read this stuff never leave the house. And then I realized that isn't entirely true -- at the annual SABR convention (in Minneapolis this year), baseball nerds get together, and the quantitative specialists need to be able to argue about the "new stats" and the interpretation thereof, rather than wasting precious time and energy arguing about pronunciation. (Which is a whole other dimension of nerdiness.)

Perhaps someone can create a sabermetric rewrite of the Gershwin brothers' "Let's call the whole thing off" -- the tuh-MAY-toe / ta-MAH-toe song.

Fred shows Ginger how to execute an effective throw from second base on a double play pivot.