<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-684119683005319088</id><updated>2012-02-16T10:27:44.923-08:00</updated><category term='win expectancy'/><category term='Cliff Lee'/><category term='weighting'/><category term='minor leagues'/><category term='Billy Beane'/><category term='Victoria Seals'/><category term='Tom Wilhelmsen'/><category term='actuarial'/><category term='XKCD'/><category term='SABR'/><category term='Minnesota Twins'/><category term='sabermetrics'/><category term='ERA'/><category term='small sample size'/><category term='regression toward the mean'/><category term='Dave Winfield'/><category term='streak'/><category term='baseball bookshelf'/><category term='regression'/><category term='perfect game'/><category term='academics'/><category term='Andrew Gelman'/><category term='no-hitter'/><category term='Albert Pujols'/><category term='Vancouver Canadians'/><category term='infographics'/><category term='Bo Hart'/><category term='Strat-o-matic'/><category term='physics'/><category term='football'/><category term='run expectancy'/><category term='probability'/><category term='prediction'/><category term='monte carlo'/><category term='Golden Baseball League'/><category term='pythagorean'/><category term='New York Yankees'/><category term='pitching'/><category term='World Series'/><category term='Detroit Tigers'/><category term='golf'/><category term='WPA'/><category term='log5'/><category term='Derek Jeter'/><category term='Bayesian'/><category term='politics'/><category term='Jason Kendall'/><category term='real life'/><category term='random'/><category term='Bill James'/><category term='good math-bad statistics'/><category term='humour'/><category term='Baltimore Orioles'/><category term='wins'/><category term='Mariano Rivera'/><category term='philosophy'/><category term='sportcaster'/><category term='luck'/><category term='salary'/><category term='epistemology'/><category term='batting'/><category term='poisson'/><category term='Aleks Jakulin'/><category term='linear weights'/><category term='Seattle Mariners'/><category term='economics'/><category term='unemployment'/><category term='slugging'/><category term='payroll'/><category term='talent distribution'/><category term='Michael Lewis'/><category term='statistics'/><category term='Greg Maddux'/><category term='correlation'/><category term='Josh Hamilton'/><category term='Moneyball'/><category term='selection bias'/><category term='tennis'/><category term='skill'/><category term='Boston Red Sox'/><title type='text'>Bayes Ball</title><subtitle type='html'>The Reverend Thomas Bayes never saw a baseball, but he would have enjoyed thinking about the probabilistic nature of the game.</subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://bayesball.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://bayesball.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><author><name>Martin Monkman</name><uri>http://www.blogger.com/profile/05582544453619381290</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='http://2.bp.blogspot.com/_DcSPZvAgn6Y/TAwMW7v4JlI/AAAAAAAAAAY/JOD53e64_I8/S220/charliebrownpitching.jpg'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>53</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-684119683005319088.post-7042493892150814234</id><published>2012-02-13T14:38:00.001-08:00</published><updated>2012-02-13T18:53:05.022-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Bill James'/><title type='text'>Bill James in People magazine</title><content type='html'>&lt;table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="float: right; margin-left: 1em; text-align: right;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td style="text-align: center;"&gt;&lt;a href="http://img2.timeinc.net/people/i/2007/archive/covers/91/6_3_91_205x273.jpg" imageanchor="1" style="margin-left: auto; margin-right: auto;"&gt;&lt;img border="0" src="http://img2.timeinc.net/people/i/2007/archive/covers/91/6_3_91_205x273.jpg" /&gt;&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="tr-caption" style="text-align: center;"&gt;People, June 3, 1991.&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;br /&gt;No kidding.  &lt;br /&gt;&lt;br /&gt;First (May 31, 1982) up was &lt;a href="http://www.people.com/people/archive/article/0,,20082251,00.html"&gt;a review of the then-newly published Baseball Abstract&lt;/a&gt; in the "Picks and Pans" section.&lt;br /&gt;&lt;br /&gt;Then in 1991 a personal profile entitled "Holy R.b.i.—it's Statman!" (note: capitalization of RBI is in the archive) appeared, and can be found &lt;a href="http://www.people.com/people/archive/article/0,,20115245,00.html"&gt;here&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;-30-&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684119683005319088-7042493892150814234?l=bayesball.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bayesball.blogspot.com/feeds/7042493892150814234/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://bayesball.blogspot.com/2012/02/bill-james-in-people-magazine.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/7042493892150814234'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/7042493892150814234'/><link rel='alternate' type='text/html' href='http://bayesball.blogspot.com/2012/02/bill-james-in-people-magazine.html' title='Bill James in People magazine'/><author><name>Martin Monkman</name><uri>http://www.blogger.com/profile/05582544453619381290</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='http://2.bp.blogspot.com/_DcSPZvAgn6Y/TAwMW7v4JlI/AAAAAAAAAAY/JOD53e64_I8/S220/charliebrownpitching.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684119683005319088.post-5748822904910510637</id><published>2012-01-26T15:52:00.000-08:00</published><updated>2012-01-26T15:52:11.788-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='statistics'/><category scheme='http://www.blogger.com/atom/ns#' term='good math-bad statistics'/><category scheme='http://www.blogger.com/atom/ns#' term='academics'/><category scheme='http://www.blogger.com/atom/ns#' term='Bayesian'/><title type='text'>Statistical approaches</title><content type='html'>At the Statistics for Experimental Biologists blog, a short article about the four main ways to approach a statistical problem:&amp;nbsp; Bayesian, frequentist, information-theoretic, and likelihood, entitled "&lt;a href="http://labstats.net/articles/overview.html"&gt;Putting the methods you use into context&lt;/a&gt;". The article points out that there are overlaps between the four approaches, and ends with the statement "Knowing the big picture allows you to reflect on the methods you use, and ask whether they are appropriate for the task. It is also a useful antidote to some of the dogmatism associated with statistical analyses (you don't have to do something one way just because that's how you saw others do it)."&lt;br /&gt;&lt;br /&gt;The article also includes some references and links to peer-reviewed journal articles in PDF form that are (contrary to my &lt;a href="http://bayesball.blogspot.com/2012/01/bizarre-world-of-academic-journals.html" target="_blank"&gt;previous post&lt;/a&gt;) freely available to the hoi polloi.&amp;nbsp; One of those is by &lt;a href="http://www.mpib-berlin.mpg.de/en/staff/gerd-gigerenzer" target="_blank"&gt;Gerd Gigerenzer&lt;/a&gt;, and appeared in the &lt;i&gt;Journal of Socio-Economics&lt;/i&gt; in 2004.  In the article, titled "&lt;a href="http://people.umass.edu/%7Ebioep740/yr2009/topics/Gigerenzer-jSoc-Econ-1994.pdf" target="_blank"&gt;Mindless Statistics&lt;/a&gt;", Gigerenzer defines the "null ritual" as having three steps:&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;1. set up a statistical null hypothesis, but do not specify your own hypothesis nor any alternative hypothesis,&lt;br /&gt;2. use the 5% significance level for rejecting and accepting the null hypothesis, and&lt;br /&gt;3. always perform this procedure.&lt;/blockquote&gt;Gigerenzer then asks "Why do intelligent people engage in statistical rituals rather than in statistical thinking?" His conclusions are rooted in the ritualistic nature of statistical hypothesis testing, in that it has all of the elements of social rituals.&amp;nbsp; First, there is the repetition of the same action (repeating the test over and over).&amp;nbsp; Second, there is a focus on special numbers and colours -- 0.05 and 0.01, two standard deviations, etc.&amp;nbsp; Third, a ritual incorporates fears about serious sanctions for transgressions; in the academy, the adjudicators are professors, academic advisors, journal editors, and textbook publishers.&amp;nbsp; And finally, the ritual relies on "wishful thinking and delusions that virtually eliminate critical thinking", which in statistical research is the &lt;i&gt;p&lt;/i&gt;-value.&lt;br /&gt;&lt;br /&gt;I will finish with Gigerenzer's final two paragraphs, quoted here in their entirety:&lt;br /&gt;&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;We know but often forget that the problem of inductive inference has no single solution.&lt;br /&gt;There is no uniformly most powerful test, that is, no method that is best for every problem.&lt;br /&gt;Statistical theory has provided us with a toolbox with effective instruments, which require&lt;br /&gt;judgment about when it is right to use them. When textbooks and curricula begin to teach&lt;br /&gt;the toolbox, students will automatically learn to make judgments. And they will realize that&lt;br /&gt;in many applications, a skilful and transparent descriptive data analysis is sufficient, and&lt;br /&gt;preferable to the application of statistical routines chosen for their complexity and opacity.&lt;br /&gt;Judgment is part of the art of statistics.&lt;br /&gt;To stop the ritual, we also need more guts and nerves.We need some pounds of courage&lt;br /&gt;to cease playing along in this embarrassing game. This may cause friction with editors and&lt;br /&gt;colleagues, but it will in the end help them to enter the dawn of statistical thinking.&lt;/blockquote&gt;&lt;br /&gt;-30-&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684119683005319088-5748822904910510637?l=bayesball.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bayesball.blogspot.com/feeds/5748822904910510637/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://bayesball.blogspot.com/2012/01/statistical-approaches.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/5748822904910510637'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/5748822904910510637'/><link rel='alternate' type='text/html' href='http://bayesball.blogspot.com/2012/01/statistical-approaches.html' title='Statistical approaches'/><author><name>Martin Monkman</name><uri>http://www.blogger.com/profile/05582544453619381290</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='http://2.bp.blogspot.com/_DcSPZvAgn6Y/TAwMW7v4JlI/AAAAAAAAAAY/JOD53e64_I8/S220/charliebrownpitching.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684119683005319088.post-455058682776222607</id><published>2012-01-25T22:10:00.000-08:00</published><updated>2012-01-25T22:10:44.180-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='academics'/><title type='text'>The bizarre world of academic journals</title><content type='html'>On a fairly regular basis, for both my work and for light entertainment, I seek out journal articles written by leading scholars. &amp;nbsp;And I begin every search knowing there is a high probability that it will be foiled, and I will find myself unable to access those materials.&lt;br /&gt;&lt;br /&gt;Laura McKenna, writing for &lt;i&gt;The Atlantic&lt;/i&gt;, sums the bizarre nature of academic journals with this well-written and succinct article titled &lt;a href="http://www.theatlantic.com/business/archive/2012/01/locked-in-the-ivory-tower-why-jstor-imprisons-academic-research/251649/" target="_blank"&gt;"Locked in the Ivory Tower: Why JSTOR Imprisons Academic Research"&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Highly recommended.&lt;br /&gt;&lt;br /&gt;-30-&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684119683005319088-455058682776222607?l=bayesball.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bayesball.blogspot.com/feeds/455058682776222607/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://bayesball.blogspot.com/2012/01/bizarre-world-of-academic-journals.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/455058682776222607'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/455058682776222607'/><link rel='alternate' type='text/html' href='http://bayesball.blogspot.com/2012/01/bizarre-world-of-academic-journals.html' title='The bizarre world of academic journals'/><author><name>Martin Monkman</name><uri>http://www.blogger.com/profile/05582544453619381290</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='http://2.bp.blogspot.com/_DcSPZvAgn6Y/TAwMW7v4JlI/AAAAAAAAAAY/JOD53e64_I8/S220/charliebrownpitching.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684119683005319088.post-3240819900290458163</id><published>2012-01-13T13:55:00.000-08:00</published><updated>2012-01-13T13:56:17.426-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='SABR'/><category scheme='http://www.blogger.com/atom/ns#' term='sabermetrics'/><title type='text'>SABR Analytics Conference</title><content type='html'>The Society for American Baseball Research (SABR) has just announced that they will be hosting the first-ever &lt;a href="http://sabr.org/analytics" target="_blank"&gt;conference dedicated to baseball analytics&lt;/a&gt; from March 15-17, 2012.&lt;br /&gt;&lt;br /&gt;All of the "featured speakers" shown are team general managers and executives&amp;nbsp;-- but I anticipate that as more speakers are announced there will be some number crunchers added to the list.&lt;br /&gt;&lt;br /&gt;Hopefully there will be a conference procedings document produced for those of us who won't be able to attend.&lt;br /&gt;&lt;br /&gt;-30-&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684119683005319088-3240819900290458163?l=bayesball.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bayesball.blogspot.com/feeds/3240819900290458163/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://bayesball.blogspot.com/2012/01/sabr-analytics-conference.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/3240819900290458163'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/3240819900290458163'/><link rel='alternate' type='text/html' href='http://bayesball.blogspot.com/2012/01/sabr-analytics-conference.html' title='SABR Analytics Conference'/><author><name>Martin Monkman</name><uri>http://www.blogger.com/profile/05582544453619381290</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='http://2.bp.blogspot.com/_DcSPZvAgn6Y/TAwMW7v4JlI/AAAAAAAAAAY/JOD53e64_I8/S220/charliebrownpitching.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684119683005319088.post-5072164582651752178</id><published>2012-01-07T17:18:00.000-08:00</published><updated>2012-01-07T17:32:30.793-08:00</updated><title type='text'>The music of my mind</title><content type='html'>Jason Branon at Baseball Nation offers up "&lt;a href="http://www.blogger.com/ProGS%20-%20The%20Pronunciation%20Guide%20For%20Sabermetricians"&gt;ProGS - The Pronunciation Guide For Sabermetricians&lt;/a&gt;", which is just what it says.&lt;br /&gt;&lt;br /&gt;At first I thought, why bother? I tended to side with Jason's leader, Rob Neyer -- the only people who read this stuff never leave the house.  And then I realized that isn't entirely true -- at the annual SABR convention (&lt;a href="http://sabr.org/convention"&gt;in Minneapolis this year&lt;/a&gt;), baseball nerds get together, and the quantitative specialists need to be able to argue about the "new stats" and the interpretation thereof, rather than wasting precious time and energy arguing about pronunciation.  (Which is a whole other dimension of nerdiness.)&lt;br /&gt;&lt;br /&gt;Perhaps someone can create a sabermetric rewrite of the Gershwin brothers' "Let's call the whole thing off" -- the tuh-MAY-toe / ta-MAH-toe song.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;a href="http://3.bp.blogspot.com/-GqeydSciZqA/TwjqT6Q9PnI/AAAAAAAAAIg/z9WWZC1Smwo/s1600/fred+and+ginger+3.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="279" src="http://3.bp.blogspot.com/-GqeydSciZqA/TwjqT6Q9PnI/AAAAAAAAAIg/z9WWZC1Smwo/s320/fred+and+ginger+3.jpg" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;i&gt;Fred shows Ginger how to execute an effective throw from second base on a double play pivot.&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;-30-&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684119683005319088-5072164582651752178?l=bayesball.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bayesball.blogspot.com/feeds/5072164582651752178/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://bayesball.blogspot.com/2012/01/ironic.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/5072164582651752178'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/5072164582651752178'/><link rel='alternate' type='text/html' href='http://bayesball.blogspot.com/2012/01/ironic.html' title='The music of my mind'/><author><name>Martin Monkman</name><uri>http://www.blogger.com/profile/05582544453619381290</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='http://2.bp.blogspot.com/_DcSPZvAgn6Y/TAwMW7v4JlI/AAAAAAAAAAY/JOD53e64_I8/S220/charliebrownpitching.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/-GqeydSciZqA/TwjqT6Q9PnI/AAAAAAAAAIg/z9WWZC1Smwo/s72-c/fred+and+ginger+3.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684119683005319088.post-7313495038203240318</id><published>2011-12-14T19:21:00.000-08:00</published><updated>2011-12-14T19:28:25.476-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='unemployment'/><category scheme='http://www.blogger.com/atom/ns#' term='real life'/><title type='text'>OT: Unemployment Rates in the U.S.A.</title><content type='html'>A great example of &lt;a href="http://flowingdata.com/2011/12/12/fox-news-still-makes-awesome-charts/" target="_blank"&gt;a truly awful graph&lt;/a&gt; was posted on Flowing Data, starting &lt;a href="http://www.insidethebook.com/ee/index.php/site/comments/this_week_in_chart_failure/" target="_blank"&gt;a conversation on The Book&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;I posted the following comment there:&lt;br /&gt;&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;&lt;span class="Apple-style-span" style="background-color: #cccccc;"&gt;Wexler/27 beat me to the BLS data sets... I will &amp;nbsp;note that the "discouraged" numbers can be found in the "characteristics of the unemployed" tables. &amp;nbsp;The different ways of parsing what constitutes "unemployment" are ways to try to get to the nuance in why people aren't working. &amp;nbsp;The narrow "actively seeking work" definition of defining who is in the labour market is a way to cut through demographic changes (e.g. in the post-WWII period, when most women were June Cleavering and not seeking work), increases in post-secondary enrollments, etc.&lt;br /&gt;With that said, the increase in people who have thrown in the towel is, to me, one of the most disturbing parts of the current recession.&lt;br /&gt;Wexler/7 and MGL/8 raise the question of attribution -- how much is the President (in the current circumstance, Obama) responsible for unemployment rates?&lt;br /&gt;I dug up some historical U.S. unemployment rate data going back to 1948, and I have posted a chart of it to my own blog (because I have no idea how to do that directly). &lt;br /&gt;A summary: &amp;nbsp;the current increase in unemployment started in the last year of G.W. Bush's 2nd term (rising from 5.0% in January 2008 to 6.8% by the November, the month of the election). &amp;nbsp;Going back, there was a peak in unemployment during the first G.W. presidency, and another that straddled G.H.W. Bush and Clinton. &amp;nbsp;And the worst unemployment rates since the Great Depression (higher rates and with a longer peak than the current phase) were during the first term of Ronald Reagan's presidency. &amp;nbsp;&lt;/span&gt;&lt;/blockquote&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;br /&gt;Here's a chart of U.S. unemployment rates from January 1948 to November 2011:&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;a href="http://3.bp.blogspot.com/-zaoMNHpVUH8/TullN3Jf7GI/AAAAAAAAAII/ad0wXWr6LpM/s1600/US+unemp+SA_1948-2011.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="225" src="http://3.bp.blogspot.com/-zaoMNHpVUH8/TullN3Jf7GI/AAAAAAAAAII/ad0wXWr6LpM/s320/US+unemp+SA_1948-2011.jpg" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;And for those of you wanting to focus on the more recent period, the past 20 years (less a month):&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;a href="http://1.bp.blogspot.com/-C4yXZrAqjUQ/TullOQP9n3I/AAAAAAAAAIQ/F3iZioP5-AU/s1600/US+unemp+SA_1992-2011.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="225" src="http://1.bp.blogspot.com/-C4yXZrAqjUQ/TullOQP9n3I/AAAAAAAAAIQ/F3iZioP5-AU/s320/US+unemp+SA_1992-2011.jpg" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;I obtained the U.S. Department of Labor data set from the &lt;a href="http://research.stlouisfed.org/" target="_blank"&gt;Economic Research&lt;/a&gt; pages of the Federal Reserve Bank of St. Louis&amp;nbsp;&lt;a href="http://research.stlouisfed.org/fred2/data/UNRATE.txt" target="_blank"&gt;here&lt;/a&gt;, and have converted that text file to &lt;a href="https://docs.google.com/open?id=0B7t4wpcrwqkBNDBhZjAyYWQtNzk4NS00MGQ1LWEyMTMtMDcxZTliNzJiMDk4" target="_blank"&gt;an Excel file&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;More Bureau of Labor Statistics (BLS) unemployment data can be found &lt;a href="http://www.bls.gov/bls/unemployment.htm" target="_blank"&gt;here&lt;/a&gt;,&amp;nbsp;&lt;a href="http://www.bls.gov/cps/tables.htm" target="_blank"&gt;here&lt;/a&gt; and &lt;a href="http://www.bls.gov/cps/" target="_blank"&gt;here&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;-30-&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684119683005319088-7313495038203240318?l=bayesball.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bayesball.blogspot.com/feeds/7313495038203240318/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://bayesball.blogspot.com/2011/12/ot-unemployment-rates-in-usa.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/7313495038203240318'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/7313495038203240318'/><link rel='alternate' type='text/html' href='http://bayesball.blogspot.com/2011/12/ot-unemployment-rates-in-usa.html' title='OT: Unemployment Rates in the U.S.A.'/><author><name>Martin Monkman</name><uri>http://www.blogger.com/profile/05582544453619381290</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='http://2.bp.blogspot.com/_DcSPZvAgn6Y/TAwMW7v4JlI/AAAAAAAAAAY/JOD53e64_I8/S220/charliebrownpitching.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/-zaoMNHpVUH8/TullN3Jf7GI/AAAAAAAAAII/ad0wXWr6LpM/s72-c/US+unemp+SA_1948-2011.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684119683005319088.post-6523464198658055162</id><published>2011-11-21T10:32:00.000-08:00</published><updated>2011-11-21T10:42:18.897-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='infographics'/><category scheme='http://www.blogger.com/atom/ns#' term='prediction'/><category scheme='http://www.blogger.com/atom/ns#' term='Vancouver Canadians'/><category scheme='http://www.blogger.com/atom/ns#' term='minor leagues'/><title type='text'>Farm system success</title><content type='html'>Flip Flop Flyball has had a number of good infographics (and humour items) in the past, but their recent &lt;a href="http://www.flipflopflyin.com/flipflopflyball/info-organizations2011.html"&gt;"Wins and loses throughout each team's system"&lt;/a&gt; chart is particularly&amp;nbsp;interesting. &amp;nbsp;One thing that caught my eye is that no team in the Houston Astros system managed to break the .500 mark in 2011.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;a href="http://www.flipflopflyin.com/flipflopflyball/info-organizations2011.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="320px" src="http://www.flipflopflyin.com/flipflopflyball/info-organizations2011.png" width="253px" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;This raises a question in my mind. Can the current&amp;nbsp;performance of&amp;nbsp;minor league afflitates be used to predict MLB team performance at some future date?&amp;nbsp; (Economists would call this a leading indicator.)&amp;nbsp; All of the research on minor league performance that I'm aware of is in service of&amp;nbsp;forecasting individual player performance.&amp;nbsp;For&amp;nbsp;good review of that work, see "&lt;a href="http://www.fangraphs.com/library/index.php/the-projection-rundown-the-basics-on-marcels-zips-cairo-oliver-and-the-rest/" target="_blank"&gt;The Projection Rundown&lt;/a&gt;" at Fangraphs.&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;But I'm wondering if the fluid nature of the minor leagues will yield any sort of meaningful result at the team level.&amp;nbsp; Not only are players constantly moving up and down between the levels, it also seems to me that they are every bit as likely (if not more so) to move mid-season&amp;nbsp;from one organization's farm system to another (resource: &lt;a href="http://www.baseballamerica.com/statistics/players/mlfa.php" target="_blank"&gt;Baseball America's listing of minor league players&lt;/a&gt;).&amp;nbsp; And the farm teams themselves are prone to shifting from one organization to another, and moving up and down the levels.&amp;nbsp; As one example, the &lt;a href="http://www.minorleaguebaseball.com/index.jsp?sid=t435" target="_blank"&gt;Vancouver Canadians&lt;/a&gt; of the single-A Northwest League were affiliates of the&amp;nbsp;Oakland A's for 11 seasons, but in 2011 came under to Toronto Blue Jays umbrella (they finished with a 0.513 record, second in their division).&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;Which then leads me back to the infographic: other than 2011 results, does it tell us anything?&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;-30-&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684119683005319088-6523464198658055162?l=bayesball.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bayesball.blogspot.com/feeds/6523464198658055162/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://bayesball.blogspot.com/2011/11/farm-system-success.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/6523464198658055162'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/6523464198658055162'/><link rel='alternate' type='text/html' href='http://bayesball.blogspot.com/2011/11/farm-system-success.html' title='Farm system success'/><author><name>Martin Monkman</name><uri>http://www.blogger.com/profile/05582544453619381290</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='http://2.bp.blogspot.com/_DcSPZvAgn6Y/TAwMW7v4JlI/AAAAAAAAAAY/JOD53e64_I8/S220/charliebrownpitching.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684119683005319088.post-7134240297759720915</id><published>2011-11-14T14:50:00.000-08:00</published><updated>2011-11-14T14:50:54.902-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Moneyball'/><category scheme='http://www.blogger.com/atom/ns#' term='Michael Lewis'/><category scheme='http://www.blogger.com/atom/ns#' term='Billy Beane'/><title type='text'>Lewis &amp; Beane interview</title><content type='html'>&lt;em&gt;Moneyball&lt;/em&gt; (the movie) opens in the U.K. on November 25, and as part of the publicity, &lt;em&gt;The Financial Times&lt;/em&gt; features an &lt;a href="http://www.ft.com/intl/cms/s/2/3f5cc88c-0b21-11e1-ae56-00144feabdc0.html" target="_blank"&gt;in-depth interview with both Michael Lewis and Billy Beane&lt;/a&gt;&amp;nbsp;by Simon&amp;nbsp;Kuper.&lt;br /&gt;&lt;br /&gt;It's quite a revealing interview, that digs into the relationship between Lewis and Beane -- why Lewis was interested in finding the story, and why Beane let Lewis hang around.&lt;br /&gt;&lt;br /&gt;But to whet your appetite, here's a couple of highlights.  First, a quote from Michael Lewis: &lt;br /&gt;&lt;blockquote&gt;"Baseball is a stupid-making enterprise in that nobody wants to be singled out or say something dumb. You wander in the clubhouse and it’s amazing how incurious the players are. One reason I was attracted to Scott Hatteberg [the former A’s player] as a character: he was just curious: ‘What the hell are you doing here, man?’”&lt;/blockquote&gt;&lt;br /&gt;On the criticisms of &lt;i&gt;Moneyball&lt;/i&gt;:&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;There are two silly objections often made to Lewis’s book. The first is that if Moneyball works so well, then why haven’t the A’s had a winning season since 2006? We meet on a sunny October morning, mid-playoffs, a perfect day for baseball, but the team’s season has long since ended.&lt;br /&gt;&lt;br /&gt;However, the people who make this objection don’t seem to grasp the basic principles of imitation and catch-up. Once all teams are playing Moneyball, then playing Moneyball no longer gives you an edge. Indeed, the richer clubs have the means to play it smarter. The New York Yankees recently hired 21 statisticians, Beane marvels.&lt;br /&gt;&lt;br /&gt;The other common snipe is that Beane should never have spilled his secrets to Lewis. That ruined the A’s, the critics say. But Lewis dismisses the charge. First, he notes, Beane had never imagined their conversations would spiral into a book. Lewis says, “I was going to do something little. By the time I thought I was going to do something big I’d hung around so much it would have been socially awkward to ask me to leave.”&lt;br /&gt;&lt;br /&gt;Second, notes Lewis, by 2002 &lt;em&gt;Moneyball&lt;/em&gt; was already spreading. The book ends with the Red Sox offering Beane the highest GM’s salary in baseball history. Only when Beane turned them down, having decided after Stanford that he’d never do anything just for money again, did the Red Sox hire Epstein. “The market was moving already,” says Lewis. “The teams that wanted to do it were going to do it anyway, so no book was going to make any difference. My view is the only effect of the book was to give them [the A’s] the credit. If no book had been written, Theo would have been branded the man who reinvented baseball.”&lt;/blockquote&gt;&lt;br /&gt;Of course, Epstein's stuff worked in the playoffs.&lt;br /&gt;&lt;br /&gt;-30-&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684119683005319088-7134240297759720915?l=bayesball.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bayesball.blogspot.com/feeds/7134240297759720915/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://bayesball.blogspot.com/2011/11/lewis-beane-interview.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/7134240297759720915'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/7134240297759720915'/><link rel='alternate' type='text/html' href='http://bayesball.blogspot.com/2011/11/lewis-beane-interview.html' title='Lewis &amp; Beane interview'/><author><name>Martin Monkman</name><uri>http://www.blogger.com/profile/05582544453619381290</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='http://2.bp.blogspot.com/_DcSPZvAgn6Y/TAwMW7v4JlI/AAAAAAAAAAY/JOD53e64_I8/S220/charliebrownpitching.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684119683005319088.post-2999802234562411038</id><published>2011-11-07T12:06:00.000-08:00</published><updated>2011-11-07T12:06:21.096-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='baseball bookshelf'/><category scheme='http://www.blogger.com/atom/ns#' term='SABR'/><category scheme='http://www.blogger.com/atom/ns#' term='Bill James'/><category scheme='http://www.blogger.com/atom/ns#' term='sabermetrics'/><title type='text'>The Sabermetric bookshelf, #2</title><content type='html'>&lt;em&gt;Baseball Analyst&lt;/em&gt;, 1982-1989 (Bill James, publisher and editor)&lt;br /&gt;&lt;div&gt;&lt;/div&gt;&lt;div&gt;&lt;/div&gt;&lt;div&gt;&lt;/div&gt;&lt;div&gt;&lt;hr /&gt;&lt;/div&gt;&lt;div&gt;&lt;/div&gt;SABR is now hosting -- the the blessing of Bill James, and through the work of &lt;a href="http://sabermetricresearch.blogspot.com/" target="_blank"&gt;Phil Birnbaum&lt;/a&gt;&amp;nbsp;-- &lt;a href="http://sabr.org/latest/baseball-analyst-archives-now-available" target="_blank"&gt;the complete &lt;em&gt;Baseball Analyst&lt;/em&gt;&lt;/a&gt;.&amp;nbsp; Between 1982 and 1989, Bill James published 40 issues of&amp;nbsp;&lt;em&gt;Baseball Analyst&lt;/em&gt;, which in retrospect is now recognized as the launch pad for some fundamental thinking about using quantitative approaches to understand baseball. &lt;br /&gt;&lt;br /&gt;The initial issue got off to a great start, with an article about fielding by Paul Schwarzenbart. In his introduction to the issue, James writes that the article "demonstrates that fielding statistics, like batting and pitching&amp;nbsp;but apparently even more so, are the products in part&amp;nbsp;of circumstances as well as men."&amp;nbsp;This is a topic that, 30 years later, continues to provide plenty of fodder for analysis (e.g. this blog post from a month ago by Tangotiger, "&lt;a href="http://www.insidethebook.com/ee/index.php/site/comments/not_all_fielding_opportunities_are_created_the_same/" target="_blank"&gt;Not all fielding opportunities are created the same&lt;/a&gt;").&lt;br /&gt;&lt;br /&gt;In later issues, there are articles covering the usual parade of topics: clutch hitting, ballpark effects, how much young pitchers should work, ageing of ball players,&amp;nbsp;and of course movie reviews.&lt;br /&gt;&lt;br /&gt;There's also familiar names: Pete Palmer, Phil Birnbaum, and Bill James himself.&lt;br /&gt;&lt;br /&gt;All in all, &lt;em&gt;Baseball Analyst&lt;/em&gt; is an interesting time capsule.&amp;nbsp;The tools the sabermetric community use to communicate have shifted --&amp;nbsp;when was the last time you subscribed to a&amp;nbsp;magazine produced on a typewriter and mimeograph?&amp;nbsp;But more importantly,&amp;nbsp;it&amp;nbsp;demonstrates how thinking about these topics has shifted. This shift is both because of further research (we know more than we used to)&amp;nbsp;and because of the proliferation of data and cheap computing power&lt;br /&gt;&lt;br /&gt;But it also shows that in spite of 30 years of analysis, there are still many questions unresolved.&lt;br /&gt;&lt;br /&gt;-30-&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684119683005319088-2999802234562411038?l=bayesball.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bayesball.blogspot.com/feeds/2999802234562411038/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://bayesball.blogspot.com/2011/11/sabermetric-bookshelf-2.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/2999802234562411038'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/2999802234562411038'/><link rel='alternate' type='text/html' href='http://bayesball.blogspot.com/2011/11/sabermetric-bookshelf-2.html' title='The Sabermetric bookshelf, #2'/><author><name>Martin Monkman</name><uri>http://www.blogger.com/profile/05582544453619381290</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='http://2.bp.blogspot.com/_DcSPZvAgn6Y/TAwMW7v4JlI/AAAAAAAAAAY/JOD53e64_I8/S220/charliebrownpitching.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684119683005319088.post-4818279641751139198</id><published>2011-10-18T11:57:00.000-07:00</published><updated>2011-10-18T11:57:36.530-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='World Series'/><category scheme='http://www.blogger.com/atom/ns#' term='Bill James'/><category scheme='http://www.blogger.com/atom/ns#' term='prediction'/><title type='text'>World Series prediction: the Bill James method</title><content type='html'>Bill James developed a method for predicting playoff series winners, last updated in the 1984 edition of &lt;em&gt;Baseball Abstract&lt;/em&gt; in an essay titled "The World Series Prediction System, Revisited".&amp;nbsp; At that point, it had a pretty good track record -- 73% success in predicting the winner of all the postseason series in the 20th century.&lt;br /&gt;&lt;br /&gt;Mike Lynch over at seamheads.com used the method (without any adjustments, updates, or other tweaks)&amp;nbsp;to &lt;a href="http://seamheads.com/2010/10/26/bill-james-world-series-predictor-goes-with/"&gt;predict the 2010 World Series&lt;/a&gt; -- which correctly identified the Giants.&lt;br /&gt;&lt;br /&gt;This year, &lt;a href="http://seamheads.com/2011/10/17/and-your-2011-world-series-winner-is/"&gt;Lynch has again used the tool&lt;/a&gt; and tabulated the Rangers and the Cardinals according to the Bill James method.&amp;nbsp; &lt;br /&gt;&lt;br /&gt;The result:&amp;nbsp; the Rangers come out as solid favorites.&lt;br /&gt;&lt;br /&gt;(A couple of other older references to previous use of method are &lt;a href="http://www.baseballthinkfactory.org/files/newsstand/discussion/bill_james_prediction_system/"&gt;here&lt;/a&gt; and &lt;a href="http://sonsofsamhorn.yuku.com/forum/viewtopic/id/17633"&gt;here&lt;/a&gt;. Other than that, I haven't found anything on the web that uses or updates the method.)&lt;br /&gt;&lt;br /&gt;-30-&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684119683005319088-4818279641751139198?l=bayesball.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bayesball.blogspot.com/feeds/4818279641751139198/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://bayesball.blogspot.com/2011/10/world-series-prediction-bill-james.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/4818279641751139198'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/4818279641751139198'/><link rel='alternate' type='text/html' href='http://bayesball.blogspot.com/2011/10/world-series-prediction-bill-james.html' title='World Series prediction: the Bill James method'/><author><name>Martin Monkman</name><uri>http://www.blogger.com/profile/05582544453619381290</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='http://2.bp.blogspot.com/_DcSPZvAgn6Y/TAwMW7v4JlI/AAAAAAAAAAY/JOD53e64_I8/S220/charliebrownpitching.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684119683005319088.post-7993177031169757870</id><published>2011-10-17T19:11:00.000-07:00</published><updated>2011-10-17T19:11:54.308-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='World Series'/><category scheme='http://www.blogger.com/atom/ns#' term='log5'/><category scheme='http://www.blogger.com/atom/ns#' term='Bayesian'/><category scheme='http://www.blogger.com/atom/ns#' term='prediction'/><title type='text'>World Series prediction</title><content type='html'>The 2011 World Series starts in a couple of days, and it's time for the pundits to come out and make their predictions.&amp;nbsp; Over on coolstandings.com they've posted their &lt;a href="http://www.coolstandings.com/playoff_outcomes2.asp?sn=2011"&gt;prediction for the World Series&lt;/a&gt;.&amp;nbsp; Here's a screenshot of their "smart" prediction:&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;a href="http://4.bp.blogspot.com/-Zs6ZIXqCWAI/TpzLNdloFTI/AAAAAAAAAHo/za5CuaIQ9Ng/s1600/coolstandings_worldseries2011_20111017.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="286" src="http://4.bp.blogspot.com/-Zs6ZIXqCWAI/TpzLNdloFTI/AAAAAAAAAHo/za5CuaIQ9Ng/s320/coolstandings_worldseries2011_20111017.jpg" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;(The "dumb" prediction is 50/50 for either team, so there's no point talking about that. And I've posted a screenshot, since their predictions are live and will change upon the outcome of the first game of the World Series. An example of &lt;a href="http://en.wikipedia.org/wiki/Monty_Hall_problem"&gt;the Monty Hall problem&lt;/a&gt;, in real life.)&lt;br /&gt;&lt;br /&gt;To summarize:&amp;nbsp; Texas shows as having a 68.2% probability of winning the World Series.&lt;br /&gt;&lt;br /&gt;I'm not sure of the details of their methodology, but we can use each team's regular season win/loss record to employ the &lt;a href="http://www.tangotiger.net/wiki/index.php?title=Log5"&gt;"log5"&lt;/a&gt; approach to come up with our own prediction.&amp;nbsp; So I did that, and my first prediction is for a Texas victory (58% probability) -- and if pressed to predict the series length, it would be Texas in 6 games (17% of the outcomes are Texas 4-2). &amp;nbsp;Both probabilities are&amp;nbsp;substantially lower than the coolstandings prediction.&lt;br /&gt;&lt;br /&gt;But we can be a bit more sophisticated in our approach, using an adjusted win/loss percentage that employs a Bayesian adjustment to each team's final result.&amp;nbsp; (This is &lt;a href="http://bayesball.blogspot.com/2011/05/early-season-standings-and-bayes.html"&gt;the same method I used back in May&lt;/a&gt; for the early season results -- after 162 games the impact of the prior is much reduced.) This changes Texas' winning percentage to 0.571, and St. Louis to 0.543. &amp;nbsp;(Google doc spreadsheet &lt;a href="https://docs.google.com/spreadsheet/ccc?key=0Art4wpcrwqkBdEdMaGFMdlZhVWoweDhfWVJTYXpBZFE&amp;amp;hl=en_US#gid=0"&gt;here&lt;/a&gt;.) &amp;nbsp;Using the log5 formula, this gives the Rangers a 0.538 edge over the Cardinals.&lt;br /&gt;&lt;br /&gt;Working through the 7 game series, Texas' probability of winning the World Series is 56%.&lt;br /&gt;&lt;br /&gt;And we can be still more clever, by considering the road/home splits of each team. &lt;br /&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;Team &amp;nbsp; &amp;nbsp; &amp;nbsp;W-L &amp;nbsp; &amp;nbsp; % &amp;nbsp; posterior&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;-------- ----- &amp;nbsp;---- &amp;nbsp;---------&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;Texas &amp;nbsp; &amp;nbsp;96-66 &amp;nbsp;.593 &amp;nbsp; .571&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;- home &amp;nbsp; 52-29 &amp;nbsp;.642 &amp;nbsp; .591&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;- road &amp;nbsp; 44-37 &amp;nbsp;.543 &amp;nbsp; .527&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;St Louis 90-72 &amp;nbsp;.556 &amp;nbsp; .543&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;- home &amp;nbsp; 45-36 &amp;nbsp;.556 &amp;nbsp; .535&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;- road &amp;nbsp; 45-36 &amp;nbsp;.556 &amp;nbsp; .535&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;The home-road splits improve things for the Cardinals, since they had a better home record than Texas' road record and thus become more likely to win a home game. As well, the Cardinals have home field advantage (but only on game 7 -- the Rangers have home field advantage in a 5-game series. But I digress.) &amp;nbsp;After using the home-road splits, Texas still remains the favorite, but the probability is down to 54%.&lt;br /&gt;&lt;br /&gt;While my approaches still give Texas the biggest likelihood of victory, my estimates are less emphatic than the probabilities over at coolstandings.&amp;nbsp;&amp;nbsp;Based on the characterizations used at coolstandings,&amp;nbsp;my methods lie somewhere&amp;nbsp;between "dumb" and "smart". &amp;nbsp;"Average intelligence", perhaps.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;a href="http://1.bp.blogspot.com/-paIs_KtItLc/TpzdgDwypOI/AAAAAAAAAHw/68xY8PdHOFE/s1600/TowMater.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="267" src="http://1.bp.blogspot.com/-paIs_KtItLc/TpzdgDwypOI/AAAAAAAAAHw/68xY8PdHOFE/s320/TowMater.jpg" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;i&gt;Tow Mater says "Rangers in 6. But I had the Phillies and the Brewers beating the Cardinals, too".&lt;/i&gt;&lt;/div&gt;&lt;br /&gt;-30-&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684119683005319088-7993177031169757870?l=bayesball.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bayesball.blogspot.com/feeds/7993177031169757870/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://bayesball.blogspot.com/2011/10/world-series-prediction.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/7993177031169757870'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/7993177031169757870'/><link rel='alternate' type='text/html' href='http://bayesball.blogspot.com/2011/10/world-series-prediction.html' title='World Series prediction'/><author><name>Martin Monkman</name><uri>http://www.blogger.com/profile/05582544453619381290</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='http://2.bp.blogspot.com/_DcSPZvAgn6Y/TAwMW7v4JlI/AAAAAAAAAAY/JOD53e64_I8/S220/charliebrownpitching.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/-Zs6ZIXqCWAI/TpzLNdloFTI/AAAAAAAAAHo/za5CuaIQ9Ng/s72-c/coolstandings_worldseries2011_20111017.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684119683005319088.post-6150671007738114278</id><published>2011-10-08T11:49:00.000-07:00</published><updated>2011-10-08T11:49:45.180-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='infographics'/><title type='text'>And on the topic of infographics...</title><content type='html'>&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://farm7.static.flickr.com/6190/6143338263_d2497c02fe_z.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;br /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div style="margin-left: 1em; margin-right: 1em; text-align: left;"&gt;&lt;a href="http://farm7.static.flickr.com/6190/6143338263_d2497c02fe_z.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="320" src="http://farm7.static.flickr.com/6190/6143338263_d2497c02fe_z.jpg" width="251" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;br /&gt;&lt;/div&gt;&amp;nbsp;Source: &amp;nbsp;&lt;a href="http://www.flickr.com/photos/smoy/6143338263/"&gt;http://www.flickr.com/photos/smoy/6143338263/&lt;/a&gt;, via&amp;nbsp;&lt;a href="http://andrewgelman.com/2011/09/meta-infographic/"&gt;Andrew Gelman&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;-30-&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684119683005319088-6150671007738114278?l=bayesball.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bayesball.blogspot.com/feeds/6150671007738114278/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://bayesball.blogspot.com/2011/10/and-on-topic-of-infographics.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/6150671007738114278'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/6150671007738114278'/><link rel='alternate' type='text/html' href='http://bayesball.blogspot.com/2011/10/and-on-topic-of-infographics.html' title='And on the topic of infographics...'/><author><name>Martin Monkman</name><uri>http://www.blogger.com/profile/05582544453619381290</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='http://2.bp.blogspot.com/_DcSPZvAgn6Y/TAwMW7v4JlI/AAAAAAAAAAY/JOD53e64_I8/S220/charliebrownpitching.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://farm7.static.flickr.com/6190/6143338263_d2497c02fe_t.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684119683005319088.post-6787132960156139379</id><published>2011-10-07T12:12:00.000-07:00</published><updated>2011-10-07T12:14:47.590-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='New York Yankees'/><category scheme='http://www.blogger.com/atom/ns#' term='infographics'/><category scheme='http://www.blogger.com/atom/ns#' term='Detroit Tigers'/><category scheme='http://www.blogger.com/atom/ns#' term='WPA'/><title type='text'>WPA contribution infographic</title><content type='html'>I like these WPA&amp;nbsp;word cloud&amp;nbsp;graphics from SB Nation by Kevin Dame, describing the player contributions in last night's Tiger-Yankee ALDS&amp;nbsp;game 5.&amp;nbsp; (From &lt;br /&gt;&lt;a href="http://mlb.sbnation.com/2011/10/7/2474640/yankees-tigers-game-5-visual-box-score"&gt;http://mlb.sbnation.com/2011/10/7/2474640/yankees-tigers-game-5-visual-box-score&lt;/a&gt;.)&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://assets.sbnation.com/assets/739644/heros___goats_tigers.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="214px" kca="true" src="http://assets.sbnation.com/assets/739644/heros___goats_tigers.jpg" width="320px" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://assets.sbnation.com/assets/739640/heros___goats_yankees.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="214px" kca="true" src="http://assets.sbnation.com/assets/739640/heros___goats_yankees.jpg" width="320px" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;One of the things I like is that they emphasize that that&amp;nbsp;WPA (&lt;a href="http://www.fangraphs.com/library/index.php/misc/wpa/"&gt;Win Probability Added&lt;/a&gt;)&amp;nbsp;is circumstantial.&amp;nbsp; &lt;br /&gt;&lt;br /&gt;For the Tiger pitching staff, the starter Fister gave up only one run over five innings (that is, four scoreless innnings), but gets a smaller font than the closer Valverde&amp;nbsp;who worked only the scoreless ninth inning.&amp;nbsp; An easy example is Fister worked a 1-2-3 1st innning with a 2-0 lead, which was worth&amp;nbsp;0.052 WPA.&amp;nbsp; By contrast, Valverde's 1-2-3 9th inning with a one-run lead (3-2 score), was worth 0.222 WPA.&amp;nbsp; Being later in the game and with a tighter score yielded a higher WPA.&lt;br /&gt;&lt;br /&gt;And for the Yankee hitters,&amp;nbsp;ARod's strikeout to end the game (the end of Valverde's 1-2-3 ninth)&amp;nbsp;was only one-third as important to the Yankee defeat (-0.053 WPA)&amp;nbsp;as Swisher's strikeout to end the 7th inning, when the bases were loaded (-0.154).&amp;nbsp; Of course, on Swisher's strikeout&amp;nbsp;Tiger pitcher Joaquin Benoit set himself up for the big 0.154 WPA by coming in with a runner on 1st, then giving up two singles to load the bases, followed by a walk to close the lead to one run.&lt;br /&gt;&lt;br /&gt;(The &lt;a href="http://www.fangraphs.com/boxscore.aspx?date=2011-10-06&amp;amp;team=Yankees&amp;amp;dh=0"&gt;Fangraphs box score has the details&lt;/a&gt;&amp;nbsp;that were used to make the word clouds, while the individual&amp;nbsp;play log, with the WPA for each at-bat, is &lt;a href="http://www.fangraphs.com/plays.aspx?date=2011-10-06&amp;amp;team=Yankees&amp;amp;dh=0"&gt;here&lt;/a&gt;.)&lt;br /&gt;&lt;br /&gt;-30-&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684119683005319088-6787132960156139379?l=bayesball.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bayesball.blogspot.com/feeds/6787132960156139379/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://bayesball.blogspot.com/2011/10/win-expectancy-contribution.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/6787132960156139379'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/6787132960156139379'/><link rel='alternate' type='text/html' href='http://bayesball.blogspot.com/2011/10/win-expectancy-contribution.html' title='WPA contribution infographic'/><author><name>Martin Monkman</name><uri>http://www.blogger.com/profile/05582544453619381290</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='http://2.bp.blogspot.com/_DcSPZvAgn6Y/TAwMW7v4JlI/AAAAAAAAAAY/JOD53e64_I8/S220/charliebrownpitching.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684119683005319088.post-3171534932911954759</id><published>2011-10-04T13:55:00.000-07:00</published><updated>2011-10-04T13:55:45.163-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='actuarial'/><category scheme='http://www.blogger.com/atom/ns#' term='probability'/><category scheme='http://www.blogger.com/atom/ns#' term='Josh Hamilton'/><title type='text'>Actuarial baseball</title><content type='html'>&lt;em&gt;Been off the grid for a while...&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;A couple of weeks ago, Josh Hamilton of the Rangers hit a grand slam that got an above-average amount of attention, since it was tied into a promotion being run by a flooring company.&amp;nbsp; The title of this article describing the homer could instead be "Josh Hamilton's grand slam yields big insurance payout":&lt;br /&gt;&lt;a href="http://espn.go.com/dallas/mlb/story/_/id/6971469/texas-rangers-josh-hamilton-slam-triggers-free-flooring-payout"&gt;http://espn.go.com/dallas/mlb/story/_/id/6971469/texas-rangers-josh-hamilton-slam-triggers-free-flooring-payout&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Somebody, somewhere, in some insurance company, sold coverage for this promotion. &amp;nbsp;And that same&amp;nbsp;somebody (we hope)&amp;nbsp;must have sat down and calculated the probability of Hamilton hitting a grand slam over a one month period, and set the premium based on that probability.&lt;br /&gt;&lt;br /&gt;Summary:&amp;nbsp; insurance is gambling.&lt;br /&gt;&lt;br /&gt;-30-&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684119683005319088-3171534932911954759?l=bayesball.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bayesball.blogspot.com/feeds/3171534932911954759/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://bayesball.blogspot.com/2011/10/actuarial-baseball.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/3171534932911954759'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/3171534932911954759'/><link rel='alternate' type='text/html' href='http://bayesball.blogspot.com/2011/10/actuarial-baseball.html' title='Actuarial baseball'/><author><name>Martin Monkman</name><uri>http://www.blogger.com/profile/05582544453619381290</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='http://2.bp.blogspot.com/_DcSPZvAgn6Y/TAwMW7v4JlI/AAAAAAAAAAY/JOD53e64_I8/S220/charliebrownpitching.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684119683005319088.post-1845382549352152462</id><published>2011-08-09T13:47:00.000-07:00</published><updated>2011-08-09T13:47:33.376-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='baseball bookshelf'/><category scheme='http://www.blogger.com/atom/ns#' term='epistemology'/><category scheme='http://www.blogger.com/atom/ns#' term='philosophy'/><category scheme='http://www.blogger.com/atom/ns#' term='Bayesian'/><title type='text'>Bayes book</title><content type='html'>I recently learned about a new book by Sharon Bertsch McGrayne, &lt;em&gt;&lt;a href="http://yalepress.yale.edu/book.asp?isbn=9780300169690"&gt;The Theory That Would Not Die: How Bayes' Rule Cracked the Enigma Code, Hunted Down Russian Submarines, and Emerged Triumphant from Two Centuries of Controversy&lt;/a&gt;&lt;/em&gt;.&lt;br /&gt;&lt;br /&gt;I haven't got a copy yet, but based on a couple of reviews, it seems like it's going to be a good read. The &lt;a href="http://www.significancemagazine.org/details/review/1062663/The-Theory-That-Would-Not-Die-by-Sharon-Bertsch-McGrayne.html"&gt;review in Significance magazine&lt;/a&gt; describes it thus: "At times reading like a historical account, at times like investigative journalism, at yet other times like a statistical commentary."&lt;br /&gt;&lt;br /&gt;Other reviews:&amp;nbsp; &lt;a href="http://www.nytimes.com/2011/08/07/books/review/the-theory-that-would-not-die-by-sharon-bertsch-mcgrayne-book-review.html?_r=1&amp;amp;pagewanted=all"&gt;New York Times Sunday Book Review&lt;/a&gt;, &lt;a href="http://articles.boston.com/2011-06-05/ae/29686089_1_spam-21st-century-common-sense"&gt;Boston Globe&lt;/a&gt;, and &lt;a href="http://www.nature.com/nature/journal/v475/n7357/full/475450a.html"&gt;Nature&lt;/a&gt; (subscription required for on-line access).&lt;br /&gt;&lt;br /&gt;-30-&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684119683005319088-1845382549352152462?l=bayesball.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bayesball.blogspot.com/feeds/1845382549352152462/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://bayesball.blogspot.com/2011/08/bayes-book.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/1845382549352152462'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/1845382549352152462'/><link rel='alternate' type='text/html' href='http://bayesball.blogspot.com/2011/08/bayes-book.html' title='Bayes book'/><author><name>Martin Monkman</name><uri>http://www.blogger.com/profile/05582544453619381290</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='http://2.bp.blogspot.com/_DcSPZvAgn6Y/TAwMW7v4JlI/AAAAAAAAAAY/JOD53e64_I8/S220/charliebrownpitching.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684119683005319088.post-863571199096450694</id><published>2011-06-30T13:18:00.000-07:00</published><updated>2011-07-15T22:52:48.354-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='academics'/><category scheme='http://www.blogger.com/atom/ns#' term='physics'/><title type='text'>Physics of baseball</title><content type='html'>In 1990, I read a great little book called &lt;em&gt;The Physics of Baseball&lt;/em&gt; by Robin K. Adair (now in its third edition).&amp;nbsp; It's strongly recommended to anyone interested in the subject matter.&amp;nbsp; And here's &lt;a href="http://www.popularmechanics.com/science/4256812"&gt;a short Q&amp;amp;A with Adair&lt;/a&gt; at &lt;em&gt;Popular Mechanics&lt;/em&gt; from a couple of years ago.&lt;br /&gt;&lt;br /&gt;But a few other items of note on this subject have popped up recently.&lt;br /&gt;&lt;br /&gt;First, a&lt;a href="http://www.insidethebook.com/ee/index.php/site/comments/launch_angle_speed_off_the_bat_trajectory/"&gt; great chart at The Book&lt;/a&gt;, with the speed of the ball off the bat as the X axis and the angle of launch as Y, showing the outcome (from ground ball to home run) at the points X,Y.&lt;br /&gt;&lt;br /&gt;Then, there's an article in the latest issue of the &lt;em&gt;American Journal of Physics&lt;/em&gt; by Faber, Smith, Nathan, and Russell called "&lt;a href="http://www.kettering.edu/physics/drussell/bats-new/Papers/CheatingPaper.pdf"&gt;Corked bats, juiced balls, and humidors: The physics of cheating&lt;/a&gt;".&amp;nbsp;&amp;nbsp;You can also find&amp;nbsp;&lt;a href="http://www.smithsonianmag.com/science-nature/The-Physics-of-Cheating-in-Baseball.html"&gt;a short summary of the article at Smithsonian.com&lt;/a&gt;.&amp;nbsp; &lt;br /&gt;&lt;br /&gt;There are three questions asked, each with a nuanced answer.&lt;br /&gt;&lt;br /&gt;1. Question: "Can a baseball be hit&amp;nbsp;farther with a corked bat?"&lt;br /&gt;Answer: "...&amp;nbsp;while corking may not allow a batter to hit the ball farther, it may well allow a batter to hit the ball solidly more often."&lt;br /&gt;&lt;br /&gt;2. Question: "Is the baseball juiced?"&lt;br /&gt;Answer: The researchers "found no evidence that baseballs of today are more or less lively than baseballs used in the late 1970s."&lt;br /&gt;&lt;br /&gt;3. Question: "What's the deal with the humidor?" (Or, "is it plausible that the humidor accounts for the decrease in offensive statistics at Coors Field since 2002?"&lt;br /&gt;Answer: Yes.&lt;br /&gt;&lt;br /&gt;For those interested in a deeper dive into this topic, one of the co-authors of the study,&amp;nbsp;Alan Nathan, has a &lt;a href="http://webusers.npl.illinois.edu/~a-nathan/pob/"&gt;page dedicated to the physics of baseball&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;-30-&lt;br /&gt;&lt;br /&gt;&lt;i&gt;Update 2011-07-13: &amp;nbsp;Tango at The Book posted &lt;a href="http://www.insidethebook.com/ee/index.php/site/comments/physics_of_bats_and_balls/"&gt;a link to the Smithsonian article&lt;/a&gt;, and there has been plenty of commentary including a number of responses from Alan Nathan.&lt;/i&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684119683005319088-863571199096450694?l=bayesball.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bayesball.blogspot.com/feeds/863571199096450694/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://bayesball.blogspot.com/2011/06/physics-of-baseball.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/863571199096450694'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/863571199096450694'/><link rel='alternate' type='text/html' href='http://bayesball.blogspot.com/2011/06/physics-of-baseball.html' title='Physics of baseball'/><author><name>Martin Monkman</name><uri>http://www.blogger.com/profile/05582544453619381290</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='http://2.bp.blogspot.com/_DcSPZvAgn6Y/TAwMW7v4JlI/AAAAAAAAAAY/JOD53e64_I8/S220/charliebrownpitching.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684119683005319088.post-5350976388522788726</id><published>2011-06-01T20:25:00.000-07:00</published><updated>2011-06-01T20:26:36.098-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='regression toward the mean'/><category scheme='http://www.blogger.com/atom/ns#' term='Bayesian'/><category scheme='http://www.blogger.com/atom/ns#' term='Minnesota Twins'/><title type='text'>Two months in, a Bayesian look at the standings</title><content type='html'>At the end of April, I posted &lt;a href="http://bayesball.blogspot.com/2011/05/early-season-standings-and-bayes.html"&gt;"Early season standings and Bayes"&lt;/a&gt; that took two different approaches to regressing the early season standings to come up with a prediction for the eventual result for the full season.&lt;br /&gt;&lt;br /&gt;So here we are at the end of May, and there's been a lot of movement in the standings, so here's &lt;a href="https://spreadsheets.google.com/spreadsheet/ccc?key=0Art4wpcrwqkBdHdTUDdqeUJWbUZENW9xRVVDZVZFc0E&amp;hl=en_US#gid=0"&gt;an update to the spreadsheet&lt;/a&gt;.  Although the Phillies and the Indians remain at the top of the standings, they are starting to regress downwards.  At the end of April, both teams were "on pace" to win 112 games in the season, but the regression showed a more modest result of 93 wins.  A month later both teams are "on pace" to win 100, but the Bayesian approach suggests that they are more likely to win 91 games.&lt;br /&gt;&lt;br /&gt;If you are a Twins or an Astros fan, there is no solace in the fact that both teams have not regressed toward the mean over the past month, but have instead continued to play at roughly the same level they exhibited in April.  The regression model now predicts the Twins will end up at 65 wins, which would be the lowest in MLB.  Of course, this prediction is based only on the team's performance to date -- it doesn't consider &lt;a href="http://minnesporta.wordpress.com/2011/05/30/twins-injury-updates/"&gt;the number of injuries the Twins are currently dealing with&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;-30-&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684119683005319088-5350976388522788726?l=bayesball.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bayesball.blogspot.com/feeds/5350976388522788726/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://bayesball.blogspot.com/2011/06/two-months-in-bayesian-look-at.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/5350976388522788726'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/5350976388522788726'/><link rel='alternate' type='text/html' href='http://bayesball.blogspot.com/2011/06/two-months-in-bayesian-look-at.html' title='Two months in, a Bayesian look at the standings'/><author><name>Martin Monkman</name><uri>http://www.blogger.com/profile/05582544453619381290</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='http://2.bp.blogspot.com/_DcSPZvAgn6Y/TAwMW7v4JlI/AAAAAAAAAAY/JOD53e64_I8/S220/charliebrownpitching.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684119683005319088.post-4692853596966522221</id><published>2011-05-29T09:23:00.000-07:00</published><updated>2011-05-29T09:28:18.460-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='humour'/><category scheme='http://www.blogger.com/atom/ns#' term='random'/><category scheme='http://www.blogger.com/atom/ns#' term='XKCD'/><category scheme='http://www.blogger.com/atom/ns#' term='sportcaster'/><title type='text'>Ouch</title><content type='html'>&lt;a href="http://xkcd.com/904/"&gt;XKCD on sports&lt;/a&gt;. &amp;nbsp;Is there a way to test (in a quantitative manner) the hypothesis that baseball is the worst offender?&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;a href="http://imgs.xkcd.com/comics/sports.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="320" src="http://imgs.xkcd.com/comics/sports.png" width="240" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;-30-&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684119683005319088-4692853596966522221?l=bayesball.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bayesball.blogspot.com/feeds/4692853596966522221/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://bayesball.blogspot.com/2011/05/ouch.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/4692853596966522221'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/4692853596966522221'/><link rel='alternate' type='text/html' href='http://bayesball.blogspot.com/2011/05/ouch.html' title='Ouch'/><author><name>Martin Monkman</name><uri>http://www.blogger.com/profile/05582544453619381290</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='http://2.bp.blogspot.com/_DcSPZvAgn6Y/TAwMW7v4JlI/AAAAAAAAAAY/JOD53e64_I8/S220/charliebrownpitching.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684119683005319088.post-1973677540524557128</id><published>2011-05-06T13:15:00.000-07:00</published><updated>2011-05-06T15:45:05.142-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='real life'/><category scheme='http://www.blogger.com/atom/ns#' term='academics'/><category scheme='http://www.blogger.com/atom/ns#' term='Derek Jeter'/><category scheme='http://www.blogger.com/atom/ns#' term='sabermetrics'/><title type='text'>When labour market research goes to the ballpark</title><content type='html'>In a recently issued paper called "&lt;a href="http://harrisschool.uchicago.edu/programs/beyond/workshops/ppepapers/fall2010_feldman.pdf"&gt;Productivity, Wages, and Marriage: The Case of Major League Baseball&lt;/a&gt;", economists Francesca Cornaglia and Naomi E. Feldman examine the "marriage premium" -- the fact that controlling for other influencing factors, married men earn more than unmarried men. In most situations, a variety of confounding variables muddy the waters -- things like geographic location, differences across occupations, and poor productivity measures. Cornaglia and Feldman innovatively use information from MLB to control for those variables. &lt;br /&gt;&lt;br /&gt;&lt;table cellpadding="0" cellspacing="0" class="tr-caption-container" style="float: right; margin-left: 1em; text-align: right;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td style="text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/-9gUGJvQNNeM/TcRWWonjI9I/AAAAAAAAAEk/JbJbZgpdGK8/s1600/1993-topps-98-derek-jeter-rc.jpg" imageanchor="1" style="clear: right; cssfloat: right; margin-bottom: 1em; margin-left: auto; margin-right: auto;"&gt;&lt;img border="0" height="320px" j8="true" src="http://2.bp.blogspot.com/-9gUGJvQNNeM/TcRWWonjI9I/AAAAAAAAAEk/JbJbZgpdGK8/s320/1993-topps-98-derek-jeter-rc.jpg" width="230px" /&gt;&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="tr-caption" style="text-align: center;"&gt;Derek Jeter, the exception that proves the rule.&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;br /&gt;The abstract:&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;Using a sample of professional baseball players from 1871 - 2007, this paper aims at analyzing a longstanding empirical observation that married men earn significantly more than their single counterparts holding all else equal. There are numerous conflicting explanations, some of which reflect subtle sample selection problems (that is, men who tend to be successful in the workplace or have high potential wage growth also tend to be successful in attracting a spouse) and some of which are causal (that is, marriage does indeed increase productivity for men). Baseball is a unique case study because it has a long history of statistics collection and numerous direct measurements of productivity. Our results show that the marriage premium also holds for baseball players, where married players earn up to 20% more than those who are not married, even after controlling for selection. The results are generally robust only for players in the top third of the ability distribution and post 1975 when changes in the rules that govern wage contracts allowed for players to be valued closer to their true market price. Nonetheless, there do not appear to be clear differences in productivity between married and nonmarried players. We discuss possible reasons why employers may discriminate in favor of married men.&lt;/blockquote&gt;&lt;br /&gt;You can hear Dr. &lt;span style="font-family: URWBookmanL-Ligh;"&gt;Cornaglia discuss the research&amp;nbsp;on the BBC programme&lt;/span&gt; &lt;a href="http://www.bbc.co.uk/programmes/b010mwbt"&gt;&lt;i&gt;More or Less&lt;/i&gt; (2011-04-29&lt;/a&gt;), starting at roughly 10'50".&lt;br /&gt;&lt;br /&gt;My initial reaction regards neither the findings nor the methodology, but the fact that other than a mention of &lt;a href="http://baseball1.com/statistics/"&gt;the Lahman database&lt;/a&gt;, the list of references does not include any of work of the sabermetric research community. At one point in the discussion of productivity measures the authors write "Most modern-day baseball enthusiasts and commentators consider the latter two statistics [&lt;a href="http://en.wikipedia.org/wiki/On-base_plus_slugging"&gt;OPS&lt;/a&gt; and &lt;a href="http://en.wikipedia.org/wiki/Equivalent_average"&gt;EqA&lt;/a&gt;] to be the most accurate measures of a player’s productivity", but the authors neither refer to any authority to support that statement nor discuss&amp;nbsp;fact&amp;nbsp;that &lt;a href="http://www.insidethebook.com/ee/index.php/site/article/why_is_eqa_so_complicated/"&gt;others have critiqued those measures&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;This is not the first time that academics have utilized the contributions of the sabermetric community in supporting their research (in this case, it provides a vital element in the foundation of the productivy measure) but then failed to acknowledge that work. For a well-reasoned discussion of &lt;em&gt;that&lt;/em&gt; topic, please read Phil Birnbaum's &lt;a href="http://sabermetricresearch.blogspot.com/2010/01/chopped-liver-ii.html"&gt;"Chopped liver II"&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;-30-&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684119683005319088-1973677540524557128?l=bayesball.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bayesball.blogspot.com/feeds/1973677540524557128/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://bayesball.blogspot.com/2011/05/when-labour-market-research-goes-to.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/1973677540524557128'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/1973677540524557128'/><link rel='alternate' type='text/html' href='http://bayesball.blogspot.com/2011/05/when-labour-market-research-goes-to.html' title='When labour market research goes to the ballpark'/><author><name>Martin Monkman</name><uri>http://www.blogger.com/profile/05582544453619381290</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='http://2.bp.blogspot.com/_DcSPZvAgn6Y/TAwMW7v4JlI/AAAAAAAAAAY/JOD53e64_I8/S220/charliebrownpitching.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/-9gUGJvQNNeM/TcRWWonjI9I/AAAAAAAAAEk/JbJbZgpdGK8/s72-c/1993-topps-98-derek-jeter-rc.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684119683005319088.post-5959934006153171743</id><published>2011-05-01T11:56:00.000-07:00</published><updated>2011-05-01T11:56:43.179-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='probability'/><category scheme='http://www.blogger.com/atom/ns#' term='regression toward the mean'/><category scheme='http://www.blogger.com/atom/ns#' term='Bayesian'/><category scheme='http://www.blogger.com/atom/ns#' term='Boston Red Sox'/><title type='text'>Early season standings and Bayes</title><content type='html'>Early season performance has been a hot topic this year (not that it isn't a topic of discussion every year). &amp;nbsp;&lt;a href="http://bayesball.blogspot.com/2011/04/on-pace-for-162-win-season.html"&gt;I wrote about it&lt;/a&gt;, using a simple approach of assuming that every team is .500, and a more recent addition in the blogosphere is &lt;a href="http://mlb.sbnation.com/2011/4/27/2137412/april-standings-mean-more-than-you-might-think"&gt;Rob Neyer's take&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Last week &lt;a href="http://www.3-dbaseball.net/2011/04/bayes-regression-and-red-sox-but-mostly.html"&gt;Kincaid over at 3-D baseball had great post&lt;/a&gt; that used Boston's 2-10 start to go down a detailed and more sophisticated Bayesian path to estimating the team's true talent. &amp;nbsp;Tango posted a &lt;a href="http://www.insidethebook.com/ee/index.php/site/comments/bayes_and_regression_again/"&gt;link to Kincaid's blog&lt;/a&gt;, and added a few details that incorporate actual observations. &amp;nbsp;A key element in this is that the observed spread of talent is wider than the theoretical .500 level of all teams. &amp;nbsp;(If all teams were .500, the random component would result in a standard deviation of 0.039. In reality, the standard deviation is wider, at 0.071 -- the implication of this is that there are real talent differences between teams, with some teams having a true talent level above .500 and others below.)&lt;br /&gt;&lt;br /&gt;My modest contribution to this thread is here: &lt;a href="https://spreadsheets.google.com/ccc?key=0Art4wpcrwqkBdGdyTkd1Wm9mWXF1THJFdzFvNWFzY1E&amp;amp;hl=en"&gt;&amp;nbsp;a Google doc spreadsheet&lt;/a&gt; that show all of the MLB team's current record (as of 2011-04-30), and then takes two different Bayesian-based methods to predict each team's final season outcome.&lt;br /&gt;&lt;br /&gt;The first set are the yellow columns, which replicate Kincaid's "shortcut" approach, with the implied regression of 69 games noted by Tango. The blue columns take a different approach that uses the standard deviation of both the observed performance to date and the long-term observations (every MLB team season outcome from 1961-2010) as the prior.&lt;br /&gt;&lt;br /&gt;The difference in the result generated between these two approaches is relatively modest. &amp;nbsp;(It's worth noting that the relative position on the standings does not change.) What is apparent is that with roughly 25 games played this season, there are solid differences appearing in the team performances. &amp;nbsp;This method forecasts that Cleveland and Philadelphia will regress downward from .692 (a 112 win season) to .573 and end up with 93 wins. At the bottom of the table, it suggests that the Twins will improve from .346 (58 wins on the season) to .444 and a much more respectable 74 wins over the course of the season. &lt;br /&gt;&lt;br /&gt;-30-&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684119683005319088-5959934006153171743?l=bayesball.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bayesball.blogspot.com/feeds/5959934006153171743/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://bayesball.blogspot.com/2011/05/early-season-standings-and-bayes.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/5959934006153171743'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/5959934006153171743'/><link rel='alternate' type='text/html' href='http://bayesball.blogspot.com/2011/05/early-season-standings-and-bayes.html' title='Early season standings and Bayes'/><author><name>Martin Monkman</name><uri>http://www.blogger.com/profile/05582544453619381290</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='http://2.bp.blogspot.com/_DcSPZvAgn6Y/TAwMW7v4JlI/AAAAAAAAAAY/JOD53e64_I8/S220/charliebrownpitching.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684119683005319088.post-2219146433457541517</id><published>2011-04-28T12:14:00.000-07:00</published><updated>2011-04-28T12:14:37.567-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='football'/><category scheme='http://www.blogger.com/atom/ns#' term='correlation'/><category scheme='http://www.blogger.com/atom/ns#' term='probability'/><title type='text'>Words with meaning</title><content type='html'>Article at Slate entitled "&lt;a href="http://www.slate.com/id/2292312/"&gt;Turning words into touchdowns&lt;/a&gt;", on the work of &lt;a href="http://www.achievementmetrics.com/"&gt;Achievement Metrics&lt;/a&gt;. This company takes interviews from various players and parses them, and then correlates the pattern with on- and off-field performance.&amp;nbsp; From AM's website:&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;Our analyses of players’ speech, arrest, and suspension data have shown that differences in players’ speech while in college can predict which players are more likely to exhibit off- and on-field behavioral problems during their professional careers.&lt;/blockquote&gt;&lt;br /&gt;I'm not sure whether this would work for baseball, since the players&amp;nbsp;just &lt;a href="http://www.youtube.com/watch?v=KeVca9MwDX8"&gt;speak in clich&lt;span style="font-family: 'Book Antiqua','serif'; font-size: 11pt; line-height: 115%; mso-ansi-language: EN-CA; mso-bidi-font-family: 'Times New Roman'; mso-bidi-language: AR-SA; mso-bidi-theme-font: minor-bidi; mso-fareast-font-family: Calibri; mso-fareast-language: EN-US; mso-fareast-theme-font: minor-latin;"&gt;é&lt;/span&gt;s&lt;/a&gt;. (Warning: clip includes profanities.)&lt;br /&gt;&lt;br /&gt;-30-&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684119683005319088-2219146433457541517?l=bayesball.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bayesball.blogspot.com/feeds/2219146433457541517/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://bayesball.blogspot.com/2011/04/words-with-meaning.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/2219146433457541517'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/2219146433457541517'/><link rel='alternate' type='text/html' href='http://bayesball.blogspot.com/2011/04/words-with-meaning.html' title='Words with meaning'/><author><name>Martin Monkman</name><uri>http://www.blogger.com/profile/05582544453619381290</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='http://2.bp.blogspot.com/_DcSPZvAgn6Y/TAwMW7v4JlI/AAAAAAAAAAY/JOD53e64_I8/S220/charliebrownpitching.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684119683005319088.post-5755091415778131427</id><published>2011-04-19T19:01:00.000-07:00</published><updated>2011-04-20T06:12:13.751-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='probability'/><category scheme='http://www.blogger.com/atom/ns#' term='Strat-o-matic'/><title type='text'>Baseball fans are crazy</title><content type='html'>Or so says &lt;a href="http://tsutpen.blogspot.com/2011/04/art-of-american-fantasy-48.html"&gt;this ad&lt;/a&gt; for &lt;a href="http://www.strat-o-matic.com/products/baseball"&gt;Strat-o-matic&lt;/a&gt;, posted at &lt;a href="http://tsutpen.blogspot.com/"&gt;&lt;i&gt;If Charlie Parker was a Gunslinger...&lt;/i&gt;&lt;/a&gt;&lt;br /&gt;&amp;nbsp;(Note: the main &lt;i&gt;Charlie Parker&lt;/i&gt; page is not appropriate for at-work viewing.)&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;a href="http://img.photobucket.com/albums/v280/tomasutpen/0411/americanfantasy.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="320" src="http://img.photobucket.com/albums/v280/tomasutpen/0411/americanfantasy.jpg" width="205" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;(Click the image to enlarge.)&lt;br /&gt;&lt;br /&gt;-30-&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684119683005319088-5755091415778131427?l=bayesball.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bayesball.blogspot.com/feeds/5755091415778131427/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://bayesball.blogspot.com/2011/04/baseball-fans-are-crazy.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/5755091415778131427'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/5755091415778131427'/><link rel='alternate' type='text/html' href='http://bayesball.blogspot.com/2011/04/baseball-fans-are-crazy.html' title='Baseball fans are crazy'/><author><name>Martin Monkman</name><uri>http://www.blogger.com/profile/05582544453619381290</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='http://2.bp.blogspot.com/_DcSPZvAgn6Y/TAwMW7v4JlI/AAAAAAAAAAY/JOD53e64_I8/S220/charliebrownpitching.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684119683005319088.post-53721258153903383</id><published>2011-04-12T08:43:00.000-07:00</published><updated>2011-04-12T08:48:54.248-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='win expectancy'/><category scheme='http://www.blogger.com/atom/ns#' term='Seattle Mariners'/><title type='text'>Kicking at the darkness</title><content type='html'>Last night (2011-04-11) the Seattle Mariners pulled off a preposterous comeback, defeating the Blue Jays 8-7 after trailing 0-7 heading into the seventh inning. Other teams have had comebacks from being down by 7 runs, and pulled off comebacks in bigger games. But as &lt;a href="http://www.lookoutlanding.com/2011/4/12/2105773/luis-rodriguez-seattle-mariners-toronto-blue-jays"&gt;Rob Neyer has pointed out&lt;/a&gt;, what made this so unexpected and&amp;nbsp;so special was that the Mariners have been, in a word, hapless. The early part of&amp;nbsp;this game was the best/worst example of their struggles. &lt;br /&gt;&lt;br /&gt;The &lt;a href="http://www.fangraphs.com/boxscore.aspx?date=2011-04-11&amp;amp;team=Mariners&amp;amp;dh=0&amp;amp;season=2011"&gt;FanGraphs plot&lt;/a&gt; (chart below) follows what has become a disturbing Mariner trend this year -- the line quickly&amp;nbsp;plummets to the sub-10% &lt;a href="http://www.fangraphs.com/library/index.php/misc/we/"&gt;win expectancy&lt;/a&gt; range in the early innings, and&amp;nbsp;slowly&amp;nbsp;drifts towards zero from&amp;nbsp;there. (Check out the games vs. Cleveland &lt;a href="http://www.fangraphs.com/wins.aspx?date=2011-04-10&amp;amp;team=Mariners&amp;amp;dh=0&amp;amp;season=2011#"&gt;the day before&lt;/a&gt; and &lt;a href="http://www.fangraphs.com/wins.aspx?date=2011-04-08&amp;amp;team=Mariners&amp;amp;dh=0&amp;amp;season=2011#"&gt;the home opener on 2011-04-08&lt;/a&gt;&amp;nbsp;for recent examples).&amp;nbsp; This time, after bottoming out at 0.3% when Luis&amp;nbsp;Rodrigez&amp;nbsp;(the game's eventual hero) struck out to lead off the Mariner half of the seventh inning, the WE&amp;nbsp;line zigzagged its way to the other end of the scale.&lt;br /&gt;&lt;br /&gt;&lt;table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td style="text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/-ne2U8z4dTKM/TaRzjZthxCI/AAAAAAAAAEg/ZVVEX-XmzLk/s1600/20110411_BlueJays_Mariners_0.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"&gt;&lt;img border="0" height="203" r6="true" src="http://4.bp.blogspot.com/-ne2U8z4dTKM/TaRzjZthxCI/AAAAAAAAAEg/ZVVEX-XmzLk/s320/20110411_BlueJays_Mariners_0.png" width="320" /&gt;&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="tr-caption" style="text-align: center;"&gt;Blue Jays @ Mariners, 2011-04-11 (source: FanGraphs)&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;br /&gt;&lt;br /&gt;For the Mariners, a second consecutive 100 loss season (which would be the third in&amp;nbsp;four seasons) is not at all out of the question. But for the fans who stuck with it last night, this was one for the ages.&amp;nbsp; Or the &lt;a href="http://www.ussmariner.com/2011/04/11/game-ten-recap-2/"&gt;U.S.S. Mariner game summary&lt;/a&gt;, quoted here in its entirety: "That was horrible, then awesome. Baseball is fun."&lt;br /&gt;&lt;br /&gt;The title I used for this entry&amp;nbsp;is a reference to Bruce Cockburn's song "Lovers in a Dangerous Time". In the article linked above, Neyer wrote "It was somebody smart, or maybe an episode of &lt;em&gt;Scrubs&lt;/em&gt;, that said nothing worth having comes easy." The song contains the line "Nothing worth having comes without some kind of fight/Got to kick at the darkness until it bleeds daylight".&amp;nbsp;In the late innings of last night's game, the Mariners&amp;nbsp;showed some kick.&lt;br /&gt;&lt;br /&gt;-30-&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684119683005319088-53721258153903383?l=bayesball.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bayesball.blogspot.com/feeds/53721258153903383/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://bayesball.blogspot.com/2011/04/kicking-at-darkness.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/53721258153903383'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/53721258153903383'/><link rel='alternate' type='text/html' href='http://bayesball.blogspot.com/2011/04/kicking-at-darkness.html' title='Kicking at the darkness'/><author><name>Martin Monkman</name><uri>http://www.blogger.com/profile/05582544453619381290</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='http://2.bp.blogspot.com/_DcSPZvAgn6Y/TAwMW7v4JlI/AAAAAAAAAAY/JOD53e64_I8/S220/charliebrownpitching.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/-ne2U8z4dTKM/TaRzjZthxCI/AAAAAAAAAEg/ZVVEX-XmzLk/s72-c/20110411_BlueJays_Mariners_0.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684119683005319088.post-4929761412637656994</id><published>2011-04-11T14:06:00.001-07:00</published><updated>2011-04-11T14:06:43.732-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='real life'/><category scheme='http://www.blogger.com/atom/ns#' term='regression toward the mean'/><title type='text'>Social mobility toward the mean</title><content type='html'>The &lt;a href="http://www.bbc.co.uk/programmes/b0100j90"&gt;April 8, 2011&amp;nbsp;edition of the BBC radio program &lt;em&gt;More or Less&lt;/em&gt;&lt;/a&gt;* includes a discussion of regression toward the mean in the context of social mobility stats in the U.K.&amp;nbsp;&amp;nbsp;Most of the analysis has focussed on the impact social class has on long-term education outcomes. In particular, much has been made of the fact that the analysis suggests that the low ability children from high social class catch up and pass the high&amp;nbsp;ability children from low social class.&amp;nbsp; &lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/-3fUXFRRbqO8/TaNQD-KcCrI/AAAAAAAAAEc/SeqSTOLhPdw/s1600/bbc_social+mobility.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="241" r6="true" src="http://2.bp.blogspot.com/-3fUXFRRbqO8/TaNQD-KcCrI/AAAAAAAAAEc/SeqSTOLhPdw/s320/bbc_social+mobility.jpg" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;But in the broadcast Daniel Read, professor at Warwick Business School, has offered &lt;a href="http://www.alphagalileo.org/ViewItem.aspx?ItemId=100508&amp;amp;CultureCode=en"&gt;a critique&lt;/a&gt; (link to written version) that points out that the analysis has not accounted for&amp;nbsp;"one of the oldest statistical problems of all" (the&amp;nbsp;BBC's&amp;nbsp;description): regression toward the mean. The source of the problem is correctly identified as the bias introduced by including only the highest and lowest performers in the groups shown in the chart. The children closest to the mean for that social class have been excluded.&lt;br /&gt;&lt;br /&gt;Because only the extreme ends of the&amp;nbsp;education outcomes tests&amp;nbsp;of the two social class groups have been selected, the poorest performers&amp;nbsp;naturally show improvements while the higher performers show&amp;nbsp;declines. From the broadcast:&lt;br /&gt;&lt;blockquote&gt;&lt;span style="font-family: inherit;"&gt;It's not that it's [the differences in outcomes&amp;nbsp;between social class]&amp;nbsp;all fluke. But if there's any element of&amp;nbsp;luck at all -- which there&amp;nbsp;surely&amp;nbsp;is, because we're talking about ability tests for toddlers&amp;nbsp;-- then we have to allow for what we'd expect to happen when that luck fails to last.&amp;nbsp; And what we'd expect to happen is pretty much what the graph in the&amp;nbsp;government's social mobility strategy shows, which is that the next time you test the children&amp;nbsp;&lt;em&gt;all&lt;/em&gt; the high performers have dropped off. But especially the poorer kids who, remember, Nick Clegg says were disadvantaged from birth. And all the lower performers have caught up, but especially the richer kids. And then as you continue to test, the richer kids gain on the poorer kids at a very much less dramatic pace.&lt;/span&gt;&lt;/blockquote&gt;The easiest way to spot the regression towards the mean? The enormous change from the first to the second measurement, as much of the selection bias at the first measurement point disappears. The high and low performers were selected not on the basis of their long-term outcomes, but on the results of the&amp;nbsp;first test. In subsequent tests the children in these extreme cases&amp;nbsp;will move toward the mean, and closer to their "true talent".&lt;br /&gt;&lt;br /&gt;Accounting for regression toward the mean&amp;nbsp;does not mean that social class doesn't have&amp;nbsp;a relationship with&amp;nbsp;education outcomes. But accounting for the regression toward the mean would moderate the magnitude of the difference between the two social classes.&lt;br /&gt;&lt;br /&gt;&lt;em&gt;*The linked page has a text&amp;nbsp;summary of the program, a copy of the chart in question, streaming audio of the program,&amp;nbsp;and links to&amp;nbsp;the podcast and supporting documents. The&amp;nbsp;item begins at roughly&amp;nbsp;17' 25" of the podcast.&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;-30-&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684119683005319088-4929761412637656994?l=bayesball.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bayesball.blogspot.com/feeds/4929761412637656994/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://bayesball.blogspot.com/2011/04/social-mobility-toward-mean.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/4929761412637656994'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/4929761412637656994'/><link rel='alternate' type='text/html' href='http://bayesball.blogspot.com/2011/04/social-mobility-toward-mean.html' title='Social mobility toward the mean'/><author><name>Martin Monkman</name><uri>http://www.blogger.com/profile/05582544453619381290</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='http://2.bp.blogspot.com/_DcSPZvAgn6Y/TAwMW7v4JlI/AAAAAAAAAAY/JOD53e64_I8/S220/charliebrownpitching.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/-3fUXFRRbqO8/TaNQD-KcCrI/AAAAAAAAAEc/SeqSTOLhPdw/s72-c/bbc_social+mobility.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684119683005319088.post-6061774547773621254</id><published>2011-04-10T09:56:00.000-07:00</published><updated>2011-04-10T10:59:24.493-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='small sample size'/><category scheme='http://www.blogger.com/atom/ns#' term='good math-bad statistics'/><title type='text'>Meaningless numbers</title><content type='html'>There's been plenty of chatter on the sabermetric blogs lately about the meaningless stats bandied about by broadcasters during the early stages of the season. &amp;nbsp;The best way I've seen the validity of these numbers debunked is on &lt;a href="http://www.lookoutlanding.com/2011/4/9/2101675/seattle-mariners-cleveland-indians"&gt;Jeff Sullivan's post on the Indians-Mariners game&lt;/a&gt;&amp;nbsp;at Lookout Landing (on SB Nation):&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif; font-size: 13px; line-height: 17px;"&gt;In the bottom of the first, the broadcast flashed a&amp;nbsp;&lt;a class="sbn-auto-link" href="http://www.sbnation.com/mlb/players/33392/justin-masterson" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; background-position: initial initial; background-repeat: initial initial; border-bottom-width: 0px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; color: #0d3b77; font-size: 13px; font-weight: bold; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; outline-color: initial; outline-style: initial; outline-width: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; text-decoration: none; vertical-align: baseline;"&gt;Justin Masterson&lt;/a&gt;&amp;nbsp;[the Indians' starting pitcher]&amp;nbsp;stat graphic showing his lefty/righty splits on the season. After one game. The only thing I wish is that they would've shown his home/road splits instead.&lt;/span&gt;&lt;/blockquote&gt;&lt;br /&gt;-30-&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684119683005319088-6061774547773621254?l=bayesball.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bayesball.blogspot.com/feeds/6061774547773621254/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://bayesball.blogspot.com/2011/04/meaningless-numbers.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/6061774547773621254'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/6061774547773621254'/><link rel='alternate' type='text/html' href='http://bayesball.blogspot.com/2011/04/meaningless-numbers.html' title='Meaningless numbers'/><author><name>Martin Monkman</name><uri>http://www.blogger.com/profile/05582544453619381290</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='http://2.bp.blogspot.com/_DcSPZvAgn6Y/TAwMW7v4JlI/AAAAAAAAAAY/JOD53e64_I8/S220/charliebrownpitching.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684119683005319088.post-4719383551649191358</id><published>2011-04-08T11:03:00.000-07:00</published><updated>2011-04-08T11:52:51.585-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='probability'/><title type='text'>On pace for a 162 loss season!</title><content type='html'>&lt;a href="http://bayesball.blogspot.com/2011/04/on-pace-for-162-win-season.html"&gt;Two days ago I responded to an on-line article&lt;/a&gt; about the Orioles' 4-0 start to the season, pointing out that it's not at all surprising -- using the laws of binomial probability -- to see three of the 30 MLB teams at 4-0 to start the season. &lt;br /&gt;&lt;br /&gt;What I didn't mention were the teams that started the season on a losing streak. And now, heading into this weekend's play, the 0-6 Red Sox and Rays that are getting the attention of the punditocracy. &amp;nbsp;For the Red Sox, it's the poorest start since the 1945 season, and the Rays haven't ever gone 0-6 to start the season in their comparatively short history.&lt;br /&gt;&lt;br /&gt;Some writers&amp;nbsp;have acknowledged the probability and the history of being 0-6: &amp;nbsp;Dave Cameron at FanGraphs writes &lt;a href="http://www.fangraphs.com/blogs/index.php/is-it-time-to-panic-in-boston/"&gt;"Is it time to panic in Boston?"&lt;/a&gt;, and Cliff Corcoran at S.I.'s piece is &lt;a href="http://sportsillustrated.cnn.com/2011/writers/cliff_corcoran/04/06/bad.starts/index.html"&gt;"It's still early, but history is against winless Red Sox, Rays and Astros"&lt;/a&gt; (which was&amp;nbsp;written before yesterday's games, when the teams were 0-5).&lt;br /&gt;&lt;div&gt;&lt;br /&gt;An entirely different view can be found&amp;nbsp;Baseball Prospectus, where &lt;a href="http://www.baseballprospectus.com/article.php?articleid=13499"&gt;Steven Goldman&lt;/a&gt; uses the 1987 Brewers, who went 13-0 and then 20-3 (Goldman writes, "on pace for a 141 win season")&amp;nbsp;before hitting a 12 game losing streak, and the opening sequence of Tom Stoppard's&amp;nbsp;existentialist&amp;nbsp;play &lt;i&gt;&lt;a href="http://en.wikipedia.org/wiki/Rosencrantz_and_Guildenstern_Are_Dead"&gt;Rosencrantz And Guildenstern Are Dead&lt;/a&gt;&lt;/i&gt; to suggest that sometimes things operate&amp;nbsp;outside the laws of&amp;nbsp;probability.&lt;br /&gt;&lt;br /&gt;(&lt;a href="http://www.youtube.com/watch?v=RjOqaD5tWB0"&gt;Watch the scene in question&lt;/a&gt;, from the 1990 film with Gary Oldman and Tim Roth as the title characters.)&lt;br /&gt;&lt;br /&gt;In the play, the characters are faced with a preposterously long string of coin-flips that land heads. This leads Guildenstern to&amp;nbsp;say "A weaker man might be moved to re-examine his faith, if in nothing else at least in the law of probability." &lt;br /&gt;&lt;br /&gt;Even though&amp;nbsp;the two&amp;nbsp;teams are 0-6, the weakness of my faith in the laws of probability is not yet tested.&lt;br /&gt;&lt;br /&gt;-30-&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684119683005319088-4719383551649191358?l=bayesball.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bayesball.blogspot.com/feeds/4719383551649191358/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://bayesball.blogspot.com/2011/04/on-pace-for-162-loss-season.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/4719383551649191358'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/4719383551649191358'/><link rel='alternate' type='text/html' href='http://bayesball.blogspot.com/2011/04/on-pace-for-162-loss-season.html' title='On pace for a 162 loss season!'/><author><name>Martin Monkman</name><uri>http://www.blogger.com/profile/05582544453619381290</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='http://2.bp.blogspot.com/_DcSPZvAgn6Y/TAwMW7v4JlI/AAAAAAAAAAY/JOD53e64_I8/S220/charliebrownpitching.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684119683005319088.post-8048890674790518154</id><published>2011-04-07T17:31:00.000-07:00</published><updated>2011-04-07T17:31:22.854-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Bill James'/><category scheme='http://www.blogger.com/atom/ns#' term='prediction'/><category scheme='http://www.blogger.com/atom/ns#' term='Andrew Gelman'/><category scheme='http://www.blogger.com/atom/ns#' term='politics'/><title type='text'>Gelman on baseball</title><content type='html'>Andrew Gelman has published a few blog articles lately that hit on baseball.&lt;br /&gt;&lt;br /&gt;First up, &lt;a href="http://www.stat.columbia.edu/~cook/movabletype/archives/2011/04/bill_james_and.html"&gt;"Bill James and the base-rate fallacy"&lt;/a&gt;, where he points out a flaw in James' reasoning that arises from the&amp;nbsp;"availability heuristic".&lt;br /&gt;&lt;br /&gt;Second, at &lt;a href="http://statisticsforum.wordpress.com/"&gt;The Statistics Forum&lt;/a&gt;, a comparison of predicting future performance at a significant transition point in &lt;a href="http://statisticsforum.wordpress.com/2011/04/06/minor-league-stats-predict-major-league-performance-sarah-palin-and-some-differences-between-baseball-and-politics/"&gt;"Minor-league Stats Predict Major League Performance, Sarah Palin, and Some Differences Between Baseball and Politics"&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;I don't have anything to add, other than to say it's encouraging to see one of the best statistical thinkers in the academy using baseball as a point of reference.&lt;br /&gt;&lt;br /&gt;-30-&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684119683005319088-8048890674790518154?l=bayesball.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bayesball.blogspot.com/feeds/8048890674790518154/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://bayesball.blogspot.com/2011/04/gelman-on-baseball.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/8048890674790518154'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/8048890674790518154'/><link rel='alternate' type='text/html' href='http://bayesball.blogspot.com/2011/04/gelman-on-baseball.html' title='Gelman on baseball'/><author><name>Martin Monkman</name><uri>http://www.blogger.com/profile/05582544453619381290</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='http://2.bp.blogspot.com/_DcSPZvAgn6Y/TAwMW7v4JlI/AAAAAAAAAAY/JOD53e64_I8/S220/charliebrownpitching.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684119683005319088.post-106134032994413762</id><published>2011-04-06T17:28:00.000-07:00</published><updated>2011-05-29T09:24:54.498-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Baltimore Orioles'/><category scheme='http://www.blogger.com/atom/ns#' term='wins'/><category scheme='http://www.blogger.com/atom/ns#' term='probability'/><category scheme='http://www.blogger.com/atom/ns#' term='log5'/><category scheme='http://www.blogger.com/atom/ns#' term='XKCD'/><title type='text'>On pace for a 162 win season!</title><content type='html'>Only at the beginning of the season would a&amp;nbsp;four-game winning streak get you &lt;a href="http://sportsillustrated.cnn.com/2011/writers/jon_heyman/04/06/orioles.macphail/index.html?xid=cnnbin&amp;amp;hpt=Sbin"&gt;an article on S.I.&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;The probability of&amp;nbsp;a true .500 team&amp;nbsp;playing four games against&amp;nbsp;other true&amp;nbsp;.500 teams and winning them all is 6.25% -- or roughly 2 out of 30. In other words, at this point in the season we could realistically expect&amp;nbsp;two teams to be at 4-0.&amp;nbsp; Once we consider that in real life teams are not so&amp;nbsp;perfectly matched, thus raising the odds of the more powerful team being successful in all four games,&amp;nbsp;the fact that there are three teams with 4-0 records (Orioles, Reds, and Rangers)&amp;nbsp;isn't a surprise.&lt;br /&gt;&lt;br /&gt;The only surprise in all of this is that the Orioles (widely touted to finish last in the A.L. East) swept the&amp;nbsp;Rays in a three game series in Tampa Bay and then went on to beat the Tigers at home in their fourth game of the season. But then again, the probability of a .400 team going 4-0 against a .500* team is just under 3%.&amp;nbsp; Not very good odds, but something we would expect to see on occasion.&lt;br /&gt;&lt;br /&gt;(Shout out to Tango, who raised the "&lt;a href="http://www.insidethebook.com/ee/index.php/site/comments/on_pace_to/"&gt;on pace&lt;/a&gt;" problem a couple of days ago, and The Book readers who added various lucid and perceptive comments, including an &lt;a href="http://xkcd.com/605/"&gt;XKCD&lt;/a&gt; cartoon. You can never go wrong using an XKCD cartoon to illustrate your point.)&lt;br /&gt;&lt;br /&gt;&lt;i&gt;Update: the New Utosky Bolshevik Show takes the Red Sox 0-3 start as its jumping off point for a post titled &lt;a href="http://blog.daniel-watkins.co.uk/red-sox-arent-doomed"&gt;The Red Sox Aren't Doomed&lt;/a&gt;, demonstrating the same thing I did but with graphs and Python script. &amp;nbsp;Score one for the NUBS.&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;&lt;em&gt;* Changed from ".600".&amp;nbsp; Comment #1 below was generated because of this typo; #2 is my detailed response.&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;-30-&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684119683005319088-106134032994413762?l=bayesball.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bayesball.blogspot.com/feeds/106134032994413762/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://bayesball.blogspot.com/2011/04/on-pace-for-162-win-season.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/106134032994413762'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/106134032994413762'/><link rel='alternate' type='text/html' href='http://bayesball.blogspot.com/2011/04/on-pace-for-162-win-season.html' title='On pace for a 162 win season!'/><author><name>Martin Monkman</name><uri>http://www.blogger.com/profile/05582544453619381290</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='http://2.bp.blogspot.com/_DcSPZvAgn6Y/TAwMW7v4JlI/AAAAAAAAAAY/JOD53e64_I8/S220/charliebrownpitching.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684119683005319088.post-1280738204874144660</id><published>2011-03-31T21:43:00.000-07:00</published><updated>2011-04-06T17:29:39.963-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='talent distribution'/><category scheme='http://www.blogger.com/atom/ns#' term='Bill James'/><category scheme='http://www.blogger.com/atom/ns#' term='Tom Wilhelmsen'/><title type='text'>Developing talent</title><content type='html'>&lt;span class="Apple-style-span" style="font-family: Verdana, sans-serif;"&gt;Today (Opening Day 2011), an excerpt from Bill James' forthcoming book&amp;nbsp;&lt;/span&gt;&lt;span class="Apple-style-span" style="font-weight: bold;"&gt;&lt;span class="Apple-style-span" style="font-weight: normal;"&gt;&lt;span id="btAsinTitle"&gt;&lt;span class="Apple-style-span" style="font-family: Verdana, sans-serif;"&gt;&lt;i&gt;Solid Fool's Gold: Detours on the Way to Conventional Wisdom&lt;/i&gt;&amp;nbsp;appeared on the Slate site. The article is titled &lt;a href="http://www.blogger.com/goog_906644461"&gt;"&lt;/a&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="Apple-style-span" style="font-family: Verdana, sans-serif;"&gt;&lt;a href="http://www.blogger.com/goog_906644461"&gt;Shakespeare and Verlander:&amp;nbsp;&lt;/a&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="Apple-style-span" style="font-family: Verdana, sans-serif; font-weight: normal;"&gt;&lt;a href="http://www.slate.com/id/2289380"&gt;Why are we so good at developing athletes and so lousy at developing writers?"&lt;/a&gt;, and in it he provides some profound insights into discrimination in sports compared to the rest of society.&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Verdana, sans-serif;"&gt;But along the way to that point, James&amp;nbsp;takes a shot at&amp;nbsp;the conventional wisdom that expansion dilutes the talent pool. James' contrary view is that expansion creates a short-term dilution, but over the long term more talent develops to fill the increased demand.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-weight: bold;"&gt;&lt;span class="Apple-style-span" style="font-family: Verdana, sans-serif; font-weight: normal;"&gt;The thesis is built on&lt;/span&gt;&lt;span class="Apple-style-span" style="font-family: Verdana, sans-serif; font-weight: normal;"&gt;&amp;nbsp;James' assertion that raw talent is abundant, and simply needs the right opportunities -- incentives -- to be developed. In James' thought experiment, an expansion of MLB from 30 teams to 300 would over the long term have no impact on the level of talent, as talent development would expand to ensure the newly available opportunities were filled.&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Verdana, sans-serif;"&gt;But can we really believe this? &amp;nbsp;There has been plenty of discussion elsewhere about the distribution of baseball talent (for example,&amp;nbsp;&lt;a href="http://www.sabernomics.com/sabernomics/index.php/2010/12/agreeing-and-disagreeing-with-bill-james/"&gt;Sabernomics&lt;/a&gt; and &lt;a href="http://tangotiger.net/talent.html"&gt;The Book&lt;/a&gt;), all of which would, at first glance, seem to run contrary to Bill James' argument. But those talent curves are drawn based on the current system of incentives, with enough room for 25 roster players on 30 MLB teams and roughly 9,000 players in pro ball in North America and a few more thousand around the world.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Verdana, sans-serif;"&gt;Criticisms &amp;nbsp;of Bill James' essay will no doubt focus on the fact that expanding the number of MLB teams beyond 30 requires some of the non-roster players currently in the minors to move up to The Show ... they aren't good enough to play today, but in an expansion environment they would be.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Verdana, sans-serif;"&gt;This might be true in the short term, but as Bill James argues, over the long haul the change in opportunities would shift, and talent would be developed to fill the new opportunities.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Verdana, sans-serif;"&gt;Currently around the margins of professional baseball are &lt;a href="http://seattletimes.nwsource.com/html/mariners/2014631684_mari30.html"&gt;men who have given up baseball to work as a bartender&lt;/a&gt;, and those who have decided to pursue excellence in another sport. Players in both these groups would demonstrate different behaviour when provided a different set of incentives. &amp;nbsp;The shape of the distribution curve would not change, and the average player's performance would also be unchanged, but the absolute number of players would increase.&amp;nbsp;&lt;/span&gt;&lt;br /&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: Verdana, sans-serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;table cellpadding="0" cellspacing="0" class="tr-caption-container" style="float: right; margin-left: 1em; text-align: right;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td style="text-align: center;"&gt;&lt;a href="http://mariners.sportspressnw.com/files/2011/02/IMG_4505.jpg" imageanchor="1" style="clear: right; margin-bottom: 1em; margin-left: auto; margin-right: auto;"&gt;&lt;img border="0" height="213" src="http://mariners.sportspressnw.com/files/2011/02/IMG_4505.jpg" width="320" /&gt;&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="tr-caption" style="text-align: center;"&gt;&lt;span class="Apple-style-span" style="font-family: Verdana, sans-serif; font-size: small;"&gt;&lt;i&gt;Tom Wilhelmsen, former bartender, now pitching for the Seattle Mariners.&lt;/i&gt;&lt;/span&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: Verdana, sans-serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: Verdana, sans-serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: Verdana, sans-serif;"&gt;&lt;i&gt;&lt;br /&gt;&lt;/i&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: Verdana, sans-serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: Verdana, sans-serif;"&gt;The latter group (the athletically gifted stars in other sports) would provide the increased numbers of players at the top end of the distribution curve, becoming the star players on Teams #31 through #300.&amp;nbsp;&amp;nbsp;&lt;/span&gt;&lt;span class="Apple-style-span" style="font-family: Verdana, sans-serif;"&gt;The bartenders of today would become the focus of rigorous development regimes. It's important to remember that not only would there be 10 times more opportunities at very level, but there would also be 10 times more teams trying to succeed, and 10 times more scouts, coaches, and others keen to see their players develop into stars. And this would be repeated around the world, ensuring that the best athletes are active in the sport that provides the greatest opportunities. &lt;/span&gt;&lt;span class="Apple-style-span" style="font-family: Verdana, sans-serif;"&gt;Given enough time, there would be enough players developed to stock 300 teams with no decrease in overall quality of play.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Verdana, sans-serif;"&gt;There are examples of this in the past. One recent example is the growth of information technology occupations -- 40 years ago, very few individuals (both in terms of absolute numbers and as a percentage of the workforce) knew how to write a computer program. But with increased job opportunities and an expansion of training, people who might otherwise chosen other occupations and career paths now can write computer programs. This does not mean the talent pool of computer programmers has been diluted; in fact, an argument could be made that the average talent and the high-end extreme of talent has increased.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Verdana, sans-serif;"&gt;Another parallel is the availability of natural resources that lay unused until somebody found a use for it. Petroleum was known to exist for centuries, but wasn't a sought-after resource until the mid-nineteenth century when a method to distill kerosene was developed, making it a cheap alternative to whale oil. In a short period of time opportunities expanded, and as a result there was a rush to develop this previously ignored resource.&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: Verdana, sans-serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: Verdana, sans-serif;"&gt;-30-&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: Verdana, sans-serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: Verdana, sans-serif; font-weight: normal;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: Verdana, sans-serif; font-size: x-small; font-weight: normal;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684119683005319088-1280738204874144660?l=bayesball.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bayesball.blogspot.com/feeds/1280738204874144660/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://bayesball.blogspot.com/2011/03/developing-talent.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/1280738204874144660'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/1280738204874144660'/><link rel='alternate' type='text/html' href='http://bayesball.blogspot.com/2011/03/developing-talent.html' title='Developing talent'/><author><name>Martin Monkman</name><uri>http://www.blogger.com/profile/05582544453619381290</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='http://2.bp.blogspot.com/_DcSPZvAgn6Y/TAwMW7v4JlI/AAAAAAAAAAY/JOD53e64_I8/S220/charliebrownpitching.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684119683005319088.post-297528846080333381</id><published>2011-02-06T19:22:00.000-08:00</published><updated>2011-02-06T19:23:06.261-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='statistics'/><category scheme='http://www.blogger.com/atom/ns#' term='epistemology'/><category scheme='http://www.blogger.com/atom/ns#' term='Aleks Jakulin'/><category scheme='http://www.blogger.com/atom/ns#' term='Bill James'/><category scheme='http://www.blogger.com/atom/ns#' term='Andrew Gelman'/><title type='text'>Modeling: insights from the pros</title><content type='html'>It's been a busy few weeks, so I've spent Super Bowl Sunday* catching up on the various blogs that I try to follow. A couple of posts from Andrew Gelman and Aleks Jakulin caught my eye: &lt;i&gt;&lt;a href="http://www.stat.columbia.edu/~cook/movabletype/archives/2011/01/why_cant_i_be_m.html"&gt;Why can't I be more like Bill James, or, The use of default and default-like models&lt;/a&gt;&lt;/i&gt; and the two-part Model Makers' Hippocratic Oath (&lt;i&gt;&lt;a href="http://www.stat.columbia.edu/~cook/movabletype/archives/2011/02/model_makers_hi.html"&gt;Part 1&lt;/a&gt;&lt;/i&gt; and &lt;i&gt;&lt;a href="http://www.stat.columbia.edu/~cook/movabletype/archives/2011/02/an_addition_to.html"&gt;Part 2&lt;/a&gt;&lt;/i&gt;).&lt;br /&gt;&lt;br /&gt;All these posts are worth reading in their entirety, but they all boil down to the quote from &lt;a href="http://en.wikiquote.org/wiki/George_Box"&gt;George E.P. Box&lt;/a&gt;: "Essentially, all models are wrong, but some are useful." Knowing (or if you're the author, admitting) the limitations of the model is the most important to understanding how useful a model might be. &lt;br /&gt;&lt;br /&gt;&lt;i&gt;*Pitchers and catchers start reporting for spring training one week today!&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;-30-&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684119683005319088-297528846080333381?l=bayesball.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bayesball.blogspot.com/feeds/297528846080333381/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://bayesball.blogspot.com/2011/02/being-like-bill.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/297528846080333381'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/297528846080333381'/><link rel='alternate' type='text/html' href='http://bayesball.blogspot.com/2011/02/being-like-bill.html' title='Modeling: insights from the pros'/><author><name>Martin Monkman</name><uri>http://www.blogger.com/profile/05582544453619381290</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='http://2.bp.blogspot.com/_DcSPZvAgn6Y/TAwMW7v4JlI/AAAAAAAAAAY/JOD53e64_I8/S220/charliebrownpitching.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684119683005319088.post-5085648449615673341</id><published>2011-01-28T19:05:00.000-08:00</published><updated>2011-05-29T09:25:20.379-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='humour'/><category scheme='http://www.blogger.com/atom/ns#' term='weighting'/><category scheme='http://www.blogger.com/atom/ns#' term='XKCD'/><title type='text'>The risks of adjusting performance stats</title><content type='html'>From &lt;a href="http://xkcd.com/852/"&gt;XKCD&lt;/a&gt;. &amp;nbsp;Not baseball, but it brings to mind park effect, league context, &lt;a href="http://www.baseball-reference.com/about/equiv_stats.shtml"&gt;etc&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://imgs.xkcd.com/comics/local_g.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="320" src="http://imgs.xkcd.com/comics/local_g.png" width="245" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684119683005319088-5085648449615673341?l=bayesball.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bayesball.blogspot.com/feeds/5085648449615673341/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://bayesball.blogspot.com/2011/01/risks-of-adjusting-performance-stats.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/5085648449615673341'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/5085648449615673341'/><link rel='alternate' type='text/html' href='http://bayesball.blogspot.com/2011/01/risks-of-adjusting-performance-stats.html' title='The risks of adjusting performance stats'/><author><name>Martin Monkman</name><uri>http://www.blogger.com/profile/05582544453619381290</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='http://2.bp.blogspot.com/_DcSPZvAgn6Y/TAwMW7v4JlI/AAAAAAAAAAY/JOD53e64_I8/S220/charliebrownpitching.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684119683005319088.post-2303061748698355971</id><published>2011-01-05T22:36:00.001-08:00</published><updated>2011-01-06T06:07:05.032-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='statistics'/><category scheme='http://www.blogger.com/atom/ns#' term='epistemology'/><category scheme='http://www.blogger.com/atom/ns#' term='Bill James'/><category scheme='http://www.blogger.com/atom/ns#' term='Andrew Gelman'/><title type='text'>Andrew Gelman's "5 Books"</title><content type='html'>Andrew Gelman is one of the most interesting (IMHO) social scientist/statisticians in The Academy. Not only does he have serious statistical chops (he co-authored &lt;i&gt;Bayesian Data Analysis&lt;/i&gt; with Carlin, Stern, and Rubin), but he also has published a raft of papers on voting patterns. His blog &lt;a href="http://www.stat.columbia.edu/~cook/movabletype/mlm/"&gt;Statistical Modeling, Causal Inference, and Social Science&lt;/a&gt; -- written with his colleague Aleks Jakulin -- offers wide ranging commentary on everything from statistical theory and philosophy, to R (the statistical software), to all manner of social statistics.&lt;br /&gt;Gelman was recently approached by &lt;a href="http://thebrowser.com/"&gt;The Browser&lt;/a&gt; to suggest five books on how people vote in the U.S., but instead he provided &lt;a href="http://www.stat.columbia.edu/~cook/movabletype/archives/2011/01/5_books.html"&gt;a list of five excellent books about statistics&lt;/a&gt;.&amp;nbsp; #1 on his list:&amp;nbsp; Bill James’ &lt;em&gt;Baseball Abstracts 1982-1986&lt;/em&gt;.&lt;br /&gt;&lt;br /&gt;-30-&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684119683005319088-2303061748698355971?l=bayesball.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bayesball.blogspot.com/feeds/2303061748698355971/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://bayesball.blogspot.com/2011/01/blogger-bayes-ball-edit-post-gelman.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/2303061748698355971'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/2303061748698355971'/><link rel='alternate' type='text/html' href='http://bayesball.blogspot.com/2011/01/blogger-bayes-ball-edit-post-gelman.html' title='Andrew Gelman&amp;#39;s &amp;quot;5 Books&amp;quot;'/><author><name>Martin Monkman</name><uri>http://www.blogger.com/profile/05582544453619381290</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='http://2.bp.blogspot.com/_DcSPZvAgn6Y/TAwMW7v4JlI/AAAAAAAAAAY/JOD53e64_I8/S220/charliebrownpitching.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684119683005319088.post-6893402910290211984</id><published>2010-12-22T20:45:00.000-08:00</published><updated>2011-04-06T17:30:21.365-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='ERA'/><category scheme='http://www.blogger.com/atom/ns#' term='good math-bad statistics'/><category scheme='http://www.blogger.com/atom/ns#' term='talent distribution'/><category scheme='http://www.blogger.com/atom/ns#' term='probability'/><category scheme='http://www.blogger.com/atom/ns#' term='regression toward the mean'/><category scheme='http://www.blogger.com/atom/ns#' term='pitching'/><category scheme='http://www.blogger.com/atom/ns#' term='selection bias'/><title type='text'>The ERA distribution curve</title><content type='html'>&lt;b&gt;NOTE: Tango and MGL at The Book, through the &lt;a href="http://www.insidethebook.com/ee/index.php/site/comments/lemonade/"&gt;"Lemonade"&lt;/a&gt; thread, have critiqued the analysis below and pointed out errors in my assumptions. These errors mean that my closing conclusions are wrong -- good math, but bad statistics on my part. Through this post, you will find italicized text describing my errors.&lt;br /&gt;&lt;i&gt;Revised January 5, 2011&lt;/i&gt;&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;&lt;hr /&gt;&lt;br /&gt;J.C. Bradbury's recent blog&amp;nbsp;postings (&lt;a href="http://www.sabernomics.com/sabernomics/index.php/2010/12/agreeing-and-disagreeing-with-bill-james/"&gt;here&lt;/a&gt; and &lt;a href="http://www.sabernomics.com/sabernomics/index.php/2010/12/whats-wrong-with-replacement-level-valuing-of-players"&gt;here&lt;/a&gt;) have included histograms showing the distribution of ERA across major league pitchers for the 2009 season.&amp;nbsp; For his analysis, Bradbury omitted those pitchers with fewer than 100 batters faced -- in both his blog and book &lt;em&gt;Hot Stove Economics&lt;/em&gt;, he justifies this due to the wide variance in ERA scores, much of which will be due to the small number of "samples" for each pitcher.&amp;nbsp; (As we saw in my &lt;a href="http://bayesball.blogspot.com/2010/11/bo-knows-probability.html"&gt;earlier post about Bo Hart&lt;/a&gt;, it's possible for an average&amp;nbsp;player to do very well over the short term; the inverse applies too.)&lt;br /&gt;&lt;br /&gt;But&amp;nbsp;a few comments on Bradbury's blog from readers&amp;nbsp;ask about the impact that those "missing cases", who account for nearly a third (28%) of all individuals who pitched in MLB in 2009, would have on the curve.&lt;br /&gt;&lt;br /&gt;Here's the answer:&amp;nbsp; &lt;br /&gt;&lt;br /&gt;&lt;em&gt;Figure 1: MLB Pitching, 2009 -- Number of Pitchers by ERA, by Number of Batters Faced&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_DcSPZvAgn6Y/TRK6fkMG6ZI/AAAAAAAAADg/1Nn3QQp87jk/s1600/pitching_2009_ch_ERA-BFP.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="291" src="http://1.bp.blogspot.com/_DcSPZvAgn6Y/TRK6fkMG6ZI/AAAAAAAAADg/1Nn3QQp87jk/s400/pitching_2009_ch_ERA-BFP.jpg" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Incorporating the &amp;lt;100 BFP pitchers (the black chunks of each bar)&amp;nbsp;adds&amp;nbsp;pitchers across the whole range, although they are skewed to the right (i.e. higher ERAs).&amp;nbsp; While there is a stack on the left with very low ERAs, there's a bigger group of players with an ERA greater than 10. (The highest ERA of this group was 135.00.)&lt;br /&gt;&lt;br /&gt;&lt;i&gt;NOTE 1: ERA is a poor measure to use for this type of evaluation -- for pitchers with a low number of batters faced or innings pitched, it's easy for huge numbers to appear. That 135.00 ERA is the equivalent of 15 earned runs with only a single recorded out.&amp;nbsp; These exaggerated values then lead to an upward distortion of the mean for the group.&amp;nbsp; A better measure would be wOBA, or other measure that resembles a probability between 0 and 1.&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;The table below shows the average ERA of this group and three other groups based on the number of batters faced. &amp;nbsp;What we see is that the &amp;lt;100 BFP pitchers have a higher ERA than those who pitched more frequently. &amp;nbsp;(This difference is statistically significant.) &amp;nbsp;In spite of the variation in their ERAs, this group on average are less skilled than the other three groupings of pitchers.&lt;br /&gt;&lt;br /&gt;&lt;em&gt;NOTE 2: This is where I went wrong. The math is correct, but there is bias in the sample that I ignored. We can be fairly confident that pitchers who get off to a poor start won't get many opportunities to pitch -- and therefore won't get the opportunity to regress to the mean. Pitchers who do better at the start of their season will continue to pitch, and regress to the mean.&amp;nbsp; This process may take them some time, which may push them over the arbitrary line of 100 batters faced.&amp;nbsp; Thus the statistical significance is an artifact of the bias.&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;&lt;em&gt;Figure 2: MLB Pitching, 2009 -- Average ERA, by Number of Batters Faced&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_DcSPZvAgn6Y/TRK6mdsNn7I/AAAAAAAAADo/zEEdE2zug4A/s1600/pitching_2009_ERA-BFP.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="135" src="http://3.bp.blogspot.com/_DcSPZvAgn6Y/TRK6mdsNn7I/AAAAAAAAADo/zEEdE2zug4A/s400/pitching_2009_ERA-BFP.jpg" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;In a &lt;a href="http://www.insidethebook.com/ee/index.php/site/comments/lemonade/"&gt;thread on The Book blog&lt;/a&gt; that covered this same topic, I made a similar statement (reply #8): "What I’m trying to say is that our best estimate of the “true talent” of this group is an ERA of 8.11 [in the current case, 8.72], and that estimate is quite accurate". That statement got a response from Tango (reply #9) of "That is not accurate. If you look at how those pitchers who faced fewer than 100 batters did in the season preceding or the season following, THAT will give you a much better indicator of the true talent level."&lt;br /&gt;&lt;br /&gt;So let me clarify. &amp;nbsp;The average level of skill of the pitchers who faced fewer than 100 batters in 2009, is an average ERA of 8.72. Although Tango is correct in his assertion that the poorest performers would regress upwards, by the same token the best pitchers (some of whom managed a 0.00 ERA in their short stint) would get worse. But if we were to let all 188 of them continue to pitch, we can be 95% certain that the "true" ERA of the group would end up somewhere between 6.92 and 10.52.&lt;br /&gt;&lt;br /&gt;Even the lower bound (i.e. the lowest score we would expect with our more rigorous testing) is higher than the highest range from the other groups.&lt;br /&gt;&lt;br /&gt;&lt;em&gt;NOTE 3:&amp;nbsp; My statement above would be correct, if it were not for the bias in the sample.&amp;nbsp; My belief had been that this group would regress not to the MLB average, but to the average of the &amp;lt;100 BFP pitchers.&amp;nbsp; But because of the selection bias, this does not hold true.&lt;/em&gt;&lt;br /&gt;Here's a simple example to demonstrate how this works. Think of the probability professor's favourite tool, the coin toss. If we have a penny and toss it repeatedly -- say, 10,000 times -- and recorded the result each time, the proportion of heads would very accurately reflect the true probability of the individual penny. And we'd need plenty of tosses to get an accurate measure of the single penny.&lt;br /&gt;&lt;br /&gt;But what if instead of one penny we had 188 pennies, and we varied the number of tosses each penny got? Although the average number of tosses would be 50, some pennies might get only one toss, while others would get as many as 100 tosses. Some of those short sequences might come up all heads, while others would heavily favour the tails. On average, though, across the 188 pennies, we would find that the group average was a close reflection of "true average" of the group.&lt;br /&gt;&lt;br /&gt;&lt;em&gt;NOTE 4: The error in the initial assumption causes my coin flipping analogy to&amp;nbsp;fall apart.&amp;nbsp; If “success” is a head, then the coin that comes up heads &amp;gt;0.5 will keep being flipped, possibly with enough flips to no longer be part of the “low flip” group (over that arbitrary threshold).&amp;nbsp; Meanwhile, a coin that runs tails more often will get pulled from the trials quickly, and end up &amp;lt;0.5 and with few flips.&amp;nbsp; Thus, as a group, the coins with a smaller number of flips will end up looking worse than those that keep getting flipped.&amp;nbsp; Selection bias causes an apparent difference, where none really exists. &lt;/em&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;And so it is with the pitchers in question. If they were like the other pitchers in MLB, we would expect that some of the &amp;lt;100 batters faced pitchers would have ERAs above the league average, while others would fall below. What we see, however, is that while there is a wide variation, the average is substantially higher than the other groups of pitchers. &lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;em&gt;NOTE 5:&amp;nbsp; ...because of selection bias!&amp;nbsp; The lesson:&amp;nbsp; selection bias&amp;nbsp;can crop up&amp;nbsp;anywhere, even if you are not the one doing the selecting.&lt;/em&gt;&lt;/div&gt;&lt;br /&gt;-30-&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684119683005319088-6893402910290211984?l=bayesball.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bayesball.blogspot.com/feeds/6893402910290211984/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://bayesball.blogspot.com/2010/12/era-distribution-curve.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/6893402910290211984'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/6893402910290211984'/><link rel='alternate' type='text/html' href='http://bayesball.blogspot.com/2010/12/era-distribution-curve.html' title='The ERA distribution curve'/><author><name>Martin Monkman</name><uri>http://www.blogger.com/profile/05582544453619381290</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='http://2.bp.blogspot.com/_DcSPZvAgn6Y/TAwMW7v4JlI/AAAAAAAAAAY/JOD53e64_I8/S220/charliebrownpitching.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_DcSPZvAgn6Y/TRK6fkMG6ZI/AAAAAAAAADg/1Nn3QQp87jk/s72-c/pitching_2009_ch_ERA-BFP.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684119683005319088.post-8422844289390170841</id><published>2010-12-20T18:36:00.001-08:00</published><updated>2011-04-06T17:30:37.438-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Mariano Rivera'/><category scheme='http://www.blogger.com/atom/ns#' term='ERA'/><category scheme='http://www.blogger.com/atom/ns#' term='talent distribution'/><category scheme='http://www.blogger.com/atom/ns#' term='Bill James'/><category scheme='http://www.blogger.com/atom/ns#' term='pitching'/><category scheme='http://www.blogger.com/atom/ns#' term='Greg Maddux'/><title type='text'>Agreeing with Bill James</title><content type='html'>In 1988, the Bill James Abstract included &lt;a href="http://www.baseball1.com/bb-data/bbd-bj1.html"&gt;"A Bill James Primer"&lt;/a&gt;, with 15 statements expressing what he deemed to be useful knowledge. On that list was:&lt;br /&gt;&lt;span class="Apple-style-span"&gt;&lt;i&gt;2. Talent in baseball is not normally distributed. It is a pyramid. For every player who is 10 percent above the average player, there are probably twenty players who are 10 percent below average.&lt;/i&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class="Apple-style-span"&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span"&gt;I agree. (&lt;a href="http://www.sabernomics.com/sabernomics/index.php/2010/12/agreeing-and-disagreeing-with-bill-james/"&gt;Others don't&lt;/a&gt;; for further discussion &lt;a href="http://www.insidethebook.com/ee/index.php/site/comments/lemonade/"&gt;also see here&lt;/a&gt;.)&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class="Apple-style-span"&gt;But what is this thing called "talent"? Talent is a combination of a high level of skill and sustained, consistent performance. Skill in baseball is measured through metrics such as &lt;a href="http://en.wikipedia.org/wiki/Earned_run_average"&gt;ERA&lt;/a&gt; (earned run average) and &lt;a href="http://en.wikipedia.org/wiki/On-base_plus_slugging"&gt;OPS&lt;/a&gt; (on-base average plus slugging percentage) -- measures that turn counting stats into an efficiency or rate measure. While this type of measure is important, they fail to account for the fact that some players have lengthy careers, while other players have a very short MLB career.&lt;/span&gt;&lt;span class="Apple-style-span"&gt; Teams will sign long-term contracts with aging superstars because the player's skill is still above average, even though they may have diminished with age. &lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span"&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class="Apple-style-span"&gt;In short, career length becomes a valid proxy for talent.&lt;/span&gt;&lt;span class="Apple-style-span"&gt;&lt;br /&gt;&lt;/span&gt;&lt;span class="Apple-style-span"&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class="Apple-style-span"&gt;The charts below plot the number of pitchers over the period 1996-2009, by both the number of games played (which favours the relief pitchers) and innings pitched (which favours the starters). During this period a total of 2,134 individuals pitched in MLB -- but the chart shows that very few of them stuck around for any length of time.&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span"&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class="Apple-style-span"&gt;At the head of the "games" list at 898 is the still-active &lt;a href="http://www.baseball-reference.com/players/r/riverma01.shtml"&gt;Mariano Rivera&lt;/a&gt;, while the pitcher with the most innings over this period was &lt;a href="http://www.baseball-reference.com/players/m/maddugr01.shtml"&gt;Greg Maddux&lt;/a&gt; (2887.67 innings; and Maddux threw more than 2,100 innings before 1996, as well). These two individuals, and other Hall of Fame calibre pitchers, are out at the far right of the long tail. Close to the origin at the left are pitchers whose entire career lasted but 1/3 of an inning -- a single out.&lt;/span&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;Figure 1: Number of Pitchers, by Career Innings Pitched (1996-2009)&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="http://4.bp.blogspot.com/_DcSPZvAgn6Y/TRATIzUm4uI/AAAAAAAAACw/VWQs5_unpT0/s1600/career%2Binnings.jpg" onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}"&gt;&lt;img alt="" border="0" id="BLOGGER_PHOTO_ID_5552959382484345570" src="http://4.bp.blogspot.com/_DcSPZvAgn6Y/TRATIzUm4uI/AAAAAAAAACw/VWQs5_unpT0/s320/career%2Binnings.jpg" style="cursor: hand; display: block; height: 218px; margin: 0px auto 10px; text-align: center; width: 320px;" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;Figure 2: Number of Pitchers, by Career Games (1996-2009)&lt;/span&gt;&lt;br /&gt;&lt;a href="http://4.bp.blogspot.com/_DcSPZvAgn6Y/TRATUbGkYUI/AAAAAAAAAC4/Xmp1vIig258/s1600/career%2Bgames.jpg" onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}"&gt;&lt;img alt="" border="0" id="BLOGGER_PHOTO_ID_5552959582141440322" src="http://4.bp.blogspot.com/_DcSPZvAgn6Y/TRATUbGkYUI/AAAAAAAAAC4/Xmp1vIig258/s320/career%2Bgames.jpg" style="cursor: hand; display: block; height: 218px; margin: 0px auto 10px; text-align: center; width: 320px;" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;But what of the average skill level of those pitchers? Pitchers who get a small amount of MLB experience (fewer than 27 innings) have a higher ERA than those who get more opportunities to pitch. This group -- 27% of all MLB pitchers -- recorded an average ERA of 8.08, compared to 5.15 for the 42% who pitched between 27 to 269 innings, and 4.45 for the 27% who threw between 270 and 1349 innings. The elite, those who pitched 1350 innings and above, recorded the lowest ERA of all, 4.17.&lt;br /&gt;&lt;br /&gt;In spite of the wide variance in the ERAs of the coffee drinkers, the differences in the mean scores are statistically significant.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;em&gt;Figure 3: MLB Pitchers, average ERA, by number of innings pitched (1996-2009)&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="http://3.bp.blogspot.com/_DcSPZvAgn6Y/TRECQEvbDhI/AAAAAAAAADA/hbpwOmp-Wd8/s1600/ERA%2Bby%2BIP.jpg"&gt;&lt;img alt="" border="0" id="BLOGGER_PHOTO_ID_5553222290698341906" src="http://3.bp.blogspot.com/_DcSPZvAgn6Y/TRECQEvbDhI/AAAAAAAAADA/hbpwOmp-Wd8/s400/ERA%2Bby%2BIP.jpg" style="cursor: hand; display: block; height: 123px; margin: 0px auto 10px; text-align: center; width: 400px;" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;div&gt;&lt;/div&gt;&lt;div&gt;In summary: there is an abundance of players who are less talented than the major league average, while at the same time the number of above-average talents is low. The distribution, at the major league level, is not normal. Just like Bill James said 22 years ago.&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;&lt;div&gt;-30- &lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684119683005319088-8422844289390170841?l=bayesball.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bayesball.blogspot.com/feeds/8422844289390170841/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://bayesball.blogspot.com/2010/12/agreeing-with-bill-james.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/8422844289390170841'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/8422844289390170841'/><link rel='alternate' type='text/html' href='http://bayesball.blogspot.com/2010/12/agreeing-with-bill-james.html' title='Agreeing with Bill James'/><author><name>Martin Monkman</name><uri>http://www.blogger.com/profile/05582544453619381290</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='http://2.bp.blogspot.com/_DcSPZvAgn6Y/TAwMW7v4JlI/AAAAAAAAAAY/JOD53e64_I8/S220/charliebrownpitching.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_DcSPZvAgn6Y/TRATIzUm4uI/AAAAAAAAACw/VWQs5_unpT0/s72-c/career%2Binnings.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684119683005319088.post-674200541930118389</id><published>2010-12-16T08:34:00.000-08:00</published><updated>2011-04-06T17:30:50.219-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='economics'/><category scheme='http://www.blogger.com/atom/ns#' term='Cliff Lee'/><category scheme='http://www.blogger.com/atom/ns#' term='salary'/><category scheme='http://www.blogger.com/atom/ns#' term='Greg Maddux'/><title type='text'>Angell turns comic</title><content type='html'>Roger Angell, writing on the New Yorker site, cracks me up with his article &lt;a href="http://www.newyorker.com/online/blogs/sportingscene/2010/12/stats.html"&gt;"Stats"&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Yes, Cliff Lee had a high UPUBB (Unexpectedly Passing Up Big Bucks), but where does it rank in the history of the game?  Greg Maddux did the same in 1993, when the Yankees made him a better (well, more financially lucrative) offer than the Braves, but has anyone done a similar analysis of non-monetary influences on player signing decisions?&lt;br /&gt;&lt;br /&gt;-30-&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684119683005319088-674200541930118389?l=bayesball.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bayesball.blogspot.com/feeds/674200541930118389/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://bayesball.blogspot.com/2010/12/angell-turns-comic.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/674200541930118389'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/674200541930118389'/><link rel='alternate' type='text/html' href='http://bayesball.blogspot.com/2010/12/angell-turns-comic.html' title='Angell turns comic'/><author><name>Martin Monkman</name><uri>http://www.blogger.com/profile/05582544453619381290</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='http://2.bp.blogspot.com/_DcSPZvAgn6Y/TAwMW7v4JlI/AAAAAAAAAAY/JOD53e64_I8/S220/charliebrownpitching.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684119683005319088.post-7442832126090226183</id><published>2010-12-10T16:33:00.000-08:00</published><updated>2011-04-06T17:31:04.476-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Albert Pujols'/><category scheme='http://www.blogger.com/atom/ns#' term='slugging'/><category scheme='http://www.blogger.com/atom/ns#' term='regression toward the mean'/><category scheme='http://www.blogger.com/atom/ns#' term='Jason Kendall'/><category scheme='http://www.blogger.com/atom/ns#' term='selection bias'/><category scheme='http://www.blogger.com/atom/ns#' term='batting'/><title type='text'>Slugging regression II</title><content type='html'>&lt;div&gt;Building on my &lt;a href="http://bayesball.blogspot.com/2010/12/slugging-regression.html"&gt;previous post&lt;/a&gt;, this time around we'll look at a bigger group of hitters, those with at least 75 at-bats in both 2007 and 2008. This is a total of 360 players.&lt;/div&gt;&lt;div&gt;Theoretically with fewer at-bats, we would see a greater number of very high SLG values and also a larger number of below-average SLG values.  But we've already seen hints that player talent gets evaluated early on (in the previous post, I identified the fact that the worst SLG in the 400+ group wasn't as awful to the same degree as the best hitters are good).&lt;/div&gt;&lt;br /&gt;&lt;div&gt;How to read the charts below: in both cases, there are 25 players plotted. Those that fall between the 100% and zero lines are regressing to the league mean. And the closer they are to the line, the bigger the regression.  As shown in Figure 1, 22 of the top sluggers regressed toward the mean in 2008, 3 improved (led by Albert Pujols) and none fell below the league average.&lt;/div&gt;&lt;br /&gt;&lt;div&gt;For these players, 66% of their 2008 SLG score was accounted for by their 2007 SLG (and therefore the league average accounted for 44%).&lt;/div&gt;&lt;br /&gt;&lt;div&gt;An interesting observation is that these players are by and large the same as the 400+ AB group I dealt with in the previous post. Of the 25, 19 had 400+ ABs in both years.  And of the remaining 6, 4 of the players had below 400 in 2007 and then over 400 in 2008.  This group includes familiar names -- &lt;a href="http://www.thebaseballcube.com/players/H/Josh-Hamilton.shtml"&gt;Josh Hamilton&lt;/a&gt;, &lt;a href="http://www.thebaseballcube.com/players/M/David-Murphy.shtml"&gt;David Murphy&lt;/a&gt;, and &lt;a href="http://www.thebaseballcube.com/players/R/Cody-Ross.shtml"&gt;Cody Ross&lt;/a&gt;.  All of them are young sluggers who did well in a short stint in 2007, and were given the opportunity to continue to play in 2008.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div align="center"&gt;&lt;em&gt;Figure 1: Top 25 SLG (2007), minimum 75 at-bats&lt;br /&gt;&lt;/em&gt;&lt;/div&gt;&lt;a href="http://1.bp.blogspot.com/_DcSPZvAgn6Y/TQLJKhFePMI/AAAAAAAAACg/MIc0bPIiLWw/s1600/slug75-top.jpg"&gt;&lt;img alt="" border="0" id="BLOGGER_PHOTO_ID_5549218873391070402" src="http://1.bp.blogspot.com/_DcSPZvAgn6Y/TQLJKhFePMI/AAAAAAAAACg/MIc0bPIiLWw/s400/slug75-top.jpg" style="cursor: hand; display: block; height: 220px; margin: 0px auto 10px; text-align: center; width: 400px;" /&gt;&lt;/a&gt;&lt;br /&gt;For hitters at the bottom of the slugging table, we see a similar pattern of regression.  Figure 2 shows SLG "improvement" in the opposite direction:  the closer the bar gets to the bottom of the chart, the bigger the improvement.  Thus of the 25 players, 17 regressed toward the mean without achieving it, and 3 others exceeded the league average (the ones who fell "below zero"). The remaining 5, on the other hand, started out below average in 2007 and fared worse in 2008.&lt;br /&gt;For this group, the previous year's SLG accounted for only 55% of their 2008 SLG.&lt;br /&gt;&lt;div style="text-align: left;"&gt;And for this group of 25, they are decidedly not the same players as the least sluggerly of the 400+ AB group.  Only 1 -- Jason Kendall -- appears in both lists.&lt;/div&gt;&lt;br /&gt;In short, the "survivor bias" that keeps good players active with opportunities to hit has an inverse impact on the players at the bottom of the stack.  Unless they show a huge improvement that brings them much closer to the league average, these players seem to be destined to part-time roles.&lt;br /&gt;&lt;br /&gt;&lt;div style="text-align: center;"&gt;&lt;em&gt;Figure 2: Bottom 25 SLG (2007), minimum 75 at-bats&lt;/em&gt;&lt;/div&gt;&lt;a href="http://4.bp.blogspot.com/_DcSPZvAgn6Y/TQLKQ8tY42I/AAAAAAAAACo/WfQUaUYrJKo/s1600/slug75-bottom.jpg"&gt;&lt;img alt="" border="0" id="BLOGGER_PHOTO_ID_5549220083397092194" src="http://4.bp.blogspot.com/_DcSPZvAgn6Y/TQLKQ8tY42I/AAAAAAAAACo/WfQUaUYrJKo/s400/slug75-bottom.jpg" style="cursor: hand; display: block; height: 220px; margin: 0px auto 10px; text-align: center; width: 400px;" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;This analysis is best described as "proof of concept".  Certainly any conclusions drawn should be tentatively stated, and a more robust analysis over a greater number of seasons is warranted.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;-30-&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684119683005319088-7442832126090226183?l=bayesball.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bayesball.blogspot.com/feeds/7442832126090226183/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://bayesball.blogspot.com/2010/12/slugging-regression-ii.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/7442832126090226183'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/7442832126090226183'/><link rel='alternate' type='text/html' href='http://bayesball.blogspot.com/2010/12/slugging-regression-ii.html' title='Slugging regression II'/><author><name>Martin Monkman</name><uri>http://www.blogger.com/profile/05582544453619381290</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='http://2.bp.blogspot.com/_DcSPZvAgn6Y/TAwMW7v4JlI/AAAAAAAAAAY/JOD53e64_I8/S220/charliebrownpitching.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_DcSPZvAgn6Y/TQLJKhFePMI/AAAAAAAAACg/MIc0bPIiLWw/s72-c/slug75-top.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684119683005319088.post-3821125369013502934</id><published>2010-12-10T10:06:00.001-08:00</published><updated>2011-04-06T17:31:16.535-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Albert Pujols'/><category scheme='http://www.blogger.com/atom/ns#' term='slugging'/><category scheme='http://www.blogger.com/atom/ns#' term='regression toward the mean'/><category scheme='http://www.blogger.com/atom/ns#' term='Jason Kendall'/><category scheme='http://www.blogger.com/atom/ns#' term='sabermetrics'/><category scheme='http://www.blogger.com/atom/ns#' term='batting'/><title type='text'>Slugging regression</title><content type='html'>Tango &lt;a href="http://www.insidethebook.com/ee/index.php/site/comments/tangotiger_challenge_of_the_day/"&gt;issued a multi-part challenge&lt;/a&gt;, of which the first part is:&lt;br /&gt;&lt;span style="font-family: arial;"&gt;&lt;em&gt;1. Take the top 10 in SLG in each of the last 10 years, and tell me what the overall average SLG of these 100 players was in the following year.&lt;/em&gt;&lt;/span&gt;&lt;br /&gt;&lt;em&gt;&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;The point of the challenge is to demonstrate that top performing players will &lt;a href="http://en.wikipedia.org/wiki/Regression_toward_the_mean"&gt;regress toward the mean &lt;/a&gt;in subsequent seasons, and that the year under consideration accounts for, as a rule of thumb, 70% of the next season's performance, and the league average (to which their performance regresses) the other 30%.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;In algebraic terms, X is predicted to be 70% when&lt;br /&gt;&lt;span style="font-family: arial;"&gt;X = (SLG2 - LSLG)/(SLG1-LSLG)&lt;/span&gt;&lt;br /&gt;Where:&lt;br /&gt;&lt;span style="font-family: arial;"&gt;SLG1 is season 1 slugging average,&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: arial;"&gt;SLG2 is season 2 slugging average,&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: arial;"&gt;LSLG is the average league slugging average (from season 1)&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Leo quickly responded (comment #1 to Tango's post), with his calculation that for slugging, 73.3% was accounted for by the player's average in the first season. To my way of thinking, the challenge has been met -- job well done, Leo!&lt;br /&gt;&lt;br /&gt;But as I started to think about it further, I began to wonder how far through the rankings this rule of thumb holds -- as we approach the league average, the player's SLG and the league SLG become one and the same number. And at the opposite end of the scale -- the non-sluggers -- do they regress &lt;em&gt;upwards&lt;/em&gt; towards the mean?&lt;br /&gt;&lt;br /&gt;So my first step was to simplify the challenge, and only look at two consecutive seasons, 2007 and 2008. Using only those players who had a minimum of 400 at bats each season, I pruned the list down to 129 players in both the NL and AL. Simplifying matters further is the fact that the 2007 SLG for the NL was the same as the AL -- .423. So for my "top sluggers" I then looked at the top 25 across both leagues.&lt;br /&gt;&lt;br /&gt;The result: for these 25 players, on average, 66% of their 2008 SLG was accounted for through their 2007 score. A few percentage points from Tango's rule of thumb, but close enough.&lt;br /&gt;&lt;br /&gt;Charting the results shows that all but two of the top 25 sluggers regressed downwards towards the mean. And of the two, only one improved dramatically: &lt;a href="http://www.thebaseballcube.com/players/P/Albert-Pujols.shtml"&gt;Albert Pujols &lt;/a&gt;(who inched up still further in 2009, before regressing ever-so-slightly in 2010). Were Pujols not in the mix, the 2007 SLG would account for only 62% of the 2008 scores.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="http://1.bp.blogspot.com/_DcSPZvAgn6Y/TQKYtUbHTbI/AAAAAAAAACA/I6llNP_RmHQ/s1600/slug07-08_1.jpg"&gt;&lt;img alt="" border="0" id="BLOGGER_PHOTO_ID_5549165595217841586" src="http://1.bp.blogspot.com/_DcSPZvAgn6Y/TQKYtUbHTbI/AAAAAAAAACA/I6llNP_RmHQ/s320/slug07-08_1.jpg" style="cursor: hand; display: block; height: 208px; margin: 0px auto 10px; text-align: center; width: 320px;" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Another interesting observation is that of these top performers, not one fell so far in 2008 to end up with a SLG below the league average. That's not to say that it wouldn't happen, but it suggests that at the extreme end of the performance curve, as determined over the course of a full season, top performers really are above average. (NOTE: further testing required!)&lt;br /&gt;&lt;br /&gt;But what of the other end of the ranking? I looked at the lowest performing players that I had selected, and the rule of thumb does &lt;em&gt;not&lt;/em&gt; work. From the bottom up, the percentage explained was 87%, 84%, -4.4%, -28%, ...&lt;br /&gt;&lt;br /&gt;At this point, I started to wonder -- why &lt;em&gt;minus&lt;/em&gt; values? A quick check of the numbers, and I saw that these players regressed up, and to a point &lt;em&gt;above&lt;/em&gt; the league average.&lt;br /&gt;&lt;br /&gt;So what's different about the bottom of the range? It's simple: &lt;a href="http://en.wikipedia.org/wiki/Survivorship_bias"&gt;survivorship bias&lt;/a&gt;. My "sample" of 139 players who had 400+ ABs in each of 2007 and 2008, while ensuring I found the top hitters, automatically excluded those weak-slugging players who don't get many plate appearances but who collectively drag down the league average. Thus the "worst" players of the 139 with lots of ABs were not (by and large) far from the league average. The bottom of the list was &lt;a href="http://www.thebaseballcube.com/players/K/Jason-Kendall.shtml"&gt;Jason Kendall&lt;/a&gt;, who slugged .309 in 2007 for the A's and the Cubs while catching. Perform much worse than that, and you'll end up playing Triple A. Or in Kendall's case, for the Royals.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;On deck: regression toward the mean, SLG with 75+ ABs.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;-30-&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684119683005319088-3821125369013502934?l=bayesball.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bayesball.blogspot.com/feeds/3821125369013502934/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://bayesball.blogspot.com/2010/12/slugging-regression.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/3821125369013502934'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/3821125369013502934'/><link rel='alternate' type='text/html' href='http://bayesball.blogspot.com/2010/12/slugging-regression.html' title='Slugging regression'/><author><name>Martin Monkman</name><uri>http://www.blogger.com/profile/05582544453619381290</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='http://2.bp.blogspot.com/_DcSPZvAgn6Y/TAwMW7v4JlI/AAAAAAAAAAY/JOD53e64_I8/S220/charliebrownpitching.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_DcSPZvAgn6Y/TQKYtUbHTbI/AAAAAAAAACA/I6llNP_RmHQ/s72-c/slug07-08_1.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684119683005319088.post-3335693498224263746</id><published>2010-12-09T09:16:00.000-08:00</published><updated>2011-04-06T17:31:28.976-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='run expectancy'/><category scheme='http://www.blogger.com/atom/ns#' term='linear weights'/><title type='text'>The fundamentals</title><content type='html'>&lt;a href="http://3.bp.blogspot.com/_DcSPZvAgn6Y/TQEQFwjQHWI/AAAAAAAAAB4/XVeD25gTSvM/s1600/agent-smith.jpg"&gt;&lt;img alt="" border="0" id="BLOGGER_PHOTO_ID_5548733907015310690" src="http://3.bp.blogspot.com/_DcSPZvAgn6Y/TQEQFwjQHWI/AAAAAAAAAB4/XVeD25gTSvM/s200/agent-smith.jpg" style="cursor: hand; float: right; height: 200px; margin: 0px 0px 10px 10px; width: 155px;" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;div&gt;Tango continues to provide the essential ingredients. First it was the &lt;a href="http://www.insidethebook.com/ee/index.php/site/comments/run_expectancy_matrix_1950_2010/"&gt;run expectancy matrix&lt;/a&gt;, now it's &lt;a href="http://www.insidethebook.com/ee/index.php/site/comments/bases_outs_by_event/"&gt;bases &amp;amp; outs by events&lt;/a&gt;.  These two tables are fundamentals of sabermetrics -- understanding baseball (and in particular, the costs and benefits of different strategies) starts here.&lt;/div&gt;&lt;br /&gt;&lt;div&gt;&lt;/div&gt;&lt;br /&gt;&lt;div&gt;&lt;em&gt;[Right: Agent Smith ponders the matrix.]&lt;/em&gt;&lt;/div&gt;&lt;br /&gt;&lt;div&gt;&lt;/div&gt;&lt;br /&gt;&lt;div&gt;&lt;/div&gt;&lt;br /&gt;&lt;div&gt;-30-&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684119683005319088-3335693498224263746?l=bayesball.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bayesball.blogspot.com/feeds/3335693498224263746/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://bayesball.blogspot.com/2010/12/fundamentals.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/3335693498224263746'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/3335693498224263746'/><link rel='alternate' type='text/html' href='http://bayesball.blogspot.com/2010/12/fundamentals.html' title='The fundamentals'/><author><name>Martin Monkman</name><uri>http://www.blogger.com/profile/05582544453619381290</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='http://2.bp.blogspot.com/_DcSPZvAgn6Y/TAwMW7v4JlI/AAAAAAAAAAY/JOD53e64_I8/S220/charliebrownpitching.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_DcSPZvAgn6Y/TQEQFwjQHWI/AAAAAAAAAB4/XVeD25gTSvM/s72-c/agent-smith.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684119683005319088.post-5886619365979739772</id><published>2010-12-07T20:49:00.000-08:00</published><updated>2011-04-06T17:31:41.589-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='run expectancy'/><category scheme='http://www.blogger.com/atom/ns#' term='probability'/><category scheme='http://www.blogger.com/atom/ns#' term='prediction'/><title type='text'>Graphing run expectancy</title><content type='html'>&lt;a href="http://www.baseballprospectus.com/statistics/sortable/index.php?cid=139594"&gt;Baseball Prospectus&lt;/a&gt; provided the world with the 2010 situational run expectancies, and Joshua Maciel has provided a &lt;a href="http://henkakyuu.blogspot.com/2010/12/run-expectancy-by-base-out-state-2010.html"&gt;graphic display&lt;/a&gt;.  The graph was first posted to and then evolved as a result of feedback from &lt;a href="http://www.insidethebook.com/ee/index.php/site/comments/graphical_run_expectancy_chart/"&gt;readers at Tango's blog&lt;/a&gt; -- a fascinating process in and of itself.  The graph is still a bit busy to my mind (but it's certainly not a &lt;a href="http://www.edwardtufte.com/tufte/books_vdqi"&gt;Tufte&lt;/a&gt;-ian duck...), but it's a fine piece of work displaying what is arguably one of the most important pieces of baseball data.&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;div&gt;And remember, it was George Lindsey who first published these data, back in 1963.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;br /&gt;&lt;a href="http://1.bp.blogspot.com/_AtlVw4SMQfg/TP3N2JARj2I/AAAAAAAAAPA/mn3cuZSf98M/s1600/baseout-states-%25282010-12-08%2529.png" onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}"&gt;&lt;img alt="" border="0" src="http://1.bp.blogspot.com/_AtlVw4SMQfg/TP3N2JARj2I/AAAAAAAAAPA/mn3cuZSf98M/s1600/baseout-states-%25282010-12-08%2529.png" style="cursor: hand; cursor: pointer; float: right; height: 240px; margin: 0 0 10px 10px; width: 320px;" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;div&gt;&lt;i&gt;[Right: Joshua's run expectancy chart. Click to see it full-size, and visit his blog for &lt;a href="http://henkakyuu.blogspot.com/2010/12/run-expectancy-by-base-out-state-2010.html"&gt;all the details&lt;/a&gt;.]&lt;/i&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;-30-&lt;/div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684119683005319088-5886619365979739772?l=bayesball.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bayesball.blogspot.com/feeds/5886619365979739772/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://bayesball.blogspot.com/2010/12/graphing-run-expectancy.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/5886619365979739772'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/5886619365979739772'/><link rel='alternate' type='text/html' href='http://bayesball.blogspot.com/2010/12/graphing-run-expectancy.html' title='Graphing run expectancy'/><author><name>Martin Monkman</name><uri>http://www.blogger.com/profile/05582544453619381290</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='http://2.bp.blogspot.com/_DcSPZvAgn6Y/TAwMW7v4JlI/AAAAAAAAAAY/JOD53e64_I8/S220/charliebrownpitching.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_AtlVw4SMQfg/TP3N2JARj2I/AAAAAAAAAPA/mn3cuZSf98M/s72-c/baseout-states-%25282010-12-08%2529.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684119683005319088.post-6056166848859232644</id><published>2010-11-29T07:08:00.000-08:00</published><updated>2011-04-06T17:31:53.451-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='good math-bad statistics'/><category scheme='http://www.blogger.com/atom/ns#' term='epistemology'/><category scheme='http://www.blogger.com/atom/ns#' term='philosophy'/><title type='text'>Good math, bad statistics</title><content type='html'>In the past few days, a pair of posts on other blogs caught my attention -- they seem to be coming at the same issue from different directions. &lt;br /&gt;&lt;div&gt;First, William R. Briggs posted "&lt;a href="http://wmbriggs.com/blog/?p=3169"&gt;Statistics Is Not Math&lt;/a&gt;" (November 16, 2010). Then, Tango over at The Book posted "&lt;a href="http://www.insidethebook.com/ee/index.php/site/article/detrending_when_statisticians_attack/"&gt;Detrending: when statisticians attack!&lt;/a&gt;" (November 24, 2010). I responded to the Tango post (comment #4), but I would like to here elaborate further.&lt;/div&gt;&lt;div&gt;One of the things that jumped out at me from Briggs' post was the statement that "Statistics rightly belongs to epistemology, the philosophy of how we know what we know. Probability and statistics can even be called quantitative epistemology." In other words, statistics is useful only if we have some understanding of the subject matter at hand. No amount of fancy math will help our understanding if we do not start our research with some knowledge of the topic.&lt;/div&gt;&lt;div&gt;In the "Detrending" post, Tango links to an unpublished (in the academic sense that it's not been published in a peer-reviewed journal) paper, by three physicists, Alexander M. Petersen , Orion Penner, and H. Eugene Stanley, entitled &lt;a href="http://arxiv.org/PS_cache/arxiv/pdf/1003/1003.0134v1.pdf"&gt;"Detrending career statistics in professional baseball:&lt;/a&gt;&lt;/div&gt;&lt;div&gt;&lt;a href="http://arxiv.org/PS_cache/arxiv/pdf/1003/1003.0134v1.pdf"&gt;Accounting for the steroids era and beyond"&lt;/a&gt;. I may offer a longer critique of this paper at a later date, but the first thing that jumps out is an apparent ignorance of The Literature (i.e. what's been written earlier about the topic -- baseball -- from a statistical basis). This leads the authors to make conclusions that have been supported elsewhere (for example, pitcher wins are not a good measure of pitcher performance, or that standardizing allows for inter-season comparisons).&lt;/div&gt;&lt;div&gt;There's lots of fancy maths (some of which isn't as fancy or new-fangled as the authors seem to think) and plenty of Greek letters, but in the end it doesn't add a great deal to our understanding of baseball.&lt;/div&gt;&lt;div&gt;This article serves as a reminder that when we are assessing the quality of any sabermetric writing, we need to consider two factors:&lt;/div&gt;&lt;div&gt;1. Is the author using the appropriate statistical tools and interpreting the mathematical results correctly? &lt;/div&gt;&lt;div&gt;2. Does the author understand the game, including how baseball has evolved and the analytic literature that has been written over the past 50 years?&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;-30-&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684119683005319088-6056166848859232644?l=bayesball.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bayesball.blogspot.com/feeds/6056166848859232644/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://bayesball.blogspot.com/2010/11/good-math-bad-statistics.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/6056166848859232644'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/6056166848859232644'/><link rel='alternate' type='text/html' href='http://bayesball.blogspot.com/2010/11/good-math-bad-statistics.html' title='Good math, bad statistics'/><author><name>Martin Monkman</name><uri>http://www.blogger.com/profile/05582544453619381290</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='http://2.bp.blogspot.com/_DcSPZvAgn6Y/TAwMW7v4JlI/AAAAAAAAAAY/JOD53e64_I8/S220/charliebrownpitching.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684119683005319088.post-1431040099899105510</id><published>2010-11-22T21:16:00.000-08:00</published><updated>2011-04-06T17:32:07.528-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Bo Hart'/><category scheme='http://www.blogger.com/atom/ns#' term='streak'/><category scheme='http://www.blogger.com/atom/ns#' term='probability'/><category scheme='http://www.blogger.com/atom/ns#' term='regression toward the mean'/><category scheme='http://www.blogger.com/atom/ns#' term='batting'/><title type='text'>Bo knows probability</title><content type='html'>&lt;a href="http://www.freewebs.com/316sports/bo%20hart.JPG" onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}"&gt;&lt;img alt="" border="0" src="http://www.freewebs.com/316sports/bo%20hart.JPG" style="cursor: hand; float: left; height: 498px; margin: 0px 10px 10px 0px; width: 351px;" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;i&gt;Cardinals' second baseman Bo Hart&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;div&gt;Over on &lt;a href="http://www.3-dbaseball.net/"&gt;3-D Baseball&lt;/a&gt;, Kincaid has a nice explanation of regression to the mean in a post titled &lt;a href="http://www.3-dbaseball.net/2010/11/on-correlation-regression-and-bo-hart.html"&gt;"On Correlation, Regression, and Bo Hart"&lt;/a&gt;. The blog entry starts with the story of Bo Hart, who got called up to the Cardinals in June 2003, and promptly hit .412 over his first 75 at-bats. Since Kincaid wrote a regression to the mean article, you can guess where Hart's season went -- he finished with &lt;a href="http://www.thebaseballcube.com/players/H/bo-hart.shtml"&gt;286 at-bats and a .277 average&lt;/a&gt;.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;But Kincaid flirts with a few notions that I think are worth following in a bit more detail.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;First up, what are the odds that a .277 hitter will break .400 across a string of 75 at-bats?&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The answer is roughly 1 in 200.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;This is calculated through the fact that the binomial distribution approximates the normal distribution -- in English, if you repeat a set of binomial trials, the histogram of the count of success rates for the trials will look like the normal curve. This leads us to the &lt;a href="http://en.wikipedia.org/wiki/Probability_density_function"&gt;probability density function&lt;/a&gt;, which allows us to state the probability that a value (in this case, a batting average of .412) falls at a certain point given the mean value (.277).&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Using Bo Hart's season batting average of .277 as his "true talent" (or "population mean") across 75 at-bats, we can calculate the standard deviation of the distribution (0.052). We then determine that .412 lies at 2.60 standard deviations from the mean (2.60=[.412-.277]/.052). As a probability, 2.60 standard deviations is 0.5% -- or 1 in 200.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;What was unusual about Bo Hart is that his 1 in 200 string of successful at-bats occurred at the beginning of his Major League career. Calculating &lt;i&gt;that&lt;/i&gt; probability is a task for another day.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;In my next post I will explore Kincaid's statements about evaluating "true talent" based on a number of observations. Specifically, I'll delve into the following questions: "At what point can we be relatively certain about our inferences of true talent based on observed performance? 75 PAs is not enough, and one million is plenty, but what about 1000?"&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;-30-&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684119683005319088-1431040099899105510?l=bayesball.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bayesball.blogspot.com/feeds/1431040099899105510/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://bayesball.blogspot.com/2010/11/bo-knows-probability.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/1431040099899105510'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/1431040099899105510'/><link rel='alternate' type='text/html' href='http://bayesball.blogspot.com/2010/11/bo-knows-probability.html' title='Bo knows probability'/><author><name>Martin Monkman</name><uri>http://www.blogger.com/profile/05582544453619381290</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='http://2.bp.blogspot.com/_DcSPZvAgn6Y/TAwMW7v4JlI/AAAAAAAAAAY/JOD53e64_I8/S220/charliebrownpitching.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684119683005319088.post-1019745901496500690</id><published>2010-11-14T17:22:00.001-08:00</published><updated>2011-04-06T17:32:29.039-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Golden Baseball League'/><category scheme='http://www.blogger.com/atom/ns#' term='Victoria Seals'/><title type='text'>Sealing the exits?</title><content type='html'>&lt;a href="http://3.bp.blogspot.com/_DcSPZvAgn6Y/TOCcodI7NRI/AAAAAAAAABw/KBkE-Lpn_qI/s1600/3819374246_a70ec04980_o.jpg" onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}"&gt;&lt;img alt="" border="0" id="BLOGGER_PHOTO_ID_5539599760496473362" src="http://3.bp.blogspot.com/_DcSPZvAgn6Y/TOCcodI7NRI/AAAAAAAAABw/KBkE-Lpn_qI/s400/3819374246_a70ec04980_o.jpg" style="cursor: hand; float: left; height: 271px; margin: 0px 10px 10px 0px; width: 400px;" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;The &lt;a href="http://victoriaseals.ca/"&gt;Victoria Seals&lt;/a&gt; of the &lt;a href="http://www.goldenbaseball.com/default.aspx"&gt;Golden Baseball League&lt;/a&gt; announced on Wednesday (November 10, 2010) that they were "ceasing operations". While acknowledging that the league has some serious challenges, the club was pointing more fingers at the City of Victoria, who owns and operates the Seals' home, Royal Athletic Park (RAP). &lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;div&gt;Like practically every other pro sports franchise that plays in a publicly owned facility, the Seals wanted a better deal. In the &lt;a href="http://victoriaseals.ca/component/content/article/42-rokstories/665-seals-cease-operations"&gt;news release and press conference&lt;/a&gt;, the Seals stated they had asked for a larger share of the gate and concession revenue, and solutions to a variety of issues relating to the playing field, including the ability to leave the outfield fence up all summer. The City's position is that they are unwilling to have taxpayers subsidize the club, and since RAP is a multi-purpose facility available for many users their hands are tied on the field issues.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I don't know enough about the details on either side to offer an informed opinion. But I can say both sides seem to have entrenched positions, based on their specific operating requirements (for the City, that includes the political reality as well as the business side) that seem reasonable enough. In short, there may not be a middle ground that is satisfactory to both parties.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The local press has played up the distance between the Seals and the City (the news article is &lt;a href="http://www.timescolonist.com/sports/After+years+Seals+adieu/3811493/story.html"&gt;here&lt;/a&gt;, but the local daily also weighed in with &lt;a href="http://www.timescolonist.com/sports/Another+team+falls+victim+bureaucracy/3816424/story.html"&gt;this opinion piece&lt;/a&gt; and &lt;a href="http://www.timescolonist.com/Seals+saga+familiar+these+shores/3816359/story.html"&gt;this more resigned article&lt;/a&gt;). But in so doing, the press has missed a key element that the Seals acknowledge: the Golden Baseball League is itself in a shambles. The team's press release describes the league as being in an "unstable state", but I suspect that understates the troubles.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;To my way of thinking, the biggest problem is the distribution of the teams in the league. To expand the league off the mainland of North America to Victoria (on Vancouver Island) was one thing -- it guarantees a higher-per-mile travel cost (those ferries aren't cheap) and perhaps an extra hotel night. But what was the league thinking, given the evidence that the league is in a tenuous state to start with, adding clubs in Mexico (Tijuana -- not far from the mothballed San Diego club but across an international border) and in particular Hawaii (Maui)? &lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I have to wonder if, with their "cease operations" announcement, the management of the Seals is trying to press both the City of Victoria and the management of the Golden Baseball League. Perhaps it is just wishful thinking on my part, but if the Seals can get a more satisfactory arrangement with the City of Victoria (or another municipality in the greater Victoria area -- we have 13, after all), while pressuring the league to make some sensible choices about where the franchises are located, then perhaps we haven't seen the end of this round of professional baseball in Victoria.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;i&gt;Update (2010-11-15):&lt;/i&gt; The &lt;a href="http://www.theglobeandmail.com/news/national/british-columbia/tom-hawthorn/the-seals-depart-leaving-a-void-in-victoria/article1798610/?cmpid=rss1"&gt;Globe &amp;amp; Mail also chimes in&lt;/a&gt;.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684119683005319088-1019745901496500690?l=bayesball.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bayesball.blogspot.com/feeds/1019745901496500690/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://bayesball.blogspot.com/2010/11/playing-all-their-cards.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/1019745901496500690'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/1019745901496500690'/><link rel='alternate' type='text/html' href='http://bayesball.blogspot.com/2010/11/playing-all-their-cards.html' title='Sealing the exits?'/><author><name>Martin Monkman</name><uri>http://www.blogger.com/profile/05582544453619381290</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='http://2.bp.blogspot.com/_DcSPZvAgn6Y/TAwMW7v4JlI/AAAAAAAAAAY/JOD53e64_I8/S220/charliebrownpitching.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_DcSPZvAgn6Y/TOCcodI7NRI/AAAAAAAAABw/KBkE-Lpn_qI/s72-c/3819374246_a70ec04980_o.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684119683005319088.post-7960796894659676749</id><published>2010-11-14T17:18:00.000-08:00</published><updated>2011-04-06T17:32:42.256-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='baseball bookshelf'/><category scheme='http://www.blogger.com/atom/ns#' term='sabermetrics'/><title type='text'>The Sabermetric bookshelf, #1</title><content type='html'>&lt;div&gt;&lt;a href="http://3.bp.blogspot.com/_DcSPZvAgn6Y/TOCG0z91v9I/AAAAAAAAABo/UYvVuhwa1ds/s1600/numbersgame.jpg" onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}"&gt;&lt;img alt="" border="0" id="BLOGGER_PHOTO_ID_5539575783526612946" src="http://3.bp.blogspot.com/_DcSPZvAgn6Y/TOCG0z91v9I/AAAAAAAAABo/UYvVuhwa1ds/s320/numbersgame.jpg" style="cursor: hand; cursor: pointer; float: left; height: 262px; margin: 0 10px 10px 0; width: 195px;" /&gt;&lt;/a&gt;&lt;span style="font-style: italic;"&gt;The Numbers Game: Baseball's Lifelong Fascination with Statistics&lt;/span&gt;, by Alan Schwarz. 2004, St. Martin's Press.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;hr /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;In &lt;em&gt;The Numbers Game&lt;/em&gt;, Alan Schwarz presents a well-written and tidy history of the development and evolution of the statistics that record the history of the game.  Or more accurately, it's a history of baseball, and its evolution over the past century and a half, from the perspective of the numerical record and analysis of the game.&lt;/div&gt;&lt;div&gt;&lt;/div&gt;&lt;div&gt;&lt;/div&gt;&lt;div&gt;Thus Schwarz begins in the mid-nineteenth century, with Henry Chadwick's influence on the information that got recorded.  But more importantly, Schwarz points out (and this becomes a recurring theme) that how the game was played was an influence on what got recorded.  In the early days, the ball was "pitched" to the batter in a way that facilitated batting it -- and because pitching was secondary to hitting and fielding, there was no record of pitching performance.  And as the game evolved, so did the numbers that recorded the game and got used to evaluate the players.&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;/div&gt;A second recurring theme is the weaving of the technical aspects of the statistics with the personal characters of those who developed and promoted various measures.  This is very much a character-driven book -- we hear not only about the "why" of the statistics that were recorded, but the people who developed them and the means of recording them. So we hear about Al Munro Elias, Allan Roth's career with the Dodgers, Hal Richman's development of Strat-O-Matic, and George Lindsey's articles that appeared in academic journals beginning in the late 1950s.  We also get an entire chapter devoted to the publication in 1969 of &lt;span style="font-style: italic;"&gt;The Baseball Encyclopedia&lt;/span&gt;, and another to Bill James.  &lt;br /&gt;&lt;div&gt;&lt;/div&gt;&lt;div&gt;One of the things that jumps out to me is the impact that computers-- particularly the personal computer -- has had on the volume of statistics available, and the precision of the analysis that is now available.  (And what is perhaps a topic for another day, the proliferation of analysts of varying quality.)&lt;/div&gt;&lt;div&gt;In &lt;i&gt;The Numbers Game&lt;/i&gt;, Schwarz has written what may well be the single best introduction to sabermetrics.  But it's not a technical manual that will tell you how to calculate any one statistic, or how another measure should be interpreted.  Instead it's a lively history of major league baseball, and the numerical record and analysis that accompanies it.&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;Assessment: home run.&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684119683005319088-7960796894659676749?l=bayesball.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bayesball.blogspot.com/feeds/7960796894659676749/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://bayesball.blogspot.com/2010/11/sabermetric-bookshelf-1.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/7960796894659676749'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/7960796894659676749'/><link rel='alternate' type='text/html' href='http://bayesball.blogspot.com/2010/11/sabermetric-bookshelf-1.html' title='The Sabermetric bookshelf, #1'/><author><name>Martin Monkman</name><uri>http://www.blogger.com/profile/05582544453619381290</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='http://2.bp.blogspot.com/_DcSPZvAgn6Y/TAwMW7v4JlI/AAAAAAAAAAY/JOD53e64_I8/S220/charliebrownpitching.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_DcSPZvAgn6Y/TOCG0z91v9I/AAAAAAAAABo/UYvVuhwa1ds/s72-c/numbersgame.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684119683005319088.post-5013766054106702515</id><published>2010-11-14T14:34:00.002-08:00</published><updated>2010-11-14T14:49:15.473-08:00</updated><title type='text'>2010 in retrospect</title><content type='html'>&lt;div&gt;The 2010 MLB season has come to a close, and so begins a time of reflection and resolutions.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;a) I started this blog, and rarely posted.  I started a few posts, often in response to other blogs, but finished fewer still.  I am confounded by the traffic on the blogosphere -- I thought I could respond thoughtfully and add something of value, but I find myself either repeating what gets said elsewhere, or sounding like a condescending pedant.  Or both.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;So I'll start off on a different tack, starting now.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;b) I went to two MLB games in 2010.  (There's nothing like living across an international border to the closest team, and 4,300 km from the "national" club).  Two shutouts!  Fangraphs has the results &lt;a href="http://www.fangraphs.com/wins.aspx?date=2010-05-28&amp;amp;team=Blue%20Jays&amp;amp;dh=0&amp;amp;season=2010"&gt;here&lt;/a&gt; and &lt;a href="http://www.fangraphs.com/wins.aspx?date=2010-10-01&amp;amp;team=Mariners&amp;amp;dh=0&amp;amp;season=2010"&gt;here&lt;/a&gt;.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;c) The local pro team just announced they are closing up shop.  More on that later.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684119683005319088-5013766054106702515?l=bayesball.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bayesball.blogspot.com/feeds/5013766054106702515/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://bayesball.blogspot.com/2010/11/2010-in-retrospect.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/5013766054106702515'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/5013766054106702515'/><link rel='alternate' type='text/html' href='http://bayesball.blogspot.com/2010/11/2010-in-retrospect.html' title='2010 in retrospect'/><author><name>Martin Monkman</name><uri>http://www.blogger.com/profile/05582544453619381290</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='http://2.bp.blogspot.com/_DcSPZvAgn6Y/TAwMW7v4JlI/AAAAAAAAAAY/JOD53e64_I8/S220/charliebrownpitching.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684119683005319088.post-2854663061914202386</id><published>2010-10-27T10:23:00.000-07:00</published><updated>2011-04-06T17:33:13.952-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='economics'/><category scheme='http://www.blogger.com/atom/ns#' term='World Series'/><category scheme='http://www.blogger.com/atom/ns#' term='prediction'/><title type='text'>World Series predictions</title><content type='html'>The 2010 World Series starts this evening, and many pundits are making their predictions on who will win (a sample: &lt;a href="http://baseballanalysts.com/archives/2010/10/how_we_see_the.php"&gt;The Baseball Analysts&lt;/a&gt;, &lt;a href="http://bats.blogs.nytimes.com/2010/10/27/keeping-score-on-paper-this-series-is-a-toss-up/"&gt;New York Times&lt;/a&gt;, and &lt;a href="http://sportsillustrated.cnn.com/2010/writers/joe_sheehan/10/26/world.series.position.breakdown.giants.rangers/index.html"&gt;Sports Illustrated&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;But perhaps the best way to think about evaluating the pundits is contained in this article &lt;a href="http://www.vancouversun.com/business/Slick+talkers+forecasters/3732771/story.html"&gt;"Slick talkers and bad forecasters"&lt;/a&gt; by Dan Gardner. Gardner's article is about economic forecasting, but the point is relevant -- when it comes to predicting the outcome, a nuanced understanding of all of the influencing factors produces the best forecast. Or as Gardner puts it, "experts who gathered information from many sources, who were comfortable with complexity and uncertainty, and were more prepared to admit mistakes and adjust conclusions accordingly -- these were the experts worth listening to."&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684119683005319088-2854663061914202386?l=bayesball.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bayesball.blogspot.com/feeds/2854663061914202386/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://bayesball.blogspot.com/2010/10/world-series-predictions.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/2854663061914202386'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/2854663061914202386'/><link rel='alternate' type='text/html' href='http://bayesball.blogspot.com/2010/10/world-series-predictions.html' title='World Series predictions'/><author><name>Martin Monkman</name><uri>http://www.blogger.com/profile/05582544453619381290</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='http://2.bp.blogspot.com/_DcSPZvAgn6Y/TAwMW7v4JlI/AAAAAAAAAAY/JOD53e64_I8/S220/charliebrownpitching.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684119683005319088.post-4319948607079173401</id><published>2010-08-12T09:46:00.000-07:00</published><updated>2010-08-12T10:10:54.168-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='skill'/><category scheme='http://www.blogger.com/atom/ns#' term='luck'/><category scheme='http://www.blogger.com/atom/ns#' term='golf'/><title type='text'>Skill and luck on the links</title><content type='html'>Differentiating skill and luck has been a hot topic of late -- here's an article at Slate by Michael Agger titled &lt;a href="http://www.slate.com/id/2263085/"&gt;"Dead Solid Lucky"&lt;/a&gt; looking at the topic in the context of golf.&lt;br /&gt;&lt;br /&gt;The article draws heavily on an analysis by Robert A. Connolly and Richard J. Rendleman Jr., both from the Kenan-Flagler Business School, University of North Carolina. (Their article, from the &lt;i&gt;Journal of the American Statistical Association&lt;/i&gt;, can be found in PDF format &lt;a href="http://www.dartmouth.edu/~stats/rendleman.pdf"&gt;here&lt;/a&gt;.)&lt;br /&gt;&lt;br /&gt;A tidy summary, quoted directly from Agger's article: "How big a deal is luck on the golf course? On average, tournament winners are the beneficiaries of 9.6 strokes of good luck. Tiger Woods' superior putting, you'll recall, gives him a three-stroke advantage per tournament. Good luck is potentially three times more important. When Connolly and Rendleman looked at the tournament results, they found that (with extremely few exceptions) the top 20 finishers benefitted from some degree of luck. They played better than predicted. So, in order for a golfer to win, he has to both play well and get lucky."&lt;br /&gt;&lt;br /&gt;Sounds like real life.  And baseball.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684119683005319088-4319948607079173401?l=bayesball.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bayesball.blogspot.com/feeds/4319948607079173401/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://bayesball.blogspot.com/2010/08/skill-and-luck-on-links.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/4319948607079173401'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/4319948607079173401'/><link rel='alternate' type='text/html' href='http://bayesball.blogspot.com/2010/08/skill-and-luck-on-links.html' title='Skill and luck on the links'/><author><name>Martin Monkman</name><uri>http://www.blogger.com/profile/05582544453619381290</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='http://2.bp.blogspot.com/_DcSPZvAgn6Y/TAwMW7v4JlI/AAAAAAAAAAY/JOD53e64_I8/S220/charliebrownpitching.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684119683005319088.post-1459336676723713842</id><published>2010-07-27T13:11:00.000-07:00</published><updated>2011-04-06T17:34:33.922-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Dave Winfield'/><category scheme='http://www.blogger.com/atom/ns#' term='real life'/><category scheme='http://www.blogger.com/atom/ns#' term='skill'/><category scheme='http://www.blogger.com/atom/ns#' term='probability'/><category scheme='http://www.blogger.com/atom/ns#' term='luck'/><category scheme='http://www.blogger.com/atom/ns#' term='sabermetrics'/><title type='text'>Skill, luck, and more than a little style</title><content type='html'>&lt;i&gt;Pictured: "Mr. May", Dave Winfield, comes through in the clutch in the 1992 World Series with, in his words, "One stinkin' little hit." The 11th-inning double drove in two runs and sealed the World Series win for the Blue Jays.&lt;/i&gt;&lt;br /&gt;&lt;a href="http://1.bp.blogspot.com/_DcSPZvAgn6Y/TE8_Zve17bI/AAAAAAAAABI/yrgwn6vEjyQ/s1600/dave-winfield.jpg" onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}"&gt;&lt;img alt="" border="0" id="BLOGGER_PHOTO_ID_5498683381517774258" src="http://1.bp.blogspot.com/_DcSPZvAgn6Y/TE8_Zve17bI/AAAAAAAAABI/yrgwn6vEjyQ/s320/dave-winfield.jpg" style="cursor: pointer; float: left; height: 320px; margin: 0pt 10px 10px 0pt; width: 226px;" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;The BBC has posted an article and calculator (&lt;a href="http://www.bbc.co.uk/news/magazine-10729380"&gt;"Can chance make you a killer?"&lt;/a&gt;) that is used to demonstrate the challenges in differentiating luck from skill. In this case, a simple scenario with fixed parameters is linked to a calculator that generates the range of possibilities.&lt;br /&gt;&lt;br /&gt;While I'm not sure how this could be used in a baseball setting, it is a very good tool for demonstrating that it can be difficult -- particularly if you just look at "the numbers" in a selective way -- to make definitive statements about a player's ability. Such as, say, clutch hitting.&lt;br /&gt;&lt;br /&gt;(Acknowledgement: &lt;a href="http://www.insidethebook.com/ee/index.php/site/comments/chance_v_skill/"&gt;The Book&lt;/a&gt;.)&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684119683005319088-1459336676723713842?l=bayesball.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bayesball.blogspot.com/feeds/1459336676723713842/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://bayesball.blogspot.com/2010/07/skill-luck-and-more-than-little-style.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/1459336676723713842'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/1459336676723713842'/><link rel='alternate' type='text/html' href='http://bayesball.blogspot.com/2010/07/skill-luck-and-more-than-little-style.html' title='Skill, luck, and more than a little style'/><author><name>Martin Monkman</name><uri>http://www.blogger.com/profile/05582544453619381290</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='http://2.bp.blogspot.com/_DcSPZvAgn6Y/TAwMW7v4JlI/AAAAAAAAAAY/JOD53e64_I8/S220/charliebrownpitching.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_DcSPZvAgn6Y/TE8_Zve17bI/AAAAAAAAABI/yrgwn6vEjyQ/s72-c/dave-winfield.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684119683005319088.post-1154579288568362956</id><published>2010-07-26T19:21:00.000-07:00</published><updated>2011-04-06T17:35:53.581-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='real life'/><category scheme='http://www.blogger.com/atom/ns#' term='skill'/><category scheme='http://www.blogger.com/atom/ns#' term='luck'/><category scheme='http://www.blogger.com/atom/ns#' term='sabermetrics'/><title type='text'>Baseball imitates real life</title><content type='html'>How understanding luck in baseball can help understanding real life, or at least your investment portfolio: &lt;a href="http://contenta.mkt1710.com/lp/26966/115068/Untangling%20Skill%20and%20Luck.pdf"&gt;"Untangling skill and luck"&lt;/a&gt; by Michael J. Mauboussin.&lt;br /&gt;&lt;br /&gt;Mauboussin uses a variety of sabermetric analysis, including Jim Albert's 2004 paper &lt;a href="http://bayes.bgsu.edu/papers/paper_bavg.pdf"&gt;“A Batting Average: Does It Represent Ability or Luck?”&lt;/a&gt; and Tango's &lt;a href="http://www.insidethebook.com/ee/index.php/site/article/true_talent_levels_for_sports_leagues/"&gt;True Talent Level&lt;/a&gt; analysis.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684119683005319088-1154579288568362956?l=bayesball.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bayesball.blogspot.com/feeds/1154579288568362956/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://bayesball.blogspot.com/2010/07/baseball-imitates-real-life.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/1154579288568362956'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/1154579288568362956'/><link rel='alternate' type='text/html' href='http://bayesball.blogspot.com/2010/07/baseball-imitates-real-life.html' title='Baseball imitates real life'/><author><name>Martin Monkman</name><uri>http://www.blogger.com/profile/05582544453619381290</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='http://2.bp.blogspot.com/_DcSPZvAgn6Y/TAwMW7v4JlI/AAAAAAAAAAY/JOD53e64_I8/S220/charliebrownpitching.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684119683005319088.post-7603891894901867121</id><published>2010-07-18T07:01:00.000-07:00</published><updated>2011-04-06T17:35:36.645-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='wins'/><category scheme='http://www.blogger.com/atom/ns#' term='monte carlo'/><category scheme='http://www.blogger.com/atom/ns#' term='pythagorean'/><title type='text'>Probability of winning the division</title><content type='html'>The &lt;a href="http://www.coolstandings.com/baseball_standings.asp?i=1"&gt;Cool Standings&lt;/a&gt; website presents the probabilities that any Major League Baseball team will win the division or wild card.  They also have this available for the NHL, NFL, and NBA.  &lt;br /&gt;&lt;br /&gt;If I am interpreting &lt;a href="http://www.coolstandings.com/welcome.asp"&gt;their methodology&lt;/a&gt; correctly, they are using a &lt;a href="http://en.wikipedia.org/wiki/Pythagorean_expectation"&gt;Pythagorean&lt;/a&gt; basis in a &lt;a href="http://en.wikipedia.org/wiki/Monte_Carlo_simulation"&gt;Monte Carlo simulation&lt;/a&gt;.  (To read more, here's an &lt;a href="http://sports.espn.go.com/mlb/news/story?id=3474432"&gt;ESPN article&lt;/a&gt; on the method.)&lt;br /&gt;&lt;br /&gt;Interesting...&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684119683005319088-7603891894901867121?l=bayesball.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bayesball.blogspot.com/feeds/7603891894901867121/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://bayesball.blogspot.com/2010/07/probability-of-winning-division.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/7603891894901867121'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/7603891894901867121'/><link rel='alternate' type='text/html' href='http://bayesball.blogspot.com/2010/07/probability-of-winning-division.html' title='Probability of winning the division'/><author><name>Martin Monkman</name><uri>http://www.blogger.com/profile/05582544453619381290</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='http://2.bp.blogspot.com/_DcSPZvAgn6Y/TAwMW7v4JlI/AAAAAAAAAAY/JOD53e64_I8/S220/charliebrownpitching.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684119683005319088.post-476893169879957463</id><published>2010-07-14T18:26:00.000-07:00</published><updated>2011-04-06T17:35:21.210-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='poisson'/><category scheme='http://www.blogger.com/atom/ns#' term='no-hitter'/><title type='text'>Behind in the count</title><content type='html'>Time to get caught up with the goings-on elsewhere...&lt;br /&gt;&lt;br /&gt;First up, Tom at &lt;a href="http://tomflesher.com/2010/06/25/edwin-jackson-fourth-no-hitter-of-2010/"&gt;Heureusement, ici, c'est le Blog!&lt;/a&gt; cleverly adapted the same Poisson method I used for perfect games to examine the plethora of no-hitters this season.&lt;br /&gt;&lt;br /&gt;And no surprise, the number fits nicely with the outer range of the "expected" frequencies.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684119683005319088-476893169879957463?l=bayesball.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bayesball.blogspot.com/feeds/476893169879957463/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://bayesball.blogspot.com/2010/07/behind-in-count.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/476893169879957463'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/476893169879957463'/><link rel='alternate' type='text/html' href='http://bayesball.blogspot.com/2010/07/behind-in-count.html' title='Behind in the count'/><author><name>Martin Monkman</name><uri>http://www.blogger.com/profile/05582544453619381290</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='http://2.bp.blogspot.com/_DcSPZvAgn6Y/TAwMW7v4JlI/AAAAAAAAAAY/JOD53e64_I8/S220/charliebrownpitching.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684119683005319088.post-2714062865720586615</id><published>2010-06-24T18:01:00.000-07:00</published><updated>2010-06-24T19:50:43.991-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='probability'/><category scheme='http://www.blogger.com/atom/ns#' term='tennis'/><title type='text'>Tireless tennis</title><content type='html'>After a bit of water-cooler chat today at work, I had planned to spend part of my evening working out the probability of the crazy Wimbledon match that saw John Isner win a marathon match against Nicolas Mahut, after the tie-breaker went 138 games and had Isner prevail 70-68.&lt;br /&gt;&lt;br /&gt;I got home to find not one but two well-presented analyses that tackle the question, so my work would be redundant.  Thus I present the following links for your reading pleasure:&lt;br /&gt;&lt;br /&gt;1. Carl Bialik, &lt;a href="http://blogs.wsj.com/dailyfix/2010/06/24/isner-fitting-winner-of-marathon-wimbledon-match"&gt;"Isner Fitting Winner of Marathon Wimbledon Match"&lt;/a&gt;, Wall Street Journal&lt;br /&gt;&lt;br /&gt;2. Phil Birnbaum, &lt;a href="http://sabermetricresearch.blogspot.com/2010/06/what-were-odds-of-70-68-score-at.html"&gt;"What were the odds of the 70-68 score at Wimbledon?"&lt;/a&gt;, Sabermetric Research.  (Birnbaum graciously acknowledges Bialik's article in an appended post-post foreward.)&lt;br /&gt;&lt;br /&gt;The final summary: this was a one-in-a-million (give or take, depending on some of the assumptions) event.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684119683005319088-2714062865720586615?l=bayesball.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bayesball.blogspot.com/feeds/2714062865720586615/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://bayesball.blogspot.com/2010/06/tireless-tennis.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/2714062865720586615'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/2714062865720586615'/><link rel='alternate' type='text/html' href='http://bayesball.blogspot.com/2010/06/tireless-tennis.html' title='Tireless tennis'/><author><name>Martin Monkman</name><uri>http://www.blogger.com/profile/05582544453619381290</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='http://2.bp.blogspot.com/_DcSPZvAgn6Y/TAwMW7v4JlI/AAAAAAAAAAY/JOD53e64_I8/S220/charliebrownpitching.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684119683005319088.post-6854781707308382058</id><published>2010-06-06T15:08:00.001-07:00</published><updated>2011-04-06T17:35:02.116-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='poisson'/><category scheme='http://www.blogger.com/atom/ns#' term='perfect game'/><category scheme='http://www.blogger.com/atom/ns#' term='probability'/><title type='text'>Perfectly random?</title><content type='html'>There has been much discussion about the recent run of perfect and almost-perfect games. A variety of hypotheses have been floated, including &lt;br /&gt;&lt;a href="http://econtricks.blogspot.com/2010/06/is-pitching-too-easy-now.html"&gt;pitching dominance&lt;/a&gt; (including a higher strike out ratio), &lt;a href="http://www.sabernomics.com/sabernomics/index.php/2010/06/why-more-perfect-games/"&gt;improved defense&lt;/a&gt;, and &lt;a href="http://www.associatedcontent.com/article/5437222/why_have_there_been_so_many_perfect.html?cat=14"&gt;the confluence of expansion, better player evaluation, and a drug-free world&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Perfect games are a rare event, so we run the risk of seeing a random cluster as a trend. There have now been 20 perfect games -- 18 in the "modern era" (since 1900), 14 since the expansion era began in 1961, and two so far in the 2010 season. How can we tell if this "streak" of two perfect games in a single season is simply a random fluctuation?&lt;br /&gt;&lt;br /&gt;&lt;b&gt;&lt;i&gt;Calculating the probability of a perfect game: allowing runners&lt;/i&gt;&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;One approach is to calculate a theoretical probability based on on-base percentage (OBP). Tango has a blog entry &lt;a href="http://www.insidethebook.com/ee/index.php/site/article/perfect_game_calculation/"&gt;"Perfect Game calculation"&lt;/a&gt; that presents one approach. His estimate was 1 perfect game per 15,000.&lt;br /&gt;&lt;br /&gt;Another example of this appears in &lt;i&gt;Mathletics&lt;/i&gt; by &lt;a href="http://waynewinston.com/wordpress/"&gt;Wayne L. Winston&lt;/a&gt;, who calculated a probability of 0.0000489883, or 1 game in just over 20,400. Winston noted at the time the book went to press (before the 2009 season) there had been nearly 173,000 regular season games since 1900 and each game provides 2 opportunities for a perfect game (so we have 346,000 "team games"). Winston then goes on to note that we would therefore expect there to be 16.95 perfect games over that period -- almost perfectly matching the observed total of 17 to that point in time.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family: arial;"&gt;A side note: after Mark Buehrle's perfect game in 2009, &lt;a href="http://baseballanalysts.com/archives/2009/07/perfect_games_a.php"&gt;Sky Andrecheck&lt;/a&gt; took a similar approach for individual players. He worked out the individual chances for the 16 modern-era players who had tossed a perfect game, based on the sum of the on-base percentage and reached-on-error percentage they allowed over their careers.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;&lt;i&gt;Calculating the probability of a perfect game: observed rate&lt;/i&gt;&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;A second approach to calculating the probability is to compare the observed number of perfect games and to the number of opportunities. I decided to use 1961 as year one. This was a natural point to begin -- this was the first year of baseball's expansion, and it falls mid-way between Don Larson's 1956 World Series perfecto (which had been the first in 22 years) and Jim Bunning's 90 pitch masterpiece in 1964. Between 1961 and 2009 inclusive, there were 12 perfect games -- and there were 201,506 regular season "team games". This gives us a probability of 0.00005955, or 1 perfect game every 16,790 team games played.&lt;br /&gt;&lt;br /&gt;This method yields a result that is roughly the mid-point between Tango's and Winston's approaches.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;&lt;i&gt;What are the odds of two perfect games in one season?&lt;/i&gt;&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;While most statistical analysis makes the assumption that the distribution of the events is "normal", when we are dealing with rare discrete events the distribution does not resemble the normal distribution. The most common distribution used for this is the &lt;a href="http://en.wikipedia.org/wiki/Poisson_distribution"&gt;Poisson distribution&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;At the probability of 1 in 16,790 across a season of 4,860 "team games" (the current number per season -- based on 2,430 games and therefore 4,860 perfect game opportunities) and 4,112 (the average number since 1961) that the probabilities, expected frequencies, and observed frequencies are as follows:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://1.bp.blogspot.com/_DcSPZvAgn6Y/TBEmyXrY4ZI/AAAAAAAAAA4/GYTPmrs16D0/s1600/pg1.jpg"&gt;&lt;img alt="" border="0" id="BLOGGER_PHOTO_ID_5481204868277920146" src="http://1.bp.blogspot.com/_DcSPZvAgn6Y/TBEmyXrY4ZI/AAAAAAAAAA4/GYTPmrs16D0/s320/pg1.jpg" style="cursor: hand; cursor: pointer; float: center; height: 105px; margin: 0 10px 10px 0; width: 320px;" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="http://2.bp.blogspot.com/_DcSPZvAgn6Y/TBEnBrhcVJI/AAAAAAAAABA/-QahWp_2i7M/s1600/pg2.jpg"&gt;&lt;img alt="" border="0" id="BLOGGER_PHOTO_ID_5481205131302950034" src="http://2.bp.blogspot.com/_DcSPZvAgn6Y/TBEnBrhcVJI/AAAAAAAAABA/-QahWp_2i7M/s320/pg2.jpg" style="cursor: hand; cursor: pointer; float: center; height: 182px; margin: 0 10px 10px 0; width: 320px;" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;So over 50 seasons, we would predict that there would be between 1 and 2 seasons with 2 perfect games, and between 9 and 11 seasons with 1 perfect game.&lt;br /&gt;&lt;br /&gt;So to answer the question posed in the title, the answer is "Yes -- two perfect games in one season is well within the expected distribution." The fact that 2010 has been the first season with 2 perfect games in the 50 years since 1961 fits perfectly with the expected distribution.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;In future posts I will repeat the calculation of probabilities and frequencies, with modified probabilities (once the dust settles on the "correct" way to calculate the probabilities...)&lt;br /&gt;&lt;br /&gt;&lt;i&gt;Comments and questions are always welcome.&lt;/i&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684119683005319088-6854781707308382058?l=bayesball.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bayesball.blogspot.com/feeds/6854781707308382058/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://bayesball.blogspot.com/2010/06/perfectly-random.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/6854781707308382058'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/6854781707308382058'/><link rel='alternate' type='text/html' href='http://bayesball.blogspot.com/2010/06/perfectly-random.html' title='Perfectly random?'/><author><name>Martin Monkman</name><uri>http://www.blogger.com/profile/05582544453619381290</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='http://2.bp.blogspot.com/_DcSPZvAgn6Y/TAwMW7v4JlI/AAAAAAAAAAY/JOD53e64_I8/S220/charliebrownpitching.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_DcSPZvAgn6Y/TBEmyXrY4ZI/AAAAAAAAAA4/GYTPmrs16D0/s72-c/pg1.jpg' height='72' width='72'/><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684119683005319088.post-7500083604523285178</id><published>2010-06-04T22:19:00.001-07:00</published><updated>2011-04-06T17:34:47.988-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='wins'/><category scheme='http://www.blogger.com/atom/ns#' term='regression'/><category scheme='http://www.blogger.com/atom/ns#' term='payroll'/><title type='text'>A closer look at payroll and performance</title><content type='html'>A &lt;a href="http://hawkonomics.blogspot.com/2010/06/payroll-and-performance-in-mlb.html"&gt;recent post on Hawkonomics &lt;/a&gt;presented a regression analysis of Major League Baseball team performance as a function of payroll.  This post has generated some chatter in the sabermetric blogs (&lt;a href="http://sabermetricresearch.blogspot.com/2010/06/payroll-and-wins-and-correlation.html" target="new"&gt;Sabermetric Research&lt;/a&gt; and &lt;a href="http://www.insidethebook.com/ee/index.php/site/article/no_no_no_no_no_part_2/" target="new"&gt;The Book&lt;/a&gt;).  If I may be so bold, the original post wasn't very well articulated, which has led to some critiques.  Herein I aim to repeat the original analysis, and provide some elaboration that will aid interpretation.&lt;br /&gt;&lt;br /&gt;It is clear that the calibre of the players on the team influences the number of wins.  What is less clear is the relationship between team calibre and the total amount the team pays in salaries.  We have all heard it said that rich teams "buy a championship" by loading up on highly paid free agents, but how true is it? &lt;br /&gt;&lt;br /&gt;This relationship has been analyzed in the past. One such analysis can be found in the book &lt;i&gt;The Wages of Wins&lt;/i&gt; by Berri, Schmidt, &amp;amp; Brook, and there are plenty of other sources around the sabermetric blogs. (One interesting visualization tool can be found on &lt;a href="http://www.benfry.com/salaryper/" target="new"&gt;Ben Fry’s site&lt;/a&gt;.)&lt;br /&gt;&lt;br /&gt;One of the most common ways to test a relationship between two variables is through a regression analysis.  This is the approach taken by Stacey Brook over at Hawkonomics. (Note: Brook is one of the co-authors of &lt;i&gt;The Wages of Wins&lt;/i&gt;.)&lt;br /&gt;&lt;br /&gt;I have re-run the regression using the data supplied on his blog. I changed two things to make the results more readily comprehensible.  First, I changed the salary figures to be represented as millions; thus the Yankee’s salary is expressed not as $206,333,389 but $206.3.  More dramatically, I used each team’s current winning percentage and projected it out over 162 games – essentially a forecast of where the teams will end up at the end of the 2010 season if they continue at the pace established over the first ~50 games of the season.&lt;br /&gt;&lt;br /&gt;&lt;i&gt;NOTE: These transformations alter neither the “goodness of fit” of the model nor the statistical significance.&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;Let’s look in detail at the model that results.&lt;br /&gt;&lt;br /&gt;&lt;i&gt;&lt;b&gt;1.  The Correlation: The strength of the relationship between team salaries and wins&lt;/b&gt;&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;The correlation coefficient (often identified as the Pearson correlation coefficient, and represented as "R") is a unitless measure that simply tells us how much the two variables vary together in a linear manner.  If they both move up in lockstep, the correlation coefficient will be 1 (the temperature outside and the amount of electricity used to run air conditioners); if one moves up while the other moves down in lockstep, the R value will be -1 (the temperature outside and the amount of natural gas burned keeping your house warm).  If there is no relationship at all, then R will equal zero (the temperature outside and the amount of energy used to heat the gallons of hot water used by teenagers in the shower).&lt;br /&gt;&lt;br /&gt;For MLB salaries and wins so far this season, the R value is 0.224.  This is interpreted as being a weak linear relationship. In plainer language, the data do not really follow a linear pattern.&lt;br /&gt;&lt;br /&gt;But there is another value -- R&lt;sup&gt;2&lt;/sup&gt; or R-squared -- that gives us some language to work with.  In this case, O.224 squared is 0.0503.  From this, we can say that salaries improve our prediction of a team's winning success by 5.03% -- not a very big improvement at all. &lt;br /&gt;&lt;br /&gt;This is easily seen in the X-Y chart below.  Salary is plotted across the bottom, with the forecast wins up the side (I converted the team win percentages to a forecast season wins -- more on this later.) Each team is represented by one of the dark blue diamonds scattered about the chart.  The predicted values derived from the regession model are shown in the form of the red dots joined by a nice straight line. &lt;br /&gt;&lt;br /&gt;&lt;a href="http://4.bp.blogspot.com/_DcSPZvAgn6Y/TAv6-Hy6JjI/AAAAAAAAAAM/AD56ovvOFVA/s1600/salary+to+wins.jpg"&gt;&lt;img alt="" border="0" id="BLOGGER_PHOTO_ID_5479749316777748018" src="http://4.bp.blogspot.com/_DcSPZvAgn6Y/TAv6-Hy6JjI/AAAAAAAAAAM/AD56ovvOFVA/s320/salary+to+wins.jpg" style="cursor: pointer; display: block; height: 272px; margin: 0px auto 10px; text-align: center; width: 320px;" /&gt;&lt;/a&gt;&lt;br /&gt;[click for a bigger version]&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;From this, it is easy to see that the model is not a very good predictor of actual wins.  While there are some blue dots that fall close to the line, there are others that are well above or below the line.  If there isn’t much difference between the actual and predicted values, we have a good model.  That clearly is not the case here.&lt;br /&gt;&lt;br /&gt;The model tells us little about what makes a winning team, because a lot of the difference in team success cannot be explained by salaries.  In short, this model has no &lt;i&gt;oomph&lt;/i&gt;.&lt;br /&gt;&lt;br /&gt;&lt;i&gt;(Back to the side note from earlier: the correlation coefficient will remain the same regardless of how we express our variables.  We can convert the dollars to a percentage of the average for the season (thus the low-spending Pirates would be said to have a salary that is 39% of the average while the Yankees are spending 206% above), and the correlation coefficient remains at 0.224.  Or we could convert the winning percentage to actual wins, with no change in the R value.&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;&lt;i&gt;&lt;b&gt;2a.  The regression equation -- how much does a change in salaries influence wins?&lt;/b&gt;&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;The regression equation is expressed as&lt;br /&gt;Y=constant+(beta*X)&lt;br /&gt;Where Y is the predicted number of wins, and X is the salary.  The constant is the point on the Y axis where X is equal to zero, also known as the Y intercept. The beta value is the amount that a change in X will generate in increase of one in the predicted Y value. The constant and the beta value are calculated in the model.&lt;br /&gt;&lt;br /&gt;In this model, it becomes&lt;br /&gt;Y=73.0+(0.087*X)&lt;br /&gt;&lt;br /&gt;The interpretation:  each extra $87,000 spent yields an increase in a single win, starting at a base of 73 wins.  A team that spends an average amount on salaries ($90.6 million) will get an average number of wins (81).&lt;br /&gt;&lt;br /&gt;When we start to think about this equation, it’s easy to see why the model isn’t very robust.  There are some teams that are going to end the season with less than 73 wins if they keep on the way they have been.  To end up below 73 wins, the model says the players should be paying the team!&lt;br /&gt;&lt;br /&gt;&lt;i&gt;&lt;b&gt;2b.  This year’s Moneyball teams&lt;/b&gt;&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;The model does give us a way to see which teams are getting the most production (i.e. wins) for every dollar spent – the gap between the team’s actual performance and the number of wins predicted in the model is the “residual”, and it ranges from a high of 29 wins above what the model predicts for Tampa Bay and 21 for San Diego, to a low of -34 for Baltimore and -23 for Houston.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;i&gt;&lt;b&gt;3.  Statistical significance&lt;/b&gt;&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;This model is NOT statistically significant. &lt;br /&gt;&lt;br /&gt;So what?  All this means is that if we were to use another group of 30 team wins-team salary pairs, we would likely get a different R value.  We could improve the significance of the model with more team salary and wins data pairs. &lt;br /&gt;&lt;br /&gt;But if the data points are still as dispersed as they are in this case, more data points might yield a “statistically significant” model that (and this is the important part…) has the same correlation coefficient – the model would still have no oomph.  All we have then is a model that we can be confident tells us that team salary has a small relationship with being the number of wins earned.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;i&gt;&lt;b&gt;Some parting thoughts&lt;/b&gt;&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;So we have arrived at the inescapable conclusion that this model does not tell us much about what influences wins, since there is little relationship between salaries and wins at this point in the 2010 season.  The model is both weak and statistically insignificant.&lt;br /&gt;&lt;br /&gt;The fact that the model is so weak runs counter to earlier research, which tended to find a stronger relationship.  Is 2010 different, or is it just too early in the season to tell?&lt;br /&gt;&lt;br /&gt;&lt;i&gt;Comments and questions are always welcome.&lt;/i&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684119683005319088-7500083604523285178?l=bayesball.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bayesball.blogspot.com/feeds/7500083604523285178/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://bayesball.blogspot.com/2010/06/cant-buy-me-wins.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/7500083604523285178'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684119683005319088/posts/default/7500083604523285178'/><link rel='alternate' type='text/html' href='http://bayesball.blogspot.com/2010/06/cant-buy-me-wins.html' title='A closer look at payroll and performance'/><author><name>Martin Monkman</name><uri>http://www.blogger.com/profile/05582544453619381290</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='http://2.bp.blogspot.com/_DcSPZvAgn6Y/TAwMW7v4JlI/AAAAAAAAAAY/JOD53e64_I8/S220/charliebrownpitching.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_DcSPZvAgn6Y/TAv6-Hy6JjI/AAAAAAAAAAM/AD56ovvOFVA/s72-c/salary+to+wins.jpg' height='72' width='72'/><thr:total>2</thr:total></entry></feed>
