Getting Started Doing Baseball Analysis without Coding

There's lot of confusion about how best to get started doing baseball analysis. It doesn't have to be difficult! You can start doing it right away, even if you don't know anything about R, Python, Ruby, SQL or machine learning (most GMs can't code). Learning these and other tools makes it easier and faster to do analysis, but they're only part of the process of constructing a well-reasoned argument. They're important, of course, because they can turn 2 months of hard work into 10 minutes of typing. Even if you don't like mathematics, statistics, coding or databases, they're mundane necessities that can make your life much easier and your analysis more powerful.

Here are two example problems. You don't have to do these specifically, but they illustrate the general idea. Write up your solutions, then publish them for other people to make some (hopefully) helpful comments and suggestions. This can be on a blog or through a versioning control platform like GitHub (which is also great for versioning any code or data your use). Try to write well! A great argument, but poorly written and poorly presented isn't going to be very convincing. Once it's finished, review and revise, review and revise, review and revise. When a team you follow makes a move, treat it as a puzzle for you to solve. Why did they do it, and was it a good idea?

Pick a recent baseball trade. For example, the Padres traded catcher Yasmani Grandal for Dodgers outfielder Matt Kemp. It's never that simple of course; the Padres aren't paying all of Matt Kemp's salary. Find out what the salary obligations were for each club in this trade. Using your favorite public projection system, where were the projected surplus values for each player at the time of the trade? Of course, there were other players involved in that trade, too. What were the expected surplus values of those players? From the perspective of surplus values, who won or lost this trade? Finally, why do you think each team made this trade, especially considering that they were division rivals? Do you think one or both teams made any mistakes in reasoning; if so, what were they, and did the other team take advantage of those mistakes?
Pick any MLB team and review the draft picks they made in the 2015 draft for the first 10 rounds. Do you notice any trends or changes from the 2014 draft? Do these picks agree or disagree with the various public pre-draft player rankings? Which picks were designed to save money to help sign other picks? Identify those tough signs. Was the team actually able to sign them, and were the picks to save money still reasonably good picks? Do you best to identify which picks you thought were good and bad, write them down in a notebook with your reasoning, then check back in 6 months and a year. Was your reasoning correct? If not, what were your mistakes and how can you avoid making them in the future?

The Angry Statistician

Search This Blog

Probability and Cumulative Dice Sums

Getting Started Doing Baseball Analysis without Coding

Comments

Post a Comment

Popular posts from this blog

Notes on Setting up a Titan V under Ubuntu 17.04

Mixed Models in R - Bigger, Faster, Stronger

A Bayes' Solution to Monty Hall