[Last Updated: December 31st, 2017]
Here are some notes on our fantasy golf product on DailyRoto. Any specific questions you have can be sent to firstname.lastname@example.org or email@example.com.
This note will be continually updated. The goal of the note is to eliminate any confusion users may have with respect to what is going on in the background of the product. The better you understand the shortcomings and assumptions behind the model, the more effective you will be in using it.
The product includes our usual finish probability model (Win, Top3, Top5, Top10, Top20, Top30), a new *interactive* fantasy model, and a head-to-head player comparison tool.
FINISH PROBABILITY MODEL
You can read details about this model here.
Differences between fantasy “finish points” projections and finish probability model: At times, you will likely notice that the finish points projections from the fantasy model (described below) do not lineup perfectly with the finish probability model. It’s actually pretty hard to look at the finish probabilities and from that estimate what the expected finish points should be, but in some cases it could be obvious there is a discrepancy (e.g. one player has higher probabilities for all finish types than another player, yet is projected to have fewer finish points). These discrepancies will happen from time to time simply because the models are different, and both have their strengths and weaknesses. The finish probability model is predicting each player’s round-by-round scores (and then simulating finishes), while the fantasy model is simply predicting finish points based off the historical finish points of each player. Because the finish points are allocated in DraftKings and FanDuel contests in a highly non-linear fashion (i.e. the bulk of the points go to the top players) this creates differences in the models’ predictions. For example, Rickie Fowler is going to do better relative to Justin Thomas in the scoring model as compared to the fantasy model because Fowler has a great scoring average in the past two seasons, but not that many wins, while for Thomas the opposite is true: he has a lot of wins considering what his scoring average has been. If you understand the differences between the models, you can use this to your advantage. We are essentially predicting the same thing, in two different ways; so, if there is a player that is liked by both models (relative to the general markets, say) then this is a stronger signal than any one model on its own.
The fantasy model has two components: the projections and the optimizer.
The “projections” are the expected number of points for each player in this week’s field. Projections for both scoring points and finish points are provided. Scoring points are defined as all points apart from those allocated for finish position.
These projections are basically a function of two things:
1) a weighted average of long-term form (previous 2 years), short-term form (previous 3 months), and course history (performance at this week’s course(s) since 2010).
2) how the course(s) for that week is expected to play relative-to-par.
The projection page defaults to parameter settings that we have estimated from our model. However, users can adjust these to their liking (more on this below).
Overview of our model: We first collect and adjust the “scoring” points and “finish” points in all PGA Tour and European Tour events from 2013-onwards. This adjustment accounts for course difficulty and field strength (i.e. scoring 10 points more than average at the U.S. Open is better than scoring 10 points more than average at the Sanderson Farms). Adjusted points are a relative measure; they indicate the number of points a player earned above or below the average player in our data (we make the average zero, for ease of interpretation). So, for example, if Jordan Spieth has long-term form for scoring points of +30 points, that means he is on average 30 scoring points better than the average player in our data. Because we’ve taken out any differences in points due to course difficulty, adjusted points from different events can be directly compared.
To get our projections, we estimate two simple models for predicting adjusted scoring points and adjusted finish points that takes as inputs each player’s long-term average, short-term average, and course history average. We find the optimal weighting to be about 75%, 20%, 5% for long-term, short-term, and course history, respectively. For simplicity, we use a single set of weights for projecting scoring points and finish points (in practice, the weights are very similar anyways).
For finish points projections, once we have the weighted averages for all players in the field, we are done. We simply normalize the projected finish points to make sure the total number of finish points adds up to the total available that week (which is typically the same in most events, unless it’s a small field).
For scoring points projections, there is an additional step: we take into account the expected course difficulty (i.e. at easier courses scoring points will tend to be higher and more spread out than at harder courses). This stands in contrast to finish points, where we do not allow course difficulty to impact projections (other then through the course history weight). The parameter we must specify is the expected scoring average relative-to-par at the current week’s course. This scoring average adjustment affects all players differently; while everyone’s scoring points projection will go up when the course is made easier, some players’ projections will go up more than others (there are not huge differences between players with respect to this parameter: 0.5-1 point differences in players’ responsiveness to a 1 stroke change in course difficulty are typical). On average, the better players’ point projections tend to go up more when the course is playing easier (and consequently, decrease more when the course is playing harder). This is only a rough correlation, and does not hold true universally (e.g. there are many good players who do better, in relative terms, at tougher courses).
Adjusting the Weights: The weights range from 0 to 10. When you adjust the weights, we re-scale them to sum to 1. Therefore, a weighting of 10, 10, 10 is equivalent to a weighting of 5, 5, 5. If you move everything to 0, you will get errors.
If a player has no rounds available to construct his short-term or course history average, then we replace it with the projected points of the average player in our entire data set (i.e. PGA Tour and European Tour from 2013-present). Therefore, it’s important to be careful when you put a high weight on short-term form: if a player hasn’t played for the previous 3 months, he will be given the form of an average player. Similarly, if the event is being played at a new course, and you weight course history 100%, all players will have the (roughly) same projected points (continue reading to see why it won’t be identical for all players).
Rookies projections are unaffected by the weights – their projections are equal to the average points earned by all previous rookies in our data.
Another important point: we make adjustments to these averages (long-term / short-term / course history) based off the number of events they are calculated from. The fewer events a player has played, the more their averages get discounted towards zero (remember all zero is the mean of our adjusted points measures). For example, if a long-term average of adjusted points is calculated from just 3 events, it will be discounted quite dramatically towards zero. Unfortunately, this is a bit of a black box from a user’s standpoint. Most weeks, it won’t affect too many players: if a player has played more than 15 events in the last two years, no discounting is done. For the short-term form and course history, very little discounting is done. Also, if a player has played only a few events, he just gets a general discounting (i.e. a downward shift in his projected points), as players with only a few events tend to do worse. (This is why weighting course history 100% at a new course won’t give perfectly uniform projections).
Adjusting Expected Course Scoring Average: As described above, this parameter is going to be important for pinning down the average level of scoring points for your projections. The easier the course is playing relative-to-par, the higher the average number of scoring points will be that week. We have estimated the relationship between an event’s scoring average and the average scoring points at that event. When you alter this parameter, we plug it into this estimated equation and it shifts all player’s projections up (if course is playing easier) or down (if course is playing harder).
Manual Bump: If you can’t adjust the above parameters to get exactly what you want, you can manually boost or downgrade a player’s point projection.
The optimizer takes your projections from the main table as inputs and provides the lineups that will maximize points, subject to some user-defined parameters. The baseline optimizer will maximize each lineup’s expected points (that is, the values in the projection table). For this, obviously it’s just checking all possible teams and adding up their projected points. You can choose how many lineups you would like the optimizer to return (i.e. top lineup, top 10 lineups, top 1000 lineups, etc.).
Lock-in / Eliminate: Use this if there is a certain player you want in all your lineups, or a certain player you want excluded from all your lineups. The optimizer will search through all possible lineups that meet your restrictions.
Adjusting Projected Ownership: You can maximize points conditional on your lineup having less than some aggregate projected ownership %. So this will return the “x” best lineups that have a lower aggregate projected ownership than the value you’ve inputted. At the moment, this is only available for DraftKings.
Adjusting Risk Preference: Instead of maximizing expected points, you can maximize points at any percentile (percentiles range from 0 to 100). Remember, our forecasts are probabilistic: the default projections are the expected, or average, points of each player / lineup, but it is the case that for any range of points (e.g. between 400 and 500) there is a corresponding probability of each lineup attaining that range of points. The “Xth” percentile for a lineup is the number of points the lineup will attain or be lower than “X” percent of the time. So, if the 90th percentile for a lineup is 500 points, this means that lineup will get 500 points or less 90% of the time. Equivalently, it means the lineup will get 500 points or more 10% of the time. So, if you maximize at the 90th percentile, this is a “riskier” strategy in the sense that you are choosing lineups with the greatest upside (i.e. the highest point totals on their “good” days) and not necessarily the highest expected points.
Conversely, maximizing lineups at the 10th percentile is a “safer” strategy. The 10th percentile is the number of points a team will be below 10% of the time (or, again more intuitively, above 90% of the time). You are choosing teams that have the least downside (i.e. they have the highest point totals on their “bad” days) and not necessarily the highest expected points.
It is often the case that the best “safe” teams and the best “risky” teams are one of the same; they are just good across the board.
A few more details: lineups that are composed of players with higher standard deviations (i.e. players who are more inconsistent) are going to have higher “top” percentiles (e.g. 80th, 90th) but lower “bottom” percentiles (e.g. 10th, 20th). The opposite is true for lineups composed of players who are more consistent (i.e. lower standard deviations). So this is what will drive any differences between the top lineups at each percentile; lineups with consistent players will tend to, all else equal, do better when maximizing at lower percentiles, and lineups with inconsistent players will do better at the higher percentiles.