During the application/interview process with the Philadelphia 76ers front office, I was presented the project of predicting three-point shooting percentages for all NBA players this season. From my general awareness of statistical projection systems for baseball, the basis of the model would be to use past historical data to estimate a player’s true skill level. However, there are additional factors that could influence a player’s percentages. For example, a player’s true skill level can evolve over time, and while the direction and extent of that change may vary significantly by the player, there could be some generic trend evident across basketball. In addition, a player’s shooting percentages can depend heavily on the intra-game context of his shots, such as the location of the shots (distance or location along the arc), how open he is, and whether the shots are off-the-dribble or catch-and-shoot. Furthermore, there may be subtle inter-game influences such as whether the games occur at home or on the road and how much travel and rest time the player has had. Of the many different variables that could in theory impact three-point shooting percentages, many of them are either themselves unknown or their average effects may be determined to be minimal. As a result, the goal of this project was to build the model foundation that can predict three-point shooting percentages on its own and that can be extended in the future to include additional variables.
The Numbers in 5p0rt5
Separating the numbers that lie from those that don't
Pages
Sunday, November 9, 2014
Thursday, June 5, 2014
2014 NBA Playoffs Finals Preview
SAS 5 times: 3-2-0
Wednesday, June 4, 2014
2014 NBA Draft Big Board 1.0
This is the first version of my attempt at creating a 2014 NBA Draft Big Board. First, here are some of its guiding principles:
Sunday, May 18, 2014
2014 NBA Playoffs Conference Finals Preview
LAC 6 times: 2-4-0
WAS 3 times: 3-1-0
Granted, this was an extremely small sample size, and adding my second-round results to my first-round results doesn't significantly increase the likelihood a coin-flip strategy would match my record. The bigger issue is the 2-4 record betting on the Clippers given the confidence I had in that bet. Specifically, a 57%-weighted coin would be just as likely to finish with a record as poor as 2-4 as a fair coin would be to finish with a record as good as 25-19.
Monday, May 5, 2014
2014 NBA Playoffs 2nd Round Preview
ATL 6 times: 3-2-1
MIA 3 times: 2-1-0
BKN 7 times: 4-3-0
CHI 2 tims: 1-1-0
DAL 1 time: 1-0-0
SAS 1 time: 0-1-0
MEM 7 times : 4-3-0
GSW 7 times: 5-2-0
HOU 1 time: 0-1-0
Saturday, April 19, 2014
2014 NBA Playoffs 1st Round Preview
Tuesday, April 15, 2014
Tanking in the NBA
People disagree about the significance of the tanking problem in the NBA, but no one doubts that it exists. Most of the media coverage on tanking has focused only on the race for draft lottery ping-pong balls that was especially evident this year, given the expected strength of the incoming draft class and the projected gap between the top teams and the bottom teams before the season even began. This kind of tanking can manifest in many different forms and degrees, with some front offices actively trading away productive players (Boston trading Pierce, Garnett, Lee, and Crawford or Philadelphia trading Turner and Hawes), others benching players towards the end of the year citing bogus injuries (Milwaukee holding Sanders out until it was beneficial to medically clear him to start his marijuana suspension), and others simply making no effort to improve the team at any point in the season (Philadelphia not bothering to reach the salary floor or Utah trading for Jefferson and Biedrins to reach the salary floor). Still, this might not even be the most egregious manner by which teams actively trying to lose games, as many of these draft lottery tankers initially tried to compete and arguably only Philadelphia, Utah, and Boston stuck to season-long losing blueprints. There are two rules that even more directly incentive teams to intentionally lose, and each of these is more easily fixable.
Wednesday, March 20, 2013
How to Mathematically Win March Madness
No, this is not a strategy for how to actually win the national championship, as if I had this strategy, I would not be at home writing about it. However, given that an estimated $2.5 billion is gambled on the tournament, the real winners of March Madness aren't necessarily the National Champions but rather the people who win their bracket pools. The most significant source of value added in filling out a bracket is actual basketball knowledge, so for example, if someone could predict every matchup with over 80% accuracy, he wouldn't need any other strategies. But for normal people, here are a few tips to keep in mind:
1) Predicting the champion correctly is paramount
The degree of importance obviously depends on the scoring system, but I have yet to see a system that doesn't increase point totals in each round. Using the default system of 1 point for each correct first-round game, 2 points for each correct second-round game,..., and 32 points for the correct champion, a correctly picked champion already nets 63 out of a total 192 points (the champion has to win each of its games); by comparison, correctly picking every first-round and second-round game (a total of 48 games) only nets 64 points.
2) Matchups are not binary
This means that if one were simply trying to maximize expected point total, the final four should almost always consist of high seeds, even if they aren't necessarily the best teams in their respective regions. This is simply because they usually have the easiest expected early round opponents. So even if we think an 8 seed would likely beat a 1 seed (either because of matchup issues or simply because the 1 seed is overseeded while the 8 seed is underseeded), choosing the 8 seed to advance for this reason alone isn't wise as it's much less likely to even advance past the first round and make the 1-8 matchup. Pulling arbitrary numbers out of my hat, that would mean the 8 seed would need to be something like 75% favorites against the 9 seed and 60% favorites against the 1 seed as the 1 seed is almost always close to 100% against the 16 seed. In terms of actually executable bracket picking strategy, given Tip #1, one should attempt to work backwards by picking the champion, then the National Championship Game, and then the Final Four, and so on. This may mean that there are matchups in one's bracket in which the winner chosen isn't necessarily the team more likely to win that specific matchup, which is something that usually befuddles network "analysts."
3) In any pools of significant size, choose the team that is the most undervalued by the public and not necessarily the team that is the most likely to win
In the most simplified case, consider a 2 team tournament between Team A and Team B and a pool with n brackets (where n is a large number). Team A is a 67% favorite over Team B, and 90% of people pick Team A over Team B. Assuming a sensible payout structure (i.e. the money is split between the tying brackets or one of the tying brackets is chosen at random to win the payout), one's EV for picking Team A is X*67%/(90%*n) = 0.6*X/n, where X is the monetary size of the pool, while one's EV for picking Team B is X*33%/(10%*n) = 3.3*X/n. Adding more rounds definitely complicates issues, as there is now more than one path to win a pool (there isn't one specific game that determines who wins). However, in the case of extremely large pools in which one can reasonably assume that each team in the field is picked to win by at least one bracket, the champion likely must be picked correctly for a bracket to have a chance at winning. This basically means that if one were trying to win the challenge for best bracket that both Yahoo! and ESPN offer, one should pick a team that exhibits the largest discrepancy between how often the team is chosen as a champion and the team's actual championship chances.
Now, in smaller office pools, the strategy might be significantly different and this is because as mentioned earlier, there can be multiple potential paths to winning the pool. At the most extreme case, consider a 2 person office pool. Using the same 2 team tournament example from before, one's EV for picking Team A is X*67%*10%+90%/2 = 0.517*X while one's EV for picking Team B is X*33%*90%+10%/2 = 0.35*X. As a result, in a 2 person pool, one should always choose the most likely outcome, while in an infinite person pool, one should always choose the outcome that is most undervalued (defined simply as % chance/% chosen), and everything in between involves some balance between the two.
Will following any of these tips guarantee a win? Of course not. Similar to how end of game coaching won't matter if players simply don't hit shots, making strategic adjustments won't matter if someone simply adds no value in predicting matchups. However, there's no reason to sacrifice any EV if one doesn't need to, and each addition at the margin can add up. Good luck bracketing.
1) Predicting the champion correctly is paramount
The degree of importance obviously depends on the scoring system, but I have yet to see a system that doesn't increase point totals in each round. Using the default system of 1 point for each correct first-round game, 2 points for each correct second-round game,..., and 32 points for the correct champion, a correctly picked champion already nets 63 out of a total 192 points (the champion has to win each of its games); by comparison, correctly picking every first-round and second-round game (a total of 48 games) only nets 64 points.
2) Matchups are not binary
This means that if one were simply trying to maximize expected point total, the final four should almost always consist of high seeds, even if they aren't necessarily the best teams in their respective regions. This is simply because they usually have the easiest expected early round opponents. So even if we think an 8 seed would likely beat a 1 seed (either because of matchup issues or simply because the 1 seed is overseeded while the 8 seed is underseeded), choosing the 8 seed to advance for this reason alone isn't wise as it's much less likely to even advance past the first round and make the 1-8 matchup. Pulling arbitrary numbers out of my hat, that would mean the 8 seed would need to be something like 75% favorites against the 9 seed and 60% favorites against the 1 seed as the 1 seed is almost always close to 100% against the 16 seed. In terms of actually executable bracket picking strategy, given Tip #1, one should attempt to work backwards by picking the champion, then the National Championship Game, and then the Final Four, and so on. This may mean that there are matchups in one's bracket in which the winner chosen isn't necessarily the team more likely to win that specific matchup, which is something that usually befuddles network "analysts."
3) In any pools of significant size, choose the team that is the most undervalued by the public and not necessarily the team that is the most likely to win
In the most simplified case, consider a 2 team tournament between Team A and Team B and a pool with n brackets (where n is a large number). Team A is a 67% favorite over Team B, and 90% of people pick Team A over Team B. Assuming a sensible payout structure (i.e. the money is split between the tying brackets or one of the tying brackets is chosen at random to win the payout), one's EV for picking Team A is X*67%/(90%*n) = 0.6*X/n, where X is the monetary size of the pool, while one's EV for picking Team B is X*33%/(10%*n) = 3.3*X/n. Adding more rounds definitely complicates issues, as there is now more than one path to win a pool (there isn't one specific game that determines who wins). However, in the case of extremely large pools in which one can reasonably assume that each team in the field is picked to win by at least one bracket, the champion likely must be picked correctly for a bracket to have a chance at winning. This basically means that if one were trying to win the challenge for best bracket that both Yahoo! and ESPN offer, one should pick a team that exhibits the largest discrepancy between how often the team is chosen as a champion and the team's actual championship chances.
Now, in smaller office pools, the strategy might be significantly different and this is because as mentioned earlier, there can be multiple potential paths to winning the pool. At the most extreme case, consider a 2 person office pool. Using the same 2 team tournament example from before, one's EV for picking Team A is X*67%*10%+90%/2 = 0.517*X while one's EV for picking Team B is X*33%*90%+10%/2 = 0.35*X. As a result, in a 2 person pool, one should always choose the most likely outcome, while in an infinite person pool, one should always choose the outcome that is most undervalued (defined simply as % chance/% chosen), and everything in between involves some balance between the two.
Will following any of these tips guarantee a win? Of course not. Similar to how end of game coaching won't matter if players simply don't hit shots, making strategic adjustments won't matter if someone simply adds no value in predicting matchups. However, there's no reason to sacrifice any EV if one doesn't need to, and each addition at the margin can add up. Good luck bracketing.
Subscribe to:
Posts (Atom)