In many situations, such reinforcement learning is an essential strategy, allowing people to optimize behavior to fit a constantly changing situation. However, the Israeli scientists discovered that it was a terrible approach in basketball, as learning and performance are “anticorrelated.” In other words, players who have just made a three-point shot are much more likely to take another one, but much less likely to make it:
What is the effect of the change in behaviour on players’ performance? Intuitively, increasing the frequency of attempting a 3pt after made 3pts and decreasing it after missed 3pts makes sense if a made/missed 3pts predicted a higher/lower 3pt percentage on the next 3pt attempt. Surprizingly [sic], our data show that the opposite is true. The 3pt percentage immediately after a made 3pt was 6% lower than after a missed 3pt. Moreover, the difference between 3pt percentages following a streak of made 3pts and a streak of missed 3pts increased with the length of the streak. These results indicate that the outcomes of consecutive 3pts are anticorrelated.
This anticorrelation works in both directions. as players who missed a previous three-pointer were more likely to score on their next attempt. A brick was a blessing in disguise.
The underlying study, showing a “failure of reinforcement learning” is here.
Suppose you just hit a 3-pointer and now you are holding the ball on the next possession. You are an experienced player (they used NBA data), so you know if you are truly on a hot streak or if that last make was just a fluke. The defense doesn’t. What the defense does know is that you just made that last 3-pointer and therefore you are more likely to be on a hot streak and hence more likely than average to make the next 3-pointer if you take it. Likewise, if you had just missed the last one, you are less likely to be on a hot streak, but again only you would know for sure. Even when you are feeling it you might still miss a few.
That means that the defense guards against the three-pointer more when you just made one than when you didn’t. Now, back to you. You are only going to shoot the three pointer again if you are really feeling it. That’s correlated with the success of your last shot, but not perfectly. Thus, the data will show the autocorrelation in your 3-point shooting.
Furthermore, when the defense is defending the three-pointer you are less likely to make it, other things equal. Since the defense is correlated with your last shot, your likelihood of making the 3-pointer is also correlated with your last shot. But inversely this time: if you made the last shot the defense is more aggressive so conditional on truly being on a hot streak and therefore taking the next shot, you are less likely to make it.
(Let me make the comparison perfectly clear: you take the next shot if you know you are hot, but the defense defends it only if you made the last shot. So conditional on taking the next shot you are more likely to make it when the defense is not guarding against it, i.e. when you missed the last one.)
You shoot more often and miss more often conditional on a previous make. Your private information about your make probability coupled with the strategic behavior of the defense removes the paradox. It’s not possible to “arbitrage” away this wedge because whether or not you are “feeling it” is exogenous.
15 comments
Comments feed for this article
January 29, 2012 at 11:24 pm
Anonymous
Why isnt this just regression to the mean?
January 30, 2012 at 10:04 am
jeff
As I understand it, regression to the mean would be based on a comparison of the early and late phases of a given streak. (if you got lucky and shot well early, then you are likely to do worse than that later on, whereas if you got unlucky and shot poorly, you are likely to do better later on.)
The authors are doing a different comparison. They are comparing two different streaks. In both streaks you started by shooting a 3 pointer, and in the first streak your three-pointer went in but in the second it missed. The comparison is the likelihood of making the next shot across these two scenarios.
One way to see the difference is to consider the case where shot success is iid over time. Then you would get regression to the mean but you would see no difference in the authors’ comparison.
January 30, 2012 at 11:55 am
twicker
My understanding is that regression to the mean assumes, first, that there’s a mean to regress to – in other words, a made-shot percentage that is independent of what came before, not dependent on what came before.
Thus, if someone does abnormally well (or poorly), then, unless their mean has moved to the new, unusual level (e.g., all previous measurements were made when they were drunk, and now they’re sober, or vice versa), the next measurement (e.g., shot) will be more likely to be at the mean level instead of at the abnormal level – not because of any correction of a level, but because abnormal things are abnormal, and normal performance is normal. By the theory of regression to the mean, the made-shot percentage of Shot B (the second shot) should be the same regardless of whether the previous shot (Shot A) was good or not – one should not depend on, or predict, the other.
What happens here is that the made-shot percentage of B specifically depends on what happened in Shot A (the previous shot). Thus, because a dependency exists, this would not be regression to the mean.
January 30, 2012 at 12:46 am
twicker
How does one know that one is truly hot (esp. given the abundant evidence we have for overconfidence)? And, if one is an experienced NBA player, wouldn’t one immediately discount any feeling of “hotness” that was predicated on a previous made shot, knowing, from experience, that the defense will be stronger and that you’ll be less likely to make it?
Again: because, as you mention, these are experienced players (i.e., those who are more likely to somehow accurately know when they are “hot” and somehow avoid the overconfidence effect), they should automatically assume closer defense when they make a shot, especially when they are demonstrably “hot,” and know that they will, therefore, be less likely to make it because of the defense, hot or not.
In other words, I don’t see that it’s achieved the equilibrium you describe, since the shooter knows that the defender knows that the shooter made the last shot. I do, however, see evidence that people who have made a shot will be overconfident in their ability for the next shot, and that people who missed the previous shot will be underconfident – i.e., both sets of players will, as the article states, tend to overgeneralize.
I’ll note that the authors briefly discuss the defensive actions of the other team – but the authors do so to note that, after a made three-pointer, the shooter making the original three-pointer should have less of a chance to shoot another one since the defense will do more to try to prevent that person from subsequently getting the ball or having a clear shot – yet those shooters still shoot more. Further, shooters who miss should have looser defense on them (as you note), meaning they should have more opportunities to get the ball and to have a clean shot – yet they shoot less. This, even though these are experienced players who should recognize, and plan for, and be able to take advantage of, tighter and looser defense, respectively.
January 30, 2012 at 10:11 am
jeff
The shooter knows the defense will be tougher. But because he is hot, it is still optimal for him to shoot despite the defense. The fact that the probability of making it is lower than it would have been had he missed the first shot is not directly relevant for deciding whether to shoot another three. What matters is the expected value of shooting the next three versus doing something else (like shooting a two.)
January 30, 2012 at 11:58 am
twicker
So, I’m still left with two questions:
1) Again – how does a shooter know that s/he is hot (note that the effect exists for both NBA and WNBA players)?
2) If the expert is expert enough to know when s/he is “hot,” how does s/he not know that s/he will be guarded more closely after a made shot and that, thus, s/he has a lower likelihood of making the next shot? In other words, what would be the process that leads to an accurate assessment of Part A but somehow leads to a completely wack assessment of Part B – especially when it’s a heck of a lot easier to predict B than A?
January 30, 2012 at 9:32 pm
jeff
The thing that’s confusing here is that when I write (and when the
original authors wrote) “you are less likely to make the shot…” the
mind jumps to a conclusion and pays less attention to what comes next.
Because “less likely to make the shot” sounds like it implies that you
shouldn’t be taking the shot.
But what comes after the “than” in “you are less likely to make the shot
than…” is important for the meaning. Consider the following possible
completions of the sentence:
You are less likely to make the shot than
1. If you were Wilt Chamberlain
2. If you had just missed your last shot
These two completions have this in common: they refer to
contingencies which are out of your control and are false whether you
take a 3 point shot or a 2 point shot and are therefore irrelevant to
your decision about which of those two shots to take.
So when you ask “how does he not know that he is less likely to make it?”
I say “he *does* know.” But that doesn’t stop him from shooting. Because what affects whether or not he will shoot is not how more or less likely it is he will make it compared to a contingency that hasn’t happened and can’t happen, but how many points his team is going to make if he shoots a three-pointer compared to doing something else.
Now things are a little more interesting when you notice, as Alex F did, that more aggressive defense against the three pointer must come at the expense of less aggressive defense on some margin. That substitution could be across players (they defend another of your teammates less), across time (they spend more energy defending you now and less later), or across types of shots you might take (guard against the three, giving you more room to make a two.)
Either of the first two leave the analysis unchanged. But if guarding against the three increases your relative odds of making a two versus the three, then there is an additional calculation that has to work out. In order that you continue to shoot the three in face of the tougher defense, it must be that the expected value of the three pointer (make probability times 3 points) exceeds that of the two (higher make probability but times only 2 points.)
As long as this is true, i.e. as long as being hot means that shooting a 3 is more attractive than shooting a 2 even when the defense is looking for the 3, the data can be rationalized. You will find autocorrelation in 3 point shooting and the probability of making a 3 will be highest when you missed the last shot, and lowest when you made the last shot.
(Indeed as the streak continues, the defense gets more and more convinced you are hot, progressively tightens the defense and the effect gets even stronger, exactly as the paper shows.)
January 30, 2012 at 1:27 am
Alex F
Not sure I follow your logic, Jeff. Defending three-pointers is presumably a substitute for defending two-pointers. So in my not-hot state, against a normal defense, I make two-pointers with probability X and three-pointers with probability Y. In my not-hot state, against a 3-pt defense, I make two-pointers with probability Z>X and three-pointers with probability <Y. In my hot-state, against a 3-pt defense, I make two pointers with probability ≥Z and three-pointers with probability maybe above Y, maybe below.
So if I'm not hot but you think I might be hot, I am less likely to shoot 3-pointers. I am unambiguously better off than if I'm not hot and you think I'm not hot.
If I'm hot and you think I might be hot, I might be more or less likely to shoot 2-pointers. However, I'm unambiguously better off than if I'm either not hot and you think I'm not hot, or not hot and you think I'm hot. So if it is the case that I'm shooting more 3-pointers than before, it must be that I am more likely to make them.
In other words, the prediction is that if I shoot more 3's after I just made a 3, I should probably be making them more often.
The logic falls apart if it's not a single defender changing his strategy, but rather the defense starting to double-team me or whatever and take defenders off my teammates. Then Z can be less than X and the whole argument falls apart.
January 30, 2012 at 9:15 am
jeff
Hi Alex F. Two things are enough:
1. You shoot the three if an only if you are hot (conditional on shooting at all)
2. Defending the three lowers your chance of making the three regardless of your state.
It’s easy to find parameters for which this is true and consistent with optimality. Here’s the simplest.
1. You miss the three for sure when you’re not hot.
2. The expected value of shooting a three is higher than of shooting a two when you are hot regardless of the defense. (the defense lowers the probability of making the three but the expected value is still higher than a two.)
These imply that you shoot the three when you are hot, but not when you’re cold. If you made the last shot you are more likely hot and so more likely to shoot the three again. But you face tougher defense than if you were hot and missed the last shot. So compared to that situation (the only situation you would shoot the three again if you missed the last one) you make the three less often.
January 30, 2012 at 3:56 pm
Jason Collins
An interesting aside to this problem is the lack of evidence for hot streaks – even when you take the defence out of the equation (i.e free throws). If the player feels hot, it is likely an illusion (http://www.psych.cornell.edu/sec/pubPeople/tdg1/Gilo.Vallone.Tversky.pdf)
January 30, 2012 at 4:23 pm
twicker
@Jason: Indeed.
January 30, 2012 at 7:39 pm
dan s
that’s pretty subtle (do you think it’s true? i.e. even if 1 can construct rational equilibrium consistent with data, still more likely players too quick to think they’re hot from 1 made shot, right?)
also, re last 2 comments, have to plug my own work here, sorry
Click to access hothand.pdf
January 30, 2012 at 9:44 pm
jeff
Hey Dan:
I don’t know what to think. I guess the point I am trying to make is that there is an indispensable role for theory in empirical studies that are purported to show behavioral phenomena. We want to know definitively when a behavioral model is required and to be able to draw such a conclusion we need to do our best to understand the different interpretations of the data. And when we have an ambiguity like this one we need to work harder to come up with a better test.
January 30, 2012 at 10:09 pm
jeff
Also I should add that what’s subtle is the interpretation of the data by the statistician. The prescribed behavior for the players is as simple as can be: shoot the three when you are feeling it, shoot the two (or pass) when you are not. Ratchet up the defense when the guy made his most recent three.
January 30, 2012 at 11:00 pm
dan s
hmm, i was thinking parameters for your argument to work are likely implausible, but maybe not. the post is certainly a nice illustration of broader point anyway (the value of theory)