One of the core assumptions when playing and researching Pokémon GO is that different events in the game are independent from one another. You’re no more likely to hatch a Deino from your next 10km egg just because you’ve been unlucky so far. Finding a shiny Pokémon doesn’t become any more or less probable when you’re on a dry spell. Pokémon with perfect IVs don’t come in clusters. Our brains are “wired” to help us find order in a disordered world, and most claims of patterns can — after investigation — be ascribed to the human tendency to find patterns where none exist. However, the Silph Research Group has uncovered an in-game mechanic where a clear dependence between successive events does exist: Special Lure Modules.
The findings in this article, comprising data from more than 13,000 spawns from Special Lure Modules recorded over the course of many months, provide an example that breaks this long-held assumption of independence. To our knowledge, this is the first time that such a mechanic has been documented in Pokémon GO. In today’s article, we’re going to take a closer look at several surprising and puzzling observations from Special Lures, and speculate a bit about what’s going on underneath the surface. Let’s dive in!
Review of Special Lure Modules
The Silph Research Group previously published our findings on the species available from Special Lure Modules. We found that each Special Lure type attracts 13 specific species. These species make up about 50% of the Pokémon generated by the Special Lure, while the other 50% are a mixture of Pokémon from the surrounding biome. Throughout this article, we’ll use the terms “Lure spawns” and “Biome spawns” to describe these two different groups.
It was later uncovered that the Biome spawns from Special Lures are biased towards Pokémon in the second half of the Pokédex, so that in effect, the Lure spawns tend to replace spawns from Pokémon found in the first half of the Pokédex.
Since then, Researchers have discovered several additional oddities in the behavior of Special Lure Module spawns. We suspect that all these observations, new and old, are related to the same underlying cause, though we have not been able to pinpoint what exactly that cause is – we speculate about further this later in this article.
- Lure spawns tend to follow Lure spawns, and Biome spawns tend follow Biome spawns. This means that we tend to see longer than expected streaks of each kind.
- There is a periodic autocorrelation between Lure and Biome spawns. The period of this autocorrelation is about 34 spawns.
- Specific species tend to be grouped within streaks of Lure spawns. A heat map of which species tend to occur subsequent to one another suggests that the Lure species may have a specific and possibly circular ordering.
A Series of Coin Flips?
Whether a Special Lure Module generates a Lure spawn or a Biome spawn is a random process. On the whole, roughly half the spawns from Special Lure Modules are Lure spawns, and half are Biome spawns. It would be natural to guess that this random process is like the flip of a fair coin. However, this guess does not hold up to the observed data.
Successive flips of a fair coin are independent from one another, so that whether or not the last flip was a heads or a tails, the next flip will be heads half the time and tails the other half. However, what we observe from Special Lure Modules is very different from the flip of a fair coin: each spawn is followed by the same type of spawn (Lure or Biome) nearly 70% of the time, and the other type approximately 30% of the time.¹ That is, Lure spawns are more likely to be followed by Lure spawns, and Biome spawns are more likely to be followed by Biome spawns, and this is statistically significant (with a p-value below 10-10). The tendency for each spawn to be followed by one of the same type can lead to long streaks of spawns from one species pool or the other.
So, at least at first blush, the spawn-generation process is modeled better by the flip of an unfair coin that lands heads 70% of the time and tails 30% of the time, with heads representing “select the next spawn from the same spawn pool” and tails representing “select the next spawn from the other spawn pool.” The following diagram illustrates this concept visually.
If the “unfair coin” model described in the previous paragraphs were correct, then while successive spawns would exhibit high correlation, spawns separated by more and more time should exhibit less and less correlation, a phenomenon known as “mixing.”
To test this, we represented the output of a Special Lure as a string of 0s and 1s, with 0 for Biome spawns and 1 for Lure spawns, and calculated the correlation between spawns separated by different distances in the string: can any correlation be seen between all pairs of spawns 2 apart in the string? 5 apart? 10 apart? We quantified this relationship using a correlation coefficient. A correlation coefficient can have any value between -1 and 1, with correlations greater than 0 indicating a positive relationship (spawns tend to come from the same pool), and correlations less than 0 indicating a negative one (spawns tend to come from different pools). A value of 0 is expected for uncorrelated events.
Before the outbreak of COVID-19, Lures placed during Community Day lasted for three hours. A few of our Researchers took advantage of this bonus by placing Special Lures during the final minute of Community Day and recording spawns thereafter, providing up to 120 uninterrupted spawns and allowing us to examine correlations along much longer strings of spawns than the usual 30-minute, 20-spawn lures would allow.
For these long chains of spawns, the figure below shows the correlation coefficient plotted as a function of spawn separation distance for five Special Lures of various durations.²
Surprisingly, the correlations for all five Lures display a consistent cyclic behavior with a period of about 34 spawns. More precisely, there is a significant anti-correlation between pairs of spawns that are 17 apart in each string, leading to correlation in spawns 34 apart, anti-correlation in spawns 51 apart, and so forth. The data becomes more noisy towards the right side of the plot as the sample sizes become smaller (due to both fewer Lures and fewer spawns separated by that distance). We won’t hazard a guess at whether these correlations should be expected to dampen over time or if they would continue on indefinitely.
Averaging the Lures together, we see that the peaks and valleys of the correlation plot are much larger than those expected by chance. The following graph compares our observed data with “random” data generated using the biased coin flip model above (which allows for the positive correlation of spawns close together, but exhibits mixing over the longer term).
We confirmed that this periodicity appears to be present in normal 30-minute 20-spawn Special Lures as well, although only the first half of the first correlation cycle can be observed.
Circularity in Lure Species
So far, we’ve limited our discussion to patterns of Lure spawns and Biome spawns. What about the individual species themselves?
Just as Lure spawns are more likely to be followed by another Lure spawn, certain Special Lure species are significantly more likely to follow others than would be expected by chance. The heat map below shows which Glacial Lure Module boosted species tend follow each other. Similar maps can be made for Mossy and Magnetic Lure Modules. Each square shows the over- (or under-) representation of one species pair, with the preceding spawn on the horizontal axis and the following spawn on the vertical axis. The darker the color, the more over-represented the pairing.³ We have manually ordered the species so that “hotter” parts of the map are closer to the middle.
There are four interesting observations that we would like to highlight about the above plot.
- As already stated, some species pairs are far more likely to occur in series than would be expected by chance (Chi-squared p-value<0.001 for all 13 species). Naturally, there were also many species pairs that were seen much less frequently than would be expected.
- Species do not tend to follow themselves. The dashed line that runs through the plot was added to visually highlight the diagonal of the plot. These squares show how often a species tends to spawn twice in a row. The lighter colors of these squares indicate that, on average, the same species spawning twice in a row is no more likely than would be expected by random chance.
- The pattern that emerged from our manual sorting of species suggests that the Lure species have an ordering. For example, Snover tends to follow Wailmer, which tends to follow Spheal, which tends to follow Seel, and so forth. The reverse direction is also correlated, though not as strongly. Had the associations between species been random, it would be difficult to arrange the species such that the darker regions almost entirely fall along a 1:1 line.
- Furthermore, the darker regions that appear in the top left and bottom right of the plot are a hallmark of circularity. To visualize this, imagine placing additional copies of the heat map in a grid. The areas in the bottom right and top left now fall next to the main diagonal dark area, creating continuous lines of association through the mosaic. The circularity isn’t perfect, Feebas doesn’t quite fit as well as the other species, and different observers might make slightly different cycles out of the same data, but the overall structure looks something like this:
Speculation and Parting Words
We believe that the above observations are all likely linked to some sort of underlying quirk in how random numbers are created to generate Special Lure spawns, rather than by a deliberate data-generating process. Many pseudo-random number generators rely on a “seed” value, which is then transformed using a deterministic formula to get an output. The output of such a formula can appear random, although it is not truly random. Some simple algorithms rely on computing the modulus of two numbers (the remainder after the two numbers are divided) to create random values. One of many possible explanations for the observations in this article is that the exact time of the spawn, potentially with other information, might be used by the server as a seed to generate Lure spawns.
We’re uncertain what the potential ramifications of these observations might be for other game features. It’s possible that these oddities are strictly limited to Special Lure Modules. It’s also possible, however, that researchers have stumbled onto a pattern that can be applied more generally to numerous aspects of the Pokémon GO world. The truth may even fall somewhere in between these extremes.
Thanks for reading, travelers, and we’ll see you on the Road.
Analysis: Scientist Titleist and Gluglumaster
Graphics and data presentation: Scientist WoodWoseWulf and Titleist
Editing: Scientist Cham1nade and Lead Researchers Gustavobc and Paleshadow
Data collection: 91 amazingly patient researchers
¹For all analysis in the article, we assume that any spawn of the 13 Lure species comes from the “Lure spawn” pool, even though it’s likely some of these spawns were drawn from the “Biome spawns.”
²Extended Lure types and lengths:
|Lure Type||Lure Length|
Post-publication edit: The figure in this article related to the plot for the above table had the colors flipped for Mossy and Magnetic Lures – this was corrected shortly after publication.
³More precisely, the colors in the heat map represent the Jaccard similarity coefficient between the two species. The highest coefficient was 0.21 for the (Spheal, Wailmer) pairing, while the lowest coefficient was zero for several pairings.