What is the “Hotel Problem” and how does it affect the interpretation of data?
In general, analytics tools count users, but each analytics tool has its own way of counting. Sometimes, user counts can appear to be incorrect for various reasons, especially when measuring users over time. For example, if a user installs your app on Monday and then returns on Tuesday, a usage chart that reports total users per day will show two users, one for each day. But if you modify the same chart to report total users across days (that is, not split by day), the chart will show only one user, since there was only one unique user observed across the evaluated time period. This common source of confusion is called the Hotel Problem. (For more information on the Hotel Problem, see the Wikipedia article on Web Analytics.)
This can be particularly confusing when evaluating new versus returning users. For example, Localytics counts a user as new only the first time they open your app. Therefore, a user who opens your app for the first time on a given day and returns again the same day will be counted once as a new user and once as a returning user. This means that if you split the number of users from a given day between new and returning, the total will likely exceed the number of unique users from that day (because some users are properly identified as both new and returning).
A variation of this problem manifests when counting users who have enabled or disabled push messaging. Over time, a user can enable or disable your ability to send them push messages in your app. Since our SDK tracks whether a user has push enabled on every event, they can show as being both enabled and disabled if the report time period covers when they switched. A discrepancy can also occur in the number of users push messages are sent to due to known versus anonymous users. For example, if a user enables push messaging while anonymous and later disables push messaging while known, the user will technically be counted as two different users with different push enabled settings.
Let's say you're looking at a Localytics report covering the month of April. A user who had push disabled on April 1 decided to enable it on April 15. If you filter the April 30-day report by Push Enabled, that user would be counted for April. If you remove that filter and instead filter by Push Disabled, they would again be counted for April, since that user was both enabled and disabled over that time period.
The Hotel Problem will also create a discrepancy between unique users and total users over time. For example, when looking at Users by Day, the dashboard is displaying the number of unique users for that specific day. However, the summary bar at the top of the screen shows unique users across the entire time period, and users will inevitably overlap as they have sessions on different days. So, the Users by Day Total column is the sum of users each day instead of the desired time range—a perfect example of the "Hotel Problem."