Monday October 17, 2022 | 12:30 p.m. - 1:45 p.m.
There is a great desire to use adaptive sampling methods, such as reinforcement learning and bandit algorithms, for the optimization of digital interventions in areas like mobile health and education. A major obstacle preventing more widespread use of such algorithms in practice is the lack of assurance that the resulting adaptively collected data can be used to reliably answer inferential questions, including questions about time-varying causal effects. In this work, we introduce the adaptive sandwich estimator to quantify uncertainty for Z-estimators on data collected by a large class of adaptive algorithms that learn to select actions by pooling the data of multiple users. Our approach is applicable to longitudinal data settings and in simpler settings, our results generalize those in the adaptive clinical trial literature.