Methodology
- [1] arXiv:2405.18597 [pdf, ps, html, other]
-
Title: Causal inference in the closed-loop: marginal structural models for sequential excursion effectsComments: 19 pages, 9 figuresSubjects: Methodology (stat.ME); Applications (stat.AP)
Optogenetics is widely used to study the effects of neural circuit manipulation on behavior. However, the paucity of causal inference methodological work on this topic has resulted in analysis conventions that discard information, and constrain the scientific questions that can be posed. To fill this gap, we introduce a nonparametric causal inference framework for analyzing "closed-loop" designs, which use dynamic policies that assign treatment based on covariates. In this setting, standard methods can introduce bias and occlude causal effects. Building on the sequentially randomized experiments literature in causal inference, our approach extends history-restricted marginal structural models for dynamic regimes. In practice, our framework can identify a wide range of causal effects of optogenetics on trial-by-trial behavior, such as, fast/slow-acting, dose-response, additive/antagonistic, and floor/ceiling. Importantly, it does so without requiring negative controls, and can estimate how causal effect magnitudes evolve across time points. From another view, our work extends "excursion effect" methods--popular in the mobile health literature--to enable estimation of causal contrasts for treatment sequences greater than length one, in the presence of positivity violations. We derive rigorous statistical guarantees, enabling hypothesis testing of these causal effects. We demonstrate our approach on data from a recent study of dopaminergic activity on learning, and show how our method reveals relevant effects obscured in standard analyses.
- [2] arXiv:2405.18722 [pdf, ps, html, other]
-
Title: Adaptive and Efficient Learning with Blockwise Missing and Semi-Supervised DataSubjects: Methodology (stat.ME)
Data fusion is an important way to realize powerful and generalizable analyses across multiple sources. However, different capability of data collection across the sources has become a prominent issue in practice. This could result in the blockwise missingness (BM) of covariates troublesome for integration. Meanwhile, the high cost of obtaining gold-standard labels can cause the missingness of response on a large proportion of samples, known as the semi-supervised (SS) problem. In this paper, we consider a challenging scenario confronting both the BM and SS issues, and propose a novel Data-adaptive projecting Estimation approach for data FUsion in the SEmi-supervised setting (DEFUSE). Starting with a complete-data-only estimator, it involves two successive projection steps to reduce its variance without incurring bias. Compared to existing approaches, DEFUSE achieves a two-fold improvement. First, it leverages the BM labeled sample more efficiently through a novel data-adaptive projection approach robust to model misspecification on the missing covariates, leading to better variance reduction. Second, our method further incorporates the large unlabeled sample to enhance the estimation efficiency through imputation and projection. Compared to the previous SS setting with complete covariates, our work reveals a more essential role of the unlabeled sample in the BM setting. These advantages are justified in asymptotic and simulation studies. We also apply DEFUSE for the risk modeling and inference of heart diseases with the MIMIC-III electronic medical record (EMR) data.
- [3] arXiv:2405.18836 [pdf, ps, html, other]
-
Title: Do Finetti: On Causal Effects for Exchangeable DataSubjects: Methodology (stat.ME); Machine Learning (cs.LG)
We study causal effect estimation in a setting where the data are not i.i.d. (independent and identically distributed). We focus on exchangeable data satisfying an assumption of independent causal mechanisms. Traditional causal effect estimation frameworks, e.g., relying on structural causal models and do-calculus, are typically limited to i.i.d. data and do not extend to more general exchangeable generative processes, which naturally arise in multi-environment data. To address this gap, we develop a generalized framework for exchangeable data and introduce a truncated factorization formula that facilitates both the identification and estimation of causal effects in our setting. To illustrate potential applications, we introduce a causal Pólya urn model and demonstrate how intervention propagates effects in exchangeable data settings. Finally, we develop an algorithm that performs simultaneous causal discovery and effect estimation given multi-environment data.
- [4] arXiv:2405.18856 [pdf, ps, other]
-
Title: Inference under covariate-adaptive randomization with many strataSubjects: Methodology (stat.ME); Statistics Theory (math.ST)
Covariate-adaptive randomization is widely employed to balance baseline covariates in interventional studies such as clinical trials and experiments in development economics. Recent years have witnessed substantial progress in inference under covariate-adaptive randomization with a fixed number of strata. However, concerns have been raised about the impact of a large number of strata on its design and analysis, which is a common scenario in practice, such as in multicenter randomized clinical trials. In this paper, we propose a general framework for inference under covariate-adaptive randomization, which extends the seminal works of Bugni et al. (2018, 2019) by allowing for a diverging number of strata. Furthermore, we introduce a novel weighted regression adjustment that ensures efficiency improvement. On top of establishing the asymptotic theory, practical algorithms for handling situations involving an extremely large number of strata are also developed. Moreover, by linking design balance and inference robustness, we highlight the advantages of stratified block randomization, which enforces better covariate balance within strata compared to simple randomization. This paper offers a comprehensive landscape of inference under covariate-adaptive randomization, spanning from fixed to diverging to extremely large numbers of strata.
- [5] arXiv:2405.18873 [pdf, ps, html, other]
-
Title: A Return to Biased Nets: New Specifications and Approximate Bayesian InferenceSubjects: Methodology (stat.ME); Social and Information Networks (cs.SI)
The biased net paradigm was the first general and empirically tractable scheme for parameterizing complex patterns of dependence in networks, expressing deviations from uniform random graph structure in terms of latent ``bias events,'' whose realizations enhance reciprocity, transitivity, or other structural features. Subsequent developments have introduced local specifications of biased nets, which reduce the need for approximations required in early specifications based on tracing processes. Here, we show that while one such specification leads to inconsistencies, a closely related Markovian specification both evades these difficulties and can be extended to incorporate new types of effects. We introduce the notion of inhibitory bias events, with satiation as an example, which are useful for avoiding degeneracies that can arise from closure bias terms. Although our approach does not lead to a computable likelihood, we provide a strategy for approximate Bayesian inference using random forest prevision. We demonstrate our approach on a network of friendship ties among college students, recapitulating a relationship between the sibling bias and tie strength posited in earlier work by Fararo.
- [6] arXiv:2405.19058 [pdf, ps, html, other]
-
Title: Participation bias in the estimation of heritability and genetic correlationSubjects: Methodology (stat.ME)
It is increasingly recognized that participation bias can pose problems for genetic studies. Recently, to overcome the challenge that genetic information of non-participants is unavailable, it is shown that by comparing the IBD (identity by descent) shared and not-shared segments among the participants, one can estimate the genetic component underlying participation. That, however, does not directly address how to adjust estimates of heritability and genetic correlation for phenotypes correlated with participation. Here, for phenotypes whose mean differences between population and sample are known, we demonstrate a way to do so by adopting a statistical framework that separates out the genetic and non-genetic correlations between participation and these phenotypes. Crucially, our method avoids making the assumption that the effect of the genetic component underlying participation is manifested entirely through these other phenotypes. Applying the method to 12 UK Biobank phenotypes, we found 8 have significant genetic correlations with participation, including body mass index, educational attainment, and smoking status. For most of these phenotypes, without adjustments, estimates of heritability and the absolute value of genetic correlation would have underestimation biases.
- [7] arXiv:2405.19145 [pdf, ps, html, other]
-
Title: L-Estimation in Instrumental Variables Regression for Censored Data in Presence of Endogeneity and Dependent ErrorsSubjects: Methodology (stat.ME)
In this article, we propose L-estimators of the unknown parameters in the instrumental variables regression in the presence of censored data under endogeneity. We allow the random errors involved in the model to be dependent. The proposed estimation procedure is a two-stage procedure, and the large sample properties of the proposed estimators are established. The utility of the proposed methodology is demonstrated for various simulated data and a benchmark real data set.
- [8] arXiv:2405.19231 [pdf, ps, html, other]
-
Title: Covariate Shift Corrected Conditional Randomization TestSubjects: Methodology (stat.ME)
Conditional independence tests are crucial across various disciplines in determining the independence of an outcome variable $Y$ from a treatment variable $X$, conditioning on a set of confounders $Z$. The Conditional Randomization Test (CRT) offers a powerful framework for such testing by assuming known distributions of $X \mid Z$; it controls the Type-I error exactly, allowing for the use of flexible, black-box test statistics. In practice, testing for conditional independence often involves using data from a source population to draw conclusions about a target population. This can be challenging due to covariate shift -- differences in the distribution of $X$, $Z$, and surrogate variables, which can affect the conditional distribution of $Y \mid X, Z$ -- rendering traditional CRT approaches invalid. To address this issue, we propose a novel Covariate Shift Corrected Pearson Chi-squared Conditional Randomization (csPCR) test. This test adapts to covariate shifts by integrating importance weights and employing the control variates method to reduce variance in the test statistics and thus enhance power. Theoretically, we establish that the csPCR test controls the Type-I error asymptotically. Empirically, through simulation studies, we demonstrate that our method not only maintains control over Type-I errors but also exhibits superior power, confirming its efficacy and practical utility in real-world scenarios where covariate shifts are prevalent. Finally, we apply our methodology to a real-world dataset to assess the impact of a COVID-19 treatment on the 90-day mortality rate among patients.
- [9] arXiv:2405.19312 [pdf, ps, html, other]
-
Title: Causal Inference for Balanced Incomplete Block DesignsSubjects: Methodology (stat.ME)
Researchers often turn to block randomization to increase the precision of their inference or due to practical considerations, such as in multi-site trials. However, if the number of treatments under consideration is large it might not be practical or even feasible to assign all treatments within each block. We develop novel inference results under the finite-population design-based framework for a natural alternative to the complete block design that does not require reducing the number of treatment arms, the balanced incomplete block design (BIBD). This includes deriving the properties of two estimators for BIBDs and proposing conservative variance estimators. To assist practitioners in understanding the trade-offs of using BIBDs over other designs, the precisions of resulting estimators are compared to standard estimators for the complete block design, the cluster-randomized design, and the completely randomized design. Simulations and a data illustration demonstrate the strengths and weaknesses of using BIBDs. This work highlights BIBDs as practical and currently underutilized designs.
New submissions for Thursday, 30 May 2024 (showing 9 of 9 entries )
- [10] arXiv:2405.18459 (cross-list from cs.IT) [pdf, ps, html, other]
-
Title: Probing the Information Theoretical Roots of Spatial Dependence MeasuresComments: COSIT-2024 Conference ProceedingsSubjects: Information Theory (cs.IT); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Methodology (stat.ME)
Intuitively, there is a relation between measures of spatial dependence and information theoretical measures of entropy. For instance, we can provide an intuition of why spatial data is special by stating that, on average, spatial data samples contain less than expected information. Similarly, spatial data, e.g., remotely sensed imagery, that is easy to compress is also likely to show significant spatial autocorrelation. Formulating our (highly specific) core concepts of spatial information theory in the widely used language of information theory opens new perspectives on their differences and similarities and also fosters cross-disciplinary collaboration, e.g., with the broader AI/ML communities. Interestingly, however, this intuitive relation is challenging to formalize and generalize, leading prior work to rely mostly on experimental results, e.g., for describing landscape patterns. In this work, we will explore the information theoretical roots of spatial autocorrelation, more specifically Moran's I, through the lens of self-information (also known as surprisal) and provide both formal proofs and experiments.
- [11] arXiv:2405.18518 (cross-list from cs.LG) [pdf, ps, html, other]
-
Title: LSTM-COX Model: A Concise and Efficient Deep Learning Approach for Handling Recurrent EventsSubjects: Machine Learning (cs.LG); Methodology (stat.ME); Machine Learning (stat.ML)
In the current field of clinical medicine, traditional methods for analyzing recurrent events have limitations when dealing with complex time-dependent data. This study combines Long Short-Term Memory networks (LSTM) with the Cox model to enhance the model's performance in analyzing recurrent events with dynamic temporal information. Compared to classical models, the LSTM-Cox model significantly improves the accuracy of extracting clinical risk features and exhibits lower Akaike Information Criterion (AIC) values, while maintaining good performance on simulated datasets. In an empirical analysis of bladder cancer recurrence data, the model successfully reduced the mean squared error during the training phase and achieved a Concordance index of up to 0.90 on the test set. Furthermore, the model effectively distinguished between high and low-risk patient groups, and the identified recurrence risk features such as the number of tumor recurrences and maximum size were consistent with other research and clinical trial results. This study not only provides a straightforward and efficient method for analyzing recurrent data and extracting features but also offers a convenient pathway for integrating deep learning techniques into clinical risk prediction systems.
- [12] arXiv:2405.18563 (cross-list from cs.LG) [pdf, ps, html, other]
-
Title: Counterfactual Explanations for Multivariate Time-Series without Training DatasetsSubjects: Machine Learning (cs.LG); Methodology (stat.ME)
Machine learning (ML) methods have experienced significant growth in the past decade, yet their practical application in high-impact real-world domains has been hindered by their opacity. When ML methods are responsible for making critical decisions, stakeholders often require insights into how to alter these decisions. Counterfactual explanations (CFEs) have emerged as a solution, offering interpretations of opaque ML models and providing a pathway to transition from one decision to another. However, most existing CFE methods require access to the model's training dataset, few methods can handle multivariate time-series, and none can handle multivariate time-series without training datasets. These limitations can be formidable in many scenarios. In this paper, we present CFWoT, a novel reinforcement-learning-based CFE method that generates CFEs when training datasets are unavailable. CFWoT is model-agnostic and suitable for both static and multivariate time-series datasets with continuous and discrete features. Users have the flexibility to specify non-actionable, immutable, and preferred features, as well as causal constraints which CFWoT guarantees will be respected. We demonstrate the performance of CFWoT against four baselines on several datasets and find that, despite not having access to a training dataset, CFWoT finds CFEs that make significantly fewer and significantly smaller changes to the input time-series. These properties make CFEs more actionable, as the magnitude of change required to alter an outcome is vastly reduced.
- [13] arXiv:2405.18601 (cross-list from stat.ML) [pdf, ps, html, other]
-
Title: From Conformal Predictions to Confidence RegionsSubjects: Machine Learning (stat.ML); Machine Learning (cs.LG); Methodology (stat.ME)
Conformal prediction methodologies have significantly advanced the quantification of uncertainties in predictive models. Yet, the construction of confidence regions for model parameters presents a notable challenge, often necessitating stringent assumptions regarding data distribution or merely providing asymptotic guarantees. We introduce a novel approach termed CCR, which employs a combination of conformal prediction intervals for the model outputs to establish confidence regions for model parameters. We present coverage guarantees under minimal assumptions on noise and that is valid in finite sample regime. Our approach is applicable to both split conformal predictions and black-box methodologies including full or cross-conformal approaches. In the specific case of linear models, the derived confidence region manifests as the feasible set of a Mixed-Integer Linear Program (MILP), facilitating the deduction of confidence intervals for individual parameters and enabling robust optimization. We empirically compare CCR to recent advancements in challenging settings such as with heteroskedastic and non-Gaussian noise.
- [14] arXiv:2405.18621 (cross-list from cs.LG) [pdf, ps, html, other]
-
Title: Multi-Armed Bandits with Network InterferenceSubjects: Machine Learning (cs.LG); Methodology (stat.ME); Machine Learning (stat.ML)
Online experimentation with interference is a common challenge in modern applications such as e-commerce and adaptive clinical trials in medicine. For example, in online marketplaces, the revenue of a good depends on discounts applied to competing goods. Statistical inference with interference is widely studied in the offline setting, but far less is known about how to adaptively assign treatments to minimize regret. We address this gap by studying a multi-armed bandit (MAB) problem where a learner (e-commerce platform) sequentially assigns one of possible $\mathcal{A}$ actions (discounts) to $N$ units (goods) over $T$ rounds to minimize regret (maximize revenue). Unlike traditional MAB problems, the reward of each unit depends on the treatments assigned to other units, i.e., there is interference across the underlying network of units. With $\mathcal{A}$ actions and $N$ units, minimizing regret is combinatorially difficult since the action space grows as $\mathcal{A}^N$. To overcome this issue, we study a sparse network interference model, where the reward of a unit is only affected by the treatments assigned to $s$ neighboring units. We use tools from discrete Fourier analysis to develop a sparse linear representation of the unit-specific reward $r_n: [\mathcal{A}]^N \rightarrow \mathbb{R} $, and propose simple, linear regression-based algorithms to minimize regret. Importantly, our algorithms achieve provably low regret both when the learner observes the interference neighborhood for all units and when it is unknown. This significantly generalizes other works on this topic which impose strict conditions on the strength of interference on a known network, and also compare regret to a markedly weaker optimal action. Empirically, we corroborate our theoretical findings via numerical simulations.
- [15] arXiv:2405.18671 (cross-list from cs.LG) [pdf, ps, html, other]
-
Title: Watermarking Counterfactual ExplanationsSubjects: Machine Learning (cs.LG); Cryptography and Security (cs.CR); Methodology (stat.ME)
The field of Explainable Artificial Intelligence (XAI) focuses on techniques for providing explanations to end-users about the decision-making processes that underlie modern-day machine learning (ML) models. Within the vast universe of XAI techniques, counterfactual (CF) explanations are often preferred by end-users as they help explain the predictions of ML models by providing an easy-to-understand & actionable recourse (or contrastive) case to individual end-users who are adversely impacted by predicted outcomes. However, recent studies have shown significant security concerns with using CF explanations in real-world applications; in particular, malicious adversaries can exploit CF explanations to perform query-efficient model extraction attacks on proprietary ML models. In this paper, we propose a model-agnostic watermarking framework (for adding watermarks to CF explanations) that can be leveraged to detect unauthorized model extraction attacks (which rely on the watermarked CF explanations). Our novel framework solves a bi-level optimization problem to embed an indistinguishable watermark into the generated CF explanation such that any future model extraction attacks that rely on these watermarked CF explanations can be detected using a null hypothesis significance testing (NHST) scheme, while ensuring that these embedded watermarks do not compromise the quality of the generated CF explanations. We evaluate this framework's performance across a diverse set of real-world datasets, CF explanation methods, and model extraction techniques, and show that our watermarking detection system can be used to accurately identify extracted ML models that are trained using the watermarked CF explanations. Our work paves the way for the secure adoption of CF explanations in real-world applications.
- [16] arXiv:2405.18987 (cross-list from econ.EM) [pdf, ps, other]
-
Title: Transmission Channel Analysis in Dynamic ModelsSubjects: Econometrics (econ.EM); Methodology (stat.ME)
We propose a framework for the analysis of transmission channels in a large class of dynamic models. To this end, we formulate our approach both using graph theory and potential outcomes, which we show to be equivalent. Our method, labelled Transmission Channel Analysis (TCA), allows for the decomposition of total effects captured by impulse response functions into the effects flowing along transmission channels, thereby providing a quantitative assessment of the strength of various transmission channels. We establish that this requires no additional identification assumptions beyond the identification of the structural shock whose effects the researcher wants to decompose. Additionally, we prove that impulse response functions are sufficient statistics for the computation of transmission effects. We also demonstrate the empirical relevance of TCA for policy evaluation by decomposing the effects of various monetary policy shock measures into instantaneous implementation effects and effects that likely relate to forward guidance.
- [17] arXiv:2405.19225 (cross-list from cs.LG) [pdf, ps, html, other]
-
Title: Synthetic Potential Outcomes for Mixtures of Treatment EffectsSubjects: Machine Learning (cs.LG); Econometrics (econ.EM); Methodology (stat.ME)
Modern data analysis frequently relies on the use of large datasets, often constructed as amalgamations of diverse populations or data-sources. Heterogeneity across these smaller datasets constitutes two major challenges for causal inference: (1) the source of each sample can introduce latent confounding between treatment and effect, and (2) diverse populations may respond differently to the same treatment, giving rise to heterogeneous treatment effects (HTEs). The issues of latent confounding and HTEs have been studied separately but not in conjunction. In particular, previous works only report the conditional average treatment effect (CATE) among similar individuals (with respect to the measured covariates). CATEs cannot resolve mixtures of potential treatment effects driven by latent heterogeneity, which we call mixtures of treatment effects (MTEs). Inspired by method of moment approaches to mixture models, we propose "synthetic potential outcomes" (SPOs). Our new approach deconfounds heterogeneity while also guaranteeing the identifiability of MTEs. This technique bypasses full recovery of a mixture, which significantly simplifies its requirements for identifiability. We demonstrate the efficacy of SPOs on synthetic data.
- [18] arXiv:2405.19317 (cross-list from cs.LG) [pdf, ps, html, other]
-
Title: Adaptive Generalized Neyman Allocation: Local Asymptotic Minimax Optimal Best Arm IdentificationSubjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Econometrics (econ.EM); Methodology (stat.ME); Machine Learning (stat.ML)
This study investigates a local asymptotic minimax optimal strategy for fixed-budget best arm identification (BAI). We propose the Adaptive Generalized Neyman Allocation (AGNA) strategy and show that its worst-case upper bound of the probability of misidentifying the best arm aligns with the worst-case lower bound under the small-gap regime, where the gap between the expected outcomes of the best and suboptimal arms is small. Our strategy corresponds to a generalization of the Neyman allocation for two-armed bandits (Neyman, 1934; Kaufmann et al., 2016) and a refinement of existing strategies such as the ones proposed by Glynn & Juneja (2004) and Shin et al. (2018). Compared to Komiyama et al. (2022), which proposes a minimax rate-optimal strategy, our proposed strategy has a tighter upper bound that exactly matches the lower bound, including the constant terms, by restricting the class of distributions to the class of small-gap distributions. Our result contributes to the longstanding open issue about the existence of asymptotically optimal strategies in fixed-budget BAI, by presenting the local asymptotic minimax optimal strategy.
Cross submissions for Thursday, 30 May 2024 (showing 9 of 9 entries )
- [19] arXiv:2106.14083 (replaced) [pdf, ps, html, other]
-
Title: Bayesian Time-Varying Tensor Vector Autoregressive Models for Dynamic Effective ConnectivitySubjects: Methodology (stat.ME); Applications (stat.AP)
In contemporary neuroscience, a key area of interest is dynamic effective connectivity, which is crucial for understanding the dynamic interactions and causal relationships between different brain regions. Dynamic effective connectivity can provide insights into how brain network interactions are altered in neurological disorders such as dyslexia. Time-varying vector autoregressive (TV-VAR) models have been employed to draw inferences for this purpose. However, their significant computational requirements pose challenges, since the number of parameters to be estimated increases quadratically with the number of time series. In this paper, we propose a computationally efficient Bayesian time-varying VAR approach. For dealing with large-dimensional time series, the proposed framework employs a tensor decomposition for the VAR coefficient matrices at different lags. Dynamically varying connectivity patterns are captured by assuming that at any given time only a subset of components in the tensor decomposition is active. Latent binary time series select the active components at each time via an innovative and parsimonious Ising model in the time-domain. Furthermore, we propose parsity-inducing priors to achieve global-local shrinkage of the VAR coefficients, determine automatically the rank of the tensor decomposition and guide the selection of the lags of the auto-regression. We show the performances of our model formulation via simulation studies and data from a real fMRI study involving a book reading experiment.
- [20] arXiv:2307.11941 (replaced) [pdf, ps, html, other]
-
Title: Visibility graph-based covariance functions for scalable spatial analysis in non-convex domainsComments: expanded with supporting informationSubjects: Methodology (stat.ME)
We present a new method for constructing valid covariance functions of Gaussian processes for spatial analysis in irregular, non-convex domains such as bodies of water. Standard covariance functions based on geodesic distances are not guaranteed to be positive definite on such domains, while existing non-Euclidean approaches fail to respect the partially Euclidean nature of these domains where the geodesic distance agrees with the Euclidean distances for some pairs of points. Using a visibility graph on the domain, we propose a class of covariance functions that preserve Euclidean-based covariances between points that are connected in the domain while incorporating the non-convex geometry of the domain via conditional independence relationships. We show that the proposed method preserves the partially Euclidean nature of the intrinsic geometry on the domain while maintaining validity (positive definiteness) and marginal stationarity of the covariance function over the entire parameter space, properties which are not always fulfilled by existing approaches to construct covariance functions on non-convex domains. We provide useful approximations to improve computational efficiency, resulting in a scalable algorithm. We compare the performance of our method with those of competing state-of-the-art methods using simulation studies on synthetic non-convex domains. The method is applied to data regarding acidity levels in the Chesapeake Bay, showing its potential for ecological monitoring in real-world spatial applications on irregular domains.
- [21] arXiv:2309.03952 (replaced) [pdf, ps, html, other]
-
Title: The Causal Roadmap and Simulations to Improve the Rigor and Reproducibility of Real-Data ApplicationsSubjects: Methodology (stat.ME)
The Causal Roadmap outlines a systematic approach to asking and answering questions of cause-and-effect: define the quantity of interest, evaluate needed assumptions, conduct statistical estimation, and carefully interpret results. To protect research integrity, it is essential that the algorithm for statistical estimation and inference be pre-specified prior to conducting any effectiveness analyses. However, it is often unclear which algorithm will perform optimally for the real-data application. Instead, there is a temptation to simply implement one's favorite algorithm -- recycling prior code or relying on the default settings of a computing package. Here, we call for the use of simulations that realistically reflect the application, including key characteristics such as strong confounding and dependent or missing outcomes, to objectively compare candidate estimators and facilitate full specification of the Statistical Analysis Plan. Such simulations are informed by the Causal Roadmap and conducted after data collection but prior to effect estimation. We illustrate with two worked examples. First, in an observational longitudinal study, outcome-blind simulations are used to inform nuisance parameter estimation and variance estimation for longitudinal targeted minimum loss-based estimation (TMLE). Second, in a cluster randomized trial with missing outcomes, treatment-blind simulations are used to examine Type-I error control in Two-Stage TMLE. In both examples, realistic simulations empower us to pre-specify an estimation approach that is expected to have strong finite sample performance and also yield quality-controlled computing code for the actual analysis. Together, this process helps to improve the rigor and reproducibility of our research.
- [22] arXiv:2401.16567 (replaced) [pdf, ps, html, other]
-
Title: Parallel Affine Transformation Tuning of Markov Chain Monte CarloComments: 37 pages, 9 figures, 10 tablesSubjects: Methodology (stat.ME); Machine Learning (stat.ML)
The performance of Markov chain Monte Carlo samplers strongly depends on the properties of the target distribution such as its covariance structure, the location of its probability mass and its tail behavior. We explore the use of bijective affine transformations of the sample space to improve the properties of the target distribution and thereby the performance of samplers running in the transformed space. In particular, we propose a flexible and user-friendly scheme for adaptively learning the affine transformation during sampling. Moreover, the combination of our scheme with Gibbsian polar slice sampling is shown to produce samples of high quality at comparatively low computational cost in several settings based on real-world data.
- [23] arXiv:2405.08177 (replaced) [pdf, ps, html, other]
-
Title: Parameter identifiability, parameter estimation and model prediction for differential equation modelsComments: 22 pages, 6 FiguresSubjects: Methodology (stat.ME)
Interpreting data with mathematical models is an important aspect of real-world applied mathematical modeling. Very often we are interested to understand the extent to which a particular data set informs and constrains model parameters. This question is closely related to the concept of parameter identifiability, and in this article we present a series of computational exercises to introduce tools that can be used to assess parameter identifiability, estimate parameters and generate model predictions. Taking a likelihood-based approach, we show that very similar ideas and algorithms can be used to deal with a range of different mathematical modelling frameworks. The exercises and results presented in this article are supported by a suite of open access codes that can be accessed on GitHub.
- [24] arXiv:2206.12235 (replaced) [pdf, ps, html, other]
-
Title: Guided sequential ABC schemes for intractable Bayesian modelsComments: 47 pages, added new case study (Lotka-Volterra), see also Table 3 in supplementarySubjects: Computation (stat.CO); Methodology (stat.ME)
Sequential algorithms such as sequential importance sampling (SIS) and sequential Monte Carlo (SMC) have proven fundamental in Bayesian inference for models not admitting a readily available likelihood function. For approximate Bayesian computation (ABC), SMC-ABC is the state-of-art sampler. However, since the ABC paradigm is intrinsically wasteful, sequential ABC schemes can benefit from well-targeted proposal samplers that efficiently avoid improbable parameter regions. We contribute to the ABC modeller's toolbox with novel proposal samplers that are conditional to summary statistics of the data. In a sense, the proposed parameters are "guided" to rapidly reach regions of the posterior surface that are compatible with the observed data. This speeds up the convergence of these sequential samplers, thus reducing the computational effort, while preserving the accuracy in the inference. We provide a variety of guided Gaussian and copula-based samplers for both SIS-ABC and SMC-ABC easing inference for challenging case-studies, including multimodal posteriors, highly correlated posteriors, hierarchical models with about 20 parameters, and a simulation study of cell movements using more than 400 summary statistics.
- [25] arXiv:2207.09054 (replaced) [pdf, ps, html, other]
-
Title: Towards a Low-SWaP 1024-beam Digital Array: A 32-beam Sub-system at 5.8 GHzArjuna Madanayake, Viduneth Ariyarathna, Suresh Madishetty, Sravan Pulipati, R. J. Cintra, Diego Coelho, Raíza Oliveira, Fábio M. Bayer, Leonid Belostotski, Soumyajit Mandal, Theodore S. RappaportComments: 22 pages, 8 figures, 3 tables. Fixed typo in Table 1Journal-ref: IEEE Transactions on Antennas and Propagation, v. 68, n. 2, Feb. 2020Subjects: Signal Processing (eess.SP); Systems and Control (eess.SY); Numerical Analysis (math.NA); Methodology (stat.ME)
Millimeter wave communications require multibeam beamforming in order to utilize wireless channels that suffer from obstructions, path loss, and multi-path effects. Digital multibeam beamforming has maximum degrees of freedom compared to analog phased arrays. However, circuit complexity and power consumption are important constraints for digital multibeam systems. A low-complexity digital computing architecture is proposed for a multiplication-free 32-point linear transform that approximates multiple simultaneous RF beams similar to a discrete Fourier transform (DFT). Arithmetic complexity due to multiplication is reduced from the FFT complexity of $\mathcal{O}(N\: \log N)$ for DFT realizations, down to zero, thus yielding a 46% and 55% reduction in chip area and dynamic power consumption, respectively, for the $N=32$ case considered. The paper describes the proposed 32-point DFT approximation targeting a 1024-beams using a 2D array, and shows the multiplierless approximation and its mapping to a 32-beam sub-system consisting of 5.8 GHz antennas that can be used for generating 1024 digital beams without multiplications. Real-time beam computation is achieved using a Xilinx FPGA at 120 MHz bandwidth per beam. Theoretical beam performance is compared with measured RF patterns from both a fixed-point FFT as well as the proposed multiplier-free algorithm and are in good agreement.
- [26] arXiv:2305.15988 (replaced) [pdf, ps, other]
-
Title: Non-Log-Concave and Nonsmooth Sampling via Langevin Monte Carlo AlgorithmsSubjects: Machine Learning (stat.ML); Machine Learning (cs.LG); Computation (stat.CO); Methodology (stat.ME)
We study the problem of approximate sampling from non-log-concave distributions, e.g., Gaussian mixtures, which is often challenging even in low dimensions due to their multimodality. We focus on performing this task via Markov chain Monte Carlo (MCMC) methods derived from discretizations of the overdamped Langevin diffusions, which are commonly known as Langevin Monte Carlo algorithms. Furthermore, we are also interested in two nonsmooth cases for which a large class of proximal MCMC methods have been developed: (i) a nonsmooth prior is considered with a Gaussian mixture likelihood; (ii) a Laplacian mixture distribution. Such nonsmooth and non-log-concave sampling tasks arise from a wide range of applications to Bayesian inference and imaging inverse problems such as image deconvolution. We perform numerical simulations to compare the performance of most commonly used Langevin Monte Carlo algorithms.
- [27] arXiv:2402.00358 (replaced) [pdf, ps, other]
-
Title: nhppp: Simulating Nonhomogeneous Poisson Point Processes in RComments: 32 pages, 7 figures, 6 tables, 8 algorithmsSubjects: Computation (stat.CO); Methodology (stat.ME)
We introduce the `nhppp' package for simulating events from one-dimensional non-homogeneous Poisson point processes (NHPPPs) in R fast and with a small memory footprint. We developed it to facilitate the sampling of event times in discrete event and statistical simulations. The package's functions are based on three algorithms that provably sample from a target NHPPP: the time-transformation of a homogeneous Poisson process (of intensity one) via the inverse of the integrated intensity function; the generation of a Poisson number of order statistics from a fixed density function; and the thinning of a majorizing NHPPP via an acceptance-rejection scheme. We present a study of numerical accuracy and time performance of the algorithms. We illustrate use with simple reproducible examples.
- [28] arXiv:2402.07868 (replaced) [pdf, ps, other]
-
Title: Nesting Particle Filters for Experimental Design in Dynamical SystemsComments: Accepted to ICML 2024Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG); Methodology (stat.ME)
In this paper, we propose a novel approach to Bayesian experimental design for non-exchangeable data that formulates it as risk-sensitive policy optimization. We develop the Inside-Out SMC$^2$ algorithm, a nested sequential Monte Carlo technique to infer optimal designs, and embed it into a particle Markov chain Monte Carlo framework to perform gradient-based policy amortization. Our approach is distinct from other amortized experimental design techniques, as it does not rely on contrastive estimators. Numerical validation on a set of dynamical systems showcases the efficacy of our method in comparison to other state-of-the-art strategies.
- [29] arXiv:2403.03208 (replaced) [pdf, ps, html, other]
-
Title: Active Statistical InferenceSubjects: Machine Learning (stat.ML); Machine Learning (cs.LG); Methodology (stat.ME)
Inspired by the concept of active learning, we propose active inference$\unicode{x2013}$a methodology for statistical inference with machine-learning-assisted data collection. Assuming a budget on the number of labels that can be collected, the methodology uses a machine learning model to identify which data points would be most beneficial to label, thus effectively utilizing the budget. It operates on a simple yet powerful intuition: prioritize the collection of labels for data points where the model exhibits uncertainty, and rely on the model's predictions where it is confident. Active inference constructs provably valid confidence intervals and hypothesis tests while leveraging any black-box machine learning model and handling any data distribution. The key point is that it achieves the same level of accuracy with far fewer samples than existing baselines relying on non-adaptively-collected data. This means that for the same number of collected samples, active inference enables smaller confidence intervals and more powerful p-values. We evaluate active inference on datasets from public opinion research, census analysis, and proteomics.