We gratefully acknowledge support from
the Simons Foundation and member institutions.

Quantitative Biology

New submissions

[ total of 15 entries: 1-15 ]
[ showing up to 2000 entries per page: fewer | more ]

New submissions for Thu, 2 May 24

[1]  arXiv:2405.00070 [pdf, other]
Title: Bayesian-Guided Generation of Synthetic Microbiomes with Minimized Pathogenicity
Journal-ref: The 46th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (IEEE EMBC), 2024
Subjects: Quantitative Methods (q-bio.QM); Artificial Intelligence (cs.AI)

Synthetic microbiomes offer new possibilities for modulating microbiota, to address the barriers in multidtug resistance (MDR) research. We present a Bayesian optimization approach to enable efficient searching over the space of synthetic microbiome variants to identify candidates predictive of reduced MDR. Microbiome datasets were encoded into a low-dimensional latent space using autoencoders. Sampling from this space allowed generation of synthetic microbiome signatures. Bayesian optimization was then implemented to select variants for biological screening to maximize identification of designs with restricted MDR pathogens based on minimal samples. Four acquisition functions were evaluated: expected improvement, upper confidence bound, Thompson sampling, and probability of improvement. Based on each strategy, synthetic samples were prioritized according to their MDR detection. Expected improvement, upper confidence bound, and probability of improvement consistently produced synthetic microbiome candidates with significantly fewer searches than Thompson sampling. By combining deep latent space mapping and Bayesian learning for efficient guided screening, this study demonstrated the feasibility of creating bespoke synthetic microbiomes with customized MDR profiles.

[2]  arXiv:2405.00128 [pdf, other]
Title: Target-Specific De Novo Peptide Binder Design with DiffPepBuilder
Subjects: Biomolecules (q-bio.BM)

Despite the exciting progress in target-specific de novo protein binder design, peptide binder design remains challenging due to the flexibility of peptide structures and the scarcity of protein-peptide complex structure data. In this study, we curated a large synthetic dataset, referred to as PepPC-F, from the abundant protein-protein interface data and developed DiffPepBuilder, a de novo target-specific peptide binder generation method that utilizes an SE(3)-equivariant diffusion model trained on PepPC-F to co-design peptide sequences and structures. DiffPepBuilder also introduces disulfide bonds to stabilize the generated peptide structures. We tested DiffPepBuilder on 30 experimentally verified strong peptide binders with available protein-peptide complex structures. DiffPepBuilder was able to effectively recall the native structures and sequences of the peptide ligands and to generate novel peptide binders with improved binding free energy. We subsequently conducted de novo generation case studies on three targets. In both the regeneration test and case studies, DiffPepBuilder outperformed AfDesign and RFdiffusion coupled with ProteinMPNN, in terms of sequence and structure recall, interface quality, and structural diversity. Molecular dynamics simulations confirmed that the introduction of disulfide bonds enhanced the structural rigidity and binding performance of the generated peptides. As a general peptide binder de novo design tool, DiffPepBuilder can be used to design peptide binders for given protein targets with three dimensional and binding site information.

[3]  arXiv:2405.00159 [pdf, other]
Title: Heterogeneity analysis provides evidence for a genetically homogeneous subtype of bipolar-disorder
Subjects: Genomics (q-bio.GN)

Bipolar disorder is a highly heritable brain disorder which affects an estimated 50 million people worldwide. Due to recent advances in genotyping technology and bioinformatics methodology, as well as the increase in the overall amount of available data, our understanding of the genetic underpinnings of BD has improved. A growing consensus is that BD is polygenic and heterogeneous, but the specifics of that heterogeneity are not yet well understood. Here we use a recently developed technique to investigate the genetic heterogeneity of bipolar disorder. We find strong statistical evidence for a `bicluster': a subset of bipolar subjects that exhibits a disease-specific genetic pattern. The structure illuminated by this bicluster replicates in several other data-sets and can be used to improve BD risk-prediction algorithms. We believe that this bicluster is likely to correspond to a genetically-distinct subtype of BD. More generally, we believe that our biclustering approach is a promising means of untangling the underlying heterogeneity of complex disease without the need for reliable subphenotypic data.

[4]  arXiv:2405.00255 [pdf, other]
Title: Reliability and predictability of phenotype information from functional connectivity in large imaging datasets
Subjects: Neurons and Cognition (q-bio.NC)

One of the central objectives of contemporary neuroimaging research is to create predictive models that can disentangle the connection between patterns of functional connectivity across the entire brain and various behavioral traits. Previous studies have shown that models trained to predict behavioral features from the individual's functional connectivity have modest to poor performance. In this study, we trained models that predict observable individual traits (phenotypes) and their corresponding singular value decomposition (SVD) representations - herein referred to as latent phenotypes from resting state functional connectivity. For this task, we predicted phenotypes in two large neuroimaging datasets: the Human Connectome Project (HCP) and the Philadelphia Neurodevelopmental Cohort (PNC). We illustrate the importance of regressing out confounds, which could significantly influence phenotype prediction. Our findings reveal that both phenotypes and their corresponding latent phenotypes yield similar predictive performance. Interestingly, only the first five latent phenotypes were reliably identified, and using just these reliable phenotypes for predicting phenotypes yielded a similar performance to using all latent phenotypes. This suggests that the predictable information is present in the first latent phenotypes, allowing the remainder to be filtered out without any harm in performance. This study sheds light on the intricate relationship between functional connectivity and the predictability and reliability of phenotypic information, with potential implications for enhancing predictive modeling in the realm of neuroimaging research.

[5]  arXiv:2405.00333 [pdf, other]
Title: Reevaluating coexistence and stability in ecosystem networks to address ecological transients: methods and implications
Subjects: Populations and Evolution (q-bio.PE); Applications (stat.AP)

Representing ecosystems at equilibrium has been foundational for building ecological theories, forecasting species populations and planning conservation actions. The equilibrium "balance of nature" ideal suggests that populations will eventually stabilise to a coexisting balance of species. However, a growing body of literature argues that the equilibrium ideal is inappropriate for ecosystems. Here, we develop and demonstrate a new framework for representing ecosystems without considering equilibrium dynamics. Instead, far more pragmatic ecosystem models are constructed by considering population trajectories, regardless of whether they exhibit equilibrium or transient (i.e. non-equilibrium) behaviour. This novel framework maximally utilises readily available, but often overlooked, knowledge from field observations and expert elicitation, rather than relying on theoretical ecosystem properties. We developed innovative Bayesian algorithms to generate ecosystem models in this new statistical framework, without excessive computational burden. Our results reveal that our pragmatic framework could have a dramatic impact on conservation decision-making and enhance the realism of ecosystem models and forecasts.

[6]  arXiv:2405.00513 [pdf, ps, other]
Title: 3D MR Fingerprinting for Dynamic Contrast-Enhanced Imaging of Whole Mouse Brain
Subjects: Quantitative Methods (q-bio.QM)

Quantitative MRI enables direct quantification of contrast agent concentrations in contrast-enhanced scans. However, the lengthy scan times required by conventional methods are inadequate for tracking contrast agent transport dynamically in mouse brain. We developed a 3D MR fingerprinting (MRF) method for simultaneous T1 and T2 mapping across the whole mouse brain with 4.3-min temporal resolution. We designed a 3D MRF sequence with variable acquisition segment lengths and magnetization preparations on a 9.4T preclinical MRI scanner. Model-based reconstruction approaches were employed to improve the accuracy and speed of MRF acquisition. The method's accuracy for T1 and T2 measurements was validated in vitro, while its repeatability of T1 and T2 measurements was evaluated in vivo (n=3). The utility of the 3D MRF sequence for dynamic tracking of intracisternally infused Gd-DTPA in the whole mouse brain was demonstrated (n=5). Phantom studies confirmed accurate T1 and T2 measurements by 3D MRF with an undersampling factor up to 48. Dynamic contrast-enhanced (DCE) MRF scans achieved a spatial resolution of 192 x 192 x 500 um3 and a temporal resolution of 4.3 min, allowing for the analysis and comparison of dynamic changes in concentration and transport kinetics of intracisternally infused Gd-DTPA across brain regions. The sequence also enabled highly repeatable, high-resolution T1 and T2 mapping of the whole mouse brain (192 x 192 x 250 um3) in 30 min. We present the first dynamic and multi-parametric approach for quantitatively tracking contrast agent transport in the mouse brain using 3D MRF.

[7]  arXiv:2405.00530 [pdf, ps, other]
Title: The Highly Durable Antibacterial Gel-like Coatings for Textiles
Subjects: Tissues and Organs (q-bio.TO)

Hospital-acquired infections are considered a priority for public health systems, which poses a significant burden for society. High-touch surfaces of healthcare centers, including textiles, provide a suitable environment for pathogenic bacteria to grow, necessitating incorporating effective antibacterial agents into textiles. This paper introduces a highly durable antibacterial gel-like solution, Silver Shell finish, which contains chitosan-bound silver chloride microparticles. The study investigates the coating's environmental impact, health risks, and durability during repeated washing. The structure of the Silver Shell finish was studied using Transmission Electron Microscopy (TEM) and Energy-Dispersive X-ray Spectroscopy (EDX). TEM images showed a core-shell structure, with chitosan forming a protective shell around groupings of silver micro-particles. Field Emission Scanning Electron Microscopy (FESEM) demonstrated the uniform deposition of Silver Shell on the surface of fabrics. AATCC Test Method 100 was employed to quantitatively analyze the antibacterial properties of fabrics coated with silver microparticles. Two types of bacteria, Staphylococcus aureus (S. aureus) and Escherichia coli (E. coli) were used in this study. The antibacterial results showed that after 75 wash cycles, a 100% reduction for both S. aureus and E. coli in the coated samples using crosslinking agents was observed. The coated samples without a crosslinking agent exhibited a 99.88% and 99.81% reduction for S. aureus and E. coli after 50 washing cycles. AATCC-147 was performed to investigate the coated samples' leaching properties and the crosslinking agent's effect against S. aureus and E. coli. All coated samples demonstrated remarkable antibacterial efficacy even after 75 wash cycles.

[8]  arXiv:2405.00541 [pdf, other]
Title: New Trends on the Systems Approach to Modeling SARS-CoV-2 Pandemics in a Globally Connected Planet
Subjects: Populations and Evolution (q-bio.PE); Adaptation and Self-Organizing Systems (nlin.AO); Physics and Society (physics.soc-ph)

This paper presents a critical analysis of the literature and perspective research ideas for modeling the epidemics caused by the SARS-CoV-2 virus. It goes beyond deterministic population dynamics to consider several key complexity features of the system under consideration. In particular, the multiscale features of the dynamics from contagion to the subsequent dynamics of competition between the immune system and the proliferating virus. Other topics addressed in this work include the propagation of epidemics in a territory, taking into account local transportation networks, the heterogeneity of the population, and the study of social and economic problems in populations involved in the spread of epidemics. The overall content aims to show how new mathematical tools can be developed to address the above topics and how mathematical models and simulations can contribute to the decision making of crisis managers.

Cross-lists for Thu, 2 May 24

[9]  arXiv:2405.00129 (cross-list from cs.SI) [pdf, other]
Title: Complex contagions can outperform simple contagions for network reconstruction with dense networks or saturated dynamics
Comments: 8 pages, 5 figures
Subjects: Social and Information Networks (cs.SI); Populations and Evolution (q-bio.PE); Machine Learning (stat.ML)

Network scientists often use complex dynamic processes to describe network contagions, but tools for fitting contagion models typically assume simple dynamics. Here, we address this gap by developing a nonparametric method to reconstruct a network and dynamics from a series of node states, using a model that breaks the dichotomy between simple pairwise and complex neighborhood-based contagions. We then show that a network is more easily reconstructed when observed through the lens of complex contagions if it is dense or the dynamic saturates, and that simple contagions are better otherwise.

[10]  arXiv:2405.00166 (cross-list from cs.LG) [pdf, other]
Title: Discovering intrinsic multi-compartment pharmacometric models using Physics Informed Neural Networks
Comments: Accepted into the International conference on Scientific Computation and Machine Learning 2024 (SCML 2024)
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Quantitative Methods (q-bio.QM)

Pharmacometric models are pivotal across drug discovery and development, playing a decisive role in determining the progression of candidate molecules. However, the derivation of mathematical equations governing the system is a labor-intensive trial-and-error process, often constrained by tight timelines. In this study, we introduce PKINNs, a novel purely data-driven pharmacokinetic-informed neural network model. PKINNs efficiently discovers and models intrinsic multi-compartment-based pharmacometric structures, reliably forecasting their derivatives. The resulting models are both interpretable and explainable through Symbolic Regression methods. Our computational framework demonstrates the potential for closed-form model discovery in pharmacometric applications, addressing the labor-intensive nature of traditional model derivation. With the increasing availability of large datasets, this framework holds the potential to significantly enhance model-informed drug discovery.

[11]  arXiv:2405.00184 (cross-list from cs.LG) [pdf, other]
Title: Semi-Supervised Hierarchical Multi-Label Classifier Based on Local Information
Subjects: Machine Learning (cs.LG); Quantitative Methods (q-bio.QM)

Scarcity of labeled data is a common problem in supervised classification, since hand-labeling can be time consuming, expensive or hard to label; on the other hand, large amounts of unlabeled information can be found. The problem of scarcity of labeled data is even more notorious in hierarchical classification, because the data of a node is split among its children, which results in few instances associated to the deepest nodes of the hierarchy. In this work it is proposed the semi-supervised hierarchical multi-label classifier based on local information (SSHMC-BLI) which can be trained with labeled and unlabeled data to perform hierarchical classification tasks. The method can be applied to any type of hierarchical problem, here we focus on the most difficult case: hierarchies of DAG type, where the instances can be associated to multiple paths of labels which can finish in an internal node. SSHMC-BLI builds pseudo-labels for each unlabeled instance from the paths of labels of its labeled neighbors, while it considers whether the unlabeled instance is similar to its neighbors. Experiments on 12 challenging datasets from functional genomics show that making use of unlabeled along with labeled data can help to improve the performance of a supervised hierarchical classifier trained only on labeled data, even with statistical significance.

[12]  arXiv:2405.00202 (cross-list from cs.LG) [pdf, other]
Title: Leveraging Active Subspaces to Capture Epistemic Model Uncertainty in Deep Generative Models for Molecular Design
Subjects: Machine Learning (cs.LG); Quantitative Methods (q-bio.QM); Machine Learning (stat.ML)

Deep generative models have been accelerating the inverse design process in material and drug design. Unlike their counterpart property predictors in typical molecular design frameworks, generative molecular design models have seen fewer efforts on uncertainty quantification (UQ) due to computational challenges in Bayesian inference posed by their large number of parameters. In this work, we focus on the junction-tree variational autoencoder (JT-VAE), a popular model for generative molecular design, and address this issue by leveraging the low dimensional active subspace to capture the uncertainty in the model parameters. Specifically, we approximate the posterior distribution over the active subspace parameters to estimate the epistemic model uncertainty in an extremely high dimensional parameter space. The proposed UQ scheme does not require alteration of the model architecture, making it readily applicable to any pre-trained model. Our experiments demonstrate the efficacy of the AS-based UQ and its potential impact on molecular optimization by exploring the model diversity under epistemic uncertainty.

[13]  arXiv:2405.00577 (cross-list from cs.LG) [pdf, ps, other]
Title: Discovering robust biomarkers of neurological disorders from functional MRI using graph neural networks: A Review
Subjects: Machine Learning (cs.LG); Signal Processing (eess.SP); Neurons and Cognition (q-bio.NC)

Graph neural networks (GNN) have emerged as a popular tool for modelling functional magnetic resonance imaging (fMRI) datasets. Many recent studies have reported significant improvements in disorder classification performance via more sophisticated GNN designs and highlighted salient features that could be potential biomarkers of the disorder. In this review, we provide an overview of how GNN and model explainability techniques have been applied on fMRI datasets for disorder prediction tasks, with a particular emphasis on the robustness of biomarkers produced for neurodegenerative diseases and neuropsychiatric disorders. We found that while most studies have performant models, salient features highlighted in these studies vary greatly across studies on the same disorder and little has been done to evaluate their robustness. To address these issues, we suggest establishing new standards that are based on objective evaluation metrics to determine the robustness of these potential biomarkers. We further highlight gaps in the existing literature and put together a prediction-attribution-evaluation framework that could set the foundations for future research on improving the robustness of potential biomarkers discovered via GNNs.

Replacements for Thu, 2 May 24

[14]  arXiv:2311.07338 (replaced) [pdf, other]
Title: A mathematical model of the visual MacKay effect
Subjects: Optimization and Control (math.OC); Analysis of PDEs (math.AP); Numerical Analysis (math.NA); Neurons and Cognition (q-bio.NC)
[15]  arXiv:2401.16544 (replaced) [pdf, other]
Title: Stochastic Distinguishability of Markovian Trajectories
Subjects: Statistical Mechanics (cond-mat.stat-mech); Biological Physics (physics.bio-ph); Quantitative Methods (q-bio.QM)
[ total of 15 entries: 1-15 ]
[ showing up to 2000 entries per page: fewer | more ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, q-bio, recent, 2405, contact, help  (Access key information)