We gratefully acknowledge support from
the Simons Foundation and member institutions.

Quantitative Biology

New submissions

[ total of 28 entries: 1-28 ]
[ showing up to 2000 entries per page: fewer | more ]

New submissions for Mon, 6 May 24

[1]  arXiv:2405.01616 [pdf, other]
Title: Generative Active Learning for the Search of Small-molecule Protein Binders
Subjects: Biomolecules (q-bio.BM); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)

Despite substantial progress in machine learning for scientific discovery in recent years, truly de novo design of small molecules which exhibit a property of interest remains a significant challenge. We introduce LambdaZero, a generative active learning approach to search for synthesizable molecules. Powered by deep reinforcement learning, LambdaZero learns to search over the vast space of molecules to discover candidates with a desired property. We apply LambdaZero with molecular docking to design novel small molecules that inhibit the enzyme soluble Epoxide Hydrolase 2 (sEH), while enforcing constraints on synthesizability and drug-likeliness. LambdaZero provides an exponential speedup in terms of the number of calls to the expensive molecular docking oracle, and LambdaZero de novo designed molecules reach docking scores that would otherwise require the virtual screening of a hundred billion molecules. Importantly, LambdaZero discovers novel scaffolds of synthesizable, drug-like inhibitors for sEH. In in vitro experimental validation, a series of ligands from a generated quinazoline-based scaffold were synthesized, and the lead inhibitor N-(4,6-di(pyrrolidin-1-yl)quinazolin-2-yl)-N-methylbenzamide (UM0152893) displayed sub-micromolar enzyme inhibition of sEH.

[2]  arXiv:2405.01664 [pdf, ps, other]
Title: Stress-induced Eukaryotic Translational Regulatory Mechanisms
Comments: 37 Pages, Figures 7, Review Article
Subjects: Molecular Networks (q-bio.MN)

The eukaryotic protein synthesis process entails intricate stages governed by diverse mechanisms to tightly regulate translation. Translational regulation during stress is pivotal for maintaining cellular homeostasis, ensuring the accurate expression of essential proteins crucial for survival. This selective translational control mechanism is integral to cellular adaptation and resilience under adverse conditions. This review manuscript explores various mechanisms involved in selective translational regulation, focusing on mRNA-specific and global regulatory processes. Key aspects of translational control include translation initiation, which is often a rate-limiting step, and involves the formation of the eIF4F complex and recruitment of mRNA to ribosomes. Regulation of translation initiation factors, such as eIF4E, eIF4E2, and eIF2, through phosphorylation and interactions with binding proteins, modulates translation efficiency under stress conditions. This review also highlights the control of translation initiation through factors like the eIF4F complex and the ternary complex and also underscores the importance of eIF2{\alpha} phosphorylation in stress granule formation and cellular stress responses. Additionally, the impact of amino acid deprivation, mTOR signaling, and ribosome biogenesis on translation regulation and cellular adaptation to stress is also discussed. Understanding the intricate mechanisms of translational regulation during stress provides insights into cellular adaptation mechanisms and potential therapeutic targets for various diseases, offering valuable avenues for addressing conditions associated with dysregulated protein synthesis.

[3]  arXiv:2405.01703 [pdf, other]
Title: Fluid-structure interaction simulations for the prediction of fractional flow reserve in pediatric patients with anomalous aortic origin of a coronary artery
Subjects: Tissues and Organs (q-bio.TO); Medical Physics (physics.med-ph)

Computer simulations of blood flow in patients with anomalous aortic origin of a coronary artery (AAOCA) have the promise to provide insight into this complex disease. They provide an in-silico experimental platform to explore possible mechanisms of myocardial ischemia, a potentially deadly complication for patients with this defect. This paper focuses on the question of model calibration for fluid-structure interaction models of pediatric AAOCA patients. Imaging and cardiac catheterization data provide partial information for model construction and calibration. However, parameters for downstream boundary conditions needed for these models are difficult to estimate. Further, important model predictions, like fractional flow reserve (FFR), are sensitive to these parameters. We describe an approach to calibrate downstream boundary condition parameters to clinical measurements of resting FFR. The calibrated models are then used to predict FFR at stress, an invasively measured quantity that can be used in the clinical evaluation of these patients. We find reasonable agreement between the model predicted and clinically measured FFR at stress, indicating the credibility of this modeling framework for predicting hemodynamics of pediatric AAOCA patients. This approach could lead to important clinical applications since it may serve as a tool for risk stratifying children with AAOCA.

[4]  arXiv:2405.01715 [pdf, other]
Title: Identification of SNPs in genomes using GRAMEP, an alignment-free method based on the Principle of Maximum Entropy
Subjects: Genomics (q-bio.GN); Information Theory (cs.IT); Applications (stat.AP)

Advances in high throughput sequencing technologies provide a large number of genomes to be analyzed, so computational methodologies play a crucial role in analyzing and extracting knowledge from the data generated. Investigating genomic mutations is critical because of their impact on chromosomal evolution, genetic disorders, and diseases. It is common to adopt aligning sequences for analyzing genomic variations, however, this approach can be computationally expensive and potentially arbitrary in scenarios with large datasets. Here, we present a novel method for identifying single nucleotide polymorphisms (SNPs) in DNA sequences from assembled genomes. This method uses the principle of maximum entropy to select the most informative k-mers specific to the variant under investigation. The use of this informative k-mer set enables the detection of variant-specific mutations in comparison to a reference sequence. In addition, our method offers the possibility of classifying novel sequences with no need for organism-specific information. GRAMEP demonstrated high accuracy in both in silico simulations and analyses of real viral genomes, including Dengue, HIV, and SARS-CoV-2. Our approach maintained accurate SARS-CoV-2 variant identification while demonstrating a lower computational cost compared to the gold-standard statistical tools. The source code for this proof-of-concept implementation is freely available at https://github.com/omatheuspimenta/GRAMEP.

[5]  arXiv:2405.01896 [pdf, ps, other]
Title: A Step Test to Evaluate the Susceptibility to Severe High-Altitude Illness in Field Conditions
Authors: Eric Hermand (URePSSS, H&P), Léo Lesaint, Laura Denis (H&P), Jean-Paul Richalet (INSEP), François Lhuissier (H&P)
Comments: High Altitude Medicine and Biology, 2024
Subjects: Tissues and Organs (q-bio.TO)

A laboratory-based hypoxic exercise test, performed on a cycle ergometer, can be used to predict susceptibility to severe high-altitude illness (SHAI) through the calculation of a clinicophysiological SHAI score. Our objective was to design a field-condition test and compare its derived SHAI score and various physiological parameters, such as peripheral oxygen saturation (SpO2), and cardiac and ventilatory responses to hypoxia during exercise (HCRe and HVRe, respectively), to the laboratory test. A group of 43 healthy subjects (15 females and 28 males), with no prior experience at high altitude, performed a hypoxic cycle ergometer test (simulated altitude of 4,800 m) and step tests (20 cm high step) at 3,000, 4,000, and 4,800 m simulated altitudes. According to tested altitudes, differences were observed in O2 desaturation, heart rate, and minute ventilation (p < 0.001), whereas the computed HCRe and HVRe were not different (p = 0.075 and p = 0.203, respectively). From the linear relationships between the step test and SHAI scores, we defined a risk zone, allowing us to evaluate the risk of developing SHAI and take adequate preventive measures in field conditions, from the calculated step test score for the given altitude. The predictive value of this new field test remains to be validated in real high-altitude conditions.

[6]  arXiv:2405.02038 [pdf, other]
Title: Dimensionality reduction of neuronal degeneracy reveals two interfering physiological mechanisms
Subjects: Neurons and Cognition (q-bio.NC); Mathematical Physics (math-ph); Cell Behavior (q-bio.CB)

Neuronal systems maintain stable functions despite large variability in their physiological components. Ion channel expression, in particular, is highly variable in neurons exhibiting similar electrophysiological phenotypes, which poses questions regarding how specific ion channel subsets reliably shape neuron intrinsic properties. Here, we use detailed conductance-based modeling to explore the origin of stable neuronal function from variable channel composition. Using dimensionality reduction, we uncover two principal dimensions in the channel conductance space that capture most of the variance of the observed variability. Those two dimensions correspond to two physiologically relevant sources of variability that can be explained by feedback mechanisms underlying regulation of neuronal activity, providing quantitative insights into how channel composition links to neuronal electrophysiological activity. These insights allowed us to understand and design a model-independent, reliable neuromodulation rule for variable neuronal populations.

[7]  arXiv:2405.02076 [pdf, other]
Title: SCIMAP: A Python Toolkit for Integrated Spatial Analysis of Multiplexed Imaging Data
Comments: 6 pages, 1 figure
Subjects: Quantitative Methods (q-bio.QM); Tissues and Organs (q-bio.TO)

Multiplexed imaging data are revolutionizing our understanding of the composition and organization of tissues and tumors. A critical aspect of such tissue profiling is quantifying the spatial relationship relationships among cells at different scales from the interaction of neighboring cells to recurrent communities of cells of multiple types. This often involves statistical analysis of 10^7 or more cells in which up to 100 biomolecules (commonly proteins) have been measured. While software tools currently cater to the analysis of spatial transcriptomics data, there remains a need for toolkits explicitly tailored to the complexities of multiplexed imaging data including the need to seamlessly integrate image visualization with data analysis and exploration. We introduce SCIMAP, a Python package specifically crafted to address these challenges. With SCIMAP, users can efficiently preprocess, analyze, and visualize large datasets, facilitating the exploration of spatial relationships and their statistical significance. SCIMAP's modular design enables the integration of new algorithms, enhancing its capabilities for spatial analysis.

[8]  arXiv:2405.02117 [pdf, ps, other]
Title: Multi-grid reaction-diffusion master equation: applications to morphogen gradient modelling
Subjects: Quantitative Methods (q-bio.QM); Biological Physics (physics.bio-ph)

The multi-grid reaction-diffusion master equation (mgRDME) provides a generalization of stochastic compartment-based reaction-diffusion modelling described by the standard reaction-diffusion master equation (RDME). By enabling different resolutions on lattices for biochemical species with different diffusion constants, the mgRDME approach improves both accuracy and efficiency of compartment-based reaction-diffusion simulations. The mgRDME framework is examined through its application to morphogen gradient formation in stochastic reaction-diffusion scenarios, using both an analytically tractable first-order reaction network and a model with a second-order reaction. The results obtained by the mgRDME modelling are compared with the standard RDME model and with the (more detailed) particle-based Brownian dynamics simulations. The dependence of error and numerical cost on the compartment sizes is defined and investigated through a multi-objective optimization problem.

[9]  arXiv:2405.02136 [pdf, other]
Title: bio2Byte Tools deployment as a Python package and Galaxy tool to predict protein biophysical properties
Subjects: Quantitative Methods (q-bio.QM)

We introduce a unified Python package for the prediction of protein biophysical properties, streamlining previous tools developed by the Bio2Byte research group. This suite facilitates comprehensive assessments of protein characteristics, incorporating predictors for backbone and sidechain dynamics, local secondary structure propensities, early folding, long disorder, beta-sheet aggregation and FUS-like phase separation. Our package significantly eases the integration and execution of these tools, enhancing accessibility for both computational and experimental researchers.

Cross-lists for Mon, 6 May 24

[10]  arXiv:2405.01554 (cross-list from cs.LG) [pdf, other]
Title: Early-stage detection of cognitive impairment by hybrid quantum-classical algorithm using resting-state functional MRI time-series
Comments: 28 pages, 10 figures
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Neurons and Cognition (q-bio.NC)

Following the recent development of quantum machine learning techniques, the literature has reported several quantum machine learning algorithms for disease detection. This study explores the application of a hybrid quantum-classical algorithm for classifying region-of-interest time-series data obtained from resting-state functional magnetic resonance imaging in patients with early-stage cognitive impairment based on the importance of cognitive decline for dementia or aging. Classical one-dimensional convolutional layers are used together with quantum convolutional neural networks in our hybrid algorithm. In the classical simulation, the proposed hybrid algorithms showed higher balanced accuracies than classical convolutional neural networks under the similar training conditions. Moreover, a total of nine brain regions (left precentral gyrus, right superior temporal gyrus, left rolandic operculum, right rolandic operculum, left parahippocampus, right hippocampus, left medial frontal gyrus, right cerebellum crus, and cerebellar vermis) among 116 brain regions were found to be relatively effective brain regions for the classification based on the model performances. The associations of the selected nine regions with cognitive decline, as found in previous studies, were additionally validated through seed-based functional connectivity analysis. We confirmed both the improvement of model performance with the quantum convolutional neural network and neuroscientific validities of brain regions from our hybrid quantum-classical model.

[11]  arXiv:2405.01960 (cross-list from cond-mat.soft) [pdf, other]
Title: Proliferation-driven mechanical feedback regulates cell dynamics in growing tissues
Comments: 5 figures. arXiv admin note: text overlap with arXiv:2202.04806
Subjects: Soft Condensed Matter (cond-mat.soft); Statistical Mechanics (cond-mat.stat-mech); Biological Physics (physics.bio-ph); Cell Behavior (q-bio.CB)

Local stresses in a tissue, a collective property, regulate cell division and apoptosis. In turn, cell growth and division induce active stresses in the tissue. As a consequence, there is a feedback between cell growth and local stresses. However, how the cell dynamics depend on local stress-dependent cell division and the feedback strength is not fully understood. Here, we probe the consequences of stress-mediated growth and cell division on cell dynamics using agent-based simulations of a two-dimensional growing tissue. We discover a rich dynamical behavior of individual cells, ranging from jamming (mean square displacement, $\Delta (t) \sim t^{\alpha}$ with $\alpha$ less than unity), to hyperdiffusion ($\alpha > 2$) depending on cell division rate and the strength of the mechanical feedback. Strikingly, $\Delta (t)$ is determined by the tissue growth law, which quantifies cell proliferation (number of cells $N(t)$ as a function of time). The growth law ($N(t) \sim t^{\lambda}$ at long times) is regulated by the critical pressure that controls the strength of the mechanical feedback and the ratio between cell division-apoptosis rates. We show that $\lambda \sim \alpha$, which implies that higher growth rate leads to a greater degree of cell migration. The variations in cell motility are linked to the emergence of highly persistent forces extending over several cell cycle times. Our predictions are testable using cell-tracking imaging techniques.

[12]  arXiv:2405.01974 (cross-list from cs.LG) [pdf, other]
Title: Multitask Extension of Geometrically Aligned Transfer Encoder
Comments: 7 pages, 3 figures, 2 tables
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Quantitative Methods (q-bio.QM)

Molecular datasets often suffer from a lack of data. It is well-known that gathering data is difficult due to the complexity of experimentation or simulation involved. Here, we leverage mutual information across different tasks in molecular data to address this issue. We extend an algorithm that utilizes the geometric characteristics of the encoding space, known as the Geometrically Aligned Transfer Encoder (GATE), to a multi-task setup. Thus, we connect multiple molecular tasks by aligning the curved coordinates onto locally flat coordinates, ensuring the flow of information from source tasks to support performance on target data.

[13]  arXiv:2405.01983 (cross-list from cs.AI) [pdf, other]
Title: Model-based reinforcement learning for protein backbone design
Subjects: Artificial Intelligence (cs.AI); Biomolecules (q-bio.BM)

Designing protein nanomaterials of predefined shape and characteristics has the potential to dramatically impact the medical industry. Machine learning (ML) has proven successful in protein design, reducing the need for expensive wet lab experiment rounds. However, challenges persist in efficiently exploring the protein fitness landscapes to identify optimal protein designs. In response, we propose the use of AlphaZero to generate protein backbones, meeting shape and structural scoring requirements. We extend an existing Monte Carlo tree search (MCTS) framework by incorporating a novel threshold-based reward and secondary objectives to improve design precision. This innovation considerably outperforms existing approaches, leading to protein backbones that better respect structural scores. The application of AlphaZero is novel in the context of protein backbone design and demonstrates promising performance. AlphaZero consistently surpasses baseline MCTS by more than 100% in top-down protein design tasks. Additionally, our application of AlphaZero with secondary objectives uncovers further promising outcomes, indicating the potential of model-based reinforcement learning (RL) in navigating the intricate and nuanced aspects of protein design

[14]  arXiv:2405.01989 (cross-list from math.OC) [pdf, other]
Title: Parameter estimation in ODEs: assessing the potential of local and global solvers
Subjects: Optimization and Control (math.OC); Quantitative Methods (q-bio.QM)

We consider the problem of parameter estimation in dynamic systems described by ordinary differential equations. A review of the existing literature emphasizes the need for deterministic global optimization methods due to the nonconvex nature of these problems. Recent works have focused on expanding the capabilities of specialized deterministic global optimization algorithms to handle more complex problems. Despite advancements, current deterministic methods are limited to problems with a maximum of around five state and five decision variables, prompting ongoing efforts to enhance their applicability to practical problems.
Our study seeks to assess the effectiveness of state-of-the-art general-purpose global and local solvers in handling realistic-sized problems efficiently, and evaluating their capabilities to cope with the nonconvex nature of the underlying estimation problems.

[15]  arXiv:2405.02003 (cross-list from cond-mat.stat-mech) [pdf, ps, other]
Title: Smoothly vanishing density in the contact process by an interplay of disorder and long-distance dispersal
Comments: 8 pages, 6 figures
Subjects: Statistical Mechanics (cond-mat.stat-mech); Disordered Systems and Neural Networks (cond-mat.dis-nn); Populations and Evolution (q-bio.PE)

Realistic modeling of ecological population dynamics requires spatially explicit descriptions that can take into account spatial heterogeneity as well as long-distance dispersal. Here, we present Monte Carlo simulations and numerical renormalization group results for the paradigmatic model, the contact process, in the combined presence of these factors in both one and two-dimensional systems. Our results confirm our analytic arguments stating that the density vanishes smoothly at the extinction threshold, in a way characteristic of infinite-order transitions. This extremely smooth vanishing of the global density entails an enhanced exposure of the population to extinction events. At the same time, a reverse order parameter, the local persistence displays a discontinuity characteristic of mixed-order transitions, as it approaches a non-universal critical value algebraically with an exponent $\beta_p'<1$.

Replacements for Mon, 6 May 24

[16]  arXiv:2306.00695 (replaced) [pdf, ps, other]
Title: Demixing fluorescence time traces transmitted by multimode fibers
Comments: Main text: 20 pages, 7 Figures. Supp info: 14 pages, 11 Figures
Subjects: Optics (physics.optics); Applied Physics (physics.app-ph); Biological Physics (physics.bio-ph); Neurons and Cognition (q-bio.NC)
[17]  arXiv:2307.06472 (replaced) [pdf, other]
Title: Early Autism Diagnosis based on Path Signature and Siamese Unsupervised Feature Compressor
Subjects: Computer Vision and Pattern Recognition (cs.CV); Neurons and Cognition (q-bio.NC)
[18]  arXiv:2308.13257 (replaced) [pdf, other]
Title: Alternating Shrinking Higher-order Interactions for Sparse Neural Population Activity
Comments: 5 figures
Subjects: Neurons and Cognition (q-bio.NC)
[19]  arXiv:2310.04420 (replaced) [pdf, other]
Title: BrainSCUBA: Fine-Grained Natural Language Captions of Visual Cortex Selectivity
Comments: ICLR 2024. Project page: this https URL
Subjects: Machine Learning (cs.LG); Neurons and Cognition (q-bio.NC)
[20]  arXiv:2311.12410 (replaced) [pdf, other]
Title: nach0: Multimodal Natural and Chemical Languages Foundation Model
Comments: Accepted to Chemical Science Journal. Models are publicly available via this https URL and this https URL
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Quantitative Methods (q-bio.QM)
[21]  arXiv:2312.15795 (replaced) [pdf, ps, other]
Title: Biparental Reproduction may Enhance Species Sustainability by Conserving Shared Parental Traits more Faithfully than Monoparental Reproduction
Comments: This version has an analytical angle. See V1 for simulations of a slightly different model
Subjects: Populations and Evolution (q-bio.PE)
[22]  arXiv:2312.16074 (replaced) [pdf, other]
Title: Unsupervised Learning of Phylogenetic Trees via Split-Weight Embedding
Subjects: Populations and Evolution (q-bio.PE); Machine Learning (stat.ML)
[23]  arXiv:2402.09330 (replaced) [pdf, other]
Title: 3D-based RNA function prediction tools in rnaglib
Subjects: Biomolecules (q-bio.BM); Machine Learning (cs.LG)
[24]  arXiv:2403.03134 (replaced) [pdf, other]
Title: Simplicity in Complexity : Explaining Visual Complexity using Deep Segmentation Models
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Neurons and Cognition (q-bio.NC)
[25]  arXiv:2404.04858 (replaced) [pdf, other]
Title: Do the receptive fields in the primary visual cortex span a variability over the degree of elongation of the receptive fields?
Authors: Tony Lindeberg
Comments: 14 pages, 7 figures. Note: Companion paper regarding theoretical analysis in arXiv:2304.11920
Subjects: Neurons and Cognition (q-bio.NC)
[26]  arXiv:2404.09411 (replaced) [pdf, other]
Title: Wasserstein Wormhole: Scalable Optimal Transport Distance with Transformers
Comments: To appear at the Forty-first International Conference on Machine Learning (ICML2024)
Subjects: Machine Learning (cs.LG); Computational Geometry (cs.CG); Genomics (q-bio.GN)
[27]  arXiv:2405.00541 (replaced) [pdf, other]
Title: New Trends on the Systems Approach to Modeling SARS-CoV-2 Pandemics in a Globally Connected Planet
Subjects: Populations and Evolution (q-bio.PE); Adaptation and Self-Organizing Systems (nlin.AO); Physics and Society (physics.soc-ph)
[28]  arXiv:2405.00810 (replaced) [pdf, other]
Title: A Simple Comparison of Biochemical Systems Theory and Metabolic Control Analysis
Authors: Herbert M Sauro
Subjects: Molecular Networks (q-bio.MN); Quantitative Methods (q-bio.QM); Subcellular Processes (q-bio.SC)
[ total of 28 entries: 1-28 ]
[ showing up to 2000 entries per page: fewer | more ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, q-bio, recent, 2405, contact, help  (Access key information)