Disordered Systems and Neural Networks
- [1] arXiv:2405.15220 [pdf, ps, html, other]
-
Title: Hybrid scaling theory of localization transition in a non-Hermitian disorder Aubry-Andr\'{e} modelSubjects: Disordered Systems and Neural Networks (cond-mat.dis-nn)
In this paper, we study the critical behaviors in the non-Hermtian disorder Aubry-André (DAA) model, and we assume the non-Hermiticity is introduced by the nonreciprocal hopping. We employ the localization length $\xi$, the inverse participation ratio ($\rm IPR$), and the real part of the energy gap between the first excited state and the ground state $\Delta E$ as the character quantities to describe the critical properties of the localization transition. By preforming the scaling analysis, the critical exponents of the non-Hermitian Anderson model and the non-Hermitian DAA model are obtained, and these critical exponents are different from their Hermitian counterparts, indicating the Hermitian and non-Hermitian disorder and DAA models belong to different universe classes. The critical exponents of non-Hermitian DAA model are remarkably different from both the pure non-Hermitian AA model and the non-Hermitian Anderson model, showing that disorder is a independent relevant direction at the non-Hermitian AA model. We further propose a hybrid scaling theory to describe the critical behavior in the overlapping critical region constituted by the critical regions of non-Hermitian DAA model and the non-Hermitian Anderson localization transition.
New submissions for Monday, 27 May 2024 (showing 1 of 1 entries )
- [2] arXiv:2405.14927 (cross-list from quant-ph) [pdf, ps, html, other]
-
Title: Slow measurement-only dynamics of entanglement in Pauli subsystem codesComments: 21 pages, 11 figuresSubjects: Quantum Physics (quant-ph); Disordered Systems and Neural Networks (cond-mat.dis-nn); Statistical Mechanics (cond-mat.stat-mech); Strongly Correlated Electrons (cond-mat.str-el)
We study the non-unitary dynamics of a class of quantum circuits based on stochastically measuring check operators of subsystem quantum error-correcting codes, such as the Bacon-Shor code and its various generalizations. Our focus is on how properties of the underlying code are imprinted onto the measurement-only dynamics. We find that in a large class of codes with nonlocal stabilizer generators, at late times there is generically a nonlocal contribution to the subsystem entanglement entropy which scales with the subsystem size. The nonlocal stabilizer generators can also induce slow dynamics, since depending on the rate of competing measurements the associated degrees of freedom can take exponentially long (in system size) to purify (disentangle from the environment when starting from a mixed state) and to scramble (become entangled with the rest of the system when starting from a product state). Concretely, we consider circuits for which the nonlocal stabilizer generators of the underlying subsystem code take the form of subsystem symmetries. We present a systematic study of the phase diagrams and relevant time scales in two and three spatial dimensions for both Calderbank-Shor-Steane (CSS) and non-CSS codes, focusing in particular on the link between slow measurement-only dynamics and the geometry of the subsystem symmetry. A key finding of our work is that slowly purifying or scrambling degrees of freedom appear to emerge only in codes whose subsystem symmetries are nonlocally {\it generated}, a strict subset of those whose symmetries are simply nonlocal. We comment on the link between our results on subsystem codes and the phenomenon of Hilbert-space fragmentation in light of their shared algebraic structure.
- [3] arXiv:2405.14936 (cross-list from quant-ph) [pdf, ps, html, other]
-
Title: Local and nonlocal stochastic control of quantum chaos: Measurement- and control-induced criticalityComments: 16 pages, 10 figuresSubjects: Quantum Physics (quant-ph); Disordered Systems and Neural Networks (cond-mat.dis-nn); Chaotic Dynamics (nlin.CD)
We theoretically study the topology of the phase diagram of a family of quantum models inspired by the classical Bernoulli map under stochastic control. The quantum models inherit a control-induced phase transition from the classical model and also manifest an entanglement phase transition intrinsic to the quantum setting. This measurement-induced phase transition has been shown in various settings to either coincide or split off from the control transition, but a systematic understanding of the necessary and sufficient conditions for the two transitions to coincide in this case has so far been lacking. In this work, we generalize the control map to allow for either local or global control action. While this does not affect the classical aspects of the control transition that is described by a random walk, it significantly influences the quantum dynamics, leading to the universality class of the measurement-induced transition being dependent on the locality of the control operation. In the presence of a global control map, the two transitions coincide and the control-induced phase transition dominates the measurement-induced phase transition. Contrarily, the two transitions split in the presence of the local control map or additional projective measurements and generically take on distinct universality classes. For local control, the measurement-induced phase transition recovers the Haar logarithmic conformal field theory universality class found in feedback-free models. However, for global control, a novel universality class with correlation length exponent $\nu \approx 0.7$ emerges from the interplay of control and projective measurements. This work provides a more refined understanding of the relationship between the control- and measurement-induced phase transitions.
- [4] arXiv:2405.15376 (cross-list from cs.LG) [pdf, ps, html, other]
-
Title: Fast, accurate training and sampling of Restricted Boltzmann MachinesComments: 18 pages, 8 figuresSubjects: Machine Learning (cs.LG); Disordered Systems and Neural Networks (cond-mat.dis-nn); Statistical Mechanics (cond-mat.stat-mech)
Thanks to their simple architecture, Restricted Boltzmann Machines (RBMs) are powerful tools for modeling complex systems and extracting interpretable insights from data. However, training RBMs, as other energy-based models, on highly structured data poses a major challenge, as effective training relies on mixing the Markov chain Monte Carlo simulations used to estimate the gradient. This process is often hindered by multiple second-order phase transitions and the associated critical slowdown. In this paper, we present an innovative method in which the principal directions of the dataset are integrated into a low-rank RBM through a convex optimization procedure. This approach enables efficient sampling of the equilibrium measure via a static Monte Carlo process. By starting the standard training process with a model that already accurately represents the main modes of the data, we bypass the initial phase transitions. Our results show that this strategy successfully trains RBMs to capture the full diversity of data in datasets where previous methods fail. Furthermore, we use the training trajectories to propose a new sampling method, {\em parallel trajectory tempering}, which allows us to sample the equilibrium measure of the trained model much faster than previous optimized MCMC approaches and a better estimation of the log-likelihood. We illustrate the success of the training method on several highly structured datasets.
- [5] arXiv:2405.15480 (cross-list from cs.LG) [pdf, ps, html, other]
-
Title: Fundamental limits of weak learnability in high-dimensional multi-index modelsEmanuele Troiani, Yatin Dandi, Leonardo Defilippis, Lenka Zdeborová, Bruno Loureiro, Florent KrzakalaSubjects: Machine Learning (cs.LG); Disordered Systems and Neural Networks (cond-mat.dis-nn); Computational Complexity (cs.CC)
Multi-index models -- functions which only depend on the covariates through a non-linear transformation of their projection on a subspace -- are a useful benchmark for investigating feature learning with neural networks. This paper examines the theoretical boundaries of learnability in this hypothesis class, focusing particularly on the minimum sample complexity required for weakly recovering their low-dimensional structure with first-order iterative algorithms, in the high-dimensional regime where the number of samples is $n=\alpha d$ is proportional to the covariate dimension $d$. Our findings unfold in three parts: (i) first, we identify under which conditions a \textit{trivial subspace} can be learned with a single step of a first-order algorithm for any $\alpha\!>\!0$; (ii) second, in the case where the trivial subspace is empty, we provide necessary and sufficient conditions for the existence of an {\it easy subspace} consisting of directions that can be learned only above a certain sample complexity $\alpha\!>\!\alpha_c$. The critical threshold $\alpha_{c}$ marks the presence of a computational phase transition, in the sense that no efficient iterative algorithm can succeed for $\alpha\!<\!\alpha_c$. In a limited but interesting set of really hard directions -- akin to the parity problem -- $\alpha_c$ is found to diverge. Finally, (iii) we demonstrate that interactions between different directions can result in an intricate hierarchical learning phenomenon, where some directions can be learned sequentially when coupled to easier ones. Our analytical approach is built on the optimality of approximate message-passing algorithms among first-order iterative methods, delineating the fundamental learnability limit across a broad spectrum of algorithms, including neural networks trained with gradient descent.
- [6] arXiv:2405.15539 (cross-list from stat.ML) [pdf, ps, other]
-
Title: A generalized neural tangent kernel for surrogate gradient learningComments: 52 pages, 3 figures + 2 supplementary figuresSubjects: Machine Learning (stat.ML); Disordered Systems and Neural Networks (cond-mat.dis-nn); Machine Learning (cs.LG); Probability (math.PR); Neurons and Cognition (q-bio.NC)
State-of-the-art neural network training methods depend on the gradient of the network function. Therefore, they cannot be applied to networks whose activation functions do not have useful derivatives, such as binary and discrete-time spiking neural networks. To overcome this problem, the activation function's derivative is commonly substituted with a surrogate derivative, giving rise to surrogate gradient learning (SGL). This method works well in practice but lacks theoretical foundation. The neural tangent kernel (NTK) has proven successful in the analysis of gradient descent. Here, we provide a generalization of the NTK, which we call the surrogate gradient NTK, that enables the analysis of SGL. First, we study a naive extension of the NTK to activation functions with jumps, demonstrating that gradient descent for such activation functions is also ill-posed in the infinite-width limit. To address this problem, we generalize the NTK to gradient descent with surrogate derivatives, i.e., SGL. We carefully define this generalization and expand the existing key theorems on the NTK with mathematical rigor. Further, we illustrate our findings with numerical experiments. Finally, we numerically compare SGL in networks with sign activation function and finite width to kernel regression with the surrogate gradient NTK; the results confirm that the surrogate gradient NTK provides a good characterization of SGL.
- [7] arXiv:2405.15560 (cross-list from cond-mat.mes-hall) [pdf, ps, html, other]
-
Title: Super-diffusive transport in two-dimensional Fermionic wiresComments: 11 pages, 7 figuresSubjects: Mesoscale and Nanoscale Physics (cond-mat.mes-hall); Disordered Systems and Neural Networks (cond-mat.dis-nn); Statistical Mechanics (cond-mat.stat-mech); Quantum Physics (quant-ph)
We consider a two-dimensional model of a Fermionic wire in contact with reservoirs along its two opposite edges. With the reservoirs biased around a Fermi level, $E$, we study the scaling of the conductance of the wire with its length, $L$ as the width of the wire $W\rightarrow\infty$. The wire is disordered along the direction of the transport so the conductance is expected to exponentially decay with the length of the wire. However, we show that our model shows a super-diffusive scaling ($1/L^{1/2}$) of the conductance within $|E|<E_c$. This behavior is attributed to the presence of eigenstates of diverging localization length as $W\rightarrow\infty$. At $|E|=E_c$, the conductance behavior is sensitive to the disorder and scales sub-diffusively as $1/L^{3/2}$, and $1/L^{5/2}$ for zero and nonzero expectation value of the disorder. Furthermore, at this Fermi level and at certain points in the parameter space of the wire, the behavior of the conductance is also sensitive to the sign of the expectation value of the disorder. At these points we find $1/L^{7/4}$ for zero expectation value of the disorder and $1/L$, $1/L^{3}$ for different signs of the expectation value of the disorder.
- [8] arXiv:2405.15699 (cross-list from stat.ML) [pdf, ps, html, other]
-
Title: Dimension-free deterministic equivalents for random feature regressionSubjects: Machine Learning (stat.ML); Disordered Systems and Neural Networks (cond-mat.dis-nn); Machine Learning (cs.LG)
In this work we investigate the generalization performance of random feature ridge regression (RFRR). Our main contribution is a general deterministic equivalent for the test error of RFRR. Specifically, under a certain concentration property, we show that the test error is well approximated by a closed-form expression that only depends on the feature map eigenvalues. Notably, our approximation guarantee is non-asymptotic, multiplicative, and independent of the feature map dimension -- allowing for infinite-dimensional features. We expect this deterministic equivalent to hold broadly beyond our theoretical analysis, and we empirically validate its predictions on various real and synthetic datasets. As an application, we derive sharp excess error rates under standard power-law assumptions of the spectrum and target decay. In particular, we provide a tight result for the smallest number of features achieving optimal minimax error rate.
- [9] arXiv:2405.15712 (cross-list from stat.ML) [pdf, ps, html, other]
-
Title: Infinite Limits of Multi-head Transformer DynamicsSubjects: Machine Learning (stat.ML); Disordered Systems and Neural Networks (cond-mat.dis-nn); Machine Learning (cs.LG)
In this work, we analyze various scaling limits of the training dynamics of transformer models in the feature learning regime. We identify the set of parameterizations that admit well-defined infinite width and depth limits, allowing the attention layers to update throughout training--a relevant notion of feature learning in these models. We then use tools from dynamical mean field theory (DMFT) to analyze various infinite limits (infinite key/query dimension, infinite heads, and infinite depth) which have different statistical descriptions depending on which infinite limit is taken and how attention layers are scaled. We provide numerical evidence of convergence to the limits and discuss how the parameterization qualitatively influences learned features.
Cross submissions for Monday, 27 May 2024 (showing 8 of 8 entries )
- [10] arXiv:2403.07565 (replaced) [pdf, ps, html, other]
-
Title: Logarithmic critical slowing down in complex systems: from statics to dynamicsComments: 22 pages, 2 figuresJournal-ref: Phys. Rev. B 109, 174211(2024)Subjects: Disordered Systems and Neural Networks (cond-mat.dis-nn); Materials Science (cond-mat.mtrl-sci); Soft Condensed Matter (cond-mat.soft); Statistical Mechanics (cond-mat.stat-mech)
We consider second-order phase transitions in which the order parameter is a replicated overlap matrix. We focus on a tricritical point that occurs in a variety of mean-field models and that, more generically, describes higher order liquid-liquid or liquid-glass transitions. We show that the static replicated theory implies slowing down with a logarithmic decay in time. The dynamical equations turn out to be those predicted by schematic Mode Coupling Theory for supercooled viscous liquids at a $A_3$ singularity, where the parameter exponent is $\lambda=1$. We obtain a quantitative expression for the parameter $\mu$ of the logarithmic decay in terms of cumulants of the overlap, which are physically observable in experiments or numerical simulations.
- [11] arXiv:2405.13508 (replaced) [pdf, ps, html, other]
-
Title: Furutsu-Novikov--like cross-correlation--response relations for systems driven by shot noiseComments: 12 pages, 9 figuresSubjects: Disordered Systems and Neural Networks (cond-mat.dis-nn); Biological Physics (physics.bio-ph)
We consider a dynamic system that is driven by an intensity-modulated Poisson process with intensity $\Lambda(t)=\lambda(t)+\varepsilon\nu(t)$. We derive an exact relation between the input-output cross-correlation in the spontaneous state ($\varepsilon=0$) and the linear response to the modulation ($\varepsilon>0$). This can be regarded as a variant of the Furutsu-Novikov theorem for the case of shot noise. As we show, the relation is still valid in the presence of additional independent noise. Furthermore, we derive an extension to Cox-process input, i.e. to colored shot noise. We discuss applications to particle detection and to neuroscience. Using the new relation, we obtain a fluctuation-response-relation for a leaky integrate-and-fire neuron. We also show how the new relation can be used in a remote control problem in a recurrent neural network. The relations are numerically tested for both stationary and non-stationary dynamics. Lastly, extensions to marked Poisson processes and to higher-order statistics are presented.
- [12] arXiv:2405.14289 (replaced) [pdf, ps, html, other]
-
Title: Generating-functional analysis of random Lotka-Volterra systems: A step-by-step guideComments: 81 pages, 8 figuresSubjects: Disordered Systems and Neural Networks (cond-mat.dis-nn); Statistical Mechanics (cond-mat.stat-mech); Populations and Evolution (q-bio.PE)
This paper provides what is hopefully a self-contained set of notes describing the detailed steps of a generating-functional analysis of systems of generalised Lotka-Volterra equations with random interaction coefficients. Nothing in these notes is original, instead the generating-functional method (also known as the Martin-Siggia-Rose-DeDominic-Janssen formalism) and the resulting dynamic mean field theories have been used for the study of disordered systems and spin glasses for decades. But it is hard to find unifying sources which would allow a beginner to learn step-by-step how these methods can be used. My aim is to provide such a source. Most of the calculations are specific to generalised Lotka-Volterra systems, but much can be transferred to disordered systems in more general.
- [13] arXiv:2401.14339 (replaced) [pdf, ps, html, other]
-
Title: Variational Quantum Algorithms for the Allocation of Resources in a Cloud/Edge ArchitectureComments: 14 pages, 13 figuresJournal-ref: IEEE Transactions on Quantum Engineering,2024Subjects: Quantum Physics (quant-ph); Disordered Systems and Neural Networks (cond-mat.dis-nn); Other Condensed Matter (cond-mat.other)
Modern Cloud/Edge architectures need to orchestrate multiple layers of heterogeneous computing nodes, including pervasive sensors/actuators, distributed Edge/Fog nodes, centralized data centers and quantum devices. The optimal assignment and scheduling of computation on the different nodes is a very difficult problem, with NP-hard complexity. In this paper, we explore the possibility of solving this problem with Variational Quantum Algorithms, which can become a viable alternative to classical algorithms in the near future. In particular, we compare the performances, in terms of success probability, of two algorithms, i.e., Quantum Approximate Optimization Algorithm (QAOA) and Variational Quantum Eigensolver (VQE). The simulation experiments, performed for a set of simple problems, %CM230124 that involve a Cloud and two Edge nodes, show that the VQE algorithm ensures better performances when it is equipped with appropriate circuit \textit{ansatzes} that are able to restrict the search space. Moreover, experiments executed on real quantum hardware show that the execution time, when increasing the size of the problem, grows much more slowly than the trend obtained with classical computation, which is known to be exponential.
- [14] arXiv:2405.08571 (replaced) [pdf, ps, html, other]
-
Title: Mean-field theory of first-order quantum superconductor-insulator transitionComments: 32 pages, 5 figuresSubjects: Superconductivity (cond-mat.supr-con); Disordered Systems and Neural Networks (cond-mat.dis-nn)
Recent experimental studies on strongly disordered indium oxide films have revealed an unusual first-order quantum phase transition between the superconducting and insulating states (SIT). This transition is characterized by a discontinuous jump from non-zero to zero values of superfluid stiffness at the critical point, contradicting the conventional ``scaling scenario'' typically associated with SIT. In this paper, we present a theoretical framework for understanding this first-order transition. Our approach is based on the concept of competition between two fundamentally distinct ground states that arise from electron pairs initially localized by strong disorder: the superconducting state and the Coulomb glass insulator. These ground states are distinguished by two crucially different order parameters, suggesting a natural expectation of a discontinuous transition between them at $T=0$. This transition occurs when the magnitudes of the superconducting gap $\Delta$ and the Coulomb gap $E_C$ become comparable. Additionally, we extend our analysis to low non-zero temperatures and provide a mean-field ``phase diagram'' in the plane of $(T/\Delta,E_C/\Delta)$. Our results reveal the existence of a natural upper bound for the kinetic inductance of strongly disordered superconductors.
- [15] arXiv:2405.13098 (replaced) [pdf, ps, html, other]
-
Title: Glassy dynamics in deep neural networks: A structural comparisonSubjects: Computational Physics (physics.comp-ph); Disordered Systems and Neural Networks (cond-mat.dis-nn); Statistical Mechanics (cond-mat.stat-mech)
Deep Neural Networks (DNNs) share important similarities with structural glasses. Both have many degrees of freedom, and their dynamics are governed by a high-dimensional, non-convex landscape representing either the loss or energy, respectively. Furthermore, both experience gradient descent dynamics subject to noise. In this work we investigate, by performing quantitative measurements on realistic networks trained on the MNIST and CIFAR-10 datasets, the extent to which this qualitative similarity gives rise to glass-like dynamics in neural networks. We demonstrate the existence of a Topology Trivialisation Transition as well as the previously studied under-to-overparameterised transition analogous to jamming. By training DNNs with overdamped Langevin dynamics in the resulting disordered phases, we do not observe diverging relaxation times at non-zero temperature, nor do we observe any caging effects, in contrast to glass phenomenology. However, the weight overlap function follows a power law in time, with an exponent of approximately -0.5, in agreement with the Mode-Coupling Theory of structural glasses. In addition, the DNN dynamics obey a form of time-temperature superposition. Finally, dynamic heterogeneity and ageing are observed at low temperatures. These results highlight important and surprising points of both difference and agreement between the behaviour of DNNs and structural glasses.