Information Theory
- [1] arXiv:2405.19509 [pdf, ps, html, other]
-
Title: Leveraging partial stragglers within gradient codingComments: 12 pages, 7 figuresSubjects: Information Theory (cs.IT)
Within distributed learning, workers typically compute gradients on their assigned dataset chunks and send them to the parameter server (PS), which aggregates them to compute either an exact or approximate version of $\nabla L$ (gradient of the loss function $L$). However, in large-scale clusters, many workers are slower than their promised speed or even failure-prone. A gradient coding solution introduces redundancy within the assignment of chunks to the workers and uses coding theoretic ideas to allow the PS to recover $\nabla L$ (exactly or approximately), even in the presence of stragglers. Unfortunately, most existing gradient coding protocols are inefficient from a computation perspective as they coarsely classify workers as operational or failed; the potentially valuable work performed by slow workers (partial stragglers) is ignored. In this work, we present novel gradient coding protocols that judiciously leverage the work performed by partial stragglers. Our protocols are efficient from a computation and communication perspective and numerically stable. For an important class of chunk assignments, we present efficient algorithms for optimizing the relative ordering of chunks within the workers; this ordering affects the overall execution time. For exact gradient reconstruction, our protocol is around $2\times$ faster than the original class of protocols and for approximate gradient reconstruction, the mean-squared-error of our reconstructed gradient is several orders of magnitude better.
- [2] arXiv:2405.19540 [pdf, ps, html, other]
-
Title: Computing Low-Entropy Couplings for Large-Support DistributionsSamuel Sokota, Dylan Sam, Christian Schroeder de Witt, Spencer Compton, Jakob Foerster, J. Zico KolterSubjects: Information Theory (cs.IT); Cryptography and Security (cs.CR)
Minimum-entropy coupling (MEC) -- the process of finding a joint distribution with minimum entropy for given marginals -- has applications in areas such as causality and steganography. However, existing algorithms are either computationally intractable for large-support distributions or limited to specific distribution types and sensitive to hyperparameter choices. This work addresses these limitations by unifying a prior family of iterative MEC (IMEC) approaches into a generalized partition-based formalism. From this framework, we derive a novel IMEC algorithm called ARIMEC, capable of handling arbitrary discrete distributions, and introduce a method to make IMEC robust to suboptimal hyperparameter settings. These innovations facilitate the application of IMEC to high-throughput steganography with language models, among other settings. Our codebase is available at this https URL .
- [3] arXiv:2405.19596 [pdf, ps, html, other]
-
Title: The weight hierarchies of three classes of linear codesSubjects: Information Theory (cs.IT)
Studying the generalized Hamming weights of linear codes is a significant research area within coding theory, as it provides valuable structural information about the codes and plays a crucial role in determining their performance in various applications. However, determining the generalized Hamming weights of linear codes, particularly their weight hierarchy, is generally a challenging task. In this paper, we focus on investigating the generalized Hamming weights of three classes of linear codes over finite fields. These codes are constructed by different defining sets. By analysing the intersections between the definition sets and the duals of all $r$-dimensional subspaces, we get the inequalities on the sizes of these intersections. Then constructing subspaces that reach the upper bounds of these inequalities, we successfully determine the complete weight hierarchies of these codes.
- [4] arXiv:2405.19911 [pdf, ps, html, other]
-
Title: Full weight spectrum one-orbit cyclic subspace codesSubjects: Information Theory (cs.IT); Combinatorics (math.CO)
For a linear Hamming metric code of length n over a finite field, the number of distinct weights of its codewords is at most n. The codes achieving the equality in the above bound were called full weight spectrum codes. In this paper we will focus on the analogous class of codes within the framework of cyclic subspace codes. Cyclic subspace codes have garnered significant attention, particularly for their applications in random network coding to correct errors and erasures. We investigate one-orbit cyclic subspace codes that are full weight spectrum in this context. Utilizing number theoretical results and combinatorial arguments, we provide a complete classification of full weight spectrum one-orbit cyclic subspace codes.
- [5] arXiv:2405.19965 [pdf, ps, html, other]
-
Title: Several classes of BCH codes of length $n=\frac{q^{m}-1}{2}$Subjects: Information Theory (cs.IT)
BCH codes are an important class of cyclic codes, and have wide applications in communication and storage systems. In this paper, we study the negacyclic BCH code and cyclic BCH code of length $n=\frac{q^m-1}{2}$.For negacyclic BCH code, we give the dimensions of $\mathcal C_{(n,-1,\delta,0)}$ for $\widetilde{\delta} =a\frac{q^m-1}{q-1},aq^{m-1}-1$($1\leq a <\frac{q-1}{2}$) and $\widetilde{\delta} =a\frac{q^m-1}{q-1}+b\frac{q^m-1}{q^2-1},aq^{m-1}+(a+b)q^{m-2}-1$ $(2\mid m,1\leq a+b \leq q-1$,$\left\lceil \frac{q-a-2}{2}\right\rceil\geq 1)$. The dimensions of negacyclic BCH codes $\mathcal C_{(n,-1,\delta,0)}$ with few nonzeros and $\mathcal C_{(n,-1,\delta,b)}$ with $b\neq 1$ are settled.For cyclic BCH code, we give the weight distributions of extended codes $\overline{\mathcal C}_{(n,1,\delta,1)}$ for $\delta=\delta_1,\delta_2$ and the parameters of dual code $\mathcal C^{\perp}_{(n,1,\delta,1)}$ for $\delta_2\leq \delta \leq \delta_1$.
- [6] arXiv:2405.20047 [pdf, ps, html, other]
-
Title: Schubert Subspace CodesSubjects: Information Theory (cs.IT); Combinatorics (math.CO)
In this paper, we initiate the study of constant dimension subspace codes restricted to Schubert varieties, which we call Schubert subspace codes. These codes have a very natural geometric description, as objects that we call intersecting sets with respect to a fixed subspace. We provide a geometric construction of maximum size constant dimension subspace codes in some Schubert varieties with the largest possible value for the minimum subspace distance. Finally, we generalize the problem to different values of the minimum distance.
- [7] arXiv:2405.20073 [pdf, ps, html, other]
-
Title: Power Allocation for Cell-Free Massive MIMO ISAC Systems with OTFS SignalComments: This work is submitted to IEEE for possible publicationSubjects: Information Theory (cs.IT); Signal Processing (eess.SP)
Applying integrated sensing and communication (ISAC) to a cell-free massive multiple-input multiple-output (CF mMIMO) architecture has attracted increasing attention. This approach equips CF mMIMO networks with sensing capabilities and resolves the problem of unreliable service at cell edges in conventional cellular networks. However, existing studies on CF-ISAC systems have focused on the application of traditional integrated signals. To address this limitation, this study explores the employment of the orthogonal time frequency space (OTFS) signal as a representative of innovative signals in the CF-ISAC system, and the system's overall performance is optimized and evaluated. A universal downlink spectral efficiency (SE) expression is derived regarding multi-antenna access points (APs) and optional sensing beams. To streamline the analysis and optimization of the CF-ISAC system with the OTFS signal, we introduce a lower bound on the achievable SE that is applicable to OTFS-signal-based systems. Based on this, a power allocation algorithm is proposed to maximize the minimum communication signal-to-interference-plus-noise ratio (SINR) of users while guaranteeing a specified sensing SINR value and meeting the per-AP power constraints. The results demonstrate the tightness of the proposed lower bound and the efficiency of the proposed algorithm. Finally, the superiority of using the OTFS signals is verified by a 13-fold expansion of the SE performance gap over the application of orthogonal frequency division multiplexing signals. These findings could guide the future deployment of the CF-ISAC systems, particularly in the field of millimeter waves with a large bandwidth.
New submissions for Friday, 31 May 2024 (showing 7 of 7 entries )
- [8] arXiv:2405.19337 (cross-list from cs.ET) [pdf, ps, other]
-
Title: Information-theoretic language of proteinoid gels: Boolean gates and QR codesComments: 7 pages, 5 figuresSubjects: Emerging Technologies (cs.ET); Information Theory (cs.IT); Biological Physics (physics.bio-ph); Chemical Physics (physics.chem-ph)
With an aim to build analog computers out of soft matter fluidic systems in future, this work attempts to invent a new information-theoretic language, in the form of two-dimensional Quick Response (QR) codes. This language is, effectively, a digital representation of the analog signals shown by the proteinoids. We use two different experimental techniques: (i) a voltage-sensitive dye and (ii) a pair of differential electrodes, to record the analog signals. The analog signals are digitally approximatied (synthesised) by sampling the analog signals into a series of discrete values, which are then converted into binary representations. We have shown the AND-OR-NOT-XOR-NOR-NAND-XNOR gate representation of the digitally sampled signal of proteinoids. Additional encoding schemes are applied to convert the binary code identified above to a two-dimensional QR code. As a result, the QR code becomes a digital, unique marker of a given proteinoid network. We show that it is possible to retrieve the analog signal from the QR code by scanning the QR code using a mobile phone. Our work shows that soft matter fluidic systems, such as proteinoids, can have a fundamental informatiom-theoretic language, unique to their internal information transmission properties (electrical activity in this case) - such a language can be made universal and accessible to everyone using 2D QR codes, which can digitally encode their internal properties and give an option to recover the original signal when required. On a more fundamental note, this study identifies the techniques of approximating continuum properties of soft matter fluidic systems using a series representation of gates and QR codes, which are a piece-wise digital representation, and thus one step closer to programming the fluids using information-theoretic methods, as suggested almost a decade ago by Tao's fluid program.
- [9] arXiv:2405.19889 (cross-list from eess.SP) [pdf, ps, html, other]
-
Title: Deep Joint Semantic Coding and Beamforming for Near-Space Airship-Borne Massive MIMO NetworkComments: Major Revision by IEEE JSACSubjects: Signal Processing (eess.SP); Information Theory (cs.IT); Machine Learning (cs.LG); Multimedia (cs.MM)
Near-space airship-borne communication network is recognized to be an indispensable component of the future integrated ground-air-space network thanks to airships' advantage of long-term residency at stratospheric altitudes, but it urgently needs reliable and efficient Airship-to-X link. To improve the transmission efficiency and capacity, this paper proposes to integrate semantic communication with massive multiple-input multiple-output (MIMO) technology. Specifically, we propose a deep joint semantic coding and beamforming (JSCBF) scheme for airship-based massive MIMO image transmission network in space, in which semantics from both source and channel are fused to jointly design the semantic coding and physical layer beamforming. First, we design two semantic extraction networks to extract semantics from image source and channel state information, respectively. Then, we propose a semantic fusion network that can fuse these semantics into complex-valued semantic features for subsequent physical-layer transmission. To efficiently transmit the fused semantic features at the physical layer, we then propose the hybrid data and model-driven semantic-aware beamforming networks. At the receiver, a semantic decoding network is designed to reconstruct the transmitted images. Finally, we perform end-to-end deep learning to jointly train all the modules, using the image reconstruction quality at the receivers as a metric. The proposed deep JSCBF scheme fully combines the efficient source compressibility and robust error correction capability of semantic communication with the high spectral efficiency of massive MIMO, achieving a significant performance improvement over existing approaches.
- [10] arXiv:2405.19891 (cross-list from quant-ph) [pdf, ps, other]
-
Title: Improving the Fidelity of CNOT Circuits on NISQ HardwareComments: 67 pages, 33 figures, and 9 tablesSubjects: Quantum Physics (quant-ph); Information Theory (cs.IT)
We introduce an improved CNOT synthesis algorithm that considers nearest-neighbour interactions and CNOT gate error rates in noisy intermediate-scale quantum (NISQ) hardware. Compared to IBM's Qiskit compiler, it improves the fidelity of a synthesized CNOT circuit by about 2 times on average (up to 9 times). It lowers the synthesized CNOT count by a factor of 13 on average (up to a factor of 162).
Our contribution is twofold. First, we define a $\textsf{Cost}$ function by approximating the average gate fidelity $F_{avg}$. According to the simulation results, $\textsf{Cost}$ fits the error probability of a noisy CNOT circuit, $\textsf{Prob} = 1 - F_{avg}$, much tighter than the commonly used cost functions. On IBM's fake Nairobi backend, it matches $\textsf{Prob}$ to within $10^{-3}$. On other backends, it fits $\textsf{Prob}$ to within $10^{-1}$. $\textsf{Cost}$ accurately quantifies the dynamic error characteristics and shows remarkable scalability. Second, we propose a noise-aware CNOT routing algorithm, NAPermRowCol, by adapting the leading Steiner-tree-based connectivity-aware CNOT synthesis algorithms. A weighted edge is used to encode a CNOT gate error rate and $\textsf{Cost}$-instructed heuristics are applied to each reduction step. NAPermRowCol does not use ancillary qubits and is not restricted to certain initial qubit maps. Compared with algorithms that are noise-agnostic, it improves the fidelity of a synthesized CNOT circuit across varied NISQ hardware. Depending on the benchmark circuit and the IBM backend selected, it lowers the synthesized CNOT count up to $56.95\%$ compared to ROWCOL and up to $21.62\%$ compared to PermRowCol. It reduces the synthesis $\textsf{Cost}$ up to $25.71\%$ compared to ROWCOL and up to $9.12\%$ compared to PermRowCol. Our method can be extended to route a more general quantum circuit, giving a powerful new tool for compiling on NISQ devices. - [11] arXiv:2405.20115 (cross-list from quant-ph) [pdf, ps, html, other]
-
Title: Monogamy of nonlocality from multipartite information causalityComments: First draft, comments are welcome!Subjects: Quantum Physics (quant-ph); Information Theory (cs.IT)
The monogamy of nonlocality is one the most intriguing and cryptographically significant predictions of quantum theory. The physical principle of information causality offers a promising means to understand and restrict the extent of nonlocality without invoking the abstract mathematical formalism of quantum theory. In this article, we demonstrate that the original bipartite formulation of information causality cannot imply non-trivial monogamy relations, thereby refuting the previous claims. Nevertheless, we show that the recently proposed multipartite formulation of information causality implies stronger-than-no-signaling monogamy relations. We use these monogamy relations to enhance the security of device-independent quantum key distribution against a no-signaling eavesdropper constrained by information causality.
Cross submissions for Friday, 31 May 2024 (showing 4 of 4 entries )
- [12] arXiv:2202.11694 (replaced) [pdf, ps, other]
-
Title: An information-theoretic proof of the Erd\H{o}s-Kac theoremComments: There is an error in this paper that is addressed in arXiv:2308.10817Subjects: Information Theory (cs.IT)
In this article we show that the Erdős-Kac theorem, which informally states that the number of prime divisors of very large integers converges to a normal distribution, has an elegant proof via Algorithmic Information Theory.
- [13] arXiv:2206.01312 (replaced) [pdf, ps, other]
-
Title: Optimization of Energy-Constrained IRS-NOMA Using a Complex Circle Manifold ApproachSubjects: Information Theory (cs.IT); Signal Processing (eess.SP)
This work investigates the performance of intelligent reflective surfaces (IRSs) assisted uplink non-orthogonal multiple access (NOMA) in energy-constrained networks. Specifically, we formulate and solve two optimization problems; the first aims at minimizing the sum of users' transmit power, while the second targets maximizing the system level energy efficiency (EE). The two problems are solved by jointly optimizing the users' transmit powers and the beamforming coefficients at IRS subject to the users' individual uplink rate and transmit power constraints. A novel and low complexity algorithm is developed to optimize the IRS beamforming coefficients by optimizing the objective function over a \textit{complex circle manifold} (CCM). To efficiently optimize the IRS phase shifts over the manifold, the optimization problem is reformulated into a feasibility expansion problem which is reduced to a max-min signal-plus-interference-ratio (SINR). Then, with the aid of a smoothing technique, the exact penalty method is applied to transform the problem from constrained to unconstrained. The proposed solution is compared against three semi-definite programming (SDP)-based benchmarks which are semi-definite relaxation (SDR), SDP-difference of convex (SDP-DC) and sequential rank-one constraint relaxation (SROCR). The results show that the manifold algorithm provides better performance than the SDP-based benchmarks, and at a much lower computational complexity for both the transmit power minimization and EE maximization problems. The results also reveal that IRS-NOMA is only superior to orthogonal multiple access (OMA) when the users' target achievable rate requirements are relatively high.
- [14] arXiv:2310.16555 (replaced) [pdf, ps, html, other]
-
Title: Towards Information Theory-Based Discovery of EquivariancesComments: 23 pages, 0 figuresSubjects: Information Theory (cs.IT); Neural and Evolutionary Computing (cs.NE); Group Theory (math.GR)
The presence of symmetries imposes a stringent set of constraints on a system. This constrained structure allows intelligent agents interacting with such a system to drastically improve the efficiency of learning and generalization, through the internalisation of the system's symmetries into their information-processing. In parallel, principled models of complexity-constrained learning and behaviour make increasing use of information-theoretic methods. Here, we wish to marry these two perspectives and understand whether and in which form the information-theoretic lens can "see" the effect of symmetries of a system. For this purpose, we propose a novel variant of the Information Bottleneck principle, which has served as a productive basis for many principled studies of learning and information-constrained adaptive behaviour. We show (in the discrete case and under a specific technical assumption) that our approach formalises a certain duality between symmetry and information parsimony: namely, channel equivariances can be characterised by the optimal mutual information-preserving joint compression of the channel's input and output. This information-theoretic treatment furthermore suggests a principled notion of "soft" equivariance, whose "coarseness" is measured by the amount of input-output mutual information preserved by the corresponding optimal compression. This new notion offers a bridge between the field of bounded rationality and the study of symmetries in neural representations. The framework may also allow (exact and soft) equivariances to be automatically discovered.
- [15] arXiv:2312.13921 (replaced) [pdf, ps, html, other]
-
Title: Hulls of projective Reed-Muller codes over the projective planeSubjects: Information Theory (cs.IT); Commutative Algebra (math.AC)
By solving a problem regarding polynomials in a quotient ring, we obtain the relative hull and the Hermitian hull of projective Reed-Muller codes over the projective plane. The dimension of the hull determines the minimum number of maximally entangled pairs required for the corresponding entanglement-assisted quantum error-correcting code. Hence, by computing the dimension of the hull we now have all the parameters of the symmetric and asymmetric entanglement-assisted quantum error-correcting codes constructed with projective Reed-Muller codes over the projective plane. As a byproduct, we also compute the dimension of the Hermitian hull for affine Reed-Muller codes in 2 variables.
- [16] arXiv:2405.17789 (replaced) [pdf, ps, html, other]
-
Title: On the Downlink Average Energy Efficiency of Non-Stationary XL-MIMOComments: 13 pages, 11 figuresSubjects: Information Theory (cs.IT)
Extra large-scale multiple-input multiple-output (XL-MIMO) is a key technology for future wireless communication systems. This paper considers the effects of visibility region (VR) at the base station (BS) in a non-stationary multi-user XL-MIMO scenario, where only partial antennas can receive users' signal. In time division duplexing (TDD) mode, we first estimate the VR at the BS by detecting the energy of the received signal during uplink training phase. The probabilities of two detection errors are derived and the uplink channel on the detected VR is estimated. In downlink data transmission, to avoid cumbersome Monte-Carlo trials, we derive a deterministic approximate expression for ergodic {average energy efficiency (EE)} with the regularized zero-forcing (RZF) precoding. In frequency division duplexing (FDD) mode, the VR is estimated in uplink training and then the channel information of detected VR is acquired from the feedback channel. In downlink data transmission, the approximation of ergodic average {EE} is also derived with the RZF precoding. Invoking approximate results, we propose an alternate optimization algorithm to design the detection threshold and the pilot length in both TDD and FDD modes. The numerical results reveal the impacts of VR estimation error on ergodic average {EE} and demonstrate the effectiveness of our proposed algorithm.
- [17] arXiv:2205.07596 (replaced) [pdf, ps, html, other]
-
Title: Exact Exponents for Concentration and Isoperimetry in Product Polish SpacesComments: IEEE Transactions on Information TheorySubjects: Probability (math.PR); Information Theory (cs.IT); Functional Analysis (math.FA); Metric Geometry (math.MG)
In this paper, we derive variational formulas for the asymptotic exponents (i.e., convergence rates) of the concentration and isoperimetric functions in the product Polish probability space under certain mild assumptions. These formulas are expressed in terms of relative entropies (which are from information theory) and optimal transport cost functionals (which are from optimal transport theory). Hence, our results verify an intimate connection among information theory, optimal transport, and concentration of measure or isoperimetric inequalities. In the concentration regime, the corresponding variational formula is in fact a dimension-free bound in the sense that this bound is valid for any dimension. A cardinality bound for the alphabet of the auxiliary random variable in the expression of the asymptotic isoperimetric exponent is provided, which makes the expression computable by a finite-dimensional program for the finite alphabet case. We lastly apply our results to obtain an isoperimetric inequality in the classic isoperimetric setting, which is asymptotically sharp under certain conditions. The proofs in this paper are based on information-theoretic and optimal transport techniques.
- [18] arXiv:2302.09904 (replaced) [pdf, ps, html, other]
-
Title: WW-FL: Secure and Private Large-Scale Federated LearningComments: WWFL combines private training and inference with secure aggregation and hierarchical FL to provide end-to-end protection and to facilitate large-scale global deploymentSubjects: Machine Learning (cs.LG); Cryptography and Security (cs.CR); Distributed, Parallel, and Cluster Computing (cs.DC); Information Theory (cs.IT)
Federated learning (FL) is an efficient approach for large-scale distributed machine learning that promises data privacy by keeping training data on client devices. However, recent research has uncovered vulnerabilities in FL, impacting both security and privacy through poisoning attacks and the potential disclosure of sensitive information in individual model updates as well as the aggregated global model. This paper explores the inadequacies of existing FL protection measures when applied independently, and the challenges of creating effective compositions.
Addressing these issues, we propose WW-FL, an innovative framework that combines secure multi-party computation (MPC) with hierarchical FL to guarantee data and global model privacy. One notable feature of WW-FL is its capability to prevent malicious clients from directly poisoning model parameters, confining them to less destructive data poisoning attacks. We furthermore provide a PyTorch-based FL implementation integrated with Meta's CrypTen MPC framework to systematically measure the performance and robustness of WW-FL. Our extensive evaluation demonstrates that WW-FL is a promising solution for secure and private large-scale federated learning. - [19] arXiv:2304.11996 (replaced) [pdf, ps, html, other]
-
Title: Applications of Information Inequalities to Database Theory ProblemsComments: This paper was invited for LICS'2023Subjects: Databases (cs.DB); Information Theory (cs.IT)
The paper describes several applications of information inequalities to problems in database theory. The problems discussed include: upper bounds of a query's output, worst-case optimal join algorithms, the query domination problem, and the implication problem for approximate integrity constraints. The paper is self-contained: all required concepts and results from information inequalities are introduced here, gradually, and motivated by database problems.
- [20] arXiv:2402.10025 (replaced) [pdf, ps, other]
-
Title: An improved lower bound on the Shannon capacities of complements of odd cyclesComments: 7 pages, 1 figureSubjects: Combinatorics (math.CO); Information Theory (cs.IT)
Improving a 2003 result of Bohman and Holzman, we show that for $n \geq 1$, the Shannon capacity of the complement of the $2n+1$-cycle is at least $(2^{r_n} + 1)^{1/r_n} = 2 + \Omega(2^{-r_n}/r_n)$, where $r_n = \exp(O((\log n)^2))$ is the number of partitions of $2(n-1)$ into powers of $2$. We also discuss a connection between this result and work by Day and Johnson in the context of graph Ramsey numbers.