Sparsification of Large Ultrametric Matrices: Insights into the Microbial Tree of Life* Journal Article uri icon

Overview

abstract

  • AbstractStrictly ultrametric matrices appear in many domains of mathematics and science; nevertheless, they can be large and dense, making them difficult to store and manipulate, unlike large but sparse matrices. In this manuscript, we exploit that strictly ultrametric matrices can be represented as binary trees to sparsify them via an orthonormal base change based on Haar-like wavelets. We show that, with overwhelmingly high probability, only an asymptotically negligible fraction of the off-diagonal entries in random but large strictly ultrametric matrices remain non-zero after the base change; and develop an algorithm to sparsify such matrices directly from their tree representation. We also identify the subclass of matrices diagonalized by the Haar-like wavelets and supply a sufficient condition to approximate the spectrum of strictly ultrametric matrices outside this subclass. Our methods give computational access to the covariance matrix of the microbiologists’ Tree of Life, which was previously inaccessible due to its size, and motivate introducing a new wavelet-based (beta-diversity) metric to compare microbial environments. Unlike the established (beta-diversity) metrics, the new metric may be used to identify internal nodes (i.e., splits) in the Tree that link microbial composition and environmental factors in a statistically significant manner.MSC codes05C05, 15A18, 42C40, 65F55, 92C70

publication date

  • August 21, 2022

has restriction

  • green

Date in CU Experts

  • August 30, 2022 2:52 AM

Full Author List

  • Gorman ED; Lladser ME

author count

  • 2

Other Profiles