Homomogeneous length functions on Groups

A PolyMath adventure

Siddhartha Gadgil

Indian Institute of Science

joint with

the rest of (spontaneous) polymath 14

The PolyMath 14 participants

  • Tobias Fritz, MPI MIS
  • Siddhartha Gadgil, IISc, Bangalore
  • Apoorva Khare, IISc, Bangalore
  • Pace Nielsen, BYU
  • Lior Silberman, UBC
  • Terence Tao, UCLA
  • On Saturday, December 16, 2017, Terrence Tao posted on his blog a question, which Apoorva Khare had asked him.
  • Is there a homogeneous, (conjugacy invariant) length function on the free group on two generators?
  • Six days later, this was answered in a collaboration involving several mathematicians (and a computer).
  • This the story of the answer and its discovery.

Length functions

  • Fix a group $G$, i.e. a set with an associative product with inverses.
  • A pseudo-length function $l: G \to [0, \infty)$ is a function such that:
    • $l(e) = 0$.
    • $l(g^{-1}) = l(g)$ , for all $g \in G$.
    • (Triangle inequality) $l(gh) \leq l(g) + l(h)$, for all $g,h\in G$.
  • A length function is a pseudo-length function such that $l(g) > 0$ whenever $g\neq e$ (positivity condition).
  • A pseudo-length function $l$ is said to be conjugacy invariant if $l(ghg^{-1})=l(h)$ for all $g, h \in G$.
  • $l$ is said to be homogeneous if $l(g^n) = nl(g)$ for all $g \in G$.
  • On the group $\mathbb{Z}^2$ of pairs of integers, a (conjugacy invariant) homogeneous length function is given by $l_{\mathbb{Z}^2}((a,b)) = |a| + |b|$.
  • On the other hand, if $l$ is homogeneous and $g^n=e$ then $l(g)=0$ as $nl(g) = l(g^n) = 0$.
  • Thus, if $G$ has torsion, i.e., there exists $g\neq e\in G$, such that $g^n =0$ for some $n > 0$, then there is no homogeneous length function on $G$.

Free group $\mathbb{F}_2$ on $\alpha$, $\beta$

  • Consider words in four letters $\alpha$, $\beta$, $\bar{\alpha}$ and $\bar{\beta}$.
  • We multiply words by concatenation, e.g. $\alpha\beta\cdot\bar{\alpha}\beta = \alpha\beta\bar{\alpha}\beta$.
  • Two words are regarded as equal if they can be related by cancellating adjacent letters that are inverses, e.g. $\alpha\beta\bar{\beta}\alpha\beta = \alpha\alpha\beta$.
  • This gives a group; e.g. $(\alpha\bar{\beta}\alpha\beta\beta)^{-1} = \bar{\beta}\bar{\beta}\bar{\alpha}\beta\bar{\alpha}$.
  • The word length on $\mathbb{F_2}$ is the length of the shortest word representing an element.

Pullback pseudo-lengths

  • We define $ab: \mathbb{F}_2 \to \mathbb{Z}^2$ by total exponents (ignoring order), e.g. $ab(\alpha\alpha\beta\bar{\alpha}\bar{\beta}\alpha\bar{\beta}) = (2, -1)$.
  • We get a pullback pseudo-length $l$ on $\mathbb{F}_2$ by $l_{ab}(g) = l_{\mathbb{Z}^2}(ab(g))$.
  • As $l_{\mathbb{Z}^2}$ is homogeneous, so is $l_{ab}$.
  • However, this is not a length function as $l_{ab}([\alpha, \beta])=0$, where $[g, h] = ghg^{-1}h^{-1}$ (violating positivity).
  • More generally, every group $G$ has an Abelianization $ab: G \to G/[G, G]$.

The Main results

  • Question: Is there a homogeneous length function on the free group on two generators?
  • Answer: No; we in fact describe all homogeneous pseudo-lengths.
  • Theorem: Any homogeneous pseudo-length function on a group $G$ is the pullback of a pseudo-length on its Abelianization $G / [G, G]$.
  • Corollary: If $G$ is not abelian (e.g. $G = \mathbb{F}_2$) there is no homogeneous length function on $G$.
  • In fact, every homogeneous pseudo-length is the pullback of a norm on a vector space.

First reductions

  • (Fritz): Homomogeneous implies Conjugacy invariant.
  • (Tao, Khare): $l(g^2) = 2l(g) \forall g\in G$ implies Homogeneity.

Finding bounds: contradict positivity?

  • Fix $l$ homomogeneous pseudo-length function.
  • The triangle inequality gives bounds on $l(gh)$ in terms of $l(g)$ and $l(h)$.
  • Homogeneity gives a bound on $l(g)$ in terms of a bound on $l(g^n)$; also use conjugacy invariance.
  • Find bounds for elements in terms of bounds for others, especially
    • $l([x, y])$ in terms of $l(x)$ and $l(y)$
    • $l(g)$ for fixed $g$ in terms of bounds on generators.
  • Combine bounds, including by interpreting in terms of convexity.

Seeking a homogeneous length function

  • We can attempt to construct pseudo-lengths that are positive, or at least such that $l([\alpha, \beta])> 0$.
  • Can consider quotients larger than $\mathbb{Z}^2$: negative results by Sawin, Silberman, Tao.
  • Given a candidate $l: G \to [0, \infty)$, try to fix this to ensure homogeneity and the triangle inequality.
  • There were also attempted constructions based on functional analysis.

Repairing lengths

  • We start with a candidate function $l: G \to \mathbb{R}$, say the word length.
  • Homogenize: $\bar{l}(g) = lim_{n\to \infty} \frac{l(g^n)}{n}$.
  • The limit exists by sub-addivity of $l(g^n)$ (for fixed $g$) by Fekete's lemma.
  • Kobayashi construction: Fix triangle inequality by $\bar{l}(g) = \inf\{\sum_{i=1}^nl(g_i) : g=\prod_{i=1}^n g_i \}$.
  • I hoped to combine these with an approach based on non-crossing matchings.

Non-crossing matchings

  • Given a word in $\mathbb{F}_2$, we consider matchings such that
    • letters are paired with their inverses,
    • there are no crossings
  • The norm is the number of unmatched letters.
  • For $g\in \mathbb{F}_2$, define the Watson-Crick length $l^{}_{WC}(g)$ to be the minimum number of unmatched letters over all non-crossing matchings.
  • $l^{}_{WC}$ is unchanged under cancellation (hence well-defined on $\mathbb{F}_2$) and conjugacy invariant.
  • However, if $g=\alpha[\alpha, \beta]$, then $l^{}_{WC}(g^2)= 4$, but $l^{}_{WC}(g) = 3$, violating homogeneity.
  • Non-crossing matchings correspond to decompositions into conjugates, e.g., $(\alpha[\alpha, \beta])^2 = ((\beta)^\alpha (\alpha^2)^{\bar{\beta}})^\alpha\bar{\beta}$, where ($g^h$ denotes $hgh^{-1}$; giving proofs of bounds and showing $l^{}_{WC}$ is the maximal conjugacy-invariant length.

The great bound chase

  • By Tuesday morning, most people were convinced that there are no homogeneous length functions on the free group.
  • There was a steady improvements in the combinatorial/analytic bounds on $l([\alpha, \beta])$.
  • These seemed to be stuck above 1 (as observed by Khare) - but eventually broke this barrier (work of Fritz, Khare, Nielsen, Silberman, Tao).
  • At this stage, my approach diverged.

    Computer Assisted proofs: beyond (symbolic) computation, enumeration?

  • We can recursively compute the matching length $l^{}_{WC}(g)$ for a word $g$.
  • This gives an upper bound on all conjugacy-invariant normalized lengths.
  • This can be combined with using homogenity.
  • Using this, I obtained an upper bound of about $0.85$ on $l(\alpha, \beta)$.
  • This was upgraded to a (computer) checkable proof.
  • On Thursday morning, I posted an in principle human readable proof of a bound.
  • The computer-generated proof was studied by Pace Nielsen, who extracted the internal repetition trick.
  • This was extended by Nielsen and Fritz and generalized by Tao; from this Fritz obtained the key lemma:
  • Let $f(m, k) = l(x^m[x, y]^k)$. Then $f(m, k) \leq\frac{f(m-1, k) + f(m+1, k-1)}{2}$.
  • A probabilistic argument of Tao finished the proof.

Computer bounds and proofs

  • For $g=ah$, $a \in \{\alpha, \beta, \bar{\alpha}, \bar{\beta}\}$, the length $l^{}_{WC}(g)$ is the minimum of:
    • $1 + l^{}_{WC}(h)$ : corresponding to $a$ unmatched.
    • $\min\{l^{}_{WC}(x) + l^{}_{WC}(y): h = x\bar{a}y \}$.
  • We can describe a minimal non-crossing matching by a similar recursive definition.
  • A similar recursion gives a proof of a bound on $l(g)$ for $g$ a homogeneous, conjugacy-invariant length with $l(\alpha)=l(\beta)=1$.
  • We can also use homogeneity for selected elements and powers to bound the length function $l$.

Proofs of $l(word)\leq bound$:


sealed abstract class LinNormBound(
	val word: Word,
	val bound: Double)

final case class Gen(n: Int) extends
  LinNormBound(Word(Vector(n)), 1)

final case class ConjGen(n: Int,pf: LinNormBound) extends
  LinNormBound(n +: pf.word :+ (-n), pf.bound)

final case class Triang(
  pf1: LinNormBound, pf2: LinNormBound) extends
    LinNormBound( pf1.word ++ pf2.word, pf1.bound + pf2.bound)

final case class PowerBound(
  baseword: Word, n: Int, pf: LinNormBound) extends
    LinNormBound(baseword, pf.bound/n){
      require(pf.word == baseword.pow(n))
    }

final case object Empty extends
 LinNormBound(Word(Vector()), 0)

On the computer proof

  • Our proofs, and their correctness are self-contained, and separate from the act of finding the proof.
  • Hence the correctness of the proof depended only on the correctness of the code defining proofs, which was tiny.
  • Further, while the proof was large, the search space for the proof was enormous.
  • Hence the proof had information content beyond establishing the truth of a statement.
  • In particular, one can learn from such a proof.

Limitations of the computer proof

  • The elements for which we applied homogeneity were selected by hand.
  • More importantly, in our representations of proofs, the bounds were only for concrete group elements.
  • In particular, we cannot
    • represent inequalities for expressions.
    • use induction.
  • Would want proof in complete foundations - which I completed a few days after the PolyMath proof.
  • Lemma: If $x = s(wy)s^{-1} = t(zw^{-1})t^{-1}$, we have $l(x)\leq \frac{l(y)+ l(z)}{2}$.
  • Proof: $l(x^nx^n) = l(s(wy)^ns^{-1}t(zw^{-1})^nt^{-1})$ $\leq n(l(y) +l(z)) + 2(l(s) + l(t))$
  • Use $l(x) = \frac{l(x^nx^n)}{2n}$ and take limits.
  • Lemma: $f(m, k) \leq\frac{f(m-1, k) + f(m+1, k-1)}{2}$, where $f(m, k) = l(x^m[x, y]^k)$.
  • In other words, for $Y$ a Rademacher random variable, i.e., $Y$ is $1$ or $-1$ each with probability $1/2$, $f(m, k)\leq E(f( (m, k - 1/2) + Y(1, -1/2) ))$.
  • Hence for $Y_i$ i.i.d. Rademacher random variables, $f(0, n) \leq E(f((Y_1 + Y_2 + \dots + Y_{2n}) (1, -1/2) ))$,
  • By triangle inequality, $f(a, b) \leq A\Vert (a, b) \Vert$.
  • $Y_1 + Y_2 + \dots Y_{2n}$ has mean $(0, 0)$ and variance $2n$, so $E(\Vert Y_1 + Y_2 + \dots Y_{2n} \Vert) \leq B\sqrt{n}$.
  • We deduce that $l([x, y]^n) = f(0, n) \leq C\sqrt{n}$, hence $l([x, y]) = 0$.

Epilogue: Quasification

  • The function $l: G \to [0, \infty)$ is a quasi-pseudo-length function if there exists $c \in \mathbb{R}$ such that $l(gh) \leq l(g) + l(h) + c$, for all $g,h\in G$.
  • We see that for a homogeneous quasi-pseudo-length function, $l([x, y]) \leq 4c$ for all $x, y\in G$.
  • Free groups have homogeneous quasi-pseudo-length functions that are not pullbacks of norms.
  • But, for a group with vanishing stable commutator length, e.g. if $G$ is solvable or $G=Sl(n , \mathbb{Z})$, $n\geq 3$, any homogeneous quasi-pseudo-length function is equivalent to the pullback of a norm.