Homomogeneous length functions on Groups

A PolyMath adventure

Siddhartha Gadgil

Indian Institute of Science

joint with

the rest of (spontaneous) polymath 14

The PolyMath 14 participants

Tobias Fritz, MPI MIS
Siddhartha Gadgil, IISc, Bangalore
Apoorva Khare, IISc, Bangalore
Pace Nielsen, BYU
Lior Silberman, UBC
Terence Tao, UCLA

On Saturday, December 16, 2017, Terrence Tao posted on his blog a question, which Apoorva Khare had asked him.

Is there a homogeneous, (conjugacy invariant) length function on the free group on two generators?

Six days later, this was answered in a collaboration involving several mathematicians (and a computer).
This the story of the answer and its discovery.

Length functions

Fix a group $G$, i.e. a set with an associative product with inverses.
A pseudo-length function $l: G \to [0, \infty)$ is a function such that:
- $l(e) = 0$.
- $l(g^{-1}) = l(g)$ , for all $g \in G$.
- (Triangle inequality) $l(gh) \leq l(g) + l(h)$, for all $g,h\in G$.
A length function is a pseudo-length function such that $l(g) > 0$ whenever $g\neq e$ (positivity condition).

A pseudo-length function $l$ is said to be conjugacy invariant if $l(ghg^{-1})=l(h)$ for all $g, h \in G$.
$l$ is said to be homogeneous if $l(g^n) = nl(g)$ for all $g \in G$.
On the group $\mathbb{Z}^2$ of pairs of integers, a (conjugacy invariant) homogeneous length function is given by $l_{\mathbb{Z}^2}((a,b)) = |a| + |b|$.
On the other hand, if $l$ is homogeneous and $g^n=e$ then $l(g)=0$ as $nl(g) = l(g^n) = 0$.
Thus, if $G$ has torsion, i.e., there exists $g\neq e\in G$, such that $g^n =0$ for some $n > 0$, then there is no homogeneous length function on $G$.

Free group $\mathbb{F}_2$ on $\alpha$, $\beta$

Consider words in four letters $\alpha$, $\beta$, $\bar{\alpha}$ and $\bar{\beta}$.
We multiply words by concatenation, e.g. $\alpha\beta\cdot\bar{\alpha}\beta = \alpha\beta\bar{\alpha}\beta$.
Two words are regarded as equal if they can be related by cancellating adjacent letters that are inverses, e.g. $\alpha\beta\bar{\beta}\alpha\beta = \alpha\alpha\beta$.
This gives a group; e.g. $(\alpha\bar{\beta}\alpha\beta\beta)^{-1} = \bar{\beta}\bar{\beta}\bar{\alpha}\beta\bar{\alpha}$.
The word length on $\mathbb{F_2}$ is the length of the shortest word representing an element.

Pullback pseudo-lengths

We define $ab: \mathbb{F}_2 \to \mathbb{Z}^2$ by total exponents (ignoring order), e.g. $ab(\alpha\alpha\beta\bar{\alpha}\bar{\beta}\alpha\bar{\beta}) = (2, -1)$.
We get a pullback pseudo-length $l$ on $\mathbb{F}_2$ by $l_{ab}(g) = l_{\mathbb{Z}^2}(ab(g))$.
As $l_{\mathbb{Z}^2}$ is homogeneous, so is $l_{ab}$.
However, this is not a length function as $l_{ab}([\alpha, \beta])=0$, where $[g, h] = ghg^{-1}h^{-1}$ (violating positivity).
More generally, every group $G$ has an Abelianization $ab: G \to G/[G, G]$.

The Main results

Question: Is there a homogeneous length function on the free group on two generators?
Answer: No; we in fact describe all homogeneous pseudo-lengths.
Theorem: Any homogeneous pseudo-length function on a group $G$ is the pullback of a pseudo-length on its Abelianization $G / [G, G]$.
Corollary: If $G$ is not abelian (e.g. $G = \mathbb{F}_2$) there is no homogeneous length function on $G$.
In fact, every homogeneous pseudo-length is the pullback of a norm on a vector space.

First reductions

(Fritz): Homomogeneous implies Conjugacy invariant.
(Tao, Khare): $l(g^2) = 2l(g) \forall g\in G$ implies Homogeneity.

Finding bounds: contradict positivity?

Fix $l$ homomogeneous pseudo-length function.
The triangle inequality gives bounds on $l(gh)$ in terms of $l(g)$ and $l(h)$.
Homogeneity gives a bound on $l(g)$ in terms of a bound on $l(g^n)$; also use conjugacy invariance.
Find bounds for elements in terms of bounds for others, especially
- $l([x, y])$ in terms of $l(x)$ and $l(y)$
- $l(g)$ for fixed $g$ in terms of bounds on generators.
Combine bounds, including by interpreting in terms of convexity.

Seeking a homogeneous length function

We can attempt to construct pseudo-lengths that are positive, or at least such that $l([\alpha, \beta])> 0$.
Can consider quotients larger than $\mathbb{Z}^2$: negative results by Sawin, Silberman, Tao.
Given a candidate $l: G \to [0, \infty)$, try to fix this to ensure homogeneity and the triangle inequality.
There were also attempted constructions based on functional analysis.

Repairing lengths

We start with a candidate function $l: G \to \mathbb{R}$, say the word length.
Homogenize: $\bar{l}(g) = lim_{n\to \infty} \frac{l(g^n)}{n}$.
The limit exists by sub-addivity of $l(g^n)$ (for fixed $g$) by Fekete's lemma.
Kobayashi construction: Fix triangle inequality by $\bar{l}(g) = \inf\{\sum_{i=1}^nl(g_i) : g=\prod_{i=1}^n g_i \}$.
I hoped to combine these with an approach based on non-crossing matchings.

Non-crossing matchings

Given a word in $\mathbb{F}_2$, we consider matchings such that
- letters are paired with their inverses,
- there are no crossings
The norm is the number of unmatched letters.

For $g\in \mathbb{F}_2$, define the Watson-Crick length $l^{}_{WC}(g)$ to be the minimum number of unmatched letters over all non-crossing matchings.
$l^{}_{WC}$ is unchanged under cancellation (hence well-defined on $\mathbb{F}_2$) and conjugacy invariant.
However, if $g=\alpha[\alpha, \beta]$, then $l^{}_{WC}(g^2)= 4$, but $l^{}_{WC}(g) = 3$, violating homogeneity.
Non-crossing matchings correspond to decompositions into conjugates, e.g., $(\alpha[\alpha, \beta])^2 = ((\beta)^\alpha (\alpha^2)^{\bar{\beta}})^\alpha\bar{\beta}$, where ($g^h$ denotes $hgh^{-1}$; giving proofs of bounds and showing $l^{}_{WC}$ is the maximal conjugacy-invariant length.

The great bound chase

By Tuesday morning, most people were convinced that there are no homogeneous length functions on the free group.
There was a steady improvements in the combinatorial/analytic bounds on $l([\alpha, \beta])$.
These seemed to be stuck above 1 (as observed by Khare) - but eventually broke this barrier (work of Fritz, Khare, Nielsen, Silberman, Tao).
At this stage, my approach diverged.

Computer Assisted proofs: beyond (symbolic) computation, enumeration?

We can recursively compute the matching length $l^{}_{WC}(g)$ for a word $g$.
This gives an upper bound on all conjugacy-invariant normalized lengths.
This can be combined with using homogenity.
Using this, I obtained an upper bound of about $0.85$ on $l(\alpha, \beta)$.
This was upgraded to a (computer) checkable proof.
On Thursday morning, I posted an in principle human readable proof of a bound.

The computer-generated proof was studied by Pace Nielsen, who extracted the internal repetition trick.
This was extended by Nielsen and Fritz and generalized by Tao; from this Fritz obtained the key lemma:
Let $f(m, k) = l(x^m[x, y]^k)$. Then $f(m, k) \leq\frac{f(m-1, k) + f(m+1, k-1)}{2}$.
A probabilistic argument of Tao finished the proof.

Computer bounds and proofs

For $g=ah$, $a \in \{\alpha, \beta, \bar{\alpha}, \bar{\beta}\}$, the length $l^{}_{WC}(g)$ is the minimum of:
- $1 + l^{}_{WC}(h)$ : corresponding to $a$ unmatched.
- $\min\{l^{}_{WC}(x) + l^{}_{WC}(y): h = x\bar{a}y \}$.
We can describe a minimal non-crossing matching by a similar recursive definition.
A similar recursion gives a proof of a bound on $l(g)$ for $g$ a homogeneous, conjugacy-invariant length with $l(\alpha)=l(\beta)=1$.
We can also use homogeneity for selected elements and powers to bound the length function $l$.

Proofs of $l(word)\leq bound$:


sealed abstract class LinNormBound(
	val word: Word,
	val bound: Double)

final case class Gen(n: Int) extends
  LinNormBound(Word(Vector(n)), 1)

final case class ConjGen(n: Int,pf: LinNormBound) extends
  LinNormBound(n +: pf.word :+ (-n), pf.bound)

final case class Triang(
  pf1: LinNormBound, pf2: LinNormBound) extends
    LinNormBound( pf1.word ++ pf2.word, pf1.bound + pf2.bound)

final case class PowerBound(
  baseword: Word, n: Int, pf: LinNormBound) extends
    LinNormBound(baseword, pf.bound/n){
      require(pf.word == baseword.pow(n))
    }

final case object Empty extends
 LinNormBound(Word(Vector()), 0)

On the computer proof

Our proofs, and their correctness are self-contained, and separate from the act of finding the proof.
Hence the correctness of the proof depended only on the correctness of the code defining proofs, which was tiny.
Further, while the proof was large, the search space for the proof was enormous.
Hence the proof had information content beyond establishing the truth of a statement.
In particular, one can learn from such a proof.

Limitations of the computer proof

The elements for which we applied homogeneity were selected by hand.
More importantly, in our representations of proofs, the bounds were only for concrete group elements.
In particular, we cannot
- represent inequalities for expressions.
- use induction.
Would want proof in complete foundations - which I completed a few days after the PolyMath proof.

Lemma: If $x = s(wy)s^{-1} = t(zw^{-1})t^{-1}$, we have $l(x)\leq \frac{l(y)+ l(z)}{2}$.
Proof: $l(x^nx^n) = l(s(wy)^ns^{-1}t(zw^{-1})^nt^{-1})$ $\leq n(l(y) +l(z)) + 2(l(s) + l(t))$
Use $l(x) = \frac{l(x^nx^n)}{2n}$ and take limits.

Lemma: $f(m, k) \leq\frac{f(m-1, k) + f(m+1, k-1)}{2}$, where $f(m, k) = l(x^m[x, y]^k)$.
In other words, for $Y$ a Rademacher random variable, i.e., $Y$ is $1$ or $-1$ each with probability $1/2$, $f(m, k)\leq E(f( (m, k - 1/2) + Y(1, -1/2) ))$.
Hence for $Y_i$ i.i.d. Rademacher random variables, $f(0, n) \leq E(f((Y_1 + Y_2 + \dots + Y_{2n}) (1, -1/2) ))$,
By triangle inequality, $f(a, b) \leq A\Vert (a, b) \Vert$.
$Y_1 + Y_2 + \dots Y_{2n}$ has mean $(0, 0)$ and variance $2n$, so $E(\Vert Y_1 + Y_2 + \dots Y_{2n} \Vert) \leq B\sqrt{n}$.
We deduce that $l([x, y]^n) = f(0, n) \leq C\sqrt{n}$, hence $l([x, y]) = 0$.

Epilogue: Quasification

The function $l: G \to [0, \infty)$ is a quasi-pseudo-length function if there exists $c \in \mathbb{R}$ such that $l(gh) \leq l(g) + l(h) + c$, for all $g,h\in G$.
We see that for a homogeneous quasi-pseudo-length function, $l([x, y]) \leq 4c$ for all $x, y\in G$.
Free groups have homogeneous quasi-pseudo-length functions that are not pullbacks of norms.
But, for a group with vanishing stable commutator length, e.g. if $G$ is solvable or $G=Sl(n , \mathbb{Z})$, $n\geq 3$, any homogeneous quasi-pseudo-length function is equivalent to the pullback of a norm.