Atkinson’s Lemma

Birkhoff’s ergodic theorem is arguably one of the most important theorems in ergodic theory. While being a neat convergence result, it gives very little information about how the Birkhoff averages of a function converge. Do they converge fast or slow? Is the convergence uniform, at least for nice functions? And do the averages overestimate or underestimate the limit? It is this last question that Atkinson’s Lemma addresses. These are some notes about it based on a talk by Barak Weiss. The lemma comes in several forms and below we will state one that will be useful to understand why Atkinson’s Lemma holds. To state it, denote for a measure-preserving system (X, \Sigma, \mu; \phi) and a function f\in\uL^1(X, \Sigma, \mu) the Birkhoff sums by S_nf \defeq \sum_{k=1}^n T_\phi^k f and the Birkhoff averages by A_nf \defeq \frac{1}{n}S_n f.

Lemma. Let (X, \Sigma, \mu; \phi) be an ergodic measure-preserving system and f\in\uL^1(X, \Sigma, \mu) real-valued. Then for almost every x\in X, there are strictly increasing sequences (n_j)_{j\in\N} and (m_j)_{j\in\N} such that for all j\in\N

(1)   \begin{align*} {A_{n_j}}f(x) \leq \int_X f \dmu \leq A_{m_j}f(x). \end{align*}

Assume that, eventually, A_nf(x) > 0 for almost every x\in X. Then Atkinson’s Lemma shows that \int_X f\dmu > 0 and hence, by Birkhoff’s ergodic theorem, A_nf(x) = \int_X f\dmu + o(n). In other words: If the Birkhoff sums S_nf are eventually strictly positive, then they already have to grow linearly. This phenomenon of turning a qualitative property into a quantitative one makes Atkinson’s lemma an important result for several applications, including the positivity of the top Lyapunov exponent for random walks which turns (via logarithmic reduction to the additive case) unbounded growth of random matrix products into exponential growth.

Also, note that while for generic f, the inequalities in Atkinson’s lemma can even be assumed to be strict, the function f = 0 shows that, in general, one cannot expect the inequalities to be strict for every function.

Corollary. Let (X, \Sigma, \mu; \phi) be an ergodic measure-preserving system and f\in\uL^1(X, \Sigma, \mu) real-valued. Then the following assertions are equivalent.

  1. \int_X f\dmu = 0
  2. For almost every x\in X, there are strictly increasing sequences (n_j)_{j\in\N} and (m_j)_{j\in\N} such that for all j\in\N

        \begin{align*} A_{n_j}f(x) \leq 0 \leq A_{m_j}f(x).\end{align*}

Proof. One implication follows from Atkinson’s lemma and the other implication from Birkhoff’s ergodic theorem. \qedsymbol

Now, why should Atkinson’s lemma be true? To understand this, recall that a contraction T on a Banach space E is mean ergodic if and only if

    \begin{align*} E = \fix(T)\oplus\overline{\ran}(I-T). \end{align*}

In light of von Neumann’s mean ergodic theorem, this means that for an ergodic measure-preserving system (X, \Sigma, \mu; \phi) and a function f\in \uL^1(X, \Sigma, \mu), f admits a unique decomposition as

    \begin{align*} f = \int_X f\dmu + g \end{align*}

where g \in \overline{\ran}(I - T_\phi). Assume for a moment that g lies in fact in \ran(I - T_\phi), i.e., g = h - T_\phi h for some h\in \uL^1(X, \Sigma, \mu). Then

    \begin{align*} f = \int_X f\dmu + \left(h - T_\phi h\right) \end{align*}

and in this case Atkinson’s lemma follows very quickly from Poincaré’s recurrence theorem since

    \begin{align*} A_nf = \int_X f\dmu + \left(h - T_\phi^n h\right) \end{align*}

which is either constant if h is constant or (by ergodicity) oscillating if h is nonconstant. Most of the work in Atkinson’s lemma thus goes into extending this idea to the general case involving \overline{\ran}(I-T_\phi) instead of \ran(I-T_\phi). It is not immediately obvious why this transition should be possible since, generally, \uL^1-perturbations usually do not interact well with almost everywhere convergence. Therefore, before we prove Atkinson’s lemma, it is necessary to gain a better understanding of when a function g in \overline{\ran}(I-T_\phi) actually lies in \ran(I-T_\phi) and how functions in \overline{\ran}(I-T_\phi)\setminus\ran(I-T_\phi) behave.

To understand this problem, note that

(2)   \begin{align*} S_{n+1}g = S_n \left(T_\phi g\right) + g \end{align*}

or, equivalently,

    \begin{align*} g = S_{n+1}g - T_\phi \left( S_n g\right). \end{align*}

Now, if we knew that (S_ng)_{n\in\N} converged, than g would lie in \ran(I - T_\phi). However, this is almost never the case. That being said, it is not actually necessary to take a limit — it suffices to eliminate the dependence on n, e.g. by taking the infimum or supremum over n\in\N. This leads to the following criterion.

Lemma. Let (X, \Sigma, \mu; \phi) be an ergodic measure-preserving system and g\in\overline{\ran}(I-T_\phi) real-valued. Define almost everywhere

    \begin{align*} h\colon X \to \overline{\R}, \quad h(x) \defeq \inf_{n\in\N} S_ng(x). \end{align*}

Then \mu([h = -\infty]) \in \{0,1\}. If \mu([h = -\infty]) = 0, then g = h - T_\phi h.

The upshot of this lemma is: A function g in \overline{\ran}(I-T_\phi) that is not of the form g = h - T_\phi h for some  function h must necessarily have Birkhoff sums that are unbounded from below and, indeed, also from above, as can be seen by passing to -g. In other words, the Birkhoff sums of a usual function in \overline{\ran}(I-T_\phi) will exhibit strong oscillating behavior. To be fair, one has to admit that this lemma is not quite what one would hope for since it is not clear (to me) whether h is integrable and hence g \in \overline{\ran}(I - T_\phi). However, for establishing oscillating behavior, it is sufficient to represent g as h - T_\phi h, regardless of whether h is integrable or not. Thus, our quest remains unaffected!

Proof. For almost every x\in X,

    \begin{align*} T_\phi h(x) = \inf_{n\in\N} S_{n+1} g(x) - g(x) \geq \inf_{n\in\N} S_n g(x) - g(x) = h(x) - g(x). \end{align*}

Therefore, the set [h = -\infty] is \phi-invariant and so \mu([h = -\infty]) \in \{0,1\} by ergodicity. If \mu([h = -\infty]) = 0, we already know by definition that h \leq g and so \mu([h = +\infty]) = 0, showing that h is finite almost everywhere. In this case, we can rearrange the above inequality to g \geq h - T_\phi h.

To see that equality holds, note that g - (h- T_\phi h) \geq 0, so this function has a well-defined (possibly infinite) integral. (We do not know whether h is integrable.) Therefore, by a slight generalization of Birkhoff’s ergodic theorem using cutt-offs,

    \begin{align*} \lim_{n\to \infty} A_n\left(T_\phi h - h\right)(x) &= \lim_{n\to \infty} A_n\left(g - (h - T_\phi h)\right)(x) \\ &= \int_X g - (h - T_\phi h) \dmu \quad \mu\text{-a.e.} \end{align*}

since \int_X g\dmu = 0. If we can show that this vanishes, then g - (h- T_\phi h) = 0 since a positive function with zero integral vanishes. However, for almost every x\in X

    \begin{align*} \int_X g - (h - T_\phi h) \dmu &= \lim_{n\to\infty} A_n\left(g - (h - T_\phi h)\right)(x) \\ &= \lim_{n\to \infty} A_n\left(T_\phi h - h)\right)(x) \\ &= \lim_{n\to \infty} \frac{1}{n}T_\phi^nh(x) - \frac{1}{n}h(x) \\ &= \liminf_{n\to\infty} \frac{1}{n}T_\phi^n h(x) \\ &= 0 \end{align*}

since by Poincaré recurrence, almost every x\in X returns infinitely often to a set
of positive measure where h is bounded. Hence, g = h - T_\phi h. \square

With this criterion in our pocket, Atkinson’s lemma is a piece of cake.

Proof (of Atkinson’s Lemma). First, suppose f can be written as

    \begin{align*} f = \int_X f\dmu + (h - T_\phi h) \end{align*}

for some (not necessarily integrable) h\colon X \to \R. Then the claim is immediate if h is essentially constant and follows swiftly from Poincaré recurrence otherwise since S_n(h - T_\phi h) = h - T_\phi^{n+1}h will oscillate almost everywhere. If f does not admit such a representation, then neither does -f and so by the lemma above the Birkhoff sums (S_n(f-\int f\dmu))_{n\in\N} must be unbounded from above and below, i.e.,

    \begin{align*} \inf_{n\in \N} S_n\left(f-\int_Xf\dmu\right) = -\infty \qquad \text{and} \qquad \sup_{n\in \N} S_n\left(f-\int_X f\dmu\right) = +\infty \qquad \mu\text{-a.e.} \end{align*}

In particular, for almost every x\in X

    \begin{align*} S_n(f)(x) \leq n\int_X f\dmu \qquad \text{and} \qquad n\int_X f\dmu \leq S_n(f)(x) \end{align*}

each hold infinitely often. Dividing by n yields the claim. \square

The lemma that we used should be compared to the Gottschalk-Hedlund Theorem in topological dynamics:

Theorem. Let (K; \phi) be a minimal topological dynamical system on a compact metric space K. If f\in\uC(K), then

    \begin{align*} \exists g\in\uC(K)\colon f = g - T_\phi g \quad \iff \quad \exists M > 0 \forall x\in K\forall N\in\N: \left|\sum_{n=0}^N T_\phi^nf(x)\right| < M. \end{align*}

Posted in Uncategorized | Leave a comment