We discuss and compare two ways to estimate a short second moment of \(L\)-functions on \({\mathop{\mathrm{PGL}}}_2\) in the spectral aspect: that of Iwaniec (Iwaniec 1992), using the approximate functional equation and Kuznetsov formula, and that of (Nelson 2021) using period integrals.
Remark 1. This is a lightly edited version of “log-2022-03-08-a.tex”. It needs a lot of polishing, but might be useful to someone even in its current state.
Take \[G := {\mathop{\mathrm{PGL}}}_2(\mathbb{R}),\] \[\Gamma := {\mathop{\mathrm{PGL}}}_2(\mathbb{Z}) \hookrightarrow G,\] \[H := {\mathop{\mathrm{GL}}}_1(\mathbb{R}) \cong \begin{pmatrix} \ast & 0 \\ 0 & 1 \end{pmatrix} \hookrightarrow G,\] \[\Gamma_H := {\mathop{\mathrm{GL}}}_1(\mathbb{Z}) = \{\pm 1\} \hookrightarrow H.\] Let \(\pi \subset L^2_{\text{cusp}}(\Gamma \backslash G)\) be a cuspidal automorphic representation corresponding to a Maass form of eigenvalue \(1/4 + T^2\). We aim to show that \[\label{eqn:35ac318817}\tag{1.1} L(\pi,\tfrac{1}{2}) \ll T^{1/2 - \delta}\] for some fixed \(\delta > 0\).
Such an estimate was first shown by Iwaniec (Iwaniec 1992) (initially under an additional hypothesis, which Iwaniec later observed could be removed using the identity \(\lambda(p)^2 - \lambda(p^2) = 1\)). Iwaniec’s method consists of estimating an amplified second moment over \(T\) in an interval of length a bit more than \(1\). This method was adapted to \({\mathop{\mathrm{PGL}}}_3\) by Blomer–Buttcane (Blomer and Buttcane 2020a, 2020b), using an amplified fourth moment, and then to \({\mathop{\mathrm{GL}}}_n\) in (Nelson 2021) (direct link), using an amplified \(2(n-1)\)th moment. In the works of Iwaniec and Blomer—Buttcane, the moment is estimated using the approximate functional equation, Kuznetsov formula, and several applications of Poisson summation. In (Nelson 2021), the moment is estimated implicitly by direct analysis of an integral representation for the \(L\)-function, with vectors chosen as in (Nelson and Venkatesh 2021) and the averaging implemented via the pretrace formula like in (Iwaniec and Sarnak 1995).
Remark 2. The best known bound for \((1.1)\) is to due to Ivic (Ivić 2001) , who showed that \((1.1)\) holds for any fixed \(\delta < 1/6\). Ivic’s method consists of estimating an unamplified fourth moment over \(T\) in an interval of length a bit more than \(T^{1/3}\) (compare with (Michel and Venkatesh 2010; Nelson 2019; Blomer, Jana, and Nelson 2023; Balkanova, Frolenkov, and Wu 2022)). Such approaches using “higher moments” have no known extension beyond \({\mathop{\mathrm{GL}}}_2\). It would be significant to identify such an extension.
The space \(L^2_{\mathop{\mathrm{cusp}}}(\Gamma \backslash G)\) consists of square-integrable functions on \(\Gamma \backslash G\) whose constant term vanishes. It is simultaneously a representation for the group \(G\), acting via right translation, and the Hecke algebra, acting via left translation with respect to double cosets in \(\Gamma \backslash {\mathop{\mathrm{PGL}}}_2(\mathbb{Q}) / \Gamma\). Under these actions, it decomposes as a direct sum of irreducible representations, each occurring with multiplicity one. We denote by \(\pi\) the space of smooth vectors inside one such representation.
The Hecke eigenvalues of \(\pi\) are described by a multiplicative function \(\lambda_\pi : \mathbb{N} \rightarrow \mathbb{C}\), which specifies the eigenvalues for a spanning set of double cosets. For \(\varphi \in \pi\), one defines the Whittaker function \(W_\varphi : G \rightarrow \mathbb{C}\) by \[W_\varphi (g) := \int_{x \in \mathbb{R} / \mathbb{Z} } \varphi \left( \begin{pmatrix} 1 & x \\ 0 & 1 \\ \end{pmatrix} g \right) e(-x) \, d x,\] where we employ the standard abbreviation \(e(t) := e^{2 \pi i t}\). Conversely, \(\varphi\) may be recovered from \(\lambda_\pi\) and \(W_\varphi\) via the formula \[\label{eqn:35ac31d7c1}\tag{2.1} \varphi(g) = \sum _{n \neq 0} \frac{\lambda_{\pi}(|n|)}{|n|^{1/2}} W_\varphi \left( \begin{pmatrix} n & 0 \\ 0 & 1 \end{pmatrix} g \right),\] with the sum taken over all nonzero integers \(n\).
The Whittaker model \(\mathcal{W}(\pi)\) for \(\pi\) is defined to consist of all functions of the form \(W_\varphi\). The theory of the Kirillov model implies that the restriction map \[\mathcal{W}(\pi) \rightarrow \{\text{functions } H \rightarrow \mathbb{C} \}\] is injective, and that its image contains \(C_c^\infty(H)\). Combining this fact with \((2.1)\) tells us that
any \(\varphi \in \pi\) is determined by \(W_\varphi : H \rightarrow \mathbb{C}\), and
every smooth compactly-supported function \(H \rightarrow \mathbb{C}\) determines a unique \(\varphi \in \pi\).
The \(L\)-function \(L(\pi,s)\) may be defined for \(\Re(s) > 1\) by the absolutely convergent Dirichlet series \(\sum_{n \in \mathbb{N} } \lambda_\pi(n) n^{-s}\) and in general by meromorphic continuation. By unfolding \((2.1)\), one obtains the Hecke/Jacquet–Langlands integral representation \[\label{eq:int-_h-varphi}\tag{3.1} \int _{\Gamma_{H} \backslash H} \varphi = L(\pi,\tfrac{1}{2}) \int _H W_\varphi.\]
We denote by \(\mathfrak{g}\) the Lie algebra of \(G\) and by \(\mathfrak{g}^*\) its linear dual. Using the trace pairing, we may identify \(\mathfrak{g}^*\) with \(\mathfrak{g}\), which we identify further with the space \({\mathop{\mathrm{\mathfrak{s}\mathfrak{l}}}}_2(\mathbb{R})\) of traceless \(2 \times 2\) matrices \(\xi\). The coadjoint orbit \[\mathcal{O}_\pi \subseteq \mathfrak{g}^*\] attached to \(\pi\) is the one-sheeted hyperboloid cut out by the equation \[\det(\xi) = -T^2.\] (See for instance §6.2-§6.3 of Quantum variance III.) It comes with a natural symplectic volume form that describes the character of \(\pi\) via the Kirillov formula (see (Nelson and Venkatesh 2021, sec. 6)). We note that in the coordinates \[\xi = \begin{pmatrix} a & b \\ c & -a \end{pmatrix} = \begin{pmatrix} x & y - z \\ y + z & -x \end{pmatrix},\] we have \[\det(\xi) = -a^2 - b c = z^2 - y^2 - x^2.\] In particular, \(\mathcal{O}_\pi\) contains the circle \(\{(x,y,0) : x^2 + y^2 = T^2\}\). The circular strip \(\{(x,y,z) \in \mathcal{O}_\pi : |z| \leq 1/2\}\) has symplectic volume one; in the orbit method heuristic described in (Nelson and Venkatesh 2021, sec. 1.7), it corresponds to the weight zero vector in \(\pi\).
The integrals \((3.1)\) may be understood as describing how \(\pi\) oscillates against the trivial character of \(H\). The orbit method suggests (Nelson and Venkatesh 2021, sec. 1.9) that such oscillation may be understood in terms of the intersection \[\mathcal{O}_\pi \cap \mathfrak{h}^\perp,\] i.e., the preimage in \(\mathcal{O}_\pi\) of the trivial element \(0\) of \(\mathfrak{h}^*\). That intersection is given in \((x,y,z)\) coordinates by \[\{(0,y,z) : y^2 - z^2 = T^2\}\] and in \((a,b,c)\) coordinates by \[\{(0,b,c) : b c = T^2\}.\] It is a closed \(H\)-orbit, with trivial stabilizer.
We pick a point \(\tau \in \mathcal{O}_\pi \cap \mathfrak{h}^\perp\) of size comparable to \(T\). For concreteness, let us take \[\tau = \begin{pmatrix} 0 & T \\ T & 0 \end{pmatrix}.\]
Remark 3. For the variant problem concerning \(\pi\) attached to a holomorphic form of weight \(2 k\), we would take \(T := k - 1/2\), we would take \(\mathcal{O}_\pi\) cut out by \(\det(\xi) = T^2\), and we would take \[\tau = \begin{pmatrix} 0 & -T \\ T & 0 \\ \end{pmatrix}.\] See for instance §8.4 of Quantum variance III.
We seek a unit vector \(\varphi \in \pi\) that is “localized at \(\tau\).” Informally, this means that for each fixed Lie algebra element \(X \in \mathfrak{g}\), we have \[\label{eq:x-varphi-=}\tag{5.1} X \varphi = {i \langle X, \tau \rangle} \varphi + \operatorname{O}(T^{1/2}).\] For further informal discussion of this concept, see (Nelson and Venkatesh 2021, sec. 1.7) and (Nelson 2023, sec. 2.5) . For a precise definition, see §3 of this note; for exercises, see this note; for further discussion, see §7.1 of Quantum variance III, or (Nelson 2021, sec. 14) (direct link: §14).
Such a vector \(\varphi\) may be described readily in the Kirillov model for \(\pi\). Recall (from §2.3) that this model consists of the restrictions to \(H\) of elements \(W\) of the Whittaker model \(\mathcal{W}(\pi)\). It will be convenient to think of such restricted elements \(W\) as functions on \(\mathbb{R}^\times\) via the abbreviation \[W(y) := W\left( \begin{pmatrix} y & 0 \\ 0 & 1 \\ \end{pmatrix} \right).\] We consider the following basis elements for \(\mathfrak{g}\): \[\partial_a = \begin{pmatrix} 1 & 0 \\ 0 & -1 \end{pmatrix}, \quad \partial_b = \begin{pmatrix} 0 & 1 \\ 0 & 0 \end{pmatrix}, \quad \partial_c = \begin{pmatrix} 0 & 0 \\ 1 & 0 \end{pmatrix} .\] The action of the first two of these elements on the Kirillov model is given very simply: \[\partial_a W(y) = 2 y W'(y), \quad \partial_b W(y) = 2 \pi i y W(y).\]
From now on, we will be very vague and informal with asymptotic notation. For a precise discussion, we refer to Exercise 8 of this note. Since \(\langle \partial_a, \tau \rangle = 0\) and \(\langle \partial_b, \tau \rangle = T\), the condition \((5.1)\) says in particular that \[\partial_a W_\varphi = \operatorname{O}(T^{1/2}), \quad \partial_b W_\varphi = i T W_\varphi + \operatorname{O}(T^{1/2}).\] These formulas suggest taking the following smoothened \(L^2\)-normalized characteristic function: \[\label{eqn:35ac3e31cb}\tag{5.2} W_\varphi(y) = T^{1/4} 1_{T / 2 \pi + \operatorname{O}(T^{1/2})}^{\text{smooth}}(y).\] Using that \(W_\varphi\) is an eigenfunction under the Casimir operator, one can check also that \[\partial_c W_\varphi = i T W_\varphi + \operatorname{O}(T^{1/2}).\] This is the content of Exercise 8 of this note. For \({\mathop{\mathrm{GL}}}_n\), precise forms of all these estimates are established in (Nelson 2021, pt. 3) (direct link: Part 3).
Remark 4. This choice of \(\varphi\) is closely related to the “microlocal lift” of Zelditch (Zelditch 1987) et al., see (Lindenstrauss 2001; Silberman and Venkatesh 2007; Anantharaman and Zelditch 2007). More precisely, it is (asymptotically) a \(G\)-translate of the usual definition; the limit invariance for \(L^2\)-masses will be with respect to the stabilizer \(G_\tau\) of \(\tau\), namely \[\label{eqn:G-tau}\tag{5.3} G_\tau = \left\{ \begin{pmatrix} \cosh t & \sinh t \\ \sinh t & \cosh t \end{pmatrix} \right\},\] rather than with respect to the diagonal subgroup.
By choosing the smooth bump function implicit in \((5.2)\) to be nonnegative, we may arrange that \[\int_{H} W_\varphi \asymp T^{-1/4},\] so that \((3.1)\) reads \[\label{eqn:35ac3e32a4}\tag{5.4} T^{-1/4} L(\pi,\tfrac{1}{2}) \asymp \int_{\Gamma_{H} \backslash H} \varphi.\] Our task is to show that the right hand side is \(\ll T^{1/4-\delta}\).
We now indicate why the integral on the right hand side of \((5.4)\) of may be effectively truncated to a fixed compact set. For a detailed discussion of this point, see (Nelson 2021, sec. 5.3) (direct link: 5.3) or (Michel and Venkatesh 2010, sec. 5.1.4) or (Schumacher 2020, sec. 3) (or §3 of this informal note).
Consider, for a fixed even “truncation” function \(\mathcal{T} \in C_c^\infty(H)\), the map \[I : H \rightarrow \mathbb{C}\] \[I(Y) := \int _{y \in \Gamma_{H} \backslash H} \mathcal{T}(y/Y) \varphi(y) \, d^\times y\] assigning to a parameter \(Y\) the smoothened integral of \(\varphi|_H\) over the corresponding dyadic range. We have \[\int _{\Gamma_{H} \backslash H} \varphi |.|^{s-1/2} = L(\pi,s) Z(W_\varphi,s),\] where \(Z(W_\varphi,s)\) denotes the local zeta integral \[Z(W_\varphi,s) := \int _{H} W_\varphi |.|^{s-1/2}.\] For \(s = \operatorname{O}(1)\), we see by explicit calculation that \[Z(W_\varphi,s) \approx T^{s-3/4}.\] If moreover \(|\Re(s)| \leq 1/2\), then the convexity bound reads \[L(\pi,s) \ll T^{1 - s}.\] These bounds become only polynomially worse if we relax the condition \(s = \operatorname{O}(1)\) to \(\Re(s) = \operatorname{O}(1)\). Multiplying them together and convolving against the rapidly-decaying Mellin transform of the fixed test function \(\mathcal{T}\), we deduce the Mellin transform estimate \[|\Re(s)| \leq 1/2 \implies \tilde{I}(s) \ll T^{1/4} (1 + |s|)^{-\infty}.\] It follows readily that for fixed \(\kappa > 0\), we incur the acceptable error \(\operatorname{O}(T^{1/4 -\kappa/2})\) by smoothly truncating the integral \(\int_H \varphi\) to the range \(\{T^{-\kappa} < |y| < T^{\kappa}\}\). If we seek only a qualitative subconvex bound, then we can take \(\kappa\) as small as we like, so there is no harm in truncating to \(\{ |y| = T^{o(1)} \}\). (Note that these arguments are specific to \(\varphi\): for general vectors, one cannot always truncate to the range \(|y| = T^{o(1)}\).)
The model problem is thus to bound \[\label{eq:int-_y-in}\tag{6.1} \int _{y \in H, y \asymp 1} \varphi(y) \, d^\times y.\]
We use a convolution kernel \(\omega \in C_c^\infty(G)\) to “remember” many of the symmetries satisfied by \(\varphi\). (When amplifying, we really take \(\omega \in C_c^\infty({\mathop{\mathrm{PGL}}}_2(\mathbb{A}))\).) Roughly speaking, we take \(\omega\) to be a character multiple of an approximate subgroup of \(G\): \[\omega := \mathop{\mathrm{vol}}(J)^{-1} 1_J^{\text{smooth}} \chi_\tau^{-1},\] where:
\(J\) is a subset of \(G\) roughly of the shape \[J = (1 + \operatorname{O}(T^{-\varepsilon})) \cap (G_\tau + \operatorname{O}(T^{-1/2-\varepsilon})),\] with \(G_\tau\) the stabilizer of \(\tau\), as described in \((5.3)\).
\(1_J^{\text{smooth}}\) is a smoothened characteristic function of \(J\).
\(\chi_\tau\) is the “approximate character” of \(J\) attached to \(\tau\), given near the identity in exponential coordinates by \[\chi_\tau(\exp(X)) = e^{i \langle X, \tau \rangle}.\]
We may also describe \(\omega\) in terms of the function \(\mathfrak{g}^* \rightarrow \mathbb{C}\) obtained by taking the Fourier transform of the pullback \(\omega \circ \exp\). This function is roughly a smoothened characteristic function of a “coin-shaped” neighborhood of \(\tau\), of thickness \(T^\varepsilon\) (resp. \(T^{1/2+\varepsilon}\)) in directions transverse (resp. tangential) to the coadjoint orbit \(\mathcal{O}_\pi\) at \(\tau\). The intersection of this neighborhood with \(\mathcal{O}_\pi\) has symplectic volume \(\approx T^{2 \varepsilon} \approx 1\). The orbit method heuristic suggests that for an irreducible representation \(\sigma\) of \(G\), we have \(\sigma(\omega) \approx 0\) unless \(\sigma\) is a principal series representation of parameter \(T + \operatorname{O}(T^{\varepsilon})\), in which case \(\sigma(\omega)\) is approximately a projection onto a rank \(\approx T^{2 \varepsilon} \approx 1\) “subspace” of vectors microlocalized at \(\tau\). In particular, \[\label{eqn:35ac3e34eb}\tag{7.1} \pi(\omega) \varphi \approx \varphi.\] These heuristics and definitions can be made precise, and the above approximation holds in an extremely strong sense (i.e., up to \(\operatorname{O}(T^{-\infty})\) with respect to any fixed seminorm).
For further informal discussion concerning \(\omega\) in a general setting, see (Nelson 2023, sec. 2). For a precise discussion in the current rank one example, see §7.1 of Quantum variance III.
Let’s explain how to recover the convexity bound from here. Our task is to show that \[\int _{y \in H, y \asymp 1} \pi(\omega) \varphi(y) \, d^\times y \ll T^{1/4}.\] We view the square of the left hand side as one term arising from an integrated pretrace formula, like in the sup norm story (Iwaniec and Sarnak 1995). Alternatively, we write the left hand side as the inner product of \(\varphi\) against a Poincaré series and apply Cauchy–Schwarz; see (Nelson 2023, sec. 5.3). Either way, we reduce to checking that \[\label{eq:int-_-substack-1}\tag{8.1} \int _{ \substack{ y_1, y_2 \asymp 1 } } \sum _{\gamma \in \Gamma } \omega (y_1^{-1} \gamma y_2) \ll T^{1/2}.\] Since \(\omega\) is supported on \(1 + \operatorname{O}(T^{-\varepsilon})\), we see that the only \(\gamma\) that contribute are those in \(\Gamma_{H}\). Combining the \(y_1\) and \(y_2\) integrals, it remains to check that \[\label{eq:int-_-substack}\tag{8.2} \int _{ \substack{ y \in H : y\asymp 1 } } \omega (y) \ll T^{1/2}.\] To that end, we observe that \[H \cap G_\tau = \{1\},\] i.e., that no nontrivial matrices are simultaneously diagonal and of the form \((5.3)\). (This is a baby case of the “stability” feature explained in (Nelson and Venkatesh 2021, sec. 1.9, §14).) It follows that (up to some \(\varepsilon\)’s in the exponents) \[H \cap J \subseteq \operatorname{O}(T^{-1/2}),\] and so the volume of the integral in \((8.2)\) is \(\operatorname{O}(T^{-1/2})\).
On the other hand, the magnitude of the integrand is \[\mathop{\mathrm{vol}}(J)^{-1} \approx T.\] Indeed, \(J\) has dimensions roughly \(1\) along one direction and \(T^{-1/2}\) along the remaining two directions.
These observations combine to give the required estimate \((8.2)\).
Remark 5. It’s clear in retrospect that we should have obtained such an estimate: the orbit method heuristic applied to \(\omega\) suggests that the left hand side of \((8.1)\) is a proxy for the sum of \(|L(\pi,\tfrac{1}{2})|^2\) over \(T\) in an interval of width roughly \(\operatorname{O}(T^\varepsilon)\) (see (Nelson 2023, sec. 2.3)), which is of the appropriate size for an averaged Lindelöf estimate to recover convexity.
This section is just a stub; see (Nelson 2023, sec. 1.5, §2.7-2.10) for details and pictures relevant for this example. To carry out the amplification step, we basically need to know that the vector \(\varphi '\) obtained by averaging the translates of \(\varphi\) under elements of \(H\) of size \(\asymp 1\) satisfies a matrix coefficient estimate \(\langle g \varphi ', \varphi ' \rangle \ll T^{-\delta}\) except when \(g\) is very close to \(H\). This is word-for-word what happens in the sup norm problem (Iwaniec and Sarnak 1995), where, if we replace \(H\) with \(\mathop{\mathrm{SO}}(2)\), then \(\varphi '\) becomes something like the weight zero vector.
The “two arguments” are “equivalent.” Recall that Iwaniec (Iwaniec 1992) starts with the approximate functional equation, applies Kuznetsov, and then applies Poisson summation to both variables. The reduction to \((6.1)\) is essentially the approximate functional equation (TODO: explain this in some detail?). In the passage to \((8.1)\), rather than applying the pretrace formula, we could have instead applied the Fourier expansion of \(\varphi\) and averaged each term in the resulting double sum using Kuznetsov. A couple applications of Poisson to the geometric side of Kuznetsov would then bring us right back to \((8.1)\).
It’s more interesting to compare the generalization of this argument to \({\mathop{\mathrm{GL}}}_3\) with Blomer–Buttcane (Blomer and Buttcane 2020a). The arguments are again ultimately “equivalent,” but the present approach seems less miraculous.