Andrew Sutherland, Arithmetic L-functions and their Sato-Tate distributions

May 9, 2020 14:04 · 11188 words · 53 minute read constant coefficient algorithm exists fact

R. Pries: Welcome to the VaNTGe seminar. This is the third talk in the series of talks on the Sato-Tate conjecture And today, we’re very happy to have Drew Sutherland who’s speaking on arithmetic L-functions and their Sato-Tate distributions And Drew is it okay for us to record this talk and post it to YouTube? A. Sutherland: Yes, it is okay to post it to YouTube. Thank You Rachel and thank you for the invitation to speak in this seminar. When I first heard about the VaNTAGe seminar last fall, I thought it was a wonderful idea, something I wish I had thought of, and then I was very excited when Rachel asked me to speak in this seminar and then I was even more excited when she asked me to join her as a co- organizer So this has been a lot of fun and I’m looking forward to a lot of exciting seminars coming in the future. But let’s start with this one and let me get started So I thought I would kick off with a question of Nick Katz. This is from 2012, when he noted a simple thing that we don’t know So let X be a nice curve, say defined over Q, smooth projective geometrically integral For each prime p we can compute the trace of Frobenius which we could do by taking p + 1 - the number of F_p rational points on the reduction of our curve modulo p And we know from the Weil bounds we’re going tp get some integer whose absolute value is bounded by 2g\sqrt{p}, and If we normalize the trace by dividing by \sqrt{p} We’re going to get some real number between -2g and 2g.

01:39 - So in particular the genus is bounded below by the absolute value of the normalized trace divided by 2 for every prime. So you might imagine that if we look at Frobenius traces for enough primes p eventually, we could push the genus up to it’s actual value - this lower bound would become sharp, and Katz asked whether that was always possible. Now in the case of genus 1 we know that it is - the Sato-Tate conjecture says that we’re going to have a distribution, a semicircular distribution, that’s going to extend all the way to the edges of the interval, and so if we sample enough normalized Frobenius traces we’ll eventually see one that’s large enough to prove that we have a genus 1 curve. In fact, as soon as we see one, that’s nonzero we know that we have a curve of genus at least 1. But then you might ask: what about g > 1? Astonishingly, we don’t know.

02:35 - For g=2, the best we can say is that the normalized trace Frobenius will at least be bigger than ²⁄₃ infinitely often, and this is a highly non-trivial very recent result of Noah Taylor from 2018. Once we go past genus 2, we know essentially nothing, even proving that the Frobenius traces are nonzero infinitely often is a hard problem. So this should give us pause. I give this example just to illustrate how challenging the question we’re considering is. I mean, we’re analyzing the shapes of these distributions, but even to know that they’re non-trivial is hard to prove. Okay. So with that dose of humility, let us press on.

03:23 - Just recall the definition of the L-function of a curve, or equivalently of its Jacobian. This is defined as an Euler product taken over primes of the field of definition of the curve, in our case Q, and for each prime the polynomial L_p that appears in the Euler product is coming from the numerator of the zeta function of the reduction of the curve modulo p, and we know this L-polynomial has integer coefficients. It’s of degree 2g and it’s reciprocal to the characteristic polynomial of the Frobenius endomorphism of the Jacobian. Now those who study L-functions sort of “in the large”, in a context that’s much bigger than the one we’re considering, often find it interesting to focus on a subset of L-functions that satisfy a set of axioms that we really expect every reasonable L-function should satisfy. These were laid out by Selberg. So this class of L-functions is known as the Selberg class.

04:26 - I’m actually going to restrict to a subset of the Selberg class - those that have Euler factors Euler representations where the Euler factors are defined by polynomials. So call these polynomials a Selberg class and the four properties that L-functions in the polynomial Selberg class have are 1) they should have an analytic continuation that’s holomorphic away from s=1 2) there should be some choice of gamma factors that will make the completed L-function satisfy a functional equation for some choice of this complex number epsilon which has absolute value one, we should have a functional equation of this form, and we can define the degree of the L-function in terms of the shape of these gamma factors 3) An L function in the Selberg class should have coefficients that don’t grow too quickly - they should satisfy what’s known as the Ramanujan bound. 4) Our L-function should have an Euler product. We should be able to express it as a product over primes, where we’re taking some integer polynomial and plugging in p^-s where s is a complex variable, and then inverting. For all but finitely many primes these L-polynomials should have degree equal to the degree of the L function.

05:48 - If we take a nice curve over Q and normalise it appropriately so it fits within this framework of analytic L-functions, that means we want the axis of symmetry to be at ¹⁄₂ rather than 1 in the case of a curve, we obtain an L-function that automatically satisfies properties 3 & 4. Property 3 comes from the Weil bounds and property 4 comes just by definition: we’re defining the L function of a curve in terms of an Euler product And conjecturally any L-function that comes from a curve or abelian variety defined over a number field is going to satisfy all four properties. Satisfying the first two properties is equivalent to satisfying the Hasse-Weil conjecture. In the case of genus 1 curves, we know this is true because of the modularity theorem. One remarkable fact about the Selberg class of L-functions, is that it imposes some real rigidity to the set of L-functions that we’re considering.

06:54 - One striking example of this is known as “strong multiplicity one”. This is a theorem of 2001 which says that if I have two Dirichlet series that both lie in the Selberg class with polynomial Euler factors and the Dirichlet coefficients at prime indices agree for all but finitely many primes then they actually must be the same L-function. So in other words if we have an L-function in the Selberg class it’s automatically determined by all but finitely many coefficients a_p and we can even pick, we’re free to choose the primes for which we want to ignore some a_p. This will be very convenient, in particular in situations where maybe it’s hard to compute certain a_p but you have a good method for computing lots and lots of them. So henceforth we’re going to assume that we’re always working with L-functions that lie in this Selberg class For curves over number fields, there is a natural choice of gamma factors, which i’ve denoted here Gamma_C, so we’ll define our completed L-function by taking the g-th power of Gamma_C and we then expect our completed L-function to satisfy a functional equation in which there are two parameters that appear.

08:15 - There’s the root number epsilon which for curves over a number fields is always going to be plus or minus 1 and an analytic conductor, this is a positive integer, and you can take as its definition It’s the unique positive integer for which this functional equation actually holds. Now one of virtues of having a functional equation is it’s actually really hard for L-functions to satisfy a functional equation, and this is what really is behind the multiplicity one phenomenon. And we can test this, and we can test the validity of the coefficients of the Dirichlet series we’re considering explicitly, by defining a function S(x) in terms of the inverse Mellin transform of the gamma factors at infinity. And so we define this as a sum over n going from 1 to infinity But we’re always free to truncate this sum if we wish. So the way I’ve defined S(x), our completed L-function is going to look like an integral of S(x) over x going from 0 to infinity.

09:27 - We’re then going to have an analogue of the functional equation that’s satisfied by S, but because this inverse Mellin transform decays quite rapidly, for any sufficiently large choice of a positive constant c_0, if we truncate this series by only considering the Dirichlet coefficients a_n for n <= c_0 x we get a very good approximation to this function S(x). And we can compute an explicitly bounded error term that tells us how far our approximation differs from the true value. So our strategy is then this: we fix some small set of primes and these would be primes for which maybe we don’t know exactly what the Euler factor is at that prime, we don’t know what a_p is. So in the case of hyperelliptic curves a common choice is to take S to be the set of primes containing just 2 that’s the one prime that can be really tricky to deal with. We also assume we have an upper bound on the conductor of our L-function so in the case of a hyperelliptic curve, we could just take the discriminant of our favorite model of the curve, and there are then only finitely many possibilities for the root number which has to be plus or minus 1, and the conductor which has to divide our chosen upper bound, and also for the Euler factors in our set of excluded primes S.

10:53 - These L-polynomials have integer coefficients and the coefficients are bounded by the Weil bounds so there’s only finitely many different possibilities. And now let’s suppose we can efficiently compute a_n for all the n’s up to some bound proportional to the square root of our bound on the conductor, provided we’re only required to do this for the n’s that are not divisible by a prime in our excluded set. We now define a quantity that measures how far our truncated function S(x) is from satisfying the functional equation that we know it has to satisfy. When we plug in x equal to some multiple of the square root of our conductor, and we can test this for every possible choice of root number candidate for the conductor and candidate for L-polynomials for primes in our excluded set S, if we ever reach a point where all but one choice makes the error larger than our explicit error bound we know we must have the wrong choice - our S_0 cannot actually be the truncated version of the true S function that we are looking for, and so we can discard that choice. And the strong multiplicity one theorem tells us that as long as we make this constant c_1 big enough this is guaranteed to happen eventually.

12:18 - And one can explicitly determine a set of candidate values of how big this constant needs to be and one of them is guaranteed to work. And in practice in the case of hyperelliptic curves, we did this in a project for genus 2 curves we were always able to take c_1=16 and in some recent work on genus 3 curves we typically will take c_1=30. But if you have a choice of c_1 that doesn’t seem to be working, if you’re left with too many possibilities you just increase it until it does work. And of course I should note as a footnote this is all subject to our assumption that the L-function we’re considering actually lies in the Selberg class. But if it doesn’t, if it were ever to turn out that we couldn’t rule out all but one possibility we would have on our hands an explicit counterexample to the Hasse-Weil conjecture.

13:07 - Okay, just a side comment on conductor bounds. You might wonder how do we come up with this integer M that bounds the conductor. Well, we know that there’s only finitely many primes that can divide the conductor. These are going to be the primes of bad reduction; they have to lie within the primes of bad reduction for the Jacobian of our curve, which are certainly contained in the primes of bad reduction for the curve and there are explicit bounds one can compute given by Brumer and Kramer on the power of each prime that can appear in the conductor. I’ve shown them in this table here. Now these don’t give bounds that are optimal in most cases so a better strategy whenever you can is to take an integral model for your curve, one that’s as minimal as possible and then use its discriminant as an upper bound on your conductor.

14:00 - We know that that always works for hyperelliptic curves and If you define the discriminant appropriately, one expects that this should hold in general. Okay, so we’ve talked about the Selberg class of L-functions I now want to consider a slightly more precise notion of a class of L-functions of interest. These are the analytic L-functions that are described in a wonderful paper by Farmer, Pitale, Ryan and Schmidt in 2019 They define a class of analytic L-functions which satisfy all the properties, the four properties we laid out for L-functions in the Selberg class, as well as a few others and they add some more refined information that an L-function is required to satisfy to be considered an analytic L-function but conjecturally all the L-functions in the Selberg class with polynomial Euler factors do satisfy those additional properties. And among the universe of all analytic L-functions one can distinguish those of “arithmetic type”. These are analytic L-functions for which there is some integer w_ar, an arithmetic weight, and a number field F such that when we multiply each Dirichlet coefficient a_n by an appropriate power of n - that power is going to be n raised to the arithmetic weight divided by 2 - we wind up with an algebraic integer in the ring of integers of the number field F.

15:38 - So in order to be an arithmetic L-function there must exist some w and some F for which this is true for every n. The smallest F and w that work are known as the field of coefficients and the arithmetic weight of the L-function L(s). And for curves over number fields, the field of coefficients is always going to be Q, that doesn’t actually depend on the field of definition of the curve itself or of an abelian variety. So we always will get what’s known as a “rational L-function”, an arithmetic L-function whose field of coefficients is Q. And for curves and abelian varieties, the arithmetic weight is always going to be 1, which is the same as the motivic weight.

16:23 - More generally one expects that if you were ever given an L-function that comes from a motive, say a pure motive of a weight w, the motivic weight w should be the same as the arithmetic weight. And in fact, we expect that every arithmetic L function should arise in this way. So let’s take a look at an example So I’ve written down the first few Dirichlet coefficients of an arithmetic L-function here. if I click on this it should bring us to the homepage for this L-function in the LMFDB. Here we’re looking at the L-function in its arithmetic normalization - this is the way you would normally work with it if you’re coming from an arithmetic geometry perspective, but if we want to fit it into the class of analytic L-functions we need to normalize it so that the axis of symmetry is at Re(s)=¹⁄₂, so let me do that by switching over to be analytic normalization and wow the coefficients don’t look like integers anymore but if you stare at them for a while you realize that for each coefficient a_n if we multiply the coefficient by the squareroot of n we get an integer, or something that looks very close to an integer.

17:36 - Obviously we don’t have enough precision here to tell for sure, and we’re taking the square root because that’s ¹⁄₂ which is the arithmetic weight divided by 2. And here you can see on the L-functions homepage, we actually show the weight of the L-function, in this case, it’s actually denoted the motivic weight. This is also considered a synonym for the arithmetic weight because we expect they’re always the same. And the other invariant of the L-function that’s listed here that I want to draw your attention to is the Sato-Tate group So in this case, the Sato-Tate group is D_2,1. This is one of the labels of the 52 Sato-Tate groups that Francesc talked about in his talk, and you can see here on the homepage of this Sato-Tate group in the LMFDB you can see moment statistics and also information about other Sato-Tate groups that are contained in or contain this particular Sato-Tate group.

18:27 - It also tells us for example that for this L-function we should expect the trace of Frobenius the a_p=0, ³⁄₄ of the time. Actually, let me switch back one more time to the homepage, there’s one other thing I wanted to note about this arithmetic L-function, on the right side of the screen, we can see that this L-function actually has several origins. So one of those origins, which is the way this L-function, I first discovered this L-function was as the L-function of a genus 2 curve. So here’s the equation of that curve and on the genus 2 curve homepage, we can see many invariants of the curve as well as invariants of its Jacobian including quantities that appear in the BSD formula as well as its Sato-Tate group, the same D_2,1 we noted above. But the thing I want to draw your attention to now is that this genus 2 curve, its Jacobian actually decomposes as the product of two elliptic curves and these elliptic curves are actually twists of each other, and this is what leads to the large number of possible origins of the L-function.

Because instead of taking the genus 2 curve we could take one of the 19:46 - elliptic curve factors, for example, base change to a quadratic field where it becomes isogeneous to the other elliptic curve factor. So for example we could go to this isogeny class of elliptic curves over Q(\sqrt{-2}) or we could go to this isogeny class of elliptic curves over Q(\sqrt{6}), and both of these elliptic curves over quadratic fields have the same L-function as the genus 2 curve that we were looking at earlier. They both have associated modular forms that you can also find in the LMFDB. Ok, so now let me switch back to my slides, Coming back again to the paper of Farmer-Pitale-Ryan-Schmidt that I mentioned, in that paper there’s a wonderful diagram that you can find on page 21 that illustrates the relationships between various sets of L-functions and The thing I want to draw your attention to on this slide is these four boxes in the upper left. We have this box of L-functions arising from Galois representations.

21:02 - These are L-functions arising from automorphic forms, Q-automorphic L-functions we have L-functions arising from motives and then we have L-functions of arithmetic type. Each of the arrows in this diagram represents a conjecture that tells us that every L-function in one box is equivalent to an L function in another box, and If you follow enough arrows and assume enough conjectures, you can get from any of these four boxes to any of the other ones. So all of these L functions have what on the surface appear to be different definitions and potentially define different sets of L-functions, but conjecturally they are all the same. And so the rest of this talk we’re going to be working under the assumption that we’re working with arithmetic L-functions, all of which have an associated motive. The advantage of doing that is it gives us a well-defined notion, a well-defined way to attach a Sato-Tate group to an L-function.

22:09 - If we have a motivic L-function, we can always say that the Sato-Tate group of the L-function is just the Sato-Tate group of the corresponding motive. Now as we saw in the example we were just looking at, there may be many motives that lead to the same L-function. But if they give the same L-function they necessarily must all have the same Sato-Tate group. So let’s start with the simplest case - for rational L-functions of degree 2 and weight 1 there are exactly three possible Sato-Tate distributions, and these should look familiar because we saw these three Sato-Tate distributions in the first lecture in this series by Kiran Kedlaya. We have SU(2) which is the Sato-Tate group of an elliptic curve over Q that does not have complex multiplication On the right hand side, we have the normalizer of U(1) this is the Sato-Tate group of an elliptic curve over Q that has CM, complex multiplication.

23:10 - This spike in the middle has density ¹⁄₂ corresponding to the fact that half of the Frobenius traces are going to be 0. And then the third possibility is U(1) embedded inside SU(2), this is the Sato-Tate distribution of an elliptic curve with CM defined over a field where the CM is defined. But the difference between what we’re looking at on this slide or what we saw in the first talk is I’m now thinking of this purely from an L-function perspective, I’ve forgotten the elliptic curves. I’m just saying if you give me a rational L-function, an L-function with rational coefficients of degree 2 and weight 1 Its Sato-Tate group must be one of these three possibilities. If we now consider other possibilities for the weight w and degree d of the L functions we’re interested in There’s a whole zoo of different sources and different Sato-Tate groups that might arise So I’ve listed just some of them here, each of these blue highlighted L-functions or zeta function, Dedekind zeta functions, listed here represents a link to the L-functions and Modular Forms Database where you can go and see examples.

24:20 - So in weight 0, if we wanted to get a weight 0 degree 1 rational L-function, the thing to do would be to take, and this is really the only thing we can do, is to take a Dirichlet character which is either trivial or a quadratic character. These are precisely the Dirichlet characters that are going to take integer values. This would include the Riemann zeta function, but also quadratic Dirichlet L-functions If we wanted an L-function of weight 0 and degree 2 we could instead take the L-function of a weight 1 modular form So let’s do that Jump in here and we can take a look at this L-function. So this is an L-function of degree 2 motivic weight 0. if I go over here to the list of sources I can jump into the modular form of weight 1, notice that the weight of a modular form is always 1 greater than its weight as a motive. Its motivic weight is 1 less.

25:17 - So a weight 1 modular form has a motivic weight of 0. As I’m sure many of you know, every weight 1 modular form is also has an associated Artin representation The L function of this Artin representation is necessarily the same as the L-function of the modular form, and this is another way to get an L function of degree 2 and weight 0. More generally we could take Artin representations of higher dimension, not just 2, and as long as we restrict our attention to Artin representations that have rational traces we will get a rational L-function of weight 0 and degree n where n is the dimension of the Artin representation. This also includes Dedekind zeta functions, even though the Artin L-functions that are factors of this Dedekind zeta function might not have rational coefficients, if we multiply them all together we know that the Dedekind zeta function is going to have rational coefficients because its a_n’s are simply counting ideals of norm n. Moving on to weight 1, we’ve already seen lots of examples of L-functions of weight 1 This is what you get if you take in abelian variety or a curve over a number field.

26:29 - These also come from classical modular forms of weight 2, or for the degree 4 case we could get a weight 1 L-function by taking up, say a parallel weight 2 Hilbert modular form with rational coefficients. In weight 2, there are a few different ways to get weight 2 L-function. We could take a weight 3 classical modular form, or we could take the symmetric square of an elliptic curve So the elliptic curve itself is weight 1 degree 2, but when we take the symmetric square we get weight 2 and degree 3, and so on. We could even take the symmetric cube, and this would be a way of getting a weight 3 degree 4 L-function Another option for us would be to consider hypergeometric motives and you can find many of these in the beta version of the L-functions and Modular Forms database. if we take a hypergeometric motive, say with Hodge vector [1,1,1,1], we will get an L-function of weight 3 and degree 4.

27:28 - And I’m going to highlight this example because the degree of the L-function and the weight together determine where the Sato-Tate group lives. So the Sato-Tate group is going to be a subgroup of an orthogonal group if the weight is even and a subgroup of a unitary symplectic group when the weight is odd. Notice that for unitary symplectic groups this d here must be an even number. And so that tells us that when the weight is odd only even degrees can show up. So that’s why I’ve only listed even degrees next to 1 and 3 But for even weights you can get many different degrees.

28:02 - The overriding rule is that the product of the weight and the degree is always going to be an even number. What’s interesting about weight 3 and degree 4 is that we get Sato- Tate groups that are subgroups of USp(4) which is where the Sato-Tate groups of abelian surfaces lie. But as Francesc noted when he was describing the classification of Sato-Tate groups of abelian surfaces, there are 55 Sato-Tate groups that satisfy the first two axioms he laid out that are ruled out by the fourth axiom, which he referred to as the Serre axiom. But in weight 3 that axiom doesn’t apply, that seems specific to abelian varieties, and in weight 3 those three excluded Sato-Tate groups can actually occur, they can be realized by L-functions of weight 3 and degree 4, In addition, there’s even another connected Sato-Tate group that shows up in weight 3. The degree 2 unitary group U(2) can be embedded inside USp(4) and that shows up as a connected Sato-Tate group for weight 3 motives that have degree 4 L-functions.

29:13 - Okay, so I want to switch back to our focus on Sato-Tate groups of abelian varieties and rational L-functions of weight 1. But I just wanted to sort of broaden the picture and make sure that you have a chance to see the the larger universe of L functions and rational L-functions and Sato-Tate groups that one can attach to them. Okay, so anyone who’s ever seen me give a talk about Sato-Tate distributions knows that at some point, I’m going to show you an animated histogram And so we have now reached that point. This is one of our standard examples, It’s a Sato-Tate histogram for a genus 2 curve y^2=x^5-x+1 And I’m gonna go ahead and start the animation. What you’re seeing here in each frame is a histogram where on each bucket is counting the number of normalized Frobenius traces that lie in some subinterval of the interval [-4,4] in which we know the normalized Frobenius trace of a genus 2 curve must lie.

30:23 - And you can see that we’re getting a nice smooth curve here corresponding to the Sato-Tate group USp(4). So this is a generic genus 2 curve. What happens if we take an a non- generic genus 2 curve, say this one? So now we’re going to get a Sato-Tate distribution that looks quite different You can even see in the picture It’s starting to emerge in the picture that this Sato-Tate group has multiple components And in this case the spike in the middle actually has positive density, it’s shown up here. its density is going to turn out to be exactly ¹⁄₆, corresponding to the fact that ¹⁄₆ of the Frobenius traces for this curve are always 0. And we can also look at an example in genus 3. So this is a particular genus 3 curve and my colleague who’s involved in writing the code that allowed us to actually compute this Sato-Tate distribution, David Harvey, likes to call this the Batman distribution for obvious reasons.

31:24 - So this curve has a very interesting Sato-Tate group and an interesting series of moments of its Sato-Tate distribution. In this case the spike in the middle has density ¹⁄₄. These spikes both have densities 0. So among the Sato-Tate groups of abelian surfaces that are defined over Q, in addition to the generic case there are 25 exceptional Sato-Tate trace distributions, I’ve shown them here. If we were to consider trace distributions of abelian surfaces over number fields we would get larger number - there are 36 possibilities. That’s still less than 52 because the 52 Sato-Tate distributions of abelian surfaces do not all have distinct trace distributions.

32:20 - And in dimension 3, Sato-Tate groups of abelian 3-folds or genus 3 curves, well, there are 410 possibilities altogether for the Sato-Tate distributions. There are 245 trace distributions. That’s far too many for me to fit on a slide what I’ve done instead is just show the 14 connected Sato-Tate groups, these are the 14 identity components of Sato-Tate groups that arise in dimension 3. Ok, so I’ve been talking for a while This might be a good time to take any questions that people have on the first half of the talk. R. Pries: It’s going great. A. Sutherland: Okay, great. Good to know there are people out there. I can see 151 of you actually are out there. Yeah, so now that we’ve seen that we can think of having a Sato-Tate group associated to an L-function We understand that to compute an L-function, it’s really enough to compute the a_p’s for enough primes p, and I want to spend the second half of this talk explaining how one can do that efficiently enough to actually produce pretty pictures like this. I mean each of these pictures that you’re seeing here involves hundreds of millions of point counting computations over primes running up to a bound that’s at least a billion in every case, at least 10^9.

33:55 - So there are lots of ways to compute L-functions, I’m going to focus on algorithms that are practical to run in large volume for low genus curves. And so in the table here, I’m showing algorithms that one might use to compute the L-polynomials L_p. These are the polynomials that show up for the Euler product for the L-function of our curve and they also show up in the numerator of the zeta function. But we want to do this not just for one prime p, we want to do it for all good primes or all but some finite set of excluded primes up to some bound B So one approach would be to just do this naively, literally count points on a suitable model of the curve, and this leads to an algorithm that is roughly quasilinear in the number of points And so this is not going to be a very effective algorithm when g gets big We would be taking something on the order p^3 time (in genus 3). And this is for every p up to some bound B that’s running up to say a billion; that’s not practical.

35:07 - A more efficient approach which turns out to be very efficient in the genus 1 case is to actually use group computations on the Jacobian. You can do this using algorithms like baby steps giant steps, you can efficiently compute the order of the Jacobian, which gives you information about the point counts of the curve and you can use that to get an algorithm that runs in time proportional to the fourth root of p in genus 1, p^³⁄₄ in genus 2, and and it turns out to be if you also count points which you may as well count points over F_p if you’re going to spend time that’s at least O(p), you get an O(p) algorithm in genus 3. A more efficient approach asymptotically, as g is growing, is to use any of several techniques based on p-adic cohomology, Kedlaya’s algorithm is perhaps the most well known example of this. And this allows you to compute the zeta function including the numerator L_p in time proportional, roughly proportional, to the square root of p. And the genus is relevant here, but i’m considering a fixed genius here, so i’m not including that, but a nice thing about this algorithm is the complexity only grows as a polynomial function of the genus.

36:24 - Now those of you who are familiar with Schoof’s algorithm or it’s generalization to abelian varieties by John Pila might be wondering why we’re even wasting time thinking about using algorithms that are exponential in the size of the input. I mean the p here and the input size if we were to write down our curve the size of our coefficients for any prime p or if we were write it as integer coefficients with integers on the order of B is logarithmic, and so one might hope for an algorithm whose running time is polynomial in log(p), and such an algorithm exists. So this is Schoof’s algorithm in genus 1, which has running time proportional to log(p)^5, and if you use the Elkies-Atkin improvements, the SEA algorithm, you can even bring this 5 down to a 4. Analogues of this exist for higher genus when appropriately optimized, this 8 here I think is originally due to Gaudry and Schost. This seen 14 here in genus 3, this is a recent results of Abelard in his 2018 thesis.

37:32 - So these are impressive results and really the only option when p is really big, say if p is cryptographic size. But for the problem we’re considering where p is never going to be cryptographic size because by definition if we want all of these L-polynomials, the size of our output is going to be exponential in log B, because we want all the primes p up to B so we’re never going to be able to make the upper bound on p all that big. And so within the feasible range of computation these approaches that are polynomial time in log(p) are not actually even competitive with other approaches that are exponential in log(p). However there is a new approach or not so new now, this was introduced in 2014 by David Harvey which gives what he calls an average polynomial time algorithm, and the trick here, all of these other algorithms are working prime by prime, but Harvey’s algorithm doesn’t work prime by prime. Instead it’s going to give you the answer for all the primes, or all the good primes, up to some bound B all at once.

38:40 - So it’s going to be doing a computation where you have essentially no choice but to compute all of them, or you may as well compute all of them, since it’s no faster to compute just one of them. But when you compute all of them you get a running time that if you divide by the number of primes works out to be proportional to the fourth power of log p. And that’s already better than Schoof’s algorithm, even in genus 1 and in higher genus is dramatically better than any other approach that’s known. Now I’ve highlighted in blue the algorithm that I actually used to create the histograms that I’ve shown you. And notice that in genus 1, it’s not actually this log(p)^4 that’s highlighted.

39:15 - That’s because for the practical values of p say p is 2^40 or 2^32 the fourth root of p is actually smaller and then log(p)^4. And even after you adjust for the constant factors, the constant factors in this line are actually better than the constant factors in this line. But for genus greater than 1 the average polynomial time approach is the way to go. The other point I want to make on this slide, is that for the purpose of computing L-functions, while in principle we might be very interested in knowing the Euler factors, to compute the L-function we only really need to know the a_n to this bound B, and for the most part that just means knowing a_p’s. Yhe number of primes p for which p^2 < B, there’s only square root B of them and there’s only cube root B primes p whose cube root is less than or equal to B.

So we can afford to spend a lot more time 40:19 - computing a_p^n where n>1 because there’s so few of them So, in fact we could afford to use any algorithm that we that is linear in p to compute these a_p^2’s and a_p^3’s As long as we have a fast algorithm say an average polynomial time algorithm for computing the a_p’s, we can get an overall running time that’s going to be on average log(p)^4 for each prime. And this will allow us to compute the entire L-function provided we can take B to be up to some constant times the square root of the conductor of the Jacobian of our curve. Okay, and that makes this a very feasible computation in genus 1, 2 and 3, and even in genus 4, this would still be feasible as long as the conductor is not too big. Okay. So now I want to talk about Harvey’s average polynomial time algorithm. First I’ll state the results and then I’ll give some details on how one actually goes about running this algorithm.

41:26 - So the algorithmic result is far more general than just computing the L-function of a curve, his algorithm can compute the Hasse-Weil zeta function of any arithmetic scheme. So let X be a scheme of finite type over Spec(Z), we can define the Hasse-Weil zeta function as a product of local zeta functions, where these local zeta functions, at good primes anyway, they look exactly like our definition of the zeta function of a curve. We’re just defining them as these, you can think of these as formal power series, we’re defining them as exponential generating functions where the coefficients come from point counts where we have to remember what we mean by the point count on a scheme. To count the F_p^r-rational points on our scheme we compute the sum over closed points and we need to weight that sum by the degree of the residue field. So the details of this will be very familiar to experts, and for non- experts the details aren’t so important, what really matters is that these local zeta functions, you can think of these local zeta functions as being defined by point counts and as long as we can count points we can compute each of these local zeta functions, and then we can compute the complete Hasse-Weil zeta function.

42:55 - Harvey’s algorithm is going to give us a method to compute, simultaneously compute, all of these local zeta functions and therefore compute the Hasse-Weil zeta function, or rather a partial Hasse-Weil zeta function where we’re taking the Euler product over all primes up to some bound and that’s exactly what we need to compute the L-function. So this language of Hasse-Weil zeta function is not quite the same as L-function So a natural question would be, you know, if I have a nice curve X over Q say and I pick an integral model \mathcal{X}, I could consider my integral model as an arithmetic scheme, so it has a Hasse-Weil zeta function, and I could ask what’s the difference between the L-function of my curve and the Hasse-Weil zeta function of my arithmetic scheme. And the answer is they’re basically the same. So at good primes the local zeta functions will be exactly the same as the L-polynomials in the Euler product for L-function. And from our multiplicity one perspective where we know that it’s enough for us to be able to understand what happens at all but finitely many primes p, that’s all we need. So the primes where these two functions differ aren’t necessarily of concern to us But just for pedagogical purposes, I think it’s important to know that they aren’t quite the same, the L-function and the Hasse-Weil zeta function may differ at bad primes.

44:27 - But they don’t necessarily always differ It depends on how bad the reduction is. So for example if I took the elliptic curve 49a1, so conductor 49, and I could take its minimal Weierstrass equation as my arithmetic scheme, so here’s an equation with integer coefficients And if I compute the local zeta function at 7, I get -7T^2+1 That’s not the same as the L-polynomial at 7, which is just 1, the Euler factor at 7 for this L-function of this elliptic curve is 1 not 1-7T^2. On the other hand if I were to instead look at the elliptic curve 11a1 where I have a semi stable reduction at 11, the bad factors actually do agree. Ok. So now we can state Harvey’s result So let X be an arithmetic scheme, so we’re going to fix that first. So X is fixed. Given X, there is a deterministic algorithm that, given a prime p, we can compute the local zeta function at p in time quasilinear in p or more precisely p(log(p)) plus some small factor here which is going to be some power of loglog(p).

45:45 - And you can do this very space-efficiently using only O(log(p)) space If you’d like to compute the local zeta function a bit faster, you can do it in time that’s roughly proportional to the square root of p, now there’s a factor of log(p)^2 rather than log(p)^1, but it takes more space You’re going to need roughly O(\sqrt{p}log(p)) space. The third result and the one that’s most important for us, since we’re interested in L-functions not just zeta functions at a particular prime, is that there’s a deterministic algorithm that, given an integer N, outputs the local Hasse-Weil zeta functions for all the primes p up to N in time roughly N log(N)^3 using on the order of N log(N)^2 space. Now there are N/log(N) primes up to N So this is going to give us an average of log(p)^4 time per prime. And as I mentioned, in all of these complexity bounds, X is fixed, it’s only p or N that are part of the input. So depending on, you know, the description of X can actually be quite important in the practical implementation of this algorithm.

If the description of X is very complicated 47:06 - the constant factors can get unwieldly. But if we constrain our description of X, say for hyperelliptic curve, I can say I want you to give me me X as y^2 = f(x). Don’t give me the intersection of a whole bunch of equations in some high dimensional projective space. Or for example, I might say give me a plane model for X maybe even a singular one. I could take that as my model for X and in the case of plane curves one can run through Harvey’s description of this algorithm as given in his paper and you get a running time that has exactly the dependence on N that’s in his theorem as we would expect and the dependence on the genus of the curve is g^14 Now the 14 might look a little scary, but you should really think of that as a 7 because for plane curves the description of the curve, the number of coefficients, is going to increase with g^2.

47:59 - So relative to the size of the input, this is really only the 7th power, not a 14th power. Now in fact, I expect one can improve this exponent a lot. This is just the bound one gets running through the proof that Harvey gives. And I should mention that these aren’t just existence statements in this theorem. Harvey actually spells out the algorithm, at least in principle. Okay. So now I want to talk a little bit about the difference between what you can do in principle and what you can do in practice. So to date, all practical implementations of Harvey’s algorithm don’t actually compute the entire local zeta function, what they do instead is they compute just the L-polynomial modulo p, or more precisely what they’re doing is computing the Hasse-Witt or the Cartier-Manin matrix of the reduction of the curve modulo p (this is some g by g matrix with coefficients in F_p), for all primes p up to B, and the key point is that the trace of this matrix is equal to the trace of Frobenius modulo p. Knowing its value mod p actually determines the trace of Frobenius as an integer once p is big enough. All we need is for p to be bigger than 16g^2 and that’s a very small number if g is small. For p smaller than this, we could just count points naively to figure out what the traces are there.

49:38 - And fast implementations of these algorithms are available in the following cases. So for hyperelliptic curves over Q the first implementation that David Harvey and I did, there are two versions of this. I only mentioned the first one for historical reasons if you’re going to read a paper, read the second one. That’s where we actually figured how to do this properly. There’s also an algorithm that will treat geometrically hyperelliptic genus 3 curves that are defined over Q.

50:10 - So these are curves that have a hyperelliptic model once you base change to a quadratic field, but over Q, they look like degree 2 covers of a pointless conic. And so the description is slightly more complicated. But one can implement an average polynomial time algorithm for computing their L polynomials mod p and this was work that that I did with David Harvey and Maike Massierer. There’s also an algorithm for smooth plane quartics. This is work in progress that we hope to publish later this year.

This is work of David Harvey, Edgar Costa and myself and there’s also 50:52 - sn algorithm for superelliptic curves that I just found out has been accepted to ANTS, so I’m excited about that, and a preprint of this paper is available on my website. And if you’re frustrated that there aren’t more examples here, well, there’s a toy implementation about Harvey’s algorithm that works for any smooth plane curve, and It’s not hard to generalize it to handle other curves as well, that you can find in the lecture notes for a summer school I gave on this topic last year in Bristol. So this link will take you there, and you can implement the exercises that I gave in the summer school course and then extend them to actually get a fast efficient algorithm. So there’s a lot of work still to do here and I invite others to join us in the efforts to compute more and more of these L-functions and more and more Sato-Tate distributions. Okay. So in the last few minutes, I’m just going to very briefly try to sketch how this algorithm works, and to keep it brief I’m going to describe it in the simplest possible case of a genus 1 curve, which as already noted is actually the one case where you probably wouldn’t bother to use this algorithm in the first place. But that’s ok. it will help us to understand what’s going on.

52:11 - So let’s suppose we have an elliptic curve say y^2 = f(x), a cubic. I’m going to consider powers of this polynomial f, and I want to consider the coefficient of various powers of x in powers of this polynomial f. We know from the Hasse invariant that we can compute the trace of Frobenius mod p by taking the coefficient of f^((p-1)/2), I’m assuming p is odd here. Sorry, f^((p-1_/2), take the coefficient of the p-1 power of x, that’s a particular coefficient of this polynomial. So you could naively do this just compute this exponentiate this polynomial and pull out one coefficient.

53:00 - That’s not the most efficient thing to do. A better thing to do is to derive linear recurrences that relates strings of say three consecutive coefficients in powers, n-th powers of this polynomial f. So one can easily write down a linear recurrence that relates these coefficients and then you can express those recurrence relations in a matrix. Then if you write down a starting vector which if we took our n=0 that would be a very easy polynomial to write down because we would know it only had one nonzero coefficient, that’s the constant coefficient, would be one, so we could start with a base vector of [0 0 1] and then multiply it by this matrix enough times so that n here we could fix that to be p-¹⁄₂ and we could let k run from 0 to p-1. And compute the vector we would get when we multiply our initial vector by product of matrices.

54:05 - So we’ve reduced the problem of computing a particular coefficient of a particular power of this polynomial to computing the product of a bunch of matrices times a vector. So this expression here gives the vector computation with appropriate scaling to make sure we get the right answer, and If we write down this matrix in the case where we’re fixing n to be p-¹⁄₂ and we only now think of it as a matrix modulo p rather than a matrix with integer entries, we get a matrix that doesn’t actually depend on p. I mean it implicitly does, because we’re thinking about it mod p, but we could now just think of these as integers and view these matrix multiplications over the integers and reduce mod p at the end. So we wind up with a formula that tells us we can compute all of these traces of Frobenius that we want by computing a product of matrices times an initial vector modulo p. That leaves us with the following problem: we have a sequence of matrices, we have a sequence of moduli, here our moduli are just going to be primes, and we want to compute the whole sequence of partial products of take the first, you know, k matrices and the first k moduli and we want to compute the product of those matrices modulo, say the k-th modulus.

55:30 - We can turn this into a recursive algorithm where we pair up matrices and we pair up moduli and reduce it to a smaller instance of the same problem. This gives us what’s known as an accumulating remainder tree, this recursion defines a tree, where at the bottom, we have just individual matrices and we have a tree of matrices, and we have a tree of moduli. We work our way up the tree by computing products and then we work our way down the tree by reducing appropriate subproducts modulo appropriate products of moduli. And at the leaves we wind up with what we want, these values here, products of matrices, product of matrices M_0 up to M_n-1, modulo the n-th modulus. When you put all this together, the depth of the tree is on the order of log(B).

In each recursive step 56:25 - we’re multiplying and reducing a bunch of 3x3 matrices. The total bit size of the data involved is O(B log(B)). We know we can do multiplication in quasilinear time. The size of the matrices is fixed, so 3 times 3 or 3 [squared] is just a constant factor, and We wind up with an algorithm that allows us to compute the entire tree and all the reductions in time proportional to B log(B)^3. If we think of this as an average over primes, it’s going to give us the log(p)^4 that we want.

56:59 - One can also use these same matrix products, you could compute those for a single prime p and get the O(p^¹⁄₂) time algorithm that I mentioned earlier. So I’ll just end with an open problem. So we’ve seen that we can solve the problem of computing all of the point counts for a fixed curve or a fixed arithmetic scheme up to some bound in average polynomial time. Its time is on average polynomial in the size of the input But it would be really nice to have a polynomial time algorithm that would actually take just a single curve over a finite field and tell you how many points are on that curve in time polynomial in the input - we don’t have that yet. If you fix the genus we do, so Schoof’s algorithm is polynomial time, but that’s because g is hardwired to 1. But if my algorithm is supposed to take an arbitrary curve of arbitrary genius even if I restrict it to hyperelliptic curves, the description of the curve, the number of coefficients. grows with the genus.

58:05 - So the size of the input is on the order in the case of a hyperelliptic purple would be like g log(p). Or for a plane curve it to be like g^2 log(p). And we don’t have an algorithm whose running time is polynomial in that quantity. So I’ll leave that as an open problem for people to think about for the future. Okay, and I think I’ll stop there. [applause] R. Pries: Thank you Drew, that was a wonderful talk. Let’s start. There were some questions from Andrew Obus in the chat window A.

Obus: Okay, can I talk? Can people hear me? 58:50 - So I had a question about the hyper and superelliptic curves to the extent that one does better with those than with regular curves for this g^14 factor. So actually I looked just quickly at the papers that you mentioned there and so you have these algorithms that work for curves given by you know y^n = f(x) where f is a separable polynomial so it’s like a smooth plane curve with some stuff at infinity. Is this fundamentally easier than other cyclic covers of P^1 that maybe don’t have a form like that you know like just with branch cycle description, that it’s not (1 1 1 1… 1) A. Sutherland: Great question. The answer is no, it’s not fundamentally easier. In fact, it’s fundamentally harder. So yeah, the good news is for those who are interested in branched covers the formulas to describe the Cartier-Manin matrix are a little more complicated to write down but this has been done Irene Bouw has computed formulas for the coefficients of the Cartier-Manin matrix, the matrix of the Cartier-Manin operator, they’re specified, and their coefficients are some power of the polynomial defining the curve, and one can work out the details of the algorithm. So I didn’t do that in my paper, to the disappointment of some of my colleagues, just because it was more complicated and it’s not going to fit into an ANTS submission. But yes, it should not be harder.

00:25 - It’ll be hard to work out the details, but once you’ve worked out all the details, the running time of the algorithm will actually be faster because the matrix is smaller. I mean in terms of the degree of the polynomial, the genus, if you have repeated roots in your polynomial, the genus is going to drop and so the size of the matrices you’re working with is going to go down. So the running time of the algorithm will actually be slightly faster A. Obus: Faster relative to the degree of the polynomial but kind of the same relative to if the genus is A. Sutherland: Exactly, in terms of the dependence on the genus, it should be the same. A. Obus: Okay, thank you R.

Pries: Let’s see, there’s also a question from Fabien Pazuki 01:10 - F. Pazuki: So well, hi guys, it’s a pleasure. Thank you for your talk, Andrew. It was amazing So the question is a bit vague. I was a bit surprised by the fact that the Batman you showed, the density was not didn’t seem to be C^1. I mean it looked like it has spikes right? So I was a bit curious, do you know if let’s say we fix genus 2 you showed us the connected component Like the connected groups, I think the case of connected groups A. Sutherland: I’m trying to get to your Batman. This is the picture you’re asking about right? F. Pazuki: Yeah this one.

Yes, then we can see that you have these spikes going on 01:52 - Do you know, say for genus 2 curves, if it has some kind of meaning in a different way, that maybe the density there is not as nice maybe as a sine function or a Gaussian or… ? It’s a bit vague. But I mean it looks a bit strange that this density has spikes like that. So, I don’t know Can you say something? Let’s say, you know, the curve has a meaning over Q, the model is integer, so maybe the rank is special or I don’t know? A. Sutherland: Yeah, so let me answer that. So it’s misleading on the picture. But in fact this spike is way more spikier. The spike in the middle is way more spikier than these spikes by which I mean, this spike in the middle has positive density. The trace…

none of the other spikes can have positive density you can only have positive density at 0. Okay. That’s not true if we looked ,I didn’t show them in this in any of these animations, but if you were looking at a histogram for the a2 coefficients, say for the middle coefficient of the L-polynomial of an abelian surface, or in general for the even indexed coefficients of the L-polynomial, you can get positive density at other integer values. But for the odd coefficients you can’t get positive density except at the middle. That doesn’t mean there’s not something interesting happening here, what’s happening here is that there is a CM elliptic curve factor, or isogeny factor, it’s over some base extesion or something similar, and you’re seeing a component here where this is really trying to be one of these guys. So you’ll see here these are approaching infinity at the edge of the interval.

03:45 - And I think that is what is responsible for the spikes that you’re seeing in the Batman. But for the a2 coefficient it’s actually quite useful to look at the positive density spikes. In genus 2, for any integer from [-2,2] you can get positive density. And that’s actually a really useful thing to look at when you’re trying to determine what the Sato-Tate group is or to try to distinguish two Sato-Tate groups. You can often have histograms that look very similar on the screen but the density of the spikes is quite different.

04:26 - And the density of the spikes is something you can compute very quickly because the denominator, in the case of genus 2 anyway these are all rational numbers whose denominator has to divide 48. So you can tell them apart even with only a few thousand a_p’s. You can tell you can tell what these densities are looking like and we use this also in the project that’s still ongoing in genus 3. We’re using information about both the trace zero density for the a1 and the a3 coefficients and also the positive density integers for a2 to help distinguish. But whether there’s more one can say, sort of from an arithmetic perspective, yeah, I think that’s a great question. K. Kedlaya: Maybe I can add one comment A. Sutherland: Yes, please do K.

Kedlaya: Just a comment 05:19 - I think Fabien asked at some point specifically about connected groups If you look at connected groups, you can still have these zero density spikes like the one for the CM elliptic curve But I believe you can’t have positive density spikes when the group is connected. F. Pazuki: All right, thank you R. Pries: Great, let’s see any other questions here? Question: Also, you mentioned doing stuff for higher genus, can you do things for, say abelian varieties, maybe principally polarised but aren’t Jacobians A. Sutherland: Yeah, so In principle, there’s no reason not to. I guess the trick is we need, we need to be able to write down a description of the abelian variety. Now one easy way to do what you’re suggesting ,and in fact the way I computed a lot of the pictures you see on the screen here is I cheated, I didn’t actually necessarily go and find a curve to defined over Q with this identity component.

06:37 - What I did instead is, like I can compute U(1) x U(1) x U(1) I just take three elliptic curves with CM whose CM fields are different so I know they’re non isogeneous and I can compute the L-function of the product or the direct sum of the motives just as a product of L-functions,A and I can compute the histogram from that. So if you can describe your abelian variety for me, whether it’s polarized or not You know, the algorithms don’t really care about being principally polarized or not, the algorithms don’t really care about that. They just care about having a description that you can write down. So certainly for computing products and also quotients. I mean there are situations where maybe your abelian variety is a Prym variety.

07:21 - Say I want an abelian surface that’s coming from a Prym variety defined by a map from a genus 3 curve to an elliptic curve, as long as I can run my algorithm on the genus 3 curve and on the elliptic curve I can tell you exactly what the quotient looks like and I can compute the distribution. So I think probably the best approach in general for, you know, if you want to treat arbitrary abelian varieties that are not necessarily principally polarized is to try to express them as a Prym and then run this algorithm on the top and the bottom of your Prym. R. Pries: Great let’s see, any other questions? Question: Although in that case you tend to deal with higher genus than you would otherwise. A. Sutherland: This is true and you know, I’m open to suggestions, but my instinct is that I’d be willing to accept bumping the genus up 1 or 2 versus having to try and write down explicit equations for my abelian variety as a variety. That sounds a lot more difficult to me. But if there are other ways of describing the varieties you’re interested in, there may be more efficient approaches.

08:41 - Certainly, if you can describe your variety as a product of other varieties, that’s a huge help because we can work on the products independently. And it’s also another thing that we have done. I mean that might sound quite trivial, but another thing you can do, and this is how we have to do some of these things, to realize some of the abelian 3-folds that show up in our classification, is maybe we’re defining our abelian variety by taking a twist by some Hecke character or some other construction where the L-polynomials we’re interested in are actually computed in a somewhat complicated way, taking as input L-polynomials coming from things that are easy to compute and then manipulating them in some fashion. And as long as we know theoretically that that’s going to give us the actual answer of the abelian variety we know exists, but is too painful to write down explicitly, that’s another approach one can take. Question: One thing that I was thinking about if we got genus 4 varieties Yeah.

09:51 - I haven’t thought hard about the genus 4 case yet, but I know that’s sort of the obvious next thing to look at next, and I’m definitely very interested in it. Even aside from the classification question, just being able to compute them efficiently, I think it’s very much within the realm of computation and a lot of new interesting phenomena show up in genus 4 I mean I sort of feel like for g = 1,2,3,4 something brand new and exciting and interesting happens at every step I’m not sure if that happens again at genus 5 or not. But for the moment, I’m very motivated to move on to genus 4. R. Pries: Well, thank you again Drew, this has been a fabulous talk and on the next talk will be by David Zywina on May 5th and let’s all thank Drew one more time. [applause]. .