Minimalistic typed Lua is here - FOSDEM 2020

Feb 2, 2020 19:35 · 4461 words · 21 minute read lua mode types start minimalistic

well I think we’re past time so let’s get started hi everyone I’m Hisham and I’ve been involved with Lua for a long time and I’m the maintainer of LuaRocks, the Lua package manager, and also I’m involved with other free software projects, mainly htop, I work at Kong where we develop also an open source project, Kong, we’re hiring by the way — are we? yes we are! so yeah we are here to talk about Lua and types and minimalism so this talk is actually a sequel because last year I was here at what was then called the Minimalistic Languages developer room to talk about “Minimalism versus types” so it was about the journey of, you know, the difficulties between Lua and the idea of types and I hinted in the end about some further development so my last slide was actually “to be continued” so here I am and it’s very fortunate that the the scope of the dev room was expanded this year and it’s now called “Minimalistic Experimental and Emerging languages” because what I’m going to be talking about today hopefully ticks all boxes alright so let’s get started I’m gonna start with a quick recap of what I talked about last year in case you weren’t here so we’re gonna be talking about adding types to languages essentially every modern practical programming languages has types even though we called some of them like typed and not- typed, untyped like they do have types because like strings and numbers there are different things even if you don’t have to name there your variables because essentially what we’re talking about is having dynamically typed languages where values have types but variables don’t so any variable can accept any value and here’s a bunch of examples and the statically typed languages where values have types but and variables also have types right so only things that match can be assigned to each other and in recent years there have been a lot of interest in bringing types because they help the programmer to avoid mistakes so there has been a bunch of projects related to that like Python has mypy and recently pytype, Ruby has Sorbet, Typescript has become super popular for for JavaScript and and so on but what about Lua? so last year I talked about the challenges involved in in bringing types of Lua which is such a minimalistic tiny language — there have been projects such as Tidal Lock and Typed Lua which is an academic project right but we never really got there so what’s the what’s the main challenge especially in a minimalistic language well adding types or really adding anything to a language makes it larger, so it makes it larger both conceptually and also in implementation. Last year I kind of argued that conceptually the complexity when you’re writing your programs—well not in the language itself but in your programs—the complexity of types are already there if you’re maybe using a dynamically typed language but you’re constructing complex data structures and well you have to put the right kinds of data in the right places or else your program is not correct. So the idea of all of the types there are already there. Your program is as complex is as conceptually complex as, you know, its purpose. Though the implementation one is kind of hard to counter argue it’s really a thing: if you’re going to be adding more stuff, more features, your implementation is going to get more complex so here’s the conundrum like if the language grows too much it doesn’t feel like Lua anymore you know if there’s a tons of concepts you know and it starts turning in slowly turning into C++, right? but if the if we keep the implementation and everything super minimalistic and simplistic then it will stop you from doing all of the powerful things that you can do with Lua because it’s like super dynamic and and very like freeform you know as dynamic languages are so if if it’s too simple then it’s too restrictive and it doesn’t feel like Lua anymore either right so but we want both we want a small language that fits you in your head and it’s like how I like to describe Lua and why I like it like the full reference manual is this like small single page like you can kind of read in one go and understand everything there is to understand about language but also I want a type checker that at least like catches when you make a silly typo or something like that right? I want the practical benefits of you know doing like pair programming with the compiler you know in order to catch my mistakes so the challenge really is to find this sweet spot between minimalism and functionality you know this is this is what this is about so I’m gonna be talking about TL I’m in search of a better name still right so this is like the temporary name that I still had last year I’m still haven’t found a better name yet which is a typed dialect of Lua so it’s let’s talk about the minimalism part so the implementation tries to be very minimal in the Lua spirit. Lua itself if you download from the lua.

org 05:38 - website you get like a 300 kB tarball from whichyou can build a full VM right and it’s written in pure standard C with no dependencies, that means only standard C library, that means no sockets you know it doesn’t come with with batteries included, that even means no directory traversal right if you want to do file system information have to get a an extra library. so TL is in that same vein: so it’s a single file it’s currently as less than 5,000 lines counting with blank lines and comments and everything right so it’s one pure Lua file with also with no dependencies and actually it’s a pure TL file which is transpiled — source-to-source compiled into into Lua it is a source-to-source compiler, or as people nowadays call it “transpiler”, so it’s essentially the output it’s essentially your input without the types so this is the breakdown of the source code on to which like every every part is you know so we have a lexer, we have a little pretty printer that I used to debug the lexer, a parser written by hand, recursive descent parser, probably could be smaller if I had used an external dependency to use a parser generator but I wanted to keep everything self-contained right so there’s like a small AST traversal that implements a visitor pattern which I use for the pretty printer that dumps the output file then the type checker which is the bulk of the thing, definition of types for just Luis tender library which is still incomplete you know in a small loader that allows you to load TL files with require from Lua code. so the idea of no dependencies is that it you can take the generated TL that Lua drop it in your lower project and off you go like you can require TL files and and mix them with your with your Lua files of course this has pros and cons as I said but for now this is what I’m going with so there’s also a TL it’s not like one single file because the bulk of it for running it in your project is one single file but there’s also like a little command line TL file which you can use to “check” a TL file so it’s going to look at your TL file see off the types match and if your program is going to go to the Good Place or the Bad Place and then there’s a “gen file.tl” which strips the types, generates a Lua file but does not type check it so even if your type program has type errors you can still convert that to Lua and run it and “tl run” which does all things: it checks the file and if it passes it generates the Lua file in memory and just runs it straight with your Lua VM so so that the experience becomes self-contained it has two modes of execution which is depending on that the extension of their file so that.tl would make it “strict mode” and.lua would be more of a “lax mode” essentially for interacting—for having Lua programs that interact with.

tl 08:47 - files so the difference would be something like this if you write a Lua, if you write code like this in Lua which—this is fine, but if you are in strict mode then that means you haven’t annotated any return types so that function is annotated so that it doesn’t return anything and we don’t do advanced—in a minimal spirit we don’t do advanced flow inference right so in strict mode this would be an error because it says like f(x) does not return anything and you’re trying to get its return value here when you’re when you’re calling it but that would be accepted in the.lua mode because in the.lua mode the return type of that function is unknown which is the type of everything that it’s not annotated so if you do that then strict mode is happy and it knows how you’re going to return a number and everything’s fine in in a in a.tl file so TL reports errors and unknowns separately so when you start to convert your lua file into a TL file it’s not that you’re going to get like you know hundreds or thousands of errors you’re gonna get hundreds or thousands of unknowns right and says it will look at it and it will decide that ok I don’t know anything about like there’s no type annotations anywhere I don’t know basically anything about this file so as you start to annotate the types then the values of the types start to propagate things start to be, become less unknown right and eventually it might spot errors and then this is how you you gradually convert your program so the type checker really as we have seen it is it is the bulk of the compiler right and it does it does very minimal it does just like local inference if you initialize a variable like “local x = 0” then it knows it is a number you don’t have to type in like the types every time depending like it they can take like the return for example in in that example it knows that f(x) returns a number so here z is a number as well and it also does some very very naive flow inferencing like for things like empty tables like that so that ks you seen like, it was inferred to become the string the first time it the first time it was used and in table.insert so in the end that that works and it knows it to returns an array of string and if there is an error in that you use it inconsistently it will report an error and it will say “well, I was expecting an array of strings because I inferred it like at that spot”. It records for anything that’s inferred that requires the provenance and where it was inferred so that you can say “okay, that’s why you guessed it wrong” so the main thing about the type checker and the bulk of its work is determining the types of tables that’s usually the most complicated thing about type checking in Lua because a table is such a general type that’s the main thing about lua there’s no separate arrays and dictionaries, or like lists and hashes, everything is a table and a table can do everything.

So what is a Lua table in practice? when we are typing it 12:02 - we have to determine that and so essentially tables in TL can be either maps they are declared like this: you know so like keys are strings and values are boolean, arrays, declared like this, they’re essentially the same as maps with number as keys but like for better error reporting and like register programmer intent, i kept it as a separate type, then records which are specified by name so a record is a table that has you know keys with names, and array-record which is like a record and array all in one which is a very common Lua pattern in which people store things in named keys but also in the numbered keys store an array with that kind of like see programmers do when they do a struct and then put an array in the end so this is a very common lua pattern and I kind of had to support it from day one and sometimes people do array-maps as well but I have not implemented that yet i’m sure people will ask about it very soon as they as more people start using it but i just didn’t have the need so I was going with the bare minimum that I needed because my very first project which I talked about last year was trying to bootstrap it, now we’ve got to the point where it did, by the time I ended my talk I remember I had like 300-something type errors and but now the thing actually type checks itself. So I mentioned that that records are declared by name so that was a big design choice to keep things minimalistic to keep things small right there was a choice of making type systems can be either nominal or structural and I chose nominal because it just makes everything simpler essentially to compare if things match you just check the names if, you know, if the names are the same then it’s fine so you can declare record types like this and then you use them by name so for now there is no inheritance or any kind of interfaces or traits and things like that so this is this is one limitation and but I think eventually something of that kind will have to be added but then as I was as I was writing and as I was typing the compiler itself the code of the compiler itself I started writing it in Lua because TL did not exists right so I wrote it to the point that it could start checking itself and then I started annotating the types right and then I realized that well when programming in a dynamic language with dynamic types is trivial to generate very generic code right so I realized that I actually needed generics in order to write the code like, in order to translate the code that I had already written in Lua right so I added that with the backtick there as a slight nod to OCaml’s equality types so this is the same code as as that one right but instead of like only strings and now it works on tables of any kind because I either had two choices when I got to the point that I was translating, I was typing that code like either I would have to add inheritance to the language and then readjust all my Lua code to make it more object-oriented or I could keep it the way I originally programmed in Lua but then I had to have generic types so so I added type variables like that, they only check for equality they’re not very smart you know, it’s a very minimal support that’s mostly useful for things like this which are general containers of things but it works quite well so I was able to use like the visitor for different things like pretty printing and and type-checking and all of that right so this was an important example of prioritizing the practical needs over going through a feature checklist right okay I have records I need inheritance or I need traits or I need this or I need that right it was going through what I needed so “yay, types! now what?” then I realized that once I had the types and I kept working on it like I didn’t, all of those silly typo errors were gone because they were immediately caught by the compiler and then all I had left were tons of tons like it seems that every error that I got was related to nil like what Turing Award winner Tony Hoare called his billion dollar mistake: adding nil references. So yeah, so we still have nils and as in Lua any variable of any type may be a nil and how do you solve that well in many statically typed languages you have option types right Maybe, Result in Rust but for Lua that would be very tricky because essentially like every table access can return a nil so that means that every table access returns an option type and you have to destructure that and check and like that? that really wouldn’t make sense. It would take—but what if we can have the compiler detect it? that would be smart and that would be fun to do so yes so I started—so this is the “experimental” part that I started going down this rabbit hole and doing something that I promised myself that I would not do which would be to start doing researchy type things in the project right so I started writing some flow analysis to see if I could detect the need for optional type detection or if you can derive that that table access is never nil and and all of that right so I started writing writing writing and get myself totally sidetracked into that fun project of dealing with nils right so to the point that I started to add the flow analysis and it started growing growing growing right and but essentially I was dug out by off the rabbit hole right and and by the fossum deadline because I wanted to have something to show here and most importantly by user feedback right and I want to shout out to Patrick who’s been active in the github project over there posting issues and I just merged his first a PR into the project so someone else was actually trying to use it and they were hitting the things that were actually missing when you want to get really work done right that I didn’t hit when I was writing the compiler because that was just one program so it’s interesting to take a look and what were those issues and how that goes so the the first of them (five minutes? all right! we’re in the tail end) so the first one was that was that “okay, I want to write a TL program, of course I’m gonna have to use Lua libraries because there are no TL libraries so how do I add types to existing Lua libraries so I can so I can use them?” so “will it support something equivalent to Typescript declaration files?” so that obviously that was sorely needed if you wanted to get anything done with the language so yeah so I got at it and did it, it was very quick ten line thing to just add that and have it okay so when we’re type checking I just blatantly copied Typescript’s standard for that which is.d.ts so essentially when type checking before trying to require the.tl file or a.

lua file you just try to, 19:27 - when type checking try to use the definition file instead to read the types from even though they’re the functions are not there but when running then you actually use the real code. The next thing he asked for so it’s okay I have a definition file so I can start writing a definition file I started writing definition file for the Love2D library for writing Lua games right so he got to a point there said some functions in Love have multiple overloads such as this one but now it seems that you know TL kind of ignores the previous ones and just keeps the last and that is really what the compiler was doing but he needed both definitions so turns out the Lua does not have function overloading like that but it’s super common to fake it like Love actually has one definition of that function but the types are dynamic so it just does some “if type of the first argument equals this and that”, you know “then do this otherwise do that” and it feels like the user has two different functions with the same name. The funny thing was that I had cheated myself and for the definition of the standard library I had added polymorphic functions like that but I did not expose the type in the language because Typescript does something complicated: intersection types for that but then I decided to just expose that with the exact syntax that he guessed. I said okay if you have multiple entries of function types in a record they are just mushed together as an ad hoc polymorphic function but then it was the last one which is a fun one and it’s the last thing we’re gonna talk about here. This challenge: how do you type that table over there which is an array of things well the Love documentation says that’s okay it’s a table containing colors of the strings in that form: {color1, string1, color2, string2…

} and each color is in the form {red, green, blue, alpha} 21:24 - what’s the type of this? so what’s the type of coloredtext? that depends on how precise you want to be about it, right? the first step could be any which is a type that TL has this says “I don’t know anything about this” or you could say “well actually it’s a table” which is more precise you cannot pass it a number but don’t know anything about the table we could say something that “it’s an array of something” because what the user asked for was union types, what he wanted to write was this “I wanted an array of strings or arrays of numbers” but TL does not support that so the best that TL can do at this point is number 3 but this if you can think about it, this accepts a lot of invalid tables as well you can put the colors and the names in the wrong order and this would still accept it it would not be a type error so essentially what you would want would be something like this right so in the in the odd ones you would have numbers and in the even ones you have strings like you know does your favorite programming language like, can your favorite programming language express this one, right? but this still accepts invalid tables right because actually what you want is in the odd ones you want arrays of numbers of length 4 right but this still accepts invalid tables right because well the the length of the full array has to be even because you have to have pairs of colors but this still accepts invalid things because the numbers in there have to be between 0 and 1 right and looks like this is now the actual real type that lives in the programmer’s head went like the dynamic programmer when they wrote about it, when they designed the function this is the type they really had in their heads even if they don’t realize it but this still accepts invalid tables because this is not accounting for nil right so yeah so as you can see here like each type here expands on it saying like something more precise about the idea of the definition you would have you would need to have something like Idris, dependently typed languages, where essentially you have to define like a proof of you know and you could have like number 8 right there are languages that can do that but they’re like mostly research languages now. TL could do number 3 but if you have the statically typed mindset you would probably write like this instead the definition of your function right and then this works today in TL and would give you the same guarantee as number 7 so well, so the conclusion is types in Lua did they deliver? is it actually easier to maintain multiplication working tail and my answer is yes because i never felt this fast programming in Lua responding to user’s feedback and refactoring things and adding features and all of that so so in closing well TL is here I just tagged the very first 0.1.0 like you know super early early release you can run it “luarocks install tl” I’m still looking for a better name suggestions are more than welcome and well Lua and types so join us! thank you generic records yes it does have them I’m 95% sure but yeah yes yes yes it does yes it does and it’s it’s it’s very naive in how it matches the variables it just like it goes very top to bottom it does a single pass and everything has to be like the first time the variable is used that’s the type it gets and it works well for simple cases right any other question while next one sets up here yeah good question in which for which version of Lua does this work right now the code is right now the code it generates supports like 5.1 and up and it can use the compat53 library in the back to to give better compatibility this is kind of still an open question but I probably want to support the latest Lua and the latest JIT right so now a question are there plans to take advantage of type annotation to generate better code No, I mean I can have a little more detail after my talk but just to give a quick overview the type of information that LuaJIT uses to optimize code is already present at runtime this is about statically checking types before running the program which is kind of separate to what the JIT compiler does you gonna switch speakers and There you go Gotta switch speakers and yeah well I’m happy to talk about this offline — thanks again .