The Lexer | TempleOS 5-Minute Random Code Walk-Thru #13 (2014)

Oct 12, 2020 15:13 · 778 words · 4 minute read another parentheses lex advances uh

Okay this is TempleOS five minute random code walk-through episode 13 and I’m your host Terry Davis. So let’s see where God takes us today. Alt+b. And it looks like we are in the compiler’s lexical analyzer under ifndef so um the uh the compiler um has a routine called Lex which is the Lex is almost the front of the pipe You can imagine uh You can imagine a pipeline going from uh –the pipeline through the compiler starts with LexGetChar then it goes into Lex, then it goes into uh well let’s say parsing. Then it goes into the optimizing, optimizing one, two, three, four. Then uh that’s that’s the pipeline okay so Lex is uh let’s take a look when we uh When we compile a statement. Let’s go to uh an if an if statement. Lex um returns tokens um so uh this is checking if the this is after it got an if is checking for uh um parenthesis in an if statement then Lex advances it past the parentheses, then it parses an expression, checks for another parentheses Lex advances it pass passed parentheses.

02:08 - So Lex can also you can also check the value you can also use a return value of likes Lex um So LexGetChar feeds into Lex. So what Lex does is uh So Lex returns EOF if it gets a zero, otherwise if it gets uh if it gets a-z, A-Z or underscore, or the upper 250 upper 128 characters those are legal it’s not just signed characters my operate- my compiler uses both uh it uses unsigned characters anyway so it can, um, my operating system can use the full range of 8-bit characters my compiler anyway so if um if you get an identifier it returns an identifier, if you get if you get a 0-9 as the first character of a token then it um it calculates or -it it advances um it reads in the um the other digits for uh for an integer and it returns an integer. What we were looking at was the uh ifndef. So we got a pound character up here we got a pound character…where is it… so if the the main switch statement got a pound character and then we um we checked for a toke- we got a token we got a key word keyword include these are the pre -normally these are the pre-compiler pass, but in mine they’re they’re merged into the Lex i don’t have a pre-compiler so um all the pre-compiler keywords are uh So ifndef is a pre-compiler keyword and uh if it enters a a state machine um these uh there’s a state machine for uh okay I guess it’s- okay so uh ifndef in the middle of a define we have to -what do we do. I don’t know.

anyway uh so if we got an if and def then um we stay in a loop 05:19 - uh this is advancing… um oh okay okay okay now I understand. So if we got an ifndef we uh we um read a token -we -this is recursive it calls itself if the token is not an identifier we we bail out. If it is an identifier and uh it’s in the table -it checks to see if the identifier is in the hash table and if the if it is not in the hash table we uh -ifndef- then um then we continue parsing normally like this goes into this continues the lex loop this so it stays in this function in a loop and, uh, however if it’s if it is defined then ifndef’s false. So if it is defined we stay in a loop um parsing characters until we hit another pound, and if it’s a -you can have nestled uh nestled uh ifs uh pound-ifs so it has this j is the uh depth that it’s nestled for um -so we we scan -we read those characters and finally when we… when we get a pound we… okay when we get a pound we lex the the token to get the pound keyword and if um if it’s an else or we’re back to one -if it’s an else and we’re back to one then we break, and when we break uh we uh -this will continue in in the loop.

So what we did we just scanned in all the uh 07:38 - all the characters that um until we um finished the ifndefn and it um it handles uh reading ifs in depth it handles nestled ifs or nestle pound-ifs okay. .