So, up until now in this module in the previous units, we discussed the notions of parsing and grammars and tokenizing in a rather general context. And from this unit onward, we are going to talk specifically about the Jack grammar, and about developing the parser and the syntax analyzer for Jack programs. All right, now as you noticed, we use a certain notation when we describe grammars. And what you see here is the subset of the Jack grammar that we used all along. And I'd like to say a few words about the notation that we used all the time without paying much attention to it. So basically, here is a formal definition of the notation. If I write something in black font, and I put it within two quotes, well this is a description of a token that must appear verbatim, as is in the input. And everything else is sort of meta language that describes the grammar. All the green stuff is the scriptors that I invented or more precisely Norman and I invented in order to describe the Jack grammar. And we're using a certain formalism here that we thought is appropriate for our purposes. And so we use parenthesis to describe groupings of tokens, we use a vertical bar to decide a choice between different options and so on and so forth. We also use, as I explained before, question mark to describe something that should appear zero or one time, and we use an asterisk to describe something that should appear zero or more times. So these are the rules of the game. Very simple, very straightforward and that's all we need in order to describe the structure of programming languages. All right, so with that in mind, I'd like to now unveil the entire Jack grammar in all it's glory and before I do so I want to warn you that it may look a little bit forbidding. But the good news is that, the only reason it looks forbidding is because I condensed it to a size that will fit on one slide. So, here we see it and actually I think Norma and I are quite proud that the whole language fits on one slide and if you were to zoom out from this image here which is lifted out of the book. You would see that it consist of four sections and we already discussed most of what we see here. So what I would like to do next is go through everyone one of these four sections and spend a few minutes talking bout it. All right, so first of all the language has a lexicon or lexical elements and in particular as we said in unit I think the second unit in this module when we talked about tokenizing we have keywords. We have symbols, integer constants, strong constants, and identifiers. And these four terminal rules describe all the lexical elements that can go into Jack programs. These are the atoms or the tokens from which Jack programs consist. All right, moving along what about the structure of a Jack program. Well a Jack program is a collection of one or more classes. Each of these classes is stored, handled and compiled separately. Just like we do with the Java and other object oriented languages. And the structure of each of these classes is as follows. And what you see here is a non-terminal rule that describes what it takes to be a class. So, a class begins with a keyword class then we have a class name, which as you will see later on is an identifier. Then we have curly left paren which begins the definition of the class. Then we may have zero or more declarations of either fields or static variables. So all together we call this class classVarDec. And then we may have zero or more subroutine declarations followed a right curly bracket. So that's what it takes to be a class in Jack. Now drilling into this definition, what is a classVarDec? Well it's easier the keyword static or the keyword field followed by a type, followed by a variable which as you will see later on is an identifier also and then I might have zero or more comma separated additional viable names. So it can have something like staticintx, y, z. This definition here will conform to this grammatical rule. Then we have a type, what is a type? Well, in Jack the only three primitive types. Int, char, boolean or in Java there are eight types or it just a matter of simple few more keywords. And because Jack is an object oriented language and other possibility for a type is the name of a class. So a class type Is also permissible. Subroutine declaration always begins with either constructor function or method. The subroutine can be either void or a typed subroutine. In other word it returns either a void or a typed value and so on and so forth. Okay, it might have a parameter list and as you see in the rule, a parameter list is a name of yet another non terminal rule. It's the next rule and indeed we see that the entire parameter list is optional. We have a question mark at the end. So we either have some parameters or we just have the parenthesis only. So you see a few more rules here and a few more terminals that describe what it takes to be a class name, subroutine name and variable name and that's it. That's the definition of the program structure in Jack. Okay, what else do we have? Well, in Jack we have statements, right? And this is something that we actually dealt with that before in our examples but basically, we have statement which are zero or more occurrences of statement in singular. And statement is either a let, an if, a while, a do, or a return. And for every one of these statements we have a rule that describes it. Once again, all this stuff is lifted straight out of the book, and I simply laid out in front of you in separate slides. By now, after we went through all the previous examples, I think that this stuff should be relatively straightforward. All right, finally we have expressions. That's the last thing that the grammar describes. And the expressions are similar to what we discussed in our previous examples and expression is a term followed by an optional opt term class. And the thing that makes these expressions somewhat complex in Jack is the term rule and as you see, the term rule is a little bit involved match more than the simple grammar that we saw previously. In fact, the handling of expressions is somewhat tricky and therefore I would like to defferate to the end of unit. So, please bear with me we put expression on the side. And I'd like to sort of summarize what we did so far. And then I'll back to the handling of expressions. So to recap, we gave the entire Jack grammar. And we need this grammar because this is the recipe according to which we're going to write our parser. And the parser is going to be a set of compile routines, one for almost every non-terminal rule that you see in the grammar. We're going to skip some of these rules because they are very shallow, and I'll have more to say about it in the subsequent units. But essentially, we're going to have a rule for every non-terminal, I'm sorry, we're going to have a parsing routine for every non-terminal rule. The parsing routine will be written according to the right-hand side of the corresponding rule. It will write the required XML, and may call in the process yet another routine to do the same, and so on, and so forth. So here you see the sort of tight relationship, the close relationship between a grammar and a parser that operates according to this grammar. All right, now I promised that I will handle expression separately and that's what I'd like to do next before we wrap up this unit. So let's zoom into the handling of expressions and we see here the grammatical definition of expressions inject. And the thing that makes the handling of expressions somewhat challenging is the handling of the term rule. The rule that handles the term nonterminal rule. Now think about it, if the expression, or I'm sorry, if the current token is an integer constant, simple. If it's a string constant, also simple. Likewise, if it's a keyword constant, we can easily handle it. However, if the current expression, I'm sorry, if the current token Is a variable name then it turns out that according to the rule here we have several options and each one of them should be handled separately. And here are some examples of what may sort of transpire when you have a token which is a variable name. It may be just the variable name foo, right? Which is easy to handle. But it may also be the name of an array, right? Foo followed by left square bracket followed by an expression that evaluates to some index of this array. It may also be foo. and then comes a call to either a method or a function, and we have to decide what we have here, so the bottom line is that we have to resolve whether we have any one of these options here. And in order to make this resolution, we have to look ahead to the next token down the token stream. So as a rule, when you have an identifier you cannot know immediately what you have. You have to look ahead at the next one. And then if the next one is a left square record, you do one thing. If it's dot you do another and so on and so forth. This is the only case in the Jack language where the language becomes LL(2) and you have to work a little bit more delicately to write code that takes care of these various possibilities. In all other cases, the language is completely LL(1), and the writing of the parser is relatively straightforward. So with that in mind, this is the end of Unit 4.6, and in the next unit, we're going to talk about finally how to write the Jack Analyzer itself.