5 Syntax analysis --------------- 5.1 Aim and structure of the syntax analyser ---------------------------------------- The syntax analyser is the heart of Inform, and contains within it the specification of the language. Inform code fundamentally consists of directives, which are instructions to the compiler to do something, and routines containing statements, which are lines of code to be executed at run-time. The syntax analyser takes as input the stream of tokens from the lexer. Some of these are interpreted as directives and acted on immediately. Others are realised to be statements and compiled; we can regard the output of the compiling part of the analyser as being a stream of Z-machine assembly language, passed down into "asm.c". In most modern compilers, the syntax analyser would convert the token stream into a "parse tree" expressing the structure of the input: something along the lines of routine / \ statement statement | | "if" ...etc., etc. / \ condition statement | | "==" "print" / \ | "x" "0" "Good heavens!" This is aesthetic, and makes possible many optimisations (such as elimination of common sub-expressions), since the compiler is able to see the whole structure before beginning to compile it. But it uses up a fair amount of memory (admittedly, most compilers only keep the tree for the current routine, not the tree for the entire program); and a lot of speed, as the tree is trawled through again and again. Characteristically, Inform instead uses an algorithm which is about 40 years old (see ASU, p.181 et seq). It doesn't generate a parse tree for whole routines, though it does do so for expressions, assignments and conditions: for example, it would have built the little tree: "==" / \ "x" "0" For higher level-parsing, it works "top-down", making recursive function calls which mimic a depth-first traversal of the tree drawn above. That is, routine() calls statement() which reads the token "if" and calls condition() which works out the expression "x == 0" and compiles some code to perform this test and then to skip the next instruction if it's false and calls statement() which reads the token "print" and then the token "Good heavens!" and compiles some code to print this and routine() next calls statement() which reads... etc., etc. Although we are only able to go through the tree once, it's a quick and efficient trip, effectively using the C run-time stack to hold the parse tree structure. The result is also a rather legible piece of source code (as compared, for example, with the output from "yacc"). (The main reason we can escape from having to make parse trees is that our target machine, the Z-machine, has an architecture making register allocation unnecessary: the variables are the registers, and there's no space or time penalty for use of the stack.) The syntax analyser begins with parse_program() in the section "syntax.c". Because its structure is embodied in the source code, there is relatively little to say about it except to refer the reader to that: and to the the table in section 1.1 above. 5.2 Predictive parsing ------------------ Note that tokens can be read from many different places, and code can be passed out from many different places: whereas the lexer and the assembler each have a single channel of input and output. As a typical example of the syntax analyser working as a "predictive parser" and having to "backtrack", consider how parse_action() begins to parse through action statements like ; tokenised as < symbol "Take" symbol "fishknife" > When it's called, the first token, "<" has already been read (this is how parse_statement() knew to call parse_action() in the first place). What it does is: "Predict" that it's going to be a << ... >> statement Ask the lexer for a token, expecting it to be another "<" Discover that, unfortunately, it's a symbol, so the prediction was wrong Backtrack to the point where it made the wrong prediction (this actually only means giving the token "Take" back into the lexer again) Now predict the next possibility, that it's a < ... > statement Ask the lexer for a token, expecting it to be a symbol name Discover that this time it is ("Take"), so the prediction was correct ... and so on. The code to do this is more like: get_next_token(); if (token == "<") return parse_double_angle_action(); put_token_back(); return parse_single_angle_action(); (The actual code doesn't make these inefficient function calls, and token comparison is a bit fiddlier, but this is the idea.) Clearly, the longer such a prediction goes on before it is found to be wrong, the more tokens which the lexer has to take back again. The syntax of the language determines this number, N: for many languages N = 1 (because they've been designed that way) but for Inform N = 5 (because it was defined by an ignorant oaf). (Though, as will appear in the next section, it would be possible to reduce this by implementing the grammar differently: and there are compensations, such as the relative conciseness of Inform code.) 5.3 The context-free grammar ------------------------ Here are the productions which the top-down syntax analyser implements. (Most of the major ones are handled by routines.) "void_expression", "constant", "condition" and "quantity" are left undefined: these are handled by the expression parser according to an operator-precedence grammar. "vcondition" is a condition whose first token is a VARIABLENAME. Symbols are represented in capitals; NEWNAME means one not so far assigned a value, while CLASSNAME, etc., mean an existing symbol of the given type. Obsolete features which are still supported are omitted. Details of conditional compilation are abbreviated heavily: the notation PROGRAM means "anything at all until the right directive keyword comes up at the same #if... level". ========================================================================= program -> directive ; program directive -> [ NEWNAME routine "Abbreviate" strings "Array" NEWNAME arraytype array "Attribute" NEWNAME "Attribute" NEWNAME "alias" ATTRIBUTENAME "Class" NEWNAME objbody "Class" NEWNAME ( constant ) objbody "Constant" NEWNAME "Constant" NEWNAME constant "Constant" NEWNAME = constant "Default" NEWNAME constant "End" "Extend" extension "Fake_action" NEWNAME "Global" NEWNAME "Global" NEWNAME = constant "Ifdef" NAME condcomp "Ifndef" NAME condcomp "Iftrue" constant condcomp "Iffalse" constant condcomp "Ifv3" constant condcomp "Ifv5" constant condcomp "Import" manifest "Include" STRING "Link" STRING "Lowstring" NEWNAME STRING "Message" diagnostic "Nearby" objhead objbody "Object" arrows objhead objbody "Property" NEWNAME propdef "Release" quantity "Replace" ROUTINENAME "Serial" STRING "Statusline" "score" "Statusline" "time" "Stub" NAME quantity "Switches" TEXT <---- An unquoted string "System_file" "Trace" trace "Verb" verb "Zcharacter" zchar CLASSNAME arrows objhead objbody condcomp -> ; PROGRAM ; "Endif" ; PROGRAM ; "Ifnot" ; PROGRAM ; "Endif" manifest -> import "," manifest import import -> "global" NEWNAME diagnostic -> STRING "error" STRING "fatalerror" STRING "warning" STRING propdef -> propdefault "additive" propdefault propdefault -> quantity ========================================================================= trace -> t_section t_level Tracing t_section -> "objects" "symbols" "verbs" "assembly" "expressions" "lines" t_level -> "on" "off" quantity ========================================================================= zchar -> char_spec Character set STRING STRING STRING "table" char_specs "table" "+" char_specs char_specs -> char_spec char_specs char_spec char_spec -> LITERAL_CHARACTER ========================================================================= arrows -> "->" arrows Object definition objhead -> NEWNAME STRING NEWNAME STRING NEWNAME OBJECTNAME STRING OBJECTNAME NEWNAME STRING OBJECTNAME objbody -> segment objbody segment "," objbody segment -> "class" class_s "with" with_s "private" with_s "has" has_s class_s -> CLASSNAME class_s with_s -> PROPERTYNAME property PROPERTYNAME property "," with_s has_s -> attributes property -> [ routine arrayvals ========================================================================= arraytype -> -> Arrays --> "string" "table" array -> constant STRING arrayvals [ manyvals ] manyvals -> constant manyvals constant ; manyvals arrayvals -> constant arrayvals ========================================================================= extension -> "only" strings priority grammar Grammar STRING priority grammar priority -> "replace" "first" "last" verb -> strings v_setting "meta" strings v_setting v_setting -> = STRING grammar grammar -> g_line grammar g_line -> * tokens -> g_action g_action -> ACTIONNAME ACTIONNAME "reverse" tokens -> g_token tokens g_token -> preposition "noun" "held" "multi" "multiheld" "multiexcept" "multiinside" "creature" "special" "number" "topic" "noun" = ROUTINENAME "scope" = ROUTINENAME ATTRIBUTENAME ROUTINENAME preposition -> DICTIONARYWORD DICTIONARYWORD / preposition strings -> STRING strings attributes -> attribute attributes attribute -> ATTRIBUTENAME ~ ATTRIBUTENAME ========================================================================= routine -> * locals ; body ] Everything below locals ; body ] here is code to locals -> NAME locals compile body -> action_case : body "default" : body statement body action_case -> ACTIONNAME ACTIONNAME , action_case block -> ; statement ; { states } { states . NEWNAME ; } states -> statement states; s_block -> { s_states } { s_states . NEWNAME ; } s_states -> case : s_states "default" : s_states statement s_states case -> range : range , case range -> constant constant "to" constant ========================================================================= statement -> . NEWNAME ; statement "@" assembly ; "<" action ">" ; "<<" action ">>" ; indiv_state STRING void_expression action -> ACTIONNAME ACTIONNAME quantity ACTIONNAME quantity quantity assembly -> opcode operands options opcode -> OPCODENAME STRING <--- customised opcode operands -> operand operands descriptions are operand -> constant not parsed by the VARIABLENAME syntax analyser sp options -> ? branch -> store -> store ? branch store -> VARIABLENAME sp branch -> LABELNAME ~ LABELNAME ========================================================================= indiv_state -> "box" strings ; "break" ; "continue" ; "do" eblock "until" ( condition ) ; "font" "on" ; "font" "off" ; "for" ( for1 : for2 : for3 ) block ; "give" quantity attributes ; "if" ( condition ) block "if" ( condition ) block "else" block "inversion" ; "jump" LABELNAME ; "move" quantity "to" quantity ; "new_line" ; "objectloop" ( ospec ) block ; "print" printlist ; "print_ret" printlist ; "quit" ; "read" quantity quantity ; "read" quantity quantity ROUTINENAME ; "remove" quantity ; "restore" LABELNAME ; "return" ; "return" quantity ; "rtrue" ; "rfalse" ; "save" LABELNAME ; "spaces" quantity ; "string" quantity STRING ; "style" textstyle ; "switch" ( quantity ) s_block "while" ( condition ) block for1 -> void_expression for2 -> condition for3 -> void_expression ospec -> VARIABLENAME VARIABLENAME "in" quantity VARIABLENAME "near" quantity VARIABLENAME "from" quantity vcondition printlist -> printitem , printlist printitem printitem -> STRING quantity ( form ) quantity form -> ROUTINENAME "number" "char" "address" "string" "The" "the" "a" "an" "name" "object" "identifier" textstyle -> "roman" "reverse" "bold" "underline" "fixed" ========================================================================= A few critical comments on the above. From this grammar it's possible to work out N, the maximum look-ahead needed to distinguish which production is being used in the source code. The worst cases are: (a) distinguishing productions (1) and (2) in ospec -> VARIABLENAME VARIABLENAME "in" quantity (1) VARIABLENAME "near" quantity VARIABLENAME "from" quantity vcondition (2) which requires N = 5; these two productions need to be distinguished between because objectloop ( a in b ) objectloop ( a in b ... ) (the second containing a compound condition) compile quite different code: one loops through the siblings in the object tree, the other through the tree in numerical order. (b) distinguishing productions (1) and (2) in printitem -> STRING quantity (1) ( form ) quantity (2) i.e., between print (routine-name) expression print (expression which happens to begin with a bracket) which requires N = 3. The grammar contains one serious ambiguity: the innocent-looking production arrayvals -> constant arrayvals means that array initialisations list constants in sequence without commas or other separating characters. This makes it impossible to distinguish between unary and binary minus in a line like: Array X 2 - 1 ; The answer is "binary", since the grammar makes the longest match possible; but a "this is ambiguous" warning is issued. A further inconvenience in the grammar, though not much of an ambiguity, occurs in the initialisation part of "for" statements: there is a danger of for (p=Class::) being mis-parsed due to "::" being recognised as a binary operator (without a right operand, which would cause an error) and not as two consecutive ":" delimiters. If I were free to redesign the Inform grammar in the light of the last three years' experience (which I am loath to do, since so much Inform source code now exists), here are the changes I think I would make: introduce commas as separators between array values and <> parameters; remove the statements quit, restore, save, style, font, spaces, box, inversion and read: the first three ought to be used via the assembler anyway, and the last six ought to be system-provided functions; use single quotes to refer to dictionary words used as values of "name", thus removing an anomalous rule going back to Inform 1, and to refer to dictionary words in grammar; find some notation for literal characters which does not look like the notation for dictionary words: e.g., ''z'' rather than 'z'; abolish the distinction between actions and fake actions; rename the Global directive "Variable"; require an = sign to be used in "Constant X 24" directives. 5.4 Assigning values to symbols --------------------------- Assigning values to symbols is the main way by which the syntax analyser changes the way it will behave further on in the program. From a strict theoretical point of view one could insist the symbols table contains the only information remembered by the syntax analyser about the program so far, which otherwise churns out code which it forgets as soon as it has written: however, Inform's analyser also keeps a number of other data structures, such as the game dictionary and the object tree. When the lexer creates a new symbol (that is, when the table is searched for a string which isn't there, so that it is added to the table as a new entry), it has: value 256 type CONSTANT_T flags only UNKNOWN_SFLAG line the current source line "Unknown" means that the syntax analyser doesn't recognise it as meaning anything yet. The flag will only be cleared when the syntax analyser assigns some value to the symbol (using the assign_symbol() routine), if ever. The line is reset to the source line then applying if the symbol is assigned a value, but is otherwise never changed. The type and value of a symbol are only altered via assign_symbol(). With one exception (see CHANGE_SFLAG below), the value once assigned is never subsequently altered by reassignment. Each symbol always has one of the following 11 types: Type Meaning Value ------------------------------------------------------------------------ CONSTANT_T Defined constant or Value of constant value not known yet LABEL_T Name of label in source Label number GLOBAL_VARIABLE_T Name of global variable Z-machine variable number ARRAY_T Name of array Z-machine byte offset in dynamic variables area ROUTINE_T Name of routine Scaled offset in Z-code ATTRIBUTE_T Name of attribute Z-machine attribute number PROPERTY_T Name of common property Z-machine property number between 1 and 63 INDIVIDUAL_PROPERTY_T Name of indiv property Property identifier >= 64 OBJECT_T Name of object Z-machine object number - 1 CLASS_T Name of class Z-machine object number - 1 of its class-object FAKE_ACTION_T Name of fake action Action number >= 256 ------------------------------------------------------------------------ The full list of symbol flags, and their meanings, is as follows: ------------------------------------------------------------------------ UNKNOWN_SFLAG no value has been assigned to this symbol USED_SFLAG the value of this has been used in an expression REPLACE_SFLAG the programmer has asked to Replace a routine with this name DEFCON_SFLAG this constant was defined by the "Default" directive STUB_SFLAG this routine was defined by the "Stub" directive (similarly) INSF_SFLAG this symbol was originally assigned a value in a system_file SYSTEM_SFLAG this symbol was assigned a value by Inform itself, as one of the standard stock (such as "true" or "recreate") provided to all programs UERROR_SFLAG a "No such constant as this" error has already been issued for this name ALIASED_SFLAG this is an attribute or property name whose value is shared with another attribute or property name CHANGE_SFLAG this is a defined Constant whose value needs later backpatching, or is a Label whose label number has been allocated before its declaration in the code ACTION_SFLAG this is a ## name (of an action or a fake action) REDEFINABLE_SFLAG the symbol can be defined more than once using the "Constant" directive, without errors being produced. (Used for the special constants DEBUG, USE_MODULES, MODULE_MODE and Grammar__Version.) ------------------------------------------------------------------------ IMPORT_SFLAG this name is "imported" (used in module compilation) EXPORT_SFLAG this name was "exported" from some module (used in story file compilation) ------------------------------------------------------------------------ USED_SFLAG is used only to decide whether or not to issue "declared but not used" warnings. A symbol has been "used" if one of the following is true: (i) its value occurred in an expression; (ii) it has been assigned, and tested with IFDEF or IFNDEF; (iii) it's an action routine name used in grammar or via ## (e.g. use of ##Take causes TakeSub to be "used"); (iv) it's a fake action name used via ##; (v) it's a property, attribute or class name used in an object or class definition; (vi) it's a label name branched to in assembly language or the destination of a "jump" statement; (vii) it's the routine name "Main"; (viii) it's a routine name referred to by the obsolete #r$ construct; (ix) it's a routine name of a veneer routine whose definition is being over-ridden in the source code, and to which a call has been compiled; (x) it's a routine name used in a grammar token; (xi) it's referred to in a module which has been linked into the story file being compiled. Note that such warnings are not issued for object names, since in much Inform 5 code objects are given irrelevant object symbol-names (the syntax required it); nor for symbols defined by Inform, or in the veneer, or in a system file. Warnings are never issued (except for labels and local variables, which have local scope) when compiling modules, since there is no way of knowing at compile time which exported symbols will be used and which will not. The warnings are issued at the end of the source code pass, but before the veneer is compiled. The slines[] array is used to work out which source lines to report from. CHANGE_SFLAG is used when definitions such as: Constant frog_word = 'frog'; are reached, where the value (the dictionary address for 'frog') cannot immediately be known. Such symbol values are backpatched later as needed. All symbols in the table have "global scope" (that is, their definitions are valid everywhere in the source program) except for label names, whose scope is restricted to the current routine. Thus, the same label name can be used in many different routines, referring to a different label in each. To achieve this, the routine-end routine in "asm.c" cancels the assignment of label names, returning them to type CONSTANT_T and flag UNKNOWN_SFLAG. Local variables also have local scope, but for efficiency reasons these are not stored in the symbols table but in a special hash table in level 3 of the lexer. Note that action names have a different "name-space" from other symbols: this is why the library's action name "Score" is never confused with its variable "score". Rather than use a different symbols table altogether, actions are stored as integer constants in the form Score__A (thus the value of this symbol is the value ##Score). Similarly, fake actions are stored this way (but with type FAKE_ACTION_T rather than INTEGER_CONSTANT_T); both forms of symbol are flagged ACTION_SFLAG. 5.5 Outputs other than assembly language ------------------------------------ Below the parse_routine() part of the syntax analyser, the only output is assembly language (together with dictionary words and encoded text as needed within it). However, parse_directive() has several other outputs, as shown on the source code map (section 1.2 above). Directives create objects to be remembered and written into the final program: arrays, verb definitions, actions, objects and so on. These will be called "program elements". Directives also "create" constants, attributes, properties and fake actions, but in these cases creation only consists of assigning a suitable value to a symbol. So these do not count as program elements. The data structures to hold "program elements" are all created and maintained within "directs.c" and its subsidiaries (such as "objects.c", the object-maker); they are then translated into a usable Z-machine form in "tables.c". (With only trivial exceptions, the data structures are not accessed anywhere else in Inform.) --------------------------------------------------------------- Program element Section Name of (main) data structure --------------------------------------------------------------- Global variable arrays.c global_initial_value[] Array arrays.c dynamic_array_area[] Release number directs.c release_number Serial code directs.c serial_code_buffer[] Statusline flag directs.c statusline_flag Object tree objects.c objects[] Common property objects.c prop_default_value[] default value Class-to-object- objects.c class_object_numbers[] numbers table Common property objects.c *properties_table values for objects Individual prop objects.c *individuals_table values for objects Table of static symbols.c individual_name_strings[] string values for property names And attribute names symbols.c attribute_name_strings[] And action names symbols.c action_name_strings[] Abbreviation text.c abbreviations_at[] table entry Grammar table verbs.c Inform_verbs[] Token-routine verbs.c grammar_token_routine[] addresses List of dict verbs.c adjectives[] addresses for "adjective" words Action routine verbs.c action_byte_offset[] addresses --------------------------------------------------------------- A rather unequal gathering: the storage required to hold the common property values table may be as much as 32K, whereas the statusline flag can be held in 1 bit. There are three other program elements: Z-code, kept by "asm.c"; the static strings of Z-encoded text, kept by "text.c"; and the dictionary, also kept by "text.c". 5.6 Assembly operands ----------------- The type "assembly_operand" is used for all numerical values (and local or global variable references) which are destined one day to be written into the Z-machine. Many of these will indeed be operands in Z-code instructions, hence the name. Others will be property values, array entries and so on. The definition is: { int type; int32 value; int marker; } which is a pretty generous memory allocation for holding a signed 16-bit number. (However, there are no large arrays of this type; type and marker could easily each have type char, but I suspect this would be slower because of alignment problems on some compilers; and speed does matter here.) The possible types are: SHORT_CONSTANT_OT number with value between 0 and 255 LONG_CONSTANT_OT number with any 16 bit value VARIABLE_OT reference to the stack pointer if value is 0 local variable if value 1 to 15 global variable if value 16 to 255 In addition, two special types are in use by the expression parser: OMITTED_OT (the operand holds no information) EXPRESSION_OT reference to a parse tree The "marker" value is used to record the origin of the data, which is essential to make backpatching work. For example, in the line v = "Jackdaws love my big sphinx of quartz"; the right operand is marked with STRING_MV, because the value which needs to be stored in the Z-machine is a packed string address and this cannot be known until after the compilation pass (when the size of the tables and the code area are known). Wherever the operand is written, in code or in a table, "bpatch.c" will later find and correct it. 5.7 Translation to assembly language -------------------------------- The Z-machine has a very easy architecture to generate code for. Most compilers' syntax analysers generate a "three-address code" as an intermediate stage towards final object code; this is a sequence of instructions in the form x = y z together with conditional branches and labelled points to branch to. (See ASU, p.466.) Translating this code into assembly language for some CPU is a further compilation phase: the tricky part is not translating the operators into instructions, but deciding where to locate the values x, y and z. On most CPUs, a limited number of registers can hold values, and arithmetic operations can only be performed on these registers; moreover, holding data in a register is much more efficient than holding it elsewhere. The problem is therefore to allocate registers to quantities, and the performance of the compiled code depends very heavily on how well this is done. (Register allocation is far from being a solved problem in the way that lexical analysis, say, is considered to be.) What makes the Z-machine particularly easy is that its instruction set is more or less three-address code already. Arithmetic can be performed with constants as operands as well as with "registers"; and not only is every local or global variable automatically allocated to a register of its own, but a stack is available to hold intermediate values (and with no performance penalty for its use, since it is accessible as though it were a register). The key point to remember when looking at Z-code is that writing a value to the "stack-pointer variable" pushes it onto the Z-machine stack; using the "stack-pointer variable" as the operand for an instruction pulls the value being read off the stack. Despite the availability of the stack, it's still convenient to have a few spare variables to use as "registers" holding temporary values. Inform reserves the variables 249, 250, 251, ..., 255 for its own use: though this slightly exaggerates the position, since two of these are used for the variables "self" and "sender". One more is used to hold the switch-value of the current switch statement (if there is one); the remaining four in order to implement various system functions (such as "children") with inline-code rather than routine calls to the veneer. (The one inconvenience with the Z-machine's instruction set is that there's no good way to read the top of the stack non-destructively, or to duplicate the top value, or to reorder the top few values.) The syntax analyser produces code by making function calls to the assembler. There are four types of function call: assemble_routine_header(...) assemble_routine_end(...) at the start and end of every routine; assemble_label_no(N) to indicate that the Nth label belongs here (i.e., at the point where the next instruction will be put); and then a fair variety of function calls to generate actual instructions. For example, assemble_jump(N) assembles an unconditional jump to label N. A typical "three-address code" is assembled by a routine like assemble_2_to(mul_zc, AO1, AO2, stack_pointer) meaning "the instruction mul_zc, which has 2 operands and has a result which it writes to the variable indicated by the third operand". AO1 and AO2 are assembly_operands (the abbreviation is often used in the source), and so is stack_pointer (it has type VARIABLE_OT and value 0). A typical example of how the top-down syntax analyser generates code is given by the code for the "while" statement: case WHILE_CODE: assemble_label_no(ln = next_label++); match_open_bracket(); code_generate(parse_expression(CONDITION_CONTEXT), CONDITION_CONTEXT, ln2 = next_label++); match_close_bracket(); parse_code_block(ln2, ln, 0); assemble_jump(ln); assemble_label_no(ln2); return; Note that this expects to match while ( condition ) statement or { statements } The expression parser is called to turn the condition into a parse tree, and its output is fed straight into the code generator for parse trees. The label numbers ln2 and ln are supplied to the routine for parsing code blocks because they indicate which labels the statements "break" and "continue" should generate jumps to. For example, while (i <= 10) print i++; generates the assembly language .L0; @jg i 10 ?L1; @inc i; @print_num i; @jump L0; .L1; In terms of function calls to "asm.c": assemble_label_no(0); assemble_2_branch(jg_zc, AO_for_i, AO_for_10, 1, TRUE); assemble_inc(AO_for_i); assemble_1(print_num_zc, AO_for_i); assemble_jump(0); assemble_label_no(1); (Note that incrementing is needed so often that assemble_inc() is provided as a shorthand: actually also because of a form of indirect addressing used by the Z-machine for store_zc and inc_zc, dec_zc. assemble_store() and assemble_dec() are also provided.) 5.8 Summary of assembly language instructions output ------------------------------------------------ The list of Z-machine instructions which Inform actually generates code for by itself is quite short (about half of the 115 possible instructions are ever used). The assembly-language "@ statement" parser can, of course, generate every Z-machine instruction. The instructions used for three-address-style code (that is, for program control, arithmetic, etc.) are as follows. For function calls and returns, unconditional jumps and for moving values between variables, memory locations and the stack: call_* rfalse rtrue ret_popped ret inc dec store push pull storeb storew loadb loadw jump set_attr (There are various forms of call instruction to do with the number of operands, whether a return value is wanted or not, etc.). Six conditional branch instructions are used: je jz jl jg jin test_attr check_no_args (the first four are numerical branches, the next two related to the object tree, and the last one is used only in V5 or later games, and then only for printing out tracing code): note that each can also be used in a "negate this condition" form. Finally, signed 16-bit arithmetic and unsigned 16-bit bitwise operations: add sub mul div mod and or not A further batch of instructions is used, so to speak, in lieu of calls to a standard library of input/output and utility routines (like C's "stdio"): print print_ret print_char print_num print_addr print_paddr print_obj new_line insert_obj remove_obj get_parent get_child get_sibling get_prop get_prop_addr get_prop_len put_prop random aread/sread quit save restore output_stream with five more exotic screen control instructions used only if a "box" statement is ever compiled: split_window set_window style set_cursor buffer_mode