Skip to main content
留学咨询

辅导案例-COMP4403/COMP7402-Assignment 2

By May 15, 2020No Comments

Last updated: Thu 30 Apr 2020 11:29:17 AEST. COMP4403/COMP7402 – Compilers and Interpreters Assignment 2 Due date: 3pm (15:00) Friday 15th May, 2020 Weighting: 25% for both COMP4403 and COMP7402 Modify the LALR assignment 2 compiler for the PL0 language (provided on the course web page with this assignment) to add array and enumeration types, operations on arrays and enumerations, and a for statement. Assignment Compiler Files All sources for the assignment PL0 compiler are available as a2.zip (below). Please be sure to use the version for this assignment and not the one used for the tutorials or another assignment. There are differences (like the lexical tokens you need for this assignment are only defined in the assignment version). a2.zipThu 30 Apr 2020 11:28:08 AEST Save this .zip file and follow the instructions for setting up a compiler project in IntelliJ Setting up and running PL0 in IntelliJThu 30 Apr 2020 11:29:17 AEST Brief documentation on assignment 2 filesThu 30 Apr 2020 11:29:17 AEST Setting up Java-CUP and JFlex under EclipseThu 30 Apr 2020 11:02:22 AEST Here is the documentation for Java CUP [HTML] JFlex [HTML] For the most part you will not need these. https://learn.uq.edu.au/bbcswebdav/pid-5316913-dt-content…_1/courses/COMP4403S_7020_21490/2020-assignment2.html 2020/5/7 下午9:07 ⻚码:1/15 Please ensure you follow the course Piazza bulletin board for any updates and further information on the assignment. Read all the fine print below in detail before you start! And, most important, when you have finished implementing the assignment, come back and re-read the fine print again. Do not use imports for external packages other than those in java.util.*. Note that IntelliJ may offer the option of importing an external package to resolve an issue; please avoid accepting this option because it will often add an erroneous import that you will not need. Such imports lead to the compilation failing in the environment in which your compiler will be assessed because that environment does not include the external libraries. Please check you are not importing external libraries before submitting. You must only modify the files that must be submitted (see below). You must not modify any other files because we will be testing your implementation using the existing other files with your submitted files. Please do not reformat the files because we would like to just print the differences between the originals and the versions you hand in. Please keep the length of lines in your files below 100 characters, so that we can print them sensibly. Please avoid using non-standard characters, e.g. Chinese characters, including in the comments. Non-standard characters are not accepted by the Java compiler used to test your assignment and all comments should be readable by the person assessing your assignment. Your implementation should be in Java Project language level 8. Set the IntelliJ preferences for the Java project language level 8 under Project structure then Project (or use the “-source 1.8” option to the command line Java compiler). Please remove any debugging output before your assignment is submitted because debugging output will cause your program to fail our automated testing of your assignment. Either avoid using tabs or set your tabs stops to 4 spaces (this is the default for IntelliJ/Eclipse) so that your files will print sensibly. Overview Enumerated types https://learn.uq.edu.au/bbcswebdav/pid-5316913-dt-content…_1/courses/COMP4403S_7020_21490/2020-assignment2.html 2020/5/7 下午9:07 ⻚码:2/15 An enumerated type may be declared in a type definition. Subranges of enumerated types may form types. type Day = {Mon,Tue,Wed,Thu,Fri,Sat,Sun}; WeekDay = [Mon..Fri]; Day is an enumerated type with seven elements, Mon, Tue, Wed, Thu, Fri, Sat and Sun, each of which is a constant of type Day. WeekDay is a subrange with five elements, Mon through to Fri. One may declare a variable to be of an enumerated type: var d: Day; w: WeekDay; These may be assigned values of the enumerated type. d := Mon; w := d; Values of an enumerated type may be compared using all the usual relational operators: the elements are in increasing order of their position in the definition of the (base) enumerated type, e.g., Mon < Tue < Wed < Thu < Fri < Sat < Sun. The unary operators pred and succ can be used to give the predecessor and successor values, respectively, in the enumeration, e.g. succ Mon = Tue = pred Wed. Note that predecessor and successor wrap around so that succ Sun = Mon and pred Mon = Sun and hence the following code calls process for all values of d from Mon to Sun, inclusive, and leaves the final value of d as Mon (but see the for statement below for a simpler version of this code). d := Mon; while d < Sun do begin call process(); d := succ d end; call process(); d := succ d https://learn.uq.edu.au/bbcswebdav/pid-5316913-dt-content…_1/courses/COMP4403S_7020_21490/2020-assignment2.html 2020/5/7 下午9:07 ⻚码:3/15 Array types An array type may be declared in a type definition. The index type for an array may be either a subrange or an enumerated type. const N = 10; type S = [1..N]; V = array S of int; V2 = array [-N..N] of int; M = array S of V; Bookings = array Day of int; V is an array type with ten elements each of type integer and with indices 1 to 10. V2 is declared using an explicit subrange, which happens to range over both positive and negative values; it has 2N+1 elements. M is an array type with ten elements each of type V, with indices 1 to 10. Each element of M is itself an array of 10 integers, i.e., M contains a total of one hundred integers. The type Bookings defines an array with seven elements indexed by elements of type Day. One may declare a variable to be of an array type: var vec : V; mat: M; reserved: Bookings; Elements of arrays may be assigned appropriate values. vec[1] := 2; mat[10][1] := 42; mat[1] := vec; // array assignment reserved[Mon] := 3; The values of the elements may be accessed, for example, the following extracts the diagonal elements of mat and places them in vec. i := 1; while i <= N do https://learn.uq.edu.au/bbcswebdav/pid-5316913-dt-content…_1/courses/COMP4403S_7020_21490/2020-assignment2.html 2020/5/7 下午9:07 ⻚码:4/15 begin vec[i] := mat[i][i]; i := i + 1 end For statement You are also required to add a for statement to PL0. The above while loop can be rewritten using a for statement as follows. for i : 1..N do vec[i] := mat[i][i] od And the earlier while loop over days can be rewritten as follows. for d : Mon.. Sun do call process() od The for loop is not quite the same as the while because the control variable of the for loop is considered local to the loop (see static semantics below). Syntax Changes The following lexical tokens have been added to the lexical analyser, PL0.flex. Keywords are case sensitive. LCURLY → "{" RCURLY → "}" KW_ARRAY → "array" KW_FOR → "for" KW_OD → "od" KW_OF → "of" KW_PRED → "pred" KW_SUCC → "succ" https://learn.uq.edu.au/bbcswebdav/pid-5316913-dt-content…_1/courses/COMP4403S_7020_21490/2020-assignment2.html 2020/5/7 下午9:07 ⻚码:5/15 The syntax is given here in extended BNF. You need to transform it to BNF for use in Java- CUP. The syntax for type definitions (Type) is extended with alternatives for array types and enumeration types. Be careful to distinguish the use of braces as part of the EBNF notation (unquoted) from their use as lexical tokens (quoted). Type → "array" Type "of" Type Type → "{" EnumerationList "}" EnumerationList → IDENTIFIER { "," IDENTIFIER } A reference to an element of an array can be used as an LValue either within an expression or on the left side of an assignment. Hence we add a new alternative for an LValue. LValue → LValue "[" Condition "]" Unary operators are extended to include predecessor and successor. Unar yOperator → KW_PRED UnaryOperator → KW_SUCC Statement has an additional alternative for a for statement. Statement → "for" IDENTIFIER ":" Condition ".." Condition "do" StatementList "od" You need to add these productions and their associated actions to build the symbol table entries and abstract syntax trees to PL0.cup. The Parser Generator Java-CUP The parser specification for the compiler is specified in the file PL0.cup. You will need to add productions (and their associated actions) to the specification and then run the parser generator Java-CUP (manually or automatically) to generate the files CUPParser.java and CUPToken.java. Do not modify these two Java files directly (even if you think you understand them (do you really?)) - remake them from PL0.cup. You should make the compiler before you change anything just to see what forms of messages to expect. When you make the compiler (before you modify it) there will be some warning messages about the terminal symbols like ILLEGAL being declared but never used; these are to be expected at this stage. Any new warning/error messages will be significant. Beware that if there are Java-CUP errors reported, the Java files for the parser will not be https://learn.uq.edu.au/bbcswebdav/pid-5316913-dt-content…_1/courses/COMP4403S_7020_21490/2020-assignment2.html 2020/5/7 下午9:07 ⻚码:6/15 generated, so check for Java-CUP errors first. There is HTML documentation for Java- CUP available from the class web page (with the assignment). The Scanner Generator JFlex All the lexical tokens for this version of the compiler have already been added to the lexical analyser. The file Lexer.java is automatically generated by the scanner generator JFlex from PL0.flex; again, do not modify Lexer.java - remake Lexer.java from PL0.flex. Both Java-CUP and JFlex are available with the assignment files on the course web page, with instructions on how to run them in IntelliJ. Java archives for Java-CUP and JFlex are part of the IntelliJ project for the assignment. Static Semantic Restrictions Enumerated types Enumerated types are scalar types. For an enumerated type the elements are treated as constants of the enumerated type within the scope in which the type is defined. The elements cannot have names corresponding to other identifiers declared within the same scope. Obviously, an enumerated type cannot have multiple elements with the same name and two enumerated types in the same scope must have disjoint sets of elements. The elements can be treated as Const entries in the symbol table. In the static semantics, an enumeration consisting of a list of identifiers, ids, is treated as the type enumeration(ids). Each enumeration type is considered a separate primitive scalar type that is incompatible with all other types, except its subranges. Note that Type.java has already been extended with enumerated types, so your main task is to handle the parsing of enumerated type definitions to set up an EnumerationType. Expressions of enumeration types may be compared with other elements of the same enumeration (or a subrange of it) and assigned to variables of the enumeration type or a subrange of it, provided the value is in the subrange. Note that in the compiler the types of operators (like "_=_") are defined within the symbol table. You will need to extend the https://learn.uq.edu.au/bbcswebdav/pid-5316913-dt-content…_1/courses/COMP4403S_7020_21490/2020-assignment2.html 2020/5/7 下午9:07 ⻚码:7/15 types of comparison operators to handle comparison of values of an enumerated type. Hint: adding operator entries is similar to the adding the predefined operators in the class Predefined (except they do not get added to the predefined scope). The compiler supports overloaded operators. The pred and succ unary operators may be applied to an expression of an enumeration type and yield a value of the same type. The widening and narrowing rules for subranges apply to subranges of enumeration types in a manner similar to subranges of integers. Array types In an array type declaration, array S of T, the index type, S, must be a subrange type or an enumeration and the element type, T, can be any type including another array type, but cycles in type definitions are not permitted, e.g. the following is not valid as the definitions of types C and D form a cycle. C = array S of D; D = array S of C; An array with index type S and element type T is treated as the type array(S,T). In Type.java a class ArrayType has already been added to represent array types within the compiler. syms ⊢ typeof(S) = subrange(S1,v0,v1) syms ⊢ typeof(T) = T1 syms ⊢ typeof(array S of T) = array(subrange(S1,v0,v1),T1) and syms ⊢ typeof(E) = enumeration(ids) syms ⊢ typeof(T) = T1 syms ⊢ typeof(array E of T) = array(enumeration(ids),T1) For a subscripted array reference, e1[e2], e1 must be an L-Value and have a type that is a reference to an array type, and the type expression e2 used as the array index must be compatible with the type of the index of the array. The type of a subscripted array, e1[e2] is a reference to the element type of the array. syms ⊢ e2 : T1 https://learn.uq.edu.au/bbcswebdav/pid-5316913-dt-content…_1/courses/COMP4403S_7020_21490/2020-assignment2.html 2020/5/7 下午9:07 ⻚码:8/15 syms ⊢ e1 : ref(array(T1,T2)) syms ⊢ e1[e2] : ref(T2) Note that the type of e1[e2] is ref(T2) rather than T2 so that the subscripted array element can be used as an L-value, e.g., it can be used on the left side of an assignment. Assignment of whole arrays is allowed but other operations on whole arrays (e.g., comparison, etc.) are not supported. Note that, because the index type is subrange type or an enumeration, the bounds of the array are fixed (constant) at compile time. For statement For a for statement with lower and upper bound expressions, lower and upper, there must be some scalar type T such that the types of the two expressions are (coercible to) type T (e.g. by dereferencing or widening - see optDereference and optWidenSubrange in Type.java). The control variable, id, is treated as a new variable (of type ref(T)) local to the for statement. The name of the control variable, id, may be the same as an existing (local or control) variable, but within the loop any references to id are to the control variable, rather than the existing variable (unless there is a nested for loop with a control variable with the same name, which then takes over for its body (and so on)). The control variable is not allowed to be modified by the loop body. Given that the predicate isScalarType(T) is true if and only if T is a scalar type, the rule for a for follows. isScalarType(T) syms ⊢ lower : T syms ⊢ upper : T syms ⊕ { id ↦ VarEntry(ref(T)) } ⊢ WFStatement(sl) syms ⊢ WFStatement(for id : lower .. upper do sl od) In addition, the control variable is read only within sl. To facilitate the implementation of the for statement, the symbol table entry for a variable (SymEntry.VarEntry) has a flag to indicate whether or not the variable is read only. In addition, the symbol table (Scope) has a method, extendCurrentScope, to extend the current scope with a new scope at the https://learn.uq.edu.au/bbcswebdav/pid-5316913-dt-content…_1/courses/COMP4403S_7020_21490/2020-assignment2.html 2020/5/7 下午9:07 ⻚码:9/15 same lexical level. The extended scope can be used to declare the control variable with a scope which includes only the for statement. When space is allocated for variables declared in the extended scope they are effectively allocated within the parent scope (or for an extended scope within an extended scope within the grandparent scope, etc.). Leaving the extended scope can be done using the method getParent in class Scope. Dynamic Semantics Enumerations Variables and values of enumerated types act like other scalar types (because they are represented by integer values) and hence nothing special should be required for code generation except to implement predecessor and successor. The successor of the greatest element of an enumeration type wraps around to give the least element of t he type and the predecessor of the least element wraps around to give the greatest element of the type. Arrays The dynamic semantics of array accessing is conventional. An element of an array may be used like a variable whose type is the same as the element type of the array. For example, if the element type is int then it may be "assigned to", read, written and used in arithmetic expressions and comparisons (but if you start worrying about each of these you are making way more work for yourself than you need to). Accesses to array elements should be bounds checked at run time, i.e., the value of the index expression should be checked to make sure it is within the subrange defined by the index type (see the discussion of the BOUND instruction below). Variables of array type are local variables and hence are allocated on the stack just like any other local variable (not on the heap (as in Java "new")). The main difference from scalar variables is that their size is typically not one. The space allocated for an array should be just enough to fit all of its elements. For example, the array vec requires 10 words (one for each integer) and mat requires 100 words. https://learn.uq.edu.au/bbcswebdav/pid-5316913-dt-content…_1/courses/COMP4403S_7020_21490/2020-assignment2.html 2020/5/7 下午9:07 ⻚码:10/15 For statement The lower and upper bound expressions are evaluated once to v1 and v2, respectively, at the start of the execution of the for, not on every iteration. If v1 is greater than v2, then the subrange is empty, and hence the body is not executed at all. Otherwise, the for statement executes the statements in its body for each value in the subrange from v1 up to v2, inclusive, in increasing order. The control variable is assigned the corresponding value at the start of each iteration. The control variable may not be modified (it is read only) within the body of the for loop. It is only modified (incremented) by the for statement itself. Object Code The PL0 compiler generates code for the Stack Machine. A description of the stack machine is available in Inst.pdf. See also the file StackMachine.java (near the end) for details of the instruction set. Array indexing and bounds checking The BOUND instruction expects that the following have been loaded (pushed) onto the stack (in this order): 1. a value to be bounds checked, 2. a lower bound, and 3. an upper bound. The BOUND instruction pops the upper and lower bounds and checks that the value (now on top of the stack) is in the range. If it is in range, the value is left on the top of the stack (but the upper and lower bounds have been removed) and execution continues with the next instruction. If the value is out of bounds, the stack machine interpreter halts and prints out an error message that the value is out of range. Array copying https://learn.uq.edu.au/bbcswebdav/pid-5316913-dt-content…_1/courses/COMP4403S_7020_21490/2020-assignment2.html 2020/5/7 下午9:07 ⻚码:11/15 The code generation for the assignment statement currently handles assignments of items larger than a single word by using LOAD_MULTI and STORE_MULTI. If you would like to experiment with doing array assignments more efficiently, the stack machine has a COPY instruction, which takes three arguments from the stack which should be pushed in the following order: the source (or from) address, and the target (or to) address, the size of the block of memory to be moved. For statement It is desirable to reduce the overhead for repeated iterations of a loop because it may be executed a large number of times. You only need to generate code for the lower and upper bound expressions of the for loop once. In fact, it will be incorrect to evaluate the upper bound expression on every iteration because the body of the loop may change variables in that expression. Student Misconduct Students are reminded of the University's policy on student misconduct, including plagiarism. See the course profile and the School web page http://www.itee.uq.edu.au/itee-student-misconduct-including-plagiarism You are expected to protect your files so that they are not readable by other users. Your assignment is expected to be your own individual work and must not include code copied from other students, current or past. You are also reminded not to post your (partial) solutions to assignments to any place accessible by others, including the bulletin board or emailing to other students. If you need that sort of help consult the lecturer and/or tutor. Note that Piazza allows private posts to the instructors. This assignment compiler is provided solely for the purposes of doing this assignment and your solutions must never be shared, especially publicly, even after completion of the course. Such publication would be considered both student misconduct and a breach of copyright. https://learn.uq.edu.au/bbcswebdav/pid-5316913-dt-content…_1/courses/COMP4403S_7020_21490/2020-assignment2.html 2020/5/7 下午9:07 ⻚码:12/15 Late Submission A penalty of 5% of the maximum mark for an assignment will be deducted for each day (24 hours) or part thereof late up to a limit of 7 days, after which time assignments will not be accepted for assessment unless you have been granted an extension. As we plan to hand back assignments a week or two after submission, requests for an extension will not be accepted more than one week late, unless there are exceptional circumstances. Requests for extensions should be accompanied by suitable documentation (see the course profile for details). Personal hardware or computer failures are not grounds for extension. Assignments must be backed up on the university system. Submission Please keep the length of lines in your files below 100 characters, so that we can print them sensibly. You should avoid using tabs or set your tabs stops to 4 spaces so that when we print them (with tab stops set to 4 spaces) they will print sensibly. Do not forget to remove any code generating debugging output and any rogue external imports before submission. You must submit your completed assignment electronically through the assessment section of the course BlackBoard site (the BlackBoard Assessment page rather than the course web pages). You need to submit the following list of individual files (not a .zip or any other form of archive file) for evaluation and marking. Note that file names are case-sensitive. PL0.cup ExpNode.java ExpTransform.java StatementNode.java StatementTransform.java https://learn.uq.edu.au/bbcswebdav/pid-5316913-dt-content…_1/courses/COMP4403S_7020_21490/2020-assignment2.html 2020/5/7 下午9:07 ⻚码:13/15 StatementVisitor.java StaticChecker.java CodeGenerator.java You can submit your assignment multiple times, but only the last copy submitted will be retained for marking. Assessment The assignment is marked out of a total of 20 marks. Marks will be allocated as follows: 6 Syntax analysis and tree building 7 Static semantics checking 7 Code generation Marks will be awarded for the correctness of the changes to each category. Readability, modular structure, and structural and computational complexity are also criteria. For readability, we expect that you follow good software engineering practice, such as appropriate choices of variable names, consistent indentation, appropriate comments where needed, etc. For modularity we expect you introduce new methods where it makes sense to help structure the program and to avoid unnecessary duplication of code. Use of generic Java utility interfaces (like Set, Map, List, Queue, ...) and their implementations (like HashSet, ..., TreeMap, ..., LinkedList, ...) is encouraged. We expect you to produce well structured programs that are not unnecessarily complex, both structurally (e.g. complex control flow that is hard to follow), and in terms of execution time and space requirements, (e.g. an O(n) algorithm is preferred to an O(n ) algorithm, and a O(log n) algorithm is even better). We will not be concerned with the quality of syntactic error recovery because the parser generator CUP takes care of that for the most part, but you must handle semantic errors appropriately, i ncluding handling the situation when there is a syntax error, i.e., your compiler should not crash because of a syntax error. Your assignment files will be compiled in the context of the remaining assignment files and put through a sequence of tests. The total mark for this assignment will be limited by the 2 https://learn.uq.edu.au/bbcswebdav/pid-5316913-dt-content…_1/courses/COMP4403S_7020_21490/2020-assignment2.html 2020/5/7 下午9:07 ⻚码:14/15 overall success of the development in the following way: The program submitted does not compile: Maximum 10/20. The program submitted will not correctly handle any test case with the new facilities: Maximum 13/20. You are not required to correct any bugs that may exist in the original compiler. However, we would appreciate being informed of any such bugs that you detect, so that we can correct them, or any improvements to the compiler you might like to suggest. https://learn.uq.edu.au/bbcswebdav/pid-5316913-dt-content…_1/courses/COMP4403S_7020_21490/2020-assignment2.html 2020/5/7 下午9:07 ⻚码:15/15

admin

Author admin

More posts by admin