Code generation

This section describes how the generated code interacts with and supports the run time system. We need to look at function entry points and node descriptors.

Function entry points

Each function has multiple entry points. Which entry point is used depends on the context in which the function is called.

As a running example, consider the entry points generated for ++:

.export e_StdList_s++
.export e_StdList_ea++
.descexp e_StdList_d++ e_StdList_n++ e_StdList_l++ 2 0 "++"
.a 2 e_StdList_ea++
.o 2 0
e_StdList_l++
    push_args 0 1 1
    update_a 2 1
    create
    update_a 0 3
    pop_a 1
.d 3 0
    jmp e_StdList_ea++
.n 2 e_StdList_d++ e_StdList_ea++
.o 1 0
e_StdList_n++
    push_node _cycle_in_spine 2
.o 3 0
e_StdList_ea++
    jsr_eval 0
.o 3 0
e_StdList_s++
    | actual implementation ...

The strict entry point

The strict entry point starts has the prefix s (e.g., e_StdList_s++). All other entry points eventually end up here. It contains the actual implementation of the function, with the assumption that all strict arguments are in head normal form on the A stack or unboxed on the B stack. Below the arguments on the A stack is a thunk that can be filled with the result.

When the result of a function call is guaranteed to be needed, the strict entry point is used directly. The calling function makes sure to evaluate the strict arguments. This can be seen in the code generated for Start = [] ++ []:

    buildh _Nil 0
    buildh _Nil 0
.d 3 0
    jmp e_StdList_s++

Here it is the Start thunk that is below the arguments on the stack and is overwritten with the result. In some contexts, an empty _cycle_in_spine node is created to hold the result:

    create 
    buildh _Nil 0
    buildh _Nil 0
.d 3 0 
    jsr e_StdList_s++

The evaluate arguments entry point

In contexts where it is uncertain that the strict arguments are in head normal form or unboxed, the evaluate arguments entry point is used. The prefix for this entry point is ea (e.g., e_StdList_ea++). It uses jsr_eval to evaluate the necessary arguments and may use instructions like pushI_a to unbox values from the A stack to the B stack. It then moves on to the strict entry point.

The node entry point

The node entry point has the prefix n (e.g., e_StdList_n++). It is used when evaluating a thunk. The descriptor of a thunk points directly to the node entry point. To evaluate a thunk, all that is needed is to jump into this entry point with the node itself on the stack.

The thunk arguments are pushed to the stack, and the thunk itself is overwritten with _cycle_in_spine. This allows the run time system to recognize the situation that the evaluation of a thunk depends on its own result, as in Start = let xs = [hd xs] in xs.

This brings the stack in the state required for the evaluate arguments entry point.

The lazy entry point

The lazy entry point, with prefix l (e.g., e_StdList_l++) is used for the implementation of currying. It covers the case that all arguments but one are in a closure and the final argument is provided. The closure is on top of the stack; the argument is right below. The lazy entry point creates a new node for the result and brings the stack in the state required for the evaluate arguments entry point.

Descriptors

The first word of every node in the heap points to a descriptor. This descriptor is a section in the data segment containing information about the node type which is used in various places:

  • In the garbage collector, to determine the arity and node layout.
  • In the driver in _system.abc, to print values correctly.

To be able to check efficiently whether a node is a head normal form or a thunk, the descriptor of head normal forms is bitwise OR’ed with 2. This leads to the following Assembly code for a jsr_eval instruction:

ea4:                       # evaluate arguments entry point
        testb   $2,(%rcx)  # check for head normal form
        jne     e_310      # if so, omit evaluation
        call    *(%rcx)    # if not, evaluate with the node entry point
e_310:                     # proceed

Because there are different kinds of nodes, different types of descriptors are generated:

The exact contents of the descriptor varies, but includes such things as:

  • The arity. This is needed for the garbage collector.

    For records with unboxed values, the arity is stored as (a_size + b_size + 256) | (a_size << 16). For thunks with unboxed values, the arity is stored as (a_size + b_size) | (b_size << 8).

  • The name of the constructor, record, or function (in case of a closure). This is needed to print the node.

  • A type string, for records. This contains the type of each unboxed value, so that the run time system knows how to print them.

Curry tables

For nodes that can be curried (ADT constructors and function closures) a curry table is generated. It is best to look at an example, the descriptor for ++:

        .quad   e__StdList__d_A_A+2
        .globl  e__StdList__d_A_A
e__StdList__d_A_A:
        .word   0
        .word   16
        .long   yet_args_needed_0
        .word   1
        .word   8
        .long   e__StdList__l_A_A
        .word   2
        .word   0
        .word   0
        .word   2
        .long   m__StdList

e_StdList__d_A_A is the e_StdList_d++ descriptor (see symbol names), which is generated with the following ABC code:

.descexp e_StdList_d++ e_StdList_n++ e_StdList_l++ 2 0 "++"

2 here is the A stack arity and 0 is the B stack arity. ++ is the name, for printing, and the labels refer to the relevant entry points.

When a node (++) is generated, it uses the d++ descriptor. This descriptor thus points to 0, the arity.

When currying one argument into the closure (i.e., (++) xs), the label at offset 4 (for this target platform) is used, i.e., yet_args_needed_0. This label is defined in the run time system and will create a closure with one argument.

The closure with one argument has the descriptor e__StdList__d_A_A+8: it points to 1, which is the arity. Again, to curry one additional argument in (i.e., ((++) xs) ys), we use the label, here e__StdList__l_A_A. This is the lazy entry point, because we are precisely in that situation: there is a closure that expects one argument, and the argument is being provided.

The numbers 16 and 8 are the number of needed arguments, shifted to the right by 3: 2 and 1, respectively. These are used when currying more than one argument at the time (e.g., (++) xs ys). The run time system checks whether the number of provided arguments matches the number of needed arguments. If this is the case, it finds the lazy entry and finds the evaluate arguments entry point from right above it (this is what the .a 2 e_StdList_ea++ directive above the lazy entry point is for). It moves all arguments from the closure to the stack (in addition to the curried arguments which are already on the stack), and calls the evaluate arguments entry point. However, if still more arguments are needed, a new closure is created.

The last five lines of the descriptor are not part of the curry table.