Running a Clean program in
gdb, or inspecting its symbols in
will realise that Clean function names have been mangled to escape special
characters and duplicate names. For example, you may find the symbol
e____SystemEnumStrict__s__from__s_I24. To demangle the name, we first have to
unescape it to get the ABC symbol name, and then we have to read the ABC code
to find the Clean function.
From symbol names to ABC symbols¶
To find the ABC symbol name related to this, we can unescape the name. The
escape function is defined in
Underscores are used for escaping, with
_N adding a second escape layer. We
have the following sequences:
From ABC symbols to Clean functions¶
To find the Clean function belonging to an ABC symbol, first find the definition of the label.
If you have compiled with profiling information, look for a
above the label. This directive often contains a more human-readable name than
the label itself (e.g.,
<case>[line:10];9;2 instead of
Without such information, the easiest is to figure out the call graph:
- Look which functions are called from, or which thunks are build by, the
label. If a label
e_StdList_nreversethunk, look for usages of
reversein the Clean source file.
- If there are no clear outgoing links, try to find incoming links by searching
for usages of the label, and repeat. For instance, you may find
s3being used in
s5, and repeat the process with
By splitting up functions with long
# sequences into multiple functions you
can create more symbol names and make it easier to find locations.
Also, exporting functions can make it easier to read the ABC code as the names of exported functions are more readable.