Code Eviction in the Maxine VM

Maxine features no interpreter but instead employs only just-in-time compilation. This implies considerable amounts of machine code are created in the course of executing an application. Machine code can become outdated: methods may, after their first execution, later be recompiled by the optimizing compiler, and static class initializers are even executed only once. To address the memory requirement issue, code eviction was introduced to Maxine in Fall 2011 for baseline code, that is, machine code generated by the T1X compiler.

Baseline Code Cache Management

Prior to the introduction of code eviction, Maxine featured two unmanaged code caches, that is, memory areas where machine code is placed. The boot code region contains all machine code belonging to the VM, it is filled during boot image building. The run-time code region was the one where all code generated by either of the JIT compilers went. After introducing code eviction, the boot code and run-time code regions still exist and are unmanaged, but the latter now only contains code generated by the optimizing compiler. One managed code region for baseline code was added. It adopts a semi-space scheme as known from garbage collection. The most important benefit of this scheme is that it implicitly compacts memory upon collection, so that bump-pointer allocation can be applied. The semi-space code region to-space is where newly allocated code is placed (and where code surviving an eviction cycle is moved); its from-space is where code subject to eviction is found.

Code Eviction Workflow

An eviction cycle is triggered when the VM’s attempt to allocate space in the baseline code region fails. If that happens, all threads are suspended (eviction is a stop-the-world VM Operation). The workflow is controlled from the CodeEviction.doIt() method, and proceeds in the following steps.

Identify Survivors

The eviction logic needs to know which machine code survives the eviction cycle. This takes place in two steps:

  1. walk all suspended threads’ call stacks and mark the baseline methods currently executing as live. Likewise, their direct callees, if they are baseline methods as well, are marked as live. This ensures that code of methods currently being run in any of the threads will not be evicted, and also that code of methods that are very likely to be invoked again survives.
  2. iterate over the current to-space and mark methods as live that need to be protected from eviction for other reasons than being executed or likely to be invoked; namely, methods that have just been compiled but were not yet placed in the baseline code cache (such methods are typically the reason for an eviction cycle to be triggered in the first place) have an invocation counter within optimization threshold and/or a type profile (such methods might soon be recompiled by the optimizing compiler, and the profile information gathered for them should not be lost)

Invalidate Non-live Methods

Invalidation takes place in three steps:

  1. the machine code of non-live methods in the baseline code region is overwritten with trap instructions. This is not strictly necessary but greatly helps in debugging
  2. all entries in vtables and itables pointing to these methods are invalidated by letting them point to the respective trampolines again. This is facilitated by iterating over the hubs of all loaded classes
  3. all direct calls to non-live methods are invalidated by letting them reference trampolines again as well. This implies iterating over all machine code in the three different code regions and checking direct call sites. Direct calls from the boot code region to baseline code are rare, so there is an optimization in place that collects all such calls (they are established at run-time) and thereby avoids iterating over the entire boot code region

Move Live Methods

This is the step where semi-space functionality is actually applied. This affects methods that have not been wiped in the previous step. In particular, this involves the following steps for each live method:

  1. Invalidate vtable and itable entries.
  2. Copy the method’s entire bytes (code and literals arrays) over to to-space.
  3. Wipe the machine code and literal arrays as described above.
  4. Memoise the old start of the method in from-space, and set new values for its start and end in to-space.
  5. Compute and set new values for the code and literals arrays and for the codeStart pointer.
  6. Advance the to-space allocation mark by the method’s size.
  7. Fix direct calls in and to moved code. Direct call sites are relative calls. Hence, all direct calls in moved code have to be adjusted. This is achieved by iterating over all baseline methods (at this point, only methods surviving eviction are affected) and fixing all direct call sites contained therein. Also, direct calls to moved code have to be adjusted. This is achieved by iterating over the optimized and boot code regions and fixing all direct calls to moved code.
  8. Compact the baseline code region’s target methods array by removing entries for wiped (stale) methods.
  9. Fix return addresses on call stacks, and code pointers in local variables. Walk all threads’ call stacks once more and fix return addresses that point to moved code. Likewise, fix pointers to machine code held in CodePointers in the frames of the methods. This logic makes use of the saved old code start of moved methods.

Tracing and Logging

The eviction algorithm contains copious logging capability using the VMLogger mechanism. There are two distinct capabilities; logging the flow of the algorithm and dumping pertinent state of the VM before and after the algorithm executes. Dumping is very verbose and does not place data in the VM log buffer. However, it is defined as an logging operation so that it can be enabled using a consistent mechanism.

The loggable operations are separated into four areas, statistics, algorithmic details, code moving and dumping, with the operations names prefixed by Stats_, Details_, Move_ and Dump, respectively. The VM option -XX:+LogCodeEviction is used to enable logging, with tracing to the log file enabled with -XX:+TraceCodeEviction. The options -XX:+LogCodeEvictionInclude and -XX:+LogCodeEvictionExclude can be used to fine control which options are logged/traced. The operation prefixes can be used in regular expressions to enable all operations in a particular area, for example: -XX:+LogCodeEvictionExclude=Stats_.*. Note that dumping must be explicitly enabled with -XX:+LogCodeEvictionInclude=Dump.


Automatically generated from com.sun.max.vm.code.package-info