doc/design_onstack_replacement.txt

   1 ONSTACK REPLACEMENT IN CACAO
   2 ============================
   3 Author: Edwin Steiner
   4
   5 ** This is a design sketch! **
   6
   7
   8 Overview
   9 --------
  10
  11 On the whole, method replacement works like this:
  12
  13   * a method foo is compiled into a replaceable version fooV1.  fooV1
  14   has a set of "replacement points" and for each replacement point it
  15   has a description (either a data structure or generated code) how to
  16   map the "execution state" at this point to the "source state".
  17
  18   when the compiler decides to replace fooV2, it does the
  19   following:
  20
  21   * a new version fooV2 of the method is compiled. Together with fooV2
  22   the compiler creates a "replacement mapping" that maps each
  23   replacement point of fooV1 to a point in fooV2 and specifies how the
  24   source state has to be mapped to the execution state at this point.
  25
  26   * fooV2 replaces fooV1 in the vftbl. From now on all threads that
  27   enter foo will got to fooV2.
  28
  29   * If not already done, the compiler creates a chunk of
  30   "replacement-out code" for each replacement point of fooV1. If there
  31   is pre-generated replacement-out code it may have fields that are set
  32   now.
  33
  34   * each replacement point in fooV1 is patched with a jump to its chunk
  35   of replacement-out code.
  36
  37   * eventually threads will reach replacement points in fooV1 and be
  38   "beamed" to fooV2.
  39
  40   * fooV1 has to be kept around undefinitely because it is not possible
  41   (not feasible?) to determine if there is a thread that may reach fooV1
  42   code in the future.
  43
  44
  45 Replacement Points
  46 ------------------
  47
  48 Replacement points must be placed in a way such that there is a _static_
  49 upper bound to the number of instructions that a thread must execute
  50 _inside the method body_ until it reaches a replacement point.
  51
  52 NOTE that this does _not_ place an upper bound on the _time_ it takes to
  53 reach a replacement point, because execution can enter a blocking
  54 instruction or a method call at any point and take a possibly unlimited
  55 amount of time to complete that.
  56
  57 CACAO will place a replacement point at method entry and at each target
  58 of a backward branch.
  59
  60 XXX Is the replacement point at the method entry necessary?
  61 Replacement-at-entry is performed by the vftbl anyway.
  62
  63
  64 Execution State
  65 ---------------
  66
  67 The execution state that has to be transformed during
  68 onstack replacement consists of:
  69
  70    1) the program counter
  71
  72    2) live CPU registers
  73       XXX caution: BSR return values!!
  74
  75    3a) live local variables and spill slots
  76       on the stack
  77       XXX caution: BSR return values!!
  78
  79    3b) the locked object of the method if it is synchronized
  80
  81    3c) the return address to the caller and the
  82       link to the previous activation record
  83
  84    3d) saved callee-save registers
  85
  86
  87 ad 1) The program counter is in 1-to-1 correspondance to the replacement
  88 point reached.
  89
  90 ad 2) the register allocator knows which registers are live at the
  91 replacement point. This information can be encoded either in a special
  92 data structure or be transformed into generated replacement-out code
  93 that is created at the same time the method is compiled.
  94
  95 ad 3*) These data are similarily organized for all replacement points in
  96 a method so we probably want to treat these with common code in order to
  97 save space.
  98
  99
 100 Source State
 101 ------------
 102
 103 The source state is a virtual state comprising:
 104
 105    1) the Java Bytecode program counter
 106
 107    2) the values on the Java stack
 108
 109    3) the values in Java local variables
 110
 111    4) synchronization state and locked object of the method
 112
 113    5) the frames (activation records) of Java methods
 114       on the VM stack
 115
 116 The source state is "virtual" in the sense that there does not exists a
 117 real data structure containing these data. However, the source state
 118 must be reconstructable from the execution state at each replacement
 119 point.
 120
 121 It does not matter if the source state values are actually constructed
 122 or if there is a direct translation (execution state V1) -> (execution
 123 state V2) that uses the source state as a conceptual link
 124 between these states.
 125
 126
 127 One-Step vs. Multi-Step Replacement
 128 -----------------------------------
 129
 130 Conceptually the actions from reaching the replacement point in fooV1
 131 to continuing execution in fooV2 are the following:
 132
 133    1) read the execution state of the fooV1 activation
 134
 135    2) remove the activation record corresponding to this invocation
 136       of fooV1
 137
 138    3) build an activation record for the (faked)
 139       invocation of fooV2.
 140
 141    4) write the execution state of the fooV2 activation
 142
 143    5) jump to the fooV2 code
 144
 145
 146 It would be possible to perform all these actions with a single
 147 big chunk of generated code for each replacement point. I think,
 148 however, that a clean separation of the phases has many
 149 advantages.
 150
 151 Phases:
 152
 153         1) done by generated replacement-out code. (Possibly
 154            a replacement-point specific part and a method
 155            specific part).
 156
 157         2-3) implemented in C. Some parts are architecture
 158            specific, however, as they depend on the structure
 159            of the stack frames.
 160
 161         4-5) done by generated replacement-in code or a
 162            general replacement-in function written in assembler.
 163
 164 This separation minimizes platform specific assembler code
 165 and opens possibilities for debugging/logging that will
 166 probably be invaluable when implementing this stuff.
 167
 168
 169 Replacement-Out Code
 170 --------------------
 171
 172 Replacement-out code could look like this:
 173
 174   replacement_point_jumps_here:
 175         sub   %esp,SIZE_OF_STRUCT
 176         mov   REPLACEMENT_POINT_INDEX,ofsindex(%esp)
 177         mov   %regX,ofsX(%esp)
 178         mov   %regY,ofsX(%esp)
 179         jmp   method_specific_code
 180
 181    /* code for other replacement points */
 182
 183    method_specific_code:
 184         mov   %esp,%eax
 185         add   %eax,OFFSET_OF_STACKFRAME_FROM_CURRENT_ESP
 186         mov   %eax,ofsframe(%esp)
 187         /* other method specific copying ... */
 188         push  %esp /* arg0 */
 189         call  method_replace(execution_state *es)
 190
 191 Note that the current method can be found in the
 192 stack frame, so we do not have to explicitely pass that
 193 info. Information on the new code (fooV2) will be
 194 in the methodinfo.
 195
 196
 197 Splitting methodinfo and codeinfo
 198 ---------------------------------
 199
 200 Recompilation will require that cacao deal with
 201 a method (repr. by a `methodinfo`) that has several
 202 realizations in JIT code (`codeinfo` or something
 203 like that). There will always be a current codeinfo
 204 and zero or more old codeinfos that have to be
 205 kept around. (For example we may have to do stack
 206 traces from pcs in old code.)
 207
 208
 209 vim: tw=72
 210