mono/mini/TODO

   1 * use a pool of MBState structures to speedup monoburg instead of using a
   2   mempool.
   3 * the decode tables in the burg-generated could use short instead of int
   4   (this should save about 1 KB)
   5 * track the use of ESP, so that we can avoid the x86_lea in the epilog
   6
   7
   8 Other Ideas:
   9
  10 * the ORP people avoids optimizations inside catch handlers - just to save
  11   memory (for example allocation of strings - instead they allocate strings when
  12   the code is executed (like the --shared option)). But there are only a few
  13   functions using catch handlers, so I consider this a minor issue.
  14
  15 * some performance critical functions should be inlined. These include:
  16         - mono_mempool_alloc and mono_mempool_alloc0
  17         - EnterCriticalSection and LeaveCriticalSection
  18         - TlsSetValue
  19         - mono_metadata_row_col
  20         - mono_g_hash_table_lookup
  21         - mono_domain_get
  22 * load_class_names can be speeded up by caching the per-namespace hash tables
  23   in a new hash table indexed by the index of the namespace in the blob heap.
  24 * the managed/unmanaged boundary is quite slow:
  25         - it calls mono_get_lmf_addr, which calls TlsGetValue, which calls
  26       pthread_getspecific (). This means that 3 function calls are needed for
  27       each native function call.
  28 * mono_find_jit_opcode_emulation is called a lot of times during compilation,
  29   and it involves a hash table lookup.
  30 * mcs should create AssemblyBuilders with the Save flag instead of RunAndSave,
  31   so the runtime could avoid fully constructing the types in the dynamic
  32   assembly.
  33 * if a function which involves locking is called from another function which
  34   acquires the same lock, it might be useful to create a separate _inner
  35   version of the function which does not re-acquire the lock. This is a perf
  36   win only if the function is called a lot of times, like mono_get_method.
  37
  38 * the frame_state_for function in glibc 2.3.2 can't correctly decipher the
  39   unwind tables generated by gcc 3.3. It allways tells the runtime that not all
  40   callee saved registers are saved, even when the icall is marked with
  41   MONO_ARCH_SAVE_REGS. This forces the runtime to generate wrapper functions
  42   for all icalls, slowing things down greatly.
  43
  44 Usability
  45 ---------
  46
  47 * Remove the various optimization list of flags description, have an
  48   extra --help-optimizations flag.
  49
  50 * Remove the various graph options, have a separate --help-graph for
  51   that list.
  52