1 * use a pool of MBState structures to speedup monoburg instead of using a
3 * the decode tables in the burg-generated could use short instead of int
4 (this should save about 1 KB)
5 * track the use of ESP, so that we can avoid the x86_lea in the epilog
10 * the ORP people avoids optimizations inside catch handlers - just to save
11 memory (for example allocation of strings - instead they allocate strings when
12 the code is executed (like the --shared option)). But there are only a few
13 functions using catch handlers, so I consider this a minor issue.
15 * some performance critical functions should be inlined. These include:
16 - mono_mempool_alloc and mono_mempool_alloc0
17 - EnterCriticalSection and LeaveCriticalSection
19 - mono_metadata_row_col
20 - mono_g_hash_table_lookup
22 * load_class_names can be speeded up by caching the per-namespace hash tables
23 in a new hash table indexed by the index of the namespace in the blob heap.
24 * the managed/unmanaged boundary is quite slow:
25 - it calls mono_get_lmf_addr, which calls TlsGetValue, which calls
26 pthread_getspecific (). This means that 3 function calls are needed for
27 each native function call.
28 * currently mono assumes that the CustomAttribute table is not sorted. So
29 lookup in this table is slow. Furthermore, this is used by
30 field_is_thread_static, which is called a lot of times.
31 * mono_find_jit_opcode_emulation is called a lot of times during compilation,
32 and it involves a hash table lookup.
33 * mcs should create AssemblyBuilders with the Run flag instead of RunAndSave,
34 so the runtime could avoid fully constructing the types in the dynamic
36 * if a function which involves locking is called from another function which
37 acquires the same lock, it might be useful to create a separate _inner
38 version of the function which does not re-acquire the lock. This is a perf
39 win only if the function is called a lot of times, like mono_get_method.