+Here "abc" stays for "array bounds check", or "array bound checks", or
+some combination of the two.
+
+* Usage
+
+ Simply use the "abcrem" optimization invoking mono.
+
+ To see if bound checks are actually removed, use "mono -v" and
+ grep for "ARRAY-ACCESS" in the output, there should be a
+ message for each check that has been removed.
+
+ To trace the algorithm execution, use "-v -v -v", and be
+ prepared to be totally submersed by debugging messages...
+
+* Effectiveness
+
+ The abc removal code can now always remove bound checks from
+ "clean" array scans in loops, and generally anyway there are
+ clear conditions that already state that the index is "safe".
+
+ To be clearer, and give an idea of what the algorithm can and
+ cannot do without describing it in detail... keep in mind that
+ only "redunant" checks cannot be removed. By "redundant", I
+ mean "already explicitly checked" in the method code.
+
+ Unfortunately, analyzing complex expressions is not so easy
+ (see below for details), so the capabilities of this "abc
+ remover" are limited.
+
+ These are simple guidelines:
+
+ - Only expressions of the following kinds are handled:
+ - constant
+ - variable [+/- constant]
+ - Only comparisons between "handled" expressions are understood.
+ - "switch" statements are not yet handled.
+
+ This means that code like this will be handled well:
+
+ for (int i = 0; i < a.Length; i++) {
+ a [i] = .....
+ }
+
+ The "i" variable could be declared out of the "for", the "for"
+ could be a "while", and maybe even implemented with "goto",
+ the array could be scanned in reverse order, and everything
+ would still work.
+
+ I could have a temporary variable storing the array length,
+ and check on it inside the loop, and the abc removal would
+ still occurr, like this:
+
+ int i = 0;
+ int l = a.Length;
+ while ( i < l ) {
+ a [i] ......
+ }
+
+ or this:
+
+ int l = a.Length;
+ for (int i = l; i > 0; i--) {
+ a [i] = .....
+ }
+
+ The following two examples would work:
+
+ for (int i = 0; i < (a.Length -1); i++) .....
+ for (int i = 0; i < a.Length; i += 2) .....
+
+ But just something like this
+
+ int delta = -1;
+ for (int i = 0; i < (a.Length + delta); i++) .....
+
+ or like this
+
+ int delta = +2;
+ for (int i = 0; i < a.Length; i += delta) .....
+
+ would not work, the check would stay there. (unless
+ a combination of cfold, consprop and copyprop is used, too,
+ which would make the constant value of "delta" explicit).
+
+ Just to make you understand how things are tricky... this would work!
+
+ int limit = a.Length - 1;
+ for (int i = 0; i < limit; i++) {
+ a [i] = .....
+ }
+
+ A detailed explanation of the reason why things are done like
+ this is given below.
+
+* The Algorithm
+
+ This array bound check removal (abc removal) algorithm is
+ based on symbolic execution concepts. I handle the removal
+ like an "unreachable code elimination" (and in fact the
+ optimization could be extended to remove also other
+ unreachable sections of code, due to branches that "always go
+ the same way").
+
+ In symbolic execution, variables do not have actual (numeric)
+ values, but instead symbolic expressions (like "a", or "x+y").
+ Also, branch conditions are handled like symbolic conditions
+ (like "i<k"), which state relations between variable values.
+
+ The SSA representation inside mini is somewhat close to a
+ symbolic representation of the execution of the compiled
+ method.
+
+ Particularly, the representation of variable values is exactly
+ a symbolic one. It is enough to find all CEE_STIND_*
+ instructions which store to a local variable, and their second
+ argument is exactly the variable value. Actually, "cfg->vars
+ [<variable-index>]->def" should contain exactly those store
+ instructions, and the "get_variable_value_from_ssa_store"
+ function extracts the variable value from there.
+
+ On the other hand, the conditions under which each basic block
+ is executed are not made fully explicit.
+
+ However, it is not difficult to make them so.
+
+ Each BB that has more than one exit BB, in practice must end
+ either with a conditional branch instruction or with a switch
+ instruction.
+
+ In the first case, the BB has exactly two exit BBs, and their
+ execution conditions are easy to get from the condition of the
+ branch (see the "get_relation_from_branch_instruction"
+ function, and expecially the end of "analyze_block" in
+ abcremoval.c.
+
+ If there is a switch, the jump condition of every exit BB is
+ the equality of the switch argument with the particular index
+ associated with its case (but the current implementation does
+ not handle switch statements yet).
+
+ These individual conditions are in practice associated to each
+ arc that connects BBs in the CFG (with the simple case that
+ unconditional branches have a "TRUE" condition, because they
+ always happen).
+
+ So, for each BB, its *proper* entry condition is the union of
+ all the conditions associated to arcs that enter the BB. The
+ "union" is like a logical "or", in the sense that either of
+ the condition could be true, they are not necessarily all
+ true. This means that if I can enter a BB in two ways, and in
+ one case I know that "x>0", and in the other that "x==0",
+ actually in the BB I know that "x>=0", which is a weaker
+ condition (the union of the two).
+
+ Also, the *complete* entry condition for a BB is the
+ "intersection" of all the entry conditions of its
+ dominators. This is true because each dominator is the only
+ way to reach the BB, so the entry condition of each dominator
+ must be true if the control flow reached the BB. This
+ translates to the logical "and" of all the "proper" conditions
+ of the BBs met walking up in the dominator tree. So, if one
+ says "x>0", and another "x==1", then I know that "x==1", which
+ is a stronger condition (the intersection of the two).
+
+ Note that, if the two conditions were "x>0" and "x==0", then
+ the block would be unreachable (the intersection is empty),
+ because some branch is impossible.
+
+ Another observation is that, inside each BB, every variable is
+ subject to the complete entry condition of that very same BB,
+ and not the one in which it is defined (with the "complete
+ entry condition" being the thing I defined before, sorry if
+ these terms "proper" and "complete" are strange, I found
+ nothing better).
+
+ This happens because the branch conditions are related to the
+ control flow. I can define "i=a", and if I am in a BB where
+ "a>0", then "i>0", but not otherwise.
+
+ So, intuitively, if the available conditions say "i>=0", and i
+ is used as an index in an array access, then the lower bound
+ check can be omitted. If the condition also says
+ "(i>=0)&&(i<array.length)", the abc removal can occur.
+
+ So, a complete solution to the problem of abc removal would be
+ the following: for each array access, build a system of
+ equations containing:
+
+ [1] all the symbolic variable definitions
+
+ [2] the complete entry condition of the BB in which
+ the array access occurs
+
+ [3] the two "goal functions" ("index >=0" and "index <
+ array.length")
+
+ If the system is valid for *each possible* variable value, then the goal
+ functions are always true, and the abc can be removed.
+
+ All this discussion is useful to give a precise specification
+ to the problem we are trying to solve.
+
+ The trouble is that, in the general case, the resulting system
+ of equations is like a predicate in first order logic, which
+ is semi-decidable, and its general solution is anyway too
+ complex to be attempted in a JIT compiler (which should not
+ contain a full fledged theorem prover).
+
+ Therefore, we must cut some corner.
+
+ There is also another big problem, which is caused by
+ "recursive" symbolic definitions. These definition can (and
+ generally do) happen every time there is a loop. For instance,
+ in the following piece of code:
+
+ for ( int i = 0; i < array.length; i++ ) {
+ Console.WriteLine( "array [i] = " + array [i] );
+ }
+
+ one of the definitions of i is a PHI that can be either 0 or
+ "i + 1".
+
+ Now, we know that mathematically "i = i + 1" does not make
+ sense, and in fact symbolic values are not "equations", they
+ are "symbolic definitions".
+
+ The actual symbolic value of i is a generic "n", where "n" is
+ the number of iterations of the loop, but this is terrible to
+ handle (and in more complex examples the symbolic value of i
+ simply cannot be written, because i is calculated in an
+ iterative way).
+
+ However, the definition "i = i + 1" tells us something about
+ i: it tells us that i "grows". So (from the PHI definition) we
+ know that i is either 0, or "grows". This is enough to tell
+ that "i>=0", which is what we want! It is important to note
+ that recursive definitions can only occurr inside PHI
+ definitions, because actually a variable cannot be defined
+ *only* in terms of itself!
+
+ At this point, I can explain which corners I want to cut to
+ make the problem solvable. It will not remove all the abc that
+ could theoretically be removed, but at least it will work.
+
+ The easiest way to cut corners is to only handle expressions
+ which are "reasonably simple", and ignore the rest.
+
+ Keep in mind that ignoring an expression is not harmful in
+ itself. The algorithm will be simply "less powerful", because
+ it will ignore conditions that could have caused to the
+ removal of an abc, but will not remove checks "by mistake" (so
+ the resulting code will be in any case correct).
+
+ The expressions we handle are the following (all of integer
+ type):
+
+ - constant
+ - variable
+ - variable + constant
+ - constant + variable
+ - variable - constant
+
+ And, of course, PHI definitions.
+
+ Any other expression causes the introduction of an "any" value
+ in the evaluation, which makes all values that depend from it
+ unknown as well.
+
+ We will call these kind of definitions "summarizable"
+ definitions.
+
+ In a first attempt, we can consider only branch conditions
+ that have the simplest possible form (the comparison of two
+ summarizable expressions).
+
+ We can also simplify the effect of variable definitions,
+ keeping only what is relevant to know: their value range with
+ respect to zero and with respect to the length of the array we
+ are currently handling.
+
+ One particular note on PHI functions: they work (obviously)
+ like the logical "or" of their definitions, and therefore are
+ equivalent to the "logical or" of the summarization of their
+ definitions.
+
+ About recursive definitions (which, believe me, are the worst
+ thing in all this mess), we handle only "monotonic" ones. That
+ is, we try to understand if the recursive definition (which,
+ as we said above, must happen because of a loop) always
+ "grows" or "gets smaller". In all other cases, we decide we
+ cannot handle it.
+
+ One critical thing, once we have defined all these data
+ structures, is how the evaluation is actually performed.
+
+ In a first attempt I coded a "brute force" approach, which for
+ each BB tried to examine all possible conditions between all
+ variables, filling a sort of "evaluation matrix". The problem
+ was that the complexity of this evaluation was quadratic (or
+ worse) on the number of variables, and that many wariables
+ were examined even if they were not involved in any array
+ access.
+
+ Following the ABCD paper:
+
+ http://citeseer.ist.psu.edu/bodik00abcd.html
+
+ I rewrote the algorithm in a more "sparse" way.
+
+ Now, the main data structure is a graph of relations between
+ variables, and each attempt to remove a check performs a
+ traversal of the graph, looking for a path from the index to
+ the array length that satisfies the properties "index >= 0"
+ and "index < length". If such a path is found, the check is
+ removed. It is true that in theory *each* traversal has a
+ complexity which is exponential on the number of variables,
+ but in practice the graph is not very connected, so the
+ traversal terminates quickly.
+
+
+ Then, the algorithm to optimize one method looks like this:
+
+ [1] Preparation:
+
+ [1a] Build the SSA representation.
+
+ [1b] Prepare the evaluation graph (empty)
+
+ [1b] Summarize each varible definition, and put
+ the resulting relations in the evaluation
+ graph
+
+ [2] Analyze each BB, starting from the entry point and
+ following the dominator tree:
+
+ [2a] Summarize its entry condition, and put the resulting relations
+ in the evaluation graph (this is the reason
+ why the BBs are examined following the
+ dominator tree, so that conditions are added
+ to the graph in a "cumulative" way)
+
+ [2b] Scan the BB instructions, and for each array
+ access perform step [3]
+
+ [2c] Process children BBs following the dominator
+ tree (step [2])
+
+ [2d] Remove from the evaluation area the conditions added at step [2a]
+ (so that backtracking along the tree the area
+ is properly cleared)
+
+ [3] Attempt the removal:
+
+ [3a] Summarize the index expression, to see if we can handle it; there
+ are three cases: the index is either a
+ constant, or a variable (with an optional
+ delta) or cannot be handled (is a "any")
+
+ [3b] If the index can be handled, traverse the evaluation area searching
+ a path from the index variable to the array
+ length (if the index is a constant, just
+ examine the array length to see if it has
+ some relation with this constant)
+
+ [3c] Use the results of step [3b] to decide if the check is redundant
+