-* Emitcontext
+Error Reporting:
+----------------
- Do we really need to instanciate this variable all the time?
+ * Make yyerror show a nice syntax error, instead of the current mess.
- It could be static for all we care, and just use it for making
- sure that there are no recursive invocations on it.
+Iterators
+---------
+ * `yield' is no longer a keyword, it only has special
+ meaning before a return or break keywords.
-* Static-ization
+ * Study side effects with assign
+ * Study TemporaryStorage/LocalStorage -> Merge/rename
- Since AppDomain exists, maybe we can get rid of all the stuff
- that is part of the `compiler instance' and just use globals
- everywhere.
+ * Reset should throw not implemented now.
-* FindMembers
+Instance idea
+-------------
- Move our utility FindMembers from TypeContainer to Decl, because interfaces
- are also scanned with it.
+ It would be nice to have things that can be "instances" to have an
+ EmitInstance method (this would default to nothing).
-* Ordering
+ The idea is to be able to use efficiently the instance data on stack
+ manipulations, as opposed to the current scheme, where we basically have
+ a few special cases.
- Can a constant_expression invoke overloaded operators?
- Explicit user-defined conversions?
+Optimization ideas
+------------------
-* Visibility
+ Currently when we build a type cache, it contains private members,
+ internal members, and internal protected members; We should trim
+ these out, as it shows up on the profile.
- I am not reporting errors on visibility yet.
+ We create too many Arraylists; When we know the size, we should create
+ an array;
-* Enumerations
+ During parsing we use arraylists to accumulate data, like this:
- Currently I am not resolving enumerations.
+ thing:
+
+ thing_list
+ : thing { $$ =new ArrayList (); $$.Add ($1); }
+ | thing_list thing { ArrayList a = $1; a.Add ($2); $$ = a; }
- Either I track them with `RecordEnum' as I do with classes,
- structs and interfaces or I rewrite the code to visit type
- containers and `walk' the enums with this process.
+ We probably could start using "Pairs" there:
-* Known problems:
+ thing_list
+ : thing { $$ = new Pair ($1, null); }
+ | thing_list thing { Pair p = $1; $$ = new Pair ($2, $1); }
- Cast expressions
- They should should use:
+Anonymous Methods
+-----------------
- OPEN_PARENS type CLOSE_PARENS
+ Plan:
- instead of the current production which is wrong, because it
- only handles a few cases.
+ * Resolve anonymous methods before.
+ * Each time a Local matches, if the mode is `InAnonymous', flag
+ the VariableInfo for `proxying'.
+ * During Resolve track the depth required for local variables.
+ * Before Emit, create proxy classes with proper depth.
+ * Emit.
- Complex casts like:
+Open question:
+ Create a toplevel block for anonymous methods?
- Array r = (string []) object
+EmitContext.ResolveTypeTree
+---------------------------
- Wont be parsed.
-
-* Interfaces
+ We should investigate its usage. The problem is that by default
+ this will be set when calling FindType, that triggers a more expensive
+ lookup.
- For indexers, the output of ix2.cs is different from our
- compiler and theirs. They use a DefaultMemberAttribute, which
- I have yet to figure out:
+ I believe we should pass the current EmitContext (which has this turned off
+ by default) to ResolveType/REsolveTypeExpr and then have the routines that
+ need ResolveType to pass null as the emit context.
- .class interface private abstract auto ansi INTERFACE
- {
- .custom instance void [mscorlib]System.Reflection.DefaultMemberAttribute::.ctor(string)
- = ( 01 00 04 49 74 65 6D 00 00 ) // ...Item..
- ...
- }
+DeclareLocal audit
+------------------
-* Interface indexers
+ DeclareLocal is used in various statements. The audit should be done
+ in two steps:
- I have not figured out why the Microsoft version puts an
- `instance' attribute, and I am not generating this `instance' attribute.
+ * Identify all the declare locals.
- Explanation: The reason for the `instance' attribute on
- indexers is that indexers only apply to instances
+ * Identify its uses.
-* Constructors
+ * Find if we can make wrapper functions for all of them.
- Currently it calls the parent constructor before initializing fields.
- It should do it the other way around.
+ Then we can move DeclareLocal into a helper class.
-* Use of EmitBranchable
+ This is required to fix foreach in iterators.
- Currently I use brfalse/brtrue in the code for statements, instead of
- using the EmitBranchable function that lives in Binary
+Large project:
+--------------
-* Create an UnimplementedExpcetion
+ Drop FindMembers as our API and instead extract all the data
+ out of a type the first time into our own datastructures, and
+ use that to navigate and search the type instead of the
+ callback based FindMembers.
- And use that instead of plain Exceptions to flag compiler errors.
+ Martin has some some of this work with his TypeHandle code
+ that we could use for this.
-* ConvertImplicit
+Notes on memory allocation
+--------------------------
- Currently ConvertImplicit will not catch things like:
+ Outdated:
- - IntLiteral in a float context to generate a -FloatLiteral.
- Instead it will perform an integer load followed by a conversion.
+ A run of the AllocationProfile shows that the compiler allocates roughly
+ 30 megabytes of strings. From those, 20 megabytes come from
+ LookupType.
-* In class.cs: Method.Define
+ See the notes on current_container problems below on memory usage.
- Need to use FindMembers to lookup the member for reporting
- whether a new is needed or not.
+LookupTypeReflection:
+---------------------
-* virtual-method.cs breaks
+ With something like `System.Object', LookupTypeReflection will be called
+ twice: once to find out that `System' is not a type and once
+ for System.Object.
- It breaks on the call to: new B ();
+ This is required because System.Reflection requires that the type/nested types are
+ not separated by a dot but by a plus sign.
- Where B is a class defined in the source code, my guess is that
- the look for ".ctor" fails
+ A nested class would be My+Class (My being the toplevel, Class the nested one).
-* Foreach on structure returns does not work
+ It is interesting to look at the most called lookups when bootstrapping MCS:
- I am generating invalid code instead of calling ldarga for the
- structure, I am calling ldarg:
+ 647 LTR: ArrayList
+ 713 LTR: System.Globalization
+ 822 LTR: System.Object+Expression
+ 904 LTR: Mono.CSharp.ArrayList
+ 976 LTR: System.Runtime.CompilerServices
+ 999 LTR: Type
+ 1118 LTR: System.Runtime
+ 1208 LTR: Mono.CSharp.Type
+ 1373 LTR: Mono.Languages
+ 1599 LTR: System.Diagnostics
+ 2036 LTR: System.Text
+ 2302 LTR: System.Reflection.Emit
+ 2515 LTR: System.Collections
+ 4527 LTR: System.Reflection
+ 22273 LTR: Mono.CSharp
+ 24245 LTR: System
+ 27005 LTR: Mono
- struct X {
- public IEnumerator GetEnumerator ();
- }
+ Analysis:
+ The top 9 lookups are done for things which are not types.
+
+ Mono.CSharp.Type happens to be a common lookup: the class Type
+ used heavily in the compiler in the default namespace.
+
+ RED FLAG:
+
+ Then `Type' is looked up alone a lot of the time, this happens
+ in parameter declarations and am not entirely sure that this is
+ correct (FindType will pass to LookupInterfaceOrClass a the current_type.FullName,
+ which for some reason is null!). This seems to be a problem with a lost
+ piece of context during FindType.
+
+ System.Object is also used a lot as a toplevel class, and we assume it will
+ have children, we should just shortcut this.
- X x;
+ A cache:
- foreach (object a in x){
- ...
+ Adding a cache and adding a catch for `System.Object' to flag that it wont be the
+ root of a hierarchy reduced the MCS bootstrap time from 10.22 seconds to 8.90 seconds.
+
+ This cache is currently enabled with SIMPLE_SPEEDUP in typemanager.cs. Memory consumption
+ went down from 74 megs to 65 megs with this change.
+
+Ideas:
+------
+
+ Instead of the hack that *knows* about System.Object not having any children classes,
+ we should just make it simple for a probe to know that there is no need for it.
+
+The use of DottedName
+---------------------
+
+ We could probably use a different system to represent names, like this:
+
+ class Name {
+ string simplename;
+ Name parent;
}
- I need to get the address of that bad boy
+ So `System.ComponentModel' becomes:
-* Using Alias
+ x: (System, null)
+ y: (ComponentModel, x)
- Need to reset the aliases for each compilation unit, so an
- alias defined in a file does not have any effect on another one:
+ The problem is that we would still need to construct the name to pass to
+ GetType.
- File.cs
- =======
- namespace A {
- using X = Blah;
+ This has been now implemented, its called "QualifiedIdentifier"
+
+current_container/current_namespace and the DeclSpace
+-----------------------------------------------------
+
+ We are storing fully qualified names in the DeclSpace instead of the node,
+ this is because `current_namespace' (Namepsace) is not a DeclSpace like
+ `current_container'.
- class Z : X { <-- This X is `Blah'
+ The reason for storing the full names today is this:
+
+ namespace X {
+ class Y {
+ }
}
- File2.cs
- namespace {
- class Y : X { <-- This X Is not `Blah'
+ namespace A {
+ class Y {
}
}
- I think we can implement Aliases by having an `Alias' context in all
- the toplevel TypeContainers of a compilation unit. The children typecontainers
- just chain to the parents to resolve the information.
+ The problem is that we only use the namespace stack to track the "prefix"
+ for typecontainers, but they are not typecontainers themselves, so we have
+ to use fully qualified names, because both A.X and A.Y would be entered
+ in the toplevel type container. If we use the short names, there would be
+ a name clash.
- The driver advances the Alias for each file compiled, so that each file
- has its own alias set.
+ To fix this problem, we have to make namespaces DeclSpaces.
-* Tests
+ The full size, contrasted with the size that could be stored is:
+ corlib:
+ Size of strings held: 368901
+ Size of strings short: 147863
- Write tests for the various reference conversions. We have
- test for all the numeric conversions.
+ System:
+ Size of strings held: 212677
+ Size of strings short: 97521
+
+ System.XML:
+ Size of strings held: 128055
+ Size of strings short: 35782
+
+ System.Data:
+ Size of strings held: 117896
+ Size of strings short: 36153
+
+ System.Web:
+ Size of strings held: 194527
+ Size of strings short: 58064
+
+ System.Windows.Forms:
+ Size of strings held: 220495
+ Size of strings short: 64923
-* Handle destructors specially
+
+TODO:
- Turn ~X () { a () } into:
- void Finalize () { try { a (); } finally { base.Finalize (); } }
+ 1. Create a "partial" emit context for each TypeContainer..
-* Handle volatile
+ 2. EmitContext should be partially constructed. No IL Generator.
-* Support Re-Throw exceptions:
+ interface_type review.
- try {
- X ();
- } catch (SomeException e){
- LogIt ();
- throw;
- }
+ parameter_array, line 952: `note: must be a single dimension array type'. Validate this
-* Remove the tree dumper
+Dead Code Elimination bugs:
+---------------------------
- And make all the stuff which is `public readonly' be private unless
- required.
+ I should also resolve all the children expressions in Switch, Fixed, Using.
-* Use of lexer.Location in the parser
+Major tasks:
+------------
- Currently we do:
+ Pinned and volatile require type modifiers that can not be encoded
+ with Reflection.Emit.
- TOKEN nt TERMINAL nt TERMINAL nt3 {
- $$ = new Blah ($2, $4, $6, lexer.Location);
- }
+ Properties and 17.6.3: Finish it.
- This is bad, because the lexer.Location is for the last item in `nt3'
+ Implement base indexer access.
- We need to change that to use this pattern:
+readonly variables and ref/out
+
+BUGS
+----
- TOKEN { $$ = lexer.Location } nt TERMINAL nt TERMINAL nt3 {
- $$ = new Blah ($3, $5, $7, (Location) $2);
- }
+* Check for Final when overriding, if the parent is Final, then we cant
+ allow an override.
- Notice how numbering of the arguments changes, as the { $$ =
- lexer.Location } takes up a number
+* Interface indexers
+
+ I have not figured out why the Microsoft version puts an
+ `instance' attribute, and I am not generating this `instance' attribute.
-* Method Names
+ Explanation: The reason for the `instance' attribute on
+ indexers is that indexers only apply to instances
- Method names could be; `IFACE.NAME' in the method declaration,
- stating that they implement a specific interface method.
+* Break/Continue statements
- We currently fail to parse it.
+ A finally block should reset the InLoop/LoopBegin/LoopEnd, as
+ they are logically outside the scope of the loop.
-* Namespaces
+* Break/continue part 2.
- Apparently:
+ They should transfer control to the finally block if inside a try/catch
+ block.
- namespace X {
- }
+* Method Registration and error CS111
- namespace X {
- }
+ The way we use the method registration to signal 111 is wrong.
+
+ Method registration should only be used to register methodbuilders,
+ we need an alternate method of checking for duplicates.
- Is failing to create a single namespace
+*
+> // CSC sets beforefieldinit
+> class X {
+> // .cctor will be generated by compiler
+> public static readonly object O = new System.Object ();
+> public static void Main () {}
+> }
+>
-* Arrays
+PENDING TASKS
+-------------
- We need to make sure at *compile time* that the arguments in
- the expression list of an array creation are always positive.
+* Merge test 89 and test-34
-* Fix the new parameter mess that we introduced to support ref/outo
+* Revisit
-* Reducer and -Literal
+ Primary-expression, as it has now been split into
+ non-array-creation-expression and array-creation-expression.
+
+* Code cleanup
- Maybe we should never handle -Literal in Unary expressions and let
- the reducer take care of it always?
+ The information when registering a method in InternalParameters
+ is duplicated, you can always get the types from the InternalParameters
-* Implement dead code elimination in statement.cs
+* Emit modreq for volatiles
- It is pretty simple to implement dead code elimination in
- if/do/while
+ Handle modreq from public apis.
-* Implement short circuit evaluation.
+* Emit `pinned' for pinned local variables.
+
+ Both `modreq' and pinned will require special hacks in the compiler.
+
+* Make sure that we are pinning the right variable
+
+* Merge tree.cs, rootcontext.cs
+
+OPTIMIZATIONS
+-------------
+
+* User Defined Conversions is doing way too many calls to do union sets that are not needed
+
+* Add test case for destructors
+
+* Places that use `Ldelema' are basically places where I will be
+ initializing a value type. I could apply an optimization to
+ disable the implicit local temporary from being created (by using
+ the method in New).
+
+* Dropping TypeContainer as an argument to EmitContext
+
+ My theory is that I can get rid of the TypeBuilder completely from
+ the EmitContext, and have typecasts where it is used (from
+ DeclSpace to where it matters).
+
+ The only pending problem is that the code that implements Aliases
+ is on TypeContainer, and probably should go in DeclSpace.
+
+* Use of local temporary in UnaryMutator
+
+ We should get rid of the Localtemporary there for some cases
+
+ This turns out to be very complex, at least for the post-version,
+ because this case:
+
+ a = i++
+
+ To produce optimal code, it is necessary for UnaryMutator to know
+ that it is being assigned to a variable (the way the stack is laid
+ out using dup requires the store to happen inside UnaryMutator).
+
+* Tests
+
+ Write tests for the various reference conversions. We have
+ test for all the numeric conversions.
+
+* Optimizations
+
+ In Indexers and Properties, probably support an EmitWithDup
+ That emits the code to call Get and then leaves a this pointer
+ in the stack, so that later a Store can be emitted using that
+ this pointer (consider Property++ or Indexer++)
+
+* Optimizations: variable allocation.
+
+ When local variables of a type are required, we should request
+ the variable and later release it when we are done, so that
+ the same local variable slot can be reused later on.
+
+* Add a cache for the various GetArrayMethod operations.
+
+* MakeUnionSet Callers
+
+ If the types are the same, there is no need to compute the unionset,
+ we can just use the list from one of the types.
+
+* Factor the lookup code for class declarations an interfaces
+ (interface.cs:GetInterfaceByName)
+
+RECOMMENDATIONS
+---------------
+
+* Use of lexer.Location in the parser
+
+ Currently we do:
+
+ TOKEN nt TERMINAL nt TERMINAL nt3 {
+ $$ = new Blah ($2, $4, $6, lexer.Location);
+ }
+
+ This is bad, because the lexer.Location is for the last item in `nt3'
+
+ We need to change that to use this pattern:
+
+ TOKEN { oob_stack.Push (lexer.Location) } nt TERMINAL nt TERMINAL nt3 {
+ $$ = new Blah ($3, $5, $7, (Location) oob_stack.Pop ());
+ }
- Currently our and/or operations do not implement short circuit
- evaluation.
+ Notice how numbering of the arguments changes as the
+ { oob_stack.Push (lexer.Location) } takes a "slot" in the productions.
-* Foreach and arrays
+* local_variable_declaration
- Support foreach (T t in Array)
+ Not sure that this grammar is correct, we might have to
+ resolve this during semantic analysis.
- And optimize to deal with the array rather than getting the enumerator
- out of it.
\ No newline at end of file