mono/profiler/log-profiler.txt

   1 # The Mono log profiler
   2
   3 The Mono *log* profiler can be used to collect a lot of information about
   4 a program running in the Mono runtime. This data can be used (both while the process
   5 is running and later) to do analyses of the program behaviour, determine
   6 resource usage, performance issues or even look for particular execution patterns.
   7
   8 This is accomplished by logging the events provided by the Mono runtime through the
   9 profiling interface and periodically writing them to a file which can be later
  10 inspected with the command line *mprof-report* program or with a GUI (not developed yet).
  11
  12 The events collected include (among others):
  13
  14 * method enter and leave
  15 * object allocation
  16 * garbage collection
  17 * JIT compilation
  18 * metadata loading
  19 * lock contention
  20 * exceptions
  21
  22 In addition, the profiler can periodically collect info about all the objects
  23 present in the heap at the end of a garbage collection (this is called heap shot
  24 and currently implemented only for the sgen garbage collector).
  25
  26 ## Basic profiler usage
  27
  28 The simpler way to use the profiler is the following:
  29
  30 `mono --profile=log program.exe`
  31
  32 At the end of the execution the file *output.mlpd* will be found in the current
  33 directory. A summary report of the data can be printed by running:
  34
  35 `mprof-report output.mlpd`
  36
  37 With this invocation a huge amount of data is collected about the program execution
  38 and collecting and saving this data can significantly slow down program execution.
  39 If saving the profiling data is not needed, a report can be generated directly with:
  40
  41 `mono --profile=log:report program.exe`
  42
  43 If the information about allocations is not of interest, it can be excluded:
  44
  45 `mono --profile=log:noalloc program.exe`
  46
  47 On the other hand, if method call timing is not important, while allocations are,
  48 the needed info can be gathered with:
  49
  50 `mono --profile=log:nocalls program.exe`
  51
  52 You will still be able to inspect information about the sequence of calls that lead
  53 to each allocation because at each object allocation a stack trace is collected as well.
  54
  55 To periodically collect heap shots (and exclude method and allocation events) use the
  56 following options (making sure you run with the sgen garbage collector):
  57
  58 `mono --gc=sgen --profile=log:heapshot program.exe`
  59
  60 ## Profiler option documentation
  61
  62 By default the *log* profiler will gather all the events provided by the Mono runtime
  63 and write them to a file named *output.mlpd*. When no option is specified, it
  64 is equivalent to using:
  65
  66 `--profile=log:calls,alloc,output=output.mlpd,maxframes=8,calldepth=100`
  67
  68 The following options can be used to modify this default behaviour. Each option
  69 is separated from the next by a `,` character, with no spaces and all the options
  70 are included after the *log:* profile module specifier.
  71
  72 * *help*: display concise help info about each available option
  73
  74 * *[no]alloc*: *noalloc* disables collecting object allocation info, *alloc* enables
  75 it if it was disabled by another option like *heapshot*.
  76
  77 * *[no]calls*: *nocalls* disables collecting method enter and leave events. When this
  78 option is used at each object allocation and at some other events (like lock contentions
  79 and exception throws) a stack trace is collected by default. See the *maxframes* option to
  80 control this behaviour. *calls* enables method enter/leave events if they were disabled
  81 by another option like *heapshot*.
  82
  83 * *heapshot*: collect heap shot data at each major collection. The frequency of the
  84 heap shots can be changed with the *hsmode* option below. When this option is used
  85 allocation events and method enter/leave events are not recorded by default: if they
  86 are needed, they need to be enabled explicitly.
  87
  88 * *hsmode=MODE*: modify the default heap shot frequency according to MODE.
  89 hsmode can be used multiple times with different modes: in that case a heap shot is
  90 taken if either of the conditions are met.
  91 MODE can be one of:
  92         * *NUM*ms: perform a heap shot if at least *NUM* milliseconds passed since
  93         the last one.
  94         * *NUM*gc: perform a heap shot every *NUM* garbage collections (either
  95         minor or major).
  96
  97
  98 * *time=TIMER*: use the TIMER timestamp mode. TIMER can have the following values:
  99         * *fast*: a usually faster but possibly more inaccurate timer
 100
 101 * *maxframes=NUM*: when a stack trace needs to be performed, collect *NUM* frames
 102 at the most. The default is 8.
 103
 104 * *calldepth=NUM*: ignore method enter/leave events when the call chain depth is
 105 bigger than NUM.
 106
 107 * *zip*: automatically compress the output data in gzip format.
 108
 109 * *output=OUTSPEC*: instead of writing the profiling data to the output.mlpd file,
 110 substitute *%p* in *OUTSPEC* with the current process id and *%t* with the current
 111 date and time, then do according to *OUTSPEC*:
 112         * if *OUTSPEC* begins with a *|* character, execute the rest as a program
 113         and feed the data to its standard input
 114         * if *OUTSPEC* begins with a *-* character, use the rest of OUTSPEC as
 115         the filename, but force overwrite any existing file by that name
 116         * otherwise write the data the the named file: note that is a file by that
 117         name already exists, a warning is issued and profiling is disabled.
 118
 119 * *report*: the profiling data is sent to mprof-report, which will print a summary
 120 report. This is equivalent to the option: `output=mprof-report -`. If the *output*
 121 option is specified as well, the report will be written to the output file instead of
 122 the console.
 123
 124 ## Analyzing the profile data
 125
 126 Currently there is a command line program (*mprof-report*) to analyze the
 127 data produced by the profiler. This is ran automatically when the *report*
 128 profiler option is used.
 129 Simply run:
 130
 131 `mprof-report output.mlpd`
 132
 133 to see a summary report of the data included in the file.
 134
 135 ### Trace information for events
 136
 137 Often it is important for some events, like allocations, lock contention
 138 and exception throws to know where they happened. Or we may want to see
 139 what sequence of calls leads to a particular method invocation. To see this
 140 info invoke mprof-report as follows:
 141
 142 `mprof-report --traces output.mlpd`
 143
 144 The maximum number of methods in each stack trace can be specified with the
 145 *--maxframes=NUM* option:
 146
 147 `mprof-report --traces --maxframes=4 output.mlpd`
 148
 149 The stack trace info will be available if method enter/leave events have been
 150 recorded or if stack trace collection wasn't explicitly disabled with the
 151 *maxframes=0* profiler option. Note that the profiler will collect up to 8
 152 frames by default at specific events when the *nocalls* option is used, so
 153 in that case, if more stack frames are required in mprof-report, a bigger
 154 value for maxframes when profiling must be used, too.
 155
 156 The *--traces* option also controls the reverse reference feature in the heapshot
 157 report: for each class it reports how many references to objects of that class
 158 come from other classes.
 159
 160 ### Sort order for methods and allocations
 161
 162 When a list of methods is printed the default sort order is based on the total time
 163 spent in the method. This time is wall clock time (that is, it includes the time
 164 spent, for example, in a sleep call, even if actual cpu time would be basically 0).
 165 Also, if the method has been ran on different threads, the time will be a sum
 166 of the time used in each thread.
 167
 168 To change the sort order, use the option:
 169
 170 `--method-sort=MODE`
 171
 172 where *MODE* can be:
 173
 174 * *self*: amount of time spent in the method itself and not in its callees
 175 * *calls*: the number of method invocations
 176 * *total*: the total time spent in the method.
 177
 178 Object allocation lists are sorted by default depending on the total amount
 179 of bytes used by each type.
 180
 181 To change the sort order of object allocations, use the option:
 182
 183 `--alloc-sort=MODE`
 184
 185 where *MODE* can be:
 186
 187 * *count*: the number of allocated objects of the given type
 188 * *bytes*: the total number of bytes used by objects of the given type
 189
 190 ### Selecting what data to report
 191
 192 The profiler by default collects data about many runtime subsystems and mprof-report
 193 prints a summary of all the subsystems that are found in the data file. It is possible
 194 to tell mprof-report to only show information about some of them with the following
 195 option:
 196
 197 `--reports=R1[,R2...]`
 198
 199 where the report names R1, R2 etc. can be:
 200
 201 * *gc*: garbage collection information
 202 * *alloc*: object allocation information
 203 * *call*: method profiling information
 204 * *metadata*: metadata events like image loads
 205 * *exception*: exception throw and handling information
 206 * *monitor*: lock contention information
 207 * *thread*: thread information
 208 * *heapshot*: live heap usage at heap shots
 209
 210 It is possible to limit some of the data displayed to a timeframe of the
 211 program execution with the option:
 212
 213 `--time=FROM-TO`
 214
 215 where *FROM* and *TO* are seconds since application startup (they can be
 216 floating point numbers).
 217
 218 Another interesting option is to consider only events happening on a particular
 219 thread with the following option:
 220
 221 `--thread=THREADID`
 222
 223 where *THREADID* is one of the numbers listed in the thread summary report
 224 (or a thread name when present).
 225
 226 By default long lists of methods or other information like object allocations
 227 are limited to the most important data. To increase the amount of information
 228 printed you can use the option:
 229
 230 `--verbose`
 231
 232 ### Track individual objects
 233
 234 Instead of printing the usual reports from the profiler data, it is possible
 235 to track some interesting information about some specific object addresses.
 236 The objects are selected based on their address with the *--track* option as follows:
 237
 238 `--track=0xaddr1[,0xaddr2,...]`
 239
 240 The reported info (if available in the data file), will be class name, size,
 241 creation time, stack trace of creation (with the *--traces* option), etc.
 242 If heapshot data is available it will be possible to also track what other objects
 243 reference one of the listed addresses.
 244
 245 The object addresses can be gathered either from the profiler report in some
 246 cases (like in the monitor lock report), from the live application or they can
 247 be selected with the *--find=FINDSPEC* option. FINDSPEC can be one of the
 248 following:
 249
 250 * *S:SIZE*: where the object is selected if it's size is at least *SIZE*
 251 * *T:NAME*: where the object is selected if *NAME* partially matches its class name
 252
 253 This option can be specified multiple times with one of the different kinds
 254 of FINDSPEC. For example, the following:
 255
 256 `--find=S:10000 --find=T:Byte[]`
 257
 258 will find all the byte arrays that are at least 10000 bytes in size.
 259
 260 ### Saving a profiler report
 261
 262 By default mprof-report will print the summary data to the console.
 263 To print it to a file, instead, use the option:
 264
 265 `--out=FILENAME`
 266
 267 ## Dealing with profiler slowness
 268
 269 If the profiler needs to collect lots of data, the execution of the program will
 270 slow down significantly, usually 10 to 20 times slower. There are several
 271 ways to reduce the impact of the profiler on the program execution.
 272
 273 ### Collect less data
 274
 275 Collecting method enter/leave events can be very expensive, especially in programs
 276 that perform many millions of tiny calls. The profiler option *nocalls* can be
 277 used to avoid collecting this data or it can be limited to only a few call levels
 278 with the *calldepth* option.
 279
 280 Object allocation information is expensive as well, though much less than
 281 method enter/leave events. If it's not needed, it can be skipped with the
 282 *noalloc* profiler option. Note that when method enter/leave events are
 283 discarded, by default stack traces are collected at each allocation and this
 284 can be expensive as well. The impact of stack trace information can be reduced
 285 by setting a low value with the *maxframes* option or by eliminating them
 286 completely, by setting it to 0.
 287
 288 The other major source of data is the heapshot profiler option: especially
 289 if the managed heap is big, since every object needs to be inspected. The *hsmode*
 290 option can be used to reduce the frequency of the heap shots.
 291
 292 ### Reduce the timestamp overhead
 293
 294 On many operating systems or architectures what actually slows down profiling
 295 is the function provided by the system to get timestamp information.
 296 The *time=fast* profiler option can be usually used to speed up this operation,
 297 but, depending on the system, time accounting may have some level of approximation
 298 (though statistically the data should be still fairly valuable).
 299
 300 ### Use a statistical profiler instead
 301
 302 See the mono manpage for the use of a statistical (sampling) profiler.
 303 The *log* profiler will be enhanced to provide sampling info in the future.
 304
 305 ## Dealing with the size of the data files
 306
 307 When collecting a lot of information about a profiled program, huge data
 308 files can be generated. There are a few ways to minimize the amount of data,
 309 for example by not collecting some of the more space-consuming information
 310 or by compressing the information on the fly or by just generating a summary
 311 report.
 312
 313 ### Reducing the amount of data
 314
 315 Method enter/leave events can be excluded completely with the *nocalls* option
 316 or they can be limited to just a few levels of calls with the *calldepth* option.
 317 For example, the option:
 318
 319 `calldepth=10`
 320
 321 will ignore the method events when there are more than 10 managed stack frames.
 322 This is very useful for programs that have deep recursion or for programs that
 323 perform many millions of tiny calls deep enough in the call stack. The optimal
 324 number for the calldepth option depends on the program and it needs to be balanced
 325 between providing enough profiling information and allowing fast execution speed.
 326
 327 Note that by default, if method events are not recorded at all, the profiler will
 328 collect stack trace information at events like allocations. To avoid gathering this
 329 data, use the *maxframes=0* profiler option.
 330
 331 Allocation events can be eliminated with the *noalloc* option.
 332
 333 Heap shot data can also be huge: by default it is collected at each major collection.
 334 To reduce the frequency, you can use the *hsmode* profiler option to collect for example
 335 every 5 collections (including major and minor):
 336
 337 `hsmode=5gc`
 338
 339 or when at least 5 seconds passed since the last heap shot:
 340
 341 `hsmode=5000ms`
 342
 343 ### Compressing the data
 344
 345 To reduce the amout of disk space used by the data, the data can be compressed
 346 either after it has been generated with the gzip command:
 347
 348 `gzip -9 output.mlpd`
 349
 350 or it can be compressed automatically by using the *zip* profiler option. Note
 351 that in this case there could be a significant slowdown of the profiled program.
 352
 353 The mprof-report program will tranparently deal with either compressed or
 354 uncompressed data files.
 355
 356 ### Generating only a summary report
 357
 358 Often it's enough to look at the profiler summary report to diagnose an issue and in this
 359 case it's possible to avoid saving the profiler data file to disk. This can be
 360 accomplished with the *report* profiler option, which will basically send the data
 361 to the mprof-report program for display.
 362
 363 To have more control of what summary information is reported (or to use a completely
 364 different program to decode the profiler data), the *output* profiler option can be
 365 used, with `|` as the first character: the rest of the output name will be
 366 executed as a program with the data fed in on the standard input.
 367
 368 For example, to print only the Monitor summary with stack trace information, you
 369 could use it like this:
 370
 371 `output=|mprof-report --reports=monitor --traces -`
 372