man/mprof-report.1

   1 .TH mprof-report 1 ""
   2 .SH The Mono log profiler
   3 .PP
   4 The Mono \f[I]log\f[] profiler can be used to collect a lot of
   5 information about a program running in the Mono runtime.
   6 This data can be used (both while the process is running and later)
   7 to do analyses of the program behaviour, determine resource usage,
   8 performance issues or even look for particular execution patterns.
   9 .PP
  10 This is accomplished by logging the events provided by the Mono
  11 runtime through the profiling interface and periodically writing
  12 them to a file which can be later inspected with the command line
  13 \f[I]mprof-report\f[] program or with a GUI (not developed yet).
  14 .PP
  15 The events collected include (among others):
  16 .IP \[bu] 2
  17 method enter and leave
  18 .IP \[bu] 2
  19 object allocation
  20 .IP \[bu] 2
  21 garbage collection
  22 .IP \[bu] 2
  23 JIT compilation
  24 .IP \[bu] 2
  25 metadata loading
  26 .IP \[bu] 2
  27 lock contention
  28 .IP \[bu] 2
  29 exceptions
  30 .PP
  31 In addition, the profiler can periodically collect info about all
  32 the objects present in the heap at the end of a garbage collection
  33 (this is called heap shot and currently implemented only for the
  34 sgen garbage collector).
  35 .SS Basic profiler usage
  36 .PP
  37 The simpler way to use the profiler is the following:
  38 .PP
  39 \f[B]mono\ --profile=log\ program.exe\f[]
  40 .PP
  41 At the end of the execution the file \f[I]output.mlpd\f[] will be
  42 found in the current directory.
  43 A summary report of the data can be printed by running:
  44 .PP
  45 \f[B]mprof-report\ output.mlpd\f[]
  46 .PP
  47 With this invocation a huge amount of data is collected about the
  48 program execution and collecting and saving this data can
  49 significantly slow down program execution.
  50 If saving the profiling data is not needed, a report can be
  51 generated directly with:
  52 .PP
  53 \f[B]mono\ --profile=log:report\ program.exe\f[]
  54 .PP
  55 If the information about allocations is not of interest, it can be
  56 excluded:
  57 .PP
  58 \f[B]mono\ --profile=log:noalloc\ program.exe\f[]
  59 .PP
  60 On the other hand, if method call timing is not important, while
  61 allocations are, the needed info can be gathered with:
  62 .PP
  63 \f[B]mono\ --profile=log:nocalls\ program.exe\f[]
  64 .PP
  65 You will still be able to inspect information about the sequence of
  66 calls that lead to each allocation because at each object
  67 allocation a stack trace is collected as well.
  68 .PP
  69 To periodically collect heap shots (and exclude method and
  70 allocation events) use the following options (making sure you run
  71 with the sgen garbage collector):
  72 .PP
  73 \f[B]mono\ --gc=sgen\ --profile=log:heapshot\ program.exe\f[]
  74 .SS Profiler option documentation
  75 .PP
  76 By default the \f[I]log\f[] profiler will gather all the events
  77 provided by the Mono runtime and write them to a file named
  78 \f[I]output.mlpd\f[].
  79 When no option is specified, it is equivalent to using:
  80 .PP
  81 \f[B]--profile=log:calls,alloc,output=output.mlpd,maxframes=8,calldepth=100\f[]
  82 .PP
  83 The following options can be used to modify this default behaviour.
  84 Each option is separated from the next by a \f[B],\f[] character,
  85 with no spaces and all the options are included after the
  86 \f[I]log:\f[] profile module specifier.
  87 .IP \[bu] 2
  88 \f[I]help\f[]: display concise help info about each available
  89 option
  90 .IP \[bu] 2
  91 \f[I][no]alloc\f[]: \f[I]noalloc\f[] disables collecting object
  92 allocation info, \f[I]alloc\f[] enables it if it was disabled by
  93 another option like \f[I]heapshot\f[].
  94 .IP \[bu] 2
  95 \f[I][no]calls\f[]: \f[I]nocalls\f[] disables collecting method
  96 enter and leave events.
  97 When this option is used at each object allocation and at some
  98 other events (like lock contentions and exception throws) a stack
  99 trace is collected by default.
 100 See the \f[I]maxframes\f[] option to control this behaviour.
 101 \f[I]calls\f[] enables method enter/leave events if they were
 102 disabled by another option like \f[I]heapshot\f[].
 103 .IP \[bu] 2
 104 \f[I]heapshot\f[]: collect heap shot data at each major collection.
 105 The frequency of the heap shots can be changed with the
 106 \f[I]hsmode\f[] option below.
 107 When this option is used allocation events and method enter/leave
 108 events are not recorded by default: if they are needed, they need
 109 to be enabled explicitly.
 110 .IP \[bu] 2
 111 \f[I]hsmode=MODE\f[]: modify the default heap shot frequency
 112 according to MODE.
 113 hsmode can be used multiple times with different modes: in that
 114 case a heap shot is taken if either of the conditions are met.
 115 MODE can be one of:
 116 .RS 2
 117 .IP \[bu] 2
 118 \f[I]NUM\f[]ms: perform a heap shot if at least \f[I]NUM\f[]
 119 milliseconds passed since the last one.
 120 .IP \[bu] 2
 121 \f[I]NUM\f[]gc: perform a heap shot every \f[I]NUM\f[] garbage
 122 collections (either minor or major).
 123 .RE
 124 .IP \[bu] 2
 125 \f[I]time=TIMER\f[]: use the TIMER timestamp mode.
 126 TIMER can have the following values:
 127 .RS 2
 128 .IP \[bu] 2
 129 \f[I]fast\f[]: a usually faster but possibly more inaccurate timer
 130 .RE
 131 .IP \[bu] 2
 132 \f[I]maxframes=NUM\f[]: when a stack trace needs to be performed,
 133 collect \f[I]NUM\f[] frames at the most.
 134 The default is 8.
 135 .IP \[bu] 2
 136 \f[I]calldepth=NUM\f[]: ignore method enter/leave events when the
 137 call chain depth is bigger than NUM.
 138 .IP \[bu] 2
 139 \f[I]zip\f[]: automatically compress the output data in gzip
 140 format.
 141 .IP \[bu] 2
 142 \f[I]output=OUTSPEC\f[]: instead of writing the profiling data to
 143 the output.mlpd file, do according to \f[I]OUTSPEC\f[]:
 144 .RS 2
 145 .IP \[bu] 2
 146 if \f[I]OUTSPEC\f[] begins with a \f[I]|\f[] character, execute the
 147 rest as a program and feed the data to its standard input
 148 .IP \[bu] 2
 149 otherwise write the data the the named file
 150 .RE
 151 .IP \[bu] 2
 152 \f[I]report\f[]: the profiling data is sent to mprof-report, which
 153 will print a summary report.
 154 This is equivalent to the option: \f[B]output=mprof-report\ -\f[].
 155 .SS Analyzing the profile data
 156 .PP
 157 Currently there is a command line program (\f[I]mprof-report\f[])
 158 to analyze the data produced by the profiler.
 159 This is ran automatically when the \f[I]report\f[] profiler option
 160 is used.
 161 Simply run:
 162 .PP
 163 \f[B]mprof-report\ output.mlpd\f[]
 164 .PP
 165 to see a summary report of the data included in the file.
 166 .SS Trace information for events
 167 .PP
 168 Often it is important for some events, like allocations, lock
 169 contention and exception throws to know where they happened.
 170 Or we may want to see what sequence of calls leads to a particular
 171 method invocation.
 172 To see this info invoke mprof-report as follows:
 173 .PP
 174 \f[B]mprof-report\ --traces\ output.mlpd\f[]
 175 .PP
 176 The maximum number of methods in each stack trace can be specified
 177 with the \f[I]\[em]maxframes=NUM\f[] option:
 178 .PP
 179 \f[B]mprof-report\ --traces\ --maxframes=4\ output.mlpd\f[]
 180 .PP
 181 The stack trace info will be available if method enter/leave events
 182 have been recorded or if stack trace collection wasn't explicitly
 183 disabled with the \f[I]maxframes=0\f[] profiler option.
 184 Note that the profiler will collect up to 8 frames by default at
 185 specific events when the \f[I]nocalls\f[] option is used, so in
 186 that case, if more stack frames are required in mprof-report, a
 187 bigger value for maxframes when profiling must be used, too.
 188 .PP
 189 The \f[I]\[em]traces\f[] option also controls the reverse reference
 190 feature in the heapshot report: for each class it reports how many
 191 references to objects of that class come from other classes.
 192 .SS Sort order for methods and allocations
 193 .PP
 194 When a list of methods is printed the default sort order is based
 195 on the total time spent in the method.
 196 This time is wall clock time (that is, it includes the time spent,
 197 for example, in a sleep call, even if actual cpu time would be
 198 basically 0).
 199 Also, if the method has been ran on different threads, the time
 200 will be a sum of the time used in each thread.
 201 .PP
 202 To change the sort order, use the option:
 203 .PP
 204 \f[B]--method-sort=MODE\f[]
 205 .PP
 206 where \f[I]MODE\f[] can be:
 207 .IP \[bu] 2
 208 \f[I]self\f[]: amount of time spent in the method itself and not in
 209 its callees
 210 .IP \[bu] 2
 211 \f[I]calls\f[]: the number of method invocations
 212 .IP \[bu] 2
 213 \f[I]total\f[]: the total time spent in the method.
 214 .PP
 215 Object allocation lists are sorted by default depending on the
 216 total amount of bytes used by each type.
 217 .PP
 218 To change the sort order of object allocations, use the option:
 219 .PP
 220 \f[B]--alloc-sort=MODE\f[]
 221 .PP
 222 where \f[I]MODE\f[] can be:
 223 .IP \[bu] 2
 224 \f[I]count\f[]: the number of allocated objects of the given type
 225 .IP \[bu] 2
 226 \f[I]bytes\f[]: the total number of bytes used by objects of the
 227 given type
 228 .SS Selecting what data to report
 229 .PP
 230 The profiler by default collects data about many runtime subsystems
 231 and mprof-report prints a summary of all the subsystems that are
 232 found in the data file.
 233 It is possible to tell mprof-report to only show information about
 234 some of them with the following option:
 235 .PP
 236 \f[B]--reports=R1[,R2...]\f[]
 237 .PP
 238 where the report names R1, R2 etc.
 239 can be:
 240 .IP \[bu] 2
 241 \f[I]gc\f[]: garbage collection information
 242 .IP \[bu] 2
 243 \f[I]alloc\f[]: object allocation information
 244 .IP \[bu] 2
 245 \f[I]call\f[]: method profiling information
 246 .IP \[bu] 2
 247 \f[I]metadata\f[]: metadata events like image loads
 248 .IP \[bu] 2
 249 \f[I]exception\f[]: exception throw and handling information
 250 .IP \[bu] 2
 251 \f[I]monitor\f[]: lock contention information
 252 .IP \[bu] 2
 253 \f[I]thread\f[]: thread information
 254 .IP \[bu] 2
 255 \f[I]heapshot\f[]: live heap usage at heap shots
 256 .PP
 257 It is possible to limit some of the data displayed to a timeframe
 258 of the program execution with the option:
 259 .PP
 260 \f[B]--time=FROM-TO\f[]
 261 .PP
 262 where \f[I]FROM\f[] and \f[I]TO\f[] are seconds since application
 263 startup (they can be floating point numbers).
 264 .PP
 265 Another interesting option is to consider only events happening on
 266 a particular thread with the following option:
 267 .PP
 268 \f[B]--thread=THREADID\f[]
 269 .PP
 270 where \f[I]THREADID\f[] is one of the numbers listed in the thread
 271 summary report (or a thread name when present).
 272 .PP
 273 By default long lists of methods or other information like object
 274 allocations are limited to the most important data.
 275 To increase the amount of information printed you can use the
 276 option:
 277 .PP
 278 \f[B]--verbose\f[]
 279 .SS Track individual objects
 280 .PP
 281 Instead of printing the usual reports from the profiler data, it is
 282 possible to track some interesting information about some specific
 283 object addresses.
 284 The objects are selected based on their address with the
 285 \f[I]\[em]track\f[] option as follows:
 286 .PP
 287 \f[B]--track=0xaddr1[,0xaddr2,...]\f[]
 288 .PP
 289 The reported info (if available in the data file), will be class
 290 name, size, creation time, stack trace of creation (with the
 291 \f[I]\[em]traces\f[] option), etc.
 292 If heapshot data is available it will be possible to also track
 293 what other objects reference one of the listed addresses.
 294 .PP
 295 The object addresses can be gathered either from the profiler
 296 report in some cases (like in the monitor lock report), from the
 297 live application or they can be selected with the
 298 \f[I]\[em]find=FINDSPEC\f[] option.
 299 FINDSPEC can be one of the following:
 300 .IP \[bu] 2
 301 \f[I]S:SIZE\f[]: where the object is selected if it's size is at
 302 least \f[I]SIZE\f[]
 303 .IP \[bu] 2
 304 \f[I]T:NAME\f[]: where the object is selected if \f[I]NAME\f[]
 305 partially matches its class name
 306 .PP
 307 This option can be specified multiple times with one of the
 308 different kinds of FINDSPEC.
 309 For example, the following:
 310 .PP
 311 \f[B]--find=S:10000\ --find=T:Byte[]\f[]
 312 .PP
 313 will find all the byte arrays that are at least 10000 bytes in
 314 size.
 315 .SS Saving a profiler report
 316 .PP
 317 By default mprof-report will print the summary data to the console.
 318 To print it to a file, instead, use the option:
 319 .PP
 320 \f[B]--out=FILENAME\f[]
 321 .SS Dealing with profiler slowness
 322 .PP
 323 If the profiler needs to collect lots of data, the execution of the
 324 program will slow down significantly, usually 10 to 20 times
 325 slower.
 326 There are several ways to reduce the impact of the profiler on the
 327 program execution.
 328 .SS Collect less data
 329 .PP
 330 Collecting method enter/leave events can be very expensive,
 331 especially in programs that perform many millions of tiny calls.
 332 The profiler option \f[I]nocalls\f[] can be used to avoid
 333 collecting this data or it can be limited to only a few call levels
 334 with the \f[I]calldepth\f[] option.
 335 .PP
 336 Object allocation information is expensive as well, though much
 337 less than method enter/leave events.
 338 If it's not needed, it can be skipped with the \f[I]noalloc\f[]
 339 profiler option.
 340 Note that when method enter/leave events are discarded, by default
 341 stack traces are collected at each allocation and this can be
 342 expensive as well.
 343 The impact of stack trace information can be reduced by setting a
 344 low value with the \f[I]maxframes\f[] option or by eliminating them
 345 completely, by setting it to 0.
 346 .PP
 347 The other major source of data is the heapshot profiler option:
 348 especially if the managed heap is big, since every object needs to
 349 be inspected.
 350 The \f[I]hsmode\f[] option can be used to reduce the frequency of
 351 the heap shots.
 352 .SS Reduce the timestamp overhead
 353 .PP
 354 On many operating systems or architectures what actually slows down
 355 profiling is the function provided by the system to get timestamp
 356 information.
 357 The \f[I]time=fast\f[] profiler option can be usually used to speed
 358 up this operation, but, depending on the system, time accounting
 359 may have some level of approximation (though statistically the data
 360 should be still fairly valuable).
 361 .SS Use a statistical profiler instead
 362 .PP
 363 See the mono manpage for the use of a statistical (sampling)
 364 profiler.
 365 The \f[I]log\f[] profiler will be enhanced to provide sampling info
 366 in the future.
 367 .SS Dealing with the size of the data files
 368 .PP
 369 When collecting a lot of information about a profiled program, huge
 370 data files can be generated.
 371 There are a few ways to minimize the amount of data, for example by
 372 not collecting some of the more space-consuming information or by
 373 compressing the information on the fly or by just generating a
 374 summary report.
 375 .SS Reducing the amount of data
 376 .PP
 377 Method enter/leave events can be excluded completely with the
 378 \f[I]nocalls\f[] option or they can be limited to just a few levels
 379 of calls with the \f[I]calldepth\f[] option.
 380 For example, the option:
 381 .PP
 382 \f[B]calldepth=10\f[]
 383 .PP
 384 will ignore the method events when there are more than 10 managed
 385 stack frames.
 386 This is very useful for programs that have deep recursion or for
 387 programs that perform many millions of tiny calls deep enough in
 388 the call stack.
 389 The optimal number for the calldepth option depends on the program
 390 and it needs to be balanced between providing enough profiling
 391 information and allowing fast execution speed.
 392 .PP
 393 Note that by default, if method events are not recorded at all, the
 394 profiler will collect stack trace information at events like
 395 allocations.
 396 To avoid gathering this data, use the \f[I]maxframes=0\f[] profiler
 397 option.
 398 .PP
 399 Allocation events can be eliminated with the \f[I]noalloc\f[]
 400 option.
 401 .PP
 402 Heap shot data can also be huge: by default it is collected at each
 403 major collection.
 404 To reduce the frequency, you can use the \f[I]hsmode\f[] profiler
 405 option to collect for example every 5 collections (including major
 406 and minor):
 407 .PP
 408 \f[B]hsmode=5gc\f[]
 409 .PP
 410 or when at least 5 seconds passed since the last heap shot:
 411 .PP
 412 \f[B]hsmode=5000ms\f[]
 413 .SS Compressing the data
 414 .PP
 415 To reduce the amout of disk space used by the data, the data can be
 416 compressed either after it has been generated with the gzip
 417 command:
 418 .PP
 419 \f[B]gzip\ -9\ output.mlpd\f[]
 420 .PP
 421 or it can be compressed automatically by using the \f[I]zip\f[]
 422 profiler option.
 423 Note that in this case there could be a significant slowdown of the
 424 profiled program.
 425 .PP
 426 The mprof-report program will tranparently deal with either
 427 compressed or uncompressed data files.
 428 .SS Generating only a summary report
 429 .PP
 430 Often it's enough to look at the profiler summary report to
 431 diagnose an issue and in this case it's possible to avoid saving
 432 the profiler data file to disk.
 433 This can be accomplished with the \f[I]report\f[] profiler option,
 434 which will basically send the data to the mprof-report program for
 435 display.
 436 .PP
 437 To have more control of what summary information is reported (or to
 438 use a completely different program to decode the profiler data),
 439 the \f[I]output\f[] profiler option can be used, with \f[B]|\f[] as
 440 the first character: the rest of the output name will be executed
 441 as a program with the data fed in on the standard input.
 442 .PP
 443 For example, to print only the Monitor summary with stack trace
 444 information, you could use it like this:
 445 .PP
 446 \f[B]output=|mprof-report\ --reports=monitor\ --traces\ -\f[]
 447 .SH WEB SITE
 448 http://www.mono-project.com/Profiler
 449 .SH SEE ALSO
 450 .PP
 451 mono(1)
 452 .SH AUTHORS
 453 Paolo Molaro.
 454