Performance Profiler reporting 64-bit vs 32-bit CPU time

Hi,

I am running Performance Profiler 7.4 on a Win7 x64 machine with 12GB ram, 10k RPM HDD and i7-920 CPU. One of our solutions is quite large and is a C# .NET 4 Winforms application that connections to a SQL Server backend (2005+). Typical memory usage for the frontend can exceed 1GB and in some circumstances, load times can be for some forms, in the order of 30s to a minute with local database server.

I have recently compared loading our application as 32-bit to 64-bit using Performance profiler and was startled that the Performance Profiler was reporting CPU times for these large loads to be approximately double in 64-bit to 32-bit. Timing by stopwatch without using Profiler resulted in times that were approximately equal (i.e. 64-bit was close to or marginally slower than 32-bit).

Is there a obvious reason for Performance Profiler reporting CPU times double for 64-bit? I'm at a bit of a loss.

TIA!

Comments

  • Hello,

    There are two likely reasons for the difference in timings that you are seeing when explicitly comparing a 32bit build under profile to a 64bit build under profile.

    Firstly is the performance of the profiler itself due to memory addressing of pointers, pointers for 64bit are twice as large (more memory used, larger cache consumption); this can cause some deviation when compared to 32bit if you are comparing absolute timing values.

    Another reason is when using a profiling mode of type 'Line Level timings...'; using these profiling modes mean the profiler has to make far more methods calls - with the 32bit profiler we can explicitly use the __fastcall calling convention as these methods are small. In comparison the 64bit profiler only has access to the one calling convention, that is similar to __fastcall but not quite (writes more to registers, writes to the stack). So in real terms this can add a significant difference to timings as we have to make large amounts of these method calls when recording line level data.

    The short of it would be if comparing 32/64bit builds where you think there is a performance problem in one then look to percentages as they will be a better indication. Even better would be to do a quicker validation check by looking at a comparison using Sampling mode first, then if you think it is needed to delve deeper with the more intrusive/detailed profiling modes.

    Also when comparing sessions remember there will be environmental conditions to consider, i.e. the load of the the machine due to other processes.

    Hope that helps.
    Dene Boulton
    Red Gate
  • Hi Dene,

    Thanks for the reply. We had an issue in dot net 3.5 where our application was approximately 50% slower in 64-bit than 32-bit without profiling and were looking at ways we could isolate the issue. The performance loss appeared in Performance Profiler to be distributed quite evenly across the application i.e. no obvious hotspots appeared in the 64-bit profiler that 32-bit did not have. They might be there hidden somewhere but not sure the best way to drill into that?

    When we upgraded the solution to dot net 4 and upgraded our 3rd party controls, the issue seems to have been to a large extent resolved but we were looking for some real metric to determine this.
    Even better would be to do a quicker validation check by looking at a comparison using Sampling mode first, then if you think it is needed to delve deeper with the more intrusive/detailed profiling modes.

    Thanks... not aware of Sampling mode. I'll have a look.
    Also when comparing sessions remember there will be environmental conditions to consider, i.e. the load of the the machine due to other processes.
    Don't think this is a factor as this is on a local dev machine with no outside influences. Local databases, no other database related applications loaded and no other major CPU hogging applications running. Possible that our antivirus might be causing some sort of problem.
  • Hello,

    If the performance difference was/is a uniform increase across the call-stacks then all you would see is the increase across all methods on the call-tree/methods-grid.

    I would say look at 'All Methods' in the display options (just below the timeline) this will then display the framework stacks and this is where you would likely see differences for what you describe.

    There is also quite a bit of functionality available in the profiler that is exposed via a context menu. If you mouse-right-click on a call-tree node you get numerous options, one of which is to 'expand the most expensive stack trace', this would be useful for faster drilling down.
    If you find a likely candidate you could then search for its name use the 'Find Method' control (ctrl+f, Tools > Find...) when looking in another set of results.

    Regarding Sampling mode; this mode will always give you timings that are as close to "normal running conditions" as you can get when under profile - since it does not instrument the IL of your application.
    Dene Boulton
    Red Gate
Sign In or Register to comment.