Hunting for large heap fragmentation or frequent allocation

mswlogo · November 10, 2009 4:10PM

The ANTS memory profiler seems fine for helping find memory leaks.

But it's seems completely useless to finding code that is doing excessive reallocates.

It would be really cool if it could work like the performance profiler and just prioritize which lines of code are doing the most reallocations (by count and/or by size).

There is mention in the online help about this being a problem to be solved and how to detect you have a problem but there is not a hint on how to use the profiler to help FIND the source of problem.

There are references to articles and I know how to improve my code but it would be really cool if there was a tool that could find hot spot allocations (NOT LEAKS).

Maybe I'm missing something and there is a way?

StephenC · November 11, 2009 4:59AM

Stack trace data isn't available in V5 of the memory profiler simply because of the massive overhead this causes to record and we wanted to concentrate on making an extremely low overhead profiler for V5 which is in some ways more geared towards memory leak finding although not necessarily.

There is a section at the bottom of the Summary tab indicating when heap fragmentation is a likely cause of problems being observed.

Using the Class Reference graph is also often helpful in combination with filters to identify objects which are referncing other classes for you to determine objects that should be reduced.

We are considering recording stack trace information for V6 although for a large % of people the overhead and impact on the application being profiled will be too high to make it a useful feature.

Stephen

mswlogo · November 12, 2009 11:10AM

We've been working on a large .Net set of applications that deals with very large data sets for almost 5 years now.

Every single memory issue we've run into (and we've had many) has been related to large heap fragmentation. What often looks like a leak is due to the large heap getting fragmented.

Every time the solution has been to allocate large pools of memory up front and manage reuse ourselves.

Once we do that "leaks" magically go away.

It sure would be nice to have a tool to help identify these sections of code.

Sometimes a "leak" finding tool helps but it's quite looking for the root of the problem but rather at the symptom.

Andrew H · November 12, 2009 12:19PM

We do have plans to add improved support for diagnosing large object heap issues in a future version. Our experience is that it's a very common issue. It's quite complex too, as a fragmented heap does not always mean that an application will experience a problem - the total memory usage can stabilize with a large free pool for many workloads.

A fragmented heap is the fault of the garbage collector and not really a problem with the application. This means that there's no real way to point at a piece of application code and blame it for the problem - in fact, because the garbage collector can run at any time, it's quite possible for fragmentation to happen for different reasons in different runs of the same application. (Also, the rules for when an object ends up on the large object heap are sometimes unclear: most .NET application end up with a few very small objects there for some reason or another - mostly things associated with loading assemblies)

The usual pattern for a fragmentation issue is that a large short-lived object is created before a large long-lived object. This often happens in really unexpected places: for example, adding an object to a list can result in the list being reallocated, which will cause this problem if it happens at the wrong time - the list doesn't even have to be very large. To make things more confusing, the 'short-lived' object can already have been dereferenced at the time the 'long-lived' object is created as there's no guarantee that a garbage collection will have happened in the meantime.

What you can do with the current profiler to identify problematic objects is to set the filters to show only objects on the large object heap, and compare snapshots between an idle period before fragmentation has occurred and another idle period after causing fragmentation. New objects that are on the LOH are good possibilities for being the cause of fragmentation: you can use the object retention graph to associate them with your code.

We're hoping to add a feature in a future release that will make it possible to positively identify the objects that are responsible for each fragment of the heap, which should take the guesswork out of doing this comparison.

If you change the creation order around, for example by pre-allocating some objects or by setting the Capacity property for lists before adding to them, you'll fix the issue in many cases. Routines that process a large object and produce a different large object as a result are good candidates for this: if space is allocated for the result at the start instead of at the end then the processing space won't cause fragmentation (the more natural way to write this would be the reverse, which will always cause fragmentation).

Multi-threaded applications throw this out of the window, because it might not be possible to control allocation order (maybe pre-allocating a pool of results and re-using them would work in some cases).

Another possible technique is to make all objects as long-lived as possible. Fragments can't form if large objects are never dereferenced. In practice this is rather unwieldy: if fragments form between these very long lived objects this will guarantee that they will never be reclaimed, so new objects can only be allocated when nothing else is using the large object heap.

Copying existing large objects into new objects and then throwing the old objects away can also reduce fragmentation if it's done while the application is otherwise idle. This is really cheesy and can be tricky to get right (it might just end up shuffling the fragments around). This technique is interesting because a refined version of it is actually how .NET defragments the small object heaps.

A more permanent solution is to only use the large object heap for temporary objects with a well defined and most importantly short lifespan. We use this technique ourselves in the memory profiler to avoid fragmentation. The basic idea is to use lots of small arrays to store data (less than 1000 elements each) instead of a single big array. The problem is that this can be really hard to retrofit into an existing application - a good start might be to implement IList in this way. The memory profiler can help here: anything that shows up in the large object filter while the application is idle is a good candidate for this technique.

Hunting for large heap fragmentation or frequent allocation

Comments

Product Learning

Community Forums

Events & Friends

Simple Talk