Written by 12:25 Database administration, Troubleshooting Issues • One Comment

Handling a GDI Resource Leak

GDI leak (or, simply the usage of too many GDI objects) is one of the most common problems. It eventually causes rendering problems, errors, and/or performance problems. The article describes how we debug this problem.

In 2016, when most programs are executed in sandboxes wherefrom even the most incompetent developer cannot harm the system, I am amazed to face the problem I will speak about in this article. Frankly speaking, I hoped that this problem had gone forever together with Win32Api. Nevertheless, I faced it. Before that, I just heard horror stories about it from old more experienced developers.

The Problem

Leak or usage of the enormous amount of GDI objects.

Symptoms

  1. The GDI objects column on the Details tab of Task Manager shows critical 10000 (if this column is absent,  you can add it by right-clicking the table header and selecting Select Columns).gdi_task_manager
  2. When developing in C# or in other languages that are executed by CLR, the following poorly informative error occurs:
    Message: A generic error occurred in GDI+.
    Source: System.Drawing
    TargetSite: IntPtr GetHbitmap(System.Drawing.Color)
    Type: System.Runtime.InteropServices.ExternalException
    The error may not occur with certain settings or in certain system versions, but your application won’t be able to render a single object:
  3. During development in С/С++, all GDI methods, like Create%SOME_GDI_OBJECT%, began to return NULL.

Why?

Windows systems do not allow creating more than 65535 GDI objects. This number, in fact, is impressive and I can hardly imagine a normal scenario requiring such a huge amount of objects. There is a limitation for processes –  10000 per process that can be modified (by changing the HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Windows\GDIProcessHandleQuota value in the range of 256 to 65535), but Microsoft does not recommend increasing this limitation. If you still do it, one process will be able to freeze the system so that it will be unable to render even the error message. In this case, the system can be revived only after reboot.

How to fix?

If you are living in a comfortable and managed CLR world, there is a high chance that you have a usual memory leak in your application. The problem is unpleasant, but it is quite an ordinary case. There is at least a dozen of great tools for detecting this. You will need to use any profiler to view whether the number of objects that wrap GDI resources (Sytem.Drawing.Brush, Bitmap, Pen, Region, Graphics) increases. If it is the case, you can stop reading this article. If the leak of wrapper objects was not detected,  your code uses GDI API directly and there is a scenario when they are not deleted

What do others recommend?

The official Microsoft guidance or other articles on this subject will recommend you something like this:

Find all Create%SOME_GDI_OBJECT%  and detect whether the corresponding DeleteObject (or ReleaseDC for HDC objects) exists. If such DeleteObject exists, there may be a scenario that does not call it.

There is a slightly improved version of this method that contains an additional step:

Download the GDIView utility. It can show the exact number of GDI objects by type. Note that the total number of objects does not correspond to the value in the last column. But we can close eyes on this if it helps to narrow down the field of search.

gdi_view

The project I’m working on has the code base of 9 million records, approximately the same amount of records is located in the third-party libraries, hundreds of calls of the GDI function which are spread over dozens of files. I had wasted lots of time and energy before I understood that manual analysis without faults is impossible.

What can I offer?

If this method seems too long and tiresome to you, you have not passed all stages of despair with the previous one. You may try following the previous steps, but if it does not help, do not forget about this solution.

In pursuit of the leak, I questioned myself: Where are the leaking objects created? It was impossible to set breakpoints in all places where the API function is called. Besides, I was not sure that it does not happen in the .NET Framework or in one of the third-party libraries that we use. Few minutes of googling led me to the API Monitor utility that allowed to log and trace calls to all system functions. I have easily found the list of all the functions that generate GDI objects, located and selected them in API Monitor. Then, I set breakpoints.

gdi_api_filter

After that, I ran the debugging process in Visual Studio and selected it in the Processes tree. The fifth breakpoint has worked out immediately:

gdi_monitored_processes

I realized that I would drown in this torrent and that I needed something else. I deleted breakpoints from functions and decided to view the log. It showed thousands of calls. It became clear that I won’t be able to analyze them manually.

gdi_monitored_processes_2

The task is to Find the calls of the GDI functions that do not cause the deletion. The log featured everything I needed: the list of function calls in chronological order, their returned values, and parameters. Therefore, I needed to get a returned value of the Create%SOME_GDI_OBJECT%  function and find the call of DeleteObject with this value as an argument. I selected all records in API Monitor, inserted them into a text file and got something like CSV with the TAB delimiter. I ran VS, where I intended to write a small program for parsing, but before it could load, a better idea came to my mind: to export data into a database and to write a query to find what I need. It was the right choice since it allowed me to quickly ask questions and get answers.

There are many tools for importing data from CSV to a database, so I won’t dwell on this subject (mysql, mssql, sqlite).

I’ve got the following table:

CREATE TABLE apicalls (
id int(11) DEFAULT NULL,
`Time of Day` datetime DEFAULT NULL,
Thread int(11) DEFAULT NULL,
Module varchar(50) DEFAULT NULL,
API varchar(200) DEFAULT NULL,
`Return Value` varchar(50) DEFAULT NULL,
Error varchar(100) DEFAULT NULL,
Duration varchar(50) DEFAULT NULL
)

I wrote the following MySQL function to get the descriptor of the deleted object from the API call:

CREATE FUNCTION getHandle(api varchar(1000))
RETURNS varchar(100) CHARSET utf8
BEGIN
DECLARE start int(11);
DECLARE result varchar(100);
SET start := INSTR(api,','); -- for ReleaseDC where HDC is second parameter. ex: 'ReleaseDC ( 0x0000000000010010, 0xffffffffd0010edf )'
IF start = 0 THEN
SET start := INSTR(api, '(');
END IF;
SET result := SUBSTRING_INDEX(SUBSTR(api, start + 1), ')', 1);
RETURN TRIM(result);
END

And finally, I wrote a query for locating all the current objects:

SELECT creates.id, creates.handle chandle, creates.API, dels.API deletedApi
FROM (SELECT a.id, a.`Return Value` handle, a.API FROM apicalls a WHERE a.API LIKE 'Create%') creates
LEFT JOIN (SELECT
d.id,
d.API,
getHandle(d.API) handle
FROM apicalls d
WHERE API LIKE 'DeleteObject%'
OR API LIKE 'ReleaseDC%' LIMIT 0, 100) dels
ON dels.handle = creates.handle
WHERE creates.API LIKE 'Create%';

(Basically, it will simply find all Delete calls for all the Create calls).

gdi_table

As you see from the image above, all calls without a single Delete have been found at once.

So, the last question has been left: How to determine, wherefrom are these methods called in the context of my code? And here one fancy trick helped me:

  1. Run the application in VS for debugging
  2. Find it in Api Monitor, and select it.
  3. Select a required function in API and place a breakpoint.
  4. Keep clicking ‘Next’ till it will be called with the parameters in question (I really missed conditional breakpoints from VS)
  5. When you come to the required call, switch to CS and click Break All.
  6. VS Debugger will be stopped right where the leaking object is created and all you need to do is to find out why it is not deleted.gdi_debugging

Note: The code is written for illustration purposes.

Summary:

The described algorithm is complicated and requires many tools, but it gave the result much faster in comparison with a dumb search through the huge code base.

Here is a summary of all the steps:

  1. Search for memory leaks of GDI wrapper objects.
  2. If they exist, eliminate them and repeat step 1.
  3. If there are no leaks, search for calls to the API functions explicitly.
  4. If their quantity is not large, search for a script where an object is not deleted.
  5. If their quantity is large or they can be hardly traced, download API Monitor and set it up for logging calls of the GDI functions.
  6. Run the application for debugging in VS.
  7. Reproduce the leak (it will initialize the program in order to hide the cashed objects).
  8. Connect with API Monitor.
  9. Reproduce the leak.
  10. Copy the log into a text file, import it to any database at hand (the scripts featuring in this article are for MySQL, but they can be easily adopted for any relational database management system).
  11. Compare Create and Delete methods (you can find the SQL script in this article above), and find the methods without the Delete calls.
  12. Set a breakpoint in API Monitor on the call of the required method.
  13. Keep clicking Continue till the method is called with reacquired parameters.
  14. When the method is called with required parameters, click Break All in VS.
  15. Find out why this object is not deleted.

I hope that this article will be useful and help you to save your time.

Tags: , , Last modified: September 23, 2021
Close