Detecting Memory Bottleneck
Windows NT 4.0 has a virtual-memory system that combines physical memory, the file system cache, and disk into a flexible information storage and retrieval system. The system can store program code and data on disk until it is needed, and then move into physical memory. Code and data no longer in active use can be written back to disk. In this way, processes benefit from the combined space of memory and disk. But when a computer does not have enough memory, code and data must be written to and retrieved from the disk more frequently—a slow, resource-intensive process that can become a system bottleneck.
The best indicator of a memory bottleneck is a sustained, high rate of hard page faults. Hard page faults occur when the data a program needs is not found in its working set (the physical memory visible to the program) or elsewhere in physical memory, and must be retrieved from disk. Sustained hard page fault rates—over 5 per second—are a clear indicator of a memory bottleneck. To monitor hard fault rates and other indicators of memory performance, log the System, Memory, Logical Disk and Process objects for several days at an update interval of 60 seconds. Then use the following Performance Monitor counters, described in this chapter:
Object Counter
Memory|
Page Faults/sec
Memory|
Page Reads/sec
Memory|
Page Writes/sec
Memory|
Pages Input/sec
Memory|
Pages Output/sec
Memory|
Available bytes
Memory|
Nonpaged pool bytes
Process|
Page Faults/sec
Process|
Working set
Process|
Private Bytes
Process|
Page File Bytes
Windows NT 4.0 Workstation Memory Basics
Windows NT 4.0 has a flat, linear 32-bit memory. This means that each program can see 32 bits of address space or 4 gigabytes of virtual memory. The upper half of virtual memory is reserved for system code and data that is visible to the process only when it is running in privileged mode. The lower half is available to the program when it is running in user mode and to user-mode system services called by the program.
Note Windows NT versions prior to 3.51 included some 16-bit data structures that limited processes to 256 MB (64K pages) of virtual memory. These have been converted to 32-bit data structures, so 2 gigabytes of virtual memory is available to all processes.
Monitoring Windows 4.0 memory requires that you understand both the concepts used to discuss it and the Performance Monitor counters used to test it.
Terms and Concepts
The Windows NT Virtual Memory Manager controls how memory is allocated, reserved, committed, and paged. It includes sophisticated strategies for anticipating the code and data requirements of processes to minimizing disk access.
The code and data in physical memory are divided into units called pages. The size of a page varies with the processor platform. MIPS, Intel, and PowerPC platforms have 4096 bytes per page; DEC Alpha platforms have 8192 bytes per page.
The Virtual Memory Manager moves pages of code and data between disk and memory in a process called paging. Paging is essential to a virtual memory system, although excessive paging can monopolize processors and disks.
A page fault occurs when a program requests a page of code or data is not in its working set (the set of pages visible to the program in physical memory).
• A hard page fault occurs when the requested page must be retrieved from disk.
• A soft page fault occurs when then the requested page is found elsewhere in physical memory.
Soft page faults can be satisfied quickly and relatively easily by the Virtual Memory Manager, but hard faults cause paging, which can degrade performance.
Each page in memory is stored in a page frame. Before a page of code or data can be moved from disk into memory, the Virtual Memory Manager must find or create a free page frame or a frame filled with zeros. (Zero-filled pages are a requirement of the U.S. Government C2 security standard. Page frames must be filled with zeros to prevent the previous contents from being used by a new process.) To free a page frame, changes to a data page in the frame might need to be written to disk before the frame is reused. Code pages, which are typically not changed by a program, can be deleted.
When code or data paged into physical memory is used by a process, the system reserves space for that page on the disk paging file, Pagefile.sys, in case the page needs to be written back to disk. Pagefile.sys is a reserved block of disk space that is used to back up committed memory. It can be contiguous or fragmented. Because memory needs to be backed by the paging file, the size of the paging file limits the amount of data that can be stored in memory. By default, Pagefile.sys is set to the size of physical memory plus 12 MB, but you can change it. Increasing the size of the paging file often resolves virtual memory shortages.
In Windows NT 4.0 Workstation and Server, objects created and used by applications and the operating system are stored in memory pools. These pools are accessible only in privileged mode, the processing mode in which operating system components run, so application threads must be switched to privileged mode to see the objects stored in the pools.
• The paged pool holds objects that can be paged to disk.
• The nonpaged pool holds objects that never leave main memory, such as data structures used by interrupt routines or those which prevent multiprocessor conflicts within the operating system.
The initial size of the pools is based on the amount of physical memory available to Windows NT. Thereafter, the pool size is adjusted dynamically and varies widely, depending upon the applications and services that are running.
All virtual memory in Windows NT is either reserved, committed, or available:
• Reserved memory is a set of contiguous addresses that the Virtual Memory Manager sets aside for a process but doesn't count against the process's memory quota until it is used. When a process needs to write to memory, some of the reserved memory is committed to the process. If the process runs out of memory, available memory can be reserved and committed simultaneously.
• Memory is committed when the Virtual Memory Manager saves space for it in Pagefile.sys in case it needs to be written to disk. The amount of committed memory for a process is an indication of how much memory is it really using.
Committed memory is limited by the size of the paging file. The commit limit is the amount of memory that can be committed without expanding the paging file. If disk space is available, the paging file can expand, and the commit limit will be increased.
Memory that is neither reserved nor committed is available. Available memory includes free memory, zeroed memory (which is cleared and filled with zeros), and memory on the standby list, which has been removed from a process's working set but might be reclaimed.
Measuring Memory
There are many useful tools for measuring physical and virtual memory and memory use. Most provide current totals and peak values of memory counts for processes and threads from the time they were started.
About Windows NT
To see how much physical memory is available to Windows NT 4.0 Workstation, start Windows Explorer, and choose About Windows NT from the Help menu. Physical memory available to Windows is listed at the bottom.
Task Manager
The Performance tab of Task Manager has a total physical memory field, memory usage counts, and a virtual memory graph. The Processes tab lists all processes running on the computer. From it, you can select columns to show Page Faults—a running total of page faults for the process since its start—and Page Faults Delta (PF Delta)—the change in page-fault totals between updates. Click the Page Faults or PF Delta column heading to sort processes by total page faults or by changes in page faults. The following figure shows the Task Manager Processes tab displaying the page-fault columns.
Resource Kit Utilities
Process Explode (Pview.exe) and Process Viewer (Pviewer.exe) show accurate and detailed current memory counts without any setup. Process Explode shows it all on one screen. In Process Viewer, click Memory Detail to see the counts for each segment of the process's address space.
Page Fault Monitor (Pfmon.exe), a tool on the Windows NT Resource Kit 4.0 CD in the Performance Tools group (\Perftool\Meastool), produces a running list of hard and soft page faults generated by each function call in a running application. You can display the data, write it to a log file, or both.
You must run Page Fault Monitor from the command prompt window. Type pfmon /? to see a list of available switches. For more information, see Rktools.hlp.
Performance Monitor
Performance Monitor logs data over time and displays it in line graphs. This is, by far, the best presentation of data related to memory use. Page faults, page reads, and disk operations aren't smooth, continuous increases; they are quick spikes in the data. Any totals or averaging hides the patterns.
The following figure is a Performance Monitor graph of Process: Page Faults/sec for several processes during their startup, when paging is high. The graph shows an increased page fault rate first for one application (the thin, gray line) and, later, for the other (the thick, black line). The white line represents page faults for the Services process which are interspersed throughout the sample interval.
The following report shows the same data presented in a report of average values over the measured time.
Although both provide useful information, the report does not reveal the patterns evident in the graph.
In addition to the monitoring tools, some simulation tools help you test the capacity of your memory. Clearmem, a utility on the Windows NT Resource Kit 4.0 CD, lets you measure the minimum working set for a process. It allocates all available memory, references it so it doesn't get paged out, then releases it. This forces the Virtual Memory Manager to trim the working sets of other processes to a minimum so that you can see how many pages your process is actually touching.
Memory Counters
The counters on the Performance Monitor memory object provide information on memory from different perspectives. The following table is a quick reference guide to the most commonly used memory counters.
Memory Counter Description
Page Faults/sec
How often is data not found in a process's working set?
This includes both hard page faults, which require disk I/O and soft page faults where pages are found elsewhere in memory.
If requested code or data is repeatedly not found, the process's working set is probably too small because memory is limited.
Pages Input/sec
How many pages are being retrieved from disk to satisfy page faults?
Compare with Page Faults/sec to see how many faults are satisfied by reading from disk, and how many come from somewhere else.
Pages Output/sec
How many pages are being written to disk to free up space in the working set for faulted pages? Pages must be written if they were changed by the process.
A high rate indicates that most faulting is data pages and that memory is becoming scarce. If memory is available, changed pages are retained in a list in memory and written to disk in batches.
Pages/sec
Sum of Pages Input/sec and Pages Output/sec.
Page Reads/sec
How often is the system reading from disk because of page faults? How much is page faulting affecting the disk?
The primary indicator of a memory shortage. Some page reads are expected, but a sustained rate of 5 pages per second or more indicates a memory shortage.
Counts how often the disk is read, regardless of the number of pages per reads. The number of pages exceeds the number of reads when more than one page is read at a time.
Page Writes/sec
How often is the system writing to disk because of page faults?
Counts how often the disk is written to, regardless of the number of pages written.
The number of pages exceeds the number of writes when more than one page is written at a time.
This counter is another good indicator of the effect of paging on the disk.
Available bytes
How much memory is left for processes to allocate?
This is an instantaneous count, not an average.
Configuring Available Memory
You can reduce the amount of physical memory available to Windows NT on a computer with an Intel processor without changing its physical memory configuration. This lets you simulate and test the effects of low memory on your computer.
Use the MAXMEM parameter on the Boot.ini file to set the maximum physical memory available to Windows NT.
Note MAXMEM works only on Intel processor platforms. This information does not apply to RISC computers (DEC Alpha, MIPS, and Power PC) which store their boot options in ARC firmware.
Intel computers store boot options in a Boot.ini file. Add the MAXMEM parameter to a boot option line in the Boot.ini file. You can create multiple boot option lines and choose from among the alternates at boot time.
Warning Do not set the memory on Windows NT 4.0 Workstation or Server below 8 MB. If you do, Windows NT might be unable to boot.
1.
Open Boot.ini. Because this is a read-only file, you must change its properties before you can edit it. It looks something like this:
[boot loader]
timeout=30
default=multi(0)disk(0)rdisk(0)partition(1)\WINNT40
[operating systems]
multi(0)disk(0)rdisk(0)partition(1)\WINNT40="Windows NT Version 4.00"
c:\="MS-DOS"
2.
Copy the boot option line under [operating systems] and paste it just below the existing one. Within the quotes, type some text that will identify this option to you when you see it on the screen during bootup. For example:
multi(0)disk(0)rdisk(0)partition(1)\WINNT40="Windows NT Version 4.00,12Mb"
3.
Following the quotes, type a space, then /MAXMEM=n where n is the amount of memory, in megabytes, you want to be available to Windows NT. Do not set this below 8 MB, or Windows NT might be unable to boot.
multi(0)disk(0)rdisk(0)partition(1)\WINNT40="Windows NT Version 4.00,12Mb" /MAXMEM=12
4.
You can create multiple boot option lines. Make sure the timeout parameter is long enough to let you choose from among them.
5.
Reboot, choose the low memory option, then check About Windows NT or Task Manager to make sure the change was effective.
Top of page
Memory Bottlenecks and Paging
The first step in investigating a memory problem is to measure paging. Although soft page faults (in which pages not in the process's working set are found elsewhere in physical memory) interrupt the processor, their effect is likely to be negligible.
If you suspect a memory shortage, chart the following Performance Monitor counters:
• Memory: Page Faults/sec
• Memory: Pages Input/sec
• Memory: Page Reads/sec
These counters indicate how often processes must look beyond their working sets to find the code or data they need. They also indicate how often these requests require paging from disk and how many pages are moved with each disk transfer.
Note The graphs for this section were produced on a 33-MHz 486 computer with only 10-12 MB of memory available to Windows NT 4.0 Workstation. Applications used in these examples will not cause memory bottlenecks when run with the recommended minimum of 16 MB of memory.
Compare the lines for Page Faults/sec and Pages Input/sec to determine the proportion of hard page faults (Pages input/sec) to all page faults (Page faults/sec). Both counters are measured in number of pages per second, so you can compare them without conversions. The lines intersect when the Virtual Memory Manager is reading from disk to satisfy a page fault. Space between the lines indicate soft page faults, which indicate that the missing page was found elsewhere in memory.
Pages input/sec is the number of pages read to satisfy a page fault, and Page Reads is the number of reads required to retrieve those pages. When the lines meet, the Virtual Memory Manager is moving one page per read operation. The space between the lines indicates that the Virtual Memory Manager is reading more than one page per operation. (You can calculate the amount of space by dividing Pages Input/sec by Page Reads/sec to find the number of pages per read operation. Space between the lines corresponds to a value greater than 1.)
In this example, the total page fault rate, Page Faults/sec (the top line), averages 70 per second. Page faults can interrupt the processor, but only hard faults slow the system down significantly. The next line, Pages Input/sec, measures hard faults in the numbers of pages retrieved to satisfy the fault. In this example, the rate is 21.5 pages per second on average. Although hard faults represent only 31% of all pages faulted (21.5/70), 21.5 pages per second can produce enough disk activity to cause produce a disk bottleneck.
The area between the top two lines represents soft faults; that is, pages found elsewhere in memory, either in the file system cache, in transition to disk, in the working set of another process, or brought in from memory as zeros. In this case, there are many; approximately 69% of page faults are soft faults.
The bottom line, partially superimposed upon Pages Input/sec, is Page Reads/sec, the number of times the disk is read to satisfy a page fault. This is an important indicator of the type of paging that causes system bottlenecks. The average of 16 reads/sec is quite high. Although the line representing Pages Input/sec and Page Reads/sec are close to each other, at times they diverge, indicating multiple pages read during each read operation. On average, 21.5 pages/sec are read in 16 page reads/sec. This shows that the Virtual Memory Manager is reading about 1.3 pages during each read operation.
In this example, 69% of the pages faulted because they were not in the working set of the monitored process, were found elsewhere in memory, probably in the file system cache. Because it is much quicker to retrieve data from the cache than from disk, Windows NT uses all available physical memory as cache. However, the remaining 31% of faulted pages caused in average of 16 reads/sec from disk; enough to cause a disk bottleneck. The next step is to associate paging with disk use.
Paging and Disk Reads
The memory counters, Page Reads/sec and Pages Input/sec, are indirect indicators of disk activity due to paging. Use the counters on the Logical Disk object to show paging from the perspective of the disks.
Note To enable the physical or logical disks counters, you must first run the Diskperf utility. At the command prompt, type diskperf -y, then restart the computer. For fault tolerant disk configurations (FTDISK), type diskperf -ye, then restart the computer. For more information, see "Enabling the Disk Counters" in Chapter 14, "Detecting Disk Bottlenecks."
To investigate the effect of paging on your disks, add the following counters for the Logical Disk object to your memory charts and reports:
• % Disk Read Time: _Total
• Avg. Disk Read Queue Length: _Total
• Disk Reads/sec: _Total
• Avg. Disk Bytes/Read: _Total
The disk activity counters, % Disk Read Time and Avg. Disk Read Queue Length, indicate disk reading activity during periods of high paging, as measured by the memory counters. Because disk activity has many causes, the Disk Reads/sec counter is included. This lets you subtract the reads due to paging (Memory: Page Reads/sec), from all reads (Disk Reads/sec), to determine the proportion of read operations caused by paging.
The transfer rate (as represented by Avg. Disk Bytes/Read) multiplied by Disk Reads/sec yields the number of bytes read per second, another measure of disk activity.
In this graph, the thick black line running at 100% for most of the test interval is % Disk Read Time: _Total JBAs the following report shows, total disk read time for both disks actually exceeds 100%.
The white line, Avg. Disk Read Queue Length, shows that the high disk activity is producing a large queue to disk. The scale is multiplied by 10 so that you can see the line. The disk queue averages more than 2 and, at its maximum, exceeds 5. More than two items in the queue can affect performance.
The remaining lines, representing the memory counters, are scaled quite small so that they can fit in the graph. Their values are more evident in the following report.
Although this report displays the same information as the graph, it shows average values for all counters. In this example, the average rate of page faults—123 per second—is extremely high. But the Pages Input/sec counter shows that, on average, only 42 of them, or 34%, were retrieved from disk. The rest of the pages are found in memory.
The 42 pages retrieved per second on average required 32 reads from the disk per second from the disk. Even though two-thirds of the page faults were satisfied by memory, the remaining one-third was enough to consume the disk and produce a large queue. In general, a high-performance disk is capable of 40 I/Os per second. This disk was quite close to its physical maximum.
Logical Disk: Disk Reads/sec: _Total, a measure of all disk reads, at 32.778 per second, is within sampling error range of the 32.398 average Memory: Page Reads/sec. This shows that virtually all of the reading was done to find faulted pages. This confirms that paging is cause of the bottleneck. If this pattern were to persist over time, it would indicate a memory shortage, not a disk problem.
Paging and Disk Writes
Paging causes many reads from the disk, but it also causes some writing to disk. When the Virtual Memory Manager locates a faulted page, if memory is available, it simply adds the retrieved pages to the working set of the process. When memory is scarce, it deletes a page of the current working set for every new page it brings in. If data on the page has changed, it must write the data to disk before it frees up the page frame. If many of the faulted pages are data pages with changes, the writing can be significant, and Performance Monitor can measure it.
Tip If your application's page faults are causing disk writes, they are probably faulting data pages, not code pages. Reorganizing your application's data structures and the way the program references them can reduce page faults.
To limit the writes to disk, the Virtual Memory Manager maintains a modified page list, a list of changed pages that need to be written to disk. Periodically, the modified page writer, a thread in the System process, writes some of the pages out to free up space. As free space becomes scarce, the modified page thread is activated more often.
To measure writes to disk resulting from paging, chart:
• Memory: Page Writes/sec
• Memory: Pages Output/sec
• Logical Disk: Disk Writes/sec
• Logical Disk: Disk Write Bytes/sec
• Logical Disk: Avg. Disk Write Queue Length.
Memory: Page Writes/sec indicates how often changes to pages had to be written to back to disk to free up space in a working set.
Logical Disk: Disk Writes/sec represents all writing to disk, including writes not associated with paging, like writing the Performance Monitor log file, or updating system statistics. Comparing Disk Writes to Page Writes reveals the proportion of total disk writing that consists of writing pages from memory back to disk.
Comparing Memory: Pages Output/sec and Logical Disk: Disk Write Bytes/sec also indicates the proportion of disk writing activity that is directly related to paging, but it shows it in bytes, rather than time.
Pages output/sec is the number of pages written to disk to free up page frames.
Disk Write Bytes/sec is the number of bytes written to disk per second for all purposes.
To compare Pages Output/sec to Disk Write Bytes/sec, multiply the number of pages by the number of bytes per page. MIPS, Intel, and PowerPC processors have 4096 bytes per page; DEC Alpha processors have 8192 bytes per page.
This graph compares disk read time with disk write time while the system is paging. The black line represents time reading from both physical disks on the system; the white line represents time writing. In this example, there was far more reading than writing, but the amount of writing is not insignificant.
This graph of the same events shows how much of the writing time is attributable to paging. The thin black line represents all writes to disk per second; the heavy, white line represents disk-writes due to paging. The space between them represents disk-writes other than changed pages.
In this example, the curves are almost the same shape, but the there are twice as many disk writes as page writes. This indicates that the disk writes that didn't consist of writing changed pages were related to writing them, such as writing the Performance Monitor log and writing system records.
The following report shows another measure of writing. The report includes counts of writing in pages as well as time.
To compare the Disk Write Bytes/sec to Pages Output/sec, multiply the number of pages by the page size, in this case, 4096 bytes/page. In this example, 11.952 pages out of 14.452, or 82% of disk writing is directly attributable to paging.
The Paging File
It is useful to monitor the size of the paging file, Pagefile.sys, when investigating memory shortages. On systems that have relatively little excess memory, an increased demand for memory causes Windows NT to expand the paging file.
The paging file, Pagefile.sys, is a block of disk space reserved by the operating system to back up physical memory. When processes require more physical memory than is available, the Virtual Memory Manager frees up space in memory by writing less-frequently-referenced pages back to the paging file on disk. As demand for memory increases, the Virtual Memory Manager expands the paging file until it runs out of disk space or reaches the paging file reaches its maximum size.
Note Windows NT creates one paging file on the physical drive on which the operating system is installed. The default size is equal to physical memory plus 12 MB. You can change the size of the paging file, and you can create one additional paging files on each logical disk partition in your configuration.
Use the Control Panel System applet Performance tab. (Right-click My Computer, select Properties, then select the Performance tab.) The Virtual Memory box shows the current size of your paging files and lets you add new files or change the size of existing ones. To change the value, click the Change button to display the Virtual Memory window, and then click Set.
To observe the changing size of the paging file, chart Process: Page File Bytes for individual processes and Process: Page File Bytes: _Total, an instantaneous measure of total number of bytes in the paging file.
The data in the following graph was logged during a memory leak (when memory is allocated faster than it is freed). A testing tool, LeakyApp, was run to allocate as much memory as it could, and the thrashing at the end of the graph shows the system running out of virtual memory and repeatedly warning the user.
In this graph, the thin black line is Page Faults/sec. The next thick black line is Pages Output/sec. The thick white line is Available Bytes, and the thick black line at the bottom is Pages Input/sec.
The relatively low rate of pages input (those read from disk to satisfy page faults) and very high rate of pages output (those written back to disk to trim working sets and recover physical memory) reveals the strategy of the Virtual Memory Manager to conserve available bytes by writing pages from physical memory to the paging file. Even as physical memory is consumed by processes, the number of available bytes of memory rarely falls below 4 MB, though it might vary rapidly and considerably within its range. In this example, Available bytes never drops below 372,736 bytes, as indicated by the Min value on the value bar.
However, the paging file on disk (Pagefile.sys) fills up rapidly until it runs out of space on disk.
In this graph, the lines representing the private bytes LeakyApp allocated to itself, the size of the paging file used by LeakyApp, and the total number of used bytes in the paging file, are all superimposed upon each other. Although the values are not identical, they are quite similar.
The graph is evidence of the significant growth of the paging file during a memory shortage. The plateau on the graph indicates that the maximum size of the paging file was reached, and LeakyApp could no longer allocate additional memory.
To see the values of all data points of the Page File Bytes curve, export the data to a spreadsheet like Microsoft Excel. For details on exporting logged data, see "Exporting Data" in Performance Monitor Help.
Monitoring the Nonpaged Pool
The nonpaged pool is an area of kernel-mode operating system memory reserved for objects that cannot be paged to disk. The size of the nonpaged pool is adjusted by the Virtual Memory Manager based on the amount of physical memory in the computer and on the demand for pool space by applications and services. On workstations, the absolute size of the nonpaged pool is not relevant to performance monitoring. However pool leaks, which are characterized by continuous, unexplained growth in the nonpaged pool, are a concern. Pool leaks usually result from an application error. They can cause a memory shortage because space occupied by the pool is no longer available to other processes.
For more information about memory pools, see "Terms and Concepts" earlier in this chapter.
Note The size of the nonpaged pool usually is not a concern on workstations, but can be a factor on servers where the number of trusting account (user) domains depends on the size of the nonpaged pool. For more information, see "Number of Trusted Domains" in Windows NT Networking Guide, Chapter 2, "Network Security and Domain Planning."
Many performance monitoring tools monitor the paged and nonpaged memory pools. Process Explode (Pview.exe), Process Monitor (Pmon.exe) and Process Viewer (Pviewer.exe), tools included the Windows NT Resource Kit 4.0 CD, display the amount of space in the pools allocated to each process. Task Manager and Performance Monitor display the total size of each memory pool as well as the space allocated to each process. Although their display formats vary widely, all of the these tools collect their data from the same internal counters.
Important The internal counters that measure the size of the nonpaged pool for each process are not precise. The counter values are estimates which count duplicated object handles as well as space for the object. Also, because the process pool size counts are rounded to page size, pool space is overestimated when a process is using part of a page. Therefore, it is important to monitor changes in pool size for a process, not the absolute value. Total pool size counts are precise. Therefore, the sum of pool sizes for each process might not equal the value for the whole system.
In Task Manager, the current size of the nonpaged pool is listed in the Kernel Memory box on the Performance tab. On the Processes tab, you can add a column to monitor the size of the nonpaged pool allocated to each process. For more information, see "Task Manager" in Chapter 11, "Performance Monitoring Tools."
Performance Monitor lets you to log changes in pool size over time. Pool size changes slowly, so you might have to log for several hours or even days to catch a pool leak. Use these Performance Monitor counters to monitor the pool size:
• Memory: Pool Nonpaged Bytes
• Memory: Pool Nonpaged Allocations
• Process: Pool Nonpaged Bytes
The counters on the Memory object monitor the total size of the nonpaged pool and the number of allocations of pool space for the whole system. The counter on the Process object monitors nonpaged pool space allocated to each process.
To use Performance Monitor to monitor the nonpaged pool for leaks, follow these procedures:
• Record the size of the nonpaged pool when the system starts. Then log the Memory and Process objects for several days at a 60-second interval. You should be able to associate any increases in the size of the pool, as indicated by Nonpaged Pool Bytes, with the start of a process, as indicated by Process: % Processing Time. When processes are stopped, you should see a decrease in pool size.
• Set a Performance Monitor alarm to notify you when Nonpaged Pool Bytes increases by more than 10% from its value at system startup.
After the system is started, the nonpaged pool should increase in size only when a process is started. When the process ends, the nonpaged pool should return to its original size. Any other unexplained growth in the nonpaged pool is considered to be abnormal.
Pool leaks typically occur when an application creates objects, then fails to close their handles when it is done with them, or when an application repeatedly opens a file or other object unnecessarily. Each time the application attempts to open the object, more space is allocated for the same object in the nonpaged pool. The bytes allocated for the pool come from physical memory, and an unnecessarily large pool denies those bytes to the operating system and applications.
Windows NT dynamically adjusts the size of the paged and nonpaged memory pools for optimum performance. However, you can change the size of the paged and nonpaged pools on Windows NT Workstations and Servers by editing the Registry. Use Regedt32 or Regedit, tools installed with Windows NT, to edit the Registry. Editing the Registry might cause serious system problems that require reinstalling Windows NT. The values for this Registry entry are entered in bytes.
The registry parameters for paged and nonpaged pool are in:
Subtree
HKEY_LOCAL_MACHINE
Key
\System\CurrentControlSet\Control\SessionManager
\MemoryManagerment
Name
NonPagedPoolSize
PagedPoolSize
Type
REG_DWORD
Top of page
Examining Your Applications
High, sustained paging rates indicate a memory shortage, but not its cause. You might have insufficient physical memory to support the operating system, applications, and network services. However, you might have an application that is using memory inefficiently or leaking memory—that is, allocating memory, but not releasing it.
When the hard page fault rate on your system rises, investigate the memory use of your applications by using the following counters:
• Process: Private Bytes
• Process: Working Set
• Process: Page Faults/sec
• Process: Page file Bytes
The first step is to distinguish between a general memory shortage that is affecting all applications and a memory shortage caused by a particular application. Chart Process: Page Faults/sec for all processes.
The following graph shows a general memory shortage that is causing page faults in many processes.
In this example, Memory: Page Faults/sec (the tall white bar) represents all page faults for the system. The other bars represent page faults for each application or service running on the system. This graph demonstrates that no single application is causing a memory shortage. In this case, the high paging rate is best resolved by adding more physical memory.
In contrast, the following graph shows a single application, LeakyApp, a test tool, causing a high rate of page faults.
In this example, Memory: Page Faults/sec (the first tall bar) represents all page faults for the system. The tall white bar represents page faults for the test tool. The other bars, which are barely visible, represent the fault rates of other processes.
Although this memory shortage affects all system processes, it is attributable to a single application. Were it a real application instead of a test tool, a more thorough investigation would be in order. It would be prudent to consider replacing the application, moving it to another computer or, if it is your application, trying to improve it memory efficiency.
The standard performance monitoring tools are designed to determine that an application is using memory inefficiently, but not why. If you have an inefficient application, use the following tools for further diagnosis:
• Page Fault Monitor (Pfmon.exe), a utility on the Windows NT Resource Kit 4.0 CD in the Performance Tools group (\Perftool\Meastool), produces a running list of the hard and soft page faults generated by each function call in a running process. For more information, see Rktools.hlp.
• The Working Set Tuner analyzes the patterns of function calls in your application code and recommends a code organization that consumes the least possible physical memory. It requires some work from the developer, but has been demonstrated to improve memory efficiency by as much as 50%. The Working Set Tuner is part of the Win32 Software Development Kit.
The remainder of this section explains how to determine the effect of application memory use on your system.
Working Set
The working set of a process is the physical memory assigned to the process by the operating system. It contains the code and data pages recently referenced by the process. When a process needs code or data that is not in its working set, a page fault occurs, and the Virtual Memory Manager adds the new pages to the working set.
• When memory is plentiful, the Virtual Memory Manager leaves older pages in the working sets even when it adds new pages. This results in larger working sets.
• When memory becomes scarce, the Virtual Memory Manager recovers memory for the system by moving less recently referenced pages out of the working sets and by removing an older page for each new page added. Although trimming the working sets is necessary, it can cause more page faults.
One measure of application efficiency is how small its working set can get without causing a large number of page faults. In general, the more that data used in sequence is stored in sequence, the fewer pages the application needs and the smaller its working set can be.
To measure the working sets of processes, chart:
• Process: Working Set
• Process: Page Faults/sec
• Memory: Available Bytes
Working Set is an instantaneous counter that shows the current number of bytes in a process's working set. Process: Page Faults/sec is the rate of page faults for the process. The following graph demonstrates that the Virtual Memory Manager adjusts the size of a process's working set attempting to respond to a process's page fault rate by increasing its working set and then by trimming it.
In this graph, the white line represents the working set of a process; the black line represents the page-fault rate for the process. Notice that the vertical maximum has been increased to 200 so that the working set curve fits in the graph. The similar shapes of these curves reflect their cause and effect relationship of the Virtual Memory Manager responding to the page faults.
In this example, the overall page-fault rate is quite high, averaging over 12 page faults/sec. As the page-fault rate rises, the Virtual Memory Manager adds pages to the working set of the process to reduce the rate of the page faults. About midway through the graph, as the page-fault rate drops to near zero—probably because the process has much of what it needs—the Virtual Memory Manager begins to trim the working set, but a resurgence of page faults drives the size of the working set back up.
Determining the Minimum Working Set
When you improve the organization of code and data references in your program, you:
• reduce its minimum working set.
• reduce the amount of physical memory it consumes.
• improve its use of the file system cache (as described in Chapter 15, "Detecting Cache Bottlenecks").
To demonstrate one aspect of this improvement, measure the minimum working set of an application before and after you tune it.
To see the actual minimum working set of a process, you must reduce available memory to its minimum. This compels the Virtual Memory Manager to trim all but currently active pages from a process's working set.
Clearmem, a utility on the Windows NT Resource Kit 4.0 CD in the Performance Tools group (\PerfTool\MeasTool), determines the size of your computer's physical memory, allocates enough data to fill it, then references the data as quickly as possible. It also accesses files to clear the cache. This reduces memory available to other processes to a minimum. Then, Clearmem releases the allocated memory to restore normal system functions.
To find the minimum working set for a process
1.
Start the process, then start a Performance Monitor log to measure the Process object once per second.
2.
Run Clearmem It usually runs for less than a minute.
3.
Stop the log, change to Chart view, and in the Options menu, set Data From to the log file.
4.
Use the Time Window to advance the beginning time to a point when Clearmem was running. (The Clearmem process does not appear in the Add to Chart dialog box unless the process is active at beginning time of the Time Window). Chart Process: % Processor Time for Clearmem to show the duration of the Clearmem process. Then, enlarge the Time Window so that the display begins just after Clearmem started running and ends just before Clearmem stopped running.
5.
Chart Process: Working Set for the process you are testing.
6.
Read the minimum working set size from the Min field on the value bar.
In this graph, the thick, black line is the working set of Clearmem, the memory-consuming test tool. As Clearmem increases its working set (as represented by the sharply increasing curve) the working sets of all other processes are trimmed until they contain only pages currently being used and those most recently accessed. The other lines in the graph represent the working sets of other processes. The value bar shows that the minimum working set for Explorer in this test is 184,320 bytes.
Available Bytes
Available bytes is a measure of free memory. The Virtual Memory Manager continually adjusts the space used in physical memory and on disk to maintain a minimum number of available bytes for the operating system and processes. When available bytes are plentiful, the Virtual Memory Manager lets the working sets of processes grow, or keeps them stable by removing an old page for each new page added. When available bytes are few, the Virtual Memory Manager must trim the working sets of processes to maintain the minimum required.
The following graph records the start of a process. It shows the relationship between page faults, the size of the working set of a process, and available bytes. When a process is faulting many pages, the Virtual Memory Manager increases the process's working set to slow the fault rate. The memory added to the working set is taken from available bytes, which shrinks accordingly. When available bytes falls close to the minimum tolerated by the system, the Virtual Memory Manager trims the working sets to recover some available bytes. The smaller working set makes page faults more likely, requiring the Virtual Memory Manager adjustment cycle to begin again.
In this graph, which records the start of the Microsoft Word process, Winword.exe, the thick black line is Available Bytes, the white line is Working Set, and the thin black line is Page Faults/sec. The vertical maximum has been increased to 200 to accommodate the high values.
This example demonstrates the close association between the number of page faults and the increase in the working set. Note the inverse relationship between the size of the process's working set and available bytes for the system.
For the first third of the graph, the Virtual Memory Manager responds to page faults by dramatically increasing the size of the working set. Then, to recover available bytes, the working set is trimmed; thereafter, the Virtual Memory Manager responds to page faults by moving in just needed pages and by removing pages not recently referenced to keep the size of the working set to a minimum. At the end of the graph, an increase in the page fault rate drives the size of the working set back up.
Top of page
Resolving a Memory Bottleneck
Although more memory is the easy solution to a memory bottleneck, it isn't always the right solution.
• Monitor your applications and replace or correct those that leak memory or use it inefficiently.
• Localize your application's data references. Page Fault Monitor (Pfmon.exe), a tool on the Windows NT Resource Kit 4.0 CD, produces a running list of hard and soft page faults generated by each function call in a process.
• Localize your application's code references. The Working Set Tuner, included in the Win32 Software Development Kit, recommends an optimal organization of code functions to localize code page access.
• Increase the size of the paging file. In general, the bigger you can make it, the better it is. You can also have multiple paging files, though you can have only one on each logical drive. To change the size of the paging file, use the Control Panel System applet Performance tab. (Right-click My Computer, select Properties, then select the Performance tab.) The Virtual Memory box shows the current size of your paging files and lets you add new files or change the size of existing ones.
• Check the available space on your hard drives. If you have added memory, increase the size of your paging files. The paging file might need to expand to map the additional memory. If the space is not available, it might produce the symptoms of a memory bottleneck.
• Increase the size of your secondary memory cache, especially if you've just added memory. When you add memory, the secondary cache must map the larger memory space.
The amount of secondary cache a system supports depends upon the design of the motherboard. Many motherboards support several secondary cache configurations (from 64K–512K or 256K–1 MB). Increasing cache size usually requires removing the existing static ram (SRAM) chips, replacing them with new SRAM chips, and changing some jumpers. Doing so would be helpful anytime you have a working set larger than your current secondary cache.
• Remove unnecessary protocols and drivers. Even idle protocols use space in the paged and nonpaged memory pools.
If all else fails, add memory. After struggling with a memory bottleneck and its grueling effects, you will find the improved response of the entire system well worth the investment.
The best indicator of a memory bottleneck is a sustained, high rate of hard page faults. Hard page faults occur when the data a program needs is not found in its working set (the physical memory visible to the program) or elsewhere in physical memory, and must be retrieved from disk. Sustained hard page fault rates—over 5 per second—are a clear indicator of a memory bottleneck. To monitor hard fault rates and other indicators of memory performance, log the System, Memory, Logical Disk and Process objects for several days at an update interval of 60 seconds. Then use the following Performance Monitor counters, described in this chapter:
Object Counter
Memory|
Page Faults/sec
Memory|
Page Reads/sec
Memory|
Page Writes/sec
Memory|
Pages Input/sec
Memory|
Pages Output/sec
Memory|
Available bytes
Memory|
Nonpaged pool bytes
Process|
Page Faults/sec
Process|
Working set
Process|
Private Bytes
Process|
Page File Bytes
Windows NT 4.0 Workstation Memory Basics
Windows NT 4.0 has a flat, linear 32-bit memory. This means that each program can see 32 bits of address space or 4 gigabytes of virtual memory. The upper half of virtual memory is reserved for system code and data that is visible to the process only when it is running in privileged mode. The lower half is available to the program when it is running in user mode and to user-mode system services called by the program.
Note Windows NT versions prior to 3.51 included some 16-bit data structures that limited processes to 256 MB (64K pages) of virtual memory. These have been converted to 32-bit data structures, so 2 gigabytes of virtual memory is available to all processes.
Monitoring Windows 4.0 memory requires that you understand both the concepts used to discuss it and the Performance Monitor counters used to test it.
Terms and Concepts
The Windows NT Virtual Memory Manager controls how memory is allocated, reserved, committed, and paged. It includes sophisticated strategies for anticipating the code and data requirements of processes to minimizing disk access.
The code and data in physical memory are divided into units called pages. The size of a page varies with the processor platform. MIPS, Intel, and PowerPC platforms have 4096 bytes per page; DEC Alpha platforms have 8192 bytes per page.
The Virtual Memory Manager moves pages of code and data between disk and memory in a process called paging. Paging is essential to a virtual memory system, although excessive paging can monopolize processors and disks.
A page fault occurs when a program requests a page of code or data is not in its working set (the set of pages visible to the program in physical memory).
• A hard page fault occurs when the requested page must be retrieved from disk.
• A soft page fault occurs when then the requested page is found elsewhere in physical memory.
Soft page faults can be satisfied quickly and relatively easily by the Virtual Memory Manager, but hard faults cause paging, which can degrade performance.
Each page in memory is stored in a page frame. Before a page of code or data can be moved from disk into memory, the Virtual Memory Manager must find or create a free page frame or a frame filled with zeros. (Zero-filled pages are a requirement of the U.S. Government C2 security standard. Page frames must be filled with zeros to prevent the previous contents from being used by a new process.) To free a page frame, changes to a data page in the frame might need to be written to disk before the frame is reused. Code pages, which are typically not changed by a program, can be deleted.
When code or data paged into physical memory is used by a process, the system reserves space for that page on the disk paging file, Pagefile.sys, in case the page needs to be written back to disk. Pagefile.sys is a reserved block of disk space that is used to back up committed memory. It can be contiguous or fragmented. Because memory needs to be backed by the paging file, the size of the paging file limits the amount of data that can be stored in memory. By default, Pagefile.sys is set to the size of physical memory plus 12 MB, but you can change it. Increasing the size of the paging file often resolves virtual memory shortages.
In Windows NT 4.0 Workstation and Server, objects created and used by applications and the operating system are stored in memory pools. These pools are accessible only in privileged mode, the processing mode in which operating system components run, so application threads must be switched to privileged mode to see the objects stored in the pools.
• The paged pool holds objects that can be paged to disk.
• The nonpaged pool holds objects that never leave main memory, such as data structures used by interrupt routines or those which prevent multiprocessor conflicts within the operating system.
The initial size of the pools is based on the amount of physical memory available to Windows NT. Thereafter, the pool size is adjusted dynamically and varies widely, depending upon the applications and services that are running.
All virtual memory in Windows NT is either reserved, committed, or available:
• Reserved memory is a set of contiguous addresses that the Virtual Memory Manager sets aside for a process but doesn't count against the process's memory quota until it is used. When a process needs to write to memory, some of the reserved memory is committed to the process. If the process runs out of memory, available memory can be reserved and committed simultaneously.
• Memory is committed when the Virtual Memory Manager saves space for it in Pagefile.sys in case it needs to be written to disk. The amount of committed memory for a process is an indication of how much memory is it really using.
Committed memory is limited by the size of the paging file. The commit limit is the amount of memory that can be committed without expanding the paging file. If disk space is available, the paging file can expand, and the commit limit will be increased.
Memory that is neither reserved nor committed is available. Available memory includes free memory, zeroed memory (which is cleared and filled with zeros), and memory on the standby list, which has been removed from a process's working set but might be reclaimed.
Measuring Memory
There are many useful tools for measuring physical and virtual memory and memory use. Most provide current totals and peak values of memory counts for processes and threads from the time they were started.
About Windows NT
To see how much physical memory is available to Windows NT 4.0 Workstation, start Windows Explorer, and choose About Windows NT from the Help menu. Physical memory available to Windows is listed at the bottom.
Task Manager
The Performance tab of Task Manager has a total physical memory field, memory usage counts, and a virtual memory graph. The Processes tab lists all processes running on the computer. From it, you can select columns to show Page Faults—a running total of page faults for the process since its start—and Page Faults Delta (PF Delta)—the change in page-fault totals between updates. Click the Page Faults or PF Delta column heading to sort processes by total page faults or by changes in page faults. The following figure shows the Task Manager Processes tab displaying the page-fault columns.
Resource Kit Utilities
Process Explode (Pview.exe) and Process Viewer (Pviewer.exe) show accurate and detailed current memory counts without any setup. Process Explode shows it all on one screen. In Process Viewer, click Memory Detail to see the counts for each segment of the process's address space.
Page Fault Monitor (Pfmon.exe), a tool on the Windows NT Resource Kit 4.0 CD in the Performance Tools group (\Perftool\Meastool), produces a running list of hard and soft page faults generated by each function call in a running application. You can display the data, write it to a log file, or both.
You must run Page Fault Monitor from the command prompt window. Type pfmon /? to see a list of available switches. For more information, see Rktools.hlp.
Performance Monitor
Performance Monitor logs data over time and displays it in line graphs. This is, by far, the best presentation of data related to memory use. Page faults, page reads, and disk operations aren't smooth, continuous increases; they are quick spikes in the data. Any totals or averaging hides the patterns.
The following figure is a Performance Monitor graph of Process: Page Faults/sec for several processes during their startup, when paging is high. The graph shows an increased page fault rate first for one application (the thin, gray line) and, later, for the other (the thick, black line). The white line represents page faults for the Services process which are interspersed throughout the sample interval.
The following report shows the same data presented in a report of average values over the measured time.
Although both provide useful information, the report does not reveal the patterns evident in the graph.
In addition to the monitoring tools, some simulation tools help you test the capacity of your memory. Clearmem, a utility on the Windows NT Resource Kit 4.0 CD, lets you measure the minimum working set for a process. It allocates all available memory, references it so it doesn't get paged out, then releases it. This forces the Virtual Memory Manager to trim the working sets of other processes to a minimum so that you can see how many pages your process is actually touching.
Memory Counters
The counters on the Performance Monitor memory object provide information on memory from different perspectives. The following table is a quick reference guide to the most commonly used memory counters.
Memory Counter Description
Page Faults/sec
How often is data not found in a process's working set?
This includes both hard page faults, which require disk I/O and soft page faults where pages are found elsewhere in memory.
If requested code or data is repeatedly not found, the process's working set is probably too small because memory is limited.
Pages Input/sec
How many pages are being retrieved from disk to satisfy page faults?
Compare with Page Faults/sec to see how many faults are satisfied by reading from disk, and how many come from somewhere else.
Pages Output/sec
How many pages are being written to disk to free up space in the working set for faulted pages? Pages must be written if they were changed by the process.
A high rate indicates that most faulting is data pages and that memory is becoming scarce. If memory is available, changed pages are retained in a list in memory and written to disk in batches.
Pages/sec
Sum of Pages Input/sec and Pages Output/sec.
Page Reads/sec
How often is the system reading from disk because of page faults? How much is page faulting affecting the disk?
The primary indicator of a memory shortage. Some page reads are expected, but a sustained rate of 5 pages per second or more indicates a memory shortage.
Counts how often the disk is read, regardless of the number of pages per reads. The number of pages exceeds the number of reads when more than one page is read at a time.
Page Writes/sec
How often is the system writing to disk because of page faults?
Counts how often the disk is written to, regardless of the number of pages written.
The number of pages exceeds the number of writes when more than one page is written at a time.
This counter is another good indicator of the effect of paging on the disk.
Available bytes
How much memory is left for processes to allocate?
This is an instantaneous count, not an average.
Configuring Available Memory
You can reduce the amount of physical memory available to Windows NT on a computer with an Intel processor without changing its physical memory configuration. This lets you simulate and test the effects of low memory on your computer.
Use the MAXMEM parameter on the Boot.ini file to set the maximum physical memory available to Windows NT.
Note MAXMEM works only on Intel processor platforms. This information does not apply to RISC computers (DEC Alpha, MIPS, and Power PC) which store their boot options in ARC firmware.
Intel computers store boot options in a Boot.ini file. Add the MAXMEM parameter to a boot option line in the Boot.ini file. You can create multiple boot option lines and choose from among the alternates at boot time.
Warning Do not set the memory on Windows NT 4.0 Workstation or Server below 8 MB. If you do, Windows NT might be unable to boot.
1.
Open Boot.ini. Because this is a read-only file, you must change its properties before you can edit it. It looks something like this:
[boot loader]
timeout=30
default=multi(0)disk(0)rdisk(0)partition(1)\WINNT40
[operating systems]
multi(0)disk(0)rdisk(0)partition(1)\WINNT40="Windows NT Version 4.00"
c:\="MS-DOS"
2.
Copy the boot option line under [operating systems] and paste it just below the existing one. Within the quotes, type some text that will identify this option to you when you see it on the screen during bootup. For example:
multi(0)disk(0)rdisk(0)partition(1)\WINNT40="Windows NT Version 4.00,12Mb"
3.
Following the quotes, type a space, then /MAXMEM=n where n is the amount of memory, in megabytes, you want to be available to Windows NT. Do not set this below 8 MB, or Windows NT might be unable to boot.
multi(0)disk(0)rdisk(0)partition(1)\WINNT40="Windows NT Version 4.00,12Mb" /MAXMEM=12
4.
You can create multiple boot option lines. Make sure the timeout parameter is long enough to let you choose from among them.
5.
Reboot, choose the low memory option, then check About Windows NT or Task Manager to make sure the change was effective.
Top of page
Memory Bottlenecks and Paging
The first step in investigating a memory problem is to measure paging. Although soft page faults (in which pages not in the process's working set are found elsewhere in physical memory) interrupt the processor, their effect is likely to be negligible.
If you suspect a memory shortage, chart the following Performance Monitor counters:
• Memory: Page Faults/sec
• Memory: Pages Input/sec
• Memory: Page Reads/sec
These counters indicate how often processes must look beyond their working sets to find the code or data they need. They also indicate how often these requests require paging from disk and how many pages are moved with each disk transfer.
Note The graphs for this section were produced on a 33-MHz 486 computer with only 10-12 MB of memory available to Windows NT 4.0 Workstation. Applications used in these examples will not cause memory bottlenecks when run with the recommended minimum of 16 MB of memory.
Compare the lines for Page Faults/sec and Pages Input/sec to determine the proportion of hard page faults (Pages input/sec) to all page faults (Page faults/sec). Both counters are measured in number of pages per second, so you can compare them without conversions. The lines intersect when the Virtual Memory Manager is reading from disk to satisfy a page fault. Space between the lines indicate soft page faults, which indicate that the missing page was found elsewhere in memory.
Pages input/sec is the number of pages read to satisfy a page fault, and Page Reads is the number of reads required to retrieve those pages. When the lines meet, the Virtual Memory Manager is moving one page per read operation. The space between the lines indicates that the Virtual Memory Manager is reading more than one page per operation. (You can calculate the amount of space by dividing Pages Input/sec by Page Reads/sec to find the number of pages per read operation. Space between the lines corresponds to a value greater than 1.)
In this example, the total page fault rate, Page Faults/sec (the top line), averages 70 per second. Page faults can interrupt the processor, but only hard faults slow the system down significantly. The next line, Pages Input/sec, measures hard faults in the numbers of pages retrieved to satisfy the fault. In this example, the rate is 21.5 pages per second on average. Although hard faults represent only 31% of all pages faulted (21.5/70), 21.5 pages per second can produce enough disk activity to cause produce a disk bottleneck.
The area between the top two lines represents soft faults; that is, pages found elsewhere in memory, either in the file system cache, in transition to disk, in the working set of another process, or brought in from memory as zeros. In this case, there are many; approximately 69% of page faults are soft faults.
The bottom line, partially superimposed upon Pages Input/sec, is Page Reads/sec, the number of times the disk is read to satisfy a page fault. This is an important indicator of the type of paging that causes system bottlenecks. The average of 16 reads/sec is quite high. Although the line representing Pages Input/sec and Page Reads/sec are close to each other, at times they diverge, indicating multiple pages read during each read operation. On average, 21.5 pages/sec are read in 16 page reads/sec. This shows that the Virtual Memory Manager is reading about 1.3 pages during each read operation.
In this example, 69% of the pages faulted because they were not in the working set of the monitored process, were found elsewhere in memory, probably in the file system cache. Because it is much quicker to retrieve data from the cache than from disk, Windows NT uses all available physical memory as cache. However, the remaining 31% of faulted pages caused in average of 16 reads/sec from disk; enough to cause a disk bottleneck. The next step is to associate paging with disk use.
Paging and Disk Reads
The memory counters, Page Reads/sec and Pages Input/sec, are indirect indicators of disk activity due to paging. Use the counters on the Logical Disk object to show paging from the perspective of the disks.
Note To enable the physical or logical disks counters, you must first run the Diskperf utility. At the command prompt, type diskperf -y, then restart the computer. For fault tolerant disk configurations (FTDISK), type diskperf -ye, then restart the computer. For more information, see "Enabling the Disk Counters" in Chapter 14, "Detecting Disk Bottlenecks."
To investigate the effect of paging on your disks, add the following counters for the Logical Disk object to your memory charts and reports:
• % Disk Read Time: _Total
• Avg. Disk Read Queue Length: _Total
• Disk Reads/sec: _Total
• Avg. Disk Bytes/Read: _Total
The disk activity counters, % Disk Read Time and Avg. Disk Read Queue Length, indicate disk reading activity during periods of high paging, as measured by the memory counters. Because disk activity has many causes, the Disk Reads/sec counter is included. This lets you subtract the reads due to paging (Memory: Page Reads/sec), from all reads (Disk Reads/sec), to determine the proportion of read operations caused by paging.
The transfer rate (as represented by Avg. Disk Bytes/Read) multiplied by Disk Reads/sec yields the number of bytes read per second, another measure of disk activity.
In this graph, the thick black line running at 100% for most of the test interval is % Disk Read Time: _Total JBAs the following report shows, total disk read time for both disks actually exceeds 100%.
The white line, Avg. Disk Read Queue Length, shows that the high disk activity is producing a large queue to disk. The scale is multiplied by 10 so that you can see the line. The disk queue averages more than 2 and, at its maximum, exceeds 5. More than two items in the queue can affect performance.
The remaining lines, representing the memory counters, are scaled quite small so that they can fit in the graph. Their values are more evident in the following report.
Although this report displays the same information as the graph, it shows average values for all counters. In this example, the average rate of page faults—123 per second—is extremely high. But the Pages Input/sec counter shows that, on average, only 42 of them, or 34%, were retrieved from disk. The rest of the pages are found in memory.
The 42 pages retrieved per second on average required 32 reads from the disk per second from the disk. Even though two-thirds of the page faults were satisfied by memory, the remaining one-third was enough to consume the disk and produce a large queue. In general, a high-performance disk is capable of 40 I/Os per second. This disk was quite close to its physical maximum.
Logical Disk: Disk Reads/sec: _Total, a measure of all disk reads, at 32.778 per second, is within sampling error range of the 32.398 average Memory: Page Reads/sec. This shows that virtually all of the reading was done to find faulted pages. This confirms that paging is cause of the bottleneck. If this pattern were to persist over time, it would indicate a memory shortage, not a disk problem.
Paging and Disk Writes
Paging causes many reads from the disk, but it also causes some writing to disk. When the Virtual Memory Manager locates a faulted page, if memory is available, it simply adds the retrieved pages to the working set of the process. When memory is scarce, it deletes a page of the current working set for every new page it brings in. If data on the page has changed, it must write the data to disk before it frees up the page frame. If many of the faulted pages are data pages with changes, the writing can be significant, and Performance Monitor can measure it.
Tip If your application's page faults are causing disk writes, they are probably faulting data pages, not code pages. Reorganizing your application's data structures and the way the program references them can reduce page faults.
To limit the writes to disk, the Virtual Memory Manager maintains a modified page list, a list of changed pages that need to be written to disk. Periodically, the modified page writer, a thread in the System process, writes some of the pages out to free up space. As free space becomes scarce, the modified page thread is activated more often.
To measure writes to disk resulting from paging, chart:
• Memory: Page Writes/sec
• Memory: Pages Output/sec
• Logical Disk: Disk Writes/sec
• Logical Disk: Disk Write Bytes/sec
• Logical Disk: Avg. Disk Write Queue Length.
Memory: Page Writes/sec indicates how often changes to pages had to be written to back to disk to free up space in a working set.
Logical Disk: Disk Writes/sec represents all writing to disk, including writes not associated with paging, like writing the Performance Monitor log file, or updating system statistics. Comparing Disk Writes to Page Writes reveals the proportion of total disk writing that consists of writing pages from memory back to disk.
Comparing Memory: Pages Output/sec and Logical Disk: Disk Write Bytes/sec also indicates the proportion of disk writing activity that is directly related to paging, but it shows it in bytes, rather than time.
Pages output/sec is the number of pages written to disk to free up page frames.
Disk Write Bytes/sec is the number of bytes written to disk per second for all purposes.
To compare Pages Output/sec to Disk Write Bytes/sec, multiply the number of pages by the number of bytes per page. MIPS, Intel, and PowerPC processors have 4096 bytes per page; DEC Alpha processors have 8192 bytes per page.
This graph compares disk read time with disk write time while the system is paging. The black line represents time reading from both physical disks on the system; the white line represents time writing. In this example, there was far more reading than writing, but the amount of writing is not insignificant.
This graph of the same events shows how much of the writing time is attributable to paging. The thin black line represents all writes to disk per second; the heavy, white line represents disk-writes due to paging. The space between them represents disk-writes other than changed pages.
In this example, the curves are almost the same shape, but the there are twice as many disk writes as page writes. This indicates that the disk writes that didn't consist of writing changed pages were related to writing them, such as writing the Performance Monitor log and writing system records.
The following report shows another measure of writing. The report includes counts of writing in pages as well as time.
To compare the Disk Write Bytes/sec to Pages Output/sec, multiply the number of pages by the page size, in this case, 4096 bytes/page. In this example, 11.952 pages out of 14.452, or 82% of disk writing is directly attributable to paging.
The Paging File
It is useful to monitor the size of the paging file, Pagefile.sys, when investigating memory shortages. On systems that have relatively little excess memory, an increased demand for memory causes Windows NT to expand the paging file.
The paging file, Pagefile.sys, is a block of disk space reserved by the operating system to back up physical memory. When processes require more physical memory than is available, the Virtual Memory Manager frees up space in memory by writing less-frequently-referenced pages back to the paging file on disk. As demand for memory increases, the Virtual Memory Manager expands the paging file until it runs out of disk space or reaches the paging file reaches its maximum size.
Note Windows NT creates one paging file on the physical drive on which the operating system is installed. The default size is equal to physical memory plus 12 MB. You can change the size of the paging file, and you can create one additional paging files on each logical disk partition in your configuration.
Use the Control Panel System applet Performance tab. (Right-click My Computer, select Properties, then select the Performance tab.) The Virtual Memory box shows the current size of your paging files and lets you add new files or change the size of existing ones. To change the value, click the Change button to display the Virtual Memory window, and then click Set.
To observe the changing size of the paging file, chart Process: Page File Bytes for individual processes and Process: Page File Bytes: _Total, an instantaneous measure of total number of bytes in the paging file.
The data in the following graph was logged during a memory leak (when memory is allocated faster than it is freed). A testing tool, LeakyApp, was run to allocate as much memory as it could, and the thrashing at the end of the graph shows the system running out of virtual memory and repeatedly warning the user.
In this graph, the thin black line is Page Faults/sec. The next thick black line is Pages Output/sec. The thick white line is Available Bytes, and the thick black line at the bottom is Pages Input/sec.
The relatively low rate of pages input (those read from disk to satisfy page faults) and very high rate of pages output (those written back to disk to trim working sets and recover physical memory) reveals the strategy of the Virtual Memory Manager to conserve available bytes by writing pages from physical memory to the paging file. Even as physical memory is consumed by processes, the number of available bytes of memory rarely falls below 4 MB, though it might vary rapidly and considerably within its range. In this example, Available bytes never drops below 372,736 bytes, as indicated by the Min value on the value bar.
However, the paging file on disk (Pagefile.sys) fills up rapidly until it runs out of space on disk.
In this graph, the lines representing the private bytes LeakyApp allocated to itself, the size of the paging file used by LeakyApp, and the total number of used bytes in the paging file, are all superimposed upon each other. Although the values are not identical, they are quite similar.
The graph is evidence of the significant growth of the paging file during a memory shortage. The plateau on the graph indicates that the maximum size of the paging file was reached, and LeakyApp could no longer allocate additional memory.
To see the values of all data points of the Page File Bytes curve, export the data to a spreadsheet like Microsoft Excel. For details on exporting logged data, see "Exporting Data" in Performance Monitor Help.
Monitoring the Nonpaged Pool
The nonpaged pool is an area of kernel-mode operating system memory reserved for objects that cannot be paged to disk. The size of the nonpaged pool is adjusted by the Virtual Memory Manager based on the amount of physical memory in the computer and on the demand for pool space by applications and services. On workstations, the absolute size of the nonpaged pool is not relevant to performance monitoring. However pool leaks, which are characterized by continuous, unexplained growth in the nonpaged pool, are a concern. Pool leaks usually result from an application error. They can cause a memory shortage because space occupied by the pool is no longer available to other processes.
For more information about memory pools, see "Terms and Concepts" earlier in this chapter.
Note The size of the nonpaged pool usually is not a concern on workstations, but can be a factor on servers where the number of trusting account (user) domains depends on the size of the nonpaged pool. For more information, see "Number of Trusted Domains" in Windows NT Networking Guide, Chapter 2, "Network Security and Domain Planning."
Many performance monitoring tools monitor the paged and nonpaged memory pools. Process Explode (Pview.exe), Process Monitor (Pmon.exe) and Process Viewer (Pviewer.exe), tools included the Windows NT Resource Kit 4.0 CD, display the amount of space in the pools allocated to each process. Task Manager and Performance Monitor display the total size of each memory pool as well as the space allocated to each process. Although their display formats vary widely, all of the these tools collect their data from the same internal counters.
Important The internal counters that measure the size of the nonpaged pool for each process are not precise. The counter values are estimates which count duplicated object handles as well as space for the object. Also, because the process pool size counts are rounded to page size, pool space is overestimated when a process is using part of a page. Therefore, it is important to monitor changes in pool size for a process, not the absolute value. Total pool size counts are precise. Therefore, the sum of pool sizes for each process might not equal the value for the whole system.
In Task Manager, the current size of the nonpaged pool is listed in the Kernel Memory box on the Performance tab. On the Processes tab, you can add a column to monitor the size of the nonpaged pool allocated to each process. For more information, see "Task Manager" in Chapter 11, "Performance Monitoring Tools."
Performance Monitor lets you to log changes in pool size over time. Pool size changes slowly, so you might have to log for several hours or even days to catch a pool leak. Use these Performance Monitor counters to monitor the pool size:
• Memory: Pool Nonpaged Bytes
• Memory: Pool Nonpaged Allocations
• Process: Pool Nonpaged Bytes
The counters on the Memory object monitor the total size of the nonpaged pool and the number of allocations of pool space for the whole system. The counter on the Process object monitors nonpaged pool space allocated to each process.
To use Performance Monitor to monitor the nonpaged pool for leaks, follow these procedures:
• Record the size of the nonpaged pool when the system starts. Then log the Memory and Process objects for several days at a 60-second interval. You should be able to associate any increases in the size of the pool, as indicated by Nonpaged Pool Bytes, with the start of a process, as indicated by Process: % Processing Time. When processes are stopped, you should see a decrease in pool size.
• Set a Performance Monitor alarm to notify you when Nonpaged Pool Bytes increases by more than 10% from its value at system startup.
After the system is started, the nonpaged pool should increase in size only when a process is started. When the process ends, the nonpaged pool should return to its original size. Any other unexplained growth in the nonpaged pool is considered to be abnormal.
Pool leaks typically occur when an application creates objects, then fails to close their handles when it is done with them, or when an application repeatedly opens a file or other object unnecessarily. Each time the application attempts to open the object, more space is allocated for the same object in the nonpaged pool. The bytes allocated for the pool come from physical memory, and an unnecessarily large pool denies those bytes to the operating system and applications.
Windows NT dynamically adjusts the size of the paged and nonpaged memory pools for optimum performance. However, you can change the size of the paged and nonpaged pools on Windows NT Workstations and Servers by editing the Registry. Use Regedt32 or Regedit, tools installed with Windows NT, to edit the Registry. Editing the Registry might cause serious system problems that require reinstalling Windows NT. The values for this Registry entry are entered in bytes.
The registry parameters for paged and nonpaged pool are in:
Subtree
HKEY_LOCAL_MACHINE
Key
\System\CurrentControlSet\Control\SessionManager
\MemoryManagerment
Name
NonPagedPoolSize
PagedPoolSize
Type
REG_DWORD
Top of page
Examining Your Applications
High, sustained paging rates indicate a memory shortage, but not its cause. You might have insufficient physical memory to support the operating system, applications, and network services. However, you might have an application that is using memory inefficiently or leaking memory—that is, allocating memory, but not releasing it.
When the hard page fault rate on your system rises, investigate the memory use of your applications by using the following counters:
• Process: Private Bytes
• Process: Working Set
• Process: Page Faults/sec
• Process: Page file Bytes
The first step is to distinguish between a general memory shortage that is affecting all applications and a memory shortage caused by a particular application. Chart Process: Page Faults/sec for all processes.
The following graph shows a general memory shortage that is causing page faults in many processes.
In this example, Memory: Page Faults/sec (the tall white bar) represents all page faults for the system. The other bars represent page faults for each application or service running on the system. This graph demonstrates that no single application is causing a memory shortage. In this case, the high paging rate is best resolved by adding more physical memory.
In contrast, the following graph shows a single application, LeakyApp, a test tool, causing a high rate of page faults.
In this example, Memory: Page Faults/sec (the first tall bar) represents all page faults for the system. The tall white bar represents page faults for the test tool. The other bars, which are barely visible, represent the fault rates of other processes.
Although this memory shortage affects all system processes, it is attributable to a single application. Were it a real application instead of a test tool, a more thorough investigation would be in order. It would be prudent to consider replacing the application, moving it to another computer or, if it is your application, trying to improve it memory efficiency.
The standard performance monitoring tools are designed to determine that an application is using memory inefficiently, but not why. If you have an inefficient application, use the following tools for further diagnosis:
• Page Fault Monitor (Pfmon.exe), a utility on the Windows NT Resource Kit 4.0 CD in the Performance Tools group (\Perftool\Meastool), produces a running list of the hard and soft page faults generated by each function call in a running process. For more information, see Rktools.hlp.
• The Working Set Tuner analyzes the patterns of function calls in your application code and recommends a code organization that consumes the least possible physical memory. It requires some work from the developer, but has been demonstrated to improve memory efficiency by as much as 50%. The Working Set Tuner is part of the Win32 Software Development Kit.
The remainder of this section explains how to determine the effect of application memory use on your system.
Working Set
The working set of a process is the physical memory assigned to the process by the operating system. It contains the code and data pages recently referenced by the process. When a process needs code or data that is not in its working set, a page fault occurs, and the Virtual Memory Manager adds the new pages to the working set.
• When memory is plentiful, the Virtual Memory Manager leaves older pages in the working sets even when it adds new pages. This results in larger working sets.
• When memory becomes scarce, the Virtual Memory Manager recovers memory for the system by moving less recently referenced pages out of the working sets and by removing an older page for each new page added. Although trimming the working sets is necessary, it can cause more page faults.
One measure of application efficiency is how small its working set can get without causing a large number of page faults. In general, the more that data used in sequence is stored in sequence, the fewer pages the application needs and the smaller its working set can be.
To measure the working sets of processes, chart:
• Process: Working Set
• Process: Page Faults/sec
• Memory: Available Bytes
Working Set is an instantaneous counter that shows the current number of bytes in a process's working set. Process: Page Faults/sec is the rate of page faults for the process. The following graph demonstrates that the Virtual Memory Manager adjusts the size of a process's working set attempting to respond to a process's page fault rate by increasing its working set and then by trimming it.
In this graph, the white line represents the working set of a process; the black line represents the page-fault rate for the process. Notice that the vertical maximum has been increased to 200 so that the working set curve fits in the graph. The similar shapes of these curves reflect their cause and effect relationship of the Virtual Memory Manager responding to the page faults.
In this example, the overall page-fault rate is quite high, averaging over 12 page faults/sec. As the page-fault rate rises, the Virtual Memory Manager adds pages to the working set of the process to reduce the rate of the page faults. About midway through the graph, as the page-fault rate drops to near zero—probably because the process has much of what it needs—the Virtual Memory Manager begins to trim the working set, but a resurgence of page faults drives the size of the working set back up.
Determining the Minimum Working Set
When you improve the organization of code and data references in your program, you:
• reduce its minimum working set.
• reduce the amount of physical memory it consumes.
• improve its use of the file system cache (as described in Chapter 15, "Detecting Cache Bottlenecks").
To demonstrate one aspect of this improvement, measure the minimum working set of an application before and after you tune it.
To see the actual minimum working set of a process, you must reduce available memory to its minimum. This compels the Virtual Memory Manager to trim all but currently active pages from a process's working set.
Clearmem, a utility on the Windows NT Resource Kit 4.0 CD in the Performance Tools group (\PerfTool\MeasTool), determines the size of your computer's physical memory, allocates enough data to fill it, then references the data as quickly as possible. It also accesses files to clear the cache. This reduces memory available to other processes to a minimum. Then, Clearmem releases the allocated memory to restore normal system functions.
To find the minimum working set for a process
1.
Start the process, then start a Performance Monitor log to measure the Process object once per second.
2.
Run Clearmem It usually runs for less than a minute.
3.
Stop the log, change to Chart view, and in the Options menu, set Data From to the log file.
4.
Use the Time Window to advance the beginning time to a point when Clearmem was running. (The Clearmem process does not appear in the Add to Chart dialog box unless the process is active at beginning time of the Time Window). Chart Process: % Processor Time for Clearmem to show the duration of the Clearmem process. Then, enlarge the Time Window so that the display begins just after Clearmem started running and ends just before Clearmem stopped running.
5.
Chart Process: Working Set for the process you are testing.
6.
Read the minimum working set size from the Min field on the value bar.
In this graph, the thick, black line is the working set of Clearmem, the memory-consuming test tool. As Clearmem increases its working set (as represented by the sharply increasing curve) the working sets of all other processes are trimmed until they contain only pages currently being used and those most recently accessed. The other lines in the graph represent the working sets of other processes. The value bar shows that the minimum working set for Explorer in this test is 184,320 bytes.
Available Bytes
Available bytes is a measure of free memory. The Virtual Memory Manager continually adjusts the space used in physical memory and on disk to maintain a minimum number of available bytes for the operating system and processes. When available bytes are plentiful, the Virtual Memory Manager lets the working sets of processes grow, or keeps them stable by removing an old page for each new page added. When available bytes are few, the Virtual Memory Manager must trim the working sets of processes to maintain the minimum required.
The following graph records the start of a process. It shows the relationship between page faults, the size of the working set of a process, and available bytes. When a process is faulting many pages, the Virtual Memory Manager increases the process's working set to slow the fault rate. The memory added to the working set is taken from available bytes, which shrinks accordingly. When available bytes falls close to the minimum tolerated by the system, the Virtual Memory Manager trims the working sets to recover some available bytes. The smaller working set makes page faults more likely, requiring the Virtual Memory Manager adjustment cycle to begin again.
In this graph, which records the start of the Microsoft Word process, Winword.exe, the thick black line is Available Bytes, the white line is Working Set, and the thin black line is Page Faults/sec. The vertical maximum has been increased to 200 to accommodate the high values.
This example demonstrates the close association between the number of page faults and the increase in the working set. Note the inverse relationship between the size of the process's working set and available bytes for the system.
For the first third of the graph, the Virtual Memory Manager responds to page faults by dramatically increasing the size of the working set. Then, to recover available bytes, the working set is trimmed; thereafter, the Virtual Memory Manager responds to page faults by moving in just needed pages and by removing pages not recently referenced to keep the size of the working set to a minimum. At the end of the graph, an increase in the page fault rate drives the size of the working set back up.
Top of page
Resolving a Memory Bottleneck
Although more memory is the easy solution to a memory bottleneck, it isn't always the right solution.
• Monitor your applications and replace or correct those that leak memory or use it inefficiently.
• Localize your application's data references. Page Fault Monitor (Pfmon.exe), a tool on the Windows NT Resource Kit 4.0 CD, produces a running list of hard and soft page faults generated by each function call in a process.
• Localize your application's code references. The Working Set Tuner, included in the Win32 Software Development Kit, recommends an optimal organization of code functions to localize code page access.
• Increase the size of the paging file. In general, the bigger you can make it, the better it is. You can also have multiple paging files, though you can have only one on each logical drive. To change the size of the paging file, use the Control Panel System applet Performance tab. (Right-click My Computer, select Properties, then select the Performance tab.) The Virtual Memory box shows the current size of your paging files and lets you add new files or change the size of existing ones.
• Check the available space on your hard drives. If you have added memory, increase the size of your paging files. The paging file might need to expand to map the additional memory. If the space is not available, it might produce the symptoms of a memory bottleneck.
• Increase the size of your secondary memory cache, especially if you've just added memory. When you add memory, the secondary cache must map the larger memory space.
The amount of secondary cache a system supports depends upon the design of the motherboard. Many motherboards support several secondary cache configurations (from 64K–512K or 256K–1 MB). Increasing cache size usually requires removing the existing static ram (SRAM) chips, replacing them with new SRAM chips, and changing some jumpers. Doing so would be helpful anytime you have a working set larger than your current secondary cache.
• Remove unnecessary protocols and drivers. Even idle protocols use space in the paged and nonpaged memory pools.
If all else fails, add memory. After struggling with a memory bottleneck and its grueling effects, you will find the improved response of the entire system well worth the investment.
0 Comments:
Post a Comment
<< Home