Environments That Benefit From Fragmentation Elimination

From DBArtisan
Jump to: navigation, search

Go Up to Space Analyst Expert Guide

One important point must be understood regarding fragmentation: not every situation benefits from fragmentation removal. Before you invest a lot of time and effort into diagnosing and attempting the removal of fragmentation, you first need to see if jumping through hoops will actually improve your database's availability and performance.

At the highest level, the environments that benefit the most from fragmentation removal are read-intensive databases where large indexes are being frequently scanned. There are a couple of reasons why this is the case.

In most OLTP environments, data retrieval tends to be mostly selective, which negates most of the bad effects of external fragmentation. OLTP environments also often benefit from internal fragmentation because it's smart to leave room on index pages that are being added to and updated frequently. Free index page space helps avoid the dreaded page split where a page is divided in two due to the fact that incoming clustered index data can't logically fit on a needed page, so IBM DB2 makes room by creating two index pages out of one. Administrators oftentimes specify a low FILLFACTOR for their indexes to create internal fragmentation and hopefully avoid page splits.

Data warehousing or databases with many resource-intensive/scanning queries are another matter. These environments will likely benefit from fragmentation removal. One reason is that the indexes tend to be larger, and therefore are not cached by IBM DB2 as easily as small indexes. Remember that fragmentation has no effect on performance once the data is nestled safely in IBM DB2's memory caches.

Another reason why these environments benefit from fragmentation removal is the effect fragmentation has on IBM DB2's read ahead manager. The read ahead manager helps queries that perform large scans by scanning index pages (and data pages) that it believes will be read and placing them into memory before they are actually needed. Naturally, this process can reduce overall scan times because data read in memory is many times faster than when that same data must be retrieved from disk. However, fragmentation affects how well the read ahead manager works. The read ahead manager will dynamically adjust the size of I/O it performs based on the actual physical order of the data and index pages on the server. When little external fragmentation is present, the read ahead manager can work very efficiently because it can read larger blocks at a time, whereas excessive fragmentation causes the read ahead manager to read smaller blocks of data. The end result when fragmentation is present is less overall throughput.

SQL execution patterns are something else to examine. Queries that perform large index scans are the ones that should see performance improvements when fragmentation is removed in a database. In DBArtisan, you can visually set I/O statistics on in the ISQL facility and examine queries to determine the amount of physical and logical read activity. Queries that show physical and read ahead I/O may see their response times drop when fragmentation has been removed.