Defragmentation

What is defragmentation and why do I need it?

Defragmentation, also known as “defrag” or “defragging” is the process of reorganizing the data stored on the hard drive so that related pieces of data are put back together, all lined up in a continuous fashion.  You could say that defragmentation is like cleaning house for your server or PC, it picks up all of the pieces of data that are spread across your hard drive and puts them back together again.

Why is defragmentation important? Because every computer suffers from the constant growth of fragmentation and if you don’t “clean house”, your servers and PCs suffer.

How Fragmentation Occurs

Disk fragmentation occurs when a file is broken up into pieces to fit on the disk. Because files are constantly being written, deleted and resized, fragmentation is a natural occurrence. When a file is spread out over several locations, it takes longer to read and write. But the effects of fragmentation are far more widespread.

Effects of Fragmentation on Computer Performance

Many users blame computer performance problems on the operating system or simply think their computer is “old”, when disk fragmentation is most often the real culprit. The weakest link in computer performance is the disk. It is at least 100,000 times slower than RAM and over 2 million times slower than the CPU. In terms of computer performance, the disk is the primary bottleneck. File fragmentation directly affects the access and write speed of that disk, steadily corrupting computer performance to unviable levels. Because all computers suffer from fragmentation, this is a critical issue to resolve.

Problems caused by fragmentation include:

Performance:

  • Server and PC slows and performance degradations
  • Slow backup times – even failing to complete in their backup windowFind Out More About New Dymaxio Fast Data Software
  • Unnecessary I/O activity on SQL servers or slow SQL queries
  • Slow boot-up times
  • Increase in the time for each I/O operation or generation of unnecessary I/O activity
  • Inefficient disk caching
  • Slowdown in read and write for files
  • High level of disk thrashing (the constant writing and rewriting of small amounts of data)
  • Long virus scan times

System Reliability:

  • Crashes and system hangs
  • File corruption and data loss
  • Boot up failures
  • Aborted backup due to lengthy backup times
  • Errors in and conflict between applications
  • Hard drive failures
  • Compromised data security

Longevity, Power Usage, Virtualization, and SSD:

  • Premature Server or PC system failure
  • Wasted energy costs
  • Slower system performance and increased I/O overhead due to disk fragmentation compounded by server virtualization
  • Write performance degradations on SSDs due to free space fragmentation. Read about Write Amplification Factor (WAF) in Do SSDs degrade over time?

Performance Gains from Eliminating Fragmentation:

  • Better application performance
  • Reduced timeouts and crashes
  • Shorter backups
  • Faster data transfer rates
  • Increased throughput
  • Reduced latency
  • Extended hardware lifecycle
  • Increased VM density
  • Overall faster Server & PC speeds
  • Faster boot-up times
  • Faster anti-virus scans
  • Faster internet browsing speeds
  • Faster read & write times
  • Increased system stability
  • Reduced PC slows, lags & crashes
  • Reduced unnecessary I/O activity
  • Reduced file corruption and data loss
  • Lower power consumption and energy costs
  • Lower cloud compute costs

SQL Server Performance

One of the biggest hardware bottlenecks of any SQL Server is disk I/O. And anything that DBAs can do to reduce SQL Server’s use of disk I/O will help boost its performance. Some of the most common things DBAs do to reduce disk I/O bottlenecks include:

  • Tuning queries to minimize the amount of data returned.sql server best practices guide
  • Using fast disks and arrays.
  • Using lots of RAM, so more data is cached.
  • Frequent DBCC REINDEXing of data to remove logical database fragmentation.

Another less frequently used method to reduce overall disk I/O, but nonetheless important, is to perform physical defragmentation of SQL Server program files, database files, transaction logs, and backup files. Physical file fragmentation occurs in two different ways. First, individual files are broken into multiple pieces and scattered about a disk or an array (they are not contiguous on the disk). Second, free space on the disk or array consists of little pieces that are scattered about, instead of existing as fewer, larger free spaces. The first condition requires a disk’s head to make more physical moves to locate the physical pieces of the file than contiguous physical files. The more physically fragmented a file, the more work the disk drive has to do, and disk I/O performance is hurt. The second condition causes problems when data is being written to disk. It is faster to write contiguous data than noncontiguous data scattered over a drive or array. In addition, lots of empty spaces contribute to more physical file fragmentation.

+ [expand to continue reading] Software Spotlight by Brad M. McGehee

If your SQL Server is highly transactional, with mostly INSERTS, UPDATES, and DELETES, physical disk fragmentation is less of an issue because few data pages are read, and writes are small. But if you are performing lots of SELECTS on your data, especially any form of a scan, then physical file fragmentation can become a performance issue as many data pages need to be read, causing the disk head to perform a lot of extra work.

Fragmentation never stops. Although NTFS will try to minimize file fragmentation, it doesn’t do a very good job at it. Because of this, defragmentation needs to be done continually, if you want optimal disk I/O performance.

Read more How do I get the most performance from my SQL Server?

SAN, NAS, and RAID

Fragmentation prevention offers significant benefits when implemented on intricate modern hardware technologies such as RAID, NAS and SANs, and all-flash. SANs, NAS devices, corporate servers, and even high-end workstations and multimedia-centric desktops characteristically implement multiple physical disk drives in some form of fault-tolerant disk striping (RAID). Because the purpose of fault-tolerant disk striping is to offer redundancy, as well as improved disk performance by splitting the I/O load, it is a common misconception that fragmentation does not have a negative impact. It’s also important to note that the interface; EIDE, SCSI, SATA, i-SCSI, Fibre Channel, etc… does not alter the relevance of defragmentation.

Regardless of the sophistication of the hardware installed, the SAN appears to Windows as one logical drive. So when Windows reads a fragmented file, it has to logically find all those thousands of pieces, and that takes thousands of separate I/O operations to piece it all together before it is fed to the user. That exerts a heavy toll on performance.

+ [expand to continue reading] Regardless of the sophistication of the hardware installed, the SAN appears to Windows as one logical drive. The data may look pretty on the arrays, but to the OS, it is still fragmented. Windows has fragmentation built into the very fabric. Open up the defrag utility on any server or PC running and see how many fragments currently exist and the file with the most fragments. If you haven’t been running defrag, you will find files in thousands of pieces. So when Windows does a read, it has to logically find all those thousands of pieces, and that takes thousands of separate I/O operations to piece it all together before it is fed to the user. That exerts a heavy toll on performance — admittedly, which could be masked to some degree by the capability of the SAN hardware.

Because the purpose of fault-tolerant disk striping is to offer redundancy, as well as improved disk performance by distributing the I/O load, it is a common misconception that fragmentation does not have a negative impact. It’s also important to note that the interface; EIDE, SCSI, SATA, i-SCSI, Fibre Channel, etc… does not alter the relevance of defragmentation.

As this data will show, these devices do suffer from fragmentation. This is attributed to the impact of fragmentation on “logical” allocation of files and to a varying degree, their “physical” distribution.

The file system driver, NTFS.sys, handles the logical location (what the operating system and a defragmenter affect). The actual ‘writing’ is then passed to the fault-tolerant device driver (hardware or software RAID), which then, according to its procedures, handles the placement of files, and generating parity information, finally passing the data to the disk device driver (provided by drive manufacturer).

As noted, stripe sets are created, in part, for performance reasons. Access to the data on a stripe set is usually faster than access to the same data would be on a single disk, because the I/O load is spread across more than one disk. Therefore, an operating system can perform simultaneous seeks on more than one disk and can even have simultaneous reads or writes occurring.

Stripe sets work well in the following environments:

  1. When users need rapid access to large databases or other data structures.
  2. Storing program images, DLLs, or run-time libraries for rapid loading.
  3. Applications using asynchronous multi-threaded I/O’s.

Stripe sets are not well suited in the following situations:

  1. When programs make requests for small amounts of sequentially located data. For example, if a program requests 8K at a time, it might take eight separate I/O requests to read or write all the data in a 64K stripe, which is not a very good use of this storage mechanism.
  2. When programs make synchronous random requests for small amounts of data. This causes I/O bottlenecks because each request requires a separate seek operation. 16-bit single-threaded programs are very prone to this problem.

It is quite obvious that RAID can exploit a well written application that can take advantage of asynchronous multi-threaded I/O techniques. Physical members in the RAID environment are not read or written to directly by an application. Even the Windows file system sees it as one single “logical” drive. This logical drive has (LCN) logical cluster numbering just like any other volume supported under Windows. As an application reads and writes to this logical environment (creating new files, extending existing ones, as well as deleting others) the files become fragmented. Because of this fact, fragmentation on this logical drive will have a substantial negative performance effect. When an I/O request is processed by the file system, there are a number of attributes that must be checked which cost valuable system time. If an application has to issue multiple “unnecessary” I/O requests, as in the case of fragmentation, not only is the processor kept busier than needed, but once the I/O request has been issued, the RAID hardware/software must process it and determine which physical member to direct the I/O request. Intelligent RAID caching at this layer can mitigate the negative impact of physical fragmentation to varying degrees but will not solve the overhead caused to the operating system with the logical fragmentation.

To gauge the impact of fragmentation on a RAID system employ performance monitoring technologies such as PerfMon and examine Average Disk Queue Length, Split IO/Sec, and % Disk Time. Additional disk performance tuning information can be found in Microsoft’s online resources.

Download a Free Trial of new DymaxIO® fast data software now!

More about SAN Performance

As high performing storage solutions based on block protocols (e.g. iSCSI, FC), SANs excel at optimizing block access. SANs work at a storage layer underneath the operating systems file system; usually NTFS when discussing Microsoft Windows®. That dictates that a SAN is unaware of “file” fragmentation and unable to solve this issue.

+ [expand to continue reading]

Fig 1.0: Diagram of Disk I/O as it travels from Operating System to SAN LUN.

With file fragmentation causing the host operating system to generate additional unnecessary disk I/Os (more overhead on CPU and RAM) performance suffers. In most cases the randomness of I/O requests, due to fragmentation and concurrent data requests, the blocks that make up the file will be physically scattered in uneven stripes across a SAN LUN/aggregate. This causes even greater degradation in performance.

Fortunately, there are simple solutions to NTFS file system fragmentation; fragmentation prevention and defragmentation. Both approaches solve file fragmentation at the source, the local disk file system.

Learn more, read: SAN best practices . [note this link needs to be fixed to direct to the correct article]

Email Servers

The benefit of preventing fragmentation in an email server environment is no different than preventing fragmentation or defragmenting any other system. It simply takes less time and system resources to access a contiguous file than one broken into many individual pieces. This improves not only response time but also the reliability of the system. Thorough database maintenance requires a combination of disk defrag and the email server utilities (internal record/index defragmentation), to achieve optimum performance and response time.

The tools for Microsoft Exchange (ESE and EDB Utilities) deal with internal record fragmentation by rearranging the internal records/indexes on the fly when possible, and at times requiring a whole new copy of the database to be created and each record copied to the new file. Even if this copy is done to a freshly formatted volume or a defragmented volume with a free space chunk large enough to contain the entire database, it’s quite likely that this new copy will become fragmented.

+ [expand to continue reading] Email servers are prone to fragmentation, whether they are Microsoft® Exchange, Lotus® Domino®, QUALCOMM® Eudora®, or others.

There are two types of volume-centric fragmentation to be concerned about: file fragmentation and free space fragmentation. File fragmentation concerns computer files that are not contiguous, but rather are broken into scattered parts. Free space fragmentation describes a condition in which unused space on a disk is scattered into many small parts rather than a small number of larger spaces. File fragmentation causes problems with accessing data stored in computer files, while free space fragmentation causes problems creating new data files or extending (adding to) old ones.

Taken together, the two types of fragmentation are commonly referred to as “disk” or “volume” fragmentation. It is important to note that, when talking about fragmentation, we are talking about the file as a container for data and not about the contents (data) of the file itself.

Typically email application databases such as Microsoft Exchange and Lotus Domino are made up of a large container file that is pre-allocated in size at the point of creation. As the database increases beyond the initial assessment the file becomes fragmented.

People sometimes describe fragmentation as a condition in which a file has its records (internal contents) scattered about within the file, separated by numerous small gaps. This type of fragmentation may be a problem with the application which maintains the file; it is not inherent in the operating system or file structure.

Over a period of time, any popular email application server will experience this “internal” fragmentation of its database. This is where records are removed, but the space it occupied within the database is still there and is either reused for a new record or must be skipped over.

Let’s say you have 250,000 records represented in an email server database. If an individual record (e.g. a deleted email) is removed, the location is simply marked as deleted. In the course of doing business hundreds, perhaps thousands of records are added and deleted. It doesn’t take long for the internal organization of a database file, its indexes, and other related files to quickly become quite disorganized. The speed of locating a particular record or segment of information is directly related to the amount of time spent skipping over these holes or internal fragments.

It is important to state that Condusiv software does not, under any circumstances, restructure or alter the internal contents of any file. Altering or restructuring a file is a very dangerous thing to do as one would have to have a very intimate knowledge of a given file structure and be able to detect changes as the various databases evolved with new releases. Therefore, any holes or ‘records marked as deleted’ within the database, prior to defragmentation are still present.

The tools for Microsoft Exchange (ESE and EDB Utilities) deal with this internal record fragmentation by rearranging the internal records/indexes on the fly when possible, and at times requiring a whole new copy of the database to be created and each record copied to the new file. Even if this copy is done to a freshly formatted volume or a defragmented volume with a free space chunk large enough to contain the entire database, it’s quite likely that this new copy will become fragmented. It is strongly recommended to run Diskeeper® on a daily basis to ensure peak performance from your database and email servers.

Testing and White papers:

Test: Impact of Disk Fragmentation

While there is little dispute among IT professionals regarding the impact of disk fragmentation on system performance, technology expert Joe Kinsella put fragmentation to the test to answer 2 questions:

  1. What impact does fragmentation have on user and system activities?
  2. How quickly does fragmentation accumulate as a result of these activities?

The tests were documented in a white paper that outlines the results of the testing, draws conclusion, and makes recommendation regarding managing fragmentation across your infrastructure. Read the white paper The Impact of Disk Fragmentation by Joe Kinsella [link to white paper – Kellie, PDF or blog article link is needed]

White paper: Identifying Common Reliability and Stability Problems Caused by File Fragmentation

In this white paper, we explain some of the most common reliability and downtime phenomena associated with fragmentation, and the technical reasons behind them. This includes a discussion of each of the most common occurrences documented by our R&D labs, customers (empirical results presented), as well as others, in recent years. This paper covers:

  • Introduction
  • An Overview of the Problem
  • Reliability and Stability Issues Traceable to File Fragmentation
    1. CRASHES AND SYSTEM HANGS
    2. SLOW BACK UP TIMES AND ABORTED BACKUP
    3. FILE CORRUPTION AND DATA LOSS
    4. BOOT UP ISSUES
    5. ERRORS IN PROGRAMS
    6. RAM USE AND CACHE PROBLEMS
    7. HARD DRIVE FAILURES
  • Contiguous Files = Greater Uptime

Read the full white paper Identifying Common Reliability and Stability Problems Caused by File Fragmentation

Test: Antivirus Software & Fragmentation

Antivirus Scans are significantly faster on desktop systems with regularly defragmented files and free space. As capacity to quickly respond to new anti-virus attacks is a critical component of any organization’s security plan, software that automatically keeps systems fragmentation-free should not be overlooked as the tool that makes fast antivirus scans possible.

+ [expand to continue reading] At the request of one of our enterprise customers, Condusiv Technologies did a study of the effects of fragmentation on virus scan time to verify and measure the magnitude of virus scanning speed improvement.

Test Environment:

We tested the top four anti-virus products that collectively represent about 90% of the U.S. volume license market for antivirus software:

  • Symantec Antivirus 2003
  • McAfee Pro 7.02
  • Trend Micro PC-cillin 10.03
  • Panda Titanium Antivirus 2004

When a defragmenter is not regularly run, the systems will build up significant levels of fragmentation. The requested test scenario was one where the levels of fragmentation should be consistent with a desktop that is not regularly defragmented and the hardware should represent desktops that might have been purchased in the last 6 months, meaning a P4 processor or equivalent and 256 MB of RAM or greater. The mix of files on the test partition should include MS Office and other typical popular applications and file types. The test cases were saved as binary images so that the tests could be repeated.

We set up two typical corporate desktop systems as follows:

Desktop #1:
Windows 2000 Professional SP3, 80GB Hard Drive, 512 Ram, AMD Athlon 2700+ Test partition: 20GB 114,291 files. Test partition condition before: 342,283 excess fragments (average 3.99 fragments per file). 60% free space.
Desktop #2:
Windows XP Professional SP1, 80GB Hard Drive, 512 Ram, Intel Pentium 4 2400 Test Partition: 40 GB test partition with 200,001 files. Test partition condition before: 1,460,850 excess fragments (average 8.30 fragments per file). 35% free space.

Test Procedure:

The four top anti-virus software packages were tested in turn, restoring the binary disk image each time. Only the manual virus scan of each product was run, with other system monitoring options turned off in order to minimize timing variables.

Before testing each antivirus product, the disk was restored from the binary image to the fragmented test state described above, and then the virus scan was run and the scan time recorded. The disk was then defragmented using Diskeeper’s “Maximum Performance” defragmentation method (the default setting). When the defragmentation was complete, the virus scan of the same product was run again, with the new time recorded.

Test Cases and Results:

Test CaseAntivirus Software TestedScan Time Before
Defragmentation*
Scan Time After
Defragmentation
Time SavedImprovement
Percentage
Desktop #1McAfee Pro 7.021:34:430:52:340:42:0948.76%
Desktop #1Symantec Antivirus 20031:00:110:35:310:24:4040.99%
Desktop #1Trend Micro PC-cillin 10.031:00:400:31:050:29:3548.76%
Desktop #1Panda Titanium Antivirus 20041:11:350:27:460:43:4961.21%
Desktop #2McAfee Pro 7.022:14:061:15:560:58:1043.38%
Desktop #2Symantec Antivirus 20031:16:550:59:130:17:4223.01%
Desktop #2Trend Micro PC-cillin 10.031:10:550:45:340:25:2135.75%
Desktop #2Panda Titanium Antivirus 20041:39:030:43:290:55:3456.47%
* Times shown are in hours, minutes and seconds.

Conclusion:

Antivirus Scans are significantly faster on desktop systems with regularly defragmented files and free space. As capacity to quickly respond to new anti-virus attacks is a critical component of any organization’s security plan, software that automatically keeps systems fragmentation-free should not be overlooked as the tool that makes fast antivirus scans possible.

Choosing a “Defragmenter”

It is well known throughout the IT field that fragmentation is a problem. It doesn’t go away on its own, hardware upgrades won’t handle it, accumulated fragmentation on a computer hard drive will slow a system to the point of crashing and all systems suffer from it. The only effective, modern solution is preventing fragmentation. But what is the best solution to use?

The first thing to know about a “defragmenter” is that it must be able to prevent fragmentation in real time. Real-time fragmentation prevention is critical for the maintenance of system speed and reliability at peak levels, regardless of how busy a system might be.

A “defragmenter” that ties up system resources while in the act of defragmenting files is itself like the performance liability it is trying to handle. So, another important attribute of the right “defragmenter” is that it should work invisibly, without competing for active system resources.

Choose a trusted, reliable, and certified solution.

IntelliWrite® Fragmentation Prevention Technology

Historically, fragmentation has been addressed proactively, after it has already happened through the defragmentation process. In the “early days”, fragmentation was addressed by transferring files to clean hard drives. Then manual defrag programs were introduced. The next step was scheduled defragmenters with varying degrees of automation. Truly automatic defrag was finally achieved with the development of InvisiTasking® technology by Condusiv Technologies in 2007.

However, in spite of all the progress made with defrag methods, when fragmentation occurs, the system is wasting precious I/O resources by writing non-contiguous files to scattered free spaces across the disk and then secondly, using more I/O resources to defrag. Clearly the best strategy is to prevent the problem before it happens in the first place and always work with a clean, fast disk.

Based on knowledge of the Windows file system, IntelliWrite technology controls the file system operation and prevents the fragmentation that would otherwise occur.

Key Features:

  • Significantly improves system performance above the levels achieved with automatic defragmentation alone.
  • The improvement will be particularly significant for busy servers / virtual systems on which background/scheduled defragmentation has limited time slots in which to run. In extreme cases this can make a difference between being able to eradicate fragmentation or not.
  • Substantially prevents file fragmentation before it happens, up to 85% or more.
  • Can be enabled / disabled per individual volumes.
  • Can be run in coordination with automatic defragmentation (strongly recommended for optimal performance), or independently.
  • Supports NTFS and FAT file systems on Microsoft Windows operating systems.
  • Overall lower system resource usage and consequently lower energy consumption.

Key Benefits:

  • Prevents most fragmentation before it happens
  • Improves file write performance
  • Saves energy at the same time that improves performance (5.4% in controlled tests)
  • Compatible and interoperable with other storage management solutions

IntelliWrite reduces drastically the effects of fragmentation on any system running in the Windows Operating System by preventing the fragmentation before it happens. This will automatically represent an improvement in system speed, performance, reliability, stability and longevity.

InvisiTasking® background processing

InvisiTasking was coined from “invisible” and “multitasking”, and this amazing technological breakthrough promises to change the way the world operates and maintains their computer systems. InvisiTasking allows computers to do something that has never been done before – to run at maximum peak performance, continuously, without interfering with system performance or resources – even when demand is at its highest!

InvisiTasking allows Condusiv’s software to eliminate fragmentation on the fly, in real time, so that fragmentation never has a chance to interfere with the system. Best of all, InvisiTasking does this automatically, without the need for any input from the user – regardless of the size of the network. Whether it’s just one PC or thousands just install Condusiv and the software will take care of the rest!

It’s important to note that InvisiTasking is far more advanced than any previous low priority I/O approaches that do “I/O throttling” in an effort to reduce resource conflict. InvisiTasking, through the use of its advanced technology, goes beyond just I/O in order to address system resource usage using a pro-active approach. InvisiTasking checks to make sure the operation that occurs takes place invisibly, with true transparency while running in the background.

Solution to Eliminate Fragmentation and Speed up Computer Systems

Diskeeper® has been increasing PC and Server performance by eliminating and preventing fragmentation for millions of global customers for decades. Diskeeper also includes caching technology for faster-than-new computer performance.

All of Diskeeper’s features and functionality are now included in DymaxIO.

DymaxIO is the most cost-effective, easy, and indispensable solution for fast data, increased throughput, and accelerated I/O performance so systems and applications run at peak performance for as long as possible.

To learn more visit www.condusiv.com/dymaxio