|
FAT32 Performance Tradeoff: FAT32 Cluster
Sizes and FAT Sizes
It is generally believed to
be a "rule" of cluster size selection that "smaller is better". As FAT16
partitions have gotten larger and slack waste has gone through the roof, the
push toward using FAT32 to reduce cluster sizes has been tremendous. While
FAT32 does allow the use of larger hard disks and greatly reduced cluster
sizes, there is an important performance consideration in using FAT32 that
is not often talked about. Now that huge hard disks with dozens of gigabytes
have made FAT32 essential for newer systems, you often don't have a
practical choice between FAT16 and FAT32 any more. However, the principles
in this page are still relevant, depending on how you are setting up your
hard disk. They can also help influence your decisions regarding how to
partition very large drives under FAT32.www.tartoos.com
Let's consider a partition
that is just under 2,048 MB, the largest that FAT16 can support. If this
partition is set up under FAT16, it will result in a file allocation table
with 65,526 clusters in it, with each cluster taking up 32 kiB of disk
space. The large cluster size will indeed result in a great deal of wasted
space on the disk in most cases. Therefore, often it will be recommended
that FAT32 be used on this volume, which will result in the cluster size
being knocked down from 32 kiB to 4 kiB. This will, in fact, reduce
slack on the disk by an enormous amount, often close to 90%, and potentially
free hundreds of megabytes of previously wasted disk space. It is usually
the right thing to do in this situation.www.tartoos.com
However, there is another
side to this; you don't get this reduced cluster size for free. Since each
cluster is smaller, there have to be more of them to cover the same amount
of disk. So instead of 65,526 clusters, we will now have 524,208. Further
more, the FAT entries in FAT32 are 32 bits wide (4 bytes), as opposed to
FAT16's 16-bit entries (2 bytes each). The end result is that the size of
the FAT is 16 times larger for FAT32 than it is for FAT16! The following
table summarizes:
|
FAT Type |
FAT16 |
FAT32 |
|
Cluster Size |
32 kiB |
4 kiB |
|
Number of FAT
Entries |
65,526 |
524,208 |
|
Size of FAT |
~ 128 kib |
~ 2
MIB |
Still worse, if we increase
the size of the FAT32 volume from 2 GIB in size to 8 GiB, the size of the
FAT increases from around 2 MiB to a rather hefty 8 MiB. The significance of
this is not the fact that the FAT32 volume will have to waste several
megabytes of space on the disk to hold the FAT (after all, it is saving far
more space than that by reducing slack a great deal). The real problem is
that the FAT is referred to a lot during normal use of the disk,
since it holds all the cluster pointers for every file in the volume. Having
the FAT greatly increase in size can negatively impact system speed.www.tartoos.com
Virtually every system
employs disk caching to hold in memory disk structures that are frequently
accessed, like the FAT. The disk cache employs part of main memory to hold
disk information that is needed regularly, to save having to read it from
the disk each time (which is very slow compared to the memory). When the FAT
is small, like the 128 kiB FAT used for FAT16, the entire FAT can be held in
memory easily, and any time we need to look up something in the FAT it is
there at our "fingertips". When the table increases in size to 8 MiB for
example, the system is forced to choose between two unsavory alternatives:
either use up a large amount of memory to hold the FAT, or don't hold it in
memory.www.tartoos.com
For this reason, it is
important to limit the size of the file allocation table to a
reasonably-sized number. In fact, in most cases it is a matter of finding a
balance between cluster size and FAT size. A good illustration of this is
the cluster size selections made by FAT32 itself. Since FAT32 can handle
around 268 million maximum clusters, the 4 kiB cluster size is technically
able to support a disk volume 1 TiB (1,024 GiB) in size. The
little problem here is that the FAT size would be astronomical--over 1
GB! (268 million times 4 bytes per entry).
For this reason, FAT32 only
uses 4 kiB clusters for volumes up to 8 GiB in size, and then quickly
"upshifts" to larger clusters, as this table shows:
|
Cluster Size |
"Minimum" Partition
Size (GIB) |
"Maximum" Partition
Size (GIB) |
|
4 kiB |
0.5 |
8 |
|
8 kiB |
8 |
16 |
|
16 kiB |
16 |
32 |
|
32 kiB |
32 |
!? |
Note:
I am not really sure what the maximum partition size is for a FAT32
partition. :^) I have heard various different numbers, but nobody seems to
be able to produce an authoritative answer. "Officially" it is 2,048 GiB (2
TiB), but there are likely to be other limiting factors...
www.tartoos.com
As you can see, despite the
claims that FAT32 would solve large cluster problems for a long time, it
really hasn't. As soon as you hit 32 GiB partitions, you are back to the
dreaded 32 kiB clusters we all knew and hated in FAT16. Obviously 32 GiB is
a lot better than having this happen at 1 GiB, of course, but still, FAT32
is more of a temporary work-around than a permanent solution. When FAT32 was
first introduced, many people said "Yeah, but 32 GiB is huge. I'll
probably never have a disk that large and if I do, I won't care about a
little wasted disk space!" In fact, it took less than five years for
hard disk makers to move from selling 1 to 2 GB hard disks, to selling ones
32 GB in size or more! And those same people are finding they do care
about the wasted disk space, though perhaps less than they did when they
only had 1 GB. :^)
The table below is a little
exercise I did for fun, to show the size of the FAT (in MiB) as the
partition size increases, for various cluster sizes. You can see why FAT32
doesn't stick with 4 kiB clusters for very long--if it did, you'd need
enormous amounts of memory just to hold the table. (A 60 GB partition, if
formatted with 4 kiB clusters, would result in a FAT that is 64 MiB in size,
which is about as much as the entire system memory in many newer PCs.) The
entries in bold show what FAT32 will choose for a partition of the given
size; by going up in cluster size Microsoft keeps the size of the FAT to no
more than about 8 MiB through partitions of size 64 GiB:www.tartoos.com
|
Partition Size |
4 kiB Clusters |
8 kiB Clusters |
16 kiB Clusters |
32 kiB Clusters |
|
8 GiB |
8 MiB |
4 MiB |
2 MiB |
1 MiB |
|
16 GiB |
16 MiB |
8 MiB |
4 MiB |
2 MiB |
|
32 GiB |
32 MiB |
16 MiB |
8 MiB |
4 MiB |
|
64 GiB |
64 MiB |
32 MiB |
16 MiB |
8 MiB |
|
2 TiB
(2,048 GiB) |
-- |
1,024 MiB |
512 MiB |
256 MiB |
The last entry, 2
terabytes, is for fun, to show how laughable I find it when people go on
about 2 TiB hard disks "being supported by FAT32". Technically they are, I
guess, if you want to deal with a FAT that is 256 MiB in size and go back to
having 40% of your disk wasted by slack. We had better hope our system
memory goes up by a factor of 1,000 before our hard disks do again. We also
better hope that no other little "surprises" show up as disks get larger:
several did pop up when the 64 GiB barrier was reached, such as difficulties
with the FDISK program.www.tartoos.com
Really, what this page
shows is that the FAT file system is stretched beyond its limits even with
FAT32. To get both good performance and disk space efficiency for very
large volumes requires a clean break with the past and the use of a high
performance file system like NTFS. I should make clear that I am not
recommending against the use of FAT32 on Windows 9x/ME systems. With
modern drives of 50 to 100 GB, FAT16 is just too impractical, with its 2
GiB partition limit. At the same time, I want to make sure that people
realize that FAT32 has its own limitations. It is really more of a kludge
than a permanent solution to the problems of large partitions
Using
Partitioning with Hard Disks Over 2 GB
While partitioning can
be somewhat complicated, there is one aspect to it that is pretty clear-cut:
if you have a hard disk that is over 2 GB in size, and you are not using
FAT32, you must divide the disk into partitions such that each is no larger
than 2 GB, in order to access the entire disk. If you do not, you will not
be able to access more than the first 2 GB on the disk.www.tartoos.com
This was a big issue for
the users of the first Windows 95 version, which did not have FAT32. In some
cases a system would ship with say, a 6 GB hard disk, and it would be
necessary to split it into at least three partitions due to FAT32's 2 GB
limit. Even in that case, you would end up with three partitions with 32
KIBclusters, creating a lot of waste due to slack. To avoid this, many
people would segment a 6 GB disk into six or even more partitions, but this
introduces other issues.www.tartoos.com
With the introduction and
widespread adoption of FAT32 in newer operating systems, the problems with
hard disks over 2 GB have been rendered largely moot. If you are running
Windows 95 OSR2, Windows 98 or Windows ME, you should use FAT32, which
will let you use the full size of a large hard disk in a single partition.
Of course, even with FAT32 you may want to use partitioning to reduce lost
space due to slack, as described on the next page. However, the need is
much reduced compared to using FAT16.
Using Partitioning to Reduce Slack
Since slack is dependent on
the cluster size used for the partition, and the cluster size is directly
linked to partition size, it is possible to dramatically improve the storage
efficiency of the hard disk simply by dividing it into multiple partitions.
The larger the current partitions are, and the more files on the disk, the
greater the opportunity for improvement. This applies both to FAT16 and
FAT32 partitions--on older systems that use FAT16 partitions for volumes
over 512 MIB, cluster sizes will be 16 kiB or 32 kiB, and slack will be
significant, and the same goes for FAT32 partitions that are 16 GIB or more.www.tartoos.com
Let's take the example of a
2 GB disk (usually called a 2.1 GB disk by hard disk manufacturers, since
they talk in decimal gigabytes). Let's say that we have 24,000 files on our
disk and each has an average amount of slack amounting to 60% of a cluster.
Now consider various partitioning alternatives; we can either keep the disk
in one piece or break it into smaller pieces. Here is the impact on
(approximate) slack space:
|
FAT16
Cluster Size |
Size of Each
Partition |
Number of Partitions |
Typical Total Slack
(All Partitions) |
Typical Total Slack
(% of Disk Space) |
|
2 kiB |
128 MiB |
16 |
28 MiB |
1.4% |
|
4 kiB |
256 MiB |
8 |
56 MiB |
2.8% |
|
8 kiB |
512 MiB |
4 |
112 MiB |
5.6% |
|
16 kiB |
1 GiB |
2 |
225 MiB |
11.2% |
|
32 kiB |
2 GiB |
1 |
450 MiB |
22.5% |
As you can see, putting a 2
GB disk in a single partition is inefficient; typically 20% or more of the
disk is going to be wasted, and you can cut that basically in half just by
going to a pair of partitions. You can cut it much further by dividing the
disk further. In fact, the best solution might seem to be just going to 128
MiB partitions, which drops slack down to a very small number. There's only
one problem with this: you have to use 16 partitions! Do you really want
your files spread over 16 disk volumes, from C: to R:? Most people don't. (I
cringe at the very thought. :^) )www.tartoos.com
With a larger disk and
FAT32, the example is not that much different, but the slack depends
entirely on how many more files you put on that larger disk. If you put the
same number of files on the larger disk using FAT32, slack (as a percentage
of disk space) decreases dramatically; if you put many more files on the
larger partitions then you "give back" some of the slack savings, though of
course you are still ahead of where you would have been using FAT16. Which
of these is most appropriate depends on how you are using your system. In
many cases, the large hard disks available today are used to hold big files,
such as video and audio, that were rarely seen on older PCs. In other
applications, a bigger disk just means many more small files.www.tartoos.com
Let's look at an example
where have, say, a mythical 64 GiB (68.7 GB) hard disk and 96,000 files on
it. Here, I am looking at a disk 32 times the size of the previous example,
but have only increased the number of files by a factor of four. This means
slack, as a percentage, will be lower even for partitions of the same
cluster size. Here's how this looks under FAT32, assuming the same 60% end
cluster waste:
|
FAT32
Cluster Size |
Size of Each
Partition |
Number of Partitions |
Typical Total Slack
(All Partitions) |
Typical Total Slack
(% of Disk Space) |
|
4 kiB |
8 GiB |
8 |
225 MiB |
0.35% |
|
8 kiB |
16 GiB |
4 |
450 MB |
0.7% |
|
16 kiB |
32 GiB |
2 |
900 MB |
1.4% |
|
32 kiB |
64 GiB |
1 |
1,800 MB |
2.8% |
As you can see, the total
amount of slack, in bytes, is higher, because we have more files. However,
the percentage of total disk space used up in slack is much lower because
the disk is so much bigger--that's the advantage of FAT32. As before, when
you increase the cluster size, you end up with bigger partitions, and more
slack. 32 kiB clusters means four times as much slack as 8 kiB clusters.
However, with the total slack still so low--2.8%--and with the huge size of
the disk being contemplated (64 GB) the entire matter is of arguable
importance. On a 2 GB disk, 450 MB is a big deal; on a 64 GB disk, 1.8 GB
really is not, at least to most people. While most people wouldn't put an
entire 64 GB hard disk in one partition (there are other reasons to avoid
doing this, not just slack), it just isn't the big deal it used to be.www.tartoos.com
The examples above show
that there is a tradeoff between saving slack and causing your disk to be
broken into a large number of small pieces. Which option makes the most
sense for you depends entirely on what you are doing with your disk, and on
your personal preferences. I cannot stand having my disk chopped into little
bits; on an older (FAT16) system I usually use 8 kiB or 16 kiB cluster-size
partitions and sacrifice some more storage for the sake of what I consider
"order". Others prefer the use of more, smaller partitions. You should also
bear in mind the space tradeoff in using multiple partitions, something the
"partitioning fanatics" (my pet name for people that like to chop a 1 GB
disk into eight 128 MB partitions) often don't realize.www.tartoos.com
On a FAT32 system with a
large hard disk, I usually go with partitions no more than 16 GiB,
sticking to 16 kiB clusters. The difference between 8 kiB and 16 kiB
clusters is not significant enough to warrant all the volumes needed to
divide a very large disk into 8 GiB units, in my estimation. On some
disks, such as backup volumes or ones where I will be storing large
multimedia files, I will use 32 kiB clusters and very large volumes. This
is an example of tailoring one's partition size and cluster size to the
type of data being stored on the partition. If the files being put on the
volume are very big, the cluster size becomes essentially irrelevant.www.tartoos.com
|