I’m constantly amazed when I encounter people who have no idea how to  interpret their file sizes. Then I started writing this and I realized that a lot of this was very confusing. Almost like it was deliberately planned but I’m certainly not at liberty to tell you that.

Let’s start with the basics. One bit is the smallest element of memory possible. One bit can represent a one or a zero. Eight bits form a byte. I won’t go through the math but trust me, with eight bits you can represent 256 different combinations. To help out the visually oriented, I’ve asked a few of my geckos to help illustrate this.

A single bit, off

A single bit, off

A single bit, on

A single bit, on

 

 

 

 

 

 

 

 

 

Geckos_Byte

Eight bits = One byte

A very brief introduction to the metric system and how technology keeps you confused. Kilo means 1000, mega is 1,000,000, giga is 1,000,000,000 and tera is 1,000,000,000,000. Do you see a trend there? Each prefix used for memory is exactly 1000 times larger than the previous element. Except we use a slightly different standard for memory.

When we talk about memory, a kilobyte is 1024 bytes. This is done to ensure your confusion. Just kidding, almost, it’s because 1024 is the closest number that’s represented in bits and bits are how the memory is addressed. Remember eight bits represents 256, nine bits represents 512 and ten bits represent 1024 addresses. That means that one megabyte is 1024 kilobytes, one gigabyte is 1024 megabytes and one terabyte is 1024 gigabytes. unfortunately my geckos refused to cooperate for the illustration.

Memory

I pushed them too hard. Every try to get over a thousand geckos in a single picture?

Making sure you’re totally confused, data rates are typically measured in bits per second while storage is measured in bytes. Of course we use a small “b” for both. If you have a 6 mb connection, you’re capable of downloading about 600 kilobytes per second. What happened to the rest? There’s an overhead involved in sending the bits one at a time. A good rule of thumb is that this overhead works out to ten bits per byte.

Still meaningless isn’t it? Let’s go slightly deeper. One byte can store one text character. Assuming an average of five characters per word, at 700 words, this post should require 3500 bytes or 3.5 kilobytes of storage (700 words x 5 characters per word x 1 byte per character). It comes close to that if stored as a text file, stored as a Word or .doc file the number doubles because of overhead involved in formatting, colors and metadata such as the author’s name and date of creation. Word also adds about 10 kilobytes for internal data, making this post about 17 kilobytes. Multiply by 200 pages and a typical book, with no pictures, will come in at less than one megabyte.

Don’t bother checking me. The actual numbers will usually be far less because of various compression schemes used. Still the number serves as a reference point, one megabyte equals one book. A one terabyte drive could store one thousand times one thousand books or about one million books, as long as there were no pictures.

Pictures will rapidly drive the storage requirements up but compression is almost always used to store pictures. Going back to the basics, pictures use three bytes for each pixel. My camera is twelve megapixels in raw format would be 36 megabytes. With compression, a subject far beyond what I intend to talk about here, the picture takes four megabytes. Almost the same as four entire books. Remember the saying, “a picture is worth a thousand words?” The thousand words is actually more memory efficient.

It also means that a one terabyte drive will hold about 250,000 high resolution jpeg pictures. Fortunately most illustrations don’t require high resolution allowing you to hold a reasonably sized picture in about one megabyte. Without compression, such as the JPG format provides, you’ll see massive files. BMP files have no compression, which explains why they are usually very large files.

By the time you get to audio or video files, there are so many options for compression, sample rates and resolution that it’s almost impossible to predict memory size except to say it starts high and goes up.

Recapping what I hope you’ve learned;

  • One bit is the lowest element of storage – Usually used when describing connection speed as in 6 mbs (6,000,000 bits per second)
  • One byte is eight bits and usually how storage or data usage is measured
  • One kilobyte is 1024 bytes rather than 1000 bytes because of the way memory is organized
  • One megabyte is 1024 kilobytes
  • One gigabyte is 1024 megabytes
  • One terabyte is 1024 gigabytes
  • One picture may be worth a thousand words but usually requires more storage space than ten thousand words worth of text

Feel armed with slightly more knowledge now? Do you understand why your hard drive fills up so fast with only a few videos?  Compression is always a factor in data storage and if I run out of topics or get enough requests, I might try to explain compression and why some things can be compressed down to ten percent of their original size while others only compress by one percent.

© 2015 – 2019, Byron Seastrunk. All rights reserved.