About Bit Visualizer

written by Heechul Ryu on May 3, 2023

This is a post about Bit Visualizer. You may have come from here.

231
|
230
|
229
|
228
|
227
|
226
|
225
|
224
|
223
|
222
|
221
|
220
|
219
|
218
|
217
|
216
|
215
|
214
|
213
|
212
|
211
|
210
|
29
|
28
|
27
|
26
|
25
|
24
|
23
|
22
|
21
|
20
|
230
|
228
|
226
|
224
|
222
|
220
|
218
|
216
|
214
|
212
|
210
|
28
|
26
|
24
|
22
|
20
|
82 81 80
0
5
2
27 26 25 24 23 22 21 20
161 160
2
A
4,294,967,295 0

Is Binary a Native Number System to You?

Certainly not to me!

If you’re a human like me, the base-10 or decimal system is your comfort zone, but most modern computers work in binary (base-2).

When programming, we often work with high-level languages and tools that make binary representation unnecessary. However, there are still times when we need to work with binary numbers for more granular control or to understand how computers interpret numbers.

A Bit About Binary Numbers

Whether it’s text, image, or video, if it’s stored or transferred via computers (including your smartphone). The data is made of a series of bits.

A bit is either or , so it can represent two different states. And we often say is switched on and is switched off but that’s only because we choose to interpret two states in that way. We can actually interpret the numbers in any way we want. Thus depends on how we decide to interpret the chunk of a series of bits, it becomes audio, video, image, text, or anything really.

Binary numbers are to represent everything in computers but the nature of having just and makes it difficult to read and make sense of.

Hexadecimal and Octal

Wait a second, isn’t this about binary? Why are we talking about octal (base 8) and hex[adecimal] (base 16) and what are they?

Hexadecimal and octal are number systems that group binary digits together for easier reading and representation.

For example, instead of denoting each binary digit individually, we can also group some into one number. It’s sort of like a compression:

And this practice has been done with a tool like hexdump and xxd for a long time.

On macOS or Linux machines, you can try the following commands to see the results of converting numbers between binary, hexadecimal, and octal:

# printing character `1` (1 byte)

$ echo -n 1 | xxd -b  # for binary
00000000: 00110001                                               1

$ echo -n 1 | xxd     # for hex
00000000: 31                                       1

$ echo -n 1 | od      # for octal. by default, it's grouped by 2 bytes
0000000 000061        # `[0]00` is 0 and `110` is 6 in octal is `001` == 1 in octal hence `[000]061`
0000001               # because it reads from right to left within the segment

# printing character `11` (2 bytes)

$ echo -n 11 | xxd -b
00000000: 00110001 00110001                                      11

$ echo -n 11 | xxd
00000000: 3131                                     11

$ echo -n 11 | od  # note that `od` also actually groups into 2-byte ("short") by default
0000000 030461     #    0      3     0      4     6     1
0000002            # `[00]0` `011` `000` `1 00` `110` `001`

# printing character `111` (3 bytes)

$ echo -n 111 | xxd -b
00000000: 00110001 00110001 00110001                             111

$ echo -n 111 | xxd
00000000: 3131 31                                  111

$ echo -n 111 | od
0000000 030461 000061
0000003

# let's play a bit with `od` to print different octal values depending on how we would like to interpret the binaries

$ echo -n 111 | od -t o1  # or `od -b` to group per byte instead of 2-byte
0000000 061 061 061
0000003

$ echo -n 111 | od -t d1  # decode to decimal per byte
0000000   49   49   49    # char `1` is binary `110001` which is `49` in decimal
0000003

$ echo -n 111 | od -t d2  # decode to decimal per 2-byte
0000000  12593     49     # `00110001 00110001` is `12593` in decimal
0000003

$ echo -n 1111 | od -t d2 # clean cut by 2-byte to see it clearly
0000000  12593  12593
0000004

There is also https://hexed.it/ and https://cryptii.com/pipes/text-octal that you can play with.

For example, in binary is in octal and they are exactly the same number just a different way to denote.

For binary, because there are only two different states per bit and they tend to get long very quickly (to hold any reasonably complex states), which makes it hard to make sense of it at first glance and impractical to trace which information stops and starts where.

But when it’s “zipped”, now the same number/data is magically a lot more digestible.

When Is Octal (base 8, 3-bit) Useful?

I actually don’t know much about when the octal system is being used other than unix file permissions.

On that note, I will say the octal one, rw-r--r-- is more helpful to tell apart the three values, , , than the binary counterpart, , although it actually lines up with rw-r--r-- when they are side by side.

Also octal seems to be used more in the past.

When Is Hexadecimal (base 16, 4-bit) Useful?

Unlike octal, hex is used a lot. My guess for the usefulness is that not only does it compress 4 bits (between and ) into one hex digit (between and ()) that represent 16 different states, but also one hex digit being a nibble, a half of a byte. Which allows it to represent a byte with just 2 digits. different state with just two digits!

I also just discovered an article that aligns with this notion in a much more detailed way.

One of the most popular uses is to represent RGB color with three bytes (6 hex digits) like #000000 or #FFFFFF. In this case, # is just to indicate that this is a hex value. One byte is one of Red, Green, and Blue.

MAC address and IPv6 address also uses hexadecimal as mentioned here.

But for anything that requires looking at the binary data, the hex notation can be useful especially when segmenting by 4 bits makes sense.

# here is a small bmp file that looks below, I just named it "chess.bmp"
$ ascii-image-converter chess.bmp -d 5,5 -n
@ @ @
 @ @
@ @ @
 @ @
@ @ @
# each @ being a black pixel and each space being a white pixel

# let's see it with hexadecimal via `xxd`
$ xxd chess.bmp
00000000: 424d 8600 0000 0000 0000 3600 0000 2800  BM........6...(.
00000010: 0000 0500 0000 fbff ffff 0100 1800 0000  ................
00000020: 0000 4b00 0000 130b 0000 130b 0000 0000  ..K.............
00000030: 0000 0000 0000 0000 00ff ffff 0000 00ff  ................
00000040: ffff 0000 0000 ffff ff00 0000 ffff ff00  ................
00000050: 0000 ffff ff00 0000 00ff ffff 0000 00ff  ................
00000060: ffff 0000 0000 ffff ff00 0000 ffff ff00  ................
00000070: 0000 ffff ff00 0000 00ff ffff 0000 00ff  ................
00000080: ffff 0000 0000                           ......
# see probably `ffffff` being a white pixel and `000000` being a black pixel?
# and maybe `00` in between mean the end of the line?
# ¯\_(ツ)_/¯ I don't know...  you tell me with https://en.wikipedia.org/wiki/BMP_file_format

Why Not Other Base Systems?

This article I discovered above should explain this very well. But also let me add my own reductions.

Let’s First Start With Base 10

You can’t really group bits into base 10 because you can’t make the value from powering with any integer. Because of and . Right? I don’t think there is a way to shoe horn a number system that is not a power of 2 to a binary computer and even if there is one, it will be inefficient. Actually, we do this all the time but at a much higher level like for end users to interpret the data or high-level programming languages that deal with the internal working of binaries but not as direct mapping like how other power of 2 number systems.

Why Not Base 4 (2-bit)?

Base 4 ( to ) or quaternary system is a bit awkward to be a wide use. Only 4 different states and compressing just half the length. Remember even the octal or base 8 is not very used much?

Why Not Base 32 (5-bit) and Beyond?

You already have noticed that the hexadecimal or base 16 system had to “borrow” 6 characters ( to ) from the alphabet to represent from to since the regular Arabic numerals that we use to denote the base 10 system only have ten distinctive digits.

With the base 32 system, we will need 16 more characters to borrow and I don’t think we can really map each character to a specific number in our head as easily as the ones from the hexadecimal.

And one base-32 digit will compress 5 bits. Which is awkward to “regular” data since bits are not usually grouped into 5. But on the other hand, hexadecimal can compress anything that is made up of nibble (4-bit).

The same circumstances for base 32 can apply to bigger bases.

They Are Actually Being Used

However, base 32 (5-bit) and base 64 (6-bit) are actually being used because they are useful in a different sense.

Hexadecimal can represent the binary data that is comfortable to humans since it’s close to decimal and many other good reasons we talked about. That means you can convert binary to hex and vice versa. Therefore you can transfer this data in a hexadecimal form. e.g. a binary file content. However, 2 characters per byte is not the most efficient. With base 64 in particular, this becomes 4 characters per 3 bytes. more efficient than base 16.

A trade-off between understandability and portability. So it’s useful to transfer the data with a requirement of only via printable characters.

Base 64 is actually quite interesting so I’m hoping to post about it in the future.

Bonus: What the Heck Is Octet?

It is nothing to do with Octal (3-bit).

Octet means 8 bits. Basically another (strict) way to call a byte because byte actually does not necessarily have to mean 8 bits as none 8-bit bytes existed before.

Enter Bit Visualizer

As I brushed up on these concepts this time, I asked myself. What if there is a tool that helps people better understand binary and its related concepts, such as octal and hexadecimal? In a way that is quick and interactive to refresh your memory of these concepts and get a feel for how they work, without having to read and digest the concepts all over again.

That’s why I created the Bit Visualizer and this website. You are welcome and enjoy!

Go to Bit Visualizer.