Dictionary based compression
Dictionary-based compression works by creating a dictionary(table of values) from which the text can be re-created. Below is a table of the value needed to compress the phrase 'An eye for an eye, a tooth for a tooth'.
Entry |
Text |
Binary |
1 |
An |
000 |
2 |
_eye |
001 |
3 |
_for |
010 |
4 |
_an |
011 |
5 |
, |
100 |
6 |
_a |
101 |
7 |
_tooth |
110 |
'An eye for an eye, a tooth for a tooth' is compressed as:
000 001 010 011 001 100 101 110 010 101 110
Savings from dictionary based compression
'An eye for an eye, a tooth for a tooth' is compressed as: 000 001 010 011 001 100 101 110 010 101 110
This has a total size of 33 bits or 4.125 bytes. The phrase has 38 characters, so in 8-bit ASCII would take up 38 bytes.
This is a saving of (38 - 4.125)/38 * 100 = 89.15%. The saving would actually be lower than this as the dictionary itself also has to be transmitted.
It may seem from this small example that the size of the dictionary would eat p mst of the savings, but applied to a larger text the size of the dictionary becomes far smaller than the size of the text being compressed.
Knowledge check
These questions will refer to using dictionary encoding to enocde the phrase 'Can you can a can like a canner can?' using the dictionary below
Entry |
Text |
Binary |
1 |
Can |
000 |
2 |
_you |
001 |
3 |
_can |
010 |
4 |
_a |
011 |
5 |
_like |
100 |
6 |
ner |
101 |
7 |
? |
110 |