Number of English letters (incl. capital ones), punctuation symbols is less 128, hence you need 7-bits per English text symbol.

The calculated  entropy of the English language is 2.62 bits per letter on average.  According to Shannon’s Source Coding Theorem the best compression ratio you can obtain without distorting the original text (i.e. by lossless compression)  is 7 /2.62 ≈ 2.67 .

In other words you can’t compress English text on average by 3 times, but you can find a method to compress English text by 2.5 times.

 

According to the book “Video Compression Techniques“, Wolfgang Effelsberg and Ralf Steinmetz, published in 1998, Diatomic Encoding (this technique determines the pairs of bytes occurring  most frequently and encodes them with shorter codes) of ordinary English text by encoding the following most frequent pairs (here ‘_’ denotes blank for better readability):

E_ ,  T_,   TH,  A_, S_, RE, IN, HE

the replacement of the above pairs by special single bytes which are not present in English texts (e.g. Æ) reduces data by more than 10%.

 

Note: lossless coding is sometimes called entropy coding.

13 Responses

  1. I would like to thnkx for the efforts you have put in writing this blog. I am hoping the same high-grade blog post from you in the upcoming as well. In fact your creative writing abilities has inspired me to get my own blog now. Really the blogging is spreading its wings quickly. Your write up is a good example of it.

  2. The very root of your writing whilst sounding reasonable initially, did not settle perfectly with me personally after some time. Someplace throughout the paragraphs you were able to make me a believer but just for a very short while. I however have a problem with your leaps in logic and one might do well to fill in all those breaks. In the event that you actually can accomplish that, I would definitely end up being fascinated.

    1. Gödel’s second incompleteness theorem states no consistent axiomatic system can prove its own consistency. In other words no system can be safe from hacking.

  3. Youre so cool! I dont suppose Ive learn anything like this before. So good to seek out any person with some unique thoughts on this subject. realy thank you for beginning this up. this website is one thing that’s wanted on the net, someone with a bit of originality. useful job for bringing something new to the internet!

  4. Woah! I’m really loving the template/theme of this site. It’s simple, yet effective. A lot of times it’s very hard to get that “perfect balance” between usability and visual appearance. I must say you’ve done a excellent job with this. Additionally, the blog loads super fast for me on Internet explorer. Outstanding Blog!

Leave a Reply

Your email address will not be published. Required fields are marked *