Tuesday, August 08, 2006

Crypto by Steven Levy

I finished reading the book "Crypto: How the Code Rebels Beat the Government - Saving Privacy in the Digital Age", by Steven Levy. This was a great book and I would highly recommend it to anyone. I think people would even find it interesting if they aren't interested in cryptography.

The book gives an account of the invention of public key cryptography and the people who were responsible for it. The book starts with the creation of the Diffie-Hellman key exchange algorithm followed by the invention of the RSA algorithm. Towards the end of the book, Phil Zimmerman's effort to create PGP is explained.

A common theme throughout the book is the US governments attempt to restrict the knowledge of the cryptographic research community from being made public. For example, many of the researchers were threatened with jail time for publishing papers on the topic of cryptography. What was truely interesting about this book is the insight into the personalities behind the algorithms and why they believed so strongly that cryptography should be made availiable to the masses.

The book chronicals the over twenty year battle that the crypto researchers had with organizations such as the NSA, FBI and US Congress. I knew before reading this book that the US government was mostly unsupportive of cryptography. I never knew the extent that agancies such as the NSA went through to prevent people from using encryption.

People such as Diffie, Hellman, Rivest, Shamir, Adleman, and Zimmerman went through great lengths to ensure that public's privacy is maintained. I admire what these individuals were able to accomplish by refusing to comprimise with the US government. For anyone interested in cryptography or a person's right to privacy this book is a must read.

AOL Search Results Published

AOL recently published the search results of random users over a three month period. The AOL search results contain approximately 2 gigs of data. Here is the READ ME and one of the search result files released from AOL. In total there are 10 files of search results. The file I posted contains 3,558,412 lines of text, so it gives a good indication of the typical online searches. I found it very interesting to see what other people are searching for online.

I performed a some basic parsing of the file to determine the number of users who searched for certain key words. One thing that I found very surprising is the number of people using AOL who searched for common web sites through the search engine, instead of typing in the URL directly. For example, there were thousands of searches for google.com, myspace.com, yahoo.com, msn.com and aol.com. I don't understand why a person who knows the domain name for a site would use a search engine. Also, if I was already using AOL why would I need to search for AOL.

Another thing I found interesting is the large number of people who searched for porn through AOL. These are some of the random results I found of people looking for porn: "old lady gives doctor handjob", "pics of my ex", "youngorgy", "sex poetry", "female escorts for couple", and "bathroom sex mpegs". That is only a small sample of the search results for porn. It is hard to believe that so many people search for such obscure subjects. It is probably safe to say that many of the people I come in contact with daily are also looking for such subjects when they use the Internet.

I found it very surprising that very few users search for technical topics with AOL. I found few results of people looking for information on programming languages or operating systems. For example, a very small percentage of AOL users searched for information on Linux. I would have thought that more people would have searched for computer related topics through the search engine. I think much of it may be due to user community of AOL. If on the other hand, Google published their search results I think there would be more searches for technical related material.

A final note is that many of the people who search with AOL have very bad spelling. I noticed many of the searches have the words misspelled.