How Google Works

"How Google Works" by David Carr, Basline Magazine, July 7, 2006

Google's compute system is actually a really big LISP machine: http://en.wikipedia.org/wiki/MapReduce

http://en.wikipedia.org/wiki/Google_File_System

Google MapReduce, GFS, & more from the source: Google Labs papers

"How Google Works: Operating at Extreme Scale", by Bill McColl

DIY Google

So if you want to make your own (initially small and cheap) version of Google (a Giggle?), how do you do it?

To get the compute power, Amazon EC2 is clearly the way to go.

To take advantage of massively parallel clusters, we need to write our applications in such a way that they can be processed in a parallel fashion. A number of traditional approaches are things like Globus Toolkit, MPI, and JavaSpaces. But if we want to do it the Google-way, then Hadoop is the most interesting because it is an implementation on Map-Reduce in Java as OSS (Apache).

http://glinden.blogspot.com/2006/11/hadoop-on-amazon-ec2.html

http://www.skrenta.com/2007/03/how_to_beat_google_part_1.html