Internet has been developing for many years, including most of these sites are talent shows itself, has existed for nearly 10 years or 10 years, in the process of development for such a long time, in addition to business challenges in technology is also facing a lot of challenges. I picked up some of the more advanced Alexa sites (ranking by April 21, 2012) to see how they are technically challenging the challenges of business development.


Google is currently ranked first in Alexa. He was born on 1997, was a research project, a monthly build index, build index out through sharding (shard by DOC) way distributed to multiple servers (Index Server), "specific data the same way through the sharding distributed to multiple servers (Doc, Server) when the user submits the request, the request is submitted to the Index Server hit the inverted index points through the front-end server, then the web information extraction from a specific Doc Server (such as the page title, keyword search, segment information etc.), the final presentation to the user.

index "increased, the structure to store the index and the web page data by adding Index Server and Doc Server, but will still face many other problems, so after more than and 10 years, the structure of Google to do a lot of things to improve.

1999, Google added a Cache Cluster, used to index the results of Cache query and document fragment information, while the Index Server and Doc Server through the Replicate way into a Cluster. The benefits of these two transformation is the site’s response speed, support for access and availability (Availability) has been improved. This change caused by the increase of cost of Google in terms of hardware style is always without expensive high-end hardware, but at the level of software to ensure system reliability and high performance, and the same year, Google began using a self-designed server to reduce costs. In 2000, Google began to design their own DataCenter, using a variety of methods (such as the use of other methods to replace the air conditioning and refrigeration) to optimize the PUE (energy efficiency), and to design the server also do a lot of. In 2001, Google Index format has been modified, all of the Index into memory, the benefits of this transformation is the response speed of the site as well as the amount of support can be obtained by the

