Posts Tagged ‘bookmooch’

bookmooch illustration
[Illustration credit Andrice Arp, courtesy of BookMooch.com]

Regarding my previous post about Twitter’s database architecture and scaling, here’s a post about bookmooch.com’s scaling troubles (bookmooch is a book swapping site).

Bookmooch had a search engine for books, using a single table that indexed each word in a book’s title/author/etc. This system was very efficient for searching (one query per word). The problem was that adding new books to the index was very costly (the more popular the word was, the longer it took to update). It was up to the point where the disk writes took as much as 20 seconds out of their alloted minute.

In short, the solution was to have two tables for the index instead of one. One table was optimized for queries (faster searches), while the other for updates (adding new books quickly). These tables would be synchronized periodically.

IMHO, the guy in charge did nothing wrong with the initial design of the search engine. There are certain characteristics of a system that are hard to predict and even then, one needs a lot of experience to correctly optimize things in advance (like the problem in the second paragraph). As in the Twitter story, profiling saved the day, but don’t forget to monitor your logs!

Read Full Post »