Wednesday, March 24, 2010

eBay Scalability

Discussing SQL with a friend a few days ago, it occurred to me that the design of some databases could be described as procedural instead of relational. These procedural databases do a few things well but such a database design is found to be a hindrance when the time comes to add new features to the application. That is the negative view but, this article makes the case for just such a design.

Point 4 in the article states"To scale at Digg they followed a set of practices very similar to those used at eBay. No joins, no foreign key constraints (to scale writes), primary key look-ups only, limited range queries, and joins were done in memory". (Note how much this design mirrors the design used in most AS/400 RPG applications.)

The article's point 2 directs the designer to restrict what the user can do and to have no intention of having a generic database.  The article doesn't state that point 2 is prerequisite to point 4 but I think it must be.  Such a design may sometimes be necessary but electing to omit normalization from the database design reduces the potential for extension of that database.

Both a relational design and the design used at Digg are legitimate but, as with all design decisions, there are trade-offs. The design can be generalized and thus extensible or it can be performance optimized and specialized to do fewer business functions with limited potential for reuse without redesign.  To get both the generalized extensibility and the functionality optimized for critical performance requirements, consideration should be given to dividing the problem and solving it with two designs.

A quote from this article gives food for thought about when and how to address scalability:
Scalability is about how resource usage changes as units of work grow in number or size.  Said another way, scalability is the shape of the price-performance curve, as opposed to its value at one point in that curve.