I keep hearing that one of the ways to architect a scalable website is to not use joins. How is the world do you do that since most data is relational?
My limited research has yielded these thoughts:
A) If your data is inherently relational then indeed use a relational database, i.e., use the right tool for the job.
B) Maintain a denormalized version of your data.
C) For the data that can be forced to be non-relational then you can use NOSQL. Data architect it in such a way that joins are not necessary.
D) If you must relate your data then the application layer must manually implement joins by fetching the data sets one-by-one and manually relating the results.
E) Since manual joins at the application layer are very slow then try to do these offline (not while the user is waiting).
F) Use Map-Reduce.
Is this correct/any more answers?
High scalability has excellent articles on this. Check out the reddit one for how they handled the joins: http://highscalability.com/blog/2010/5/17/7-lessons-learned-while-building-reddit-to-270-million-page.html
Then there's already a stackoverflow question with a bunch of links in the answers for similar info: Techniques for writing a scalable website