Here is a popular design available in internet of Tiny URL application:
The application is loadbalanced with the help of a zookeeper, where each server is assigned a counter range registerd with zookeeper.
Each application is having an locally available counter, which increments on every write request.
There is Cache available which(probably) gets updated with each write request.
Gaps in my understanding:
For every write request, we dont check if a tiny url exists in the db for the large URL..so we keep on inserting all write requests(even though a tiny url already exist for that particular large URL). Is that correct? If so then would there be a clean up activity(removing redundant duplicate tiny urls for same large URL) at some intentional downtime of application in the day?
What is the point of scaling...if for 1 million(or more) range of counter value there is just one server handling the request. Wouldn't there be a problem..? say for example there is large scale writing operation, would there be a vertical scaling to avoid slowness?
Kindly correct if there if I have got anything wrong here.
Design problems are open ended; keeping that in mind, here is my take on your questions.
It may be a requirement to allow users to have their own tiny urls, even if they point to the same large url. For example, every use might want to see stats on how many times their specific tiny url was clicked one; this is a typical usage for tiny urls - put them into a blog/video/letter to get stats.
Let me extend "each server is assigned a counter range registered". This implies that generated IDs have structure X bits of service id + Y bits from local counter. X bits are assigned by the zookeeper, and this is what makes each server responsible for one range.
Several server will be placed behind a load balancer. When a request comes to the load balancer, the request will be sent to a randomly picked server. If servers are overloaded, you could just add more servers behind the load balancer, each of those servers owns its own range. This will allow the service as a whole to scale up and down (and no need in vertical scaling).
The key understanding to this design is that those ranges are arbitrary ranges. There is no need for them to be consequential.