Load generation of millions of http requests

2.1k Views Asked by At

I heard that some load generators can generate a load of millions of requests, but when number of ports in TCP is just 65000 how is this possible?

2

There are 2 best solutions below

2
On

I will assume you are talking about 1 million end users. Because, really, 1 million requests on the server for some applications can be generated by much lesser number of users. Good example of that is this study. They only had 13000 simultaneous connections to the server, but that generated over a million messages/sec workload on the server.

Also it depends whether you are talking about 1 million concurrent requests, or 1 million total requests for given number of users.

Second is easy to achieve: with X concurrent users you need to run your test for 1,000,000 / X times (iterations), e.g. for 100 concurrent users, you'd need 10000 iterations, which is not a lot really.

One million concurrent requests is a more interesting topic. It is, as you pointed out, impossible to achieve from one machine, and ports are not the only reason why. That's where Distributed load comes handy. If you have N remote JMeter engines, each running X concurrent requests, you can deliver N * X concurrent requests.

So theoretically, if you were only limited by number of ports, and assuming you only had about 64K ports to use (since ports 0-1024 should not be used, and probably some of the ports above 1024 will be taken too) you would need 16 or so remote JMeter engines, each running 64K concurrent users to deliver 1 million requests concurrently. However at the present time, it's unlikely that ports will be your biggest problem. Likely you will be limited by other parameters on the machine: memory, CPU, and as a result - number of threads. Currently on a good machine you can have about 500 - 2000 concurrent JMeter threads (depending on the machine and test). So to deliver 1 million concurrent requests, you would need 500 - 2000 remote JMeter engines...

And without even thinking of complexity of management and result analysis for such test, the question is: do you really need a full scale test? Any single server will never serve 1 million concurrent users. It's got to be a cluster, and if so, you don't have to test against the full cluster, you can scale it down to manageable proportions, and extrapolate results for a bigger cluster.

Edit: based on comment, seems like you are talking about parallel universe. In scope of JMeter and typical web application, user = thread, and also web application's parallelism is based on threads. If you abandon those limitations and also your web app supports asynchronous APIs (traditional or Quasar-like), then it's a different reality. In that case you can get to the point where number of ports will be limiting, but since number of ports are per network interface, you can add network interfaces to support a desired number of ports. On Linux it could be a virtual interface, but given most of the machines nowadays are virtual anyways, could also be virtual network adapters. You would need 16 of them for 1 million ports.

0
On

On a process basis, independent of tool, using a single load generator is not recommended. It will be almost impossible to include a control group to check the quality of your test and for a degradation of users due to your test design with an overloaded generator.

Seek at least three generators: Two for primary load and one for a control group of a single virtual user of each tested business type. If your control group and your global group degrade at the same rate then you can be assured that your cause is external, the application under test. On the other hand, if your two primary servers degrade in response time, but your control group does not (or even gets faster) then you have a test design issue which is showing as a degraded application. Just as you need to monitor your application/site under test for resource challenges, you also need to do the same with your load generators.

There are many reasons why the million concurrent requests within the same request/response window are not plausible

  1. All of the static components for a site that heavy would be borne by a CDN solution distributed on a worldwide basis. The actual number of requests tunneling through the CDN to the ORIGIN servers should be quite small, especially when you consider that the preponderance of requests on a page are static components, third party components and tracking widgets that your marketing department will obsess over
  2. Assuming this is an internet facing application without any type of push technology to prevent overloading for large population, you are looking at a population set of well over 100,000,000 online users all engaged in an activity at that moment. Humans, being chaotic instruments are difficult to get to do anything all at a "NOW!!!!!" moment. Check your HTTP access logs and you should see how this load is actually spread. The IP addresses can also provide an objective view of load within the length of the average user session duration, or five minute block (whichever is easier to look at)
  3. I have some experience with very highly scaled internet sites. You never test the full load. You test a scaled load against a defined "module" or subset of the application architecture. Mind that a module in some cases is a group of ten 42u racks rolled from the production definition for upgrade to a new release and performance testing. And that module may be 5% of the infrastructure.

So, in the end, your 1,000,000 concurrent drops to 5% as you use a slice/module/pod at 50,000. Next, that amount is depreciated by 80% to account for what is currently being caught or serviced by the CDN layer (which is probably higher than 80%). That gets you to 10,000 concurrent requests. Your HTTP logs will be able to confirm the precise timing of requests that are occuring within a single request/response window. The odds of all 10K being right there at the precise same time are quite low.