There is a use case where we need to implement rate limit for our AWS rest API on ECS . Initially we have used NLB but because the requirement came to use rate limit so that users can not damage our AWS resources . Resource flow is
WAF <--> ALB <-->ECS
We have applied rate limit on WAF but it does not work for us because of its caveats Below caveats is from AWS Docs on WAF
The following caveats apply to AWS WAF rate-based rules:
AWS WAF checks the rate of requests every 30 seconds, and counts requests for the prior 5 minutes each time. Because of this, it's possible for an IP address to send requests at too high a rate for 30 seconds before AWS WAF detects and blocks it. AWS WAF can block up to 10,000 IP addresses. If more than 10,000 IP addresses send high rates of requests at the same time, AWS WAF will only block 10,000 of them.
Our users port request in bulk ,Like 4K request is submitted within 10 seconds and that bypasses WAF rule .
We can not use AWS API gateway because our response can be more than 10 MB also and integration time out is also 30 seconds but our can go upto some time 1 minutes .
Is there any other way that we can be used here apart from application based rate limiting ?
I suggest the following approach:
This way if there is a spike in GET request to your application, CloudFront will send a response directly from cache. The request will not hit the ALB nor it will hit your ECS service(s).