I'm using serverless framework to deploy my functions to AWS Lambdas. The development is done in Kotlin using Java runtime.
To mitigate the cold start issue to some extent, I'm using Serverless Plugin Warmup
My handler function looks something like :
class MyHandler : RequestHandler<Map<String, Any>, ApiGatewayResponse> {
private val myService = MyService()
private val printStatement = LOG.debug("ColdStart") /* will be logged only during cold starts */
override fun handleRequest(input: Map<String, Any>, context: Context): ApiGatewayResponse {
val startTime = System.currentTimeMillis()
input.handleWarmup(LOG)?.let{ return it }
val customInput : Input = Input.fromMap(input) ?: return ApiResponseHelper.generateStandardErrorResponse(Messages.MALFORMED_REQUEST, 400)
LOG.info(customInput.routeKey)
val strategy = StrategyFactory.getStrategy(customInput.routeKey)
?: return ApiResponseHelper.generateStandardErrorResponse(Messages.REQUEST_NOT_RECOGNIZED, 404)
return try {
LOG.debug("Before Strategy : ${System.currentTimeMillis() - startTime}")
strategy.apply(customInput, myService)
}
catch(e : Exception){
ExceptionParser.exceptionHandler(e)
}
}
companion object {
private val LOG = LogManager.getLogger(MyHandler::class.java)
}
}
Since this handler acts as entry point to multiple endpoints, hence I'm using a StrategyFactory ( static/ object class ) to fetch the corresponding logic that should be called.
Fast forward to CloudWatch Logs :
START RequestId: 6e32e62b-2a04-4001-85ab-c0e61d48584d Version: $LATEST
2022-06-08T15:00:37.657+05:30 Picked up JAVA_TOOL_OPTIONS: -XX:+TieredCompilation -XX:TieredStopAtLevel=1
2022-06-08T15:00:39.586+05:30 Transforming org/apache/logging/log4j/core/lookup/JndiLookup (lambdainternal.CustomerClassLoader@433c675d)
2022-06-08T15:00:39.602+05:30 WARNING: sun.reflect.Reflection.getCallerClass is not supported. This will impact performance.
2022-06-08T15:00:39.602+05:30 2022-06-08 09:30:39 DEBUG MyHandler:20 - ColdStart
2022-06-08T15:00:39.680+05:30 2022-06-08 09:30:39 6e32e62b-2a04-4001-85ab-c0e61d48584d INFO MyHandler:140 - WarmUp - Lambda is warm!
2022-06-08T15:00:39.763+05:30 END RequestId: 6e32e62b-2a04-4001-85ab-c0e61d48584d
2022-06-08T15:00:39.763+05:30 REPORT RequestId: 6e32e62b-2a04-4001-85ab-c0e61d48584d Duration: 110.59 ms Billed Duration: 111 ms Memory Size: 512 MB Max Memory Used: 145 MB Init Duration: 2406.90 ms
After sometime I make a call to this lambda ( from another lambda and this was logged to the same log group right after the previous logs )
2022-06-08T15:08:29.071+05:30 START RequestId: ad24cf5e-17ea-4347-9931-be4eb8cbc47b Version: $LATEST
2022-06-08T15:08:31.981+05:30 2022-06-08 09:38:31 ad24cf5e-17ea-4347-9931-be4eb8cbc47b INFO MyHandler:30 - GET /posts/{id}
2022-06-08T15:08:32.019+05:30 2022-06-08 09:38:32 ad24cf5e-17ea-4347-9931-be4eb8cbc47b DEBUG MyHandler:36 - Before Strategy : 2944
2022-06-08T15:08:34.186+05:30 END RequestId: ad24cf5e-17ea-4347-9931-be4eb8cbc47b
2022-06-08T15:08:34.186+05:30 REPORT RequestId: ad24cf5e-17ea-4347-9931-be4eb8cbc47b Duration: 5112.63 ms Billed Duration: 5113 ms Memory Size: 512 MB Max Memory Used: 188 MB
Here we see that, REPORT has not InitDuration which means this calls is not a cold start call. Can be confirmed since LOG.debug("ColdStart")
also didn't execute with this call.
So, I could've argued that my implementation needs optimizing. But it was taking 3 seconds just to reach till strategy.apply(wdInput, nodeService)
( As seen in the logs ) which does not seems right.
However, right after this, I made a same call right after the above call and this time it just took 2ms to reach till strategy.apply(wdInput, nodeService)
. The logs look like :
2022-06-08T15:08:46.709+05:30 START RequestId: 7228e62b-e9cd-40f8-b27b-a609b9abca0e Version: $LATEST
2022-06-08T15:08:46.714+05:30 2022-06-08 09:38:46 7228e62b-e9cd-40f8-b27b-a609b9abca0e INFO MyHandler:30 - GET /shared/node/{id}/users
2022-06-08T15:08:46.714+05:30 2022-06-08 09:38:46 7228e62b-e9cd-40f8-b27b-a609b9abca0e DEBUG MyHandler:36 - Before Strategy : 2
2022-06-08T15:08:46.762+05:30 END RequestId: 7228e62b-e9cd-40f8-b27b-a609b9abca0e
2022-06-08T15:08:46.762+05:30 REPORT RequestId: 7228e62b-e9cd-40f8-b27b-a609b9abca0e Duration: 50.51 ms Billed Duration: 51 ms Memory Size: 512 MB Max Memory Used: 189 MB
It makes sense for warmup plug to take InitDuration of 2406ms since it's a cold start. However, the next call isn't a cold start and takes a duration of ~5000ms in total and the second call after this is faster than this by a factor of 100. ( Note that duration for warmup call was just 110ms ).
What is causing this and how this could be better handled?