How can I get an invoking lambda to run a cloud custodian policy in multiple different accounts on one run?

660 Views Asked by At

I have multiple c7n-org policies to be run in all regions in a list of accounts. Locally I can do this easily with the c7n-org run -c accounts.yml -s out --region all -u cost-control.yml.

The goal is to have an aws lambda function running this daily on all accounts like this. Currently I have a child lambda function for each policy in cost-control.yml and an invoker lambda function that loops through each function and calls it passing it the appropriate arn role to assume and region each time. Because I am calling the child functions for all accounts and all regions, the child functions are called over and over with different parameters to parse.

To get the regions to change each time I needed to remove an if statement in the SDK in handler.py (line 144) that is caching the config files so that it reads the new config w the parameters in subsequent invocations.

    # one time initialization for cold starts.
    global policy_config, policy_data
    if policy_config is None:
        with open(file) as f:
            policy_data = json.load(f)
        policy_config = init_config(policy_data)
        load_resources(StructureParser().get_resource_types(policy_data))

I removed the "if policy_config is None:" line and modified the filename to a new config file that I wrote to tmp within the custodian_policy.py lambda code which is the config with the parameters for this invocation.

In the log streams for each invocation of the child lambdas the accounts are not assumed properly. The regions are changing properly and cloud custodian is calling the policy on the different regions but it is keeping the initial account from the first invocation. Each log stream shows the lambda assuming the role of the first called parameters from the invoker and then not changing the role in the next calls though it is receiving the correct parameters.

I've tried changing the cloud custodian SDK code in handler.py init_config() to try to force it to change the account_id each time. I know I shouldn't be changing the SDK code though and there is probably a way to do this properly using the policies.

I've thought about trying the fargate route which would be more like running it locally but I'm not sure if I would come across this issue there too.

Could anyone give me some pointers on how to get cloud custodian to assume roles on many different lambda invocations?

1

There are 1 best solutions below

1
On BEST ANSWER

I found the answer in local_session function in utils.py of the c7n SDK. It was caching the session info for up to 45 minutes which is why it was reusing the old account info each lambda invocation within each log stream.

By commenting out lines 324 and 325, I forced c7n to create a new session each time with the passed in account parameter. The new function should look like this:

def local_session(factory, region=None):
  """Cache a session thread local for up to 45m"""
  factory_region = getattr(factory, 'region', 'global')
  if region:
      factory_region = region
  s = getattr(CONN_CACHE, factory_region, {}).get('session')
  t = getattr(CONN_CACHE, factory_region, {}).get('time')

  n = time.time()
  # if s is not None and t + (60 * 45) > n:
  #     return s
  s = factory()

  setattr(CONN_CACHE, factory_region, {'session': s, 'time': n})
  return s