Service fabric code package value of CpuShares for 100% cpu usage(max)

1.1k Views Asked by At

I have 2 service fabric application (sln) with 3 services (3 code packages). I want to distribute those in 40% 20% 20% of max CPU percentage irrespective of core i.e. no limitation on number of CPU cores (current machine is 4 core logical). According to above content/blog if I specify the CpuShares = 512 , CpuShares = 256 and CpuShares = 256 then it should take max CPU of 40 , 20 , 20 percent. However this is not the case it only allowed 5% 2% and 2% of max CPU usages for respective services. By reading this post (https://learn.microsoft.com/en-us/azure/service-fabric/service-fabric-resource-governance) I thought 1024 is default value (max value) for CpuShares , after lot of try and error I finally come to the value 10,000 for CpuShares if we apply any values greater than this we get error in service explorer saying Invalid Arg for CpuShare. So considering 10,000 = 100% CPU usage allowed. I have modified above CPU share values to CpuShares = 4000, CpuShares = 2000 and CpuShares = 2000 in testing I can see the max CPU percent usage where 40% 20% 20% (not exact 5% variance ). So the question is I could not find 10,000 = 100% CPU usage value in any of the documents. So I want to confirm if this is correct and if not then how can I restrict the service or code package to specific percentage. Please kindly help on this

1

There are 1 best solutions below

3
On BEST ANSWER

A common misconception about the resources governance for services in SF is that it is a resource reserved and isolated capacity for your service, like it is on a docker containers, and this is not true.

These metrics are just soft limits to keep balance of the services in the cluster, does not mean they are reserved exclusively for your services, other services can, and will consume these resources if no limit are set.

For resource limit enforcement to work, all code packages within a service package should have memory limits specified.

The metrics are measures to find a proper balance of services in the cluster and place services in nodes with available resources, so assuming you have your services A, B and C, and each service has a specific amount of resources requirement, lets say Memory, that is an easy value to understand, A=1000MB, B=512MB, C=512MB, and the a node has 2GB of memory, these can be placed in the same node, so assuming service C needs 1000MB, when service C need to be activated, it will avoid the same node where A and B are activated, because it does not have the capacity to run in there, even though these services are not consuming all resources requested, and another node will be elected.

From the docs:

The Service Fabric runtime currently does not provide reservation for resources. When a process or a container is opened, the runtime sets the resource limits to the loads that were defined at creation time. Furthermore, the runtime rejects the opening of new service packages that are available when resources are exceeded.

Regarding your main question about the shares:

The same docs describe that shares are a proportion of a reserved CPU Core reserved to a Service package split between all the code packages activated on that node, if you define the shares for each code package, the sum of them will be the total, and each one will get a proportion of these shares.

How these shares are controlled?

Keep in mind the below are not officially documented anywhere and I might be wrong in some aspects, I've got this information from the source code and I might have missed some details

Assuming a Service Package with the code packages A and B are activated, with the following share split:

  • Code Package A (CP.A) = 1500 shares
  • Code Package B (CP.B) = 500 shares

SF will:

  • Identify the CPU core capacity reserved for the service package:
    • The capacity will be the CPU core reserved capacity(%) / total available
    • on 4 core cpu: 1 cpu core reserved = 25%
  • Get all shares from the code packages and sum their values to identify how many shares should represent 100% of the reserved(25%) capacity
    • 1500 + 500 = 2000 total shares
    • CP.A should receive 3 fractions of 4
    • CP.B should receive 1 fraction of 4
  • Convert Shares fraction to CPU Job Cycles(see why below)
    • CP.A should receive 3/4 of 10,000 -> 7500 cycles
    • CP.B should receive 1/4 of 10,000 -> 2500 cycles
  • Multiply the number of reserved cycles by the amount of reserved cpu core
    • CP.A should receive 25% of 7500 cycles
    • CP.B should receive 25% of 2500 cycles

These limits are constrained by Job Objects, when a process(code package) is activated, it is assigned to a Job where these restrictions are set, whenever the process consumes more cycles than the restriction set in the job, the threads are pre-empted and another process thread is scheduled in the core. In the code, they suggest that 10000 represents all available cores but the correct is number of processor cycles that the threads in a job object can use during each scheduling interval. In a Job, 10000 cycle is the interval of each job schedule, a thread is scheduled in this Job and will consume x cycles of this schedule, and you will only consume the 10000 cycles if you reserve 4 cores.

The exact logic is in this bit of code:

    double shares = fractionOfSp * Constants::JobObjectCpuCyclesNumber;
    shares = shares * (numCoresAllocated / max(numAvailableCores_, systemCpuCores_));
    rg.CpuShares = static_cast<uint>(shares);

    //fractionOfSp -> Fraction of a Service Package
    //   - A service package represents 100% (Like a Pizza) of reserved core capacity
    //   - Each Code Package will have one or more fraction (A slice of the pizza)
    //Constants::JobObjectCpuCyclesNumber -> is a constant for 10000 cycles
    //numCoresAllocated -> How many cores you assigned to a service package

Some tricks:

  • The number of reserved cores also affect the result, you have to reserve at least 0.01% of a core to take any effect.
  • The shares are based on CPU Cores reserved to Service Package and not all CPU available in a node, if your node has 4 cores, you reserve 1 to a ServicePackage, that means you are sharing 25% of your node capacity among each code package.
  • If some of the code packages has zero or no share, all code packages will have the same fraction, even though you specify any value.
  • On linux, it uses CpuQuota
  • The maximum number of cycles of a job is 10k

If you need more info, take a look in the source here

PS: I got a bit dizzy calculating all these numbers, I will probably review again later!