Why is time servicing softirqs increased on kernel 4.1 in compare with kernel 3.4?

1.6k Views Asked by At

We found a performance degradation issue when doing linux uplift from kernel 3.4 to 4.1 version. Seems, it is caused by that the new kernel spends much more time to service softirqs.

We did a test using traffic generator (IXIA) which generates GTP-C packets (over UDP) and a simple application which sends back the received packet. It just interchanges source and destination IP addresses. We got the following results using mpstat utility:

  • kernel 3.4
# mpstat -P ALL 60 1
Linux 3.4.91-grsec (...)    12/22/17    _x86_64_    (20 CPU)

15:58:43     CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest   %idle
15:58:53     all    0.26    0.00    2.21    0.00    0.00    0.41    0.00    0.00   97.12
15:58:53       0    1.03    0.00    3.79    0.00    0.00   12.91    0.00    0.00   82.27
15:58:53       1    4.18    0.00   41.44    0.00    0.00    0.00    0.00    0.00   54.38
15:58:53       2    0.30    0.00    0.30    0.00    0.00    0.40    0.00    0.00   99.00
...

# mpstat -I ALL -P 0 60 1
Linux 3.4.91-grsec (...)    12/25/17    _x86_64_    (20 CPU)

10:53:16     CPU    intr/s
10:54:16       0      0.00

10:53:16     CPU        0/s        3/s        4/s        8/s        9/s       67/s       68/s       69/s       70/s       71/s       72/s       73/s       74/s       75/s       76/s       77/s       78/s       79/s       80/s       81/s       82/s       83/s       84/s      103/s      104/s      105/s      106/s      107/s      108/s      109/s      110/s      111/s      112/s      113/s      114/s      115/s      116/s      117/s      118/s      119/s      120/s      121/s      122/s      123/s      124/s      125/s      126/s      127/s      128/s      129/s      130/s      131/s      132/s      133/s      134/s      135/s      136/s      137/s      138/s      139/s      140/s      141/s      142/s      143/s      144/s      NMI/s      LOC/s      SPU/s      PMI/s      IWI/s      RTR/s      RES/s      CAL/s      TLB/s      TRM/s      THR/s      MCE/s      MCP/s      ERR/s      MIS/s
10:54:16       0       0.00       5.50       0.00       0.00       0.00       0.00      71.59       0.50       1.60       6.47       1.50      16.66       0.50       1.50       0.00      71.59       0.50       0.60       6.47       1.50      16.66       1.50       1.50     126.91   17445.01       9.93       1.87       1.28       1.25       1.27       4.82       0.83       1.37       1.58       7.45      14.38       3.45   18060.49       2.18       0.97       1.00       0.58       0.83       0.00      87.70   17503.72       7.62       1.22       0.75       0.67       0.98       2.98       0.53       1.87       1.58       1.50       6.62   18028.71       1.53       0.53       0.50       0.50       0.50       0.50       0.00       0.40     174.64       0.00       0.40       0.00       0.00     282.10       0.00       0.92       0.00       0.00       0.00       0.00       0.00       0.00

10:53:16     CPU       HI/s    TIMER/s   NET_TX/s   NET_RX/s    BLOCK/s BLOCK_IOPOLL/s  TASKLET/s    SCHED/s  HRTIMER/s      RCU/s
10:54:16       0       0.00     166.49       0.25   62813.75       0.00           0.00       0.00     298.20       0.87     130.00
...
  • kernel 4.1
# mpstat -P ALL 60 1
Linux 4.1.21 (...)  12/22/17    _x86_64_    (20 CPU)

16:19:12     CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest   %idle
16:19:22     all    0.28    0.00    2.23    0.00    0.00    2.65    0.00    0.00   94.84
16:19:22       0    1.40    0.00    2.59    0.00    0.00   52.79    0.00    0.00   43.21
16:19:22       1    3.66    0.00   42.47    0.00    0.00    0.00    0.00    0.00   53.87
16:19:22       2    0.10    0.00    0.20    0.00    0.00    0.00    0.00    0.00   99.70
...

# mpstat -I ALL -P 0 60 1
Linux 4.1.21 (...)  12/25/17    _x86_64_    (20 CPU)

10:35:45     CPU    intr/s
10:36:45       0      0.00

10:35:45     CPU        0/s        3/s        4/s        8/s        9/s       31/s       32/s       33/s       34/s       35/s       36/s       37/s       38/s       39/s       41/s       42/s       43/s       44/s       45/s       46/s       47/s       48/s       49/s       69/s       70/s       71/s       72/s       73/s       74/s       75/s       76/s       77/s       78/s       79/s       80/s       81/s       82/s       83/s       84/s       85/s       86/s       87/s       88/s       89/s       91/s       92/s       93/s       94/s       95/s       96/s       97/s       98/s       99/s      100/s      101/s      102/s      103/s      104/s      105/s      106/s      107/s      108/s      109/s      110/s      111/s      NMI/s      LOC/s      SPU/s      PMI/s      IWI/s      RTR/s      RES/s      CAL/s      TLB/s      TRM/s      THR/s      MCE/s      MCP/s      ERR/s      MIS/s
10:36:45       0       0.00       5.43       0.00       0.00       0.00       0.00      81.55       0.50       0.60       2.50       0.50      12.63       0.50       2.00       0.00      81.55       0.50       0.60       2.50       0.50      12.63       0.50       1.50     134.87   16949.32       3.25       2.62       3.55       1.32   17720.62       7.03       7.35       3.58       1.63       6.73       3.53       4.48       5.33       1.37       1.42       0.83      10.58       0.52       0.00      90.33   16160.05       3.72       2.07       0.73       0.82       2.53   17669.83       7.23       2.17       1.50       0.50       1.57       0.50       0.50       0.50       0.50       0.50      10.50       0.50       0.00       0.05    1005.90       0.00       0.05       0.00       0.00       4.05       0.00       1.85       0.00       0.00       0.00       0.02       0.00       0.00

10:35:45     CPU       HI/s    TIMER/s   NET_TX/s   NET_RX/s    BLOCK/s BLOCK_IOPOLL/s   TASKLET/s    SCHED/s  HRTIMER/s      RCU/s
10:36:45       0       0.00    1000.05       2.93   56777.58       0.00            0.00       0.57     285.15       0.00    1313.05
...

The test conditions:

  1. Traffic rate: 50000 fps
  2. The application is bound to the CPU 1 using taskset command
  3. All the interrupts (which are related to ethernet interfaces) are bound to CPU 0.

As we can see "%soft" is increased significantly on the CPU 0: 12.91% -> 52.79%. However NET_RX/s is decreased: 62813.75 -> 56777.58.

Also we tried to profile the CPU 0 using perf utility. Unfortunately, no clue is found.

  • kernel 3.4
# perf stat -C 0 -e irq:softirq_entry,irq:softirq_exit,irq:softirq_raise -a sleep 60

 Performance counter stats for 'sleep 60':

           3690739 irq:softirq_entry                                            [100.00%]
           3691328 irq:softirq_exit                                             [100.00%]
           4366035 irq:softirq_raise                                           

      60.020019821 seconds time elapsed
# perf stat -C 0 -d -a sleep 60

 Performance counter stats for 'sleep 60':

      59978.442551 task-clock                #    0.999 CPUs utilized           [100.00%]
            138638 context-switches          #    0.002 M/sec                   [100.00%]
              4206 CPU-migrations            #    0.070 K/sec                   [100.00%]
             91404 page-faults               #    0.002 M/sec                  
       49824470562 cycles                    #    0.831 GHz                     [49.34%]
       32279104677 stalled-cycles-frontend   #   64.79% frontend cycles idle    [48.20%]
    stalled-cycles-backend  
       41765280058 instructions              #    0.84  insns per cycle        
                                             #    0.77  stalled cycles per insn [62.35%]
        6939461584 branches                  #  115.699 M/sec                   [62.90%]
          69255081 branch-misses             #    1.00% of all branches         [61.90%]
       13778063320 L1-dcache-loads           #  229.717 M/sec                   [63.72%]
         757751494 L1-dcache-load-misses     #    5.50% of all L1-dcache hits   [63.85%]
         153774796 LLC-loads                 #    2.564 M/sec                   [50.09%]
    LLC-load-misses         

      60.065977609 seconds time elapsed
  • kernel 4.1
# perf stat -C 0 -e irq:softirq_entry,irq:softirq_exit,irq:softirq_raise -a sleep 60

 Performance counter stats for 'CPU(s) 0':

           3540101      irq:softirq_entry                                             (100.00%)
           3540380      irq:softirq_exit                                              (100.00%)
           4365512      irq:softirq_raise                                           

      60.061923392 seconds time elapsed

# perf stat -C 0 -d -a sleep 60

 Performance counter stats for 'CPU(s) 0':

      60105.618981      task-clock (msec)         #    1.001 CPUs utilized            (100.00%)
            112358      context-switches          #    0.002 M/sec                    (100.00%)
              3042      cpu-migrations            #    0.051 K/sec                    (100.00%)
             23077      page-faults               #    0.384 K/sec                  
       69596616908      cycles                    #    1.158 GHz                      (44.44%)
       47063269010      stalled-cycles-frontend   #   67.62% frontend cycles idle     (44.43%)
         stalled-cycles-backend   
       47882140206      instructions              #    0.69  insns per cycle        
                                                  #    0.98  stalled cycles per insn  (55.54%)
        8644786229      branches                  #  143.827 M/sec                    (55.54%)
          82066359      branch-misses             #    0.95% of all branches          (55.55%)
       15150062571      L1-dcache-loads           #  252.057 M/sec                    (44.45%)
         891694267      L1-dcache-load-misses     #    5.89% of all L1-dcache hits    (22.23%)
         192155955      LLC-loads                 #    3.197 M/sec                    (22.23%)
            148469      LLC-load-misses           #    0.08% of all LL-cache hits     (33.33%)

      60.062744860 seconds time elapsed

Maybe, someone faced with a similar issue. Any advices and suggestions will be greatly appreciated.

2

There are 2 best solutions below

0
On

Possibly the cause is mitigation of spectre and meltdown vulnerabilities. Try to pass "nospectrev2 nopti nopcid" as boot parameters for your kernel. Be ware, that this changes are potential security hole.

0
On

Had a similar issue when upgrading from 3.16 to 4.9 kernel (i.e. Debian Jessie to Debian Stretch).

Network performance for UDP dropped significantly.

We did not have any flash of genius, which helped us find the culprit for this.

However, we did some tuning and came pretty close to the previous performance:

1) Using RSS. Assign the CPU cores statically to the NICs receive queues. Stop the irqbalance service. RPS and RFS did not improve the performance any further over what RSS did in our case, but it might be useful for you, depending on your hardware.

2) Reduce the interval by which the scheduler becomes active. I simply used the parameters from this article.

Some further literature