I'm currently working on the development of a product with se MT7621A SoC and with kernel 5.0.19.rt10.
I'm having rebooting problems and in order to debug it, I activated some kernel options as Detect soft lockups
and Detect hung tasks
. I've obtained different stack traces just before rebooting. These stack traces are different but all of them finish on the function task_blocks_on_rt_mutex()
Here is an example:
[ 2370.270000] 000: Kernel bug detected[#1]:
[ 2370.270000] 000: CPU: 0 PID: 799 Comm: agsPoll0 Tainted: G O 5.0.19-rt10 #5
[ 2370.270000] 000: $ 0 :
[ 2370.270000] 000: 00000000
[ 2370.270000] 000: 00000001
[ 2370.270000] 000: 00000001
[ 2370.270000] 000: 00000001
[ 2370.270000] 000:
[ 2370.270000] 000: $ 4 :
[ 2370.270000] 000: 00000001
[ 2370.270000] 000: 00000000
[ 2370.270000] 000: 8fda6e40
[ 2370.270000] 000: 00000000
[ 2370.270000] 000:
[ 2370.270000] 000: $ 8 :
[ 2370.270000] 000: 8fda5be0
[ 2370.270000] 000: 8fda5be0
[ 2370.270000] 000: 00000001
[ 2370.270000] 000: ffffffff
[ 2370.270000] 000:
[ 2370.270000] 000: $12 :
[ 2370.270000] 000: 00000000
[ 2370.270000] 000: 00014b40
[ 2370.270000] 000: 00000000
[ 2370.270000] 000: 00000000
[ 2370.270000] 000:
[ 2370.270000] 000: $16 :
[ 2370.270000] 000: 8fda5be0
[ 2370.270000] 000: 8fda6e40
[ 2370.270000] 000: 81002a60
[ 2370.270000] 000: 8fc0bd30
[ 2370.270000] 000:
[ 2370.270000] 000: $20 :
[ 2370.270000] 000: 00000000
[ 2370.270000] 000: 8fda72ec
[ 2370.270000] 000: 8071e520
[ 2370.270000] 000: 00000000
[ 2370.270000] 000:
[ 2370.270000] 000: $24 :
[ 2370.270000] 000: 00000000
[ 2370.270000] 000: 80010580
[ 2370.270000] 000:
[ 2370.270000] 000:
[ 2370.270000] 000:
[ 2370.270000] 000: $28 :
[ 2370.270000] 000: 8e722000
[ 2370.270000] 000: 8fc0bca8
[ 2370.270000] 000: 80720000
[ 2370.270000] 000: 8007a2a0
[ 2370.270000] 000:
[ 2370.270000] 000: Hi : 000227df
[ 2370.270000] 000: Lo : 19f38000
[ 2370.270000] 000: epc : 8007a2c0 task_blocks_on_rt_mutex+0x6c/0x300
[ 2370.270000] 000: ra : 8007a2a0 task_blocks_on_rt_mutex+0x4c/0x300
[ 2370.270000] 000: Status: 11000402
[ 2370.270000] 000: KERNEL
[ 2370.270000] 000: EXL
[ 2370.270000] 000:
[ 2370.270000] 000: Cause : 50800034 (ExcCode 0d)
[ 2370.270000] 000: PrId : 0001992f (MIPS 1004Kc)
[ 2370.270000] 000: Modules linked in:
[ 2370.270000] 000: prd_phy_controller(O)
[ 2370.270000] 000: csl_mng_module(O)
[ 2370.270000] 000: csl_packet_module(O)
[ 2370.270000] 000: mac(O)
[ 2370.270000] 000: alb_critical_lock(O)
[ 2370.270000] 000: phyif(O)
[ 2370.270000] 000: ttstools(O)
[ 2370.270000] 000: gpio_bb(O)
[ 2370.270000] 000: loopprotect(O)
[ 2370.270000] 000: sys_mng(O)
[ 2370.270000] 000: alb_workqueues(O)
[ 2370.270000] 000: iptable_filter
[ 2370.270000] 000: [last unloaded: nf_nat]
[ 2370.270000] 000:
[ 2370.270000] 000: Process agsPoll0 (pid: 799, threadinfo=ef6213bf, task=9369b08c, tls=00000000)
[ 2370.270000] 000: Stack :
[ 2370.270000] 000: 8100df20
[ 2370.270000] 000: 80055d30
[ 2370.270000] 000: 8fda608c
[ 2370.270000] 000: 80723c40
[ 2370.270000] 000: 00000000
[ 2370.270000] 000: 8fda72ec
[ 2370.270000] 000: 81002a60
[ 2370.270000] 000: 805ed164
[ 2370.270000] 000:
[ 2370.270000] 000:
[ 2370.270000] 000: 00000000
[ 2370.270000] 000: 81002a60
[ 2370.270000] 000: 8fda6e40
[ 2370.270000] 000: 8fc0bd30
[ 2370.270000] 000: 00000000
[ 2370.270000] 000: 8fda72ec
[ 2370.270000] 000: 8071e520
[ 2370.270000] 000: 805eb030
[ 2370.270000] 000:
[ 2370.270000] 000:
[ 2370.270000] 000: 00000009
[ 2370.270000] 000: 81002a60
[ 2370.270000] 000: 00000000
[ 2370.270000] 000: 805ed1e0
[ 2370.270000] 000: 00000000
[ 2370.270000] 000: 81002a60
[ 2370.270000] 000: 8fc0bd30
[ 2370.270000] 000: 00000000
[ 2370.270000] 000:
[ 2370.270000] 000:
[ 2370.270000] 000: 00000000
[ 2370.270000] 000: 00000000
[ 2370.270000] 000: 8071e520
[ 2370.270000] 000: 8071e6e0
[ 2370.270000] 000: 80720000
[ 2370.270000] 000: 805eb2bc
[ 2370.270000] 000: 00000003
[ 2370.270000] 000: 8f4f0c40
[ 2370.270000] 000:
[ 2370.270000] 000:
[ 2370.270000] 000: 00000000
[ 2370.270000] 000: 81002f20
[ 2370.270000] 000: 8fc0bd30
[ 2370.270000] 000: 8fc3c9e0
[ 2370.270000] 000: 00000009
[ 2370.270000] 000: 8fc0bd3c
[ 2370.270000] 000: 8fc3ce2c
[ 2370.270000] 000: 00000000
[ 2370.270000] 000:
[ 2370.270000] 000:
[ 2370.270000] 000: ...
[ 2370.270000] 000:
[ 2370.270000] 000: Call Trace:
[ 2370.270000] 000: [<8007a2c0>] task_blocks_on_rt_mutex+0x6c/0x300
[ 2370.270000] 000: [<805eb030>] rt_spin_lock_slowlock_locked+0xbc/0x2fc
[ 2370.270000] 000: [<805eb2bc>] rt_spin_lock_slowlock+0x4c/0x70
[ 2370.270000] 000: [<805ed430>] rt_spin_lock+0x68/0x78
[ 2370.270000] 000: [<800149c8>] mips_cm_lock_other+0x13c/0x1e8
[ 2370.270000] 000: [<800104b0>] mips_smp_send_ipi_mask+0x13c/0x20c
[ 2370.270000] 000: [<80055cf8>] check_preempt_curr+0x98/0xb4
[ 2370.270000] 000: [<80055d30>] ttwu_do_wakeup.isra.93+0x1c/0x110
[ 2370.270000] 000: [<80056814>] try_to_wake_up+0x20c/0x46c
[ 2370.270000] 000: [<800806b0>] __handle_irq_event_percpu+0x7c/0x178
[ 2370.270000] 000: [<800807dc>] handle_irq_event_percpu+0x30/0x78
[ 2370.270000] 000: [<8008087c>] handle_irq_event+0x58/0xb0
[ 2370.270000] 000: [<800855e0>] handle_level_irq+0x10c/0x1f0
[ 2370.270000] 000: [<8007fc20>] generic_handle_irq+0x40/0x58
[ 2370.270000] 000: [<8031c980>] gic_handle_shared_int+0xcc/0x19c
[ 2370.270000] 000: [<8007fc20>] generic_handle_irq+0x40/0x58
[ 2370.270000] 000: [<805ed8b0>] do_IRQ+0x18/0x24
[ 2370.270000] 000: [<8031baf8>] plat_irq_dispatch+0x90/0x110
[ 2370.270000] 000: [<80006ca8>] except_vec_vi_end+0xb8/0xc4
[ 2370.270000] 000: [<805ed08c>] _raw_spin_unlock_irq+0x18/0x4c
[ 2370.270000] 000: [<805eae60>] __rt_mutex_slowlock+0x110/0x224
[ 2370.270000] 000: [<805eb378>] rt_mutex_slowlock_locked+0x98/0x284
[ 2370.270000] 000: [<805eb5cc>] rt_mutex_slowlock.constprop.25+0x68/0xc4
[ 2370.270000] 000: [<805eb790>] __rt_mutex_lock_state+0x90/0xbc
[ 2370.270000] 000: [<c074d938>] AGSPollLoopStats+0xa0/0x10c [mac]
[ 2370.270000] 000: [<8004d020>] kthread_worker_fn+0xc8/0x140
[ 2370.270000] 000: [<8004cee0>] kthread+0x118/0x148
[ 2370.270000] 000: [<800067ac>] ret_from_kernel_thread+0x14/0x1c
[ 2370.270000] 000:
[ 2370.270000] 000: Code:
[ 2370.270000] 000: 00000000
[ 2370.270000] 000: 2c420003
[ 2370.270000] 000: 38420001
[ 2370.270000] 000: <00020336>
[ 2370.270000] 000: ae710018
[ 2370.270000] 000: ae72001c
[ 2370.270000] 000: 8e220038
[ 2370.270000] 000: ae620024
[ 2370.270000] 000: 8e220160
[ 2370.270000] 000:
[ 2370.270000] 000:
[ 2370.830000] 000: ---[ end trace 0000000000000002 ]---
[ 2370.840000] 000: Kernel panic - not syncing: Fatal exception in interrupt
[ 2370.840000] 000: Rebooting in 10 seconds..
I enabled the Lock debugging options in the kernel, but it always gets stucked on some kernel code talking about deadlocks that I can't debug.
Any suggestions in order to proceed with the reboot debugging?