Reboots on SoC mt7621a and kernel 5.0.19-rt10

40 Views Asked by At

I'm currently working on the development of a product with se MT7621A SoC and with kernel 5.0.19.rt10.

I'm having rebooting problems and in order to debug it, I activated some kernel options as Detect soft lockups and Detect hung tasks. I've obtained different stack traces just before rebooting. These stack traces are different but all of them finish on the function task_blocks_on_rt_mutex()

Here is an example:

[ 2370.270000] 000: Kernel bug detected[#1]:
[ 2370.270000] 000: CPU: 0 PID: 799 Comm: agsPoll0 Tainted: G           O      5.0.19-rt10 #5
[ 2370.270000] 000: $ 0   :
[ 2370.270000] 000:  00000000
[ 2370.270000] 000:  00000001
[ 2370.270000] 000:  00000001
[ 2370.270000] 000:  00000001
[ 2370.270000] 000: 
[ 2370.270000] 000: $ 4   :
[ 2370.270000] 000:  00000001
[ 2370.270000] 000:  00000000
[ 2370.270000] 000:  8fda6e40
[ 2370.270000] 000:  00000000
[ 2370.270000] 000: 
[ 2370.270000] 000: $ 8   :
[ 2370.270000] 000:  8fda5be0
[ 2370.270000] 000:  8fda5be0
[ 2370.270000] 000:  00000001
[ 2370.270000] 000:  ffffffff
[ 2370.270000] 000: 
[ 2370.270000] 000: $12   :
[ 2370.270000] 000:  00000000
[ 2370.270000] 000:  00014b40
[ 2370.270000] 000:  00000000
[ 2370.270000] 000:  00000000
[ 2370.270000] 000: 
[ 2370.270000] 000: $16   :
[ 2370.270000] 000:  8fda5be0
[ 2370.270000] 000:  8fda6e40
[ 2370.270000] 000:  81002a60
[ 2370.270000] 000:  8fc0bd30
[ 2370.270000] 000: 
[ 2370.270000] 000: $20   :
[ 2370.270000] 000:  00000000
[ 2370.270000] 000:  8fda72ec
[ 2370.270000] 000:  8071e520
[ 2370.270000] 000:  00000000
[ 2370.270000] 000: 
[ 2370.270000] 000: $24   :
[ 2370.270000] 000:  00000000
[ 2370.270000] 000:  80010580
[ 2370.270000] 000:          
[ 2370.270000] 000:          
[ 2370.270000] 000: 
[ 2370.270000] 000: $28   :
[ 2370.270000] 000:  8e722000
[ 2370.270000] 000:  8fc0bca8
[ 2370.270000] 000:  80720000
[ 2370.270000] 000:  8007a2a0
[ 2370.270000] 000: 
[ 2370.270000] 000: Hi    : 000227df
[ 2370.270000] 000: Lo    : 19f38000
[ 2370.270000] 000: epc   : 8007a2c0 task_blocks_on_rt_mutex+0x6c/0x300
[ 2370.270000] 000: ra    : 8007a2a0 task_blocks_on_rt_mutex+0x4c/0x300
[ 2370.270000] 000: Status: 11000402
[ 2370.270000] 000: KERNEL 
[ 2370.270000] 000: EXL 
[ 2370.270000] 000: 
[ 2370.270000] 000: Cause : 50800034 (ExcCode 0d)
[ 2370.270000] 000: PrId  : 0001992f (MIPS 1004Kc)
[ 2370.270000] 000: Modules linked in:
[ 2370.270000] 000:  prd_phy_controller(O)
[ 2370.270000] 000:  csl_mng_module(O)
[ 2370.270000] 000:  csl_packet_module(O)
[ 2370.270000] 000:  mac(O)
[ 2370.270000] 000:  alb_critical_lock(O)
[ 2370.270000] 000:  phyif(O)
[ 2370.270000] 000:  ttstools(O)
[ 2370.270000] 000:  gpio_bb(O)
[ 2370.270000] 000:  loopprotect(O)

[ 2370.270000] 000:  sys_mng(O)
[ 2370.270000] 000:  alb_workqueues(O)
[ 2370.270000] 000:  iptable_filter
[ 2370.270000] 000:  [last unloaded: nf_nat]
[ 2370.270000] 000: 
[ 2370.270000] 000: Process agsPoll0 (pid: 799, threadinfo=ef6213bf, task=9369b08c, tls=00000000)
[ 2370.270000] 000: Stack :
[ 2370.270000] 000:  8100df20
[ 2370.270000] 000:  80055d30
[ 2370.270000] 000:  8fda608c
[ 2370.270000] 000:  80723c40
[ 2370.270000] 000:  00000000
[ 2370.270000] 000:  8fda72ec
[ 2370.270000] 000:  81002a60
[ 2370.270000] 000:  805ed164
[ 2370.270000] 000: 
[ 2370.270000] 000:        
[ 2370.270000] 000:  00000000
[ 2370.270000] 000:  81002a60
[ 2370.270000] 000:  8fda6e40
[ 2370.270000] 000:  8fc0bd30
[ 2370.270000] 000:  00000000
[ 2370.270000] 000:  8fda72ec
[ 2370.270000] 000:  8071e520
[ 2370.270000] 000:  805eb030
[ 2370.270000] 000: 
[ 2370.270000] 000:        
[ 2370.270000] 000:  00000009
[ 2370.270000] 000:  81002a60
[ 2370.270000] 000:  00000000
[ 2370.270000] 000:  805ed1e0
[ 2370.270000] 000:  00000000
[ 2370.270000] 000:  81002a60
[ 2370.270000] 000:  8fc0bd30
[ 2370.270000] 000:  00000000
[ 2370.270000] 000: 
[ 2370.270000] 000:        
[ 2370.270000] 000:  00000000
[ 2370.270000] 000:  00000000
[ 2370.270000] 000:  8071e520
[ 2370.270000] 000:  8071e6e0
[ 2370.270000] 000:  80720000
[ 2370.270000] 000:  805eb2bc
[ 2370.270000] 000:  00000003
[ 2370.270000] 000:  8f4f0c40
[ 2370.270000] 000: 
[ 2370.270000] 000:        
[ 2370.270000] 000:  00000000
[ 2370.270000] 000:  81002f20
[ 2370.270000] 000:  8fc0bd30
[ 2370.270000] 000:  8fc3c9e0
[ 2370.270000] 000:  00000009
[ 2370.270000] 000:  8fc0bd3c
[ 2370.270000] 000:  8fc3ce2c
[ 2370.270000] 000:  00000000
[ 2370.270000] 000: 
[ 2370.270000] 000:        
[ 2370.270000] 000:  ...
[ 2370.270000] 000: 
[ 2370.270000] 000: Call Trace:
[ 2370.270000] 000: [<8007a2c0>] task_blocks_on_rt_mutex+0x6c/0x300
[ 2370.270000] 000: [<805eb030>] rt_spin_lock_slowlock_locked+0xbc/0x2fc
[ 2370.270000] 000: [<805eb2bc>] rt_spin_lock_slowlock+0x4c/0x70
[ 2370.270000] 000: [<805ed430>] rt_spin_lock+0x68/0x78
[ 2370.270000] 000: [<800149c8>] mips_cm_lock_other+0x13c/0x1e8
[ 2370.270000] 000: [<800104b0>] mips_smp_send_ipi_mask+0x13c/0x20c
[ 2370.270000] 000: [<80055cf8>] check_preempt_curr+0x98/0xb4
[ 2370.270000] 000: [<80055d30>] ttwu_do_wakeup.isra.93+0x1c/0x110
[ 2370.270000] 000: [<80056814>] try_to_wake_up+0x20c/0x46c
[ 2370.270000] 000: [<800806b0>] __handle_irq_event_percpu+0x7c/0x178
[ 2370.270000] 000: [<800807dc>] handle_irq_event_percpu+0x30/0x78
[ 2370.270000] 000: [<8008087c>] handle_irq_event+0x58/0xb0
[ 2370.270000] 000: [<800855e0>] handle_level_irq+0x10c/0x1f0
[ 2370.270000] 000: [<8007fc20>] generic_handle_irq+0x40/0x58
[ 2370.270000] 000: [<8031c980>] gic_handle_shared_int+0xcc/0x19c
[ 2370.270000] 000: [<8007fc20>] generic_handle_irq+0x40/0x58
[ 2370.270000] 000: [<805ed8b0>] do_IRQ+0x18/0x24
[ 2370.270000] 000: [<8031baf8>] plat_irq_dispatch+0x90/0x110
[ 2370.270000] 000: [<80006ca8>] except_vec_vi_end+0xb8/0xc4
[ 2370.270000] 000: [<805ed08c>] _raw_spin_unlock_irq+0x18/0x4c
[ 2370.270000] 000: [<805eae60>] __rt_mutex_slowlock+0x110/0x224
[ 2370.270000] 000: [<805eb378>] rt_mutex_slowlock_locked+0x98/0x284
[ 2370.270000] 000: [<805eb5cc>] rt_mutex_slowlock.constprop.25+0x68/0xc4
[ 2370.270000] 000: [<805eb790>] __rt_mutex_lock_state+0x90/0xbc
[ 2370.270000] 000: [<c074d938>] AGSPollLoopStats+0xa0/0x10c [mac]
[ 2370.270000] 000: [<8004d020>] kthread_worker_fn+0xc8/0x140
[ 2370.270000] 000: [<8004cee0>] kthread+0x118/0x148
[ 2370.270000] 000: [<800067ac>] ret_from_kernel_thread+0x14/0x1c
[ 2370.270000] 000: 
[ 2370.270000] 000: Code:
[ 2370.270000] 000:  00000000 
[ 2370.270000] 000:  2c420003 
[ 2370.270000] 000:  38420001 
[ 2370.270000] 000: <00020336>
[ 2370.270000] 000:  ae710018 
[ 2370.270000] 000:  ae72001c 
[ 2370.270000] 000:  8e220038 
[ 2370.270000] 000:  ae620024 
[ 2370.270000] 000:  8e220160 
[ 2370.270000] 000: 
[ 2370.270000] 000: 
[ 2370.830000] 000: ---[ end trace 0000000000000002 ]---
[ 2370.840000] 000: Kernel panic - not syncing: Fatal exception in interrupt
[ 2370.840000] 000: Rebooting in 10 seconds..

I enabled the Lock debugging options in the kernel, but it always gets stucked on some kernel code talking about deadlocks that I can't debug.

Any suggestions in order to proceed with the reboot debugging?

0

There are 0 best solutions below