I would like to utilize Intel TSX to program lock-free code.
xbegin
my_inst1
my_inst2
xend
However, because of some reasons, one of my instructions inside TSX execution TSX abort.
I would like to know which instruction generates the fault and make the TSX abort.
Is there any possible way to know which instruction generated fault?
My first try was incrementing the global counter after executing each instruction in the TSX region. However, when the fault happens, updates to the counters also rollbacked because it rollbacks every writes in the TSX region.
Is there any trick to debug TSX execution?
Use
perf record(or other way of accessing HW perf counters) for an event likertm_retired.abortedfor any aborts, and/ortx_mem.abort_conflictortx_mem.abort_capacityto see if either of those are the cause for aborts. (You can record multiple events in one run, then see which fired inperf report)Also tx_exec.misc1..3 might be relevant. From
perf liston my Skylake desktop.See also https://oprofile.sourceforge.io/docs/intel-skylake-events.php
You might need to tweak things to get a reasonable number of samples for an event that doesn't fire very often. I haven't tried this, but hopefully the counts should show up on the guilty instruction itself.
rtm_retired.abortedis a precise event; the others don't say so inperf listoutput.Some of the RTM/TSX events are only for HLE (Hardware Lock Ellision, where you put an extra prefix on a
locked instruction).Use
perf listand search for "abort" in the output to find relevant events.