Intel Pin multithreading instrumentation: How to only instrument the shared variable accesses between the threads?

1.1k Views Asked by At

I'm using Intel Pin to dynamically instrument the multi-threaded programs to do some data race detection. I instrument memory read/write instructions to collect memory traces at runtime and then analyze the log. The trace collection is simple, which stores the memory traces (including time, thread id, address, .etc) to a buffer at runtime and writes it out in the end.

VOID PIN_FAST_ANALYSIS_CALL RecordMemRead(unsigned int  ip, unsigned int  addr, THREADID tid){
    PIN_GetLock(&lock,tid+1);

    membuf[instCounter].tid = tid;
    membuf[instCounter].ip = ip;
    membuf[instCounter].addr = addr;
    membuf[instCounter].op = 'R';
    instCounter++;

    PIN_ReleaseLock(&lock);
}

VOID PIN_FAST_ANALYSIS_CALL RecordMemWrite(unsigned int  ip, unsigned int   addr, THREADID tid){
  // similar to RecordMemRead()
}

VOID Instruction(INS ins, VOID *v){
    if(INS_IsBranchOrCall(ins)) 
        return;
    if(INS_IsStackRead(ins))
        return;
    if(INS_IsStackWrite(ins))
        return;  

    if (INS_IsMemoryRead(ins)){
        INS_InsertPredicatedCall(ins, IPOINT_BEFORE, (AFUNPTR)RecordMemRead,  IARG_FAST_ANALYSIS_CALL, IARG_INST_PTR, IARG_MEMORYREAD_EA, 
          IARG_THREAD_ID, IARG_END);
    }

    else if(INS_IsMemoryWrite(ins)){
        INS_InsertPredicatedCall(ins, IPOINT_BEFORE, (AFUNPTR)RecordMemWrite, IARG_FAST_ANALYSIS_CALL, IARG_INST_PTR, IARG_MEMORYWRITE_EA, 
          IARG_THREAD_ID, IARG_END);
    }
}

My trouble is the severe runtime overhead (200x - 500x). According to other works, the trace collection should only introduce less than 100x overhead. I have tried to optimize it by skipping the accesses to the stack, but it doesn't help much. Since my instrumentation is at a granularity of instruction, large numbers of accesses are logged. Thus, I think the only way to reduce the runtime overhead is to reduce the accesses to be collected, aka only recording the accesses to the shared variables between the threads (the race-related ones).

Can I by some means to figure out which accesses are to the shared variables in Pin? or are there any other ways to reduce the runtime overhead?

0

There are 0 best solutions below