Why Linux CFS not allocating Free CPU to other available process of run queue in Core2Duo?

145 Views Asked by At

I am working in Core2Duo,2.20GHz system which has Ubuntu-12.04 OS, 3.18.26 kernel.

I made some changes in Linux kernel source code.

To get all the processes involved (which gets scheduled and which is de-scheduled) in context switching , I made changes in kernel (kernel/sched/core.c and added following print statement inside context_switch function).

trace_printk(KERN_INFO
    "**$$,context_switch,%d,%llu,%llu,%d,%llu,%llu\n",
    (int)(prev->pid),
    prev->se.vruntime,
    prev->se.sum_exec_runtime,
    (int)(next->pid),
    next->se.vruntime,
    next->se.sum_exec_runtime);`

I am running two different Process P1 ( having 100 threads - T0,T1,...,T99) and P2 in same CPU core. P2 has already run for long , so its vruntime is high.

Inside P1, first 100 threads are created , all threads except T0 are in blocking state (waiting for semaphore).

  1. T0 performs some task, then sets a timer with duration 2000 nanosec and voluntarily releases CPU. As no threads are available, P2 gets scheduled.
  2. After 2000 nanosec, timer expired and it awaken next thread T1 which preempt P2 immediately.
  3. T1 performs some task, then sets a timer with duration 2000 nanosec and voluntarily releases CPU. As no threads are available, P2 gets scheduled.
  4. After 2000 nanosec, timer expired and it awaken next thread T2 which preempt P2 immediately.

This repeats and Threads T0,T1,...T99 executes in round-robin fashion.

So, the execution sequence like below

T0-P2-T1-P2-T2-P2-T3-......T99-P2-T0-P2.....

My experimental results shows,

when I set timer interval 1800 nano sec, P2 process gets average 1450 nano sec.

when I set timer interval 2000 nano sec, P2 process gets average 1600 nano sec.

when I set timer interval 2500 nano sec, P2 process gets average 2050 nano sec.

when I set timer interval 3000 nano sec, P2 process gets average 2600 nano sec.

So, I conclude in my Core2Duo system, context switch time is around 350-450ns. Am I right to say that ?

Another observation is that, when I set timer interval 1600 nano sec or 1700 nano sec, P2 process don't gets scheduled between two threads although CPU is free - that means CPU becomes free for around 1200 -1300 nano sec although P2 is in ready queue, ready to run. Why is this happening ?

Here is my snippet of code :

// Program - P2
int main(int argc, char *argv[])
{
cpu_set_t my_set;        
CPU_ZERO(&my_set);       
CPU_SET(1, &my_set);     
sched_setaffinity(0, sizeof(cpu_set_t), &my_set);

while(1){

// does some task
   }
}



    // Program - P1
// timer handler awakening next thread
        static void handler(int sig, siginfo_t *si, void *uc)
        {
        thread_no++;
        ret = sem_post(&sem[(thread_no)%NUM_THREADS]);
            if (ret)
            {
                printf("Error in Sem Post\n");
            }
    }

void *threadA(void *data_)
{
int turn = (intptr_t)data_;
cpu_set_t my_set;        
CPU_ZERO(&my_set);       
CPU_SET(1, &my_set);     
sched_setaffinity(0, sizeof(cpu_set_t), &my_set);

while(1)
    {

        ret = sem_wait(&sem[turn]);
        if (ret)
        {
            printf("Error in Sem Post\n");
        }

        // does some work here


        its.it_value.tv_sec = 0;
        its.it_value.tv_nsec = DELAY1;
        its.it_interval.tv_sec = 0;
        its.it_interval.tv_nsec = 0;

        ret = timer_settime(timerid, 0, &its, NULL);
        if ( ret < 0 )
            perror("timer_settime");

    }  
}

int main(int argc, char *argv[])
{

    sa.sa_flags = SA_RESTART;
    sa.sa_sigaction = handler;
    sigemptyset(&sa.sa_mask);
    err = sigaction(SIG, &sa, NULL);
    if (0 != err) {
        printf("sigaction failed\n"); }

    sev.sigev_notify = SIGEV_SIGNAL;
    sev.sigev_signo = SIG;
    sev.sigev_value.sival_ptr = &timerid;
    ret = timer_create(CLOCKID, &sev, &timerid);
    if ( ret < 0 )
        perror("timer_create");

    sem_init(&sem[0], 0, 1); 
    for ( i = 1; i < NUM_THREADS; ++i)
        {
            sem_init(&sem[i], 0, 0); 
        }   
    data=0;    
    while(data < NUM_THREADS)
    {
        //create our threads
        err = pthread_create(&tid[data], NULL, threadA, (void *)(intptr_t)data);
        if(err != 0)
            printf("\n can't create thread :[%s]", strerror(err));
        data++;
    }
}

kernel trace shows, CPU is free, sufficient time available for context switch - thread Ti to P2, still P2 not scheduled, later context switch happens between Ti and T(i+1). Why linux CFS is not selecting next process for scheduling in this case when timer duration is less than 1700 nano sec?

0

There are 0 best solutions below