Exchange data between threads on Intel Xeon Phi - BSP Model

52 Views Asked by At

I want to develop the Bulk Synchronous Parallel Model on Intel Xeon Phi using only std::thread (no other libraries). I organize the code in 2 classes: Superstep and Worker. Superstep class contains a vector of Workers.

superstep.hpp

template <typename T>
class SuperStep {
    private:
        int nw;
        std::vector<T> input;
        std::vector<std::vector<T>> chunks;
        std::vector<std::unique_ptr<Worker<T>>> workers;     //vector of pointers
        std::vector<std::vector<T>> output;

    public:
        SuperStep(int n, std::vector<T> input);
        ~SuperStep();
        std::vector<T> get_input();
        int get_parallel_degree();
        template<typename F,typename ...Args>
        void computation(std::function<F(Args...)> b);
        void communication();
};

Worker class is a thread wrapper

worker.hpp

template <typename T>
class Worker {
    private:
        int id;
        SuperStep<T> *ss;
        std::thread thread;
        std::vector<T> input;
        std::vector<T> output;


    public:
        Worker(int id, SuperStep<T> *s);
        Worker(const Worker&) = delete;
        Worker(Worker &&other);
        Worker& operator=(const Worker&) = delete; 
        Worker& operator=(Worker&&) = delete;
        ~Worker();
        int get_id();
        std::vector<T> get_output();
        void set_input(std::vector<T>, int worker_index);
        template<typename F,typename ...Args>
        void work(std::function<F(Args...)> body);
};

When I run computation of a Superstep, each workers calls its work() function that compute worker output from worker input using the template function "body".

//Superstep Computation
template<typename T>
template<typename F,typename ...Args>
void SuperStep<T>::computation(std::function<F(Args...)> body)
{
    for (auto &w: workers)
            w->work(body);
}

//////////////////////////////////////////////////////////////////

//Worker work function
template<typename T>
template<typename F, typename ...Args>
void Worker<T>::work(std::function<F(Args...)> body)
{
    thread = std::thread{[this,body]()
        { 
            output = body(input);            
        }
    };
}

I can't understand how to implement the Communication Phase on a MIC processor like Xeon Phi, or rather how to exchange output of every single worker with other workers. Communication consists to send data to the processor (I don't know how) or save outputs in a duplicated lockable vector (but this seems more "shared memory" approach)?

Thanks in advance!

0

There are 0 best solutions below