I´m newbie in the forum and I hope that you will help me with my question. Recently, I´ve developed an application in which I´ve used CUDA streams with the aim of overlapping computation and data transfers. I've executed this application on a GPU Nvidia (Maxwell architecture). I've observed with the Visual Profiler tool that some data transfers HostToDevice occur at the same time. The Maxwell GPUs only have 2 Copy engines. One copy engine is for the HostToDevice transfers and the other copy engine is for the DeviceToHost transfers, right?. With this in mind, I think that two HostToDevice transfers can´t occur at the same time. However, I´ve observed with Visual Profiler that this behaviour appears in my application. So, my question is: in this architecture, is it possible that two HostToDevice (or DeviceToHost) data transfers might occur at the same time?.
Thank you so much.
No, it's not possible.
It's not possible for 2 transfers to occur at the same time in the same direction. This is arguably based on PCI express, and not having anything to do with CUDA. When a PCI express transaction is in progress in a given direction, no other transactions can be taking place in that direction. Either you are misinterpreting the output of visual profiler, or visual profiler has some sort of bug.
By hovering your mouse over the specific transactions in visual profiler, you can get additional details about it in the window at the right hand side of the visual profiler display. This additional information should include the start and finish time of each transaction (as well as size in bytes, etc.) I would start there, to see if visual profiler thinks they are in the same direction and have the same start time.