My setup environment: CUDA 10.2 Device: RTX 2080 OS: Ubuntu 16.04 When I try to use nvprof, I find that it doesn't support devices with compute capability 7.2 and higher. It is recommended that I should use Nsight Compute or Nsight Systems instead. But I can not launch the above two software because of the lack of graphical interface. How could I use Nsight Compute in remote server? By the way, is it possible to profile metrics in Nsight Compute?
How to profile in CUDA application with compute capability 7.x? Is metric "dram_read_throughput" valid in Nsight Compute?
3.5k Views Asked by fishmingee At
1
There are 1 best solutions below
Related Questions in CUDA
- C++ using std::vector across boundaries
- Linked list without struct
- Connecting Signal QML to C++ (Qt5)
- how to get the reference of struct soap inherited in C++ Proxy/Service class
- Why we can't assign value to pointer
- Conversion of objects in c++
- shared_ptr: "is not a type" error
- C++ template using pointer and non pointer arguments in a QVector
- C++ SFML 2.2 vectors
- Lifetime of temporary objects
Related Questions in NSIGHT-COMPUTE
- C++ using std::vector across boundaries
- Linked list without struct
- Connecting Signal QML to C++ (Qt5)
- how to get the reference of struct soap inherited in C++ Proxy/Service class
- Why we can't assign value to pointer
- Conversion of objects in c++
- shared_ptr: "is not a type" error
- C++ template using pointer and non pointer arguments in a QVector
- C++ SFML 2.2 vectors
- Lifetime of temporary objects
Trending Questions
- UIImageView Frame Doesn't Reflect Constraints
- Is it possible to use adb commands to click on a view by finding its ID?
- How to create a new web character symbol recognizable by html/javascript?
- Why isn't my CSS3 animation smooth in Google Chrome (but very smooth on other browsers)?
- Heap Gives Page Fault
- Connect ffmpeg to Visual Studio 2008
- Both Object- and ValueAnimator jumps when Duration is set above API LvL 24
- How to avoid default initialization of objects in std::vector?
- second argument of the command line arguments in a format other than char** argv or char* argv[]
- How to improve efficiency of algorithm which generates next lexicographic permutation?
- Navigating to the another actvity app getting crash in android
- How to read the particular message format in android and store in sqlite database?
- Resetting inventory status after order is cancelled
- Efficiently compute powers of X in SSE/AVX
- Insert into an external database using ajax and php : POST 500 (Internal Server Error)
Popular # Hahtags
Popular Questions
- How do I undo the most recent local commits in Git?
- How can I remove a specific item from an array in JavaScript?
- How do I delete a Git branch locally and remotely?
- Find all files containing a specific text (string) on Linux?
- How do I revert a Git repository to a previous commit?
- How do I create an HTML button that acts like a link?
- How do I check out a remote Git branch?
- How do I force "git pull" to overwrite local files?
- How do I list all files of a directory?
- How to check whether a string contains a substring in JavaScript?
- How do I redirect to another webpage?
- How can I iterate over rows in a Pandas DataFrame?
- How do I convert a String to an int in Java?
- Does Python have a string 'contains' substring method?
- How do I check if a string contains a specific word?
For compute capability 7.5 and higher the recommended tools are nsight compute, and nsight systems. The documentation for nsight compute is here, the documentation for nsight systems is here. There is an introductory blog describing these "new" CUDA profiler tools here, and a tutorial blog on nsight systems here and a tutorial blog on nsight compute here. The introductory blog describes why there are 2 tools, and how they relate to each other.
It is not. The naming format of that metric indicates it is a nvprof metric. The nvprof metric names can generally not be used directly in Nsight Compute. To find out if there is an "equivalent" metric in nsight compute for a given nvprof metric, use the nvprof transition guide, in particular the metric comparison table. By studying that table, you'll note that there is a Nsight compute metric that is equivalent to
dram_read_throughput
and it is nameddram__bytes_read.sum.per_second
For instructions on how to capture this metric in nsight compute, please refer to the blog I already mentioned here, or refer to the documentation here.If you have the CUDA toolkit installed on the remote server, you should be able to run Nsight Compute in CLI (command-line-interface) mode. That is described both in the documentation already linked, and the blog article already linked. Alternatively, you may be able to run the GUI in remote mode, as described here.
Yes, we have already covered that.
I won't be able to use this question/answer to debug remote connection details or any other follow-up questions about specific access cases or usage scenarios of Nsight tools. There are documentation and tutorials already available. If you have another specific question, please ask a new question. To locate resources for Nsight Compute and Nsight Systems, I suggest simply googling those names. Usually the first hits will be landing pages here and here which link to all of the above resources, plus additional resources such as video tutorials describing specific cases and advanced usage.
All of these tools are available on windows as well with similar user interfaces. Furthermore, these tools can/should be used for any GPU of compute capability 7.0 or higher.