My setup environment: CUDA 10.2 Device: RTX 2080 OS: Ubuntu 16.04 When I try to use nvprof, I find that it doesn't support devices with compute capability 7.2 and higher. It is recommended that I should use Nsight Compute or Nsight Systems instead. But I can not launch the above two software because of the lack of graphical interface. How could I use Nsight Compute in remote server? By the way, is it possible to profile metrics in Nsight Compute?
How to profile in CUDA application with compute capability 7.x? Is metric "dram_read_throughput" valid in Nsight Compute?
3.6k Views Asked by fishmingee At
1
There are 1 best solutions below
Related Questions in CUDA
- direct global memory access using cuda
- Threads syncronization in CUDA
- Merge sort using CUDA: efficient implementation for small input arrays
- why cuda kernel function costs cpu?
- How to detect NVIDIA CUDA Architecture
- What is the optimal way to use additional data fields in functors in Thrust?
- cuda-memcheck fails to detect memory leak in an R package
- Understanding Dynamic Parallelism in CUDA
- C/CUDA: Only every fourth element in CudaArray can be indexed
- NVCC Cuda 5.0 on Ubuntu 12.04 /usr/lib/libudt.so file format not recognized
- Reduce by key on device array
- Does CUDA include a real c++ library?
- cuMemcpyDtoH yields CUDA_ERROR_INVALID_VALUE
- Different Kernels sharing SMx
- How many parallel threads i can run on my nvidia graphic card in cuda programming?
Related Questions in NSIGHT-COMPUTE
- use NCU with tensorRT, but got No kernels were profiled
- ncu-ui won't run: Could not load the Qt platform plugin "xcb" in "" even though it was found
- nsight-compute does nothing upon invocation
- NSight Compute - get total number of samples?
- How to profile in CUDA application with compute capability 7.x? Is metric "dram_read_throughput" valid in Nsight Compute?
- Port forwarding to avoid the need for certificate
- What does NSight Compute show for a stall reason that isn't "supported"?
- What are the "long" and "short" scoreboards w.r.t. MIO/L1TEX?
- Can I skip ahead to profile a specific invocation of a specific kernel?
- Nsight Compute says: "Profiling is not supported on this device" - why?
- When does MIO Throttle stall happen?
- Shared memory loads not registered when using Tensor Cores
- Which GPU execution dependencies have fixed latency (causing 'Wait' stalls)?
- Using ncu to profile pagefault in unified memory
- Unbalanced Memory Read & Write in CUDA
Trending Questions
- UIImageView Frame Doesn't Reflect Constraints
- Is it possible to use adb commands to click on a view by finding its ID?
- How to create a new web character symbol recognizable by html/javascript?
- Why isn't my CSS3 animation smooth in Google Chrome (but very smooth on other browsers)?
- Heap Gives Page Fault
- Connect ffmpeg to Visual Studio 2008
- Both Object- and ValueAnimator jumps when Duration is set above API LvL 24
- How to avoid default initialization of objects in std::vector?
- second argument of the command line arguments in a format other than char** argv or char* argv[]
- How to improve efficiency of algorithm which generates next lexicographic permutation?
- Navigating to the another actvity app getting crash in android
- How to read the particular message format in android and store in sqlite database?
- Resetting inventory status after order is cancelled
- Efficiently compute powers of X in SSE/AVX
- Insert into an external database using ajax and php : POST 500 (Internal Server Error)
Popular Questions
- How do I undo the most recent local commits in Git?
- How can I remove a specific item from an array in JavaScript?
- How do I delete a Git branch locally and remotely?
- Find all files containing a specific text (string) on Linux?
- How do I revert a Git repository to a previous commit?
- How do I create an HTML button that acts like a link?
- How do I check out a remote Git branch?
- How do I force "git pull" to overwrite local files?
- How do I list all files of a directory?
- How to check whether a string contains a substring in JavaScript?
- How do I redirect to another webpage?
- How can I iterate over rows in a Pandas DataFrame?
- How do I convert a String to an int in Java?
- Does Python have a string 'contains' substring method?
- How do I check if a string contains a specific word?
For compute capability 7.5 and higher the recommended tools are nsight compute, and nsight systems. The documentation for nsight compute is here, the documentation for nsight systems is here. There is an introductory blog describing these "new" CUDA profiler tools here, and a tutorial blog on nsight systems here and a tutorial blog on nsight compute here. The introductory blog describes why there are 2 tools, and how they relate to each other.
It is not. The naming format of that metric indicates it is a nvprof metric. The nvprof metric names can generally not be used directly in Nsight Compute. To find out if there is an "equivalent" metric in nsight compute for a given nvprof metric, use the nvprof transition guide, in particular the metric comparison table. By studying that table, you'll note that there is a Nsight compute metric that is equivalent to
dram_read_throughputand it is nameddram__bytes_read.sum.per_secondFor instructions on how to capture this metric in nsight compute, please refer to the blog I already mentioned here, or refer to the documentation here.If you have the CUDA toolkit installed on the remote server, you should be able to run Nsight Compute in CLI (command-line-interface) mode. That is described both in the documentation already linked, and the blog article already linked. Alternatively, you may be able to run the GUI in remote mode, as described here.
Yes, we have already covered that.
I won't be able to use this question/answer to debug remote connection details or any other follow-up questions about specific access cases or usage scenarios of Nsight tools. There are documentation and tutorials already available. If you have another specific question, please ask a new question. To locate resources for Nsight Compute and Nsight Systems, I suggest simply googling those names. Usually the first hits will be landing pages here and here which link to all of the above resources, plus additional resources such as video tutorials describing specific cases and advanced usage.
All of these tools are available on windows as well with similar user interfaces. Furthermore, these tools can/should be used for any GPU of compute capability 7.0 or higher.