The /NATVIS linker option can be used to embed debug visualizers into a PDB.
Given a PDB, is there a way to recover all embedded debug visualizers? I'm looking for a first-party tool (like DUMPBIN), and if that cannot do, a solution based on a first-party API (like DIA).
As noted in a comment, the ability to store .natvis files in a PDB is implemented by reusing the infrastructure for embedding arbitrary source files in a PDB.
The exercise thus comes down to parsing the respective tables in a PDB and filtering relevant entries. Thankfully, the parsing is already done for us by the Debug Interface Access SDK (DIA SDK) that ships with Visual Studio1. What's left is navigating the reference documentation to discover applicable building blocks.
Strategy
The following steps solve the problem statement:
IDiaDataSourceinterfaceIDiaSessionIDiaEnumInjectedSourcestableIDiaInjectedSourcerow and extract relevant dataBuild Environment
The DIA SDK ships with Visual Studio. Technically, that makes it a 3rd-party library, and the natural ordeal of setting things up is due. I covered the prerequisite steps here:
With the proposed changes applied, the following program should successfully compile and link:
Construct
IDiaDataSourceThis should be as simple as following the official example. However, it is not. The following program fails with a
REGDB_E_CLASSNOTREGerror code:There isn't anything inherently wrong with this code. It follows the standard pattern for in-proc COM server activation. The issue is that the COM server isn't registered (on my machine, anyways2). The documentation lists "msdia80.dll" (VS 2005), and things apparently changed between then and "msdia140.dll" (VS 2015+), and what was right once is wrong now.
I didn't spend a whole bunch of time trying to register the COM server or investigating the use of side-by-side assembly manifests, or fooling about with Activation Contexts to trick the COM infrastructure into discovering "msdia140.dll".
Either of the above may well be more correct, though I settled for using an undocumented export of "diaguids.lib" instead:
This looks like a (homebrew) version of registration-free COM, which is good enough for now. The following program successfully executes:
This loads "msdia140.dll" from "C:\Program Files (x86)\Windows Kits\10\Windows Performance Toolkit\msdia140.dll" on my system, so there may be additional dependencies which I didn't investigate.
Initiate
IDiaSessionThe
IDiaSessionis at the center of the DIA SDK. It is the pivot point for all queries against the symbol store managed by theIDiaDataSource. Initiating a session is as simple as callingIDiaDataSource::openSession():The code is using the Windows Implementation Libraries (WIL) for convenient resource management and error handling. A C++17 compiler is required due to the use of the filesystem library.
Find the source table
"Streams" in the PDB file format are represented as "tables" in the DIA SDK.
IDiaSession::getEnumTables()returns an iterator over all tables, whereIDiaEnumTables::Next()returns a genericIDiaTableinterface for each entry. A call toQueryInterface()allows us to discover the specific table type. We are interested in theIDiaEnumInjectedSourcetable specifically so that's what the code is requesting. Since there can be at most one such table3, we can return early once identified:Extract source data
With an
IDiaEnumInjectedSourcesiterator, we can reuse the pattern above to discover anIDiaInjectedSourceinterface for each entry and extract the relevant information (file name and source code bytes):This is rather straightforward. However, there are a few points worth mentioning:
Injected source files can be compressed.
IDiaInjectedSource::get_sourceCompression()returns a loosely specified value, where0means "no compression". Other values are possible but their meaning is specific to the tool responsible for generating the PDB. More work is required if you plan to interpret the source data.The
IDiaInjectedSourceinterface also doesn't offer a way to identify the type of source it refers to. I dumped theIDiaPropertyStoragekey/value pairs as well to make sure I wasn't overlooking something, but that didn't turn up anything useful either (at least for my test input). The file name extension thus serves as the only hint.Full program
With everything covered, it would be a waste not to pull it all together into a program. The following compiles to a command line utility that takes a PDB file and an output directory as parameters and dumps all .natvis files found in the PDB:
1 It is included with the Desktop development with C++ workload in the Visual Studio Installer.
2 And someone else's machine, too.
3 Based on a comment in the official example code. Hopefully this statement is (still) true.