Extract Contents code from PDF / Word File

215 Views Asked by JackXandar At 27 July 2025 at 13:08

I have to big files of MS Word & PDF which contains images, text fields, tables.

I need to insert text into these files dynamically at specific locations. I've tried Bookmarks method in Word but I can't use that method now. I've extracted data into byte array and tried to write in pdf but file gets corrupted. Here is the code:

 byte[] bytes = System.IO.File.ReadAllBytes("CDC.doc");
            FileStream fs = new FileStream("CDC.pdf", FileMode.OpenOrCreate);
            fs.Write(bytes, 0, bytes.Length);
            fs.Close();

Is there any way that I can convert these pdf/ word files to get PDF code for these files and then I can append data to specific locations in that code. Please advise. Thanks!

Original Q&A

There are 1 best solutions below

Alen Walker On 20 January 2017 at 22:03

If I understand you right, you would like to develop a code that would replace all placeholders in a Word document acting as a template with your application data. For placeholders you can use Bookmarks, but a better choice would be Content Controls. You can use Open XML SDK to parse such a template Word document and replace Content Controls with data. This approach uses a free MS library but is tedious.

A much easier approach would be using a ready-made library which can work with templates, which contain placeholders that will get replaced with your real app data at runtime. In your C# application you can prepare the data (as C# data objects or XML) and merge this data with the template. Output can be in docx, pdf or xps format. You can check out some of the examples here.

Extract Contents code from PDF / Word File

There are 1 best solutions below

Related Questions in C#

Related Questions in PDF

Related Questions in MS-WORD

Related Questions in PDF-MANIPULATION

Trending Questions

Popular # Hahtags

Popular Questions