SQL Server 2012 - Fulltext search on top of a filetable - PDF not being searched

1k Views Asked by At

I'm getting my feet wet with handling a load of Office and PDF documents with SQL Server 2012's FILETABLE feature, and using fulltext search on top of that.

I've configured my SQL Server to support fulltext search and filestream, and I've created a FILETABLE, dumped 800+ documents of all sorts into the folder, and that all works nicely.

In order to be able to fulltext index MS Office documents, I've installed the MS Filter Pack 2.0, and to handle the PDF files, I've downloaded Adobe's iFilter for PDF and installed them all.

Now I've created a full text catalog:

CREATE FULLTEXT CATALOG DocumentCatalog
WITH ACCENT_SENSITIVITY = OFF

and then a full text index on the FILETABLE table:

CREATE FULLTEXT INDEX 
ON dbo.Documents(name, file_type, file_stream)
KEY INDEX [PK_Document]
ON DocumentCatalog

and that all seemed to work just fine. After a while, populating the 800+ documents I have, I can start doing searches:

SELECT 
    stream_id, name, file_type, cached_file_size, 
    file_stream.GetFileNamespacePath(1)
FROM 
    dbo.Documents
WHERE
    CONTAINS(*, 'Silverlight')

and stuff that is contained in MS Office documents (*.doc, *.docx, *.ppt, *.pptx, *.xls, *.xlsx) is found quite nicely - and quickly.

Unfortunately, none of the text in the PDF files seems to be found :-(

Any ideas why? I had no errors during setup, and all seems fine - I can see the .pdf file type in the Filters in SQL Server:

SELECT *
FROM sys.fulltext_document_types

returns:

.pdf    E8978DA6-047F-4E3D-9C78-CDBE46041603    
        C:\Program Files\Adobe\Adobe PDF iFilter 11 for 64-bit platforms\bin\PDFFilter.dll    
        11.0.1.36    Adobe Systems, Inc.

but somehow, those PDF don't seem to be indexed. Can I someone find out what files were in fact indexed, and whether or not there was an error during population? Where would I find this information?

1

There are 1 best solutions below

0
On