The documentation states for sphinx-0.9.9-rc2:
The data to be indexed can generally come from very different sources: SQL databases, plain text files, HTML files, mailboxes, and so on.
However, I can't find any documentation on setting up a a source besides SQL. The config file doesn't seem to indicate that the source can be anything but a database. Anyone have any helpful links for setting up sphinx with an HTML source?
Are you looking for the xmlpipe (now called xmlpipe2) feature on Sphinx? I've tried it out for XML files and it works just like it does for SQL.
I haven't tried out Sphinx with vanilla HTML files, so I'm guessing you'll need to parse your HTML file and create XML files with the attributes/fields that you want indexed and feed them to Sphinx using xmlpipe.
You can see here and here for more.
HTH