Im building a search function for an application with Lucene.NET and NHibernate.Search. To index the existing data I am using this method:
public void SynchronizeIndexForAllUsers()
{
var fullTextSession = Search.CreateFullTextSession(m_session);
var users = GetAll();
foreach (var user in users)
{
if (!user.IsDeleted)
{
fullTextSession.Index(user);
}
}
}
Where I have marked the fields I want to index with following attribute:
[Field(Index.Tokenized, Store = Store.Yes, Analyzer = typeof(StandardAnalyzer))]
public virtual string FirstName
{
get { return m_firstName; }
set { m_firstName = value; }
}
But when I then inspect the indicies in Luke the fields still have uppercases, commas etc. which should have been removed by the StandardAnalyzer.
Does anyone have know what I am doing wrong?
I had similiar problem to yours, but I've been trying to use WhitespaceAnalyzer. Setting it in Field attribute didn't work for me either.
I've ended up setting it globally. I am using FluentNHibernate for configuration and it looks like that:
Take a look at NHibernate.Search.Environment.AnalyzerClass. Funny thing is that it won't work for generic fulltext queries (i think that Lucene will use StandardAnalyzer), but that's another story :).
Hope this helps.