Understanding CQ5 Lucene indexing rule

3.7k Views Asked by At

Information: I have provided an indexing configuration file to cq5. I have not indexed on the property cq:template by specifying the following rule:

<index-rule nodeType="nt:base">
 <property nodeScopeIndex="false">cq:template</property>
</index-rule>

I rebuilt the index.The logs show re-indexing is properly done.

The problem I am facing: When I execute the following SQL2 query, it gives me the same results as it would give without the above indexing rule:

SELECT s.[cq:template] FROM [nt:base] AS s WHERE s.[cq:template] like '/apps/geometrixx/templates/contentpage'
1

There are 1 best solutions below

1
On

Your rule actually omits all properties from the index except for cq:template rule (and excludes cq:template from the fulltext index because you defined nodeScopeIndex="false"). See the jackrabbit documentation for more details.

When you define the element <property nodeScopeIndex="false">cq:template</property>, the system includes the property in the index. However, nodeScopeIndex="false" tells CRX/Jackrabbit not to include the property in the fulltext index. Meaning it would be available for all searches except for those using contains(...) in sql or jcr:contains(...) for xpath.

To avoid indexing a property entirely, omit it from the first index-rule with nodeType/condition attributes that match its node. It must be the first matching rule because the rules in index_config.xml file are processed top down.

So to remove the cq:template property from the index in CQ5, do the following:

  1. Extract the out of the box CQ5 version of indexing_config.xml (See this documentation for instructions)
  2. Remove the <property nodeScopeIndex="false">cq:tempate</property> from <index-rule nodeType="nt:base">
  3. Change the regular expression in the last rule <property isRegexp="true"> from .*:.* to ^(?!cq:template).*:.*$:

After you make the changes, the index-rule should look like this:

<index-rule nodeType="nt:base">
  <property nodeScopeIndex="false">analyticsProvider</property>
  <property nodeScopeIndex="false">analyticsSnippet</property>
  <property nodeScopeIndex="false">hideInNav</property>
  <property nodeScopeIndex="false">offTime</property>
  <property nodeScopeIndex="false">onTime</property>
  <property nodeScopeIndex="false">cq:allowedTemplates</property>
  <property nodeScopeIndex="false">cq:childrenOrder</property>
  <property nodeScopeIndex="false">cq:cugEnabled</property>
  <property nodeScopeIndex="false">cq:cugPrincipals</property>
  <property nodeScopeIndex="false">cq:cugRealm</property>
  <property nodeScopeIndex="false">cq:designPath</property>
  <property nodeScopeIndex="false">cq:isCancelledForChildren</property>
  <property nodeScopeIndex="false">cq:isDeep</property>
  <property nodeScopeIndex="false">cq:lastModified</property>
  <property nodeScopeIndex="false">cq:lastModifiedBy</property>
  <property nodeScopeIndex="false">cq:lastPublished</property>
  <property nodeScopeIndex="false">cq:lastPublishedBy</property>
  <property nodeScopeIndex="false">cq:lastReplicated</property>
  <property nodeScopeIndex="false">cq:lastReplicatedBy</property>
  <property nodeScopeIndex="false">cq:lastReplicationAction</property>
  <property nodeScopeIndex="false">cq:lastReplicationStatus</property>
  <property nodeScopeIndex="false">cq:lastRolledout</property>
  <property nodeScopeIndex="false">cq:lastRolledoutBy</property>
  <property nodeScopeIndex="false">cq:name</property>
  <property nodeScopeIndex="false">cq:parentPath</property>
  <property nodeScopeIndex="false">cq:segments</property>
  <property nodeScopeIndex="false">cq:siblingOrder</property>
  <property nodeScopeIndex="false">cq:template</property>
  <property nodeScopeIndex="false">cq:trigger</property>
  <property nodeScopeIndex="false">cq:versionComment</property>
  <property nodeScopeIndex="false">jcr:createdBy</property>
  <property nodeScopeIndex="false">jcr:lastModifiedBy</property>
  <property nodeScopeIndex="false">sling:alias</property>
  <property nodeScopeIndex="false">sling:resourceType</property>
  <property nodeScopeIndex="false">sling:vanityPath</property>
  <property isRegexp="true">^(?!cq:template).*:.*$</property>
</index-rule>

Note of warning:

I'm not sure if it is safe to remove cq:template from the search index as the product code may use it in some way. As a best practice, it is recommended to only exclude custom application properties. Also, you must include properties in the fulltext index which contain references to other content paths. This is because when you move a page in CQ5 (AEM) then it does a jcr:contains search to see where that page is referenced. So if you exclude such properties with nodeScopeIndex="false" or by modifying the regular expression above to omit them then the reference search will fail. Then you end up with stale references to old paths.


References:

  1. Official indexing_config.xml reference: http://wiki.apache.org/jackrabbit/IndexingConfiguration
  2. Instructions on how to update indexing_config.xml in CQ5: http://helpx.adobe.com/experience-manager/kb/SearchIndexingConfig.html