Table of Contents
Search is a key mechanism that members of your community will use to locate content that they are interested in. You must install and configure search for your community. If you are having difficulty with using search, review how to troubleshoot search.
The Zimbra Community search provider uses a powerful search engine to furnish search results as well as index content sent from Zimbra products. The search index is self-updating and will add and remove content as it changes on your site. The search results are trimmed based on the user's roles. If for some reason the index becomes corrupt or has invalid data, you can manually delete and reindex the content on the site.
The primary reason for reindex your site would be if the search results are not the expected results or they have data that is not correct.
Reasons this may occur:
- Corruption of current data when indexing the site
- Requirement to update the version of the search components
- Failure to synchronize permissions to the search index
- Search indexing tasks fail to update the search index
To reset the search index, there are two tasks that need to be performed:
Stop the Tomcat service using the tray icon or through the Services MMC. If using the MMC, look for "Apache Tomcat."
Delete your existing index by renaming or deleting the Index folder. (Delete the whole folder, not just the content in it. Solr will recreate it in %ProgramFiles%\Apache Software Foundation\Tomcat 7.0\Solr\data\.)
Restart the indexing service by starting Tomcat.
To set the content to be reindexed, you need access to the database where the content is contained and permission to run SQL scripts against the database.
Each object type that maps to a Search Content Mapper can be reset. Here is the default script that resets all content:
/* Resets all default content types */ delete from cs_Search_Queue exec te_SearchIndex_Update null, null, 0
Due to the fact that this should only be done in rare cases, you should keep in mind a few items to think about prior to resetting your index on a live site:
- During the reindexing period, your search services will not be fully functional on your site.
- You will not get all of the search results returned until the search process has finished, but content that has been indexed at the time of the search will be returned.
- The indexing service can be processor-intensive - so you will want to perform these operations in off hours if possible.
- Back up all data prior to performing any of these steps - the database as well as the search index.
Error Indexing Attachment : [Filename=document.pdf ]System.ApplicationException: TextFilter error:
CommunityServer.Components.Search.TextFilterException: IFilter instance not found for file C:\Windows\TEMP\1000.69.1626.document.pdf at CommunityServer.Components.Search.TextFilter.áÃƒâ€šÃ‚ÂÂÃƒâ€šÃ‚ÂÂ() at CommunityServer.Components.Search.TextFilter.áÃƒâ€šÃ‚ÂÂÃƒâ€šÃ‚ÂÂ() at CommunityServer.Components.Search.TextFilter..ctor(String file) at CommunityServer.Search.MappingExtension.GetAttachmentText(PostAttachment attachment) Search Indexing 9/23/2009 2:26:41 PM WEB1 500 Warning 1000
You do not need to have Windows Search index 'D:"; it can be removed. The indexing task copies (and deletes) attachment to the Windows TEMP folder.
Download the Adobe PDF iFilter (http://www.adobe.com/support/downloads/detail.jsp?ftpID=2611). Starting after Adobe 6.0, 32-bit servers only require that you install Adobe Reader to get the iFilter. But for 64-bit servers, you need a different download: http://www.adobe.com/support/downloads/detail.jsp?ftpID=4025.
Install the Microsoft filter pack.
Install Windows Search (if you have not already done so).
To be prudent, recycle application pool so you can be certain that the indexing task is refreshed.
For testing, flag posts that have attachments to be reindexed:
update te_SearchIndex_Contents set IsIndexed = 0 where ContentId in ( select t.ContentId from te_forum_threads t join te_Attachments a on a.ContentId = t.ThreadId where IsRemote = 0 and a.ApplicationTypeId = 0 and a.ApplicationContentTypeId = 0' ) update te_SearchIndex_Contents set IsIndexed = 0 where ContentId in ( select t.ContentId from te_Forum_ThreadReplies t join te_Attachments a on a.ContentId = t.ThreadId where IsRemote = 0 and a.ApplicationTypeId = 0 and a.ApplicationContentTypeId = 1' ) update te_SearchIndex_Contents set IsIndexed = 0 where ContentId in ( select b.ContentId from te_Blog_Posts b join te_Attachments a on a.ContentId = b.PostId where IsRemote = 0 and a.ApplicationTypeId = 1 and a.ApplicationContentTypeId = 0' ) update te_SearchIndex_Contents set IsIndexed = 0 where ContentId in ( select f.ContentId from te_FileGallery_Files f join te_Attachments a on a.ContentId = f.FileId where IsRemote = 0 and a.ApplicationTypeId = 13 and a.ApplicationContentTypeId = 0' )
To check the index for documents that have attachment text, you can issue the following query against your Solr instance:
(Notes: Keep in mind that there is an almost 30-second delay before the documents are committed to the Solr index, so be patient. Also, do a hard refresh in the browser to avoid the browser caching the results.)