Article Search Indexing

Search is a key mechanism that members of your community will use to locate content that they are interested in. You must install and configure search for your community. If you are having difficulty with using search, review how to Troubleshoot search errors.

Re-indexing

Community uses a powerful search engine to furnish search results for content and conversations within the community. The search instance is self-updating and will add and remove content as it changes on your site. The search results are trimmed based on the user's roles. If for some reason the index becomes corrupt or has invalid data, you can manually delete and re-index the content on the site.

The primary reason for re-index your site would be if the search results are not the expected results or they have data that is not correct.

Reasons this may occur:

  • Corruption of current data when indexing the site
  • Requirement to update the version of the search components
  • Failure to synchronize permissions to the search index
  • Search indexing tasks fail to update the search index

Reset the index

There are two cores (indexes) within the Solr instance: "telligent-content" and "telligent-conversations". To delete one or both of the cores the following tasks need to be performed:

Delete the existing cores

  1. Stop the "Telligent Search" service using the Windows Services MMC.

  2. Navigate to the "telligent-content" core folder in the Solr home folder (ex. c:\Search\data\home\telligent-content\data\) and delete the "index" and "tlog" folders. 

  3. If also re-indexing conversations, navigate to the "telligent-conversations" core folder in the Solr home folder (ex. c:\Search\data\home\telligent-conversations\) and delete the "index" and "tlog" folders. 

  4. Restart the "Telligent Search" service.

Set the content to be re-indexed

To set the content to be re-indexed, you need access to the database where the content is contained and permission to run SQL scripts against the database.

Execute the following to reset all content:

exec te_SearchIndex_ReindexAllDefaultContentTypes

Execute the following to reset all conversations:

update dbo.te_ConversationMessages set IsIndexed = 0

Things to keep in mind

Due to the fact that this should only be done in rare cases, you should keep in mind a few items to think about prior to resetting your index on a live site:

  • During the re-indexing period, your search services will not be fully functional on your site.
  • You will not get all of the search results returned until the search process has finished, but content that has been indexed at the time of the search will be returned.
  • The indexing service at time can be CPU and database heavy so you may want to consider performing these operations after hours.

Scenario: You have the following log for an IFilter index error

Error Indexing Attachment : [Filename=document.pdf ]System.ApplicationException: TextFilter error: 
CommunityServer.Components.Search.TextFilterException: IFilter instance not found for file C:\Windows\TEMP\1000.69.1626.document.pdf at CommunityServer.Components.Search.TextFilter.ᐁ() at CommunityServer.Components.Search.TextFilter.ᐁ() at CommunityServer.Components.Search.TextFilter..ctor(String file) at CommunityServer.Search.MappingExtension.GetAttachmentText(PostAttachment attachment) Search Indexing 9/23/2009 2:26:41 PM WEB1 500 Warning 1000

Solution: Install iFilter and re-index the files

You do not need to have Windows Search index 'D:"; it can be removed. The indexing task copies (and deletes) attachment to the Windows TEMP folder. 

  1. Download the Adobe PDF iFilter (http://www.adobe.com/support/downloads/detail.jsp?ftpID=2611). Starting after Adobe 6.0, 32-bit servers only require that you install Adobe Reader to get the iFilter. But for 64-bit servers, you need a different download: http://www.adobe.com/support/downloads/detail.jsp?ftpID=4025.

  2. Install the Microsoft filter pack.

  3. Install Windows Search (if you have not already done so).

  4. To be prudent, recycle application pool so you can be certain that the indexing task is refreshed.

  5. For testing, flag posts that have attachments to be re-indexed:

          UPDATE [dbo].[te_Blog_Posts] SET [IsIndexed] = 0 
                WHERE PostID IN (select ContentId from te_Attachments where IsRemote = 0 and ApplicationTypeId = 1)
          UPDATE [dbo].[te_Forum_Threads] SET IsIndexed = 0
                WHERE ThreadId IN (select ContentId from te_Attachments where IsRemote = 0 and ApplicationTypeId = 0 and ApplicationContentTypeId = 0)
          UPDATE [dbo].[te_Forum_ThreadReplies] SET IsIndexed = 0
                WHERE ThreadReplyId IN (select ContentId from te_Attachments where IsRemote = 0 and ApplicationTypeId = 0 and ApplicationContentTypeId = 1)
          UPDATE [dbo].[te_FileGallery_Files] SET IsIndexed = 0
    

    To check the index for documents that have attachment text, you can issue the following query against your Solr instance: 

    (Notes: Keep in mind that there is an almost 30-second delay before the documents are committed to the Solr index, so be patient. Also, do a hard refresh in the browser to avoid the browser caching the results.)

    http://[your Solr Server]:8080/solr/telligent-content/select/?q=attachmenttext:[*+TO+*]&sort=indexed_at+desc&fl=attachmentname,indexed_at&rows=100