Troubleshooting dashes in search terms

Hi.

I have a Group named "ADSP-BF70x", and another one where that string appears in the Group name.

When I search for "adsp-bf70x", I get the two Groups in the results.

When I search for "adsp-bf70*", I get NO results.  I don't understand why.

When I search for "adsp bf70x"I get the two Groups in the results.

When I search for "adsp bf70*", I get the two Groups in the results.

I'm trying to understand the tokenization of the query, specifically...

  • Is a dash a token delimiter, or is it part of the token?
  • If the dash is a delimiter, then why doesn't "adsp-bf70*" work?
  • If the dash is part of the token, then why doesn't "adsp-bf70*" work?

If possible, it would be great to know what filters/tokenizers/stemmers you are using and their configuration.  It would allow me to research those directly.

Lastly, is it possible to reconfigure the filters/tokenizers/stemmers on a Telligent hosted instance, with or without PS support?

Thanks.

  • Is a dash a token delimiter, or is it part of the token?

    For the title field, which is where this value goes, the StandardTokenizer will break these into two tokens as you probably expected. Here is the analysis output:  

    I PM'd you the 10.1 search schema and configuration files. Look at the title field and you will see its a "text" field. You can see then see how that is defined. 

    When I search for "adsp-bf70*", I get NO results.  I don't understand why.

    It is probably not working as you are thinking it does when you add the wildcard. Not all analyzers run when you add a wildcard - see this article for more info. When you add a wildcard it sort of like saying "I know the pattern of token I want". In this case, its treating your search like you are looking for "adsp-bf70" (single token, not broken apart), which is not in the index - only "adsp" and "bf70x" are. Here is the title part of the query when you issue that search:

      

    There is a field in the index, titlelookup, that can be used to issue better wildcard searches against but that requires specific field searches - ex. titlelookup:ADSP-BF70x*

    Lastly, is it possible to reconfigure the filters/tokenizers/stemmers on a Telligent hosted instance, with or without PS support?

    Not through the UI but if you have a case/example of a recommended change I would contact support for review.