Endeca Search and performance Impact - Record Search|Wildcard Search|Boolean Search|Phrase Search

Record Search:
Record search is an indexed feature, each property enabled for record search increases the size of the Dgraph process. Only properties that are needed by an application for record searching should be configured.
Use Endeca Set Selection feature :  http://ravihonakamble.blogspot.com/2015/06/endeca-select-feature-aka-set-selection.html

Wildcard Search:
If wildcard search is enabled in the MDEX Engine (even if it is not used by the users), it increases the time and disk space required for indexing. Therefore, consider first the business requirements for your Endeca application to decide whether you need to use wildcard search. 
Recommendations:
1) Avoid wildcard searches with one non-wildcarded character, such as a* , since they are more expensive to process
2) Parse the queries to calculate their search term length to avoid very low information queries, such  as "a*". Avoid MDEX Engine wildcarding queries that contain fewer than 3 non-wildcarded characters. 
3) Remove all non-searchable characters from each wildcard query before issuing it to the MDEX Engine 
4) Exclude wildcard queries with quoted phrase searches.
FYI 
If search queries contain only wildcards and punctuation, such as *.* , the MDEX Engine rejects them for performance reasons and returns no results. 

Boolean Search:
 The performance of Boolean search is a function of the number of terms and operators in the query and also the number of records associated with each term in the query. As the number of records increases and as the number of terms and operators increase, queries become more expensive. If you notice unexpected behavior while using Boolean search, use the Dgraph-v flag when starting the Dgraph. This flag prints detailed output to stderr describing the running Boolean query process.

Phrase Search:
The cost of phrase search operations depends mostly on how frequently the query words appear in the data and the number of words in the phrase. You can improve performance of phrase search by limiting the number of words in a phrase with the --phrase_max <num> flag for the Dgraph
Using this flag improves performance of text search with phrases. The default number is 10. If the maximum number of words in a phrase is exceeded, the phrase is truncated to the maximum word count and a warning is logged.

No comments:

Post a Comment