Search Type: Like, Full Text or Combined?

42

23

What is the difference between the different search types?

  • Like
  • Full Text
  • Combined

I am specially interested in how the search behavior and performance changes for those settings.

PiTheNumber

Posted 2013-03-05T09:50:59.683

Reputation: 1 785

Answers

59

Everyone always complains about Magento search but I believe it can work really well if you spend time planning and configuring it properly.


Like

Keyword based search method, breaking your query into individual words. See the following from line 326 in class Mage_CatalogSearch_Model_Resource_Fulltext::prepareResult()

            $words = Mage::helper('core/string')->splitWords($queryText, true, $query->getMaxQueryWords());
            foreach ($words as $word) {
                $like[] = $helper->getCILike('s.data_index', $word, array('position' => 'any'));
            }
            if ($like) {
                $likeCond = '(' . join(' OR ', $like) . ')';
            }

You can see it splits each word in your search query and joins them together in LIKE statements - you end up with something like this:

WHERE `attribute` LIKE 'my' OR `attribute` LIKE 'search' OR `attribute` LIKE 'query'

This method might work for certain store setups where product names are simple and customers search for very specific items, but in my experience it is not a good choice.


Fulltext

Relevance based search - every search query is graded accoring to a score assigned based on MySQL's MATCH ... AGAINST query. You can see this in action in Mage_CatalogSearch_Model_Resource_Helper_Mysql4 line 44:

public function chooseFulltext($table, $alias, $select)
{
    $field = new Zend_Db_Expr('MATCH ('.$alias.'.data_index) AGAINST (:query IN BOOLEAN MODE)');
    $select->columns(array('relevance' => $field));
    return $field;
}

The database table Magento uses when performing fulltext searches is catalogsearch_fulltext. An example value:

EmCO0014e|Emma Certified|Emma Certified Organic Herbal Tonic Mist TRIAL/TRAVEL|Australian|Certified Organic|Palm Oil Free|Nut Free|Vegan Suitable|

These values are directly linked to the attributes you specify as 'Use in Quick Search' under Catalog > Attributes > Manage Attributes


Combine

Pretty self explanatory. Take a look at line 354 of Mage_CatalogSearch_Model_Resource_Fulltext:

        if ($likeCond != '' && $searchType == Mage_CatalogSearch_Model_Fulltext::SEARCH_TYPE_COMBINE) {
                $where .= ($where ? ' OR ' : '') . $likeCond;
        } elseif ($likeCond != '' && $searchType == Mage_CatalogSearch_Model_Fulltext::SEARCH_TYPE_LIKE) {
            $select->columns(array('relevance'  => new Zend_Db_Expr(0)));
            $where = $likeCond;
        }

You can see it just adds the LIKE results after FULLTEXT results have come back.


Things to note

  1. I strongly recommend using FULLTEXT for better performance and relevant results
  2. Go through each attribute and consider whether or not it is necessary to add it to the fulltext index ('Use in quick search')
  3. By default the product descriptions are included in the FULLTEXT indexing. I almost always remove the descriptions as they pollute the relevance scores with words used in irrelevant contexts.
  4. Make sure your meta attributes are used in the fulltext index. Populate them with meaningful content.
  5. MySQL FULLTEXT has some quirks - it has a list of ignored words which can be problematic if your product names are made up of these words!
  6. MySQL by default ignores FULLTEXT queries under 4 characters. Attributes with short values will be ignored unless you change this value.

You can work around points 5 & 6 by using the Combine method - the LIKE results should compensate for any words FULLTEXT ignored.

jharrison.au

Posted 2013-03-05T09:50:59.683

Reputation: 2 712

6

They're direct references to the type of query Magento will use. Personally I think Full Text search is more powerfull and performance is better, especially if LIKE is used with wildcards (%). A combination of both will probably be most accurate but might be overkill. I'd stick with the Full text.

But if you're looking for a powerfull search solution check out this project: https://code.google.com/p/magento-solr/. SOLR was build for searching large collections and while it might take some time to implement it should be worth it. Personally I've never used it in Magento before but in on other PHP projects it performed very well.

Full text documentation can be found here: http://dev.mysql.com/doc/refman/5.0/en/fulltext-search.html LIKE documentation here: http://dev.mysql.com/doc/refman/5.0/en/string-comparison-functions.html

Sander Mangel

Posted 2013-03-05T09:50:59.683

Reputation: 33 577

Fulltext is more powerful, but needs setups in my.cnf if it's going to give results for words less than 5 or 4 characters, depending on it's default setup. Like search often is shipped broken and needs to have its logic changed from OR to AND so it doesn't spew too many irrelevant results. – Fiasco Labs – 2013-03-05T16:59:10.540

You're right about fulltext. Is always smart to either host on a VPS or with a company that offers Magento hosting. Options like the Fulltext config should be available then or at least you can set them yourself. Do you have any 3th party search suggestions @Fiasco Labs? – Sander Mangel – 2013-03-05T18:41:22.590

@FiascoLabs, how do you change the OR to AND? – Michael Yaeger – 2017-02-21T18:58:29.943

6

The "like" search will do the usual like match, using a like '%keyword%' query. One advantage of this type of search is that it will match partial words. It has serious drawbacks though:

  • will quickly become a performance issue
  • relevance is bad. There is actually no concept of "relevance" in like queries

Fulltext search will work using the MyISAM fulltext search (http://dev.mysql.com/doc/refman/5.0/en/fulltext-search.html). You should read about it, but in a nutshell:

  • performance is better
  • it will exclude stopwords (like "and", "with" etc)
  • It will assign a score by default to each result based on relevance

The drawback of fulltext is that it cannot do partial matches, i.e. a search for "pho" will not find "phone"

The "combined" search will use a "like" condition to match the results but will also take into consideration the fulltext search score for sorting them. This means that you will get more results (as like search will do partial matches too) and they will also be ordered better because of the fulltext score.

As other people said, if you are serious about search you should be using Solr. It is way faster and a lot more relevant. You would have to own Magento EE and install Solr yourself though.

Paul Grigoruta

Posted 2013-03-05T09:50:59.683

Reputation: 1 112

1Thanks for this answer. Why would I need Magento EE? – PiTheNumber – 2013-03-06T08:31:51.477

1

The Solr search is part of Magento Enterprise (not a feature of community). There are extensions that will implement search in community like http://www.magentocommerce.com/magento-connect/solr-bridge-search.html. I did not used them though.

– Paul Grigoruta – 2013-03-10T18:13:00.220

3

Problem with LIKE is it uses "%term%" by default. I have changed it to match " term%" (note the space before the term), so that it matches the beginnings of words. I've also chopped off the final 's' in a search term so that "cars" gives the same results as "car". Obviously that doesn't help with irregular nouns like "children," but it is a vast improvement anyway.

The biggest inexplicably nonsensical move by Magento is to use "OR" search instead of "AND". If you search "red cars" you will get everything red (including cars, dogs, forks, mountainsides, etc), and every kind of car (including red, blue, green, yellow, etc). With "AND", you get only items that contain "red" AND "car", which is how search should work!

Quoting from jharrison.au's answer, change this:

if ($like) {
                $likeCond = '(' . join(' OR ', $like) . ')';
            }

To this:

if ($like) {
                $likeCond = '(' . join(' AND ', $like) . ')';
            }

To get an immediate, massive boost to the relevance of your search results.

For the plural thing, you can chop off the final "s" of a word like this:

$words = Mage::helper('core/string')->splitWords($queryText, true, $query->getMaxQueryWords());
$words = array_walk($words,function(&$value, &$key) { 
    // use substr(...) instead of rtrim($value,'s') 
    // because rtrim will remove multiple esses
    $value = (substr($value,-1,1) === 's') ? substr($value,0,strlen($value - 1)) : $value;
});
foreach ($words as $word) {
       $like[] = $helper->getCILike('s.data_index', $word, array('position' => 'start')); // note I changed this to 'start'
}

In app/code/local/Mage/Core/Model/Resource/Helper/Abstract.php you can override the core file and change the public function escapeLikeValue($value, $options = array()) as a quick shortcut, although the advised way is to do the same thing in a module. But here it is.

Under if (isset($options['position'])) { there is a switch. I changed the cases for 'start' and 'end' to add spaces before and after the value:

 case 'start':
      $value = '% ' . $value . '%'; // added '% ' . before
      // $value = $value . '%'; // core way (bad way)
      break;
 case 'end':
      $value = '%' . $value . ' %'; // added . ' %' after
      // $value = '%' . $value; // core way (bad way)
      break;

For the spaces before/after to work, you probably also have to change the way the search index is built, as I did, so that it makes sure there is a space before and after every word. The default way to build the index is to separate each field by a '|' (pipe character), e.g. "blue|car|a very nice car" for indexing color, product type, product description. But my index has " blue | car | a very nice car ". You can even modify the building of the search index so that perhaps hyphenated words are replaced ("super-fast car" becomes " super fast car "), etc., etc.

Borrowing an examplen from jharrison.au's answer, where a default search index field would look like this:

EmCO0014e|Emma Certified|Emma Certified Organic Herbal Tonic Mist TRIAL/TRAVEL|Australian|Certified Organic|Palm Oil Free|Nut Free|Vegan Suitable|

Mine would look like this:

 EmCO0014e | Emma Certified | Emma Certified Organic Herbal Tonic Mist TRIAL / TRAVEL | Australian | Certified Organic | Palm Oil Free | Nut Free | Vegan Suitable | 

(note spaces before and after every "|" and "/", and a space before the very first word )

Buttle Butkus

Posted 2013-03-05T09:50:59.683

Reputation: 798

1You've mentioned to override app/code/local/Mage/Core/Model/Resource/Helper/Abstract.php. This file is not used anywhere other than the search functionality? – amitshree – 2015-12-04T11:46:11.753

1@amitshree Good question. I don't know off the top of my head. It could be used by any model resource for search results, I suppose. But this is really specifically for escaping LIKE in SQL search queries, and where else does Magento search but in the product search index? I made this change over 2 years ago in a production site and we've not found any bugs related to this at all. All it really does is make it so that a "start of word" must have a space before it, and "end of word" must have a space after. Separately, in the index, I make sure every word has spaces around it. – Buttle Butkus – 2015-12-04T21:28:00.330

2

None of the above, use the built-in Zend Lucene search engine by installing something like Blast Lucene Search or Extendeware Lucene Search. The relevance beats any of the MySQL offerings.

Yeah, I went through all the iterations on the accepted answer, but frankly, Optimized Stock Magento search still was highly lacking.

Lucene on the other hand, delivers and is already included in the Magento install, it just needs a Module to enable it.

Fiasco Labs

Posted 2013-03-05T09:50:59.683

Reputation: 5 998