Hello and welcome to our community! Is this your first visit?
Register
Enjoy an ad free experience by logging in. Not a member yet? Register.
Results 1 to 8 of 8
  1. #1
    New to the CF scene
    Join Date
    May 2011
    Posts
    4
    Thanks
    1
    Thanked 0 Times in 0 Posts

    Fulltext - MATCH...AGAINST will match some words but not others...

    Hey everyone,

    I am having a little trouble with full text search within MySQL, whereby some words will return results, however others will not.

    Lets say I have the following table:

    Code:
    Row    | Title                        | Content
    ---------------------------------------------------------------------------------
    1        | Help topic 1              | Blah blah blahdity blah
    ---------------------------------------------------------------------------------
    2        | Another help topic      | Some other data here
    My query essentially looks like this:

    SELECT * FROM `table_name` WHERE (MATCH (Title) AGAINST ('search_text*' IN BOOLEAN MODE))

    The issue that I am having is that if enter 'topic' as my "search_text", both rows are returned, which is to be expected. If I use 'another' then an empty result is returned.

    After some reading, I found some information regarding full text that specifies words in more than 50% of the results are dropped from the set. This is not relevant in my case, as I am using boolean mode, however the table has more than 5 results, and the results are the same.

    I don't understand why this is, as it meets the min query length, and stop words should be ignored as I am using the IN BOOLEAN MODE...

    Could anyone shed any light as to why this is??

    Cheers
    Last edited by juzzbott; 05-06-2011 at 12:09 AM.

  • #2
    Supreme Master coder! Old Pedant's Avatar
    Join Date
    Feb 2009
    Posts
    26,561
    Thanks
    80
    Thanked 4,496 Times in 4,460 Posts
    Ummm...I think you misread the documentation.

    http://dev.mysql.com/doc/refman/5.5/...t-boolean.html
    Boolean full-text searches have these characteristics:
    * They do not use the 50% threshold.
    * They do not automatically sort rows in order of decreasing relevance. You can see this from the preceding query result: The row with the highest relevance is the one that contains “MySQL” twice, but it is listed last, not first.
    * They can work even without a FULLTEXT index, although a search executed in this fashion would be quite slow.
    * The minimum and maximum word length full-text parameters apply.
    * The stopword list applies
    Yes, "another" is in the stopword list.
    An optimist sees the glass as half full.
    A pessimist sees the glass as half empty.
    A realist drinks it no matter how much there is.

  • Users who have thanked Old Pedant for this post:

    juzzbott (05-06-2011)

  • #3
    New to the CF scene
    Join Date
    May 2011
    Posts
    4
    Thanks
    1
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by Old Pedant View Post
    Ummm...I think you misread the documentation.

    http://dev.mysql.com/doc/refman/5.5/...t-boolean.html


    Yes, "another" is in the stopword list.
    Hi Old Pedant,

    Thanks for clarifying that for me, that would certainly explain it.

    Do you perhaps know of anyway to ignore the stopword list when executing a list. After some further reading of that page, it seems that appending the wildcard character (*), should ignore stopwords anyway:
    If a word is specified with the truncation operator, it is not stripped from a boolean query, even if it is too short (as determined from the ft_min_word_len setting) or a stopword. This occurs because the word is not seen as too short or a stopword, but as a prefix that must be present in the document in the form of a word that begins with the prefix.
    Unless it is matching 'another' as a whole word before applying the * wildcard??

    Cheers

  • #4
    Supreme Master coder! Old Pedant's Avatar
    Join Date
    Feb 2009
    Posts
    26,561
    Thanks
    80
    Thanked 4,496 Times in 4,460 Posts
    Now that I don't know. But why not just try it?
    An optimist sees the glass as half full.
    A pessimist sees the glass as half empty.
    A realist drinks it no matter how much there is.

  • #5
    New to the CF scene
    Join Date
    May 2011
    Posts
    4
    Thanks
    1
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by Old Pedant View Post
    Now that I don't know. But why not just try it?
    I have, which I why am having trouble understanding it.

    Searching for topi* or topic* returns results, however hel*, help*, anoth* or another* does not. So it seems very flaky in the interpretation of the stopword list and the wildcard operator...

    Cheers

  • #6
    New to the CF scene
    Join Date
    May 2011
    Posts
    4
    Thanks
    1
    Thanked 0 Times in 0 Posts
    I found this page which lets me change the stopword file, which resolved my issue.
    http://w3guru.blogspot.com/2007/05/e...word-list.html

    The only problem is that it's a server wide configuration option, so anything on a shared environment cannot be fixed.

    While I agree with the idea behind this stopword list, and can definitely see it's uses, I think it's pretty bad that there is not a query option to ignore it.

  • #7
    Supreme Master coder! Old Pedant's Avatar
    Join Date
    Feb 2009
    Posts
    26,561
    Thanks
    80
    Thanked 4,496 Times in 4,460 Posts
    Quote Originally Posted by juzzbott View Post
    While I agree with the idea behind this stopword list, and can definitely see it's uses, I think it's pretty bad that there is not a query option to ignore it.
    Well, for the reason that things are as are they are, look here:
    http://www.codingforums.com/showpost...1&postcount=10

    Aren't you glad you aren't paying $20,000 per year to use MySQL? <grin/>
    An optimist sees the glass as half full.
    A pessimist sees the glass as half empty.
    A realist drinks it no matter how much there is.

  • #8
    New to the CF scene
    Join Date
    Jul 2011
    Posts
    1
    Thanks
    0
    Thanked 0 Times in 0 Posts
    MySQL's full-text search capability has few user-tunable parameters. Take them in order.
    But may be you just need to reinstall it .


  •  

    Tags for this Thread

    Posting Permissions

    • You may not post new threads
    • You may not post replies
    • You may not post attachments
    • You may not edit your posts
    •