Hello and welcome to our community! Is this your first visit?
Enjoy an ad free experience by logging in. Not a member yet? Register.
Results 1 to 3 of 3
  1. #1
    Regular Coder
    Join Date
    Sep 2006
    Thanked 1 Time in 1 Post

    Filtering words in search

    Okay, so I have a search script that searches my database using mysql's "LIKE"... I want to filter out words such as "a, the, is, as, I" I can't think of them all off the top of my head and was wondering if anyone had a list laying around somewhere?

  • #2
    New Coder
    Join Date
    Oct 2006
    Thanked 0 Times in 0 Posts
    str_replace ('badword', 'XXXXXXX', $string);


  • #3
    Regular Coder ralph l mayo's Avatar
    Join Date
    Nov 2005
    Thanked 31 Times in 29 Posts
    Not directly an answer to the question at hand, but searching with LIKE does not scale, and you should at least take a look at fulltext indexing. Not only is it faster, but it provides relevancy data about matches and it handles the type of stuff you're talking about (stopwords) automatically by ignoring words that appear too often in the result set.


    edit: from swish-e:

    a above according across actually adj after
    afterwards again against all almost alone along
    already also although always among amongst an and
    another any anyhow anyone anything anywhere are aren
    aren't around as at be became because become becomes
    becoming been before beforehand begin beginning behind
    being below beside besides between beyond billion both
    but by can can't cannot caption co could couldn
    couldn't did didn didn't do does doesn doesn't don
    don't down during each eg eight eighty either else
    elsewhere end ending enough etc even ever every
    everyone everything everywhere except few fifty first
    five for former formerly forty found four from
    further had has hasn hasn't have haven haven't
    he hence her here hereafter hereby herein hereupon
    hers herself him himself his how however hundred
    ie i.e. if in inc inc. indeed instead into is
    isn isn't it its itself last later latter latterly
    least less let like likely ll ltd made make
    makes many maybe me meantime meanwhile might million
    miss more moreover most mostly mr mrs much must
    my myself namely neither never nevertheless next nine
    ninety no nobody none nonetheless noone nor not
    nothing now nowhere of off often on once one
    only onto or others otherwise our ours
    ourselves out over overall own per perhaps rather
    re recent recently same seem seemed seeming seems
    seven seventy several she should shouldn shouldn't
    since six sixty so some somehow someone something
    sometime sometimes somewhere still stop such taking
    ten than that the their them themselves then
    thence there thereafter thereby therefore therein
    thereupon these they thirty this those though
    thousand three through throughout thru thus to
    together too toward towards trillion twenty two under
    unless unlike unlikely until up upon us used using
    ve very via was wasn we we well were weren
    weren't what whatever when whence whenever where
    whereafter whereas whereby wherein whereupon wherever
    whether which while whither who whoever whole whom
    whomever whose why will with within without won
    would wouldn wouldn't yes yet you your yours
    yourself yourselves
    Last edited by ralph l mayo; 01-03-2007 at 09:39 PM.


    Posting Permissions

    • You may not post new threads
    • You may not post replies
    • You may not post attachments
    • You may not edit your posts