The UHF of the film world.
Latest news

quietearth [General News 10.19.06]



For those framiliar with mysql, it offers the ability to search by regular expressions (posix style). I needed an advanced way of searching in php, and my backend was mysql, so this was the logical choice. Problem is, how do I build a whole mysql query based on the input? Here's the type of queries I wanted to be able to process:

1. exact word matches
2. sub-string matches (I was doing this with like "%WORD%")
3. exclude via sub-string match
4. exclude via exact word match


We could do more complex searches with the regexp functions, but this is pretty easy for now. Here is the mysql reference on regular expression queries for further reading.

A simple regexp query looks like:
select * from TABLE where ROW regexp '[[:<:]]bla[[:>:]]' and ROW regexp 'foo';
This will look for an exact match of the string "bla", meaning not as a sub-string, and then match the sub-string "foo" somewhere.

So first off, items 1 and 4 are exact word matches and I want to be able to do this by surrounding the word with quotes. Let's set our necessary variables and then do a match on quotes:

$newq = $query; # $query is the raw query string
$qlevel = 0;
$curquery = "select * from TABLE where "; # the beginning of the query
$doneg = 0;

preg_match_all("/\"([^\"]*)\"/i", $query, $m);
$c = count($m[0]);
for ($i = 0; $i < $c; $i++) {
     $temp = $m[1][$i]; # $temp is whats inside the quotes


Then I want to be able to exclude words, and the user should be able to do this by starting the word with a dash (-), and for exact word matches this has to be inside the quotes. The second match is to get rid of the - in front of the query.

     if (ereg("^-", $temp)) {
          $pc = preg_match("/-([^-]*)/i", $m[1][$i], $dm);
          if ($pc) {
               $temp = $dm[1];
               }
          $doneg++;
          }


Now we will set $temp to the posix compliant exact match, then build this part of the mysql query.

     $temp = "[[:<:]]".$temp."[[:>:]]";
     if ($qlevel) $curquery .= "and "; # are we nested?
     $curquery .= "ROW "; # the mysql row we are searching in
     if ($doneg) $curquery .= "not "; # if dash in front, do not
     $curquery .= "regexp ".quote_smart($temp)." ";
     $qlevel++;
     $doneg = 0;
     $newq = ereg_replace($m[0][$i], "", $newq);
     }


The variable $newq has the rest of the search string, minus everything in quotes, so whatever remains are sub-string search items falling under 2 and 3. Now we can go through what is left and basically do the same thing as above.

$s = preg_split("/\s+/", $newq, -1, PREG_SPLIT_NO_EMPTY); #whitespaces
for ($i = 0; $i < count($s); $i++) {
     if (ereg("^-", $s[$i])) { # exclude
          sscanf($s[$i], "-%s", $temp); # this is poor
          $s[$i] = $temp;
          $doneg++;
          }
     if ($qlevel) $curquery .= "and ";
     $curquery .= "ROW "; # the mysql row we are searching in
     if ($doneg) $curquery .= "not ";
     $curquery .= "regexp ".quote_smart($s[$i])." ";
     $qlevel++;
     $doneg = 0;
     }
# use $curquery here in database


The variable $curquery now contains our built mysql query. You will notice the use of quote_smart in here, this is a mysql best practice from php.net. It's the only mention of security anywhere in this code. You will need to run your own checking against the input to make sure there are no bad characters, mine only allows alpha-numerics and a few others. DO NOT use this code as is without first fixing that.

You might also like

avatar

Nanda Kumar (10 years ago) Reply

i want to find the exact word in a given string in MySql

avatar

quietearth (10 years ago) Reply

use the "posix compliant exact match" code I have listed above.

avatar

Anes (10 years ago) Reply

Friends I need an "Advanced Search Query" as we see in Advance search of Bharatmatrimony.com , jeevansathi.com etc...
if u have similar programs send me ,in my id i am really thankful to u.

avatar

Anonymous (9 years ago) Reply

A query of -"something" gives us:

select * from TABLE where ROW regexp '[[:<:]]something[[:>:]]' and ROW not regexp '[[:<:]]something[[:>:]]'

Clearly not a desirable result.

avatar

Nauman (9 years ago) Reply

http://devzone.zend.com/article/1304-Using-MySQL-Full-text-Searching

this link is for simple searching and fulltext searching

avatar

zahid (8 years ago) Reply

such a good topic I will use in my site

http://www.zahipedia.com

avatar

Unni Krishnan (7 years ago) Reply

"Friends I need an "Advanced Search Query" as we see in Advance search of Bharatmatrimony.com , jeevansathi.com etc..."

You can check this post, http://blog.webtechies.in/2009/11/29/advanced-search-with-multiple-entries-using-php-mysql/

Hope this may helped someone...

avatar

Search Engenie (7 years ago) Reply

Wouldn't it be more frugal to use Regex search in a FULLTEXT search capacity?

Description: http://en.wikipedia.org/wiki/Full_text_search
Query & Table example: http://jeremy.zawodny.com/blog/archives/000576.html


Leave a comment