Talk About Network

Google


Register and Login
Nick
Password
Register create new account Sign up is FREE and you can post replies, new topics, bookmark posts and more!
Recover lost password


Data Bases > Pgsql General > tsearch2 and hy...
Latest [ Topics | Posts ] Archive Post A New Topic Post a Reply
<< Topic < Post Post 1 of 4 Topic 15260 of 17602
Post > Topic >>

tsearch2 and hyphenated terms

by reece@[EMAIL PROTECTED] (Reece Hart) Apr 10, 2008 at 10:17 PM

I'd like to use tsearch2 to index protein and gene names. Unfortunately,
such names are written inconsistently and sometimes with hyphens. For
example, MCL-1 and MCL1 are semantically equivalent but with the default
parser and to_tsvector, I see this:

        unison@[EMAIL PROTECTED]
> select to_tsvector('MCL1 MCL-1');
               to_tsvector       
        -------------------------
         '-1':3 'mcl':2 'mcl1':1

For the purposes of indexing these names, I suspect I'd get the majority
of cases by removing a hyphen when it's followed by 1 or 2 chars from
[a-zA-Z0-9]. Does that require a custom parser?

Thanks,
Reece

-- 
Reece Hart, http://harts.net/reece/,
GPG:0x25EC91A0


-- 
Sent via pgsql-general mailing list (pgsql-general@[EMAIL PROTECTED]
)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
 




 4 Posts in Topic:
tsearch2 and hyphenated terms
reece@[EMAIL PROTECTED]   2008-04-10 22:17:25 
Re: tsearch2 and hyphenated terms
tgl@[EMAIL PROTECTED] (T  2008-04-11 12:45:32 
Re: tsearch2 and hyphenated terms
oleg@[EMAIL PROTECTED] (  2008-04-11 22:07:14 
Re: tsearch2 and hyphenated terms
reece@[EMAIL PROTECTED]   2008-04-11 17:31:15 

Post A Reply:
  Go here to Signup

AddThis Feed Button


About - Advertising - Contact - Frequently Asked Questions - Privacy Policy - Terms of Use - Signup

Contact
tan12V112 Mon Dec 1 19:04:35 CST 2008.