Talk About Network

Google


Register and Login
Nick
Password
Register create new account Sign up is FREE and you can post replies, new topics, bookmark posts and more!
Recover lost password


Data Bases > Pgsql Hackers > Re: [GENERAL] F...
Latest [ Topics | Posts ] Archive Post A New Topic Post a Reply
<< Topic < Post Post 1 of 1 Topic 9407 of 10966
Post > Topic >>

Re: [GENERAL] Fragments in tsearch2 headline

by teodor@[EMAIL PROTECTED] (Teodor Sigaev) May 24, 2008 at 07:57 AM

[moved to -hackers, because talk is about implementation details]

> I've ****ted the patch of Sushant Sinha for fragmented headlines to
pg8.3.1
> (http://archives.postgresql.org/pgsql-general/2007-11/msg00508.php)
Thank you.

1 > diff -Nrub postgresql-8.3.1-orig/contrib/tsearch2/tsearch2.c
now contrib/tsearch2 is compatibility layer for old applications - they
don't
know about new features. So, this part isn't needed.

2 solution to compile function (ts_headline_with_fragments)  into core,
but
using it only from contrib module looks very odd. So, new feature can be
used
only with compatibility layer for old release :)

3 headline_with_fragments() is hardcoded to use default parser, but what
will be
in case when configuration uses another parser? For example, for japanese
language.

4 I would prefer the signature ts_headline( [regconfig,] text, tsquery
[,text] )
and function should accept 'NumFragments=>N' for default parser. Another
parsers
may use another options.

5 it just doesn't work correctly, because new code doesn't care of parser
specific type of lexemes.
contrib_regression=# select headline_with_fragments('english', 'wow
asd-wow
wow', 'asd', '');
      headline_with_fragments
----------------------------------
  ...wow asd-wow<b>asd</b>-wow wow
(1 row)


So, I incline to use existing framework/infrastructure although it may be
a
subject to change.

Some description:
1 ts_headline defines a correct parser to use
2 it calls hlparsetext to split text into structure suitable for both
goals:
find the best fragment(s) and concatenate that fragment(s) back to the
text
representation
3 it calls parser specific method   prsheadline which works with preparsed
text
(parse was done in hlparsetext). Method should mark a needed
words/parts/lexemes etc.
4 ts_headline glues fragments into text and returns that.

We need a parser's headline method because only parser knows all about its
lexemes.


-- 
Teodor Sigaev                                   E-mail: teodor@[EMAIL PROTECTED]
                                                    WWW:
http://www.sigaev.ru/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@[EMAIL PROTECTED]
)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
 




 1 Posts in Topic:
Re: [GENERAL] Fragments in tsearch2 headline
teodor@[EMAIL PROTECTED]   2008-05-24 07:57:16 

Post A Reply:
  Go here to Signup

AddThis Feed Button


About - Advertising - Contact - Frequently Asked Questions - Privacy Policy - Terms of Use - Signup

Contact
tan12V112 Mon Dec 1 12:31:56 CST 2008.