Talk About Network

Google


Register and Login
Nick
Password
Register create new account Sign up is FREE and you can post replies, new topics, bookmark posts and more!
Recover lost password


Data Bases > Pgsql Admin > Re: duplicate p...
Latest [ Topics | Posts ] Archive Post A New Topic Post a Reply
<< Topic < Post Post 4 of 17 Topic 4908 of 5419
Post > Topic >>

Re: duplicate primary index in bayes db from SpamAssassin

by michael.monnerie@[EMAIL PROTECTED] (Michael Monnerie) Apr 15, 2008 at 08:26 AM

--nextPart18600185.XPXWiJ6995
Content-Type: text/plain;
  charset="iso-8859-15"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline

On Montag, 14. April 2008 Tom Lane wrote:
> What PG version is this?

I've looked a little more into history, and PostgreSQL was 8.1.5=20
originally from CD, then 8.1.9 and now 8.1.11.

I'm currently investigating a way to delete duplicates from the table.=20
Is there any chance to do that? A=20
SELECT * from bayes_token;
shows no dups, as there's a primary index... at least it works here.

This is the bayes_token table and its index:
CREATE TABLE bayes_token (
    id integer DEFAULT 0 NOT NULL,
    token bytea DEFAULT ''::bytea NOT NULL,
    spam_count integer DEFAULT 0 NOT NULL,
    ham_count integer DEFAULT 0 NOT NULL,
    atime integer DEFAULT 0 NOT NULL
);
ALTER TABLE ONLY bayes_token
    ADD CONSTRAINT bayes_token_pkey PRIMARY KEY (id, token);

Could I create a new db without indices, restore the data there, and=20
then run a "delete from.. where duplicates"? Sometimes there's 3x the=20
same content in the primary index, but I'd need to delete all except=20
the one with the highest spam_count.

I just found I even have a duplicate in bayes_vars:

COPY bayes_vars (id, username, spam_count, ham_count, token_count,=20
last_expire, last_atime_delta, last_expire_reduce, oldest_token_age,=20
newest_token_age) FROM stdin;
1       vscan   194393  517531  2602114 1206030039      2764800 15304  =20
1203265204      1206057801
1       vscan   194398  517535  2602799 1206030039      2764800 15304  =20
1203265204      1206064729
4       vscan   18305   25403   2042983 1208143427      1382400 13268  =20
1206501543      1208216540
\.

Definition:
CREATE TABLE bayes_vars (
    id serial NOT NULL,
    username character varying(200) DEFAULT ''::character varying NOT=20
NULL,
    spam_count integer DEFAULT 0 NOT NULL,
    ham_count integer DEFAULT 0 NOT NULL,
    token_count integer DEFAULT 0 NOT NULL,
    last_expire integer DEFAULT 0 NOT NULL,
    last_atime_delta integer DEFAULT 0 NOT NULL,
    last_expire_reduce integer DEFAULT 0 NOT NULL,
    oldest_token_age integer DEFAULT 2147483647 NOT NULL,
    newest_token_age integer DEFAULT 0 NOT NULL
);
ALTER TABLE ONLY bayes_vars
    ADD CONSTRAINT bayes_vars_pkey PRIMARY KEY (id);

But here, I could see the dups with SELECT, and even delete one record:

# select * from bayes_vars;
 id | username | spam_count | ham_count | token_count | last_expire |=20
last_atime_delta | last_expire_reduce | oldest_token_age |=20
newest_token_age
=2D---+----------+------------+-----------+-------------+-------------+----=
=2D-------------+--------------------+------------------+------------------
  1 | vscan    |     194393 |    517531 |     2602114 |  1206030039 |     
=
   =20
2764800 |              15304 |       1203265204 |       1206057801
  1 | vscan    |     194398 |    517535 |     2602799 |  1206030039 |     
=
   =20
2764800 |              15304 |       1203265204 |       1206064729
  4 | vscan    |      18375 |     25828 |     2050196 |  1208229525 |     
=
   =20
1382400 |              13268 |       1206501543 |       1208240610
(3 Zeilen)

bayes_pg_v1=3D# delete from bayes_vars where spam_count =3D194393;
DELETE 1
bayes_pg_v1=3D# select * from bayes_vars;
 id | username | spam_count | ham_count | token_count | last_expire |=20
last_atime_delta | last_expire_reduce | oldest_token_age |=20
newest_token_age
=2D---+----------+------------+-----------+-------------+-------------+----=
=2D-------------+--------------------+------------------+------------------
  1 | vscan    |     194398 |    517535 |     2602799 |  1206030039 |     
=
   =20
2764800 |              15304 |       1203265204 |       1206064729
  4 | vscan    |      18375 |     25829 |     2050215 |  1208229525 |     
=
   =20
1382400 |              13268 |       1206501543 |       1208240637
(2 Zeilen)

mfg zmi
=2D-=20
// Michael Monnerie, Ing.BSc    -----      http://it-management.at
// Tel: 0676/846 914 666                      .network.your.ideas.
// PGP Key:         "curl -s http://zmi.at/zmi.asc
| gpg --im****t"
// Fingerprint: AC19 F9D5 36ED CD8A EF38  500E CE14 91F7 1C12 09B4
// Keyserver: www.keyserver.net                   Key-ID: 1C1209B4

--nextPart18600185.XPXWiJ6995
Content-Type: application/pgp-signature; name=signature.asc 
Content-Description: This is a digitally signed message part.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.4-svn0 (GNU/Linux)

iD8DBQBIBEqdzhSR9xwSCbQRAsfeAJ4y1ElbYuiIy5e6VWJK55+VBLl2+QCfb6CB
gXGPhHm6GpycgFubsT2I+hQ=
=Hi+r
-----END PGP SIGNATURE-----

--nextPart18600185.XPXWiJ6995--
 




 17 Posts in Topic:
duplicate primary index in bayes db from SpamAssassin
michael.monnerie@[EMAIL P  2008-04-14 17:57:11 
Re: duplicate primary index in bayes db from SpamAssassin
tgl@[EMAIL PROTECTED] (T  2008-04-14 12:00:01 
Re: duplicate primary index in bayes db from SpamAssassin
michael.monnerie@[EMAIL P  2008-04-14 21:31:10 
Re: duplicate primary index in bayes db from SpamAssassin
michael.monnerie@[EMAIL P  2008-04-15 08:26:31 
Re: duplicate primary index in bayes db from SpamAssassin
alvherre@[EMAIL PROTECTED  2008-04-14 12:03:25 
Re: duplicate primary index in bayes db from SpamAssassin
michael.monnerie@[EMAIL P  2008-04-14 22:10:42 
Re: duplicate primary index in bayes db from SpamAssassin
michael.monnerie@[EMAIL P  2008-04-15 20:27:30 
Re: duplicate primary index in bayes db from SpamAssassin
michael.monnerie@[EMAIL P  2008-04-16 01:56:39 
Re: duplicate primary index in bayes db from SpamAssassin
michael.monnerie@[EMAIL P  2008-04-17 09:45:50 
Re: duplicate primary index in bayes db from SpamAssassin
tgl@[EMAIL PROTECTED] (T  2008-04-17 11:41:31 
Re: duplicate primary index in bayes db from SpamAssassin
michael.monnerie@[EMAIL P  2008-04-18 01:20:24 
Re: duplicate primary index in bayes db from SpamAssassin
tgl@[EMAIL PROTECTED] (T  2008-04-17 20:15:23 
Re: duplicate primary index in bayes db from SpamAssassin
michael.monnerie@[EMAIL P  2008-04-18 01:28:14 
Re: duplicate primary index in bayes db from SpamAssassin
tgl@[EMAIL PROTECTED] (T  2008-04-30 16:25:49 
Re: duplicate primary index in bayes db from SpamAssassin
michael.monnerie@[EMAIL P  2008-05-07 11:03:53 
Re: duplicate primary index in bayes db from SpamAssassin
tgl@[EMAIL PROTECTED] (T  2008-05-07 10:32:54 
Re: duplicate primary index in bayes db from SpamAssassin
michael.monnerie@[EMAIL P  2008-05-07 16:42:38 

Post A Reply:
  Go here to Signup

AddThis Feed Button


About - Advertising - Contact - Frequently Asked Questions - Privacy Policy - Terms of Use - Signup

Contact
tan12V112 Tue Oct 14 10:23:21 CDT 2008.