Talk About Network

Google


Register and Login
Nick
Password
Register create new account Sign up is FREE and you can post replies, new topics, bookmark posts and more!
Recover lost password


Data Bases > Pgsql Patches > Removing NONSEG...
Latest [ Topics | Posts ] Archive Post A New Topic Post a Reply
<< Topic < Post Post 1 of 3 Topic 3618 of 3966
Post > Topic >>

Removing NONSEG mode

by Zdenek.Kotala@[EMAIL PROTECTED] (Zdenek Kotala) Apr 22, 2008 at 01:24 PM

This is a multi-part message in MIME format.

--Boundary_(ID_F3d5i5myhqrMBN5f910PBQ)
Content-type: text/plain; format=flowed; charset=ISO-8859-2
Content-transfer-encoding: 7BIT

I attach patch which remove nonsegment mode sup****t. It was discussed
during
last commit fest. Nonsegment mode is possible uses only on couple of FS
(ZFS,
XFS) and it is not safe on any OS because each OS sup****t more
filesystems.

I added RELSEG option to the configure script to allow easily compile with
different segment size (on most filesystem 1T is safe value). As a bonus I
added
also BLCKSZ to configure script. It is not im****tant for this patch but it
could 
be useful e.g. for buildfarm testing with different BLCKSZ.

Patch requires to run autoconf and autoheader.

		Zdenek

PS: --with-segsize=1/1024 allows set segsize to 1MB - good for testing



--Boundary_(ID_F3d5i5myhqrMBN5f910PBQ)
Content-type: text/x-patch; name=seg.patch
Content-transfer-encoding: 7BIT
Content-disposition: inline; filename=seg.patch

Index: configure.in
===================================================================
RCS file: /zfs_data/cvs_pgsql/cvsroot/pgsql/configure.in,v
retrieving revision 1.555
diff -c -r1.555 configure.in
*** configure.in	30 Mar 2008 04:08:14 -0000	1.555
--- configure.in	21 Apr 2008 15:19:59 -0000
***************
*** 220,233 ****
  #
  # Data file segmentation
  #
! PGAC_ARG_BOOL(enable, segmented-files, yes,
!               [  --disable-segmented-files disable data file
segmentation (requires largefile sup****t)])
  
  #
  # C compiler
  #
  
  # For historical reasons you can also use --with-CC to specify the C
compiler
  # to use, although the standard way to do this is to set the CC
environment
  # variable.
  PGAC_ARG_REQ(with, CC, [], [CC=$with_CC])
--- 220,287 ----
  #
  # Data file segmentation
  #
! AC_MSG_CHECKING([for default relation segment size])
! PGAC_ARG_REQ(with, segsize, [  --with-segsize=RELSEG_SIZE  change
default relation segment size in GB [[1]]],
!              [default_segsize=$withval],
!              [default_segsize=1])
! AC_MSG_RESULT([${default_segsize}GB])
! AC_DEFINE_UNQUOTED([RELSEG_SIZE],
1024*1024*1024LL*${default_segsize}/BLCKSZ, [
!  RELSEG_SIZE is the maximum number of blocks allowed in one disk
!  file. Thus, the maximum size of a single file is RELSEG_SIZE * BLCKSZ;
!  relations bigger than that are divided into multiple files.
!  
!  RELSEG_SIZE * BLCKSZ must be less than your OS' limit on file size.
!  This is often 2 GB or 4GB in a 32-bit operating system, unless you
!  have large file sup****t enabled.  By default, we make the limit 1
!  GB to avoid any possible integer-overflow problems within the OS.
!  A limit smaller than necessary only means we divide a large
!  relation into more chunks than necessary, so it seems best to err
!  in the direction of a small limit.  (Besides, a power-of-2 value
!  saves a few cycles in md.c.)
  
+  Changing RELSEG_SIZE requires an initdb.
+ ])
+ AC_SUBST(default_segsize)
+ 
+ #
+ # Block size
  #
+ AC_MSG_CHECKING([for default block size])
+ PGAC_ARG_REQ(with, blocksize, [  --with-blocksize=BLCKSZ change default
block size (1,2,4,8,16,32 are allowed values). [[8]]],
+              [default_blocksize=$withval],
+              [default_blocksize=8])
+ case ${default_blocksize} in
+   1) default_blocksize=1024;;
+   2) default_blocksize=2048;;
+   4) default_blocksize=4096;;
+   8) default_blocksize=8192;;
+  16) default_blocksize=16384;;
+  32) default_blocksize=32768;;
+   *) AC_MSG_ERROR([Invalid block size. Allowed values are
1,2,4,8,16,32.])
+ esac
+ 
+ AC_MSG_RESULT([${default_blocksize}B])
+ AC_DEFINE_UNQUOTED([BLCKSZ], ${default_blocksize}, [
+  Size of a disk block --- this also limits the size of a tuple.  You
+  can set it bigger if you need bigger tuples (although TOAST should
+  reduce the need to have large tuples, since fields can be spread
+  across multiple tuples).
+  
+  BLCKSZ must be a power of 2.  The maximum possible value of BLCKSZ
+  is currently 2^15 (32768).  This is determined by the 15-bit widths
+  of the lp_off and lp_len fields in ItemIdData (see
+  include/storage/itemid.h).
+  
+  Changing BLCKSZ requires an initdb.
+ ]) 
+ AC_SUBST(default_blocksize)
+ 
+ 
  # C compiler
  #
  
  # For historical reasons you can also use --with-CC to specify the C
compiler
+ 
  # to use, although the standard way to do this is to set the CC
environment
  # variable.
  PGAC_ARG_REQ(with, CC, [], [CC=$with_CC])
***************
*** 1435,1443 ****
  
  # Check for largefile sup****t (must be after AC_SYS_LARGEFILE)
  AC_CHECK_SIZEOF([off_t])
! 
! if test "$ac_cv_sizeof_off_t" -lt 8 -o "$enable_segmented_files" =
"yes"; then 
!   AC_DEFINE([USE_SEGMENTED_FILES], 1, [Define to split data files into
1GB segments.]) 
  fi
  
  # SunOS doesn't handle negative byte comparisons properly with +/-
return
--- 1489,1496 ----
  
  # Check for largefile sup****t (must be after AC_SYS_LARGEFILE)
  AC_CHECK_SIZEOF([off_t])
! if test "$ac_cv_sizeof_off_t" -lt 8 -a "$default_segsize" != "1"; then 
!    AC_MSG_ERROR([Large file sup****t is not enabled. Segment size cannot
be larger then 1GB.]) 
  fi
  
  # SunOS doesn't handle negative byte comparisons properly with +/-
return
Index: src/backend/storage/file/buffile.c
===================================================================
RCS file:
/zfs_data/cvs_pgsql/cvsroot/pgsql/src/backend/storage/file/buffile.c,v
retrieving revision 1.30
diff -c -r1.30 buffile.c
*** src/backend/storage/file/buffile.c	10 Mar 2008 20:06:27 -0000	1.30
--- src/backend/storage/file/buffile.c	18 Apr 2008 08:13:45 -0000
***************
*** 38,45 ****
  #include "storage/buffile.h"
  
  /*
!  * We break BufFiles into gigabyte-sized segments, whether or not
!  * USE_SEGMENTED_FILES is defined.  The reason is that we'd like large
   * tem****ary BufFiles to be spread across multiple tablespaces when
available.
   */
  #define MAX_PHYSICAL_FILESIZE	0x40000000
--- 38,44 ----
  #include "storage/buffile.h"
  
  /*
!  * We break BufFiles into gigabyte-sized segments. The reason is that
we'd like large
   * tem****ary BufFiles to be spread across multiple tablespaces when
available.
   */
  #define MAX_PHYSICAL_FILESIZE	0x40000000
Index: src/backend/storage/smgr/md.c
===================================================================
RCS file:
/zfs_data/cvs_pgsql/cvsroot/pgsql/src/backend/storage/smgr/md.c,v
retrieving revision 1.137
diff -c -r1.137 md.c
*** src/backend/storage/smgr/md.c	18 Apr 2008 06:48:38 -0000	1.137
--- src/backend/storage/smgr/md.c	18 Apr 2008 08:12:02 -0000
***************
*** 89,106 ****
   *
   *	All MdfdVec objects are palloc'd in the MdCxt memory context.
   *
-  *	On platforms that sup****t large files, USE_SEGMENTED_FILES can be
-  *	#undef'd to disable the segmentation logic.  In that case each
-  *	relation is a single operating-system file.
   */
  
  typedef struct _MdfdVec
  {
  	File		mdfd_vfd;		/* fd number in fd.c's pool */
  	BlockNumber mdfd_segno;		/* segment number, from 0 */
- #ifdef USE_SEGMENTED_FILES
  	struct _MdfdVec *mdfd_chain;	/* next segment, or NULL */
- #endif
  } MdfdVec;
  
  static MemoryContext MdCxt;		/* context for all md.c allocations */
--- 89,101 ----
***************
*** 162,171 ****
  static void register_unlink(RelFileNode rnode);
  static MdfdVec *_fdvec_alloc(void);
  
- #ifdef USE_SEGMENTED_FILES
  static MdfdVec *_mdfd_openseg(SMgrRelation reln, BlockNumber segno,
  			  int oflags);
- #endif
  static MdfdVec *_mdfd_getseg(SMgrRelation reln, BlockNumber blkno,
  			 bool isTemp, ExtensionBehavior behavior);
  static BlockNumber _mdnblocks(SMgrRelation reln, MdfdVec *seg);
--- 157,164 ----
***************
*** 258,266 ****
  
  	reln->md_fd->mdfd_vfd = fd;
  	reln->md_fd->mdfd_segno = 0;
- #ifdef USE_SEGMENTED_FILES
  	reln->md_fd->mdfd_chain = NULL;
- #endif
  }
  
  /*
--- 251,257 ----
***************
*** 344,350 ****
  							rnode.relNode)));
  	}
  
- #ifdef USE_SEGMENTED_FILES
  	/* Delete the additional segments, if any */
  	else
  	{
--- 335,340 ----
***************
*** 374,380 ****
  		}
  		pfree(segpath);
  	}
- #endif
  
  	pfree(path);
  
--- 364,369 ----
***************
*** 420,431 ****
  
  	v = _mdfd_getseg(reln, blocknum, isTemp, EXTENSION_CREATE);
  
- #ifdef USE_SEGMENTED_FILES
  	seekpos = (off_t) BLCKSZ * (blocknum % ((BlockNumber) RELSEG_SIZE));
  	Assert(seekpos < (off_t) BLCKSZ * RELSEG_SIZE);
- #else
- 	seekpos = (off_t) BLCKSZ * blocknum;
- #endif
  
  	/*
  	 * Note: because caller usually obtained blocknum by calling mdnblocks,
--- 409,416 ----
***************
*** 469,477 ****
  	if (!isTemp)
  		register_dirty_segment(reln, v);
  
- #ifdef USE_SEGMENTED_FILES
  	Assert(_mdnblocks(reln, v) <= ((BlockNumber) RELSEG_SIZE));
- #endif
  }
  
  /*
--- 454,460 ----
***************
*** 530,539 ****
  
  	mdfd->mdfd_vfd = fd;
  	mdfd->mdfd_segno = 0;
- #ifdef USE_SEGMENTED_FILES
  	mdfd->mdfd_chain = NULL;
  	Assert(_mdnblocks(reln, mdfd) <= ((BlockNumber) RELSEG_SIZE));
- #endif
  
  	return mdfd;
  }
--- 513,520 ----
***************
*** 552,558 ****
  
  	reln->md_fd = NULL;			/* prevent dangling pointer after error */
  
- #ifdef USE_SEGMENTED_FILES
  	while (v != NULL)
  	{
  		MdfdVec    *ov = v;
--- 533,538 ----
***************
*** 564,574 ****
  		v = v->mdfd_chain;
  		pfree(ov);
  	}
- #else
- 	if (v->mdfd_vfd >= 0)
- 		FileClose(v->mdfd_vfd);
- 	pfree(v);
- #endif
  }
  
  /*
--- 544,549 ----
***************
*** 583,594 ****
  
  	v = _mdfd_getseg(reln, blocknum, false, EXTENSION_FAIL);
  
- #ifdef USE_SEGMENTED_FILES
  	seekpos = (off_t) BLCKSZ * (blocknum % ((BlockNumber) RELSEG_SIZE));
  	Assert(seekpos < (off_t) BLCKSZ * RELSEG_SIZE);
- #else
- 	seekpos = (off_t) BLCKSZ * blocknum;
- #endif
  
  	if (FileSeek(v->mdfd_vfd, seekpos, SEEK_SET) != seekpos)
  		ere****t(ERROR,
--- 558,565 ----
***************
*** 653,664 ****
  
  	v = _mdfd_getseg(reln, blocknum, isTemp, EXTENSION_FAIL);
  
- #ifdef USE_SEGMENTED_FILES
  	seekpos = (off_t) BLCKSZ * (blocknum % ((BlockNumber) RELSEG_SIZE));
  	Assert(seekpos < (off_t) BLCKSZ * RELSEG_SIZE);
- #else
- 	seekpos = (off_t) BLCKSZ * blocknum;
- #endif
  
  	if (FileSeek(v->mdfd_vfd, seekpos, SEEK_SET) != seekpos)
  		ere****t(ERROR,
--- 624,631 ----
***************
*** 708,714 ****
  {
  	MdfdVec    *v = mdopen(reln, EXTENSION_FAIL);
  
- #ifdef USE_SEGMENTED_FILES
  	BlockNumber nblocks;
  	BlockNumber segno = 0;
  
--- 675,680 ----
***************
*** 764,772 ****
  
  		v = v->mdfd_chain;
  	}
- #else
- 	return _mdnblocks(reln, v);
- #endif
  }
  
  /*
--- 730,735 ----
***************
*** 777,786 ****
  {
  	MdfdVec    *v;
  	BlockNumber curnblk;
- 
- #ifdef USE_SEGMENTED_FILES
  	BlockNumber priorblocks;
- #endif
  
  	/*
  	 * NOTE: mdnblocks makes sure we have opened all active segments, so
that
--- 740,746 ----
***************
*** 804,810 ****
  
  	v = mdopen(reln, EXTENSION_FAIL);
  
- #ifdef USE_SEGMENTED_FILES
  	priorblocks = 0;
  	while (v != NULL)
  	{
--- 764,769 ----
***************
*** 866,884 ****
  		}
  		priorblocks += RELSEG_SIZE;
  	}
- #else
- 	/* For unsegmented files, it's a lot easier */
- 	if (FileTruncate(v->mdfd_vfd, (off_t) nblocks * BLCKSZ) < 0)
- 		ere****t(ERROR,
- 				(errcode_for_file_access(),
- 			  errmsg("could not truncate relation %u/%u/%u to %u blocks: %m",
- 					 reln->smgr_rnode.spcNode,
- 					 reln->smgr_rnode.dbNode,
- 					 reln->smgr_rnode.relNode,
- 					 nblocks)));
- 	if (!isTemp)
- 		register_dirty_segment(reln, v);
- #endif
  }
  
  /*
--- 825,830 ----
***************
*** 901,907 ****
  
  	v = mdopen(reln, EXTENSION_FAIL);
  
- #ifdef USE_SEGMENTED_FILES
  	while (v != NULL)
  	{
  		if (FileSync(v->mdfd_vfd) < 0)
--- 847,852 ----
***************
*** 914,928 ****
  					   reln->smgr_rnode.relNode)));
  		v = v->mdfd_chain;
  	}
- #else
- 	if (FileSync(v->mdfd_vfd) < 0)
- 		ere****t(ERROR,
- 				(errcode_for_file_access(),
- 				 errmsg("could not fsync relation %u/%u/%u: %m",
- 						reln->smgr_rnode.spcNode,
- 						reln->smgr_rnode.dbNode,
- 						reln->smgr_rnode.relNode)));
- #endif
  }
  
  /*
--- 859,864 ----
***************
*** 1476,1483 ****
  	return (MdfdVec *) MemoryContextAlloc(MdCxt, sizeof(MdfdVec));
  }
  
- #ifdef USE_SEGMENTED_FILES
- 
  /*
   * Open the specified segment of the relation,
   * and make a MdfdVec object for it.  Returns NULL on failure.
--- 1412,1417 ----
***************
*** 1522,1528 ****
  	/* all done */
  	return v;
  }
- #endif   /* USE_SEGMENTED_FILES */
  
  /*
   *	_mdfd_getseg() -- Find the segment of the relation holding the
--- 1456,1461 ----
***************
*** 1538,1544 ****
  {
  	MdfdVec    *v = mdopen(reln, behavior);
  
- #ifdef USE_SEGMENTED_FILES
  	BlockNumber targetseg;
  	BlockNumber nextsegno;
  
--- 1471,1476 ----
***************
*** 1600,1607 ****
  		}
  		v = v->mdfd_chain;
  	}
- #endif
- 
  	return v;
  }
  
--- 1532,1537 ----
Index: src/include/pg_config_manual.h
===================================================================
RCS file:
/zfs_data/cvs_pgsql/cvsroot/pgsql/src/include/pg_config_manual.h,v
retrieving revision 1.31
diff -c -r1.31 pg_config_manual.h
*** src/include/pg_config_manual.h	11 Apr 2008 22:54:23 -0000	1.31
--- src/include/pg_config_manual.h	21 Apr 2008 15:17:07 -0000
***************
*** 11,57 ****
   */
  
  /*
-  * Size of a disk block --- this also limits the size of a tuple.  You
-  * can set it bigger if you need bigger tuples (although TOAST should
-  * reduce the need to have large tuples, since fields can be spread
-  * across multiple tuples).
-  *
-  * BLCKSZ must be a power of 2.  The maximum possible value of BLCKSZ
-  * is currently 2^15 (32768).  This is determined by the 15-bit widths
-  * of the lp_off and lp_len fields in ItemIdData (see
-  * include/storage/itemid.h).
-  *
-  * Changing BLCKSZ requires an initdb.
-  */
- #define BLCKSZ	8192
- 
- /*
-  * RELSEG_SIZE is the maximum number of blocks allowed in one disk
-  * file when USE_SEGMENTED_FILES is defined.  Thus, the maximum size 
-  * of a single file is RELSEG_SIZE * BLCKSZ; relations bigger than that 
-  * are divided into multiple files.
-  *
-  * RELSEG_SIZE * BLCKSZ must be less than your OS' limit on file size.
-  * This is often 2 GB or 4GB in a 32-bit operating system, unless you
-  * have large file sup****t enabled.  By default, we make the limit 1
-  * GB to avoid any possible integer-overflow problems within the OS.
-  * A limit smaller than necessary only means we divide a large
-  * relation into more chunks than necessary, so it seems best to err
-  * in the direction of a small limit.  (Besides, a power-of-2 value
-  * saves a few cycles in md.c.)
-  *
-  * When not using segmented files, RELSEG_SIZE is set to zero so that
-  * this behavior can be distinguished in pg_control.
-  *
-  * Changing RELSEG_SIZE requires an initdb.
-  */
- #ifdef USE_SEGMENTED_FILES
- #define RELSEG_SIZE (0x40000000 / BLCKSZ)
- #else
- #define RELSEG_SIZE 0
- #endif
- 
- /*
   * Size of a WAL file block.  This need have no particular relation to
BLCKSZ.
   * XLOG_BLCKSZ must be a power of 2, and if your system sup****ts
O_DIRECT I/O,
   * XLOG_BLCKSZ must be a multiple of the alignment requirement for
direct-I/O
--- 11,16 ----

--Boundary_(ID_F3d5i5myhqrMBN5f910PBQ)
Content-Type: text/plain
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
MIME-Version: 1.0


-- 
Sent via pgsql-patches mailing list (pgsql-patches@[EMAIL PROTECTED]
)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches

--Boundary_(ID_F3d5i5myhqrMBN5f910PBQ)--
 




 3 Posts in Topic:
Removing NONSEG mode
Zdenek.Kotala@[EMAIL PROT  2008-04-22 13:24:31 
Re: Removing NONSEG mode
alvherre@[EMAIL PROTECTED  2008-04-22 08:50:52 
Re: Removing NONSEG mode
tgl@[EMAIL PROTECTED] (T  2008-05-01 21:12:27 

Post A Reply:
  Go here to Signup

AddThis Feed Button


About - Advertising - Contact - Frequently Asked Questions - Privacy Policy - Terms of Use - Signup

Contact
tan12V112 Sat Sep 6 15:19:36 CDT 2008.