Talk About Network

Google


Register and Login
Nick
Password
Register create new account Sign up is FREE and you can post replies, new topics, bookmark posts and more!
Recover lost password


Data Bases > Databases > Re: Simple disk...
Latest [ Topics | Posts ] Archive Post A New Topic Post a Reply
<< Topic < Post Post 18 of 37 Topic 361 of 385
Post > Topic >>

Re: Simple disk-based sorting method

by Last Boy Scout <BadBill@[EMAIL PROTECTED] > Apr 15, 2007 at 09:06 PM

David Kelly wrote:
> Citizen Bob wrote:
>>>
>>> An easy way to solve your problem is to just open the source file and 
>>> bite off as much as you can sort in memory.
>>>
>>> Sort it. And write an output file.
>>>
>>> Take another bite, sort, and write to a different output file.
>>>
>>> Repeat until the entire original file has been bitten off and written 
>>> to smaller (but internally sorted) tem****ary files.
>>>
>>> Now open all the tem****ary output files. Read the first record from 
>>> each. Write to your final output file the one record which sorts 
>>> first. Replenish it with the next record from the file it came from. 
>>> Repeat until all records in the tem****ary files have been copied to 
>>> the output file.
>>>
>>> If you end up with 1,000 output files and can not open all at once, 
>>> bite off as many as you can, say 20, and merge them into a single 
>>> file, and another 20. And then merge the merged files, generation 
>>> after generation, until you are down to a single file.
>>
>> I chose the Radix Sort and it works beautifully. I only have to open
>> 11 files at a time - the input or the output file and 10 tem****ary
>> files. I discover the digit value in the key and use it as an index
>> into an array of FILE pointers.
>>
>> FILE *fp_temp[10];
>> fprintf(fp_temp[key],"%s",record);
> 
> Whatever floats your boat. But I think what you describe is clumsy and 
> only barely works in the specific application you describe. Grow the 
> application and it will break as your solution is delicate.
> 
> You are counting on an even distribution of the data which is already 
> too large to handle in memory (doesn't matter what you use to sort in 
> memory). You are counting on being able to break it into only 10 
> segments based on a pattern, and that none of those segments will be too

> large to handle in memory.
> 
> Your first pass deals the data into 10 different files. Your 2nd pass 
> loads each file in memory to perform the actual sort. Then 3rd pass 
> concatenates the files into one.
> 
> What I suggested was to take as much as you can and sort it, then write 
> a tem****ary file. Repeat until all data is now in multiple tem****ary 
> files, the contents of each is sorted. Then open all of these files and 
> your first data record will be the first data record from one of those 
> files. Each tem****ary file is now ordered so your final sort only has to

> compare the current record from each.
> 
> If you have more tem****ary files than you can open simultaneously then 
> the procedure can recurse, merging tem****ary files until a single file 
> remains.
> 
> If you like having names for things, splitting into multiple sorted 
> files and putting them back together is called a merge sort.
The thing about sorting numbers is you can sort from highest to lowest 
and just write the file and then append to the end of the file.
 




 37 Posts in Topic:
Simple disk-based sorting method
spam@[EMAIL PROTECTED] (  2007-04-13 12:40:27 
Re: Simple disk-based sorting method
"Ed Prochak" &l  2007-04-13 07:43:19 
Re: Simple disk-based sorting method
spam@[EMAIL PROTECTED] (  2007-04-13 15:34:31 
Re: Simple disk-based sorting method
Gene Wirchenko <genew@  2007-04-13 21:52:54 
Re: Simple disk-based sorting method
spam@[EMAIL PROTECTED] (  2007-04-14 16:47:00 
Re: Simple disk-based sorting method
Bob Stearns <rstearns1  2007-04-14 17:04:37 
Re: Simple disk-based sorting method
spam@[EMAIL PROTECTED] (  2007-04-15 06:31:54 
Re: Simple disk-based sorting method
"--CELKO--" <  2007-04-14 06:02:00 
Re: Simple disk-based sorting method
Gene Wirchenko <genew@  2007-04-17 10:58:15 
Re: Simple disk-based sorting method
Last Boy Scout <BadBil  2007-04-14 15:44:01 
Re: Simple disk-based sorting method
mAsterdam <mAsterdam@[  2007-04-15 02:11:03 
Re: Simple disk-based sorting method
spam@[EMAIL PROTECTED] (  2007-04-15 06:41:53 
Re: Simple disk-based sorting method
spam@[EMAIL PROTECTED] (  2007-04-15 06:47:37 
Re: Simple disk-based sorting method
David Kelly <n4hhe@[EM  2007-04-14 22:08:10 
Re: Simple disk-based sorting method
spam@[EMAIL PROTECTED] (  2007-04-15 06:47:38 
Re: Simple disk-based sorting method
David Kelly <n4hhe@[EM  2007-04-15 16:44:29 
Re: Simple disk-based sorting method
Last Boy Scout <BadBil  2007-04-15 21:03:37 
Re: Simple disk-based sorting method
Last Boy Scout <BadBil  2007-04-15 21:06:21 
Re: Simple disk-based sorting method
spam@[EMAIL PROTECTED] (  2007-04-16 11:53:31 
Re: Simple disk-based sorting method
David Kelly <n4hhe@[EM  2007-04-16 22:47:08 
Re: Simple disk-based sorting method
spam@[EMAIL PROTECTED] (  2007-04-17 13:09:39 
Re: Simple disk-based sorting method
Lemming <thiswillbounc  2007-04-28 00:44:49 
Re: Simple disk-based sorting method
Brian Inglis <Brian.In  2007-05-27 17:07:02 
Re: Simple disk-based sorting method
"Ed Prochak" &l  2007-04-16 10:37:23 
Re: Simple disk-based sorting method
"Ed Prochak" &l  2007-04-16 10:48:04 
Re: Simple disk-based sorting method
spam@[EMAIL PROTECTED] (  2007-04-16 18:48:01 
Re: Simple disk-based sorting method
--CELKO-- <jcelko212@[  2007-04-18 08:06:06 
Re: Simple disk-based sorting method
toby <toby@[EMAIL PROT  2007-04-20 20:31:10 
Re: Simple disk-based sorting method
Lemming <thiswillbounc  2007-04-28 01:25:34 
Re: Simple disk-based sorting method
toby <toby@[EMAIL PROT  2007-04-20 20:31:46 
Re: Simple disk-based sorting method
toby <toby@[EMAIL PROT  2007-04-20 20:34:48 
Re: Simple disk-based sorting method
"Paul Linehan"   2007-07-05 16:24:54 
Re: Simple disk-based sorting method
Lennart <erik.lennart.  2007-07-05 19:29:15 
Re: Simple disk-based sorting method
"Paul Linehan"   2007-07-06 10:10:11 
Re: Simple disk-based sorting method
David Segall <david@[E  2007-07-06 05:26:22 
Re: Simple disk-based sorting method
"Paul Linehan"   2007-07-06 10:05:24 
Re: Simple disk-based sorting method
David Segall <david@[E  2007-07-06 18:21:17 

Post A Reply:
  Go here to Signup

AddThis Feed Button


About - Advertising - Contact - Frequently Asked Questions - Privacy Policy - Terms of Use - Signup

Contact
tan12V112 Thu Aug 21 23:09:09 CDT 2008.