cdfit -- report how to fit a directory's items onto a CDR's worth of space
% cdfit ./mp3s
Considering fitting ./mp3s into 681,700,000b...
Optimal solution is 7 items:
[
58,338,055: Air
90,221,893: Brian Eno
18,908,463: Duran Duran
42,602,438: Laurie Anderson
14,547,579: Luscious Jackson
98,023,464: Massive Attack
359,057,250: Philip Glass
]
filling to 681,699,142b,
leaving 858b free and 14 items totalling 1,804,083,465b outside.
This utility says how to fit items from a given directory into a certain amount of space -- by default, a rule-of-thumb figure for the amount of space on a blank CDR.
Usage: cdfit [-number] [-tseconds]
[filespec]>
Examples:
cdfit
cdfit c:/data
cdfit -670,000,000
cdfit -670,000,000 ~/oggs
cdfit -650,000,000 -t120
cdfit -1,440,000 -t360 /user/mojojojo/journals /
Optional switch -number sets the size of the CD (or whatever) that you want to fit items into. Current default is 681,700,000 bytes. You may use commas or underscores in this number.
Optional switch -t<seconds> sets a time limit on how long this program will try searching for a match. When the time is up, the program stops searching and reports the best solution found within that time. You may use commas or underscores in this number. Note that this number is in seconds: -t10 is ten seconds, not ten minutes (which would be -t600, of course). This switch may not be supported under MSWin.
The parameter filespec provides the path to the items that you want to fit onto a CD. Otherwise defaults to the first existing item found from the list "c:/windows/desktop/to_burn", "c:/windows/desktop/mp3", "c:/windows/desktop/mp3s", "c:/windows/desktop/music", "."
(the latter, current directory, is a guaranteed fallback).
Note that at time of writing, you can only provide look at one directory -- i.e., you can't do cdfit mp3 mp3b
cdfit doesn't take into account per-file or per-directory overhead when imagining fitting things onto a CDR. You can work around this by just claiming that there's less space on a CDR than cdfit's default of 681,700,000 bytes.
cdfit can take exponentially longer time when given more items to fit. (Specifically, each item can make it take twice as long to run: if it takes 3 seconds to consider 15 items, it might take twice as long (6s) to consider 16 items, twice as long again (12s) to consider 17 items.
My experience is that cdfit runs best (i.e., in less than a minute on a decent machine, not taking exponential time with increased number of elements) when you have fewer than 30 items, where the typical item is larger than 1/10th the capacity of a CD. Having lots of little items (little relative to the size of the CD) will make this take quite a long time. When in doubt, you can always just abort the program.
This program has efficiency problems because this is a case of the "Knapsack Problem", which is NP-complete.
cdfit's algorithm iterates over all possible solutions, but prunes the search space when applicable. I.e., when it's considering fitting a 500M directory and a 300M directory and others into a 670M disk, it doesn't bother iterating over all possible solutions that start with having a 500M item and a 300M item, because it knows that none of those are going to work. So taking 2**N time is only a worst case -- bit it's one that we approach as we have many small items, the sum of whose sizes is just over the capacity of the CD.
cdfit's algorithm specifically traps the case where all items given will fit in the CD. This avoids a complete (and pointless) scan of the search space.
Mercifully, memory requirements don't increase much with the number of items to fit. We don't keep a list of all possible solutions in memory -- we keep only the best one so far, and the current one being considered.
As you can probably guess, I wrote this program because I had made mp3 backups of a lot of CDs of mine, and I wanted to burn those directories to CDRs, and I wanted to pack the CDRs as tightly as possible.
Copyright (c) 2002 Sean M. Burke. All rights reserved.
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
The programs and documentation in this dist are distributed in the hope that they will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose.
Sean M. Burke, sburke@cpan.org