<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: How To Think About Compression</title>
	<atom:link href="http://changelog.complete.org/archives/910-how-to-think-about-compression/feed" rel="self" type="application/rss+xml" />
	<link>http://changelog.complete.org/archives/910-how-to-think-about-compression</link>
	<description>Viewpoints on technology, society, and government</description>
	<lastBuildDate>Sun, 05 Feb 2012 03:11:59 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
	<item>
		<title>By: Erik Johansson</title>
		<link>http://changelog.complete.org/archives/910-how-to-think-about-compression/comment-page-1#comment-5590</link>
		<dc:creator>Erik Johansson</dc:creator>
		<pubDate>Sun, 14 Mar 2010 13:23:41 +0000</pubDate>
		<guid isPermaLink="false">http://changelog.complete.org/?p=910#comment-5590</guid>
		<description>pbzip2 wins the speed test for me since 8 cores is the minimal install these days, but if you do the per file compression mentioned in  &lt;a href=&quot;http://changelog.complete.org/archives/931-how-to-think-about-compression-part-2&quot; rel=&quot;nofollow&quot;&gt;part 2 the pbzip2 edge might not be so big. But pbzip2 is ridiculously fast on 8 core systems, I get 350MB/s which is faster than I can write to disks.

Though I have a friend who told me that pbzip2 crashed on him, so it&#039;s not safe yet.</description>
		<content:encoded><![CDATA[<p>pbzip2 wins the speed test for me since 8 cores is the minimal install these days, but if you do the per file compression mentioned in  <a href="http://changelog.complete.org/archives/931-how-to-think-about-compression-part-2" rel="nofollow">part 2 the pbzip2 edge might not be so big. But pbzip2 is ridiculously fast on 8 core systems, I get 350MB/s which is faster than I can write to disks.</p>
<p>Though I have a friend who told me that pbzip2 crashed on him, so it&#8217;s not safe yet.</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: system logs compressed bzip2 - openSUSE Forums</title>
		<link>http://changelog.complete.org/archives/910-how-to-think-about-compression/comment-page-1#comment-5535</link>
		<dc:creator>system logs compressed bzip2 - openSUSE Forums</dc:creator>
		<pubDate>Sat, 27 Feb 2010 00:48:15 +0000</pubDate>
		<guid isPermaLink="false">http://changelog.complete.org/?p=910#comment-5535</guid>
		<description>[...]  I know &quot;less&quot; will read them (based on: Less FAQ).   This blog got me think&#039;n about it: How To Think About Compression &#124; The Changelog  Thank [...]</description>
		<content:encoded><![CDATA[<p>[...]  I know &quot;less&quot; will read them (based on: Less FAQ).   This blog got me think&#39;n about it: How To Think About Compression | The Changelog  Thank [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Róża (rozie) 's status on Saturday, 01-Aug-09 12:48:12 UTC - Identi.ca</title>
		<link>http://changelog.complete.org/archives/910-how-to-think-about-compression/comment-page-1#comment-4214</link>
		<dc:creator>Róża (rozie) 's status on Saturday, 01-Aug-09 12:48:12 UTC - Identi.ca</dc:creator>
		<pubDate>Sat, 01 Aug 2009 12:48:23 +0000</pubDate>
		<guid isPermaLink="false">http://changelog.complete.org/?p=910#comment-4214</guid>
		<description>[...]  http://changelog.complete.org/archives/910-how-to-think-about-compression  [...]</description>
		<content:encoded><![CDATA[<p>[...]  <a href="http://changelog.complete.org/archives/910-how-to-think-about-compression" rel="nofollow">http://changelog.complete.org/archives/910-how-to-think-about-compression</a>  [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jari Aalto</title>
		<link>http://changelog.complete.org/archives/910-how-to-think-about-compression/comment-page-1#comment-3419</link>
		<dc:creator>Jari Aalto</dc:creator>
		<pubDate>Sat, 21 Mar 2009 23:14:04 +0000</pubDate>
		<guid isPermaLink="false">http://changelog.complete.org/?p=910#comment-3419</guid>
		<description>And test with http://rzip.samba.org/ would also be interesting</description>
		<content:encoded><![CDATA[<p>And test with <a href="http://rzip.samba.org/" rel="nofollow">http://rzip.samba.org/</a> would also be interesting</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jari Aalto</title>
		<link>http://changelog.complete.org/archives/910-how-to-think-about-compression/comment-page-1#comment-3418</link>
		<dc:creator>Jari Aalto</dc:creator>
		<pubDate>Sat, 21 Mar 2009 21:11:53 +0000</pubDate>
		<guid isPermaLink="false">http://changelog.complete.org/?p=910#comment-3418</guid>
		<description>It would be interesting to know the results of
zip and pigz (paraller gzip) too as well as decompression times.</description>
		<content:encoded><![CDATA[<p>It would be interesting to know the results of<br />
zip and pigz (paraller gzip) too as well as decompression times.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: David Magda</title>
		<link>http://changelog.complete.org/archives/910-how-to-think-about-compression/comment-page-1#comment-3396</link>
		<dc:creator>David Magda</dc:creator>
		<pubDate>Sat, 14 Mar 2009 14:34:30 +0000</pubDate>
		<guid isPermaLink="false">http://changelog.complete.org/?p=910#comment-3396</guid>
		<description>You have parallel bzip2, how about parallel gzip (pigz)?

http://www.zlib.net/pigz/</description>
		<content:encoded><![CDATA[<p>You have parallel bzip2, how about parallel gzip (pigz)?</p>
<p><a href="http://www.zlib.net/pigz/" rel="nofollow">http://www.zlib.net/pigz/</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: links for 2009-02-18 &#171; My Weblog</title>
		<link>http://changelog.complete.org/archives/910-how-to-think-about-compression/comment-page-1#comment-3262</link>
		<dc:creator>links for 2009-02-18 &#171; My Weblog</dc:creator>
		<pubDate>Thu, 19 Feb 2009 04:08:15 +0000</pubDate>
		<guid isPermaLink="false">http://changelog.complete.org/?p=910#comment-3262</guid>
		<description>[...] How To Think About Compression &#124; The Changelog (tags: linux) [...]</description>
		<content:encoded><![CDATA[<p>[...] How To Think About Compression | The Changelog (tags: linux) [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Pseudonym</title>
		<link>http://changelog.complete.org/archives/910-how-to-think-about-compression/comment-page-1#comment-3260</link>
		<dc:creator>Pseudonym</dc:creator>
		<pubDate>Wed, 18 Feb 2009 23:25:46 +0000</pubDate>
		<guid isPermaLink="false">http://changelog.complete.org/?p=910#comment-3260</guid>
		<description>The main factor for me is that everyone has gzip and bzip2 installed (and most people have zip).  Far fewer people have 7-Zip.

There are tools which beat bzip2 simultaneously on both compression effectiveness and compression/decompression speed.  However, there is no tool which beats it simultaneously on both compression effectiveness and market penetration.  As such, it still hits a &quot;sweet spot&quot;.</description>
		<content:encoded><![CDATA[<p>The main factor for me is that everyone has gzip and bzip2 installed (and most people have zip).  Far fewer people have 7-Zip.</p>
<p>There are tools which beat bzip2 simultaneously on both compression effectiveness and compression/decompression speed.  However, there is no tool which beats it simultaneously on both compression effectiveness and market penetration.  As such, it still hits a &#8220;sweet spot&#8221;.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Moshroum</title>
		<link>http://changelog.complete.org/archives/910-how-to-think-about-compression/comment-page-1#comment-3259</link>
		<dc:creator>Moshroum</dc:creator>
		<pubDate>Wed, 18 Feb 2009 21:54:54 +0000</pubDate>
		<guid isPermaLink="false">http://changelog.complete.org/?p=910#comment-3259</guid>
		<description>You know that lzma sdk is something different than lzma-utils. The first one is the implementation taken from 7z which stores the lzma stream without any further headers... which is somewhat suboptimal and isn&#039;t like the stuff you expect from a gzip-like utility.
lzma-utils (or better to say xz-utils now) has a better on-disk format with really headers and checksums and works like gzip or bzip2. Additional you can use the filters from 7zip. If you want to use lzma in a tool you should definitely try libxz instead of the lzma sdk.</description>
		<content:encoded><![CDATA[<p>You know that lzma sdk is something different than lzma-utils. The first one is the implementation taken from 7z which stores the lzma stream without any further headers&#8230; which is somewhat suboptimal and isn&#8217;t like the stuff you expect from a gzip-like utility.<br />
lzma-utils (or better to say xz-utils now) has a better on-disk format with really headers and checksums and works like gzip or bzip2. Additional you can use the filters from 7zip. If you want to use lzma in a tool you should definitely try libxz instead of the lzma sdk.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Linux: analisis de fornatos de compresión &#171; Marginalia2009&#8217;s Blog</title>
		<link>http://changelog.complete.org/archives/910-how-to-think-about-compression/comment-page-1#comment-3251</link>
		<dc:creator>Linux: analisis de fornatos de compresión &#171; Marginalia2009&#8217;s Blog</dc:creator>
		<pubDate>Wed, 18 Feb 2009 12:58:29 +0000</pubDate>
		<guid isPermaLink="false">http://changelog.complete.org/?p=910#comment-3251</guid>
		<description>[...] analisis de fornatos de&#160;compresión By marginalia2009  John Goerzen, realiza en su Blog, una profunda comparativa de los formatos de compresión más ultlizados en Linux... Gzip, BZIP2 y LZMA; en términos de eficiencia, tiempo y  compresión Gzip, es deseguido el [...]</description>
		<content:encoded><![CDATA[<p>[...] analisis de fornatos de&nbsp;compresión By marginalia2009  John Goerzen, realiza en su Blog, una profunda comparativa de los formatos de compresión más ultlizados en Linux&#8230; Gzip, BZIP2 y LZMA; en términos de eficiencia, tiempo y  compresión Gzip, es deseguido el [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: John Goerzen</title>
		<link>http://changelog.complete.org/archives/910-how-to-think-about-compression/comment-page-1#comment-3247</link>
		<dc:creator>John Goerzen</dc:creator>
		<pubDate>Wed, 18 Feb 2009 02:43:57 +0000</pubDate>
		<guid isPermaLink="false">http://changelog.complete.org/?p=910#comment-3247</guid>
		<description>That&#039;s an oversimplification, and an extreme one at that.  I&#039;ve shown that lzma -1 is faster than bzip2 -9.  I&#039;m sure that 7za, perhaps even with default options, can be made to be slower than bzip2 -- but it can be made to be faster, too.

I have no illusion of grandeur here.  Of course there are always more questions.  One thing I care about is how well various archive formats store all the data about POSIX filesystems -- hardlinks, symlinks, and sparse files.  Neither zip nor 7-zip do, which is a big issue for me.  tar, of course, can do this.

If you want to whip up data on those other things, by all means post a link to it here.  I&#039;d be interested.</description>
		<content:encoded><![CDATA[<p>That&#8217;s an oversimplification, and an extreme one at that.  I&#8217;ve shown that lzma -1 is faster than bzip2 -9.  I&#8217;m sure that 7za, perhaps even with default options, can be made to be slower than bzip2 &#8212; but it can be made to be faster, too.</p>
<p>I have no illusion of grandeur here.  Of course there are always more questions.  One thing I care about is how well various archive formats store all the data about POSIX filesystems &#8212; hardlinks, symlinks, and sparse files.  Neither zip nor 7-zip do, which is a big issue for me.  tar, of course, can do this.</p>
<p>If you want to whip up data on those other things, by all means post a link to it here.  I&#8217;d be interested.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: How To Think About Compression, Part 2 &#124; The Changelog</title>
		<link>http://changelog.complete.org/archives/910-how-to-think-about-compression/comment-page-1#comment-3246</link>
		<dc:creator>How To Think About Compression, Part 2 &#124; The Changelog</dc:creator>
		<pubDate>Wed, 18 Feb 2009 02:22:17 +0000</pubDate>
		<guid isPermaLink="false">http://changelog.complete.org/?p=910#comment-3246</guid>
		<description>[...] I posted part 1 of how to think about compression. If you haven&#8217;t read it already, take a look now, so this [...]</description>
		<content:encoded><![CDATA[<p>[...] I posted part 1 of how to think about compression. If you haven&#8217;t read it already, take a look now, so this [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: nanashi</title>
		<link>http://changelog.complete.org/archives/910-how-to-think-about-compression/comment-page-1#comment-3245</link>
		<dc:creator>nanashi</dc:creator>
		<pubDate>Wed, 18 Feb 2009 01:41:25 +0000</pubDate>
		<guid isPermaLink="false">http://changelog.complete.org/?p=910#comment-3245</guid>
		<description>It also takes 10 times longer (nb: before brion even wrote a parallelized bzip2) and crashes more often.  

Another thing not mentioned is decompression time, are archives solid, will they play nice with rsync, and who knows how many other different things that may be important--but I guess that isn&#039;t important as long as I can make pretty graphs with a single data point with whatever factors I decide are the only ones that matter!</description>
		<content:encoded><![CDATA[<p>It also takes 10 times longer (nb: before brion even wrote a parallelized bzip2) and crashes more often.  </p>
<p>Another thing not mentioned is decompression time, are archives solid, will they play nice with rsync, and who knows how many other different things that may be important&#8211;but I guess that isn&#8217;t important as long as I can make pretty graphs with a single data point with whatever factors I decide are the only ones that matter!</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Nick J</title>
		<link>http://changelog.complete.org/archives/910-how-to-think-about-compression/comment-page-1#comment-3244</link>
		<dc:creator>Nick J</dc:creator>
		<pubDate>Wed, 18 Feb 2009 01:12:15 +0000</pubDate>
		<guid isPermaLink="false">http://changelog.complete.org/?p=910#comment-3244</guid>
		<description>A link to a previous compression-related blog with a graph, using SQL text dumps as the input data: http://blog.nickj.org/2007/07/06/comparing-compression-options-for-text-input/
7zip was quite good, and RAR too.</description>
		<content:encoded><![CDATA[<p>A link to a previous compression-related blog with a graph, using SQL text dumps as the input data: <a href="http://blog.nickj.org/2007/07/06/comparing-compression-options-for-text-input/" rel="nofollow">http://blog.nickj.org/2007/07/06/comparing-compression-options-for-text-input/</a><br />
7zip was quite good, and RAR too.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: solrize</title>
		<link>http://changelog.complete.org/archives/910-how-to-think-about-compression/comment-page-1#comment-3243</link>
		<dc:creator>solrize</dc:creator>
		<pubDate>Tue, 17 Feb 2009 23:15:15 +0000</pubDate>
		<guid isPermaLink="false">http://changelog.complete.org/?p=910#comment-3243</guid>
		<description>for very large, repetitive xml files (wikipedia dumps), 7z gets stupendously better compression (like 10x better) than bzip2 or gzip-anything.</description>
		<content:encoded><![CDATA[<p>for very large, repetitive xml files (wikipedia dumps), 7z gets stupendously better compression (like 10x better) than bzip2 or gzip-anything.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Ulrich Petri</title>
		<link>http://changelog.complete.org/archives/910-how-to-think-about-compression/comment-page-1#comment-3242</link>
		<dc:creator>Ulrich Petri</dc:creator>
		<pubDate>Tue, 17 Feb 2009 23:03:08 +0000</pubDate>
		<guid isPermaLink="false">http://changelog.complete.org/?p=910#comment-3242</guid>
		<description>IMHO &quot;backup&quot; and &quot;compression&quot; shlould never be used in the same sentence. There is far to much that can go wrong (bad sector, filesystem corruption, etc.) with a backup medium to add another layer of uncertainty.

And if you tell me you can not afford to back up uncompressed then maybe your data is not all that valuable to begin with.</description>
		<content:encoded><![CDATA[<p>IMHO &#8220;backup&#8221; and &#8220;compression&#8221; shlould never be used in the same sentence. There is far to much that can go wrong (bad sector, filesystem corruption, etc.) with a backup medium to add another layer of uncertainty.</p>
<p>And if you tell me you can not afford to back up uncompressed then maybe your data is not all that valuable to begin with.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Geoff Prewett</title>
		<link>http://changelog.complete.org/archives/910-how-to-think-about-compression/comment-page-1#comment-3241</link>
		<dc:creator>Geoff Prewett</dc:creator>
		<pubDate>Tue, 17 Feb 2009 22:10:54 +0000</pubDate>
		<guid isPermaLink="false">http://changelog.complete.org/?p=910#comment-3241</guid>
		<description>This is kind of anecdotal, but I made a Linux upgrade DVD for my company that boots into a custom Ubuntu LiveCD and basically untars the root directory onto the appropriate partition (with config mods afterwards).  I originally used bzip2 but made a mistake and used gzip for the second revision, and I noticed that the gzipped version installs substantially faster (20-33%).  Not sure why, but it might have something to do with the CPU usage.</description>
		<content:encoded><![CDATA[<p>This is kind of anecdotal, but I made a Linux upgrade DVD for my company that boots into a custom Ubuntu LiveCD and basically untars the root directory onto the appropriate partition (with config mods afterwards).  I originally used bzip2 but made a mistake and used gzip for the second revision, and I noticed that the gzipped version installs substantially faster (20-33%).  Not sure why, but it might have something to do with the CPU usage.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: John Goerzen</title>
		<link>http://changelog.complete.org/archives/910-how-to-think-about-compression/comment-page-1#comment-3240</link>
		<dc:creator>John Goerzen</dc:creator>
		<pubDate>Tue, 17 Feb 2009 21:35:59 +0000</pubDate>
		<guid isPermaLink="false">http://changelog.complete.org/?p=910#comment-3240</guid>
		<description>Of course a distribution cloud would be helpful.  I don&#039;t have the time to devote weeks to generating and testing on massive data sets.

I back up /usr, and a lot of other people do too, because it is faster to restore from backup than to try to figure out how to make $PACKAGE_TOOL restore just /usr without touching /var or /etc, not to mention in the precise versions you had, some of which may have been locally-built.

As to use for backup safety, there is a grain of truth there, but it&#039;s overstated.  It is true that, with gzip, a bad bit at the beginning of a file could make you lose the entire rest of it.  That&#039;s not true with bzip2; your losses there are 900K at most.  And this is precisely why many modern backup programs that support software compression reset it on a fixed blocksize, or at worst, on each file.  Think of it more like zip than tar.gz, where the files are compressed before being put into the archive container.

I&#039;ll post a part 2 tonight that looks at performance of these tools when compressing individual files for writing back out to disk, as may happen for instance with rdup or other hardlink-based schemes.</description>
		<content:encoded><![CDATA[<p>Of course a distribution cloud would be helpful.  I don&#8217;t have the time to devote weeks to generating and testing on massive data sets.</p>
<p>I back up /usr, and a lot of other people do too, because it is faster to restore from backup than to try to figure out how to make $PACKAGE_TOOL restore just /usr without touching /var or /etc, not to mention in the precise versions you had, some of which may have been locally-built.</p>
<p>As to use for backup safety, there is a grain of truth there, but it&#8217;s overstated.  It is true that, with gzip, a bad bit at the beginning of a file could make you lose the entire rest of it.  That&#8217;s not true with bzip2; your losses there are 900K at most.  And this is precisely why many modern backup programs that support software compression reset it on a fixed blocksize, or at worst, on each file.  Think of it more like zip than tar.gz, where the files are compressed before being put into the archive container.</p>
<p>I&#8217;ll post a part 2 tonight that looks at performance of these tools when compressing individual files for writing back out to disk, as may happen for instance with rdup or other hardlink-based schemes.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Ken</title>
		<link>http://changelog.complete.org/archives/910-how-to-think-about-compression/comment-page-1#comment-3239</link>
		<dc:creator>Ken</dc:creator>
		<pubDate>Tue, 17 Feb 2009 21:23:59 +0000</pubDate>
		<guid isPermaLink="false">http://changelog.complete.org/?p=910#comment-3239</guid>
		<description>Your test suffers from the same problem as the Gimp tests, though perhaps to a lesser degree: it&#039;s not really a representative test.  First, because nobody backs up /usr -- just save the list of packages, so you can re-install later.  But also, in your use case of 50GB of photos a day, would anybody tar-and-compress these?  Photos don&#039;t tend to (losslessly) compress very well at all.  In that case cat might be smaller than any of these algorithms!

Graphing the data is cool, but (as all my science teachers told me) they should really show error bars, or a distribution cloud, or something.  For example, multiple points for each algorithm: &quot;10GB text files&quot;, &quot;10GB raw photos&quot;, &quot;10GB compressed audio&quot;, etc.  You&#039;re trying to show how it applies to a distribution of different file types, by running it on a distribution of different file types, but there&#039;s nothing to suggest that one distribution of inputs is the same as the other.  Is your /usr full of ELF binaries and already-compressed manpages and bitmaps, like mine is?  That&#039;s not representative of my ~ at all.

Finally, at least 3 different backup geeks have told me not to compress (or worse, encrypt) backups.  When your primary data is gone, the number one thing you want above all else is to make it really really easy to get a copy of your data back.  A single .tar.gz does not make it easy to get at one file you need, and if even one bit is bad, you can kiss the whole thing goodbye.

Nice analysis of compression, though.  Good to know the limitations of the tools available.</description>
		<content:encoded><![CDATA[<p>Your test suffers from the same problem as the Gimp tests, though perhaps to a lesser degree: it&#8217;s not really a representative test.  First, because nobody backs up /usr &#8212; just save the list of packages, so you can re-install later.  But also, in your use case of 50GB of photos a day, would anybody tar-and-compress these?  Photos don&#8217;t tend to (losslessly) compress very well at all.  In that case cat might be smaller than any of these algorithms!</p>
<p>Graphing the data is cool, but (as all my science teachers told me) they should really show error bars, or a distribution cloud, or something.  For example, multiple points for each algorithm: &#8220;10GB text files&#8221;, &#8220;10GB raw photos&#8221;, &#8220;10GB compressed audio&#8221;, etc.  You&#8217;re trying to show how it applies to a distribution of different file types, by running it on a distribution of different file types, but there&#8217;s nothing to suggest that one distribution of inputs is the same as the other.  Is your /usr full of ELF binaries and already-compressed manpages and bitmaps, like mine is?  That&#8217;s not representative of my ~ at all.</p>
<p>Finally, at least 3 different backup geeks have told me not to compress (or worse, encrypt) backups.  When your primary data is gone, the number one thing you want above all else is to make it really really easy to get a copy of your data back.  A single .tar.gz does not make it easy to get at one file you need, and if even one bit is bad, you can kiss the whole thing goodbye.</p>
<p>Nice analysis of compression, though.  Good to know the limitations of the tools available.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Brandon Moore</title>
		<link>http://changelog.complete.org/archives/910-how-to-think-about-compression/comment-page-1#comment-3238</link>
		<dc:creator>Brandon Moore</dc:creator>
		<pubDate>Tue, 17 Feb 2009 21:02:31 +0000</pubDate>
		<guid isPermaLink="false">http://changelog.complete.org/?p=910#comment-3238</guid>
		<description>I also suggest checking out lzop. It can be fast than catting directly to disk, depending on your transfer rates and CPU.</description>
		<content:encoded><![CDATA[<p>I also suggest checking out lzop. It can be fast than catting directly to disk, depending on your transfer rates and CPU.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: John Goerzen</title>
		<link>http://changelog.complete.org/archives/910-how-to-think-about-compression/comment-page-1#comment-3237</link>
		<dc:creator>John Goerzen</dc:creator>
		<pubDate>Tue, 17 Feb 2009 20:45:08 +0000</pubDate>
		<guid isPermaLink="false">http://changelog.complete.org/?p=910#comment-3237</guid>
		<description>Only with pbzip2.</description>
		<content:encoded><![CDATA[<p>Only with pbzip2.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Palmax</title>
		<link>http://changelog.complete.org/archives/910-how-to-think-about-compression/comment-page-1#comment-3236</link>
		<dc:creator>Palmax</dc:creator>
		<pubDate>Tue, 17 Feb 2009 20:12:57 +0000</pubDate>
		<guid isPermaLink="false">http://changelog.complete.org/?p=910#comment-3236</guid>
		<description>Did you use threads?</description>
		<content:encoded><![CDATA[<p>Did you use threads?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: John Goerzen</title>
		<link>http://changelog.complete.org/archives/910-how-to-think-about-compression/comment-page-1#comment-3235</link>
		<dc:creator>John Goerzen</dc:creator>
		<pubDate>Tue, 17 Feb 2009 20:03:15 +0000</pubDate>
		<guid isPermaLink="false">http://changelog.complete.org/?p=910#comment-3235</guid>
		<description>That&#039;s not true.  The latest release was on Feb 3, just a few weeks ago.  See http://www.7-zip.org/sdk.html

I may look into xz though, as well as a newer LZMA.</description>
		<content:encoded><![CDATA[<p>That&#8217;s not true.  The latest release was on Feb 3, just a few weeks ago.  See <a href="http://www.7-zip.org/sdk.html" rel="nofollow">http://www.7-zip.org/sdk.html</a></p>
<p>I may look into xz though, as well as a newer LZMA.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: mario</title>
		<link>http://changelog.complete.org/archives/910-how-to-think-about-compression/comment-page-1#comment-3234</link>
		<dc:creator>mario</dc:creator>
		<pubDate>Tue, 17 Feb 2009 19:56:28 +0000</pubDate>
		<guid isPermaLink="false">http://changelog.complete.org/?p=910#comment-3234</guid>
		<description>For backup purposes, a ultra-high compression ratio isn&#039;t really that relevant. However, data safety is.
Here neither gzip or bzip2, lzma, 7zip formats do you any good. I&#039;m actually thinking about using good old outdated zip or even rar, which actually COULD recover from corrupt sectors. (bzip2recover is a complete joke)
- Yes, that&#039;s still a possibility - even on modern external HDDs or DVD-RAM and SD-Cards , that I use for redundant backups.</description>
		<content:encoded><![CDATA[<p>For backup purposes, a ultra-high compression ratio isn&#8217;t really that relevant. However, data safety is.<br />
Here neither gzip or bzip2, lzma, 7zip formats do you any good. I&#8217;m actually thinking about using good old outdated zip or even rar, which actually COULD recover from corrupt sectors. (bzip2recover is a complete joke)<br />
- Yes, that&#8217;s still a possibility &#8211; even on modern external HDDs or DVD-RAM and SD-Cards , that I use for redundant backups.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Moshroum</title>
		<link>http://changelog.complete.org/archives/910-how-to-think-about-compression/comment-page-1#comment-3233</link>
		<dc:creator>Moshroum</dc:creator>
		<pubDate>Tue, 17 Feb 2009 19:43:49 +0000</pubDate>
		<guid isPermaLink="false">http://changelog.complete.org/?p=910#comment-3233</guid>
		<description>Maybe you didnt know it but lzma utils are deprecated and all further development was put into xz-utils. Can you maybe add a test with it? The numbers are probably changed with xz now: http://tukaani.org/xz/ (yes, the basic algorithm is still lzma, but finer tuned and better on-disk format)</description>
		<content:encoded><![CDATA[<p>Maybe you didnt know it but lzma utils are deprecated and all further development was put into xz-utils. Can you maybe add a test with it? The numbers are probably changed with xz now: <a href="http://tukaani.org/xz/" rel="nofollow">http://tukaani.org/xz/</a> (yes, the basic algorithm is still lzma, but finer tuned and better on-disk format)</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Comparación entre diferentes algoritmos de compresión en Linux (ENG)</title>
		<link>http://changelog.complete.org/archives/910-how-to-think-about-compression/comment-page-1#comment-3232</link>
		<dc:creator>Comparación entre diferentes algoritmos de compresión en Linux (ENG)</dc:creator>
		<pubDate>Tue, 17 Feb 2009 16:55:01 +0000</pubDate>
		<guid isPermaLink="false">http://changelog.complete.org/?p=910#comment-3232</guid>
		<description>[...] Comparación entre diferentes algoritmos de compresión en Linux (ENG)changelog.complete.org/archives/910-how-to-think-about-compr... por sphericow hace pocos segundos [...]</description>
		<content:encoded><![CDATA[<p>[...] Comparación entre diferentes algoritmos de compresión en Linux (ENG)changelog.complete.org/archives/910-how-to-think-about-compr&#8230; por sphericow hace pocos segundos [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Christopher Cashell</title>
		<link>http://changelog.complete.org/archives/910-how-to-think-about-compression/comment-page-1#comment-3230</link>
		<dc:creator>Christopher Cashell</dc:creator>
		<pubDate>Tue, 17 Feb 2009 16:24:25 +0000</pubDate>
		<guid isPermaLink="false">http://changelog.complete.org/?p=910#comment-3230</guid>
		<description>I&#039;m a huge fan of gzip, but I wouldn&#039;t quite go so far as to throw out bzip2.  Especially when you include pbzip2 in the mix, I think it definitely still has a place.

It&#039;s all a matter of using the appropriate compression depending on your requirements.  Some people have strict space requirements, some have strict time requirements, and some have strict resource constraint requirements.  Some have a combination, or other requirements beyond these.

The trick is to know what options are available, and make an informed decision.

Also note: I used to be much less of a fan of bzip2 than I am now.  The rapid rise in CPU cores coupled with pbzip2 has changed my mind, though.  On a lot of modern multi-core boxes, pbzip2 will outperform gzip in both compression time and compression level.  Yes, it will use more CPU resources, but that is very often a worthwhile tradeoff.  When and if implementations of the other compression algorithms make such effective use of parallelism, my opinion of bzip2 may drop again (although, my understanding is that bzip2&#039;s design lends itself better to parallelism than most other compression algorithms).</description>
		<content:encoded><![CDATA[<p>I&#8217;m a huge fan of gzip, but I wouldn&#8217;t quite go so far as to throw out bzip2.  Especially when you include pbzip2 in the mix, I think it definitely still has a place.</p>
<p>It&#8217;s all a matter of using the appropriate compression depending on your requirements.  Some people have strict space requirements, some have strict time requirements, and some have strict resource constraint requirements.  Some have a combination, or other requirements beyond these.</p>
<p>The trick is to know what options are available, and make an informed decision.</p>
<p>Also note: I used to be much less of a fan of bzip2 than I am now.  The rapid rise in CPU cores coupled with pbzip2 has changed my mind, though.  On a lot of modern multi-core boxes, pbzip2 will outperform gzip in both compression time and compression level.  Yes, it will use more CPU resources, but that is very often a worthwhile tradeoff.  When and if implementations of the other compression algorithms make such effective use of parallelism, my opinion of bzip2 may drop again (although, my understanding is that bzip2&#8242;s design lends itself better to parallelism than most other compression algorithms).</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Sven Mueller</title>
		<link>http://changelog.complete.org/archives/910-how-to-think-about-compression/comment-page-1#comment-3229</link>
		<dc:creator>Sven Mueller</dc:creator>
		<pubDate>Tue, 17 Feb 2009 16:19:01 +0000</pubDate>
		<guid isPermaLink="false">http://changelog.complete.org/?p=910#comment-3229</guid>
		<description>The (german) article at wikipedia doesn&#039;t actually say that p7zip is better than lzma-utils. It just says that lzma-utils usually uses an older version of the lzma algorithm(s) than p7zip.</description>
		<content:encoded><![CDATA[<p>The (german) article at wikipedia doesn&#8217;t actually say that p7zip is better than lzma-utils. It just says that lzma-utils usually uses an older version of the lzma algorithm(s) than p7zip.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Christopher Cashell</title>
		<link>http://changelog.complete.org/archives/910-how-to-think-about-compression/comment-page-1#comment-3228</link>
		<dc:creator>Christopher Cashell</dc:creator>
		<pubDate>Tue, 17 Feb 2009 16:14:15 +0000</pubDate>
		<guid isPermaLink="false">http://changelog.complete.org/?p=910#comment-3228</guid>
		<description>Remember that the makeup of your backups can make a huge difference here.  Video is a relatively specialized type of data that generally does best with specialized compression types (such as provided by the various video codecs).

At my company, we also generate very large backups.  However, our backups are largely database dumps, which equates to highly compressible text-oriented data.  This is an area that bzip2 excels at, and we&#039;ve made use of it at times with good results.</description>
		<content:encoded><![CDATA[<p>Remember that the makeup of your backups can make a huge difference here.  Video is a relatively specialized type of data that generally does best with specialized compression types (such as provided by the various video codecs).</p>
<p>At my company, we also generate very large backups.  However, our backups are largely database dumps, which equates to highly compressible text-oriented data.  This is an area that bzip2 excels at, and we&#8217;ve made use of it at times with good results.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Adrien Nader</title>
		<link>http://changelog.complete.org/archives/910-how-to-think-about-compression/comment-page-1#comment-3227</link>
		<dc:creator>Adrien Nader</dc:creator>
		<pubDate>Tue, 17 Feb 2009 14:55:00 +0000</pubDate>
		<guid isPermaLink="false">http://changelog.complete.org/?p=910#comment-3227</guid>
		<description>(p)7zip has some compression filters so it should provide better compression than pure lzma. However it doesn&#039;t have many. IIRC it may currently only have BCJ/BCJ (they target binaries and may depend on/work better with windows&#039;s PE format than linux&#039;s ELF).
Filters should only appear with 7zip 5, the reason is the author wants to get a stable 7zip before messing with filters. You could find some on the 7zip&#039;s sourceforge forums.</description>
		<content:encoded><![CDATA[<p>(p)7zip has some compression filters so it should provide better compression than pure lzma. However it doesn&#8217;t have many. IIRC it may currently only have BCJ/BCJ (they target binaries and may depend on/work better with windows&#8217;s PE format than linux&#8217;s ELF).<br />
Filters should only appear with 7zip 5, the reason is the author wants to get a stable 7zip before messing with filters. You could find some on the 7zip&#8217;s sourceforge forums.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: John Goerzen</title>
		<link>http://changelog.complete.org/archives/910-how-to-think-about-compression/comment-page-1#comment-3226</link>
		<dc:creator>John Goerzen</dc:creator>
		<pubDate>Tue, 17 Feb 2009 14:26:12 +0000</pubDate>
		<guid isPermaLink="false">http://changelog.complete.org/?p=910#comment-3226</guid>
		<description>I actually did; see the 7za output on the charts.  There are a huge number of parameters to be tweaked there, and I chose one somewhat arbitrarily.

I am a bit puzzled why it would do better than lzma, since they are supposed to be the same algorithm, and even appear to be using the same SDK.

However, the p7zip tools may or may not be appropriate for general use because you might not be able to pipe into them.

You can use any pipable compressor with tar:

tar -cf - somedir &#124; lzma &gt; file.tar.lzma

same works with gzip or bzip2, and that&#039;s what we did before tar got -j.  That does not create a temporary file anywhere.</description>
		<content:encoded><![CDATA[<p>I actually did; see the 7za output on the charts.  There are a huge number of parameters to be tweaked there, and I chose one somewhat arbitrarily.</p>
<p>I am a bit puzzled why it would do better than lzma, since they are supposed to be the same algorithm, and even appear to be using the same SDK.</p>
<p>However, the p7zip tools may or may not be appropriate for general use because you might not be able to pipe into them.</p>
<p>You can use any pipable compressor with tar:</p>
<p>tar -cf &#8211; somedir | lzma > file.tar.lzma</p>
<p>same works with gzip or bzip2, and that&#8217;s what we did before tar got -j.  That does not create a temporary file anywhere.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: John Goerzen</title>
		<link>http://changelog.complete.org/archives/910-how-to-think-about-compression/comment-page-1#comment-3225</link>
		<dc:creator>John Goerzen</dc:creator>
		<pubDate>Tue, 17 Feb 2009 14:24:22 +0000</pubDate>
		<guid isPermaLink="false">http://changelog.complete.org/?p=910#comment-3225</guid>
		<description>I didn&#039;t test that due to time constraints, but the Gimp article I linked to did.</description>
		<content:encoded><![CDATA[<p>I didn&#8217;t test that due to time constraints, but the Gimp article I linked to did.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: John Goerzen</title>
		<link>http://changelog.complete.org/archives/910-how-to-think-about-compression/comment-page-1#comment-3224</link>
		<dc:creator>John Goerzen</dc:creator>
		<pubDate>Tue, 17 Feb 2009 14:18:51 +0000</pubDate>
		<guid isPermaLink="false">http://changelog.complete.org/?p=910#comment-3224</guid>
		<description>Perhaps it is for a home user.  I can assure you it&#039;s not for a business, and probably less and less for home users too, considering that cheap DV cameras record at something like 30Mb/s.

A full backup at work easily exceeds 1TB, and our nightly incrementals are probably never less than 100GB.</description>
		<content:encoded><![CDATA[<p>Perhaps it is for a home user.  I can assure you it&#8217;s not for a business, and probably less and less for home users too, considering that cheap DV cameras record at something like 30Mb/s.</p>
<p>A full backup at work easily exceeds 1TB, and our nightly incrementals are probably never less than 100GB.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: John Goerzen</title>
		<link>http://changelog.complete.org/archives/910-how-to-think-about-compression/comment-page-1#comment-3223</link>
		<dc:creator>John Goerzen</dc:creator>
		<pubDate>Tue, 17 Feb 2009 14:17:33 +0000</pubDate>
		<guid isPermaLink="false">http://changelog.complete.org/?p=910#comment-3223</guid>
		<description>I will for part 2, but for this part 1 you can see lzop included in the Practical Compressor Test I linked to.</description>
		<content:encoded><![CDATA[<p>I will for part 2, but for this part 1 you can see lzop included in the Practical Compressor Test I linked to.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Horst</title>
		<link>http://changelog.complete.org/archives/910-how-to-think-about-compression/comment-page-1#comment-3222</link>
		<dc:creator>Horst</dc:creator>
		<pubDate>Tue, 17 Feb 2009 14:17:14 +0000</pubDate>
		<guid isPermaLink="false">http://changelog.complete.org/?p=910#comment-3222</guid>
		<description>Hi,

according to http://de.wikipedia.org/wiki/Lempel-Ziv-Markow-Algorithmus#Portabilit.C3.A4t_der_Referenzimplementation (sorry in German), p7zip is better than LZMA Utils. Can you also test p7zip?

Is there a possibilty to use lzma like gzip or bzip2 as parameter in tar (like tar zcf or tar jcf). I do not want to create a temorary tar file and then run the compression tool over it...</description>
		<content:encoded><![CDATA[<p>Hi,</p>
<p>according to <a href="http://de.wikipedia.org/wiki/Lempel-Ziv-Markow-Algorithmus#Portabilit.C3.A4t_der_Referenzimplementation" rel="nofollow">http://de.wikipedia.org/wiki/Lempel-Ziv-Markow-Algorithmus#Portabilit.C3.A4t_der_Referenzimplementation</a> (sorry in German), p7zip is better than LZMA Utils. Can you also test p7zip?</p>
<p>Is there a possibilty to use lzma like gzip or bzip2 as parameter in tar (like tar zcf or tar jcf). I do not want to create a temorary tar file and then run the compression tool over it&#8230;</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: John Goerzen</title>
		<link>http://changelog.complete.org/archives/910-how-to-think-about-compression/comment-page-1#comment-3221</link>
		<dc:creator>John Goerzen</dc:creator>
		<pubDate>Tue, 17 Feb 2009 14:17:06 +0000</pubDate>
		<guid isPermaLink="false">http://changelog.complete.org/?p=910#comment-3221</guid>
		<description>I have never had any issues with bzip2 corrupting data.  I believe it is backed by an extensive test suite.  I trust both it and gzip very strongly.</description>
		<content:encoded><![CDATA[<p>I have never had any issues with bzip2 corrupting data.  I believe it is backed by an extensive test suite.  I trust both it and gzip very strongly.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: steve</title>
		<link>http://changelog.complete.org/archives/910-how-to-think-about-compression/comment-page-1#comment-3220</link>
		<dc:creator>steve</dc:creator>
		<pubDate>Tue, 17 Feb 2009 14:00:05 +0000</pubDate>
		<guid isPermaLink="false">http://changelog.complete.org/?p=910#comment-3220</guid>
		<description>It seems to me that there are several issues one must simultaneously consider when choosing a compression algorithm.

1.  How fast must the compression take?
2.  How small must the compression be?
3.  How much computer resources does compression take?
4.  How fast must the DEcompression take?
5.  How much computer resources does DEcompression take?

Looking at the three algorithms I am familiar with (gzip, bzip2, and lzma):
1. gzip is the clear winner in speed and computer resources required to compress/decompress, but is the clear looser in size
2. lzma is the clear winner in size, the clear looser in computer resources for compress/decompress, the clear looser in time to compress, and the second best in time to decompress 
3.  bzip2 is not the clear winner in any category

Based on these, I would say that bzip2 is completely deprecated except for the following two cases:
1.  where size is important but resources for compression are extremely limited (even then I still might go with gzip unless size were really important), and
2. where size is paramount and I cannot be certain the decompression client has access to lzma tools</description>
		<content:encoded><![CDATA[<p>It seems to me that there are several issues one must simultaneously consider when choosing a compression algorithm.</p>
<p>1.  How fast must the compression take?<br />
2.  How small must the compression be?<br />
3.  How much computer resources does compression take?<br />
4.  How fast must the DEcompression take?<br />
5.  How much computer resources does DEcompression take?</p>
<p>Looking at the three algorithms I am familiar with (gzip, bzip2, and lzma):<br />
1. gzip is the clear winner in speed and computer resources required to compress/decompress, but is the clear looser in size<br />
2. lzma is the clear winner in size, the clear looser in computer resources for compress/decompress, the clear looser in time to compress, and the second best in time to decompress<br />
3.  bzip2 is not the clear winner in any category</p>
<p>Based on these, I would say that bzip2 is completely deprecated except for the following two cases:<br />
1.  where size is important but resources for compression are extremely limited (even then I still might go with gzip unless size were really important), and<br />
2. where size is paramount and I cannot be certain the decompression client has access to lzma tools</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: John Goerzen</title>
		<link>http://changelog.complete.org/archives/910-how-to-think-about-compression/comment-page-1#comment-3219</link>
		<dc:creator>John Goerzen</dc:creator>
		<pubDate>Tue, 17 Feb 2009 13:58:31 +0000</pubDate>
		<guid isPermaLink="false">http://changelog.complete.org/?p=910#comment-3219</guid>
		<description>See /usr/share/doc/p7zip/DOCS/MANUAL/switches/method.htm.  It&#039;s -mmt=on.</description>
		<content:encoded><![CDATA[<p>See /usr/share/doc/p7zip/DOCS/MANUAL/switches/method.htm.  It&#8217;s -mmt=on.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Adrien Nader</title>
		<link>http://changelog.complete.org/archives/910-how-to-think-about-compression/comment-page-1#comment-3218</link>
		<dc:creator>Adrien Nader</dc:creator>
		<pubDate>Tue, 17 Feb 2009 13:37:06 +0000</pubDate>
		<guid isPermaLink="false">http://changelog.complete.org/?p=910#comment-3218</guid>
		<description>7zip can only use two threads for lzma compression currently. However it has an implementation of bzip2 which can use more threads.
Actually, the day before yesterday I was wondering with some people over irc which implementation would be faster/compress more. If you have time maybe you could benchmark that too. :)</description>
		<content:encoded><![CDATA[<p>7zip can only use two threads for lzma compression currently. However it has an implementation of bzip2 which can use more threads.<br />
Actually, the day before yesterday I was wondering with some people over irc which implementation would be faster/compress more. If you have time maybe you could benchmark that too. :)</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Julian Andres Klode</title>
		<link>http://changelog.complete.org/archives/910-how-to-think-about-compression/comment-page-1#comment-3216</link>
		<dc:creator>Julian Andres Klode</dc:creator>
		<pubDate>Tue, 17 Feb 2009 11:45:46 +0000</pubDate>
		<guid isPermaLink="false">http://changelog.complete.org/?p=910#comment-3216</guid>
		<description>It would also be nice to know how fast they decompress.</description>
		<content:encoded><![CDATA[<p>It would also be nice to know how fast they decompress.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Janos</title>
		<link>http://changelog.complete.org/archives/910-how-to-think-about-compression/comment-page-1#comment-3215</link>
		<dc:creator>Janos</dc:creator>
		<pubDate>Tue, 17 Feb 2009 11:05:12 +0000</pubDate>
		<guid isPermaLink="false">http://changelog.complete.org/?p=910#comment-3215</guid>
		<description>There&#039;s one point which might make me a minority, but I have  a pair of resource constrained PCs (apart from the wireless access points being even more so).  I have no problem running gzip on them, so I agree gzip still has its place :)

I had actually given up running lzma on one of them, a 400MHz K6 with 64M(+256M swap) simply could not finish an ~50MB file with lzma over the weekend.  Your article made me realise I should check the man page, and indeed, the default is lzma -7, which would use 83MB memory, so it explains why it simply didn&#039;t work for me.  OTOH, bzip2 -9v just compresses it fine, albeit slowly -- although about three times slower than gzip -9.  So, siding with jldugger, lzma is on the borderline of staying viable on these machines.

I know 64MB may seem very ancient, but there are still new embedded (phone, router, access point) platforms coming out with this much or less memory.  I&#039;m starting to be scared to see what runs on Amiga with 6MB of memory :)</description>
		<content:encoded><![CDATA[<p>There&#8217;s one point which might make me a minority, but I have  a pair of resource constrained PCs (apart from the wireless access points being even more so).  I have no problem running gzip on them, so I agree gzip still has its place :)</p>
<p>I had actually given up running lzma on one of them, a 400MHz K6 with 64M(+256M swap) simply could not finish an ~50MB file with lzma over the weekend.  Your article made me realise I should check the man page, and indeed, the default is lzma -7, which would use 83MB memory, so it explains why it simply didn&#8217;t work for me.  OTOH, bzip2 -9v just compresses it fine, albeit slowly &#8212; although about three times slower than gzip -9.  So, siding with jldugger, lzma is on the borderline of staying viable on these machines.</p>
<p>I know 64MB may seem very ancient, but there are still new embedded (phone, router, access point) platforms coming out with this much or less memory.  I&#8217;m starting to be scared to see what runs on Amiga with 6MB of memory :)</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: ulrik</title>
		<link>http://changelog.complete.org/archives/910-how-to-think-about-compression/comment-page-1#comment-3214</link>
		<dc:creator>ulrik</dc:creator>
		<pubDate>Tue, 17 Feb 2009 10:57:46 +0000</pubDate>
		<guid isPermaLink="false">http://changelog.complete.org/?p=910#comment-3214</guid>
		<description>1. Gzip solved the compression problem with all practicality. It will only be superceded when another algorithm reaches the same installed base
2. However, the case-by-case basis of compression is very interesting. For example things like squashfs: Compress an ubuntu live cd as well as possible, so that as much of ubuntu as possible can be available on a 700M disk.</description>
		<content:encoded><![CDATA[<p>1. Gzip solved the compression problem with all practicality. It will only be superceded when another algorithm reaches the same installed base<br />
2. However, the case-by-case basis of compression is very interesting. For example things like squashfs: Compress an ubuntu live cd as well as possible, so that as much of ubuntu as possible can be available on a 700M disk.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: jldugger</title>
		<link>http://changelog.complete.org/archives/910-how-to-think-about-compression/comment-page-1#comment-3212</link>
		<dc:creator>jldugger</dc:creator>
		<pubDate>Tue, 17 Feb 2009 08:35:48 +0000</pubDate>
		<guid isPermaLink="false">http://changelog.complete.org/?p=910#comment-3212</guid>
		<description>One caveat to consider is the amount of RAM used in performing this backup.  LZMA likes big dictionaries,  which can consume a lot of RAM during compression.</description>
		<content:encoded><![CDATA[<p>One caveat to consider is the amount of RAM used in performing this backup.  LZMA likes big dictionaries,  which can consume a lot of RAM during compression.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Ganesh Sittampalam</title>
		<link>http://changelog.complete.org/archives/910-how-to-think-about-compression/comment-page-1#comment-3211</link>
		<dc:creator>Ganesh Sittampalam</dc:creator>
		<pubDate>Tue, 17 Feb 2009 08:35:42 +0000</pubDate>
		<guid isPermaLink="false">http://changelog.complete.org/?p=910#comment-3211</guid>
		<description>I would say that the 50GB of backups per day scenario is an outlier, and for most other scenarios the absolute cost of the CPU time is quite small. I typically run short of hard drive space far more often than my computer can&#039;t keep up with the compression I want it to do.</description>
		<content:encoded><![CDATA[<p>I would say that the 50GB of backups per day scenario is an outlier, and for most other scenarios the absolute cost of the CPU time is quite small. I typically run short of hard drive space far more often than my computer can&#8217;t keep up with the compression I want it to do.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Np237</title>
		<link>http://changelog.complete.org/archives/910-how-to-think-about-compression/comment-page-1#comment-3210</link>
		<dc:creator>Np237</dc:creator>
		<pubDate>Tue, 17 Feb 2009 08:29:31 +0000</pubDate>
		<guid isPermaLink="false">http://changelog.complete.org/?p=910#comment-3210</guid>
		<description>This is the exact reason why I’ve always been using gzip to compress data for backups. The explanation looks much better with nice charts, of course.</description>
		<content:encoded><![CDATA[<p>This is the exact reason why I’ve always been using gzip to compress data for backups. The explanation looks much better with nice charts, of course.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: jldugger</title>
		<link>http://changelog.complete.org/archives/910-how-to-think-about-compression/comment-page-1#comment-3209</link>
		<dc:creator>jldugger</dc:creator>
		<pubDate>Tue, 17 Feb 2009 08:19:09 +0000</pubDate>
		<guid isPermaLink="false">http://changelog.complete.org/?p=910#comment-3209</guid>
		<description>The manpage suggests you check out the full manual in /usr/share/doc, which goes over the various options.  In particular, -mt will enable multithreading, but only for specific formats.</description>
		<content:encoded><![CDATA[<p>The manpage suggests you check out the full manual in /usr/share/doc, which goes over the various options.  In particular, -mt will enable multithreading, but only for specific formats.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Ian</title>
		<link>http://changelog.complete.org/archives/910-how-to-think-about-compression/comment-page-1#comment-3208</link>
		<dc:creator>Ian</dc:creator>
		<pubDate>Tue, 17 Feb 2009 07:26:52 +0000</pubDate>
		<guid isPermaLink="false">http://changelog.complete.org/?p=910#comment-3208</guid>
		<description>&quot;7za already has a parallel LZMA implementation&quot;

Really? If so, it is undocumented, at least for Unix (no mention of it in the manpage, which seems identical to the 7zr one, and the html manual is Windows dependent garbage).</description>
		<content:encoded><![CDATA[<p>&#8220;7za already has a parallel LZMA implementation&#8221;</p>
<p>Really? If so, it is undocumented, at least for Unix (no mention of it in the manpage, which seems identical to the 7zr one, and the html manual is Windows dependent garbage).</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Aigars Mahinovs</title>
		<link>http://changelog.complete.org/archives/910-how-to-think-about-compression/comment-page-1#comment-3206</link>
		<dc:creator>Aigars Mahinovs</dc:creator>
		<pubDate>Tue, 17 Feb 2009 04:51:21 +0000</pubDate>
		<guid isPermaLink="false">http://changelog.complete.org/?p=910#comment-3206</guid>
		<description>Add lzop to your testing, it has a slightly lower compression ratio, but much higher compression speed.</description>
		<content:encoded><![CDATA[<p>Add lzop to your testing, it has a slightly lower compression ratio, but much higher compression speed.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Omari</title>
		<link>http://changelog.complete.org/archives/910-how-to-think-about-compression/comment-page-1#comment-3205</link>
		<dc:creator>Omari</dc:creator>
		<pubDate>Tue, 17 Feb 2009 04:50:04 +0000</pubDate>
		<guid isPermaLink="false">http://changelog.complete.org/?p=910#comment-3205</guid>
		<description>I took a disk  image of a Vista partition before wiping it off my new Thinkpad&#039;s hard drive (I don&#039;t plan to ever use the Vista, but who knows, and it doesn&#039;t come with a disk.) The compression time was about on par with what you describe, as bzip2 took much longer than gzip. What worried me was that bzip2 was corrupting the data. A bunzipped file was not identical to the file before it was bzipped. The bzip2 web page says this might be a memory error. I tried replicating this bug on my 2-year-old desktop with an old AMD Athlon 64, but the bzip2 of this 20GB file didn&#039;t finish after an hour and I just gave up.</description>
		<content:encoded><![CDATA[<p>I took a disk  image of a Vista partition before wiping it off my new Thinkpad&#8217;s hard drive (I don&#8217;t plan to ever use the Vista, but who knows, and it doesn&#8217;t come with a disk.) The compression time was about on par with what you describe, as bzip2 took much longer than gzip. What worried me was that bzip2 was corrupting the data. A bunzipped file was not identical to the file before it was bzipped. The bzip2 web page says this might be a memory error. I tried replicating this bug on my 2-year-old desktop with an old AMD Athlon 64, but the bzip2 of this 20GB file didn&#8217;t finish after an hour and I just gave up.</p>
]]></content:encoded>
	</item>
</channel>
</rss>

