hpodder to be multithreaded… done right.

I’ll be hacking on my hpodder program this weekend. hpodder is a full-featured podcast aggregator that runs on the command line, and has many features over other command-line podcatchers like bashpodder, and even over GUI tools like iPodder.

I originally envisioned hpodder to be something that I’d cron up and run in the background. But I have tended to run it in the foreground more than in the background. Some others have too, and the #1 requested hpodder feature is parallel downloads.

So I am working on that. I already have code working, in fact, that will parallelize both the hpodder update (downloading the feeds) and the hpodder download (downloading the actual episodes) commands.

Unlike ipodder, my code will make sure that no more than 1 thread will ever be downloading from a given server at a given time. ipodder had the terribly annoying habit of pointing all of its threads at a single server, thus pounding it while also providing little benefit for someone with a pipe fatter than the server’s.

Before all this multithreaded stuff could be written, I needed to write my own status bar code instead of just letting curl display its own status bar. (That wouldn’t work when there are 5 curls running at once)

I decided that I would write some generic status bar code, rather than something specific to hpodder. I took the apt-get status bar as an example, and whipped one up in Haskell and added it to my MissingH Haskell library.

But a status bar just begged for another feature: a generalized progress tracker. Something that could keep track of where a task (and its sub-tasks) are, calculate ETA, estimated time remaining, speed, etc. So I wrote that and made the status bar use it.

AND, a status bar begged for a generalized numeric formatter: something that could render 512 as 512, 2048 as 2K, 1048576 as 1M, etc. So I wrote that, and it’s general enough that it can render into both SI and “binary” units by default (and others that users may want).

Finally, I wrote a function to take a number of seconds and render it in something friendly like 23m5s like apt uses, and shoved that in MissingH as well.

So now hpodder will have a status bar, and any other Haskell program can use the same status bar code in minutes because it’s all generic. Or if someone just needs to render a number in megabytes, they can do that.

I really enjoy it when a program needs a solution that is generic enough to put in a larger library. I try to put as much of my Haskell code in MissingH as I can, so as to make it useful to others (and my other programs).

9 thoughts on “hpodder to be multithreaded… done right.”

John, I have absolutely no idea what you are talking about in these threads…but the language that is used…like “cron up”, parallelize…??? Who makes this stuff up?

But why I really posted here: WHERE ARE THE PICTURES OF OUR GRANDSON!!!! :-) :-) :-)

John Goerzen says:

November 23, 2006 at 8:30 am

Not to worry; I’m planning to go through those this evening. We’ll probably get some more at a family gathering this afternoon. You wouldn’t want to get a partial batch, would you? ;-)

Reply
1. Anonymous says:
  
  November 23, 2006 at 8:45 am
  
  A batch would be fabulous. However, just one every once in a while would be great too! Babies change incredibly between 3 weeks and 8 weeks of age!
  M
  
  Reply
  1. John Goerzen says:
    
    November 23, 2006 at 11:17 am
    
    I’m glad you’re excited about Jacob and getting photos of him!
    
    Don’t worry, we have been taking photos all along and they are all safe and you will see them.
    
    I just haven’t had time to work with them. It usually comes down to a decision like this: process photos or go to work, photos or sleep, photos or hold Jacob, photos or spend time with Terah, photos or keep the renovation moving, photos or pay bills, etc.
    
    We’re still short on sleep so when there is spare time, processing photos just isn’t up there. Given the volume that we have, it takes *hours* to go through all the backlog (which dates back to this spring). I don’t just upload them, but also categorize them and comment on them for future use. Most of them we haven’t even looked at yet.
    
    It’s like getting your film developed but not having time to go through your prints. We don’t even know what we have to pick from yet, in many cases.
    
    Yesterday was the first time I got to work on hobbies like hpodder in months. I need to be able to relax and do things like that sometimes too.
    
    So just remember that we are busy and exhausted over here. We’ll get you the photos, and we have them, but it may just take a bit.
    
    Reply

I should also add: if we didn’t have the digital camera, I’d be looking at about 10 rolls of used but not-yet-processed film right now. So you actually get FASTER service this way ;-)

It’s amazing how things like that just grow on you, isn’t it? My podcast aggregator is, shall we say, less than fully featured. It still works quite well, just without much in the way of resilience to network problems.

But regarding getting sidetracked with general stuff, I was trying to hack together something like “mkcabal” and ended up writing a whole bunch of [url=http://brokenhut.no-ip.org/~dougal/darcs/joincabal/CommandPrompts.hs]different prompt functions[/url] for command line programs. It was fun though.

John,

You’re one heck of a programmer! You produce an awful lot of useful code.

But I’m afraid that your code in the MissingH library isn’t getting the attention it deserves. And I think that it could be solve with a bit of repackaging.

Initially I had the feeling towards MissingH that it was just a big heap of code and I couldn’t really pull myself to actually dig through the library and look what was in there. Thankfully I did just that some time ago when you made a new release. There’s lots of good stuff in there! But I’m afraid that there are more people like me who feel like I did, they’re intimidated by the size of MissingH, confused by the undescriptive name and don’t quite know what’s in there. I think your code would be more usable if you split up MissingH into a couple of libraries instead.

Yes, I know this is a bit of work, but I don’t think it would be that much. I would think it would be worth the effort. But I know you don’t like this and of work, and ultimately it really up to you.

John Goerzen says:

November 24, 2006 at 8:58 am

Thanks, Josef, for the kind words!

I appreciate the comments. Do you have any suggestions for logical lines along which to split it up?

There is some code in it that is now obsoleted by GHC, since GHC itself has implemeted some of the features. (And a few functions I’ve contributed directly to fptools from MissingH).

I’ve been thinking, too, that the module structure could use some help, but haven’t been able to come up with a plan that sounds much better.

Reply
1. Josef Svenningsson says:
  
  November 25, 2006 at 9:49 am
  
  You’re welcome John.
  
  I’m glad you took my comments to heart. I saw your post on haskell-cafe and I think that is the right way to go. Working together with the Haskell community will ensure all the goodies in MissingH will find their proper use.
  
  I didn’t have anything specific in mind when I suggested that you should split up MissingH. But I’ll let you know via the email thread if I think of anything in particular.
  
  Reply

S	M	T	W	T	F	S
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

The Changelog

Comments on family, technology, and society

hpodder to be multithreaded… done right.

9 thoughts on “hpodder to be multithreaded… done right.”

Leave a Reply Cancel reply