Thursday, July 29, 2010

Workshop on HPC for Phylogenetics

The "PAUP running - Do not touch!!!" sign should look familiar to anyone who's done phylogenetic analyses over the past two decades. Fortunately, the days of these signs - and the inevitable lab drama that results - are quickly becoming a thing of the past. As access to high-performance computing (HPC) expands, most modern phylogenetic analyses are being conducted remotely on shared community- or campus-wide resources. Even as access to these resources expands, however, expertise in utilizing them to their full potential remains limited. For this reason, I'm excited to spread the news about The National Institute for Mathematical and Biological Synthesis's (NIMBioS) new workshop titled “Fast, Free Phylogenies: HPC for Phylogenetics Tutorial.” This workshop, which takes place this October in Knoxville, TN, will bring together some of the most knowledgeable experts on HPC for phylogenetics with the goal of teaching others how best to use resources like TeraGrid, CIPRES, iPlant, university clusters, and other free HPC resources. More details are available at the tutorials webpage. Tuition is covered by NIMBioS, but enrollment is limited.

7 comments:

Brian O'Meara said...

Thanks for posting this. One thing people may want to note is that we're offering training in unix/command line in the time leading up to the workshop, so people shouldn't let lack of experience in that area scare them off. So even if (or especially if) you're in one of the labs that keep an old Mac around for menu-driven PAUP, you'll learn enough to comfortably use some of the available high performance computing resources (and some of them require no command line usage at all: see, for example, the CIPRES Portal).

Jeremy Brown said...

This sounds like a great workshop. Does anyone know if a similar workshop (NIMBioS or otherwise) exists that focuses on how to build and maintain a smallish cluster?

Brian O'Meara said...

@Jeremy: I just started up my lab, and there the decision was to buy into the university's existing cluster rather than build my own. This seems to be the trend lately. It makes sense on a few levels -- 1) cost savings (rather than one systadmin and cooling system per 10 nodes, you get one of each for hundreds), 2) Ease of setup (give them money, get access fairly quickly), and 3) Ability to get more than you paid for. Many clusters let you use available nodes other people have paid for if they're not using them -- for example, while I was at NESCent I ran 10 times as many jobs at once as I could have had NESCent kept its cluster in house rather than buy into Duke's. Downsides mainly occur based on how the larger cluster is run -- are you limited for drive space, does it go down for maintenance, can you not install the programs you want, can jobs only run for 24 hours or less, do you only buy into it for a certain number of years (some clusters only count you as a member if you've bought in in the last 3 years, though you might maintain your own cluster far longer). I do also have a few Mac Pros, which can each function as several (expensive) cluster nodes if I need them to in the future.

As for courses, NIMBios had one on HPC last year that was more focused on the admin end. NESCent put out a call for course suggestions a few months ago -- maybe you could get them to sponsor one?

Jeremy Brown said...

@Brian: Thanks for the advice. I'll be starting up my lab next fall and have been considering both routes for lab-based HPC. I'm concerned about the drawbacks you mentioned, so I was thinking of having a rather small lab-based cluster for when jobs needed to be run right away (e.g., as deadlines approach). I was hoping a course on cluster construction and admin could help me figure out how much of a pain it would be to go that route. Maybe NESCent would be a good avenue. Thanks!

Poletarac said...

About two years ago, I was dragged into getting an inexpensive multi-core processor, and running linux. It's worked out very well. The person dragging me was a highly skilled postdoc, and doing everything on my own would have been much harder. Still, I think that it's relatively easy to get help around any campus if you run into a snag.

Anyway, I'm shopping around for a couple of more processors and boxes. Anyone have any advice? I didn't want to mess around with putting together hardware previously (except HD), but I think I'm open to the idea.

I find that a machine with around 75% of the speed of the fastest configuration available is generally about 50% of the price, or attractive to my cheap tendencies.

P.S. This is what I have in mind, but updated to Summer 2010.

Poletarac said...

Also, the NESCent course is a great idea, Brian. Anyone else interested?

Jory Weintraub said...

For more on the NESCent course proposal process, please see http://www.nescent.org/courses/proposals/.

As Brian mentioned, the deadline for this year's call just passed, but we will issue another call next spring. We will be posting announcements about this on our website, via evoldir, and other places. If you have specific questions, please feel free to contact me directly (jory@nescent.org).