Distributed Load Management on the desktop
- From: Tim Cutts <tjrc (at) sanger.ac.uk>
- Date: Mon, 10 Apr 2006 17:25:37 +0100
On 10 Apr 2006, at 5:09 pm, Andrew D. Fant wrote:
I've been thinking about the problem of job submission and the
tendency of head
nodes to become bottlenecks, both in bandwidth and cycles. I also
know that
many of the batch management systems out there can be installed on
the desktop
for direct submission. Sadly, some animals are more equal than
others and some
installations are easier to configure and support than others.
Since I don't
have control over desktop installs and the technical support can
vary in
quality, making things really simple is a good thing.
I'd appreciate feedback from people who have submission-only
clients (or
whatever term your package uses) on heterogeneous desktop
environments,
expressing how well it works with your clusters, how hard the
installations and
configuration were. and how it was accepted by the user community.
If people
don't want to post here, I'll accumulate the posts and summarize,
though I'd
love to see some discussion here.
Even if users are submitting directly from their workstations, it
doesn't alleviate the head node problem much, and may make it worse.
1) The queue system will still have a single node somewhere which is
the master and actually performing the scheduling; all the submitting
clients will still be having to contact this single node. Since the
submission will come over the network, there is actually then
slightly more overhead this way than if they submit on the master
node itself.
2) The likelihood is that to make your administration easier, you
have filesystems such as home directories NFS/CIFS mounted on these
desktops. Desktop submissions are likely to create a lot of network
filesystem traffic which doesn't exist if you use one or two head
nodes which have the data stored on physically attached storage.
As far as LSF is concerned, a client-only installation is quite easy
to do. I can't speak for other batch systems, but I expect the same
is true of SGE and PBS.
Of the two problems I outline above, I suspect that 1 will not really
be a problem, since it's not really any worse than having a single
head node. But 2 could bite you hard, depending on how disciplined
your users are.
Here, we tend to treat desktop machines as fairly dumb terminals,
used for X sessions, WWW and e-mail (and office applications in the
case of Windows machines). We used to have a few Tru64 workstations
(about 10) working as submission-only hosts. They became very
awkward to maintain, and have steadily been replaced with the dumb
linux terminals everyone else has. I think there's only one left
now, and I pretend it doesn't exist. :-)
If you encourage users to start doing real work on their local
processors with local data, it can easily become a management
nightmare for your desktop support folks, to say nothing of coming up
with a backup strategy for the machines' local data. Better to keep
the desktop machine dumb, so if it fails you just bin it (or at least
take it away to fix at leisure) and give them another one.
Just my 0.02
Tim
_______________________________________________
Bioclusters maillist - Bioclusters (at) bioinformatics.org
https://bioinformatics.org/mailman/listinfo/bioclusters