[Pw_forum] Trying to find reasonable parallelization parameters
Axel Kohlmeyer
akohlmey at cmm.chem.upenn.edu
Fri Jan 23 01:28:46 CET 2009
On Thu, 22 Jan 2009, J. J. Ramsey wrote:
JJR> Compiling is one thing. The cluster, however, has its own
JJR> mpirun.lsf script, and the number of CPUs to request is handled
JJR> through the queueing system. Also, I'm not sure how I'd make sure
same here, only that we don't waste money on overkill like LSF.
i would just make a copy of the script and adapt it to my
needs. scripts - unlike compiled binaries - have to be readable
to be executed, ;-)
JJR> that OpenMPI was set up so that mpiexec could access all the
JJR> compute nodes in the cluster. It's not like I'm compiling it on a
JJR> multi-core workstation.
as i was saying, i've rolled my own even on machines
that were set up by very paranoid people. every once
in a while i was even fixing bugs in the respective
mpirun scripts (or mpicc/mpif77). you basically have
to tighten a machine to the point where it would become
unusable (an adminstrators dream and a user's nightmare)
to not have any options to replace mpi libs.
JJR> > please try adding
JJR> > the following settings to your mpirun commandline as it is.
JJR> >
JJR> > --mca btl_openib_use_srq 1
JJR> >
JJR> > and
JJR> >
JJR> > --mca mpi_leave_pinned 1
JJR>
JJR> Might I be able to use those with the stock MPI implementation on the cluster?
by all means give it a try. i just don't know when they got
added. they are for certain available in version 1.2.6.
it also depends on the openfabrics distribution,
but it looks like the one you have should be recent enough.
and you should also give -npernode 4 or -npernode 2 a try
and see if it results in faster or slower execution. it
will most certainly limit the load on the infiniband and
that can only help.
cheers,
axel.
p.s.: please note that it is customary on this mailing
list to sign e-mails with full name and affiliation.
JJR> _______________________________________________
JJR> Pw_forum mailing list
JJR> Pw_forum at pwscf.org
JJR> http://www.democritos.it/mailman/listinfo/pw_forum
JJR>
--
=======================================================================
Axel Kohlmeyer akohlmey at cmm.chem.upenn.edu http://www.cmm.upenn.edu
Center for Molecular Modeling -- University of Pennsylvania
Department of Chemistry, 231 S.34th Street, Philadelphia, PA 19104-6323
tel: 1-215-898-1582, fax: 1-215-573-6233, office-tel: 1-215-898-5425
=======================================================================
If you make something idiot-proof, the universe creates a better idiot.
More information about the Pw_forum
mailing list