[Pw_forum] PW taskgroups and a large run on a BG/P
David Farrell
davidfarrell2008 at u.northwestern.edu
Thu Feb 12 19:24:38 CET 2009
I pulled down the current CVS version, compiled as I did with the
previous snapshot and got the same behavior:
When I ran on 128 cores in vn mode with -ntg 4 -ndiag 121, I got a
cholesky error:
When I ran on 128 cores in dual mode with -ntg 4 -ndiag 121, I got the
cholesky error:
When I ran on 128 cores in smp mode with -ntg 4 -ndiag 121, it ran fine.
I guess I have 2 options:
1) try larger systems in SMP mode with the CVS version, see how big I
can get before things blow up. I'll just have to deal with the extra
cost of the idle CPUs.
2) climb into the code with a debugger to see if I can see anything
going on (things I am interested in now are how much memory is
actually available to the code, how much it is using, if there is
something funny going on in the different modes). I'll probably have
to construct a smaller system that does the same thing first.
I don't want to abandon PW/CP just yet because this code has
demonstrated decent physics, and other codes would require me to do
develop PPs that give me results I can be confident in or way too much
work to get them scalable. Unfortunately - I also need to get it
running on the BG/P as I have a big allocation on that machine that is
otherwise wasted.
Dave
David E. Farrell
Post-Doctoral Fellow
Department of Materials Science and Engineering
Northwestern University
email: d-farrell2 at northwestern.edu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.democritos.it/pipermail/pw_forum/attachments/20090212/457df989/attachment.htm
More information about the Pw_forum
mailing list