[ale] high performance computing

Thu Jul 26 17:10:18 EDT 2012

  On 07/26/2012 05:07 PM, John Heim wrote:
> From: "Jeff Layton"<laytonjb at att.net>
> To: "Atlanta Linux Enthusiasts"<ale at ale.org>
>> There are lots of vendors to contact (including the one I work for :)  ).
>>
>> Can you say a little more about the application? 2D FFT's? 3D FFT's?
>> Is the code parallelized via MPI or OpenMP or both? Is the code written
>> with CUDA? How many cores or processes are used per run?
>>
>> Does he need to write additional code? Which compilers do they
>> use or like?
> I'm pretty sure the PI  hasn't written much of the code yet.  He's working
> with someone who's previous work was done on an SGI Altix running suse
> linux. I would assume that the two of them are working on an extension of
> the second guy's previous work.   Actually, the second guy is a gal.  Not
> that it matters but I think she'd appreciate being referred to as a she even
> on an anonymous email list.
>
> Anyway, the PI mentioned MPI in a meeting so I guess that's it. I never
> heard of CUDA until this moment. I'll have to ask him about that.  I know
> he's writing C++ but I don't know which compiler he is using. If I had to
> guess, I'd say Intel's. But I doubt anybody around here besides me knows how
> to install that and I know I didn't install it on that suse box.

The follow-up question is whether the FFT's are done locally
or if they are using an MPI based FFT?

However, I think as a starting point, you'll want compute nodes
that have reasonably fast processors, lots of cache (as Jim
pointed out) but you also needs tons of memory BW per core.
FFT's love memory BW!!

If the FFT's themselves are parallelized, then you will definitely
need InfiniBand. FFT's each networks for breakfast (in fact there
was a proposal from John Gustafson at Intel to make a 3D MPI
FFT the new benchmark for HPC since it pushed systems so
hard).

But as Jim pointed out, IB isn't that expensive but for smaller
systems 10GigE is fairly close to IB in terms of cost (the
switches get expensive as you scale the system). However, I still
like IB because for about the same money you get better performance
(Jim pointed this out as well).

CUDA - is the toolkit for programming Nvidia GPUs for running
code. Not a big deal but it might be interesting for them to
develop the application on CPUs and then replace the FFT calls
with calls to CuFFT (CUDA FFT). But that may be down the road
and hopefully it's an easy process.

If you want to talk off-line about more details, I'm more than glad
to help. Just let me know. Or we can continue to talk here if
others don't mind.

Thanks!

Jeff