[ale] Supporting Linux on super computers?

Leam Hall leamhall at gmail.com
Tue Jun 4 07:43:02 EDT 2024


Jim,

Can you talk a little about how to learn performance tuning and scaling data collection? Most of my work over the years has just been getting things to work, or fixing things after the developers did a "nothing major" change. Performance and throughput observation and tuning are important, but I've not really done a lot of it.

Is this an area where kernel parameters are used a lot? Are there kernel re-compiles? Is this a job space where dusting off my C would be useful? Are other programming languages heavily used?

Thanks!

Leam

On 6/3/24 19:47, Jim Kinney wrote:
> To make it work there's tools that multiply a command across many nodes.
> Nodes are often pxe boit from a common point then get ip and name assigned.
> Depending on maker and tools the systems are usually batch processors with
> a manager like slurm (or pbs if life hands out lemons 😞).
> 
> Monitoring tools are numerous. Some old one still work. Some new have
> problems scaling.
> 
> The hard part is figuring out performance tuning and scaling data
> collection to not overrun system usage.
> 
> On Mon, Jun 3, 2024, 4:05 PM Leam Hall via Ale <ale at ale.org> wrote:
> 
>> For those of you who know, what's different about supporting Linux on
>> supercomputers?
>>
>> Thanks!
>>
>> Leam

-- 
DevSecOps Engineer         (reuel.net/resume)
Scribe: The Domici War     (domiciwar.net)
General Ne'er-do-well      (github.com/LeamHall)


More information about the Ale mailing list