Server platforms Toolkit
Story: Tough choices for supercomputing's legacy apps
Legacy HPC SW Porting Effort and Schedule
Andrew Jones recommends that 10%-20% of the high performance computing (HPC) hardware (HW) budget be redirected to software (SW) porting/re-engineering in order to allow the code to keep pace with changing HPC architectures.
This labor estimate is an order of magnitude too small for any HPC environment that is trying to port 1M+ source lines of code (SLOC) -- a fairly common size in HPC environments (e.g., two workhorse LANL codes, and various versions of Windows and Linux (if utilities are included), are each larger than this) to a new HPC architecture.
For the sake of argument, suppose we have to re-engineer 1M SLOC to port our current code to the the next-generation HPC HW. Well calibrated SW effort and schedule estimation models such as COCOMO tell us that if more than 30% of the code has to be changed, it is cheaper to start from scratch. 10 years ago, it took $100/line to write reliable, sustainable code from scratch. So the re-engineering cost of 1M SLOC is at least $100M, which is comparable to the cost of the largest supercomputers today. Thus, for a 1M SLOC HPC SW port, we need a SW budget that is *equal to or greater than the cost of the supercomputer* to which we are trying to port.
Moreover, COCOMO predictes the time to re-engineer 1M SLOC is ~50 months, or essentially an entire HPC HW generation. Thus, a 1M SLOC HPC port can be optimized, at best, only for the *last* HPC HW architecture.
Of course, we can always solve these problems by foregoing SW reliability and sustainability ;->
Full Talkback thread








