Tough choices for supercomputing's legacy apps

Daily Newsletters

Sign up to ZDNet UK's daily newsletter.

COMMENT

The future direction of supercomputing may seem dazzling, but it raises difficult questions about the survival of important software at the heart of today's facilities, says Andrew Jones.

The supercomputing community generally agrees that the future holds a number of software challenges. The first of these is the massively increasing degrees of concurrency required — heading towards billion-way.

Then there is the complex hierarchy of parallelism — from vector-like parallelism required at the local level, through multithreading and onwards to multi-level, massive parallel processing across many nodes. On top of those challenges comes the impending storm of verification, validation and resilience.

Programming issues
Evolving our applications and middleware to address these issues is going to be a difficult, but necessary, job over the coming years, as petascale computers become increasingly common for scientific use and in corporate high-performance computing (HPC) facilities. The derived technologies place many teraflops in the hands of individual researchers, but they raise the same programming issues.

As some parts of the community consider the prospect of hundreds of petaflops and exascale computing — which may only be a few years away — others are starting to ask whether some of their applications are ever going to make it.

Proponents of this view argue that some legacy applications are coded in ways, or rely on algorithms, that make evolution impossible. The code refactoring and algorithm development would be greater than the effort of starting from scratch.

Others put it like this: "Don't let the code be the science." If you focus on the engineering challenge or the science, then the code constitutes an instrument. And as one instrument becomes incapable of addressing the scale of the problem required, move to a different instrument.

Technological inertia
Even outside the computational arena, in traditional experimental science or industry test facilities, some research groups have enormous inertia and have become as attached to their instrument for investigating a problem as to the problem itself.

This issue is seen in computational science, too. Many researchers become so attached to their code, that its capabilities control their scientific direction — instead of their scientific ambitions controlling the code.

In industry, many companies, or their R&D departments, will claim to be experts in a specific area, but in reality are only specialists in one method. Perhaps a harsh view, but painfully true in some cases, where the researcher comes up with all sorts of thin reasons against moving instruments.

Read this

Comment: Should programming supercomputers be hard?

Those who glibly argue for easier programming of supercomputers are broaching a complex issue, says Andrew Jones

Read more +

Of course, it is not that simple. A widely used code will have collected a large amount of data related to validation, knowledge of which methods match physical reality in different parts of parameter space, regions of numerical stability and so on — so that the code embodies much of the science. Thus moving to a new code potentially throws away hugely valuable scientific knowledge.

So perhaps two classes of applications will slowly evolve: those that will never be able to exploit future high-end supercomputers to the full, but will remain in use as their successors develop to comparable scientific maturity; and those that with appropriate investment can operate in the exascale regime or petascale personal HPC arena.

Balancing act
That situation creates a difficult balancing act for researchers, developers and funding agencies or company heads. They have to continue to provide the essential investment in scaling, optimisation, algorithm evolution and scientific advances in existing codes so that they can be used on high-end and medium-term mid-scale HPC platforms and avoid a possibly lethal competitive gap opening.

At the same time, they must divert sufficient effort into the development of codes to enable the next step in science or engineering design by running on the most powerful supercomputers of the future. Both tracks of investment are necessary for short- and long-term survival.

In both cases, tools exist to help, but they will only help skilled HPC programmers. They will not replace the need for skilled HPC programmers. Thus, investment in people is equally critical.

Perhaps it is time to repeat the call for more balanced investment. For example, divert between 10 and 20 percent of hardware procurement money into people and software. Don't shy away at the last minute because your latest supercomputer might have 10 percent less hardware speed or, for the Top500 chasers, not make the Top5, Top50, or Top-Whatever.

Business-critical
An increase in supercomputing capability is sometimes business-critical. However, since it is usually an order of magnitude or more, adjusting hardware investment into people and software development investment — on top of your normal operations budget — is not only credible, but is likely to lead to a greater overall computational improvement than hardware alone.

You never know, it might even pay enormous dividends in the science. There are, after all, plenty of case studies around the world's supercomputer centres that show this benefit.

Or perhaps we delay the exascale hardware by a year or two and divert the money saved into software that can be used for science when it arrives?

As vice president of HPC at the Numerical Algorithms Group, Andrew Jones leads the company's HPC services and consulting business, providing expertise in parallel, scalable and robust software development. Jones is well known in the supercomputing community. He is a former head of HPC at the University of Manchester and has more than 10 years' experience in HPC as an end user.

Talkback

Andrew Jones recommends that 10%-20% of the high performance computing (HPC) hardware (HW) budget be redirected to software (SW) porting/re-engineering in order to allow the code to keep pace with changing HPC architectures.

This labor estimate is an order of magnitude too small for any HPC environment that is trying to port 1M+ source lines of code (SLOC) -- a fairly common size in HPC environments (e.g., two workhorse LANL codes, and various versions of Windows and Linux (if utilities are included), are each larger than this) to a new HPC architecture.

For the sake of argument, suppose we have to re-engineer 1M SLOC to port our current code to the the next-generation HPC HW. Well calibrated SW effort and schedule estimation models such as COCOMO tell us that if more than 30% of the code has to be changed, it is cheaper to start from scratch. 10 years ago, it took $100/line to write reliable, sustainable code from scratch. So the re-engineering cost of 1M SLOC is at least $100M, which is comparable to the cost of the largest supercomputers today. Thus, for a 1M SLOC HPC SW port, we need a SW budget that is *equal to or greater than the cost of the supercomputer* to which we are trying to port.

Moreover, COCOMO predictes the time to re-engineer 1M SLOC is ~50 months, or essentially an entire HPC HW generation. Thus, a 1M SLOC HPC port can be optimized, at best, only for the *last* HPC HW architecture.

Of course, we can always solve these problems by foregoing SW reliability and sustainability ;->

jhorner 15 November, 2009 14:47
Reply

Post your comment

In order to post a comment you need to be registered and logged in.

You can also log in with Facebook. Log in or create your ZDNet UK account below

  • Login

Will not be displayed with your comment

By signing up for this service, you indicate that you agree to our Terms and Conditions and have read and understood our Privacy Policy. Questions about membership? Find the answers in the Community FAQ

Get ZDNet UK's daily newsletter

Enter your email address to sign up

ZDNet UK Live

Jack Schofield

@BrownieBoy > Works really well for thieves.... >> Nice attempt to deflect the argument by tossing in a point that's totally >> irrelevant, even...

12 minutes ago by Jack Schofield on AMD Ultrathins to challenge Intel Ultrabooks
raskolnikof

fantastic that the so called piracy bills have been withdrawn. however, these anti-democracy supporters are still in the shadows so lets be alert...

1 hour ago by raskolnikof on SOPA, Protect IP support wavers in face of online protest
Tony Douglas

Please God no; teach them anything you like - thinking rationally, the uses and misuses of data, what data is and what it's not - but leave the...

3 hours ago by Tony Douglas via Facebook on Kids are the future. Teach ’em to code.
BrownieBoy

@Jack, > Works really well for thieves.... Nice attempt to deflect the argument by tossing in a point that's totally irrelevant, even it were...

18 hours ago by BrownieBoy on AMD Ultrathins to challenge Intel Ultrabooks
bootlegger

Make that 13 people now - I got refused today at Manchester airport. I thought I was up to date on this legislation - I knew of the EU ruling from...

21 hours ago by bootlegger on UK airport body scans will not be opt out
tinycg

Don't forget to check out apps like GoodReader or SlideShark either, they're indispensible for people on the go in presentation situations. Best...

23 hours ago by tinycg on Four top iPad apps for people on the move
TerryRK

Well it seems there is something a number of us agree on. Why is the Ubuntu Unity launcher so ugly? I thought perhaps it was something to do with...

1 day ago by TerryRK on A tale of two distros: Ubuntu and Linux Mint
Freebies202

Duplicate comments are not made intentionally. Its very good to know that now you are keeping check on this problem because sometimes a commenter...

2 days ago by Freebies202 on Microsoft fixes blog comments, speeds up blogs with open source
kevinmchapman

"the very significant number of users" and "many (most) of us" - you have no evidence for these statements. It is a fact that most users are saying...

2 days ago by kevinmchapman on A tale of two distros: Ubuntu and Linux Mint
Marg Menzies Harrison

Another grammar faux pas is the improper use of "you". When sitting down down in a restaurant, for example, I get cringe when the waitress...

2 days ago by Marg Menzies Harrison via Facebook on 10 flagrant grammar mistakes that make you look stupid
zdnetukuser

And NOW, folks, for Canonical's next trick... Kubuntu is late. Here's a pencil. Draw your own conclusions. cf.:...

2 days ago by zdnetukuser on Linux Minterface
Moley

@kevinmchapman. The discussion here reflects the very significant number of users who really do like the traditional menu system and who wish to...

2 days ago by Moley on A tale of two distros: Ubuntu and Linux Mint
kevinmchapman

Er, no... It is an efficient means of finding the application/file/setting you need in one place. The icons are a simply a fallback for when you...

2 days ago by kevinmchapman on A tale of two distros: Ubuntu and Linux Mint
TerryRK

Isn't the provision of a text based search an admission by the developers that the mass of icons approach does not work? I don't need to use a...

2 days ago by TerryRK on A tale of two distros: Ubuntu and Linux Mint
kevinmchapman

"Unity and GNOME 3 both abandon the old text-based cascading menus in favour of a graphical icon-driven system." Point truly missed. Both use a...

2 days ago by kevinmchapman on A tale of two distros: Ubuntu and Linux Mint
TerryRK

whs001 - Thank you, I'm glad you liked the article. I absolutely agree with you on your first point. I should perhaps have made it clearer that...

2 days ago by TerryRK on A tale of two distros: Ubuntu and Linux Mint
Dennis Nilsson

If we allow corporate interest to dictate the way our government circumvents due process against foreign entities then we should accept the same...

2 days ago by Dennis Nilsson via Facebook on ACTA stumbles in Germany
GHar123

I totally dislike pirating of works, I fear that artists will be deterred from creating works if they think that they are going to get ripped off....

2 days ago by GHar123 on ACTA stumbles in Germany
JCB33

How dare film makers, artists or anybody that invests in creativity stop us pirating their works for free. I want to be able to walk into my local...

3 days ago by JCB33 on ACTA stumbles in Germany
Moley

@GrueMaster. I prefer horses for courses rather than one size fits all. I, and I suspect most other computer users, do not really wish to have...

3 days ago by Moley on A tale of two distros: Ubuntu and Linux Mint