Are we taking supercomputing code seriously?

COMMENT

Is it right that much of the supercomputing code driving scientific research and engineering design is written by people who are not software professionals, asks Andrew Jones.

Someone remarked to me recently that the problem with scientific software is that most of it is written by amateurs. Harsh perhaps, but it got me thinking. The point behind the remark is that most of the software used for simulation in scientific research, especially on supercomputers, is written by scientists rather than by professional numerical software engineers.

By implication, this state of affairs might be responsible for much of what some people see as the mess we are in with respect to assurance of results from the models and portable performance of the codes. The same argument might also be extended to engineering packages and data modelling.

Immediate need
A scientist writing software for research is obviously focused on creating code that is good enough to get a useful result in a reasonable timeframe. It is hard for individual scientists and institutions to spend the extra effort in time and money to look beyond the immediate need and make sure the software meets certain standards.

Ideally, they should examine whether the implementation has been rigorously tested, has specified areas of assured validity, and allows for potential future use. For example, is the software portable — both in terms of performance and robustness — and extendable?

Once the researcher has a piece of code that gives a believable result for the parameter space of immediate interest, the focus switches back to using the code for science rather than adding engineering quality to the code.

The idea of building in comprehensive software engineering from the start in the code itself and in the development and testing process will often be dismissed before it gets serious consideration. From a scientist's viewpoint, such an approach would look like designing the software, and then adding in the science at the last stage.

Rush to do science
Part of the problem is that in their rush to do science, scientists fail to spot the software for what it is: the analogue of the experimental instrument. Consequently, it needs to be treated with the same respect that a physical experiment would receive.

Any reputable physical experiment would ensure the instruments are appropriate to the job and have been tested. They would be checked for known error behaviour in the parameter regions of study, and chosen for their ability to give a satisfactory result within a useful timeframe and budget. Those same principles should apply to a software model.

Choose the right methods or algorithms to give scientifically valid predictions within a useful timeframe .

Make sure the model or implementation is tested for the use it will be put to. To spell it out — it is not good enough just being tested for a small part of parameter space if it is going to be used across a wider region. Quantify the error behaviour both of the method and its specific implementation.

On the other side of the coin is the (very valid) need to balance the quality of the tool with using it to do science. A gold-standard code is just as useless as an untested write-once-and-use model, if the time taken to make that software perfect delivers it to the user too late for science to be done.

Just as in business, most science puts a value on time — a good enough result today is often worth more than an incredibly precise result next week — whether it is for paper publication, for informing a business decision, or a product design.

Programming goals
The trick then must be to ensure the scientist code developer understands the methods of numerical software engineering, as well as its issues. Software engineers on the team must equally understand that the code is just part of the science, and not usually a goal in its own right.

My colleague was right that too much of our scientific code base lacks solid numerical software engineering foundations. That potential weakness puts the correctness and performance of code at risk when major renovation of the code is required, such as the disruptive effect of multicore nodes, or very large degrees of parallelism on upcoming supercomputers.

However, we must also beware of the temptation to drive towards heavily engineered code throughout. Otherwise we run the risk that each piece of code gains a perceived value from historic investment that is hard to discard. And perhaps in some cases, what we need as much as renovation is to discard and restart.

As vice president of HPC at the Numerical Algorithms Group, Andrew Jones leads the company's HPC services and consulting business, providing expertise in parallel, scalable and robust software development. Jones is well known in the supercomputing community. He is a former head of HPC at the University of Manchester and has more than 10 years' experience in HPC as an end user.

Talkback

Our experience over 14 years with Software Carpentry (http://software-carpentry.org) has been that teaching scientists and engineers a few basic software development tools and techniques (such as version control, modular design, test-driven development, and so on) pays bigger dividends than anything else, and that without these skills, they're not able to take advantage of HPC, implement more sophisticated numerical algorithms, and so on. It's therefore ironic that NAG (the employer of this article's author) chose not to support the project...

gvwilson 7 March, 2011 11:44
Reply

Strange to see this comment posted a year after the article was written/posted (and a year after NAG's last contact with Greg).

Greg's project does look interesting and useful and I encourage readers to take a look and get involved if appropriate. It is true that NAG was unable to contribute funding when approached. As we noted to Greg at the time, our budget for collaborative activities of this kind was already fully committed for that year - supporting five graduate students, sponsoring two MSc programmes, and supporting several research projects with Universities, and providing training as part of a Doctoral Training Centre (cross-disciplinary graduate school).

I would also encourage readers to look at NAG's current student awards (http://www.nag.co.uk/about/student_awards/) which we are hoping to expand this year.

Andrew (article author)

Andrew_Jones 8 March, 2011 10:07
Reply

Post your comment

In order to post a comment you need to be registered and logged in.

You can also log in with Facebook. Log in or create your ZDNet UK account below

  • Login

Will not be displayed with your comment

By signing up for this service, you indicate that you agree to our Terms and Conditions and have read and understood our Privacy Policy. Questions about membership? Find the answers in the Community FAQ

Get ZDNet UK's daily newsletter

Enter your email address to sign up

ZDNet UK Live

Roberto_Store

Now On Sale, Unlocked iPhone 4S / Galaxy Note In Factory Box. Roberto-Techie(UK) ”Now on Sales” Smartphone, Android,Tablets,Gadget &...

3 hours ago by Roberto_Store on Samsung Galaxy S III lined up for sale
Paul Smyth

Is this classic FUD? One thing I would definitely have notice is a Mozilla threat to stop supporting GNU/Linux.

5 hours ago by Paul Smyth via Facebook on Firefox rapid release improves Fedora Linux
UnderINK

I agree with the previous commenter wholeheartedly. I couldn't say it better myself. This is very 'Big Brother'. And while I agree with protecting...

9 hours ago by UnderINK on European e-identity plan to be unveiled this month
Simon Bisson and Mary Branscombe

Nice to see that Turing's idea of a general purpose computer doing once-hardware-powered tasks in software is now universal ;-) Mary

14 hours ago by Simon Bisson and Mary Branscombe on Software with everything
Jason Burchell

seriously now. I've only bothered to read a small bit of the comments. do me and the rest of the world a favour. stop saying it does not work or...

18 hours ago by Jason Burchell via Facebook on Music industry negotiating over 24-bit downloads
Philip Charles Cohen

Read about it and weep, John Donahoe ... In addition to Visa’s V.me, there is now MasterCard’s PayPass digital wallet soon to arrive; another...

22 hours ago by Philip Charles Cohen via Facebook on PayPal takes phone-based payments to the high street
apexwm

Leslie Satenstein : Where have you ever seen Mozilla even mention this? Firefox is the most popular browser in the GNU/Linux OS, so I don't see...

23 hours ago by apexwm on Firefox rapid release improves Fedora Linux
songmaster

SHleG: Do you remember building a clockwork scorpion kit (I'm pretty sure I have a photo of it somewhere) — I think it was called something like...

1 day ago by songmaster on Software with everything
Chris Wortman

Good I love Yahoo! Their search engine is getting better than Google as of late. I find more of what I want on the first page, and usually within...

1 day ago by Chris Wortman via Facebook on Linux Mint 13 ramps up for KDE release
PatrickG

openhgs has made the point for Windows 8 multiple monitors without realising it! With Windows 7 you have to switch the mouse and so your focus...

1 day ago by PatrickG on Windows 8 could speed multi-monitor uptake
Leslie Satenstein

Mozilla has threatened to stop supporting Linux. I guess that UBUNTU is going with another browser. I indicated that if Mozilla stops supporting...

1 day ago by Leslie Satenstein via Facebook on Firefox rapid release improves Fedora Linux
Andy Bolstridge

Much as I abhor Microsoft's licensing practices, this is almost certainly down to purchasing IT equipment via 3rd party consultants - you get the...

1 day ago by Andy Bolstridge via Facebook on 6 million wasted licences and £1,200 PCs: welcome to government IT
Jack Schofield

@openhgs Windows users have had multiple desktops since Linus started writing Linux. They just haven't shipped as standard because not enough...

2 days ago by Jack Schofield on Windows 8 could speed multi-monitor uptake
Jack Schofield

@Phil at Cloud4 What, Microsoft gets £1,200 per PC and £1,622 per server? Gosh, I'm amazed....

2 days ago by Jack Schofield on 6 million wasted licences and £1,200 PCs: welcome to government IT
craigsc

You guys have no idea what is going on at Autonomy. Autonomy could have been a much more profitable organization. The sales operations at Autonomy...

2 days ago by craigsc on HP cuts 27,000 staff as Autonomy chief Lynch leaves
Moley

How does this impact on dual or multi booting? Seems to me to more or less prohibit this, from Windows 8 anyway. Will Grub 2 recognise Windows 8,...

2 days ago by Moley on Windows 8 start-up speed forces USB boot workaround
apexwm

I don't understand why there cannot be a slight pause during the boot process so the user can press a key. Many operating systems do this, even if...

2 days ago by apexwm on Windows 8 start-up speed forces USB boot workaround
Gavin Goodman

You can now buy the Xi3 modular computer in the UK at http://www.ocdistribution.com . This can be bought with the Tand3m software, pricing and...

2 days ago by Gavin Goodman on CES 2012: Xi3 microSERV3R
Phil at Cloud4

I agree: Mike Lynch can clearly build a business and manage strategy. I suspect the exit of Mike is more likely the end of a planned handover...

2 days ago by Phil at Cloud4 on HP cuts 27,000 staff as Autonomy chief Lynch leaves
Phil at Cloud4

This is unbeleivable government wastage with only one winner... Microsoft 1 - Tax payer Nil!

2 days ago by Phil at Cloud4 on 6 million wasted licences and £1,200 PCs: welcome to government IT