Are we taking supercomputing code seriously?

COMMENT

Is it right that much of the supercomputing code driving scientific research and engineering design is written by people who are not software professionals, asks Andrew Jones.

Someone remarked to me recently that the problem with scientific software is that most of it is written by amateurs. Harsh perhaps, but it got me thinking. The point behind the remark is that most of the software used for simulation in scientific research, especially on supercomputers, is written by scientists rather than by professional numerical software engineers.

By implication, this state of affairs might be responsible for much of what some people see as the mess we are in with respect to assurance of results from the models and portable performance of the codes. The same argument might also be extended to engineering packages and data modelling.

Immediate need
A scientist writing software for research is obviously focused on creating code that is good enough to get a useful result in a reasonable timeframe. It is hard for individual scientists and institutions to spend the extra effort in time and money to look beyond the immediate need and make sure the software meets certain standards.

Ideally, they should examine whether the implementation has been rigorously tested, has specified areas of assured validity, and allows for potential future use. For example, is the software portable — both in terms of performance and robustness — and extendable?

Once the researcher has a piece of code that gives a believable result for the parameter space of immediate interest, the focus switches back to using the code for science rather than adding engineering quality to the code.

The idea of building in comprehensive software engineering from the start in the code itself and in the development and testing process will often be dismissed before it gets serious consideration. From a scientist's viewpoint, such an approach would look like designing the software, and then adding in the science at the last stage.

Rush to do science
Part of the problem is that in their rush to do science, scientists fail to spot the software for what it is: the analogue of the experimental instrument. Consequently, it needs to be treated with the same respect that a physical experiment would receive.

Any reputable physical experiment would ensure the instruments are appropriate to the job and have been tested. They would be checked for known error behaviour in the parameter regions of study, and chosen for their ability to give a satisfactory result within a useful timeframe and budget. Those same principles should apply to a software model.

Choose the right methods or algorithms to give scientifically valid predictions within a useful timeframe .

Make sure the model or implementation is tested for the use it will be put to. To spell it out — it is not good enough just being tested for a small part of parameter space if it is going to be used across a wider region. Quantify the error behaviour both of the method and its specific implementation.

On the other side of the coin is the (very valid) need to balance the quality of the tool with using it to do science. A gold-standard code is just as useless as an untested write-once-and-use model, if the time taken to make that software perfect delivers it to the user too late for science to be done.

Just as in business, most science puts a value on time — a good enough result today is often worth more than an incredibly precise result next week — whether it is for paper publication, for informing a business decision, or a product design.

Programming goals
The trick then must be to ensure the scientist code developer understands the methods of numerical software engineering, as well as its issues. Software engineers on the team must equally understand that the code is just part of the science, and not usually a goal in its own right.

My colleague was right that too much of our scientific code base lacks solid numerical software engineering foundations. That potential weakness puts the correctness and performance of code at risk when major renovation of the code is required, such as the disruptive effect of multicore nodes, or very large degrees of parallelism on upcoming supercomputers.

However, we must also beware of the temptation to drive towards heavily engineered code throughout. Otherwise we run the risk that each piece of code gains a perceived value from historic investment that is hard to discard. And perhaps in some cases, what we need as much as renovation is to discard and restart.

As vice president of HPC at the Numerical Algorithms Group, Andrew Jones leads the company's HPC services and consulting business, providing expertise in parallel, scalable and robust software development. Jones is well known in the supercomputing community. He is a former head of HPC at the University of Manchester and has more than 10 years' experience in HPC as an end user.

Post your comment

In order to post a comment you need to be registered and logged in

Log in or create your ZDNet UK account below

Will not be displayed with your comment

By signing up for this service, you indicate that you agree to our Terms and Conditions and have read and understood our Privacy Policy. Questions about membership? Find the answers in the Community FAQ

ZDNet UK Live

nikeshoes998

Oracle signs Solaris deals with HP and Dell: Find the answers in the Community FAQ free shipping wholesale product... http://bit.ly/bcjQtY

mensapparel2010

Oracle signs Solaris deals with HP and Dell: Find the answers in the Community FAQ free shipping wholesale product... http://bit.ly/9GWZRh

womensapparel20

Oracle signs Solaris deals with HP and Dell: Find the answers in the Community FAQ free shipping wholesale product... http://bit.ly/bPLHL8

lisabarnes001

Oracle signs Solaris deals with HP and Dell: Find the answers in the Community FAQ free shipping wholesale product... http://bit.ly/bVw3F2

KC616

Oracle signs Solaris deals with HP and Dell: Find the answers in the Community FAQ free shipping wholesale product... http://bit.ly/cDUyaj

KC616

free shipping wholesale products: Read more »h handbags,NIKE shoes, jewelry, watches, and jacket and so on. We gua... http://bit.ly/cWcW1e

SpyScroll

Cyberwar defence plan is essential, says former CIA head: Michael Hayden, former head of the CIA and the National ... http://bit.ly/beLpKQ

Droid_News

SAP leads businesses into augmented reality http://bit.ly/9eMWYp | #Droid #Android

wholesalegurru

free shipping wholesale products: We mainly supply top mirror quality brand name products, such as wholesale handb... http://bit.ly/cWcW1e

CNSInstructor

Cyberwar defence plan is essential, says former CIA head: Michael Hayden, former head of the CIA and the N... http://bit.ly/9sn6ax #pdln4nx

AllAboutFashion

Oracle signs Solaris deals with HP and Dell http://bit.ly/9KVeqD

Droid_Phone

SAP leads businesses into augmented reality http://bit.ly/9eMWYp | #Droid #Android

AllAboutFashion

free shipping wholesale products http://bit.ly/c7cpX4

Droid_Phone

TalkTalk to sell mobile services via Vodafone deal http://bit.ly/bLVfxI | #Droid #Android

wholesalegurru

Oracle signs Solaris deals with HP and Dell: Find the answers in the Community FAQ free shipping wholesale product... http://bit.ly/cDUyaj

wholesalegurru

free shipping wholesale products: Read more »h handbags,NIKE shoes, jewelry, watches, and jacket and so on. We gua... http://bit.ly/cWcW1e

felixsprisci

DoJ joins whistleblower in Oracle fraud suit http://bit.ly/bMT3SJ

actatrudy

Update: free shipping wholesale products - ZDNet UK (... http://www.actahandbags.com/trends/free-shipping-wholesale-products-zdnet-uk-blog/

lisabarnes001

free shipping wholesale products: Read more »h handbags,NIKE shoes, jewelry, watches, and jacket and so on. We gua... http://bit.ly/bRvFgG

mensapparel2010

free shipping wholesale products: Read more »h handbags,NIKE shoes, jewelry, watches, and jacket and so on. We gua... http://bit.ly/9CXYG9

Featured white papers

The need for email archiving

Without an effective system for archiving emails, organisations can find themselves unable to recover vital business records, leaving them open..

Download now

Dell Data Storage Summary

This study was conducted in the United States amoung IT decision makers with involvement in data centre purchases at companies..

Download now

Datasheet: Infrastructure as a Service

'Infrastructure as a Service' gives enterprises the flexibility to subscribe to the compute power and storage they require today with 'pay..

Download now