Author
Listed:
- Christian Feld
(Jülich Supercomputing Centre, Forschungszentrum Jülich GmbH)
- Markus Geimer
(Jülich Supercomputing Centre, Forschungszentrum Jülich GmbH)
- Marc-André Hermanns
(Jülich Supercomputing Centre, Forschungszentrum Jülich GmbH)
- Pavel Saviankou
(Jülich Supercomputing Centre, Forschungszentrum Jülich GmbH)
- Anke Visser
(Jülich Supercomputing Centre, Forschungszentrum Jülich GmbH)
- Bernd Mohr
(Jülich Supercomputing Centre, Forschungszentrum Jülich GmbH)
Abstract
Software reliability is one of the cornerstones of any successful user experience. Software needs to build up the users’ trust in its fitness for a specific purpose. Software failures undermine this trust and add to user frustration that will ultimately lead to a termination of usage. Even beyond user expectations on the robustness of a software package, today’s scientific software is more than a temporary research prototype. It also forms the bedrock for successful scientific research in the future. A well-defined software engineering process that includes automated builds and tests is a key enabler for keeping software reliable in an agile scientific environment and should be of vital interest for any scientific software development team. While automated builds and deployment as well as systematic software testing have become common practice when developing software in industry, it is rarely used for scientific software, including tools. Potential reasons are that (1) in contrast to computer scientists, domain scientists from other fields usually never get exposed to such techniques during their training, (2) building up the necessary infrastructures is often considered overhead that distracts from the real science, (3) interdisciplinary research teams are still rare, and (4) high-performance computing systems and their programming environments are less standardized, such that published recipes can often not be applied without heavy modification. In this work, we will present the various challenges we encountered while setting up an automated building and testing infrastructure for the Score-P, Scalasca, and Cube projects. We will outline our current approaches, alternatives that have been considered, and the remaining open issues that still need to be addressed—to further increase the software quality and thus, ultimately improve user experience.
Suggested Citation
Christian Feld & Markus Geimer & Marc-André Hermanns & Pavel Saviankou & Anke Visser & Bernd Mohr, 2021.
"Detecting Disaster Before It Strikes: On the Challenges of Automated Building and Testing in HPC Environments,"
Springer Books, in: Hartmut Mix & Christoph Niethammer & Huan Zhou & Wolfgang E. Nagel & Michael M. Resch (ed.), Tools for High Performance Computing 2018 / 2019, pages 3-26,
Springer.
Handle:
RePEc:spr:sprchp:978-3-030-66057-4_1
DOI: 10.1007/978-3-030-66057-4_1
Download full text from publisher
To our knowledge, this item is not available for
download. To find whether it is available, there are three
options:
1. Check below whether another version of this item is available online.
2. Check on the provider's
web page
whether it is in fact available.
3. Perform a
for a similarly titled item that would be
available.
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:sprchp:978-3-030-66057-4_1. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.