Accelerating High Arithmetic Intensity Storm Surge Model using CUDA

Munesh Singh Chauhan, Mayyada Hammoshi, Balaqis Abdallah Salim Al-Bahri

Abstract


GPUs (Graphic Processing Units) have opened the floodgates for high-performance computing, especially for programs that involve high-level arithmetic computations. Though parallel computing capability has been around for many years, it requires large and expensive hardware resources in the form of supercomputers. Presently, GPUs are found in all computing devices due to their reasonable prices. GPUs are now being tapped for speeding up legacy codes, especially in FORTRAN that involves high arithmetic intensity. One of the applications that can be applied is a weather simulation program, the storm surge model, which is converted into CUDA C (Compute Unified Device Architecture). The various pitfalls and challenges faced in the conversion process are narrated in terms of data type equivalences, including file I/Os, multi-dimensional array handling, FORTRAN mathematical functions equivalence, and many others. The processes are categorized under two major segments: FORTRAN to C and C to CUDA C. Each complexity has been tested as a separate case study with the emphasis on speed, coupled with the accuracy of the results. Finally, note that parallelism is not the panacea in all cases. Those segments of code that have data dependences and file I/O are not suitable candidates for conversion.

Keywords: CUDA C, FORTRAN, GPUs, high-level arithmetic computations, parallel computing

Cite this Article
Munesh Singh Chauhan. Accelerating High Arithmetic Intensity Storm Surge Model using CUDA. Recent Trends in Parallel Computing. 2016; 3(2): 9–21p.


Full Text:

PDF

References


Inamdar VS, Joshi Swapnil D. Performance Improvement in Large Graph Algorithms on GPU using CUDA: An Overview. International Journal of Computer Applications (IJCA). 2010; 10(10).

NVIDIA Corporation. CUDA C Programming Guide Version 3.1.1. Santa Clara: NVIDIA Corporation; 2010.

Greg Ippolito. Using C/C++ and Fortran Together. YoLinux. 2001–2011. [Online]. Available: http://www.yolinux.com/TUTORIALS/LinuxTutorialMixingFortranAndC.html. [Accessed 17 Apr 2014].

Al-Maskari J. How the National Weather Forecasting Center Oman Dealt with Tropical Cyclone Gonu. Tropical Cyclone Research and Review (TCRR). Feb 2013; 1.

Al-Barwani AS. General Situation of Flush Floods and Water Resources Problems in Oman. A Report from Ministry of Regional Municipalities and Water Resources.

PC, NA, SKD, PCS, ADR, Indu Jain. Numerical Storm Surge Model for India and Pakistan. Nat Hazards. 2007; 42: 67–73p.

ADT, Jelesenianski CP. NOAA Technical Memoradum EPL WMPO-3. Boluder, Colorado: Weather Modification Program Office; 1973.

IJ, DRA, Sinha PC. Numerical Simulation of Storm Surge Along Myanmar Coast Using a Location Specific Model. In Proceedings of the 14th Biennial Coastal Zone Conference, New Orleans, Louisiana. 2005.

PS, Sinha PC. Numerical Modeling of Storm Surges and Associated Inundation for Orissa Coast of India. In Proceedings of the 12th Biennial Coastal Zone Conference, Cleveland, OH. 2001.

Siddiqui ZA. Storm Surge Forecasting for the Arabian Sea. Pakistan: Taylor & Francis Group; 2009.

HBS, Anton. Calculus. John Wiley & Sons; 2005.

Stroud KA. Engineering Mathematics. Palgrave; 2001.

NVIDIA Corporation. Assess, Parallelize, Optimize, Deploy. NVIDIA Developer Zone; 2014. [Online]. Available: http://devblogs.nvidia.com/parallelforall/assess-parallelize-optimize-deploy/. [Accessed 22 Mar 2014].

NVIDIA Corporation. CUDA C Best Practices Guide. NVIDIA Corporation; 2013.

NGR, Brijmohan Singh DG. Parallel Implementation of Devanagari Text Line and Word Segmentation Approach on GPU. International Journal of Computer Applications (IJCA). 2011; 24(9).

NVIDIA Corporation. CUDA Programming Guide Version 0.8.2. Santa Clara: NVIDIA Corporation; 2007.

JS, Kandrot E. CUDA by Example an Introduction to General-Purpose GPU Programming. NVIDIA Corporation; 2011.

TMA, Fung Wilson WL. Thread Block Compaction for Efficient SIMT Control Flow. Canada: University of British Columbia.

Cornwall Jay LT. High-Performance SIMT Code Generation in an Active Visual Effects Library. Ischia, Italy: EPSRC; 2009.

PX, MM, Yi Yang, et al. A Case for a Flexible Scalar Unit in SIMT Architecture. [Online]. Available: people.engr.ncsu.edu/hzhou/ipdps14.pdf. [Accessed 15 Jun 2014].

Collange S. Stack-Less SIMT Reconvergence at Low Cost. France: 2011.

IJ, ADR, Dube SK, et al. Storm Surge Modelling for the Bay of Bengal and Arabian Sea. Nat Hazards. 2009.

LRN, Leestma S. Introduction to FORTRAN 90 for Engineers and Scientists. Prentice Hall.

JCA, WSB, Wagene JL, et al. FORTRAN 90 Handbook. McGraw-Hill Book Company; 1992.

B.W, RitchieDM. The C Programming Language. 2nd Edn. Prentice Hall; 1988.

AC, RW-S, Allan PM, et al. CNF and F77 Mixed Language Programming: FORTRAN and C Version 4.3. 2008.

Hui L. Using C/C++ and FORTRAN Togather. 2006.

Arnholm CA. Mixed Language Programming using C++ and FORTRAN 77 Version 1.1. 1997.

JG, EC, Javier Delgadoa, et al. A Case Study on Porting Scientific Applications to GPU/CUDA. 2011.

Northam L. Concurrent GPU Programming. In Concurrent Programming Mechanisms and Tools, Waterloo, Ontario, Canada, University of Waterloo. 2009.

Plotzki D. A Short Guide to CUDA C For physicists with Multi-Core Graphics Cards. Leipzig: University of Leipzig; 2012.


Refbacks

  • There are currently no refbacks.