OpenMP Performance in Numerical Simulation of Dambreak Problem Using Shallow Water Equations

Numerical simulation of water surface waves is widely used to describe water flow and its impact on human life. For instance, numerical simulation of waves is elaborated to simulate Tsunami as an early warning system. Using a numerical approach, the study of water flow will reduce costs and save time compared with the conventional approach (in the laboratory). Shallow water equations (SWE) is one of the mathematical models which can be used to describe water flow. In the numerical simulation of SWE, the finite volume method is a robust method to approximate SWE. The result of using a numerical approach depends on the number of grids. The high number of grids then the smooth solution can be obtained. However, an increasing number of grids lead to an increase in computational cost. In this paper, parallel computing using the OpenMP platform is given to reduce the computational cost of numerical simulation. In parallel computing performances, Speedup and Efficiency of numerical simulation using 6400 grids points are obtained four times and 51%, respectively. Moreover, by several numbers of cores from 2 to 8, the CPU time of parallel computing is shown decreasing along with the increasing number of computer cores.


Introduction
Dynamical movement of surface waves can be modeled using the various models. The simple mathematical wave model to describe wave movement dynamically is known as Shallow water equations (SWE). This model is widely used in describing fluid flow problem, such as flow in canal, river, lakes, etc. or it can be used to simulate Tsunami phenomena as an early warning system (see [1,2,3] for more detail). Model SWE is a system of hyperbolic equations which consists of two equations (mass and momentum conservation). In one dimension space, SWE is given as follows.
where h(x, t) describes water height, u(x, t) describes average velocity, g shows gravitational coefficient, moreover x and t are space and time, respectively.
To solve (1 -2) numerically, one robust method can be used, which is called the finite volume method (FVM) [4,5,6]. FVM is widely used to approximate the hyperbolic type of equations in the numerical problem. Generally, there are two types of approach in FVM, staggered grid and collocated grid model. The detail of these two numerical models can be found in some references [1,6,7,8]. As shown in [6] and [8], FVM collocated and staggered grid model are satisfying mathematical properties of shallow water equations, i.e., preserve positivity of water height, satisfy the well-balanced condition, etc.
Mathematically, a good approximation result depends on the size of space steps or grids. This size is obtained by dividing the domain space into several discrete spaces [9]. Indeed, increasing the number of grids causes high computational cost for approximating (1 -2). In numerical scheme of (1 -2), two equations (mass and momentum) will be approximated. Therefore, the process of approximating two equations needs long time execution using a large number of grids.
Here, computational cost can be minimized by applying computer science techniques which is called parallel computing. In this case, computation tasks are optimized using several cores in a single computer. Several references, as in [9,10,11,12] and [13], show the ability of parallel computing for tackling computational cost in the numerical approach. In this paper, the goal of this paper is to implement multi-cores parallel computing in a collocation scheme for SWE. Moreover, the numerical simulation of the dry-wet dam-break problem will be elaborated to investigate the performance of parallel computing.
In order to complete this paper, in Section 2 a brief introduction of FVM collocated scheme with HLLC flux for SWE. In Section 3, the parallel algorithm of numerical scheme is given. The numerical results and parallel performances are provided in Section 4. The conclusion of this paper is shown in Section 5.

Numerical Scheme
For simplicity, SWE (1 -2) can be rewritten in the following compact form, In FVM, the spatial and time domain is discretized into several control volumes. For instance, in Figure 1 a control volume V k is given at point k. This control volume is defined on (x k−1/2 , x k+1/2 )× (t n , t n+1 ). Consider computational domain of simulation is Ω = [0, L] × [0, T ], then the following discrete properties can be defined as, where N x and N t are the number of discrete points of spatial and time, respectively.
Let's U n k , k ∈ Z, n ∈ N be a discrete value of solution SWE (3), then it can written as Therefore in FVM collocated scheme, the discretization of SWE is given as where flux F i± 1 2 will be approximated using the numerical flux which called HLLE (Harten, Lax, van Leer and Einfeld) and given as F n where F (U k ) is numerical flux function (5). Meanwhile, coefficients a 1 and a 2 are given as follows, The coefficients λ 1 and λ 2 can be obtained in some references, for instance, see [4,14,15]. Thus the discretization (7) can be rewritten as Note that numerical form (10) is under stability condition, which is given by the following condition with 0 < ν ≤ 1 is called Courant number.

Parallel Architecture
Parallel computing can is a computational procedure that is to compute several tasks of computation simultaneously. This type of computing can be done by a single computer with multi-cores or multiple computers. One popular platform in multi-cores parallel computing is called OpenMP (Open Multi-Processing). This platform is a shared memory multiprocessing programming type and can be used in several programming languages like C/C++, Fortran, etc.
For example, in [9], parallel computing using the OpenMP platform is shown success to reduce computational time for solving the 1D heat equation. Moreover, OpenMP is shown as simple and straightforward in application. The performance of OpenMP depends on the specification of the computer. In this paper, two measurements of parallel performance metrics will be elaborated.
Here speedup and efficiency metrics will be given. The speedup can be obtained by In this paper, the numerical method (7) will be computed in parallel computing. Therefore the numerical algorithm is given for simplicity. A numerical algorithm for computing (7) in parallel can be seen in Figure 2.  Here, the numerical algorithm in parallel is given in two areas, in serial and parallel computing. As shown in Figure 2, serial computing can be done in the initialization process of U 0 k and in defining CFL condition. Since these two processes are not fit in parallel computing. Meanwhile, parallel computing with OpenMP can be started in the inner loop stage, which is to compute (7) by defining the water height and velocity variable. Note that, the numerical algorithm in serial is similar to the Figure 2, where OpenMP is not applied in the parallel area.

Numerical Results and Parallel performances
To obtain results of numerical simulation and parallel implementation, the following specification of the computer is given in Table 1. Operating System Centos 6.5 Processors AMD 2 socket @4 cores RAM 8 GB

Numerical Simulation of Dry and Wet Dambreak
Dambreak problem is very popular in numerical simulation of SWE. This problem produces shock phenomena, which is a big challenge for the numerical scheme to tackle discontinuity solution [14]. Here two problems are given in dry-wet bed of dambreak. The following initial configuration of dambreak in dry bed problem in the spatial domain [0, 1] is given as follows h(x, 0)u(x, 0) = 0.
The difference between dry and wet bed simulation is located on the right side of the dam wall (in this case at x = 0.5). Numerical results of dambreak simulation with h and u profile are shown in Figure 3.  As can be shown in Figure 3, the results of numerical simulation of the dry-wet bed are well elaborated. These results are similar to the analytical solution of dry-bed dam-break simulation by SWASHES software, which can be found in [16]. Here in Figure 3 (left), the water height profile for the wet-bed produces shock near x = 0.8 due to different energy of different water height. This phenomenon is satisfying Rankine-Hugoniot relation in mathematical observation [14].

Parallel Implementation
In this section, the performance of OpenMP for simulating dambreak problems is given. First, the comparison of CPU time for both numerical simulations (wet and dry dam-break) can be seen in Figure 4. Moreover, serial and parallel of CPU time are shown for both problems. Here, several numbers of grid size are elaborated to see OpenMP performance, in this case N x ∈ {200, 400, 800, 1600, 3200, 64000}.  Here in parallel implementation, the number of the processor for computing is eight cores. From Figure 4, a similar profile of CPU time can be seen for both numerical simulations. However, it can be seen that for both problems, similar CPU time in serial computing with grids number N x = 3600 and in parallel computing with N x = 64000 can be seen. This can be observed that the OpenMP platform is successfully applied, and it can reduce the computational cost of serial code.  Another parallel performance metrics, speedup, and efficiency are shown in Figure 5. These performance metrics are used to see how fast and efficient parallel computing in reducing the computational cost. As shown in Figure 5 (left), the speedup of parallel computing for both problems is reaching four times of serial computing. Moreover, since eight cores are used in this experiment, then the efficiency of parallel computing is approximately 51%, which is shown in Figure 5 (right). This means that only 51% of the average computational cost in serial code can be reduced. Since as we can see in the numerical algorithm of parallel (see Figure 2 for more detail), not all areas of computation can be parallelized. Some areas are still shown in serial computation. These performances are obtained from equations (12) and (13). For another addition, numerical simulation of parallel computing sung several numbers of cores (2,3,4,8) are also elaborated. The results in dry wet dam-break problems can be seen in Figure  6. As shown in Figure 6, the increasing number of cores from 2 to 8, resulting in decreasing of CPU time. Indeed, the increasing number of cores causing some tasks are executed faster than using the low number of cores. And this result is shown for both problems. Indeed from Figure  6, an increasing number of cores into large numbers could not guarantee CPU time is always decreasing since efficiency factor becomes an obstacle in multicore parallel programming.

Conclusion
Parallel computing performances for simulating dry-wet dam-break problem using OpenMP and shallow water equation have been done. Two numerical simulations of the dam-break problem also have been well elaborated. Here, OpenMP is shown satisfying to reduce CPU time in several numbers of grid in simulation. Speedup of simulation using parallel computing is shown able to reach four times of serial computing. Moreover, the efficiency of numerical simulation using eight cores is obtained approximately 51%, with the number of the grid is N x = 6400.