Codes Shared for Research and Education : AGATE: GPU-accelerated 3-D Chebyshev collocation methods for elliptic equations

Purpose

The package "Agate" provides a GPU-accelerated Chebyshev collocation method for solving the single elliptic equation with Dirichlet, Neumann, or Robin boundary conditions on a 3-D rectangular domain.

Specifications

Name: Agate.
Author: Feng Chen.
Finishing date: 05/30/2013.
Languages: CUDA C++, Fortran 90.
Required libraries: BLAS, LAPACK, CUBLAS.

Simple Example

Equation: \begin{equation} \begin{aligned} & \alpha u -(\beta_1 u_{xx} + \beta_2 u_{yy} + \beta_3 u_{zz}) = f, &&\quad \text{in } \Omega, \\ & au + b\frac{\partial u}{\partial \boldsymbol{n}} = c, && \quad \text{on } \partial \Omega. \end{aligned} \end{equation}
Parameters: \begin{equation} \Omega = (-1,1)^3 , \quad \alpha =2, \quad \beta_1=3, \quad \beta_2=4, \quad \beta_3 = 5, \quad a = 1, \quad b=1. \end{equation}
Exact solution and input functions: \begin{equation} u(x,y,z) = e^{x+y+z}, \end{equation} then $f(x,y,z)$ and $c(x,y,z)$ are calculated accordingly.

Quick Start

Compiling and running:

cd ./Agate
make library
nvcc Agate_Main.cu -llibrary -llapack -lblas
./a.out

Output:

Nx = 2^5, Ny = 2^6, Nz = 2^7
Device 0: Tesla M2050 of version 2.0
intialization time: 0.0961988
cpu time bulk: 0.110826
cpu error bulk: 1.57341e−11
cpu error upx: 1.53673e−11
cpu error umx: 3.08875e−12
cpu error upy: 1.48668e−11
cpu error umy: 2.78527e−12
cpu error upz: 1.2097e−11
cpu error umz: 1.57017e−11
cpu error du: 2.71758e−10
gpu time HtD: 0.00597085
gpu time bulk: 0.00152022
gpu time BV: 0.000412896
gpu time diff: 0.00107117
gpu time DtH: 0.00265923
gpu error bulk: 1.57332e−11
gpu error upx: 1.53664e−11
gpu error umx: 3.08864e−12
gpu error upy: 1.48654e−11
gpu error umy: 2.78522e−12
gpu error upz: 1.2097e−11
gpu error umz: 1.57021e−11
gpu error du: 2.71349e−10
...

CPU: Intel(R) Xeon(R) CPU X5630 @2.53GHz.
GPU: Nvidia Tesla M2050.
OS: CentOS release 6.4 (Final).
Compiler: nvcc 4.2.9, gcc/gfortran 4.5.1.

References

Feng Chen and Jie Shen. A GPU parallelized spectral method for elliptic equations in rectangular domains, Journal of Computational Physics, Volume 250, 555-564, (2013).

Code Highlight


// 
// Mars_D_Invert performs the inversion in the frequency space
// on the device.
//
// Input:
//        dd: structure for static data
//        alpha, beta: coefficient
//        d_f: right hand side
// Output:
//        d_f: solution
// Other:
//        lambda: eigenvalue
//        s: stiffness matrix
//        q: number of quadrature points
//
inline void Mars_D_Invert (Mars& dd, double alpha, 
 double betax, double betay, double betaz, double* d_f, double* wk) {
 dim3 T(8, 8, 8);  
 dim3 B((dd.qx+7)/8, (dd.qy+7)/8, (dd.qz+7)/8);
 Mars_Kernel <<<B,T>>> (alpha, betax, betay, betaz, 
  dd.qx, dd.qy, dd.qz, dd.lambdax, dd.lambday, dd.lambdaz, wk, d_f);    
}

Codes Shared for Research and Education

Thursday, July 11, 2013

AGATE: GPU-accelerated 3-D Chebyshev collocation methods for elliptic equations

No comments:

Post a Comment

Categories and Keywords