Purpose
- The package "Agate" provides a GPU-accelerated Chebyshev collocation method for solving the single elliptic equation with Dirichlet, Neumann, or Robin boundary conditions on a 3-D rectangular domain.
Specifications
- Name: Agate.
- Author: Feng Chen.
- Finishing date: 05/30/2013.
- Languages: CUDA C++, Fortran 90.
- Required libraries: BLAS, LAPACK, CUBLAS.
Simple Example
- Equation:
αu−(β1uxx+β2uyy+β3uzz)=f,in Ω,au+b∂u∂n=c,on ∂Ω.
- Parameters:
Ω=(−1,1)3,α=2,β1=3,β2=4,β3=5,a=1,b=1.
- Exact solution and input functions:
u(x,y,z)=ex+y+z,then f(x,y,z) and c(x,y,z) are calculated accordingly.
Quick Start
- Compiling and running:
cd ./Agate make library nvcc Agate_Main.cu -llibrary -llapack -lblas ./a.out
- Output:
Nx = 2^5, Ny = 2^6, Nz = 2^7 Device 0: Tesla M2050 of version 2.0 intialization time: 0.0961988 cpu time bulk: 0.110826 cpu error bulk: 1.57341e−11 cpu error upx: 1.53673e−11 cpu error umx: 3.08875e−12 cpu error upy: 1.48668e−11 cpu error umy: 2.78527e−12 cpu error upz: 1.2097e−11 cpu error umz: 1.57017e−11 cpu error du: 2.71758e−10 gpu time HtD: 0.00597085 gpu time bulk: 0.00152022 gpu time BV: 0.000412896 gpu time diff: 0.00107117 gpu time DtH: 0.00265923 gpu error bulk: 1.57332e−11 gpu error upx: 1.53664e−11 gpu error umx: 3.08864e−12 gpu error upy: 1.48654e−11 gpu error umy: 2.78522e−12 gpu error upz: 1.2097e−11 gpu error umz: 1.57021e−11 gpu error du: 2.71349e−10 ...
- CPU: Intel(R) Xeon(R) CPU X5630 @2.53GHz.
- GPU: Nvidia Tesla M2050.
- OS: CentOS release 6.4 (Final).
- Compiler: nvcc 4.2.9, gcc/gfortran 4.5.1.
References
- Feng Chen and Jie Shen. A GPU parallelized spectral method for elliptic equations in rectangular domains, Journal of Computational Physics, Volume 250, 555-564, (2013).
Code Highlight
// // Mars_D_Invert performs the inversion in the frequency space // on the device. // // Input: // dd: structure for static data // alpha, beta: coefficient // d_f: right hand side // Output: // d_f: solution // Other: // lambda: eigenvalue // s: stiffness matrix // q: number of quadrature points // inline void Mars_D_Invert (Mars& dd, double alpha, double betax, double betay, double betaz, double* d_f, double* wk) { dim3 T(8, 8, 8); dim3 B((dd.qx+7)/8, (dd.qy+7)/8, (dd.qz+7)/8); Mars_Kernel <<<B,T>>> (alpha, betax, betay, betaz, dd.qx, dd.qy, dd.qz, dd.lambdax, dd.lambday, dd.lambdaz, wk, d_f); }
No comments:
Post a Comment