-
Notifications
You must be signed in to change notification settings - Fork 7
example
The following command show how to run the ESPRESO for example with 18 MPI processes. ESPRESO create number of clusters which is equal to number of the MPI processes. In the object DECOMPOSITION, number of domains per each clusters is specify. Sub-domains are then computed by threads which are specify as a environment variable.
We have to set environment variable for OpenMP parallelization. OpenMP parallelization in ESPRESO is over domains and MPI parallelization is over clusters. Number of domains may not be equal to number of threads (much more domains can be specify on one thread).
$ source env/threading.default 1
$ mpirun -np 18 espreso -c BLADE.ecf
In the ecf configuration file, user can specify mesh duplication parameter. This parameter creates duplication of the whole simulation process which allows run the computation of the multiple frequencies in parallel. If user specify for example parameter mesh duplication to 2, and for solution of one frequency from the frequency interval will use 18 MPI processes, is necessary to run espreso with 36 MPI processes.
$ mpirun -np 36 espreso -c BLADE.ecf 2
For using GPU accelerated version user have to:
- compile ESPRESO solver with solver option CUDA
./waf configure --solver=cuda
- set solver parameter USE_SCHUR_COMPLEMENT in FETI solver object to TRUE.
Then MPI processes will take a place on the GPU accelerators. For example, if your compute node has 4 GPUs then
$ mpirun -np 8 espreso -c BLADE.ecf
run 2 MPI processes on each GPU.
The configuration file that starts the solver with CUDA accelerators (*.dat file can be found here):
# BLADE CONFIGURATION FILE
PHYSICS STRUCTURAL_MECHANICS_3D;
INPUT {
PATH BLADE_Q.dat;
FORMAT ANSYS_CDB;
DECOMPOSITION {
PARALLEL_DECOMPOSER METIS;
SEQUENTIAL_DECOMPOSER METIS;
DOMAINS 1;
}
STRUCTURAL_MECHANICS_3D {
LOAD_STEPS 1;
MATERIALS {
1 {
DENS 7850;
CP 1;
LINEAR_ELASTIC_PROPERTIES {
MODEL ISOTROPIC;
MIXY 0.3;
EX 2E11;
}
}
}
MATERIAL_SET {
ALL_ELEMENTS 1;
}
LOAD_STEPS_SETTINGS {
1 {
DURATION_TIME 1;
TYPE HARMONIC;
MODE LINEAR;
SOLVER FETI;
FETI {
METHOD TOTAL_FETI;
PRECONDITIONER DIRICHLET;
PRECISION 1E-9;
ITERATIVE_SOLVER GMRES;
REGULARIZATION ANALYTIC;
REGULARIZATION_VERSION FIX_POINTS;
MAX_ITERATIONS 500;
CONJUGATE_PROJECTOR CONJ_R;
NUM_DIRECTIONS 6;
USE_SCHUR_COMPLEMENT TRUE;
}
HARMONIC_SOLVER {
FREQUENCY_INTERVAL_TYPE LINEAR;
MIN_FREQUENCY 0;
MAX_FREQUENCY 1000;
NUM_SAMPLES 4;
DAMPING {
RAYLEIGH {
TYPE DIRECT;
DIRECT_DAMPING {
STIFFNESS 0; MASS 10;
}
}
}
}
DISPLACEMENT {
A1 { X 0; Y 0; Z 0; }
}
HARMONIC_FORCE {
B1 {
TYPE COMPONENTS;
MAGNITUDE { X 0; Y 0; Z 1; }
PHASE { X 0; Y 0; Z 0; }
}
}
}
}
}
OUTPUT {
PATH results;
STORE_RESULTS ALL;
RESULTS_STORE_FREQUENCY EVERY_SUBSTEP;
MONITORS_STORE_FREQUENCY EVERY_SUBSTEP;
MONITORING{
1 {
REGION B1;
STATISTICS AVG;
PROPERTY DISPLACEMENT_AMPLITUDE_Y;
}
}
}