Automatic Decomposition Tool (ADT) in FLOW-3D/MP
This Development Note highlights the Automatic Decomposition Tool (ADT) which will be a new feature for FLOW-3D/MP.
FLOW-3D/MP uses domain decomposition to solve flow problems in parallel on distributed memory computers. It employs the multi-block feature to decompose the domain. The task of creating the sub-domains/blocks is left to the user, and it is the user's responsibility to create the blocks so that the computational load is uniform on all processors.
Importance of Load Balancing
In a simulation with FLOW-3D/MP, the simulation must be set up with at least as many mesh blocks as the processors the user plans to use. When the geometry is complex, it is difficult to achieve a good load-balance, which results in poor performance. When decomposing a domain with complex geometry, it is not possible to use the mesh coordinates alone to find the best way to divide the domain because the optimal division is based on the active cell count rather than total cell count. In the absence of a mechanism to actually count the active cells in a region, it is easy to overestimate or underestimate the number of active cells. Different users may come up with different mesh arrangements. Due to this variation in mesh, the actual speedup seen by the user could be significantly different from the anticipated speedup indicated by the benchmarks.
Automating the Process
A new Automatic Decomposition Tool (ADT) is being developed. Since the ADT is an adaptation of the preprocessor, it can use the actual active-cell count to decompose the mesh. ADT takes into account the number of active cells in the domain and uses a recursive bisection algorithm to divide the domain into sub-domains with approximately equal number of cells in each sub-domain. ADT may switch the plane of division depending on the aspect ratio, thus minimizing communication. ADT can handle a multi-block initial mesh as input. This will allow the user to use multiple blocks to vary the mesh resolution in the domain and select only the region of interest.
The principal goal of automating the domain decomposition process is to save the user time and effort in preparing a distributed simulation with FLOW-3D/MP. It also enables repeatability of performance for the user. With ADT, the user generates an initial mesh in the normal way, considering accuracy, physics, memory usage. Then ADT is launched to obtain a new mesh intended for running FLOW-3D/MP.
A secondary benefit of the ADT is that, in many cases, the efficiency of simulations with FLOW-3D/MP is improved because the load balancing among cpus is improved. The examples given below, from benchmark tests, show two typical problems facing FLOW-3D users, and describe the benefits obtained through the use of ADT.
The two examples presented here show the initial user-generated mesh and the final mesh obtained after automatic decomposition of the input mesh for use with 8 processors. The load balance bar graphs show the load on each processor measured as the total number of active cells assigned to a processor. In the spillway case, the simulation decomposed using ADT ran 76% faster than the manually-decomposed simulation. For the foundry case, the speed-up resulting from use of ADT was lesser, but still significant: 16%.
Load Balance - Dam using 8 processors: Initial user-generated mesh (top left);
Final mesh (top right)
|Load Balance - Foundry using 8 processors: Initial user-generated mesh (top left); Final mesh (top right)|