Personal tools
You are here: Home openCMISS Wiki Using Sundance
Views

History for Using Sundance

changed:
-
I have spent a couple of days getting Sundance to run our test problem.

Notes:

  * Nice connection between the weak integral form of equations and the code
		that I have to write.

  * Large exectuables (70Mb) generated for small problems, presumably
		primarily all the libraries and suchlike.

  * Got some results after only a couple of days working on it.

Procedure:

  To build on bioeng22 

* Trilinos 6.0.16 I used the sundance-linux-mpi-opt but removed -lumfpack
		and -lamd and the -enable-amesos-umfpack because earlier build scripts
		were failing to find these libraries even though they are in the system.

* Sundance 2.1.2 I used linux-mpi-opt

Once I could run test examples I then tried 'mpirun -np 2'
and in the environment 'setenv P4_RSHCOMMAND ssh'

Initially I got an error that Karl got with libmesh on the Stokes2D.exe problem::

   p0_6750:  p4_error: Child process exited while making connection to remote process on localhost: 0

but that seems because also dependent on the problem and then in a new shell
		had gone away completely.

Anyway I moved on.  There is no explicity time stepping objects in Sundance
		but following an old Sundance 1.0 example I implemented Crank Nicholson
		and fully implicit symbolically.

<a href="OCtimeStepHeat2D.cpp">OCtimeStepHeat2D.cpp</a>

Results:

  On bioeng22 which has 4 CPUS = 2 x Dual Core AMD Opteron(tm) Processor 275,
	 2210.198MHz, 1024 KB Cache and 8154836 kB RAM

For  one processor 10 steps with a 200x200 grid (about time 2:33.48 memory VIRT 408M RES 333M)::

  eqn = delU[0]*(U[0]-soln[0])+0.02*(0.5*(D[delU[0], x]*D[U[0]+soln[0], x])+0.5*(D[delU[0], y]*D[U[0]+soln[0], y]))
  u0 = {soln[0]}
  [9]
  error norm = 0.000493527

  [1]  eval vector flops: 2.928e+07
  [1]  quadrature flops: 3.39361e+08
  [1]  ref integration flops: 4.93449e+08
  [1]  cell jacobian batch flops: 1.94498e+09
  [1]  quadrature eval mediator: 1.93632e+09
  get mesh                                : 0.621094
  unbatched facet grabbing                : 3.50781
  batched facet grabbing                  : 0.0195312
  Expr symbolic ops                       : 0
  symbolic preprocessing                  : 0.00390625
  assembler ctor                          : 6.04688
  DOF map building                        : 6.02734
  assembly                                : 9.26562
  matrix config                           : 1.54297
  matrix graph determination              : 1.30469
  graph column processing                 : 0.765625
  batched dof lookup                      : 1.93359
  tmp graph flattening                    : 0.246094
  matrix allocation                       : 0.0429688
  Low-level vector operations             : 15.1914
  cell Jacobian grabbing                  : 0.0976562
  Symbolic Evaluation                     : 2.01953
  coord function evaluation               : 0.0703125
  integration                             : 2.95312
  ref integration                         : 1.90625
  jacobian factoring                      : 0.207031
  matrix insertion                        : 1.85547
  quadrature                              : 0.753906
  vector insertion                        : 0.347656
  linear solve                            : 108.938
  discrete function evaluation            : 1.46484
  building integral transformation matrices: 0.8125
  jacobian inversion                      : 0.425781
  Expr output                             : 0
  viz output                              : 8.76172
  unbatched dof lookup                    : 4.64844

For two processors  (about time 2:02.55 memory VIRT 250&220M RES 178M)::

  mpirun -np 2 ./OCtimeStepHeat2D.exe
  
  error norm = 0.000498438

  error norm = 0.000498438

  [1]  eval vector flops: 1.47864e+07
  [1]  quadrature flops: 1.71378e+08
  [1]  ref integration flops: 2.49196e+08
  [1]  cell jacobian batch flops: 9.82217e+08
  [1]  quadrature eval mediator: 9.77842e+08

For three processors  (about time 1:27.45 memory VIRT 190&160&160M RES 120&120&120M)::

  mpirun -np 3 ./OCtimeStepHeat2D.exe
  
  error norm = 0.000499121

  error norm = error norm = 0.000499121

  0.000499121

  [1]  eval vector flops: 9.8088e+06
  [1]  quadrature flops: 1.13686e+08
  [1]  ref integration flops: 1.65318e+08
  [1]  cell jacobian batch flops: 6.51572e+08
  [1]  quadrature eval mediator: 6.48667e+08

For one processor 1 steps with a 1000x1000 grid (about time 29:13.51 memory
	 VIRT 8113M 7.5G)::


  error norm = 0.00176444

  [1]  eval vector flops: 1.92e+08
  [1]  quadrature flops: 1.014e+09
  [1]  ref integration flops: 1.30005e+09
  [1]  cell jacobian batch flops: 5.10001e+09
  [1]  quadrature eval mediator: 5.028e+09
  get mesh                                : 14.2266
  unbatched facet grabbing                : 86.4805
  batched facet grabbing                  : 0.441406
  Expr symbolic ops                       : 0.0429688
  symbolic preprocessing                  : 0.0625
  assembler ctor                          : 145.75
  DOF map building                        : 145.191
  assembly                                : 86.5898
  matrix config                           : 51.6289
  matrix graph determination              : 45.875
  graph column processing                 : 28.6523
  batched dof lookup                      : 5.07422
  tmp graph flattening                    : 7.93359
  matrix allocation                       : 0.96875
  Low-level vector operations             : 204.188
  cell Jacobian grabbing                  : 0.632812
  Symbolic Evaluation                     : 11.7148
  coord function evaluation               : 2.1875
  integration                             : 9.19531
  ref integration                         : 5.33594
  jacobian factoring                      : 1.25781
  matrix insertion                        : 9.19141
  quadrature                              : 2.60547
  vector insertion                        : 1.875
  linear solve                            : 1445.57
  discrete function evaluation            : 5.41797
  building integral transformation matrices: 2.24609
  jacobian inversion                      : 1.14844
  Expr output                             : 0.046875
  viz output                              : 21.9258
  unbatched dof lookup                    : 11.5977

For two processors 50 steps * 0.001s with a 1000x1000 grid and tolerance at 1-02 so
	 60-70 iterations (about time 1:08:16.29 memory
	 VIRT 3337m&3311m RES 3.2g&3.2g )::