Difference between revisions of "How to run ESTEL in parallel"

From SourceWiki
Jump to navigation Jump to search
Line 12: Line 12:
  
 
= Run a parallel job on one machine =
 
= Run a parallel job on one machine =
 +
Before running distributed parallel jobs, it is easier to get it to work on one machine.
 
== Using one process ==
 
== Using one process ==
Before running '''[[ESTEL]]''' in parallel, you need to start an mpd.
+
Before running '''[[ESTEL]]''' in parallel, you need to start a <code>mpd</code> process. Details are given in the '''MPI article'''. Just start it with:
 +
<code><pre>
 +
mpd &
 +
</pre></code>
  
Before trying to run real parallel jobs it is interesting to check that the "parallel" library and [[Estel | '''ESTEL''']] are playing nicely together. This can be achieved by running an existing test case with with test case for which you add in the steering file:
+
Before trying to run real parallel jobs it is interesting to check that the "parallel" library and [[Estel | '''ESTEL''']] are playing nicely together. This can be achieved by running an existing test case with in which you add the following keyword in the steering file:
 
<code><pre>
 
<code><pre>
 
PARALLEL PROCESSORS = 1
 
PARALLEL PROCESSORS = 1
Line 25: Line 29:
  
 
== Using multiple processes ==
 
== Using multiple processes ==
 +
If it works fine with one process, you can try with several. Make sure first that the <code>mpi_telemac.conf</code> file contains enough entries for the number of processes you will request. As there is just one host available to MPI, just repeat its entry several times. For instance if asking for three processes, the <code>mpi_telemac.conf</code> file should contain:
 +
<code><pre>
 +
# Number of processors :
 +
3
 +
#
 +
# For each host :
 +
#
 +
#  hostname   number_of_processors_on_the_host
 +
#
 +
master 1
 +
master 1
 +
master 1
 +
</pre></code>
  
 +
Now edit the steering file of your test case to ask for 3 processors:
 +
<code><pre>
 +
PARALLEL PROCESSORS = 3
 +
</pre></code>
  
edit estel steering file
+
If [[ESTEL]] ran properly, you should have 3 result files in your directory. The meaning of these files is explained '''further down'''.
parallel processors = 2
 
 
 
estel3d
 
  
 
= Run a parallel job on several machines =
 
= Run a parallel job on several machines =

Revision as of 13:29, 22 August 2007


This article describes how to run parallel jobs in ESTEL on "simple" networks of workstations.

Note that the methodology differs slightly for real high performance facilities such as Blue Crystal or other Beowulf clusters. Therefore, there is a dedicated article for clusters.

We call a network of workstations a set of workstations which can "talk" to each other via Intra/Internet.

Pre-requesites

Run a parallel job on one machine

Before running distributed parallel jobs, it is easier to get it to work on one machine.

Using one process

Before running ESTEL in parallel, you need to start a mpd process. Details are given in the MPI article. Just start it with:

mpd &

Before trying to run real parallel jobs it is interesting to check that the "parallel" library and ESTEL are playing nicely together. This can be achieved by running an existing test case with in which you add the following keyword in the steering file:

PARALLEL PROCESSORS = 1

Using the keyword PARALLEL PROCESSORS will force ESTEL to use the parallel library instead of the paravoid library. As we request one processor only, no MPI calls will be done.

If this does not work. Stop here and try to understand what is going wrong. You can email error messages (full messages) to JP Renaud who will help if necessary.

Using multiple processes

If it works fine with one process, you can try with several. Make sure first that the mpi_telemac.conf file contains enough entries for the number of processes you will request. As there is just one host available to MPI, just repeat its entry several times. For instance if asking for three processes, the mpi_telemac.conf file should contain:

# Number of processors :
3
#
# For each host :
#
#  hostname   number_of_processors_on_the_host
#
master 1
master 1
master 1

Now edit the steering file of your test case to ask for 3 processors:

PARALLEL PROCESSORS = 3

If ESTEL ran properly, you should have 3 result files in your directory. The meaning of these files is explained further down.

Run a parallel job on several machines

Note about the result files

Note about estel2d