Hi Eugene and Jody,
thanks for the ideas and elaborate answers. I will look into SysV and
mmap, and find out something. I am not tied to PGAPack, there may be
other PGA libs too... But I guess MPI and SysV/mmap do not cancel each
other out, I just have to know about what is running locally and what
is running remotely.
I sure will remember your help - and I am also thinking of making the
simulator opensource when it will be ready, as currently there's no
fast, distributed/parallel intraday forex simulator library available
that is capable of walk forward optimization and others. So others
will be available to take part in cashing in (or losing) those
On Tue, Apr 28, 2009 at 6:41 PM, Eugene Loh <Eugene.Loh_at_[hidden]> wrote:
> Barnabas Debreczeni wrote:
>> I am using PGAPack as a GA library, and it uses MPI to parallelize
>> optimization runs. This is how I got to Open MPI.
> Let me see if I understand the underlying premise. You want to parallelize,
> but there are some large shared tables. There are many different
> parallelization models. E.g., there are certainly *shared-memory* parallel
> programming models such as OpenMP (which is totally different from Open MPI,
> despite the similar names). But you are using MPI (which doesn't really do
> shared memory) since you're trying to leverage PGAPack, which is nice for
> handling genetic algorithms but basically forces you to use MPI. (I suspect
> most GA algorithms map reasonably well to MPI. Your interest in shared
> tables gives your situation a different twist.)
>> My problem is, I'd like to share that 2 GB table (computed once at the
>> beginning, and is read-only after) between processes so I don't have
>> to use up 16 gigs of memory.
>> How do you share data between processes locally?
> Are there shared-memory parallel GA packages that might make more sense to
> use here than PGAPack?
> If you want to stick with PGAPack/MPI, then you can set up shared memory
> among MPI processes by going outside of MPI. (You could use MPI calls to
> share data, including MPI_Get routines, but I'm guessing it's best just to
> add non-MPI code to do the sharing.) You can for example create a file that
> each process "mmap"s into its address space. There are also System V
> shared-memory calls like shmget/shmat/shmdt that allow you to share memory
> among processes.
> The main point: while MPI allows communication (and therefore "data
> sharing") among processes, you might be better off with non-MPI mechanisms
> here like mmap or SysV shared memory.
>> Later I will need to use other hosts too in the calculation. Will the
>> slaves on other hosts need to calculate their own tables on go on from
>> there and share them locally, or can I share these tables on the
>> master host with them?
> I think this is a performance-vs-memory question. If your interconnect is
> fast enough or your performance requirement low enough and your memory
> constraints severe enough, then you can share common data among all your
> nodes. You'd probably want to use MPI calls to do so... possibly using
> one-sided MPI_Get routines depending on what sort of cluster you're running
> But, if your interconnect is not fast enough or your performance requirement
> high enough or your memory constraint not too severe, then just share within
> each node. And, I could imagine you might have enough memory per node (a
> few Gbytes) that this will be your scenario. So, just replicate your
> mmap/SysV solution on each node.
> Short answer: you probably want to use non-MPI mechanisms to effect your
> shared memory.
> Most importantly, when your algorithm is successfully implemented and
> deployed and you're making millions of dollars, please remember us!
> users mailing list