>I admit I am not that smart to understand well how to use MPI, but I did read some basic materials about it and understand how some simple problems are solved by MPI.
>But dealing with an array in my case, I am not certain about how to apply MPI to it. Are you saying to use send and recieve to transfer the value computed for each element from child process to parent process?
You can, but typically that would entail too much communication overhead
for each element.
>Do you allocate a copy of the array for each process?
You can, but typically that would entail excessive memory consumption.
Typically, one allocates only a portion of the array on each process.
E.g., if the array has 10,000 elements and you have four processes, the
first gets the first 2,500 elements, the second the next 2,500, and so on.
>Also I only need the loop that computes every element of the array to be parallelized.
If you only need the initial computation of array elements to be
parallelized, perhaps any of the above strategies could work. It
depends on how expensive the computation of each element is.
>Someone said that the parallel part begins with MPI_Init and ends with MPI_Finilize,
Well, usually all processes are launched in parallel. So, the parallel
begins "immediately." Inter-process communications using MPI, however,
must take place between the MPI_Init and MPI_Finalize calls.
>and one can do any serial computations before and/or after these calls. But I have wrote some MPI programs, and found that the parallel part is not restricted between MPI_Init and MPI_Finilize, but instead the whole program. If the rest part of the code has to be wrapped for process with ID 0, I have little idea about how to apply that to my case since the rest part would be the parts before and after the loop in the function and the whole in main().
I don't understand your case very clearly. I will take a guess. You
could have all processes start and call MPI_Init. Then, slave processes
can go to sleep, waking occasionally to check if the master has sent a
signal to begin computation. The master does what it has to do and then
sends wake signals. Each slave computes its portion and sends that
portion back to the master. Each slave exits. The master gathers all
the pieces and resumes its computation. Does that sound right?