Thank you for your response. That makes it clear.
A related question. When I run a general program on a machine, say a Internet browser/Media player to watch a movie by clicking on the icon of the avi file in the folder (nothing from the terminal), how many cores does it use? In that case also does it just run on one core?
Well this isn't really a MPI question, but in short, the answer depends on the operating system and the application. If the latter is a serial application (no threads), it will generally only use one core, at most. If you are starting a more "complex" application on a graphical user interface, say, this often involves threads. So your media player may actually use a thread for the GUI, and one thread for decoding the video. But I am just guessing here. The OS is free to distribute the threads over the cores as it sees fit.
Generally, how is the work load divided on the cores on a computer. Does every process that I start uses a new core, or the work load is distributed over all the available cores?
Again, this depends on the OS and on the hardware, so there is no general answer to your question. However, with OpenMPI (and other implementations) you can map processes to sockets and cores explicitly. This is called process affinity, which serves to improve locality for certain MPI ranks. This is described in the mpirun manpage and on the OpenMPI website, see
I hope this answers your original question.