Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] [Open MPI] #3351: JAVA scatter error
From: Siegmar Gross (Siegmar.Gross_at_[hidden])
Date: 2012-12-18 12:05:23


Hi

> >> 1. The datatypes passed to Scatter are not valid MPI datatypes
> >> (MPI.OBJECT). You need to construct a datatype that is specific to the
> >> !MyData class, just like you would in C/C++. I think that this is the
> >> first error that you are seeing (i.e., that OMPI is trying to treat
> >> MPI.OBJECT as an MPI Datatype object, and failing (and therefore throwing
> >> an !ClassCastException exception).
> >
> > Perhaps you are right and my small example program ist not a valid MPI
> > program. The problem is that I couldn't find any good documentation or
> > example programs how to write a program which uses a structured data
> > type.
>
> In Java, that's probably true. Remember: there are no official MPI
> Java bindings. What is included in Open MPI is a research project
> from several years ago. We picked what appeared to be the best one,
> freshened it up a little, updated its build system to incorporate
> into ours, verified its basic functionality, and went with that.
>
> In C, there should be plenty of google-able examples about how to
> use Scatter (and friends). You might want to have a look at a few
> of those to get an idea how to use MPI_Scatter in general, and then
> apply that knowledge to a Java program.
>
> Make sense?

I know how to use MPI_Scatter or MPI_Scatterv in C, because I have
written some small and working example programs myself in the past.
My first Java program with MPI_Scatter was ColumnScatterMain.java
which I had sent to the list early October and now once more to you in
December. October 10th I had sent the program ColumnSendRecvMain.java
to the list (Subject: Datatype.Vector in mpijava in openmpi-1.9a1r27380),
because I thought and still think that building a column vector
doesn't work as expected. At the end of that email I wrote "In my
opinion Datatype.Vector doesn't work as expected. mpiJava doesn't
support something similar to MPI_Type_create_resized so how can I use
column_t in a scatter operation? Will scatter automatically start with
the next element and not with the element following the extent of
column_t?". In my opinion Datatype.Vector must set the size of the
base datatype as extent of the vector and not the true extent, because
MPI-Java doesn't provide a function to resize a datatype. Furthermore
Datatype.Struct allows only a collection of elements of the same type,
so that you must use a data object, if you want to scatter or broadcast
data of different types in one operation. We should forget
ObjectScatterMain.java for the moment and concentrate on
ObjectBroadcastMain.java, which I have sent three days ago to the list,
because it has the same problem.

1) ColumnSendRecvMain.java

I create a 2D-matrix with (Java books would use "double[][] matrix"
which is the same in my opinion, but I like C notation)

double matrix[][] = new double[P][Q];

Next I create a column vector

column_t = Datatype.Vector (P, 1, Q, MPI.DOUBLE);
column_t.Commit ();

which I can use in a send/recv-operation

    if (mytid == 0)
    {
      /* send one column to each process */
      for (i = 0; i < Q; ++i)
      {
        MPI.COMM_WORLD.Send (matrix, i, 1, column_t, i + 1, 0);
      }
    }
    else
    {
      MPI.COMM_WORLD.Recv (column, 0, P, MPI.DOUBLE, 0, 0);

This example doesn't depend on the extent of column_t, because I set
the "offset" where every column starts (at least I think so :-) ).
Java doesn't want that a user has any knowledge about memory layouts
or addresses of data structures. That's the reason why I think that
all necessary computations and transformations must be done in
Datatype.Vector, MPI.COMM_WORLD.Send, and MPI.COMM_WORLD.Recv.
Unfortunately it seems that that is not the case.

tyr java 125 mpiexec -np 7 -output-filename xx java ColumnSendRecvMain
tyr java 128 cat xx.1.0 xx.1.1

matrix:

      1.00 2.00 3.00 4.00 5.00 6.00
      7.00 8.00 9.00 10.00 11.00 12.00
     13.00 14.00 15.00 16.00 17.00 18.00
     19.00 20.00 21.00 22.00 23.00 24.00

Column of process 1

      0.00 3.00 7.00 0.00

I get the following output, if I use "int" instead of "double".

tyr java 143 mpiexec -np 7 -output-filename xx java ColumnSendRecvIntMain
tyr java 144 cat xx.1.0 xx.1.1

matrix:

     1 2 3 4 5 6
     7 8 9 10 11 12
    13 14 15 16 17 18
    19 20 21 22 23 24

Column of process 1

99731135 1586 5 7

It is easy to see that process 1 doesn't get column 0. Your
suggestion to allocate enough memory for a matrix (without defining
a matrix) and doing all index computations yourself is in my opinion
not applicable for a "normal" Java programmer (it's even hard for
most C programmers :-) ). Hopefully you have an idea how to solve
this problem so that all processes receive correct column values.

2) ObjectBroadcastMain.java

As I said above, it is my understanding, that I can send a Java object
when I use MPI.OBJECT and that the MPI implementation must perform all
necessary tasks. Your interface for derived datatypes provides only
methods for discontiguous data and no method to create an MPI.OBJECT,
so that I have no idea what I would have to do to create one. The
object must be serializable so that you get the same values in a
heterogeneous environment.

tyr java 146 mpiexec -np 2 java ObjectBroadcastMain
Exception in thread "main" java.lang.ClassCastException:
  MyData cannot be cast to [Ljava.lang.Object;
        at mpi.Comm.Object_Serialize(Comm.java:207)
        at mpi.Comm.Send(Comm.java:292)
        at mpi.Intracomm.Bcast(Intracomm.java:202)
        at ObjectBroadcastMain.main(ObjectBroadcastMain.java:44)
...

At least you try to serialize my object, but I have no idea why
the error occurs. With Google I found:

[Ljava.lang.Object; is the name for Object[].class, the java.lang.Class
representing the class of array of Object.

Perhaps I must create an array with just one element instead of a pure
Object. Success! Output for the updated version which I have attached.

tyr java 157 mpiexec -np 2 java ObjectBroadcastMain

Process 0 running on tyr.informatik.hs-fulda.de.
  Age: 35
  Name: Smith
  Salary: 2545.75

Process 1 running on tyr.informatik.hs-fulda.de.
  Age: 35
  Name: Smith
  Salary: 2545.75
tyr java 158

Is it possible that you decide in your serialization function/method,
if the buffer contains just one element/object or an array of
elements/objects so that you can cast to java.lang.Object or
Ljava.lang.Object?

> > Therefore I sticked to the mpiJava specification which states
> > for derived datatypes in chapter 3.12 that the effect for MPI_Type_struct
> > can be achieved by using MPI.OBJECT as the buffer type and relying on
> > Java object serialization. "dataItem" is a serializable Java object and
> > I used MPI.OBJECT as buffer type. How can I create a valid MPI datatype
> > MPI.OBJECT so that I get a working example program?
>
> /me reads some Java implementation code...

Me too, because I had to know if you changed the mpiJava interface
in your implementation and which datatypes for parameters and return
values you used.

> It looks like they allow passing MPI.OBJECT as the datatype argument;
> sorry, I guess I was wrong about that.
>
> > MPI.COMM_WORLD.Scatter (dataItem, 0, 1, MPI.OBJECT,
> > objBuffer, 0, 1, MPI.OBJECT, 0);
>
> What I think you're running into here is that you're still using
> Scatter wrong, per my other point, below:
>
> >> 1. It looks like you're trying to Scatter a single object to N peers.
> >> That's invalid MPI -- you need to scatter (N*M) objects to N peers, where
> >> M is a positive integer value (e.g., 1 or 2). Are you trying to
> >> broadcast?
> >
> > It is the very first version of the program where I scatter one object
> > to the process itself (at this point it is not the normal application
> > area for scatter, but should nevertheless work). I didn't continue due
> > to the error. I get the same error when I broadcast my data item.
> >
> > tyr java 116 mpiexec -np 1 java -cp $DIRPREFIX_LOCAL/mpi_classfiles \
> > ObjectScatterMain
> > Exception in thread "main" java.lang.ClassCastException: MyData cannot
> > be cast to [Ljava.lang.Object;
> > at mpi.Intracomm.copyBuffer(Intracomm.java:119)
> > at mpi.Intracomm.Scatter(Intracomm.java:389)
> > at ObjectScatterMain.main(ObjectScatterMain.java:45)
>
> I don't know Java, but it looks like it's complaining about the type
> of dataItem, not the type of MPI.OBJECT. It says it can't cast
> dataItem to a Ljava.lang.Object -- which appears to be the type of
> the first argument to Scatter.
>
> Do you need to have MyData inherit from the Java base Object type,
> or some such?
>
> > "Broadcast" works if I have only a root process and it fails when I have
> > one more process.
>
> If I change MPI.COMM_WORLD.Scatter(...) to
>
> MPI.COMM_WORLD.Bcast(dataItem, 0, 1, MPI.OBJECT, 0);
>
> I get the same casting error.
>
> I'm sorry; I really don't know Java, and don't know how to fix this offhand.
>
> --
> Jeff Squyres
> jsquyres_at_[hidden]
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/

Thank you very much for all your help so far and for all the fruitful
discussions which normally resulted in a solution.

Kind regards

Siegmar