Hi Jing

Someone else took a look at this off-list a few years ago. It was mostly a problem with the build system (some flags are different) and header file names. I don't believe the port was ever completed though.

I have appended the results of that conversation - the last message contained a list of the issues. You would need to update that to the trunk of course as the code has changed considerably since that discussion took place. Brian Barrett subsequently created a first-cut at fixing some of these, but that appears to have been lost in the years since it was done - and wouldn't really be current anyway.

I would be happy to assist as I can.
Ralph

1. configure issues with "checking prefix for global symbol labels"

1a. VxWorks assembler (CCAS=asppc) generates a.out by default (vs.
conftest.o that we need subsequently)

there is this fragment to determine the way to assemble conftest.s:

if test "$CC" = "$CCAS" ; then
   ompi_assemble="$CCAS $CCASFLAGS -c conftest.s >conftest.out 2>&1"
else
   ompi_assemble="$CCAS $CCASFLAGS conftest.s >conftest.out 2>&1"
fi

The subsequent link fails because conftest.o does not exist:

  ompi_link="$CC $CFLAGS conftest_c.$OBJEXT conftest.$OBJEXT -o conftest > conftest.link 2>&1"

To work around the problem, I did not set CCAS. This gives me the first
invocation that includes the -c argument to CC=ccppc, generating
conftest.o output.


1b. linker fails because LDFLAGS are not passed

The same linker command line caused problems because $CFLAGS were passed
to the linker

  ompi_link="$CC $CFLAGS conftest_c.$OBJEXT conftest.$OBJEXT -o conftest > conftest.link 2>&1"

In my environment, I set CC/CFLAGS/LDFLAGS as follows:
CC=ccppc

CFLAGS=-ggdb3 -std=c99 -pedantic -mrtp -msoft-float -mstrict-align
-mregnames -fno-builtin -fexceptions'

LDFLAGS=-mrtp -msoft-float -Wl,--start-group -Wl,--end-group
-L/amd/raptor/root/opt/WindRiver/vxworks-6.3/target/usr/lib/ppc/PPC32/sfcommon 

The linker flags are not passed because the ompi_link

[xp-kcain1:build_vxworks]  ccppc -ggdb3 -std=c99 -pedantic -mrtp
-msoft-float -mstrict-align -mregnames -fno-builtin -fexceptions -o
hello hello.c
/amd/raptor/root/opt/WindRiver/gnu/3.4.4-vxworks-6.3/x86-linux2/bin/../lib/gcc/powerpc-wrs-vxworks/3.4.4/../../../../powerpc-wrs-vxworks/bin/ld: 


cannot find -lc_internal
collect2: ld returned 1 exit status


2. OPAL atomics asm.c:
int versus int32_t (refer to email with Brian Barrett

3. OPAL event/event.c: sys/time.h and timercmp() macros not defined by
VxWorks
refer to workaround in event.c using #ifdef MCS_VXWORKS

4. OPAL event/event.c: pipe() syscall not found
workaround:

#ifdef HAVE_UNISTD_H
#include <unistd.h>
#ifdef MCS_VXWORKS
#include <ioLib.h>        /* for pipe() */
#endif
#endif

5. OPAL event/signal.c
static sig_atomic_t opal_evsigcaught[NSIG];
NSIG is not defined
but _NSIGS is

In Linux, NSIG is defined with -D__USE_MISC

So I added this code fragment to signal.c:

/* VxWorks signal.h defines _NSIGS, not NSIG */
#ifdef MCS_VXWORKS
#define NSIG (_NSIGS+1)
#endif


6. OPAL event/signal.c: no socketpair()

workaround: use pipe():

#ifdef HAVE_UNISTD_H
#include <unistd.h>
#ifdef MCS_VXWORKS
#include <ioLib.h>        /* for pipe() */
#endif
#endif

and later in void opal_evsignal_init(sigset_t *evsigmask)

#ifdef MCS_VXWORKS
       if (pipe(ev_signal_pair) == -1)
               event_err(1, "%s: pipe", __func__);
#else
   if (socketpair(AF_UNIX, SOCK_STREAM, 0, ev_signal_pair) == -1)
       event_err(1, "%s: socketpair", __func__);
#endif

7. OPAL util/basename.c: #if HAVE_DIRNAME problem

../../../opal/util/basename.c:23:5: warning: "HAVE_DIRNAME" is not defined
../../../opal/util/basename.c: In function `opal_dirname':

problem: HAVE_DIRNAME is not defined in opal_config.h so the #if
HAVE_DIRNAME will fail at preprocessor/compile time

workaround:
change #if HAVE_DIRNAME to #if defined(HAVE_DIRNAME)


8. OPAL util/basename.c: strncopy_s and _strdup
../../../opal/util/basename.c: In function `opal_dirname':
../../../opal/util/basename.c:153: error: implicit declaration of
function `strncpy_s'
../../../opal/util/basename.c:160: error: implicit declaration of
function `_strdup'

#ifdef MCS_VXWORKS
       strncpy( ret, filename, p - filename);
#else
               strncpy_s( ret, (p - filename + 1), filename, p - filename );
#endif
#ifdef MCS_VXWORKS
   return strdup(".");
#else
   return _strdup(".");
#endif



9. opal/util/if.c: socket() prototype not found in vxworks headers

#ifdef HAVE_SYS_SOCKET_H
#include <sys/socket.h>
#ifdef MCS_VXWORKS
#include <sockLib.h>
#endif
#endif

10. opal/util/if.c: ioctl()

#ifdef HAVE_SYS_IOCTL_H
#include <sys/ioctl.h>
#ifdef MCS_VXWORKS
#include <ioLib.h>
#endif
#endif

11. opal/util/os_path.c: MAXPATHLEN change to PATH_MAX

#ifdef MCS_VXWORKS
   if (total_length > PATH_MAX) {  /* path length is too long - reject
it */
       return(NULL);
#else
   if (total_length > MAXPATHLEN) {  /* path length is too long -
reject it */
       return(NULL);
#endif


12. opal/util/output.c: gethostname()
include <hostLib.h>

13. opal/util/output.c: MAXPATHLEN
same fix as os_path.c above

14. opal/util/output.c: closelog/openlog/syslog
manually turned off HAVE_SYSLOG_H in opal_config.h
then got a patch from Jeff Squyres that avoids syslog

15. opal/util/opal_pty.c
complains about mismatched prototype of opal_openpty() between this
source file and opal_pty.h

workaround: manually edit build_vxworks_ppc/opal/include/opal_config.h,
use the following line (change 1 to 0):
#define OMPI_ENABLE_PTY_SUPPORT 0

16. opal/util/stacktrace.c
FPE_FLTINV not present in signal.h

workaround: edit opal_config.h to turn off
OMPI_WANT_PRETTY_PRINT_STACKTRACE (this can be explicitly configured out
but I don't want to reconfigure because I hacked #15 above)

17. opal/mca/base/mca_base_open.c
gethostname() -- same as opal/util/output.c, must include hostLib.h

18. opal_progress.c
from opal/event/event.h (that I modified earlier)
cannot find #include <sys/_timeradd.h>
It is in opal/event/compat/sys

workaround: change event.h to include the definitions that are present
in _timeradd.h instead of including it.

19. Link errors for opal_wrapper
strcasecmp
strncasecmp

I rolled my own in mca_base_open.c (temporary fix, since we may come across this problem elsewhere in the code).

20. dss_internal.h uses a type 'uint'
Not sure if it's depending on something in the headers, or something it
defined on its own.

I changed it to be just like the header I found somewhere under Linux /usr/include:
#ifdef MCS_VXWORKS
typedef unsigned int uint;
#endif

21. struct iovec definition needed
orte/mca/iof/base/iof_base_fragment.h:45: warning: array type has
incomplete element type

#ifdef MCS_VXWORKS
#include <net/uio.h>
#endif

not sure if this is right, or if I should include something like
<netBufLib.h> or <ioLib.h>


22. iof_base_setup.c
struct termios not understood
can only find termios.h header in 'diab' area and I'm not using that
compiler.

a variable usepty is set to 0 already when OMPI_ENABLE_PTY_SUPPORT is 0.
So, why are we compiling this fragment of code at all? I hacked the file
so that the struct termios code will not get compiled.

23. oob_base_send/recv.c, oob_base_send/recv_nb.c. struct iovec not known.

#ifdef MCS_VXWORKS
#include <net/uio.h>
#endif

24. orte/mca/rmgr/base/rmgr_base_check_context.c:58: error:
`MAXHOSTNAMELEN' undeclared (first use in this function)

#ifdef MCS_VXWORKS
#define MAXHOSTNAMELEN 64
#endif

25. orte/mca/rmgr/base/rmgr_base_check_context.c:58:
gethostname()

#ifdef MCS_VXWORKS
#include <hostLib.h>
#endif

26. orte/mca/iof/proxy/iof_proxy.h:135: warning: array type has
incomplete element type
../../../../../orte/mca/iof/proxy/iof_proxy.h:135: error: field
`proxy_iov' has incomplete type

#ifdef MCS_VXWORKS
#include <net/uio.h>
#endif

27. /orte/mca/iof/svc/iof_svc.h:147: warning: array type has incomplete
element type
../../../../../orte/mca/iof/svc/iof_svc.h:147: error: field `svc_iov'
has incomplete type

#ifdef MCS_VXWORKS
#include <net/uio.h>
#endif

28. ../../../../../orte/mca/oob/tcp/oob_tcp_msg.h:66: warning: array
type has incomplete element type
../../../../../orte/mca/oob/tcp/oob_tcp_msg.h:66: error: field `msg_iov'
has incomplete type
../../../../../orte/mca/oob/tcp/oob_tcp_msg.h: In function
`mca_oob_tcp_msg_iov_alloc':
../../../../../orte/mca/oob/tcp/oob_tcp_msg.h:196: error: invalid
application of `sizeof' to incomplete type `iovec'


29. ../../../../../orte/mca/oob/tcp/oob_tcp.c:344: error: implicit
declaration of function `accept'
../../../../../orte/mca/oob/tcp/oob_tcp.c: In function
`mca_oob_tcp_create_listen':
../../../../../orte/mca/oob/tcp/oob_tcp.c:383: error: implicit
declaration of function `socket'
../../../../../orte/mca/oob/tcp/oob_tcp.c:399: error: implicit
declaration of function `bind'
../../../../../orte/mca/oob/tcp/oob_tcp.c:407: error: implicit
declaration of function `getsockname'
../../../../../orte/mca/oob/tcp/oob_tcp.c:415: error: implicit
declaration of function `listen'
../../../../../orte/mca/oob/tcp/oob_tcp.c: In function
`mca_oob_tcp_listen_thread':
../../../../../orte/mca/oob/tcp/oob_tcp.c:459: error: implicit
declaration of function `bzero'
../../../../../orte/mca/oob/tcp/oob_tcp.c: In function
`mca_oob_tcp_recv_probe':
../../../../../orte/mca/oob/tcp/oob_tcp.c:696: error: implicit
declaration of function `send'
../../../../../orte/mca/oob/tcp/oob_tcp.c: In function
`mca_oob_tcp_recv_handler':
../../../../../orte/mca/oob/tcp/oob_tcp.c:795: error: implicit
declaration of function `recv'
../../../../../orte/mca/oob/tcp/oob_tcp.c: In function `mca_oob_tcp_init':
../../../../../orte/mca/oob/tcp/oob_tcp.c:1087: error: implicit
declaration of function `usleep'

This gets rid of most (except bzero and usleep)
#ifdef MCS_VXWORKS
#include <sockLib.h>
#endif

Trying to reconfigure the package so CFLAGS will not include -pedantic.
This is because $WIND_HOME/vxworks-6.3/target/h/string.h has protos for
bzero, but only when #if _EXTENSION_WRS is true. So turn off
-ansi/-pedantic gets this? In my dreams?
On Mar 17, 2010, at 9:54 PM, 张晶 wrote:

Hello all,

 

In order to add some real-time feature to the OpenMPI for some research ,I need a OpenMPI version running on VxWorks. But after going through the Open-MPI website ,I can’t found any indication that it supports VxWorks .

 

Follow the thread posted by Ralph Castain ,  http://www.open-mpi.org/community/lists/users/2006/06/1371.php .
I read some paper about the OpenRTE ,like “Creating a transparent, distributed, and resilient computing environment: the OpenRTE project” and “The Open Run-Time Environment (OpenRTE):A Transparent Multi-cluster Environment for High-Performance Computing”which is written by Ralph H. Castain · Jeffrey M. Squyres and others .

 

Now I have a basic understanding of the OpenRTE , however ,there is too few document of the OpenRTE describing the implement of the OpenRTE . I don’t know
where and how to begin the migration . Any advice will be appreciated.

 

 

Thanks

 

Jing Zhang

_______________________________________________
devel mailing list
devel@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel