Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2007-07-18 16:32:58


Crud.

The Pathscale 3.0 compilers do not support thread-local data. This
is what we've been fighting with https://svn.open-mpi.org/trac/ompi/
ticket/1025; QLogic just told us last week that their compiler does
not support TLS (even though OMPI was not currently using it, glibc
does, and something calls abort() deep within pthread_exit(NULL)).
If you don't use the TLS glibc, everything works fine, but the TLS
glibc is the default on many Linux systems.

QLogic is looking into the problem and said they will get back to use
(I'm unwilling to do horrid LD_PRELOAD tricks to get the non-TLS
glibc, etc.).

I'm guessing that this change will guarantee to make the pathscale
3.0 compilers not work at all.

Is this change just to fix a memory leak? If so, could we add a
configure test to see if the compiler is broken w.r.t. TLS? (I know,
I know... :-( )

On Jul 18, 2007, at 4:25 PM, brbarret_at_[hidden] wrote:

> Author: brbarret
> Date: 2007-07-18 16:25:01 EDT (Wed, 18 Jul 2007)
> New Revision: 15494
> URL: https://svn.open-mpi.org/trac/ompi/changeset/15494
>
> Log:
> Use thread specific data and static buffers for the return type of
> opal_net_get_hostname() rather than malloc, because no one was freeing
> the buffer and the common use case was for printfs, where calling
> free is a pain.
>
> Text files modified:
> trunk/opal/runtime/opal_finalize.c | 3 +
> trunk/opal/runtime/opal_init.c | 6 +++
> trunk/opal/util/net.c | 68 ++++++++++++++++++++++
> +++++++++++++++++
> trunk/opal/util/net.h | 28 +++++++++++++++
> 4 files changed, 103 insertions(+), 2 deletions(-)
>
> Modified: trunk/opal/runtime/opal_finalize.c
> ======================================================================
> ========
> --- trunk/opal/runtime/opal_finalize.c (original)
> +++ trunk/opal/runtime/opal_finalize.c 2007-07-18 16:25:01 EDT
> (Wed, 18 Jul 2007)
> @@ -25,6 +25,7 @@
> #include "opal/util/output.h"
> #include "opal/util/malloc.h"
> #include "opal/util/if.h"
> +#include "opal/util/net.h"
> #include "opal/util/keyval_parse.h"
> #include "opal/memoryhooks/memory.h"
> #include "opal/mca/base/base.h"
> @@ -53,6 +54,8 @@
> close when not opened internally */
> opal_iffinalize();
>
> + opal_net_finalize();
> +
> /* keyval lex-based parser */
> opal_util_keyval_parse_finalize();
>
>
> Modified: trunk/opal/runtime/opal_init.c
> ======================================================================
> ========
> --- trunk/opal/runtime/opal_init.c (original)
> +++ trunk/opal/runtime/opal_init.c 2007-07-18 16:25:01 EDT (Wed, 18
> Jul 2007)
> @@ -28,6 +28,7 @@
> #include "opal/memoryhooks/memory.h"
> #include "opal/mca/base/base.h"
> #include "opal/runtime/opal.h"
> +#include "opal/util/net.h"
> #include "opal/mca/installdirs/base/base.h"
> #include "opal/mca/memory/base/base.h"
> #include "opal/mca/memcpy/base/base.h"
> @@ -165,6 +166,11 @@
> goto return_error;
> }
>
> + if (OPAL_SUCCESS != (ret = opal_net_init())) {
> + error = "opal_net_init";
> + goto return_error;
> + }
> +
> /* Setup the parameter system */
> if (OPAL_SUCCESS != (ret = mca_base_param_init())) {
> error = "mca_base_param_init";
>
> Modified: trunk/opal/util/net.c
> ======================================================================
> ========
> --- trunk/opal/util/net.c (original)
> +++ trunk/opal/util/net.c 2007-07-18 16:25:01 EDT (Wed, 18 Jul 2007)
> @@ -74,9 +74,62 @@
> #include "opal/util/output.h"
> #include "opal/util/strncpy.h"
> #include "opal/constants.h"
> +#include "opal/threads/tsd.h"
>
> #ifdef HAVE_STRUCT_SOCKADDR_IN
>
> +#if OPAL_WANT_IPV6
> +static opal_tsd_key_t hostname_tsd_key;
> +
> +
> +static void
> +hostname_cleanup(void *value)
> +{
> + opal_output(0, "cleaning up buffer: 0x%lx", value);
> + if (NULL != value) free(value);
> +}
> +
> +
> +static char*
> +get_hostname_buffer(void)
> +{
> + void *buffer;
> + int ret;
> +
> + ret = opal_tsd_getspecific(hostname_tsd_key, &buffer);
> + if (OPAL_SUCCESS != ret) return NULL;
> +
> + if (NULL == buffer) {
> + opal_output(0, "getting a buffer");
> + buffer = (void*) malloc((NI_MAXHOST + 1) * sizeof(char));
> + ret = opal_tsd_setspecific(hostname_tsd_key, buffer);
> + }
> +
> + opal_output(0, "returning buffer: 0x%lx", buffer);
> +
> + return (char*) buffer;
> +}
> +#endif
> +
> +
> +int
> +opal_net_init()
> +{
> +#if OPAL_WANT_IPV6
> + return opal_tsd_key_create(&hostname_tsd_key, hostname_cleanup);
> +#else
> + return OPAL_SUCCESS;
> +#endif
> +}
> +
> +
> +int
> +opal_net_finalize()
> +{
> + return OPAL_SUCCESS;
> +}
> +
> +
> /* convert a CIDR prefixlen to netmask (in network byte order) */
> uint32_t
> opal_net_prefix2netmask(uint32_t prefixlen)
> @@ -225,7 +278,7 @@
> opal_net_get_hostname(struct sockaddr *addr)
> {
> #if OPAL_WANT_IPV6
> - char *name = (char *)malloc((NI_MAXHOST + 1) * sizeof(char));
> + char *name = get_hostname_buffer();
> int error;
> socklen_t addrlen;
>
> @@ -297,6 +350,19 @@
>
> #else /* HAVE_STRUCT_SOCKADDR_IN */
>
> +int
> +opal_net_init()
> +{
> + return OPAL_SUCCESS;
> +}
> +
> +
> +int
> +opal_net_finalize()
> +{
> + return OPAL_SUCCESS;
> +}
> +
>
> uint32_t
> opal_net_prefix2netmask(uint32_t prefixlen)
>
> Modified: trunk/opal/util/net.h
> ======================================================================
> ========
> --- trunk/opal/util/net.h (original)
> +++ trunk/opal/util/net.h 2007-07-18 16:25:01 EDT (Wed, 18 Jul 2007)
> @@ -35,6 +35,31 @@
>
> BEGIN_C_DECLS
>
> +/**
> + * Intiailize the network helper subsystem
> + *
> + * Initialize the network helper subsystem. Should be called exactly
> + * once for any process that will use any function in the network
> + * helper subsystem.
> + *
> + * @retval OPAL_SUCCESS Success
> + * @retval OPAL_ERR_TEMP_OUT_OF_RESOURCE Not enough memory for static
> + * buffer creation
> + */
> +OPAL_DECLSPEC int opal_net_init(void);
> +
> +
> +/**
> + * Finalize the network helper subsystem
> + *
> + * Finalize the network helper subsystem. Should be called exactly
> + * once for any process that will use any function in the network
> + * helper subsystem.
> + *
> + * @retval OPAL_SUCCESS Success
> + */
> +OPAL_DECLSPEC int opal_net_finalize(void);
> +
>
> /**
> * Calculate netmask in network byte order from CIDR notation
> @@ -90,7 +115,8 @@
> * Get string version of address
> *
> * Return the un-resolved address in a string format. The string
> will
> - * be created with malloc and the user must free the string.
> + * be returned in a per-thread static buffer and should not be freed
> + * by the user.
> *
> * @param addr struct sockaddr of address
> * @return literal representation of \c addr
> _______________________________________________
> svn-full mailing list
> svn-full_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/svn-full

-- 
Jeff Squyres
Cisco Systems