Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] "Open MPI"-based MPI library used by K computer
From: Rayson Ho (raysonlogin_at_[hidden])
Date: 2012-09-20 02:57:59


I found this paper recently, "MPI Library and Low-Level Communication
on the K computer", available at:

http://www.fujitsu.com/downloads/MAG/vol48-3/paper11.pdf

What are the criteria for adding papers to the "Open MPI Publications" page?

Rayson

==================================================
Open Grid Scheduler - The Official Open Source Grid Engine
http://gridscheduler.sourceforge.net/

On Fri, Nov 18, 2011 at 5:32 AM, George Bosilca <bosilca_at_[hidden]> wrote:
> Dear Yuki and Takahiro,
>
> Thanks for the bug report and for the patch. I pushed a [nearly identical] patch in the trunk in https://svn.open-mpi.org/trac/ompi/changeset/25488. A special version for the 1.4 has been prepared and has been attached to the ticket #2916 (https://svn.open-mpi.org/trac/ompi/ticket/2916).
>
> Thanks,
> george.
>
>
> On Nov 14, 2011, at 02:27 , Y.MATSUMOTO wrote:
>
>> Dear Open MPI community,
>>
>> I'm a member of MPI library development team in Fujitsu,
>> Takahiro Kawashima, who sent mail before, is my colleague.
>> We start to feed back.
>>
>> First, we fixed about MPI_LB/MPI_UB and data packing problem.
>>
>> Program crashes when it meets all of the following conditions:
>> a: The type of sending data is contiguous and derived type.
>> b: Either or both of MPI_LB and MPI_UB is used in the data type.
>> c: The size of sending data is smaller than extent(Data type has gap).
>> d: Send-count is bigger than 1.
>> e: Total size of data is bigger than "eager limit"
>>
>> This problem occurs in attachment C program.
>>
>> An incorrect-address accessing occurs
>> because an unintended value of "done" inputs and
>> the value of "max_allowd" becomes minus
>> in the following place in "ompi/datatype/datatype_pack.c(in version 1.4.3)".
>>
>>
>> (ompi/datatype/datatype_pack.c)
>> 188 packed_buffer = (unsigned char *) iov[iov_count].iov_base;
>> 189 done = pConv->bConverted - i * pData->size; /* partial data from last pack */
>> 190 if( done != 0 ) { /* still some data to copy from the last time */
>> 191 done = pData->size - done;
>> 192 OMPI_DDT_SAFEGUARD_POINTER( user_memory, done, pConv->pBaseBuf, pData, pConv->count );
>> 193 MEMCPY_CSUM( packed_buffer, user_memory, done, pConv );
>> 194 packed_buffer += done;
>> 195 max_allowed -= done;
>> 196 total_bytes_converted += done;
>> 197 user_memory += (extent - pData->size + done);
>> 198 }
>>
>> This program assumes "done" as the size of partial data from last pack.
>> However, when the program crashes, "done" equals the sum of all transmitted data size.
>> It makes "max_allowed" to be a negative value.
>>
>> We modified the code as following and it passed our test suite.
>> But we are not sure this fix is correct. Can anyone review this fix?
>> Patch (against Open MPI 1.4 branch) is attached to this mail.
>>
>> - if( done != 0 ) { /* still some data to copy from the last time */
>> + if( (done + max_allowed) >= pData->size ) { /* still some data to copy from the last time */
>>
>> Best regards,
>>
>> Yuki MATSUMOTO
>> MPI development team,
>> Fujitsu
>>
>> (2011/06/28 10:58), Takahiro Kawashima wrote:
>>> Dear Open MPI community,
>>>
>>> I'm a member of MPI library development team in Fujitsu. Shinji
>>> Sumimoto, whose name appears in Jeff's blog, is one of our bosses.
>>>
>>> As Rayson and Jeff noted, K computer, world's most powerful HPC system
>>> developed by RIKEN and Fujitsu, utilizes Open MPI as a base of its MPI
>>> library. We, Fujitsu, are pleased to announce that, and also have special
>>> thanks to Open MPI community.
>>> We are sorry to be late announce!
>>>
>>> Our MPI library is based on Open MPI 1.4 series, and has a new point-
>>> to-point component (BTL) and new topology-aware collective communication
>>> algorithms (COLL). Also, it is adapted to our runtime environment (ESS,
>>> PLM, GRPCOMM etc).
>>>
>>> K computer connects 68,544 nodes by our custom interconnect.
>>> Its runtime environment is our proprietary one. So we don't use orted.
>>> We cannot tell start-up time yet because of disclosure restriction, sorry.
>>>
>>> We are surprised by the extensibility of Open MPI, and have proved that
>>> Open MPI is scalable to 68,000 processes level! We feel pleasure to
>>> utilize such a great open-source software.
>>>
>>> We cannot tell detail of our technology yet because of our contract
>>> with RIKEN AICS, however, we will plan to feedback of our improvements
>>> and bug fixes. We can contribute some bug fixes soon, however, for
>>> contribution of our improvements will be next year with Open MPI
>>> agreement.
>>>
>>> Best regards,
>>>
>>> MPI development team,
>>> Fujitsu
>>>
>>>
>>>> I got more information:
>>>>
>>>> http://blogs.cisco.com/performance/open-mpi-powers-8-petaflops/
>>>>
>>>> Short version: yes, Open MPI is used on K and was used to power the 8PF runs.
>>>>
>>>> w00t!
>>>>
>>>>
>>>>
>>>> On Jun 24, 2011, at 7:16 PM, Jeff Squyres wrote:
>>>>
>>>>> w00t!
>>>>>
>>>>> OMPI powers 8 petaflops!
>>>>> (at least I'm guessing that -- does anyone know if that's true?)
>>>>>
>>>>>
>> Open MPI based:
>>>>>>
>>>>>>>>> On Jun 24, 2011, at 7:03 PM, Rayson Ho wrote:
>>>>>
>>>>>> Interesting... page 11:
>>>>>>
>>>>>> http://www.fujitsu.com/downloads/TC/sc10/programming-on-k-computer.pdf
>>>>>>
>>>>>> * Open Standard, Open Source, Multi-Platform including PC Cluster.
>>>>>> * Adding extension to Open MPI for "Tofu" interconnect
>>>>>>
>>>>>> Rayson
>>>>>> http://blogs.scalablelogic.com/