| Abstract |
Message-passing within an SMP node is usually accomplished
using a 2-copy mechanism. The source process copies the message to a
shared-memory segment, then the destination process copies the message to
its memory space. Latencies as low as 1 microsecond can be achieved but the
maximum throughput is limited to half the memory copy rate.
This talk will focus on a 1-copy mechanism where a Linux kernel module
directly transfers the message data between 2 user processes. This module
supports the basic communication primitives needed to support a full MPI
implementation, including get, put, sync, broadcast, and gather.
In addition to doubling the throughput, optimized memory copy routines can
be used on some systems to further increase the performance by as much as an
additional factor of two. Future directions for this work will be discussed,
including research into 0-copy mechanisms that remain within the
message-passing paradigm.
|