Address Translation Mechanisms In Network Interfaces
نویسندگان
چکیده
ions that provide sender-based communication such as HP Hamlyn and Berkeley Active Messages, are powerful enough to support minimal messaging. The sender specifies the source and destination addresses as offsets in message segments for every message. In theory, you can define message segments that cover the entire application address space. However, in current Hamlyn [8,56] and Active Messages [29,26,10] implementations the address translation structures are not there or are limited to their reach. For example, the latest prototype Hamlyn implementation [8] is built on hardware identical to ours (Myrinet), yet message buffers must be pinned in main memory. This limits the coverage of the mechanism to the amount of application data that can be wired in physical memory. Similarly, the Active Messages implementation on Myrinet uses a single-copy approach through an intermediate shared user/kernel buffer where the NI pulls or pushes message data. Arizona ADCs [14] have been designed to optimize stream traffic. In Section 3, we discussed why this design cannot fully support minimal messaging. The base Cornell UNet [54] architecture supports an abstraction similar to ADCs, and therefore has the same limitations as ADCs. In the original UNet paper [54], a direct access UNet architecture is discussed that includes communication segments able to encompass all the user address space but the architecture is restricted to future NI designs. Recent work [2] attempted to incorporate address translation mechanisms for existing NIs. Nevertheless, the ADC abstraction has not changed and therefore, the designs are unable to move data to their final destination without extra copying. Mitsubishi DART [36] is a commercially available NI that comes close to properly support minimal messaging. DART core has been designed to support ADCs including sophisticated address translation support. Moreover, it defines an interface to a separate coprocessor that process messages. Presumably, it will be used to support the hybrid deposit model [35], an abstraction similar to Active Messages, in which the data destination is a function of the receiver’s and t sender’s state. Unfortunately, the address translation is geared towards ADCs. As a result, the translation structures are not flexible enough to efficiently support minimal messaging. The host CPU is always interrupted to handle misses while the message is blocked until the miss can be resolved. Furthermore, there are not any provisions for a fallback action. Thus, the design requires fast kernel interfaces to gracefully degrade once the translation limits are exceeded. Designs with a network coprocessor, like Meiko CS [1] and Intel Paragon [24] can support minimal messaging using the microprocessor’s address translation hardware and a separate DMA engine. Nonetheless, address translation mecha8 32 128 256 512 1024 2048 Message Size (bytes) 0 5 10 15 20 B an dw id th ( M B /s ec ) FIGURE 6. Myrinet best-case throughput and latency Intermediate uses the shared user/kernel buffer for the message data. Minimal accesses directly the application data structures. 8 32 128 256 512 1024 2048 Message Size (bytes) 0 100 200 300 400 500 R ou nd T rip T im e (μ se cs ) Intermediate Minimal Throughput Latency Intermediate Minimal Intermediate Minimal (Single Copy Fallback) Minimal (Single Copy Fallback) with Threshold = 16384 FIGURE 7. Myrinet latency vs. buffer range Intermediate uses the intermediate shared user/kernel buffer for the message data. Minimal accesses directly the application data structures. Minimal with Threshold = 16384 services only one miss for every 16384 bytes transferred. 1 4 16 64 2561024 1 4 16 64 2561024 Buffer Range (# Pages) 0 100 200 300 400 500 600 La te nc y (μ se cs ) 512 bytes 2048 bytes nisms implemented for CPUs (TLBs) are not always appropriate for NIs. There are two potential problems. First, the reach of a CPU TLB is very small, typically a few dozen of pages. Message operations can span over a wide range of addresses, which is much larger than what TLBs can cover. Moreover, the data transfers compete with other memory accesses (kernel instructions, kernel data, I/O addresses), effectively making the TLB miss the common case for any message operation. Second, data transfers from/to the user address space require the CPU to switch the hardware context to the appropriate process. This operation can have significant overhead depending on the coprocessor’s architecture (e.g., number of CPU hardware contexts, TLB and/or the cache flushing). Alternatively, the coprocessor can access page tables in software making the coprocessor TLB useless for minimal messaging. In these designs that support remote memory accesses, memory pages in the sender’s address space are associated with memory pages in the receiver’s address space. Memory accesses on the sender are captured by the NI and forwarded to the associated page on the receiver. Page associations are either direct (the sender knows the remote physical address) or indirect (through global network addresses). Examples of this approach include Princeton SHRIMP [4], Forth Telegraphos II [28], DEC Memory Channel [18] and Tandem TNet [22]. SHRIMP and Telegraphos II use direct page associations. The Memory Channel and TNet use indirect page associations. Common characteristic of these designs is their inability to handle misses in the translation structures. Therefore, the translations must in place before messaging operations, which requires communication pages to be locked in memory. Moreover, changing the reach of the translation mechanisms requires expensive system calls. SHRIMP’s prototype NI can hold up to 32K of associations between pages. Thus, minimal messaging is supported for up to 128Mb of application data from every sender. Similarly, the Memory Channel supports up to ~50K pages. In TNet, remote memory operations are supported in a 32-bit window to a node’s physical memory. A class of designs supports minimal messaging by using the CPU in kernel mode to instruct the NI to move the data to the appropriate place. Such approaches include page remapping in the kernel (implemented in Solaris 2.6 TCP [25]), Washington’s in-kernel emulation of the remote memory access model [51] and other VM manipulations [7]. In these systems, minimum messaging is achieved if the NI can directly access the main memory. However, the kernel is involved in every transfer and thus, user-level messaging is not supported. Princeton User-level DMA (UDMA) [4] avoids OS intervention and supports minimal messaging when it is used both to send and receive messages. UDMA is used in SHRIMP to initiate DMA transfers. In this case, it supports minimal messaging on the sender but on the receiver, it suffers from the same problems that we have discussed for SHRIMP. Nevertheless, it can be used on both the sender and the receiver (without SHRIMP’s support for remote memory accesses). Consequently, it supports minimal messaging in a way that shares common features with the design that allows user-controlled mappings (Section 4.3). In the common case, both avoid kernel intervention for data transfers and in both the CPU is in the critical path for message operations. Unlike our design, UDMA requires hardware support in NIs to capture transfer requests. Cray T3E [47] combines remote memory accesses with an approach similar to UDMA. It supports minimal messaging through special NI registers. The CPU initiates transfers directly from remote memory to the NI registers on the local node. It subsequently initiates the transfer of the data from the NI registers to local memory. The CPU must be involved once for every 64 bytes transferred (the maximum message size supported). T3E includes extensive hardware support for address translation in the form of complete page tables that describe global communication segments. However, the page tables must always have valid translations, and therefore the communication pages are wired in memory.
منابع مشابه
Study of PKA binding sites in cAMP-signaling pathway using structural protein-protein interaction networks
Backgroud: Protein-protein interaction, plays a key role in signal transduction in signaling pathways. Different approaches are used for prediction of these interactions including experimental and computational approaches. In conventional node-edge protein-protein interaction networks, we can only see which proteins interact but ‘structural networks’ show us how these proteins inter...
متن کاملDuplicate Address Detection in OLSR Networks
Commonly, duplicate address detection is performed when configuring network interfaces in order to ensure that unique addresses are assigned to each interface in the network. Such mechanisms commonly operate with the premises that a node ”intelligently” selects an address which it supposes to be unique, followed by a duplicate address detection cycle, through which it verifies that no other act...
متن کاملCisco Systems
Due to specific problems, Network Address Translation Protocol Translation (NAT-PT) was deprecated by the IETF as a mechanism to perform IPv6-IPv4 translation. Since then, new efforts have been undertaken within IETF to standardize alternative mechanisms to perform IPv6-IPv4 translation. This document analyzes to what extent the new stateful translation mechanisms avoid the problems that caused...
متن کاملA New Trust Model for B2C E-Commerce Based on 3D User Interfaces
Lack of trust is one of the key bottle necks in e-commerce development. Nowadays many advanced technologies are trying to address the trust issues in e-commerce. One among them suggests using suitable user interfaces. This paper investigates the functionality and capabilities of 3D graphical user interfaces in regard to trust building in the customers of next generation of B2C e-commerce websit...
متن کاملInteraction Translation Methods for XML / SNMP Gateway Using XML Technologies Interaction ranslation ethods for L / ate ay sing L echnologies
XML–based network management has been proposed as an alternative or to compliment SNMP-based network management, which has constraints in scalability and efficiency. But the XML-based network management cannot provide a method to manage networks equipped with legacy SNMP agents in the integrated management system. This integrated management system must include an XML/SNMP gateway, which transla...
متن کاملUtilising IPv6 over VPN to Enhance Home Service Connectivity
The amount of home networks, as well as the number of services and hosts in them, is increasing. Often the home users cannot get public IPv4 network allocations from service providers and are forced to use Network Address Translation (NAT) and port forwarding to solve connectivity issues to the different home services. This paper introduces a secure connectivity solution utilising both IPv6 and...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1998