US5959995A - Asynchronous packet switching - Google Patents

Asynchronous packet switching Download PDF

Info

Publication number
US5959995A
US5959995A US08/605,677 US60567796A US5959995A US 5959995 A US5959995 A US 5959995A US 60567796 A US60567796 A US 60567796A US 5959995 A US5959995 A US 5959995A
Authority
US
United States
Prior art keywords
packet
frame
mover
node
nodes
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime
Application number
US08/605,677
Inventor
Thomas M. Wicki
Patrick J. Helland
Takeshi Shimizu
Wolf-Dietrich Weber
Winfried W. Wilcke
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fujitsu Ltd
Original Assignee
Fujitsu Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fujitsu Ltd filed Critical Fujitsu Ltd
Priority to US08/605,677 priority Critical patent/US5959995A/en
Assigned to HAL COMPUTER SYSTEMS, INC. reassignment HAL COMPUTER SYSTEMS, INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: HELLAND, PATRICK J., SHIMIZU, TAKESHI, WEBER, WOLF-DIETRICH, WICKI, THOMAS M., WILCKE, WINFRIED
Priority to DE69735740T priority patent/DE69735740T2/en
Priority to PCT/US1997/002943 priority patent/WO1997031464A1/en
Priority to EP97906765A priority patent/EP0823165B1/en
Priority to JP53040697A priority patent/JP3816531B2/en
Assigned to FUJITSU, LTD. reassignment FUJITSU, LTD. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: HAL COMPUTER SYSTEMS, INC.
Application granted granted Critical
Publication of US5959995A publication Critical patent/US5959995A/en
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/34Source routing

Definitions

  • the invention relates to data transmission on a network, and more particularly to asynchronous packet switching data transmission in a multiprocessor environment.
  • a multiprocessor system includes several processors connected to one or more memories.
  • the interconnect can take one of several forms, for example a shared bus, a cross-bar, or the like.
  • the interconnect must support fast access (low latency) and high bandwidth.
  • Existing interconnects suffer either from limited bandwidth (as in shared bus interconnects), scalability problems (as in cross-bar interconnects), or excessive latency (as in general networks).
  • a multi-node system comprises a plurality of nodes coupled to each other.
  • the nodes communicate with one another by point to point packets.
  • Each node includes a packet mover and a frame mover.
  • the packet mover provides a packet to the frame mover, provides an acknowledgment in response to receiving a packet from one of the other packet movers, and resends the packet to the frame mover if an acknowledgment is not received from one of the other packet movers in a predetermined amount of time.
  • Each packet indicates a destination node.
  • the frame mover converts the packet into a frame, generates a route to the destination node. The frame is defective, it is discarded and the packet mover eventually retransmits the packet.
  • the frame mover provides source routing and multiple routes to nodes.
  • the interconnect may be of a flexible topology. Packets have a bounded finite life.
  • the frame mover selects a preselected route to a destination node, generates a frame that includes said preselected route, and provides the frame to the plurality of routers for communication to the destination node.
  • the route includes a sequence of route steps through some of the plurality of routers for communicating the frame therebetween.
  • the frame mover includes a routing table for storing a plurality of preselected routes to the destination node and includes a controller for selecting one of the plurality of preselected routes for inclusion in the frame.
  • FIG. 1 is a block diagram illustrating a multi-processor system in accordance with the present invention.
  • FIG. 2 is a block diagram illustrating protocol layers of the multi-processor system of FIG. 1.
  • FIG. 3 is a pictorial diagram illustrating a frame and a packet.
  • FIG. 4 is a block diagram illustrating the fast frame mover.
  • FIG. 5 is a block diagram illustrating the selection of a route of a frame.
  • FIG. 6 is shown a block diagram illustrating the modification of routing information in the frame header while the frame is being communicated through the interconnect.
  • FIG. 7 is a diagram illustrating a half mesh link.
  • FIG. 8 is a block diagram illustrating different clock domains for a pair of receivers and transmitters of a mesh link.
  • FIG. 9 is a block diagram illustrating a fault tolerant interconnect in a second embodiment of the present invention.
  • FIG. 10 is a block diagram illustrating virtual cut-through routing.
  • FIG. 11 is a block diagram illustrating a reliable packet mover of the multiprocessor system of FIG. 1.
  • FIG. 12a is a flowchart illustrating the operation of the transmitting packets by the reliable packet mover.
  • FIG. 12b is a flowchart illustrating the operation of processing acknowledgments and retransmission of packets by the reliable packet mover.
  • FIG. 13 is a flowchart illustrating the operation of receiving packets by the reliable packet mover.
  • FIG. 14 is a flowchart illustrating the operation of checking the pending packet buffer.
  • the multiprocessor system 100 includes a plurality of processor nodes 102 each coupled by a mesh link 120 to an interconnect 104.
  • Each processor node 102 includes a processor 106, a coherence control unit 110, and a local memory 112.
  • the coherence control unit 110 includes a reliable packet mover (RPM) 114 and a fast frame mover (FFM) 116.
  • the reliable packet mover 114 provides reliable end to end data communication between processor nodes 102.
  • the fast frame mover 116 routes data from a source processor node 102 to a destination processor node 102. For each processor node 102, at least one route to every destination processor node 102 is stored in the fast frame mover 116.
  • a method for determining the topology of the interconnect 104 and areas of failure therein is described in U.S. patent application Ser. No. 08/605,676 entitled "SYSTEM AND METHOD FOR DYNAMIC NETWORK TOPOLOGY EXPLORATION" filed Feb. 22, 1996, by Thomas M. Wicki, Patrick J. Helland, Wolf-Dietrich Weber, and Winfried W. Wilcke now U.S. Pat. No. 5,740,346, the subject matter of which is incorporated herein by reference.
  • the coherence control unit 110 may be coupled to a cache memory, which is coupled to the processor 106.
  • the interconnect 104 includes a plurality of routers 118 interconnected by mesh links 120.
  • the plurality of processor nodes 102 are coupled to the routers 118 by mesh links 120. More than one processor node 102 may be coupled to the same router 118.
  • the routers 118 preferably are cross bar switches. In the specific implementation described herein for illustrative purposes, the routers 118 have 6 ports.
  • the interconnect 104 may include only one router 118, and in a system including two processor nodes 102, no router 118 need be included. An example of a router is described in U.S. patent application Ser. No.
  • the interconnect 104 uses a packet based protocol in which all communication is directly processor node 102 to processor node 102.
  • the interconnect 104 need not provide multicast or broadcast. All data transferred is parcelled into packets, which are described below in conjunction with FIG. 3.
  • the multi-processor system 100 is a shared memory system that provides nonuniform memory access times.
  • the processor 106 may access other local memory 112.
  • the access to the local memory 112 of a first processor node 102 is less than the access time to the memory of another processor node 102.
  • the latency of the network is a measurement of the time required to provide a requesting processor node 102 with the requested data as measured from the time at which the memory request is transmitted. In other words, latency indicates how long it takes before you receive the data after it is requested.
  • the bandwidth of the link between the coherence control unit 110 and the local memory 112 preferably is substantially equal to the bandwidth of the link between the coherence control unit 110 and the interconnect 104. Bandwidth depends both on the rate at which you can receive or provide data and on the width of the path.
  • the multiprocessor system 100 preferably is a distributed memory system. More specifically, the system 100 has a memory architecture that is physically distributed but the local memories 112 are logically shared.
  • a processor node 102 e.g. node A
  • the coherence control unit 110 of the requesting node or source node (node A) identifies the location of the memory and the data stored at that location is quickly retrieved.
  • the multiprocessor system 100 may also include input/output (I/O) nodes 103, which do not have processing capability. For clarity, only one I/O node 103 is shown. Such a node 103 may be a bus converter to interface with a bus, such as a PCI bus or an S bus. Such I/O nodes 103 may function as source or destination nodes 102 as described herein. Thus, in the description herein of communicating and processing data, when a processor node 102 is described, an I/O node 103 may be also used.
  • I/O nodes 103 may be also used.
  • the protocol layers includes a interconnect service manager (ISM) layer 202, a reliable packet mover (RPM) layer 204, a fast frame mover (FFM) layer 206, and a physical layer 208.
  • the coherence control unit 110 provides the functions of the interconnect service manager layer 202 which are controllable by software executed by the processor 106, the reliable packet mover layer 204, the fast frame mover layer 206, and a portion of the physical layer 208.
  • the layers allow for a more efficient division of the functions of the system 100 and for independent development and testing of portions of the system.
  • the interconnect service manager layer 202 communicates with point to point messages to assure coherence.
  • the interconnect service manager layer 202 of a first processor node 202 sends data to or requests data from another processor node 102
  • the interconnect service manager layer 202 of the first processor node 102 sends commands to the reliable packet mover 114 that inform the reliable packet mover 114 of the data to be sent or requested and the source or destination of the data.
  • the source processor node 102 sends data to a destination processor node 102 and does not determine the path through the interconnect 104 or use any information regarding the path.
  • the reliable packet mover layer 204 provides reliable delivery of packets 302 (see FIG. 3) between the processor nodes 102 by using the fast frame mover layer 206 to communicate packets 302.
  • the reliable packet mover layer 204 provides end-to-end data integrity.
  • the reliable packet mover 114 sends data and monitors for an acknowledgment signal indicating that the data was received. If it is not acknowledged within a time out period, the reliable packet mover 114 resends the data. This preferably is hardware implemented, not software implemented.
  • the reliable packet mover layer 204 resends data that is lost or corrupted during transmission.
  • the reliable packet mover layer 204 suppresses duplicate packets and reorders data packets that are received out of order.
  • the reliable packet mover layer 204 provides node-to-node flow control to avoid overrunning a transmit packet buffer 1106 (FIG. 11) of the destination processor node 102.
  • communication is processor node 102 to processor node 102 and is not multicast or broadcast. If a packet 302 is being sent to more than one processor node 102, the interconnect service manager layer 202 sends separate copies of the packet 302 to each destination processor node 102.
  • the communication is point to point communication between directly connected elements (e.g., processor nodes 102 and routers 118).
  • the frames 300 (see FIG. 3) are sent from a source processor node 102 through a router 118 in the interconnect 104 to other routers 118 and then to a destination processor node 102.
  • the fast frame mover layer 206 provides flow control on each step between neighbor elements (routers 118 and processor nodes 102).
  • the fast frame mover layer 206 also provides the route to connect these steps together thereby transmitting frames from one node to another.
  • the fast frame mover layer 206 performs simple integrity checking on only the portion of the frame 300 that is uses but no error correction.
  • the fast frame mover layer 206 discards the frame 300, and, at a later time, the sender resends the data.
  • the fast frame mover layer 206 provides mesh link flow control to avoid overrun of the direct neighbors connected to the other end of the mesh link 120.
  • the fast frame mover layer 206 is stream-lined for low latency by not performing error detection for each frame 300 and by dropping bad frames 300.
  • the physical layer 208 includes the cabling, connectors, and the like of the interconnect 104 and the interface to the processor nodes 102.
  • a frame 300 is a unit of data transfer used by the fast frame mover 116.
  • the frame 300 includes a frame header 304 and a packet 302, which is a frame body.
  • the frame header 304 includes routing information 318, flow control information 320, and priority information 322.
  • the routing information 318 includes a sequence of the routers 118 that are to process the frame and control the routing thereof.
  • the flow control information 320 includes information regarding the capacity of the next down stream routers 118 or processor nodes 102 and enables controlling or halting flow of data.
  • the priority information 322 includes a priority level of the frame 300.
  • the frame header 304 is preferably one 68-bit word in size (81/2 bytes).
  • the frame body (packet 302) preferably is 2 to 18 (68-bit) words in size.
  • the packet 302 includes a packet header 306 and packet data 308.
  • the packet header 306 includes packet header descriptors 310, a priority acknowledgment request 324, and error detection code (EDC) 312.
  • EDC error detection code
  • the packet header 306 is preferably two 68-bit each word being 64-bits (8 bytes) of data and 4-bits of EDC.
  • a packet 302 may have no packet data 308.
  • the packet data 308 is of variable length, preferably 0 to 128 bytes of data (0 to 16 words).
  • an acknowledgment packet (described below) may include only a packet header 306 and EDC 312.
  • the packet data 308 may be data.
  • the packet header descriptors 310 include information indicating the destination processor node 102. As described below in conjunction with FIG.
  • the reliable packet mover 114 adds the EDC 312 to the packet 302 when the reliable packet mover 114 processes the packet 302.
  • the EDC 312 preferably is a byte (8 bits) for every 16 bytes of the packet data 308.
  • the EDC 312 is stored as 4 bits for each 8 bytes of the packet data 308 and is checked 8 bits per 16 bytes or 2 words at a time.
  • the priority acknowledgment request 324 is a request to the destination processor node 102 to send an immediate acknowledgment that the packet 302 has been received.
  • the reliable packet mover 114 generates the packet header 306 that includes the sequence number of the packet 302.
  • the sequence number is an identifier and an indicator of the order of packets 302 sent from a source processor node 102 to a destination processor node 102. Sequence numbers are generated for each source-destination node pair.
  • the fast frame mover 116 does not examine or modify the frame body (packet 302).
  • the fast frame mover 116 creates the frame header 304 upon receipt of the packet 302.
  • the routers 118 which are part of the fast frame mover layer 206, modify the frame header 304 as the frame 302 is communicated through the interconnect 104 as described below in conjunction with FIGS. 5-6.
  • the fast frame mover 116 in the destination processor node 102 discards the frame header 304 when transferring the packet 302 to the reliable packet mover 114 of the destination processor node 102.
  • FIG. 4 there is shown a block diagram illustrating the fast frame mover 116, which includes a transmitting circuit 402 and a receiving circuit 404.
  • each fast frame mover 116 includes both a transmitting circuit 402 and a receiving circuit 404, for clarity, only one transmitting circuit 402 and one receiving circuit 404 are shown.
  • the transmitting circuit 402 includes a routing table 406, a random number generator 407, and a FFM transmit controller 408.
  • the receiving circuit 404 includes a buffer manager 410, a buffer 412, and a FFM receiver controller 414.
  • the routing table 406 stores at least one route through the interconnect 104 to each destination processor node 102. As the frame 300 is communicated along the route, each router 118 in the route modifies the frame header 304 by removing a routing step from the route.
  • the fast frame mover 114 and the routers 118 maintain flow control over the mesh links 120.
  • the buffer manager 410 of the receiving circuit 404 monitors the status of the buffer 412 and sends status information over the mesh link 120 to the next upstream neighbor which can be either a router 118 or a processor node 102.
  • each router 118 monitors the status of buffers (not shown) therein and sends status information over the mesh link 120 to the next upstream neighbor, which can be, as above, either a router 118 or a processor node 102.
  • the fast frame mover 114 of the source processor node 102 or the router may then slow or stop sending frames 300 to the next downstream neighbor (either a processor node 102 or a router 118) until space in the buffer 412 is available.
  • next downstream neighbor either a processor node 102 or a router 118
  • flow control is described in U.S. patent application Ser. No. 08/603,913 entitled "A FLOW CONTROL PROTOCOL SYSTEM AND METHOD", filed on Feb. 22, 1996, the subject matter of which is incorporated herein by reference.
  • the routers 118 perform error detection to the extent that allows the router 118 to operate. For example, the router 118 determines whether the next link in the router 118 exists. For example, if the router 118 has six ports and the frame header 304 indicates that the frame 300 is to be provided to a non existent port, such as port 0, the router 118 discards the frame 300.
  • FIG. 5 there is shown a block diagram illustrating the selection of a route of a frame 300.
  • FIG. 6 there is shown a block diagram illustrating the modification of the routing information 318 in the frame header 304 while the frame 300 is being communicated through the interconnect 104.
  • the choice of the route depends only on the source processor node 102 and the destination processor node 102 and a random number to pick one of a plurality of routes.
  • the route preferably is not based on the size or type of the frame 300.
  • the route through the interconnect 104 is deterministic.
  • the choice of route is selected from the routing table 406, which stores a table of predetermined routes. Once the route is selected and the frame 300 is provided to the interconnect 104, the path is predetermined. The frame 300 follows this route to the destination processor node 102 or is discarded during the route in case of an error. Deterministic routing provides several advantages. First, the routers 118 can quickly process the frame 300, because the frame 300 defines the immediate destination of the frame 300 without any determination by the router 118. Second, the lifetime of the frame 300 within the interconnect 104 is bounded. The frame 300 is communicated by the pre-selected route, which is of finite length.
  • the reliable packet mover 114 uses finite length sequence numbers, which reduces the size of the packet header 306. In most cases, this also eliminates stale packets 302 from the interconnect 104.
  • the pre-selected route may follow any path through the interconnect 104. The loading of the interconnect 104 may be distributed as appropriate.
  • the route includes a sequence of directly coupled routers 118 between the source processor node 102 and the destination processor node 102.
  • the route does not require a particular topology. In fact, any topology may be used in which the link between two directly coupled routers 118 is uniquely defined.
  • the fast frame mover 116 receives a packet 302 from the reliable packet mover 114 that is to be sent to a pre-specified destination node, say processor node B for example, which is indicated in the packet 302.
  • the fast frame mover 116 retrieves a random number from the random number generator 407.
  • the fast frame mover 116 uses this random number to select one of a plurality of memory locations 504 in a probability distribution table 502.
  • Each memory location 504 stores one of a plurality of pre-specified routes from the source processor node 102 (e.g., node A) to the destination processor node 102 (e.g., node B).
  • the fast frame mover 116 then extracts from the selected memory location 504 the pre-specified route stored therein.
  • the probability distribution table 502 preferably is made according to a pre-specified probability distribution which biases the route selection. For example, the fast frame mover 116 may generate the probability distribution by storing in each of a predetermined number of memory locations 504 one of the routes stored in the routing table 406. The probability distribution 502 is determined by the frequency that each of the routes is stored in the memory locations 504.
  • the fast frame mover 116 creates a frame header 304 that includes such selected pre-specified route and prepends this frame header 304 to the packet 302 to generate a frame 300.
  • the frame header 304 includes the routing information 318 which specifies the predetermined path from the source processor node 102 through the interconnect 104 to the destination processor node 102.
  • a route includes a series of route steps. Each route step defines the port of a router 118 from which the router 118 sends the frame 300. Each route step can be variable in size. For example, for a six port router 118, three bits define the port. In a 12 port router 118, four bits define the port. Accordingly, one route may includes routers 118 of various sizes. Of course, the routes may include different numbers of route steps. In this instance, the route includes route steps of different sizes. In FIG.
  • the routing path is link 3, link 5, link 2, and the destination processor node 102.
  • Each link in the routing path removes the code for the next link from the frame header 304, shifts the routing path in the frame header 304, and back fills the frame header 304 with a non existent processor node number, say 0.
  • the link then provides the frame 300, which has a modified frame header 304, to the next link.
  • link #3 provides the frame 300 through port 5 and removes link 5 from the frame header 304.
  • the last link in the route provides the frame 300 to the destination processor node 102.
  • a mesh link 120 includes a pair of unidirectional data paths. This pair provides a greater bandwidth than a shared media switched from sending to receiving and eliminates dependency on propagation delay, which occurs for such a shared media. For high speed systems multiple bits of information may be on the mesh link 120 at a time. The pair provides the ability for a router 118 to communicate in both directions simultaneously.
  • the mesh link 120 provides point to point electrical connection.
  • the mesh link 120 preferably is not a bus.
  • Each uni-directional data path is a half mesh-link including data lines 702 and control lines, in particular, a clock line 704, a data/status indication line 706, a frame envelope line 708, and voltage reference lines 710.
  • the data lines 702 provide a path for communicating frames 300. Buffer status information is multiplexed on the same data lines 702 when no frame 300 is sent.
  • the bandwidth of the interconnect 104 depends on the number of data lines 702.
  • the data lines 702 preferably are 34 lines for communicating one half of a word of the frame 300 per clock edge.
  • the clock line 704 provides a communication path for the clock of the processor node 102 that is providing the frame 300 through the interconnect 104.
  • the clock line 704 preferably is a full differential single clock on two lines.
  • the data/status indication line 706 provides a signal indicative of whether the signal on the data lines 702 is data or status. For example, for flow control of the mesh link 120 as described above in conjunction with FIG.
  • the data/status indication line 706 indicates status information of the buffer 412 is being communicated over the data lines 702.
  • the data/status indication line 706 preferably has a single line.
  • the frame envelope line 708 provides a frame envelope signal indicative of the beginning of the frame. In particular, the frame envelope signal indicates the beginning of the frame header 304 and stays active during the transmission of the frame. The frame envelope signal becomes inactive at the end of the frame or a sufficient time before the end to allow frames to be transmitted back to back.
  • the frame envelope line 708 preferably has a single line.
  • the voltage reference lines 710 provides a voltage reference to the router 118 or a processor node 102, to allow small signal swings on all data and control lines 702, 706, 708, which may be single-wire differential.
  • the voltage reference line 710 preferably is 5 lines.
  • Each mesh link 120 preferably has 43 lines in each direction, or a total of 86 lines. This allows 34 bit to be transmitted in parallel over the mesh link 120. A word thus is transferred in two transfer cycles, equal to one clock cycle latched at both edges.
  • Each processor node 102 and each router 118 has an internal clock generator 802 for providing a clock signal.
  • the clock generators 802 preferably provide clock signals that are substantially equal.
  • the clock is provided on the mesh link 120 to the next neighbor (either the destination processor node 102 or a router 118), which uses this clock to accept the data. More specifically, this clock is used to latch the data into a First-In-First-Out (FIFO) buffer 804 in the destination processor node 102 or in the router 118.
  • FIFO First-In-First-Out
  • the destination processor node 102 or the router 118 uses its own internal clock generator 802 to read the data from the FIFO buffer 804.
  • the destination processor node 102 or the router 118 allows the destination processor node 102 or the router 118 to accept data that is based on a clock that has a frequency drift and a phase shift from the clock of the destination processor node 102 or the router 118.
  • This clocking eliminates the need for global synchronization of all clocks.
  • the clock domain is a plesiosynchronous clock domain.
  • the clock is provided on the mesh link 120 with the data on the data line 702.
  • One example of clocking is in U.S. patent application Ser. No. 08/223,575, entitled "DATA SYNCHRONIZER SYSTEM AND METHOD", filed Apr. 6, 1994, the subject matter of which is incorporated herein by reference.
  • the interconnect 104 includes at least two sub-meshes 902 that provide redundant paths between processor nodes 102 for providing fault tolerance.
  • the fast frame movers 116 dynamically either reroute around routers 118 or mesh links 120 that are nonfunctional or have been removed or use another sub-mesh 902.
  • Each sub-mesh 902 is coupled by a mesh link 120 to every processor node 102.
  • Each sub-mesh 902 is preferably similar to a non fault tolerant mesh.
  • Each processor node 102 is coupled by a mesh link 120 to a router 118 which is coupled by separate mesh links 120 to each of the sub-meshes 902.
  • Each router 118 includes a counter (not shown) that is incremented each time the router 118 discards a frame 300. Periodically the multiprocessor system 100 reads the counter to determine whether the router 118 or mesh links 120 connected to it are likely to have a defect. If such a determination is made, the multiprocessor system 100 eliminate the router 118 or the sub-mesh 902 in a fault redundant system from the predetermined routes. For instance, the processing node 102 may delete this route from the probability distribution table 502 for selecting routes from the routing table 406. A processor node 102 may count the number of retransmissions of a packet 302 that are required for each destination processor node 102 and if the count is above a predetermined threshold, determine whether a router 118 in the path has a high defect count.
  • FIG. 10 there is shown a block diagram illustrating virtual cut-through routing in which the beginning of a frame 300 may be sent to the next router 118 or processor node 102 in the route even if the end of the frame 300 has not been received yet.
  • a packet 302 is partitioned into a plurality of segments 1002 say 7 for example.
  • the segments 1002 preferably are different sized.
  • the source processor node 102 selects the route for sending the packet to the destination processor node 102.
  • the source processor node 102 provides the frame 300 to the first router 118 in the route.
  • the first router 118 in the route determines the next mesh link 120 to send the frame 300 and starts sending the frame 300 if the recipient has buffer resources available and the output port is available.
  • the frame 300 may span many routers 118 and mesh links 120, including the destination processor node 102. As shown in FIG. 10, the first segment 1002 of the frame 300 has been received at the destination processor node 102 and the second through sixth segments 1002 are at different routers 118 and mesh links 120 in the route. The source processor node 102 has not yet sent the seventh segment 1002.
  • the latency of the virtual cut-through routing typically does not include buffering in the intermediate routers 118. In contrast, in store-and-forward routing, the entire message is stored before forwarding. In such routing, the latency includes the buffering.
  • each reliable packet mover 114 includes a transmitting circuit 1102 and a receiving circuit 1104. Although each reliable packet mover 114 includes both a transmitting circuit 1102 and a receiving circuit 1104, for clarity, only one transmitting circuit 1102 and one receiving circuit 1104 are shown.
  • the transmitting circuit 1102 includes a transmit packet buffer 1106, a RPM transmit controller 1108, a partner information table 1110, and a time out circuit 1112 for controlling the retransmission of lost or corrupted data.
  • the transmit packet buffer 1106 stores the packets 302 that have been transmitted but not acknowledged.
  • the transmit packet buffer 1106 is smaller in size than in software implemented systems because the smaller latency in the system 100, in combination with virtual cut through routing, makes out-of-order reception of packets 302 less common and because the interconnect service manager layer 202 holds packets 302 if the transmit packet buffer 1106 is full.
  • the partner information table 1110 stores, for each destination processor node 102, the sequence number of the next packet 302 that is to be sent, and that is expected to be acknowledged from that destination processor node 102.
  • the RPM transmit controller 1108 controls the operation of the transmitting circuit 1102.
  • the time out circuit 1112 provides a time count for controlling the retransmission of lost or corrupted data.
  • the receiving circuit 1104 includes a pending packet buffer 1114, a RPM receiver controller 1116, and a partner information table 1118.
  • the pending packet buffer 1114 stores packets 302 that have been received out of sequence.
  • the pending packet buffer 1114 is smaller in size than in software implemented systems because the smaller latency in the system 100 makes out-of-order reception of packets 302 less common.
  • the RPM receiver controller 1116 controls the operation of the receiving circuit 1104.
  • the partner information tablet 1118 stores, for each source processor node 102, the sequence number of the next expected packet 302 from that source processor node 102.
  • the reliable packet mover 114 generates the packet header 306 that includes the sequence number of the packet 302. Sequence numbers are used to inform the destination processor node 102 of the sequence of the packets 302. The destination node only processes the packets 302 in sequence. Upon receipt of an in order packet 302, the destination processor node sends an acknowledgment back to the source processor node 102 informing same of the receipt of the packet 302. If the source processor node 102 does not get an acknowledgment within a predetermined time, the source processor node 102 retransmits the packet 302 using the same sequence number.
  • the coherence control unit 110 provides the data and an identification of the destination processor node 102 to the reliable packet mover 114, which converts the data into packets 302 and assigns a sequence number to each packet 302.
  • Each transmitted packet 302 is stored in the transmit packet buffer 1106 in the source processor node 102. If 1201 it has capacity, the transmit packet buffer 1106 accepts the data and the reliable packet mover 114 transmits the packet 302. If not, the interconnect service manager layer 202 stops sending packets 302 and waits.
  • the transmitting circuit 1102 retrieves 1202 a sequence number from the partner information table 1110 corresponding to the destination processor node 102.
  • the transmitting circuit 1102 adds 1206 the retrieved sequence number to the packet header 306 of the packet 302 and performs 1207 an error detection.
  • the transmitting circuit 1102 sends 1208 the packet 302 to the fast frame mover 116 for transmission as described above.
  • the transmitting circuit 1102 also stores 1210 the packet 302 in the transmit packet buffer 1106 with a mark bit for that packet 302 that is not set, until an acknowledgment is received that the packet was received.
  • the sequence number in the partner information table 1110 is incremented 1204 for the next packet 302 transmission. Because the sequence numbers are finite, they eventually will wrap around. Accordingly, the sequence number space is sufficiently large so that no packets 302 with the same sequence number are in the system 100 at the same time.
  • the acknowledgment packet is a control packet from the receiving circuit 1104 to the source processor node 102 that indicates that the packet 302 was received and passed error detection.
  • the acknowledgment packet includes a destination node number, the sequence number of the received packet 302, and EDC, preferably 16 bytes.
  • the acknowledgment packet may be appended to another packet 302 that is being sent to the source destination node 102. This reduces traffic in the interconnect 104.
  • the acknowledgment packet itself is not acknowledged and does not include its own sequence number.
  • the sequence number in an acknowledgment packet implicitly acknowledges all prior packets, i.e. all packets with numbers that are less than the sequence number in the acknowledgment or adjusted because of the wrap around noted above. This allows the receiving circuit 1104 to delay the acknowledgment of packets 302 and to reduce the traffic of acknowledgment packets by using a single acknowledgment packet to acknowledge more than one packet 302.
  • the transmitting circuit 1102 determines 1212 whether an acknowledgment packet is received. If so, the transmitting circuit 1102 deletes 1214 the packets 302 corresponding to the received acknowledgment from the transmit packet buffer 1106. This deletion includes all prior packets 302 in the transmit packet buffer 1106 for the source-destination processor node 102 pair. These packets 302 have a sequence number less than or equal to the sequence number in the acknowledgment packet or sequence numbers that are appropriately adjusted to account for the wrap around.
  • the packet 302 is resent if an acknowledgment packet is not received after a specified time-out period. Specifically, if an acknowledgment packet is not received 1212, the transmitting circuit 1102 determines 1216 whether the time out circuit 1112 has timed out. If not, the transmitting circuit 1102 continues to determine 1212 whether an acknowledgment is received.
  • the transmitting circuit 1102 checks 1218 each packet 302 stored in the transmit packet buffer 1106 to determine if a mark bit is set for that packet 302. If the bit is not set, the transmitting circuit 1102 sets 1220 the mark bit for that packet 302. This allows a packet 302 between one or two time out periods before being resent. For packets 302 with the mark bit set 1218, the transmitting circuit 1102 retrieves 1222 the packet 302 from the transmit packet buffer 1106 and retransmits 1224 the packet 302. To determine that the interconnect 104 is defective, a limited or maximum number of retransmissions are sent.
  • the transmitting circuit 1102 determines 1225 if the packet 302 has been resent a predetermined number of times. If it has been, the transmitting circuit 1102 informs 1227 the interconnect service manager layer 202 of such number of retransmissions and the layer 202 then may reroute packets 302 between that source-destination node pair. If the number of retransmissions has not reached the maximum, then upon reaching 1226 the last packet 302, the transmitting circuit 1102 continues to determine 1212 whether an acknowledgment packet is received as described above.
  • the receiving circuit 1104 of the reliable packet mover 114 provides the packet 302 to the interconnect service manager layer 202 by sequence number order.
  • the receiving circuit 1104 receives 1302 a packet 302 from the interconnect 104. If the pending packet buffer 1114 is full, the receiving circuit 1104 discards the packet 302. Alternatively, the receiving circuit 1104 may discard the latest packet 302 stored in the pending packet buffer 1114. Of course other packets 302 may be discarded from the pending packet buffer 1114 since this addresses performance and not correctness.
  • the receiving circuit 1104 performs 1316 error detection on the packet 302.
  • the packet 302 fails 1318 error detection, the packet 302 is discarded 1310 and the receiving circuit 1104 continues to receive 1302 packets 302. On the other hand if the packet 302 does not fail 1318 error detection. Then the receiving circuit 1104 extracts 1304 the sequence number and source node number from the packet header 306. The receiving circuit 1104 reads 1306 the next expected sequence number for the source processor node from the partner information table 1118, and compares 1308 the next expected sequence number to the extracted sequence number. If the extracted sequence number is less than the expected sequence number, the packet 302 already has been processed by the receiving circuit 1104 and is a duplicate. Again the wrap around of sequence numbers is appropriately accounted for. The packet 302 is discarded 1310 and the receiving circuit 1104 continues to receive 1302 packets 302.
  • the receiving circuit 1310 determines 1312 whether the extracted sequence number is equal to the expected sequence number. If there is not a match, the received packet 302 is out of sequence.
  • the receiving circuit 1104 stores 1314 the packet 302 in the pending packet buffer 1114 and the receiving circuit 1104 continues to receive 1302 packets 302.
  • the receiving circuit 1104 determines whether the next expected sequence number matches 1312 the extracted sequence number. If the next expected sequence number matches 1312 the extracted sequence number, the receiving circuit 1104 provides 1320 an acknowledgment to the interconnect 104. Because the received packet 302 is the expected packet 302, the receiving circuit 1104 increments 1322 the partner information table 1118 for the corresponding source processor node. The receiving circuit 1104 provides 1324 the packet 302 to the interconnect service manager layer 202 for processing and checks 1326 the pending packet buffer 1114 for the packet 302 next in the sequence.
  • FIG. 14 there is shown a flowchart illustrating the operation of checking 1326 the pending packet buffer 1114.
  • the receiving circuit 1104 checks 1402 the pending packet buffer 1114 for the packet 302 next in the sequence. If the next expected packet 302 is in the pending packet buffer 1114, the receiving circuit 1104 also sends 1104 an acknowledgment and increments 1406 the sequence number. The receiving circuit 1104 provides 1408 that packet 302 to the interconnect service manager layer 202. The receiving circuit 1104 continues checking the pending packet buffer 1114 for the next expected packet 302 until such packet 302 is not found. The receiving circuit 1104 continues to monitor 1302 (FIG. 13) for received packets 302.

Abstract

A multiprocessor system includes a plurality of nodes and an interconnect that includes routers. Each node includes a reliable packet mover and a fast frame mover. The reliable packet mover provides packets to the fast frame mover which adds routing information to the packet to form a frame. The route to each node is predetermined. The frame is provided to the routers which delete the route from the routing information. If the frame is lost while being routed, the router discards the frame. If the packet is received at a destination node, the reliable packet mover in that node sends an acknowledgment to the source node if the packet passes an error detection test. The reliable packet mover in the source node resends the packet if it does not receive an acknowledgment in a predetermined time. The fast frame mover randomly selects the route from a plurality of predetermined routes to the destination node according to a probability distribution.

Description

CROSS-REFERENCES TO RELATED APPLICATIONS
The subject matter of this application is related to the subject matter of the following applications:
application Ser. No. 08/605,676, entitled "SYSTEM AND METHOD FOR DYNAMIC NETWORK TOPOLOGY EXPLORATION" filed on Feb. 22, 1996, by Thomas M. Wicki, Patrick J. Helland, Wolf-Dietrich Weber, and Winfried W. Wilcke now U.S. Pat. No. 5,740,346;
application Ser. No. 08/603,926, entitled "LOW LATENCY, HIGH CLOCK FREQUENCY PLESIOASYNCHRONOUS PACKET-BASED CROSSBAR SWITCHING CHIP SYSTEM AND METHOD" filed on Feb. 22, 1996, by Thomas M. Wicki, Jeffrey D. Larson, Albert Mu, and Raghu Sastry now U.S. Pat. No. 5,838,684;
application Ser. No. 08/603,880, entitled "METHOD AND APPARATUS FOR COORDINATING ACCESS TO AN OUTPUT OF A ROUTING DEVICE IN A PACKET SWITCHING NETWORK" filed on Feb. 22, 1996, by Jeffrey D. Larson, Albert Mu, and Thomas M. Wicki now U.S. Pat. No. 5,892,766;
application Ser. No. 08/604,920, entitled "CROSSBAR SWITCH AND METHOD WITH REDUCED VOLTAGE SWING AND NO INTERNAL BLOCKING DATA PATH" filed on Feb. 22, 1996, by Albert Mu and Jeffrey D. Larson;
application Ser. No. 08/603,913, entitled "A FLOW CONTROL PROTOCOL SYSTEM AND METHOD" filed on Feb. 22, 1996, by Thomas M. Wicki, Patrick J. Helland, Jeffrey D. Larson, Albert Mu, Raghu Sastry, and Richard L. Schober, Jr.;
application Ser. No. 08/603,911, entitled "INTERCONNECT FAULT DETECTION AND LOCALIZATION METHOD AND APPARATUS" filed on Feb. 22, 1996, by Raghu Sastry, Jeffrey D. Larson, Albert Mu, John R. Slice, Richard L. Schober, Jr., and Thomas M. Wicki now U.S. Pat. No. 5,768,300;
application Ser. No. 08/603,923, entitled, "METHOD AND APPARATUS FOR DETECTION OF ERRORS IN MULTIPLE-WORD COMMUNICATIONS" filed on Feb. 22, 1996, by Thomas M. Wicki, Patrick J. Helland, and Takeshi Shimizu;
U.S. Pat. No. 5,615,161, entitled "CLOCKED SENSE AMPLIFIER WITH POSITIVE SOURCE FEEDBACK" issued on Mar. 25, 1996, by Albert Mu;
all of the above applications are incorporated herein by reference in their entirety.
FIELD OF THE INVENTION
The invention relates to data transmission on a network, and more particularly to asynchronous packet switching data transmission in a multiprocessor environment.
BACKGROUND OF THE INVENTION
A multiprocessor system includes several processors connected to one or more memories. The interconnect can take one of several forms, for example a shared bus, a cross-bar, or the like. The interconnect must support fast access (low latency) and high bandwidth. Existing interconnects suffer either from limited bandwidth (as in shared bus interconnects), scalability problems (as in cross-bar interconnects), or excessive latency (as in general networks).
It is desirable to have a multiprocessor system that allows low latency and high bandwidth access to all of memory. In addition the available bandwidth should increase (scale) when additional processors/memories are added.
SUMMARY OF THE INVENTION
In the present invention, a multi-node system comprises a plurality of nodes coupled to each other. The nodes communicate with one another by point to point packets. Each node includes a packet mover and a frame mover. The packet mover provides a packet to the frame mover, provides an acknowledgment in response to receiving a packet from one of the other packet movers, and resends the packet to the frame mover if an acknowledgment is not received from one of the other packet movers in a predetermined amount of time. Each packet indicates a destination node. The frame mover converts the packet into a frame, generates a route to the destination node. The frame is defective, it is discarded and the packet mover eventually retransmits the packet. The frame mover provides source routing and multiple routes to nodes. The interconnect may be of a flexible topology. Packets have a bounded finite life.
The frame mover selects a preselected route to a destination node, generates a frame that includes said preselected route, and provides the frame to the plurality of routers for communication to the destination node. The route includes a sequence of route steps through some of the plurality of routers for communicating the frame therebetween. The frame mover includes a routing table for storing a plurality of preselected routes to the destination node and includes a controller for selecting one of the plurality of preselected routes for inclusion in the frame.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 is a block diagram illustrating a multi-processor system in accordance with the present invention.
FIG. 2 is a block diagram illustrating protocol layers of the multi-processor system of FIG. 1.
FIG. 3 is a pictorial diagram illustrating a frame and a packet.
FIG. 4 is a block diagram illustrating the fast frame mover.
FIG. 5 is a block diagram illustrating the selection of a route of a frame.
FIG. 6 is shown a block diagram illustrating the modification of routing information in the frame header while the frame is being communicated through the interconnect.
FIG. 7 is a diagram illustrating a half mesh link.
FIG. 8 is a block diagram illustrating different clock domains for a pair of receivers and transmitters of a mesh link.
FIG. 9 is a block diagram illustrating a fault tolerant interconnect in a second embodiment of the present invention.
FIG. 10 is a block diagram illustrating virtual cut-through routing.
FIG. 11 is a block diagram illustrating a reliable packet mover of the multiprocessor system of FIG. 1.
FIG. 12a is a flowchart illustrating the operation of the transmitting packets by the reliable packet mover.
FIG. 12b is a flowchart illustrating the operation of processing acknowledgments and retransmission of packets by the reliable packet mover.
FIG. 13 is a flowchart illustrating the operation of receiving packets by the reliable packet mover.
FIG. 14 is a flowchart illustrating the operation of checking the pending packet buffer.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
A preferred embodiment of the present invention is now described with reference to the Figures where like reference numbers indicate identical or functionally similar elements. Also the digits that are not the two least significant digits of each reference number corresponds to the figure in which the reference number is first used.
Referring to FIG. 1, there is shown a block diagram illustrating a multiprocessor system 100 in accordance with the present invention. The multiprocessor system 100 includes a plurality of processor nodes 102 each coupled by a mesh link 120 to an interconnect 104. Each processor node 102 includes a processor 106, a coherence control unit 110, and a local memory 112. The coherence control unit 110 includes a reliable packet mover (RPM) 114 and a fast frame mover (FFM) 116.
The reliable packet mover 114 provides reliable end to end data communication between processor nodes 102. The fast frame mover 116 routes data from a source processor node 102 to a destination processor node 102. For each processor node 102, at least one route to every destination processor node 102 is stored in the fast frame mover 116. A method for determining the topology of the interconnect 104 and areas of failure therein is described in U.S. patent application Ser. No. 08/605,676 entitled "SYSTEM AND METHOD FOR DYNAMIC NETWORK TOPOLOGY EXPLORATION" filed Feb. 22, 1996, by Thomas M. Wicki, Patrick J. Helland, Wolf-Dietrich Weber, and Winfried W. Wilcke now U.S. Pat. No. 5,740,346, the subject matter of which is incorporated herein by reference.
Other architectures of the processor node 102 may be used. For example, the coherence control unit 110 may be coupled to a cache memory, which is coupled to the processor 106.
The interconnect 104 includes a plurality of routers 118 interconnected by mesh links 120. The plurality of processor nodes 102 are coupled to the routers 118 by mesh links 120. More than one processor node 102 may be coupled to the same router 118. The routers 118 preferably are cross bar switches. In the specific implementation described herein for illustrative purposes, the routers 118 have 6 ports. Of course, in a system with a few processor nodes 102, the interconnect 104 may include only one router 118, and in a system including two processor nodes 102, no router 118 need be included. An example of a router is described in U.S. patent application Ser. No. 08/603,926, entitled "LOW LATENCY, HIGH CLOCK FREQUENCY PLESIOSYNCHRONOUS PACKET-BASED CROSSBAR SWITCHING CHIP SYSTEM AND METHOD", filed Feb. 22,1996, by Thomas M. Wicki, Jeffrey D. Larson, Albert Mu, and Raghu Sastry now U.S. Pat. No. 5,838,684, the subject matter of which is incorporated herein by reference.
The interconnect 104 uses a packet based protocol in which all communication is directly processor node 102 to processor node 102. The interconnect 104 need not provide multicast or broadcast. All data transferred is parcelled into packets, which are described below in conjunction with FIG. 3.
The multi-processor system 100 is a shared memory system that provides nonuniform memory access times. The processor 106 may access other local memory 112. The access to the local memory 112 of a first processor node 102 is less than the access time to the memory of another processor node 102. By writing software that allows a processor 106 to make higher use of the local memory 112, the latency is reduced. The latency of the network is a measurement of the time required to provide a requesting processor node 102 with the requested data as measured from the time at which the memory request is transmitted. In other words, latency indicates how long it takes before you receive the data after it is requested.
The bandwidth of the link between the coherence control unit 110 and the local memory 112 preferably is substantially equal to the bandwidth of the link between the coherence control unit 110 and the interconnect 104. Bandwidth depends both on the rate at which you can receive or provide data and on the width of the path.
The multiprocessor system 100 preferably is a distributed memory system. More specifically, the system 100 has a memory architecture that is physically distributed but the local memories 112 are logically shared. For example, a processor node 102, e.g. node A, may request access to a memory location that this node 102 processes as being local but in fact it is actually physically located in a different local memory 112 that is coupled to a different processor node 102, e.g. node B. The coherence control unit 110 of the requesting node or source node (node A) identifies the location of the memory and the data stored at that location is quickly retrieved.
The multiprocessor system 100 may also include input/output (I/O) nodes 103, which do not have processing capability. For clarity, only one I/O node 103 is shown. Such a node 103 may be a bus converter to interface with a bus, such as a PCI bus or an S bus. Such I/O nodes 103 may function as source or destination nodes 102 as described herein. Thus, in the description herein of communicating and processing data, when a processor node 102 is described, an I/O node 103 may be also used.
Referring to FIG. 2, there is shown a block diagram illustrating the protocol layers of the processor nodes 102 and the interconnect 104. The protocol layers includes a interconnect service manager (ISM) layer 202, a reliable packet mover (RPM) layer 204, a fast frame mover (FFM) layer 206, and a physical layer 208. The coherence control unit 110 provides the functions of the interconnect service manager layer 202 which are controllable by software executed by the processor 106, the reliable packet mover layer 204, the fast frame mover layer 206, and a portion of the physical layer 208. The layers allow for a more efficient division of the functions of the system 100 and for independent development and testing of portions of the system. The interconnect service manager layer 202 communicates with point to point messages to assure coherence. When the interconnect service manager layer 202 of a first processor node 202 sends data to or requests data from another processor node 102, the interconnect service manager layer 202 of the first processor node 102 sends commands to the reliable packet mover 114 that inform the reliable packet mover 114 of the data to be sent or requested and the source or destination of the data.
At the reliable packet mover layer 204, the source processor node 102 sends data to a destination processor node 102 and does not determine the path through the interconnect 104 or use any information regarding the path. The reliable packet mover layer 204 provides reliable delivery of packets 302 (see FIG. 3) between the processor nodes 102 by using the fast frame mover layer 206 to communicate packets 302. The reliable packet mover layer 204 provides end-to-end data integrity. At the reliable packet mover layer 204, the reliable packet mover 114 sends data and monitors for an acknowledgment signal indicating that the data was received. If it is not acknowledged within a time out period, the reliable packet mover 114 resends the data. This preferably is hardware implemented, not software implemented. Thus, the reliable packet mover layer 204 resends data that is lost or corrupted during transmission. The reliable packet mover layer 204 suppresses duplicate packets and reorders data packets that are received out of order. The reliable packet mover layer 204 provides node-to-node flow control to avoid overrunning a transmit packet buffer 1106 (FIG. 11) of the destination processor node 102. At the reliable packet mover layer 204, communication is processor node 102 to processor node 102 and is not multicast or broadcast. If a packet 302 is being sent to more than one processor node 102, the interconnect service manager layer 202 sends separate copies of the packet 302 to each destination processor node 102.
At the fast frame mover layer 206, the communication is point to point communication between directly connected elements (e.g., processor nodes 102 and routers 118). The frames 300 (see FIG. 3) are sent from a source processor node 102 through a router 118 in the interconnect 104 to other routers 118 and then to a destination processor node 102. The fast frame mover layer 206 provides flow control on each step between neighbor elements (routers 118 and processor nodes 102). The fast frame mover layer 206 also provides the route to connect these steps together thereby transmitting frames from one node to another. The fast frame mover layer 206 performs simple integrity checking on only the portion of the frame 300 that is uses but no error correction. If an error occurs, the fast frame mover layer 206 discards the frame 300, and, at a later time, the sender resends the data. The fast frame mover layer 206 provides mesh link flow control to avoid overrun of the direct neighbors connected to the other end of the mesh link 120. The fast frame mover layer 206 is stream-lined for low latency by not performing error detection for each frame 300 and by dropping bad frames 300.
The physical layer 208 includes the cabling, connectors, and the like of the interconnect 104 and the interface to the processor nodes 102.
Referring to FIG. 3, there is shown a diagram illustrating a frame 300 and a packet 302. A frame 300 is a unit of data transfer used by the fast frame mover 116. The frame 300 includes a frame header 304 and a packet 302, which is a frame body. The frame header 304 includes routing information 318, flow control information 320, and priority information 322. The routing information 318 includes a sequence of the routers 118 that are to process the frame and control the routing thereof. The flow control information 320 includes information regarding the capacity of the next down stream routers 118 or processor nodes 102 and enables controlling or halting flow of data. The priority information 322 includes a priority level of the frame 300. The frame header 304 is preferably one 68-bit word in size (81/2 bytes). The frame body (packet 302) preferably is 2 to 18 (68-bit) words in size.
The packet 302 includes a packet header 306 and packet data 308. The packet header 306 includes packet header descriptors 310, a priority acknowledgment request 324, and error detection code (EDC) 312. The packet header 306 is preferably two 68-bit each word being 64-bits (8 bytes) of data and 4-bits of EDC. A packet 302 may have no packet data 308. The packet data 308 is of variable length, preferably 0 to 128 bytes of data (0 to 16 words). For example, an acknowledgment packet (described below) may include only a packet header 306 and EDC 312. The packet data 308 may be data. The packet header descriptors 310 include information indicating the destination processor node 102. As described below in conjunction with FIG. 12, the reliable packet mover 114 adds the EDC 312 to the packet 302 when the reliable packet mover 114 processes the packet 302. The EDC 312 preferably is a byte (8 bits) for every 16 bytes of the packet data 308. The EDC 312 is stored as 4 bits for each 8 bytes of the packet data 308 and is checked 8 bits per 16 bytes or 2 words at a time. The priority acknowledgment request 324 is a request to the destination processor node 102 to send an immediate acknowledgment that the packet 302 has been received.
The reliable packet mover 114 generates the packet header 306 that includes the sequence number of the packet 302. The sequence number is an identifier and an indicator of the order of packets 302 sent from a source processor node 102 to a destination processor node 102. Sequence numbers are generated for each source-destination node pair. The fast frame mover 116 does not examine or modify the frame body (packet 302). The fast frame mover 116 creates the frame header 304 upon receipt of the packet 302. The routers 118, which are part of the fast frame mover layer 206, modify the frame header 304 as the frame 302 is communicated through the interconnect 104 as described below in conjunction with FIGS. 5-6. The fast frame mover 116 in the destination processor node 102 discards the frame header 304 when transferring the packet 302 to the reliable packet mover 114 of the destination processor node 102.
Referring to FIG. 4, there is shown a block diagram illustrating the fast frame mover 116, which includes a transmitting circuit 402 and a receiving circuit 404. Although each fast frame mover 116 includes both a transmitting circuit 402 and a receiving circuit 404, for clarity, only one transmitting circuit 402 and one receiving circuit 404 are shown. The transmitting circuit 402 includes a routing table 406, a random number generator 407, and a FFM transmit controller 408. The receiving circuit 404 includes a buffer manager 410, a buffer 412, and a FFM receiver controller 414. The routing table 406 stores at least one route through the interconnect 104 to each destination processor node 102. As the frame 300 is communicated along the route, each router 118 in the route modifies the frame header 304 by removing a routing step from the route.
The fast frame mover 114 and the routers 118 maintain flow control over the mesh links 120. The buffer manager 410 of the receiving circuit 404 monitors the status of the buffer 412 and sends status information over the mesh link 120 to the next upstream neighbor which can be either a router 118 or a processor node 102. Likewise, each router 118 monitors the status of buffers (not shown) therein and sends status information over the mesh link 120 to the next upstream neighbor, which can be, as above, either a router 118 or a processor node 102. The fast frame mover 114 of the source processor node 102 or the router may then slow or stop sending frames 300 to the next downstream neighbor (either a processor node 102 or a router 118) until space in the buffer 412 is available. One such implementation of flow control is described in U.S. patent application Ser. No. 08/603,913 entitled "A FLOW CONTROL PROTOCOL SYSTEM AND METHOD", filed on Feb. 22, 1996, the subject matter of which is incorporated herein by reference.
The routers 118 perform error detection to the extent that allows the router 118 to operate. For example, the router 118 determines whether the next link in the router 118 exists. For example, if the router 118 has six ports and the frame header 304 indicates that the frame 300 is to be provided to a non existent port, such as port 0, the router 118 discards the frame 300.
Referring to FIG. 5, there is shown a block diagram illustrating the selection of a route of a frame 300. Referring to FIG. 6, there is shown a block diagram illustrating the modification of the routing information 318 in the frame header 304 while the frame 300 is being communicated through the interconnect 104. The choice of the route depends only on the source processor node 102 and the destination processor node 102 and a random number to pick one of a plurality of routes. The route preferably is not based on the size or type of the frame 300.
The route through the interconnect 104 is deterministic. The choice of route is selected from the routing table 406, which stores a table of predetermined routes. Once the route is selected and the frame 300 is provided to the interconnect 104, the path is predetermined. The frame 300 follows this route to the destination processor node 102 or is discarded during the route in case of an error. Deterministic routing provides several advantages. First, the routers 118 can quickly process the frame 300, because the frame 300 defines the immediate destination of the frame 300 without any determination by the router 118. Second, the lifetime of the frame 300 within the interconnect 104 is bounded. The frame 300 is communicated by the pre-selected route, which is of finite length. This allows the reliable packet mover 114 to use finite length sequence numbers, which reduces the size of the packet header 306. In most cases, this also eliminates stale packets 302 from the interconnect 104. Third, the pre-selected route may follow any path through the interconnect 104. The loading of the interconnect 104 may be distributed as appropriate.
The route includes a sequence of directly coupled routers 118 between the source processor node 102 and the destination processor node 102. The route does not require a particular topology. In fact, any topology may be used in which the link between two directly coupled routers 118 is uniquely defined.
Referring specifically to FIGS. 5-6, the fast frame mover 116 receives a packet 302 from the reliable packet mover 114 that is to be sent to a pre-specified destination node, say processor node B for example, which is indicated in the packet 302. The fast frame mover 116 retrieves a random number from the random number generator 407. The fast frame mover 116 uses this random number to select one of a plurality of memory locations 504 in a probability distribution table 502. Each memory location 504 stores one of a plurality of pre-specified routes from the source processor node 102 (e.g., node A) to the destination processor node 102 (e.g., node B). The fast frame mover 116 then extracts from the selected memory location 504 the pre-specified route stored therein. The probability distribution table 502 preferably is made according to a pre-specified probability distribution which biases the route selection. For example, the fast frame mover 116 may generate the probability distribution by storing in each of a predetermined number of memory locations 504 one of the routes stored in the routing table 406. The probability distribution 502 is determined by the frequency that each of the routes is stored in the memory locations 504. The fast frame mover 116 creates a frame header 304 that includes such selected pre-specified route and prepends this frame header 304 to the packet 302 to generate a frame 300.
Refer now specifically to FIG. 6. As described above, the frame header 304 includes the routing information 318 which specifies the predetermined path from the source processor node 102 through the interconnect 104 to the destination processor node 102. As noted above, a route includes a series of route steps. Each route step defines the port of a router 118 from which the router 118 sends the frame 300. Each route step can be variable in size. For example, for a six port router 118, three bits define the port. In a 12 port router 118, four bits define the port. Accordingly, one route may includes routers 118 of various sizes. Of course, the routes may include different numbers of route steps. In this instance, the route includes route steps of different sizes. In FIG. 6, at link 1, the routing path is link 3, link 5, link 2, and the destination processor node 102. Each link in the routing path removes the code for the next link from the frame header 304, shifts the routing path in the frame header 304, and back fills the frame header 304 with a non existent processor node number, say 0. The link then provides the frame 300, which has a modified frame header 304, to the next link. For example, link #3 provides the frame 300 through port 5 and removes link 5 from the frame header 304. Of course, the last link in the route provides the frame 300 to the destination processor node 102.
Referring to FIG. 7, there is shown a diagram illustrating a half of a mesh-link 120. A mesh link 120 includes a pair of unidirectional data paths. This pair provides a greater bandwidth than a shared media switched from sending to receiving and eliminates dependency on propagation delay, which occurs for such a shared media. For high speed systems multiple bits of information may be on the mesh link 120 at a time. The pair provides the ability for a router 118 to communicate in both directions simultaneously. The mesh link 120 provides point to point electrical connection. The mesh link 120 preferably is not a bus. Each uni-directional data path is a half mesh-link including data lines 702 and control lines, in particular, a clock line 704, a data/status indication line 706, a frame envelope line 708, and voltage reference lines 710. The data lines 702 provide a path for communicating frames 300. Buffer status information is multiplexed on the same data lines 702 when no frame 300 is sent. The bandwidth of the interconnect 104 depends on the number of data lines 702. The data lines 702 preferably are 34 lines for communicating one half of a word of the frame 300 per clock edge. The clock line 704 provides a communication path for the clock of the processor node 102 that is providing the frame 300 through the interconnect 104. The clock line 704 preferably is a full differential single clock on two lines. The data/status indication line 706 provides a signal indicative of whether the signal on the data lines 702 is data or status. For example, for flow control of the mesh link 120 as described above in conjunction with FIG. 4, the data/status indication line 706 indicates status information of the buffer 412 is being communicated over the data lines 702. The data/status indication line 706 preferably has a single line. The frame envelope line 708 provides a frame envelope signal indicative of the beginning of the frame. In particular, the frame envelope signal indicates the beginning of the frame header 304 and stays active during the transmission of the frame. The frame envelope signal becomes inactive at the end of the frame or a sufficient time before the end to allow frames to be transmitted back to back. The frame envelope line 708 preferably has a single line. The voltage reference lines 710 provides a voltage reference to the router 118 or a processor node 102, to allow small signal swings on all data and control lines 702, 706, 708, which may be single-wire differential. The voltage reference line 710 preferably is 5 lines. Each mesh link 120 preferably has 43 lines in each direction, or a total of 86 lines. This allows 34 bit to be transmitted in parallel over the mesh link 120. A word thus is transferred in two transfer cycles, equal to one clock cycle latched at both edges.
Referring to FIG. 8, there is shown a block diagram illustrating different clock domains for a pair of receivers and transmitters of a mesh link 102. Each processor node 102 and each router 118 has an internal clock generator 802 for providing a clock signal. The clock generators 802 preferably provide clock signals that are substantially equal. The clock is provided on the mesh link 120 to the next neighbor (either the destination processor node 102 or a router 118), which uses this clock to accept the data. More specifically, this clock is used to latch the data into a First-In-First-Out (FIFO) buffer 804 in the destination processor node 102 or in the router 118. The destination processor node 102 or the router 118 uses its own internal clock generator 802 to read the data from the FIFO buffer 804. This allows the destination processor node 102 or the router 118 to accept data that is based on a clock that has a frequency drift and a phase shift from the clock of the destination processor node 102 or the router 118. This clocking eliminates the need for global synchronization of all clocks. The clock domain is a plesiosynchronous clock domain. The clock is provided on the mesh link 120 with the data on the data line 702. One example of clocking is in U.S. patent application Ser. No. 08/223,575, entitled "DATA SYNCHRONIZER SYSTEM AND METHOD", filed Apr. 6, 1994, the subject matter of which is incorporated herein by reference.
Referring to FIG. 9, there is shown a block diagram illustrating a fault tolerant interconnect in a second embodiment of the present invention. The interconnect 104 includes at least two sub-meshes 902 that provide redundant paths between processor nodes 102 for providing fault tolerance. The fast frame movers 116 dynamically either reroute around routers 118 or mesh links 120 that are nonfunctional or have been removed or use another sub-mesh 902.
Each sub-mesh 902 is coupled by a mesh link 120 to every processor node 102. Each sub-mesh 902 is preferably similar to a non fault tolerant mesh. Each processor node 102 is coupled by a mesh link 120 to a router 118 which is coupled by separate mesh links 120 to each of the sub-meshes 902.
Each router 118 includes a counter (not shown) that is incremented each time the router 118 discards a frame 300. Periodically the multiprocessor system 100 reads the counter to determine whether the router 118 or mesh links 120 connected to it are likely to have a defect. If such a determination is made, the multiprocessor system 100 eliminate the router 118 or the sub-mesh 902 in a fault redundant system from the predetermined routes. For instance, the processing node 102 may delete this route from the probability distribution table 502 for selecting routes from the routing table 406. A processor node 102 may count the number of retransmissions of a packet 302 that are required for each destination processor node 102 and if the count is above a predetermined threshold, determine whether a router 118 in the path has a high defect count.
Referring to FIG. 10, there is shown a block diagram illustrating virtual cut-through routing in which the beginning of a frame 300 may be sent to the next router 118 or processor node 102 in the route even if the end of the frame 300 has not been received yet. More specifically, a packet 302 is partitioned into a plurality of segments 1002 say 7 for example. The segments 1002 preferably are different sized. As described above, the source processor node 102 selects the route for sending the packet to the destination processor node 102. The source processor node 102 provides the frame 300 to the first router 118 in the route. Upon receipt of the frame header 306, the first router 118 in the route determines the next mesh link 120 to send the frame 300 and starts sending the frame 300 if the recipient has buffer resources available and the output port is available. The frame 300 may span many routers 118 and mesh links 120, including the destination processor node 102. As shown in FIG. 10, the first segment 1002 of the frame 300 has been received at the destination processor node 102 and the second through sixth segments 1002 are at different routers 118 and mesh links 120 in the route. The source processor node 102 has not yet sent the seventh segment 1002. The latency of the virtual cut-through routing typically does not include buffering in the intermediate routers 118. In contrast, in store-and-forward routing, the entire message is stored before forwarding. In such routing, the latency includes the buffering.
Referring to FIG. 11, there is shown a block diagram illustrating a reliable packet mover 114, which includes a transmitting circuit 1102 and a receiving circuit 1104. Although each reliable packet mover 114 includes both a transmitting circuit 1102 and a receiving circuit 1104, for clarity, only one transmitting circuit 1102 and one receiving circuit 1104 are shown. The transmitting circuit 1102 includes a transmit packet buffer 1106, a RPM transmit controller 1108, a partner information table 1110, and a time out circuit 1112 for controlling the retransmission of lost or corrupted data. The transmit packet buffer 1106 stores the packets 302 that have been transmitted but not acknowledged. The transmit packet buffer 1106 is smaller in size than in software implemented systems because the smaller latency in the system 100, in combination with virtual cut through routing, makes out-of-order reception of packets 302 less common and because the interconnect service manager layer 202 holds packets 302 if the transmit packet buffer 1106 is full. The partner information table 1110 stores, for each destination processor node 102, the sequence number of the next packet 302 that is to be sent, and that is expected to be acknowledged from that destination processor node 102. The RPM transmit controller 1108 controls the operation of the transmitting circuit 1102. The time out circuit 1112 provides a time count for controlling the retransmission of lost or corrupted data.
The receiving circuit 1104 includes a pending packet buffer 1114, a RPM receiver controller 1116, and a partner information table 1118. The pending packet buffer 1114 stores packets 302 that have been received out of sequence. The pending packet buffer 1114 is smaller in size than in software implemented systems because the smaller latency in the system 100 makes out-of-order reception of packets 302 less common. The RPM receiver controller 1116 controls the operation of the receiving circuit 1104. The partner information tablet 1118 stores, for each source processor node 102, the sequence number of the next expected packet 302 from that source processor node 102.
The reliable packet mover 114 generates the packet header 306 that includes the sequence number of the packet 302. Sequence numbers are used to inform the destination processor node 102 of the sequence of the packets 302. The destination node only processes the packets 302 in sequence. Upon receipt of an in order packet 302, the destination processor node sends an acknowledgment back to the source processor node 102 informing same of the receipt of the packet 302. If the source processor node 102 does not get an acknowledgment within a predetermined time, the source processor node 102 retransmits the packet 302 using the same sequence number.
Referring to FIG. 12a, there is shown a flowchart illustrating the operation of the transmitting circuit 1102 of the reliable packet mover 114. To transmit data, the coherence control unit 110 provides the data and an identification of the destination processor node 102 to the reliable packet mover 114, which converts the data into packets 302 and assigns a sequence number to each packet 302. Each transmitted packet 302 is stored in the transmit packet buffer 1106 in the source processor node 102. If 1201 it has capacity, the transmit packet buffer 1106 accepts the data and the reliable packet mover 114 transmits the packet 302. If not, the interconnect service manager layer 202 stops sending packets 302 and waits.
When the reliable packet mover 114 is to transmit a packet 302, the transmitting circuit 1102 retrieves 1202 a sequence number from the partner information table 1110 corresponding to the destination processor node 102. The transmitting circuit 1102 adds 1206 the retrieved sequence number to the packet header 306 of the packet 302 and performs 1207 an error detection. The transmitting circuit 1102 sends 1208 the packet 302 to the fast frame mover 116 for transmission as described above. The transmitting circuit 1102 also stores 1210 the packet 302 in the transmit packet buffer 1106 with a mark bit for that packet 302 that is not set, until an acknowledgment is received that the packet was received. The sequence number in the partner information table 1110 is incremented 1204 for the next packet 302 transmission. Because the sequence numbers are finite, they eventually will wrap around. Accordingly, the sequence number space is sufficiently large so that no packets 302 with the same sequence number are in the system 100 at the same time.
Referring to FIG. 12b, there is shown a flowchart illustrating the operation of processing acknowledgment packets and retransmission of packets by the reliable packet mover, if packets have been sent. The acknowledgment packet is a control packet from the receiving circuit 1104 to the source processor node 102 that indicates that the packet 302 was received and passed error detection. The acknowledgment packet includes a destination node number, the sequence number of the received packet 302, and EDC, preferably 16 bytes. The acknowledgment packet may be appended to another packet 302 that is being sent to the source destination node 102. This reduces traffic in the interconnect 104. The acknowledgment packet itself is not acknowledged and does not include its own sequence number. The sequence number in an acknowledgment packet implicitly acknowledges all prior packets, i.e. all packets with numbers that are less than the sequence number in the acknowledgment or adjusted because of the wrap around noted above. This allows the receiving circuit 1104 to delay the acknowledgment of packets 302 and to reduce the traffic of acknowledgment packets by using a single acknowledgment packet to acknowledge more than one packet 302.
The transmitting circuit 1102 determines 1212 whether an acknowledgment packet is received. If so, the transmitting circuit 1102 deletes 1214 the packets 302 corresponding to the received acknowledgment from the transmit packet buffer 1106. This deletion includes all prior packets 302 in the transmit packet buffer 1106 for the source-destination processor node 102 pair. These packets 302 have a sequence number less than or equal to the sequence number in the acknowledgment packet or sequence numbers that are appropriately adjusted to account for the wrap around.
The packet 302 is resent if an acknowledgment packet is not received after a specified time-out period. Specifically, if an acknowledgment packet is not received 1212, the transmitting circuit 1102 determines 1216 whether the time out circuit 1112 has timed out. If not, the transmitting circuit 1102 continues to determine 1212 whether an acknowledgment is received.
On the other hand, if the time out circuit 1112 has timed out, the transmitting circuit 1102 checks 1218 each packet 302 stored in the transmit packet buffer 1106 to determine if a mark bit is set for that packet 302. If the bit is not set, the transmitting circuit 1102 sets 1220 the mark bit for that packet 302. This allows a packet 302 between one or two time out periods before being resent. For packets 302 with the mark bit set 1218, the transmitting circuit 1102 retrieves 1222 the packet 302 from the transmit packet buffer 1106 and retransmits 1224 the packet 302. To determine that the interconnect 104 is defective, a limited or maximum number of retransmissions are sent. In particular, the transmitting circuit 1102 determines 1225 if the packet 302 has been resent a predetermined number of times. If it has been, the transmitting circuit 1102 informs 1227 the interconnect service manager layer 202 of such number of retransmissions and the layer 202 then may reroute packets 302 between that source-destination node pair. If the number of retransmissions has not reached the maximum, then upon reaching 1226 the last packet 302, the transmitting circuit 1102 continues to determine 1212 whether an acknowledgment packet is received as described above.
Referring to FIG. 13, there is shown a flowchart illustrating the operation of the receiving circuit 1104 of the reliable packet mover 114. The receiving circuit 1104 of the reliable packet mover 114 provides the packet 302 to the interconnect service manager layer 202 by sequence number order. The receiving circuit 1104 receives 1302 a packet 302 from the interconnect 104. If the pending packet buffer 1114 is full, the receiving circuit 1104 discards the packet 302. Alternatively, the receiving circuit 1104 may discard the latest packet 302 stored in the pending packet buffer 1114. Of course other packets 302 may be discarded from the pending packet buffer 1114 since this addresses performance and not correctness. The receiving circuit 1104 performs 1316 error detection on the packet 302. If the packet 302 fails 1318 error detection, the packet 302 is discarded 1310 and the receiving circuit 1104 continues to receive 1302 packets 302. On the other hand if the packet 302 does not fail 1318 error detection. Then the receiving circuit 1104 extracts 1304 the sequence number and source node number from the packet header 306. The receiving circuit 1104 reads 1306 the next expected sequence number for the source processor node from the partner information table 1118, and compares 1308 the next expected sequence number to the extracted sequence number. If the extracted sequence number is less than the expected sequence number, the packet 302 already has been processed by the receiving circuit 1104 and is a duplicate. Again the wrap around of sequence numbers is appropriately accounted for. The packet 302 is discarded 1310 and the receiving circuit 1104 continues to receive 1302 packets 302.
If the extracted sequence number is not less than the expected sequence number, the receiving circuit 1310 determines 1312 whether the extracted sequence number is equal to the expected sequence number. If there is not a match, the received packet 302 is out of sequence. The receiving circuit 1104 stores 1314 the packet 302 in the pending packet buffer 1114 and the receiving circuit 1104 continues to receive 1302 packets 302.
On the other hand, if the next expected sequence number matches 1312 the extracted sequence number, the receiving circuit 1104 provides 1320 an acknowledgment to the interconnect 104. Because the received packet 302 is the expected packet 302, the receiving circuit 1104 increments 1322 the partner information table 1118 for the corresponding source processor node. The receiving circuit 1104 provides 1324 the packet 302 to the interconnect service manager layer 202 for processing and checks 1326 the pending packet buffer 1114 for the packet 302 next in the sequence.
Referring to FIG. 14, there is shown a flowchart illustrating the operation of checking 1326 the pending packet buffer 1114. The receiving circuit 1104 checks 1402 the pending packet buffer 1114 for the packet 302 next in the sequence. If the next expected packet 302 is in the pending packet buffer 1114, the receiving circuit 1104 also sends 1104 an acknowledgment and increments 1406 the sequence number. The receiving circuit 1104 provides 1408 that packet 302 to the interconnect service manager layer 202. The receiving circuit 1104 continues checking the pending packet buffer 1114 for the next expected packet 302 until such packet 302 is not found. The receiving circuit 1104 continues to monitor 1302 (FIG. 13) for received packets 302.
The above description is included to illustrate the operation of the preferred embodiments and is not meant to limit the scope of the invention. The scope of the invention is to be limited only by the following claims. From the above discussion, many variations will be apparent to one skilled in the art that would yet be encompassed by the spirit and scope of the invention.

Claims (12)

We claim:
1. A system for communicating data in packets, each packet included in a frame, the system comprising:
a plurality of routers disposed to transmit and receive frames among one another; and
a plurality of nodes, each node including a frame mover disposed to transmit frames to and receive frames from the routers and including a packet mover disposed to supply packets to and receive packets from the frame mover, the packet mover of one of the nodes supplying a packet to the frame mover of said one of the nodes, with the packet indicating a destination node among the plurality of nodes, the frame mover of said one of the nodes, selecting a predetermined route through the routers to the destination node, including the packet and the route in the frame, and transmitting the frame to the routers;
the routers transmitting the frame to the frame mover of the destination node in response to the route included in the frame;
the frame mover of the destination node receiving the frame and supplying the packet included therein to the packet mover of the destination node, the packet mover of the destination node checking the packet for errors, discarding the packet in response to detecting an error, and transmitting an acknowledgment for the packet to said one of the nodes in response to detecting no errors; and
the packet mover of said one of the nodes retransmitting the packet to the destination node in response to receiving no acknowledgment for the packet within a predetermined period of time.
2. A system for transmitting data in packets, each packet include in a frame, the system comprising:
a plurality of mesh links each for carrying frames during transmission;
a plurality of routers linked in a network by the mesh links which carry frames between the routers; and
a plurality of nodes each linked to the plurality of routers by at least one of the mesh links, one of the nodes being a source node for selecting a predetermined route to another of the nodes being a destination node, the predetermined route from a routing table having a plurality of predetermined routes to the destination node, with said predetermined route indicating a sequence of the mesh linking said one of the nodes to said another of the nodes, generating a frame that includes said predetermined route and a packet, and transmitting the frame along the mesh line includes said beginning the sequence to one of the routers, each router linked by the sequence transmitting the frame along the sequence to said another of the nodes in response to the route included in the frame, said another of the nodes extracting the packet from the frame, checking the packet for errors, and processing the packet in response to detecting no errors.
3. The system of claim 2, wherein the nodes include:
a controller for selecting one of the plurality of predetermined routes for inclusion in the frame.
4. The system of claim 3 wherein the controller selects said one of the plurality of predetermined routes for inclusion in the frame according to a probability distribution weighting a likelihood of selection of each of the predetermined routes.
5. A method for sending data in packets between nodes of a communication system, each node having a buffer, each packet including an identifier distinguishable from identifiers of other packets in the systems, the method comprising the steps of:
a) sending a copy of a packet including the identifier of the packet from a first node to a second node;
b) storing the packet of step a) in the buffer of the first node;
c) receiving an acknowledgment including a second identifier at the first node from another of the nodes;
d) removing the packet from the buffer of the first node in response to the second identifier matching the identifier of the packet; and
e) resending each packet in the buffer of the first node in response to not receiving the acknowledgment after a predetermined time.
6. The method of claim 5, wherein the identifier is a number and the step of removing the packet from storage includes the step of removing each packet from storage having an identifier less than or equal to the second identifier.
7. The system of claim 2, wherein each node has a identifier, and each predetermined route comprises the sequence of routers linked by the sequence of the mesh links, and further includes the identifier of said another of the nodes, for identifying the sequence of the mesh links in each predetermined route.
8. The system of claim 2, wherein:
said one of the nodes includes the predetermined route in a beginning portion of the frame; and
at least one of the routers linked by the sequence of the mesh links receives the beginning portion of the frame and prior to receiving an end portion of the frame transmits said beginning portion along the sequence of the mesh links to said another of the nodes in response the route included in the beginning portion of the frame.
9. The method of claim 5, wherein steps a) and b) are performed a plurality of times without performing steps c), d), and e).
10. The method of claim 5, wherein steps c), d), and e) are performed a plurality of times without performing steps a) and b).
11. A system for communicating data in packets, each packet includes a sequence number and is included in a frame, the system comprising:
a plurality of routers disposed to transmit and receive frames among one another; and
a plurality of nodes, each node including a frame mover disposed to transmit frames to and receive frames from the routers and including a packet mover disposed to supply packets to and receive packets from the frame mover, each packet mover including:
a table or storing the sequence number of the next packet expected from each other packet mover;
a buffer for storing each packet received from another of the packet movers having a sequence number that follows the sequence number in the table corresponding to said another of the packet movers; and
a controller for comparing the sequence number of a packet received from one of the other packet movers to the stored sequence number in the table corresponding to said one of the other packet movers, storing the packet in the buffer in response to the sequence number of the packet following said stored sequence number, processing the packet and incrementing said stored sequence number in response to the sequence number of the packet matching said stored sequence number, and removing and processing another packet stored in the buffer in response to the incremented sequence number matching the sequence number of said another packet and then again incrementing the incremented sequence number;
the packet mover of one of the nodes supplying a packet to the frame mover of said one of the nodes, with the packet indicating a destination node among the plurality of nodes, the frame mover of said one of the nodes selecting a predetermined route through the routers to the destination node, including the packet and the route in the frame, and transmitting the frame to the routers;
the routers transmitting the frame to the frame mover of the destination node in response to the route included in the frame;
the frame mover of the destination node receiving the frame and supplying the packet included therein to the packet mover of the destination node, the packet mover of the destination node checking the packet for errors, discarding the packet in response to detecting an error, and transmitting an acknowledgment for the packet to said one of the nodes in response to detecting no errors; and
the packet mover of said one of the nodes retransmitting the packet to the destination node in response to receiving no acknowledgment for the packet within a predetermined period of time.
12. A method for sending data in packets between nodes of a communication system using a plurality of mark bits, each mark bit having a set state and an unset state, and each node having a buffer, and each packet including an identifier distinguishable from identifiers of other packets in the system, the method comprising:
a) sending a copy of a packet including the identifier of the packet from the first node to a second node;
b) storing the packet of step a) in the buffer of the first node, including:
associating a mark bit with the packet;
and placing the mark bit in the unset state;
c) receiving an acknowledgment including a second identifier at the first node from another of the nodes;
d) removing the packet from the buffer of the first node if the second identifier matches the identifier of the packet; and
e) resending each packet in the buffer of the first node after a predetermined time, including:
resending each packet in the buffer of the first node having a mark bit in the set state; and
placing into the set state the mark bit of each packet in the buffer of the first node and having a mark bit in the unset state.
US08/605,677 1996-02-22 1996-02-22 Asynchronous packet switching Expired - Lifetime US5959995A (en)

Priority Applications (5)

Application Number Priority Date Filing Date Title
US08/605,677 US5959995A (en) 1996-02-22 1996-02-22 Asynchronous packet switching
DE69735740T DE69735740T2 (en) 1996-02-22 1997-02-20 ASYNCHRONE PACKAGE TRANSMISSION
PCT/US1997/002943 WO1997031464A1 (en) 1996-02-22 1997-02-20 Asynchronous packet switching
EP97906765A EP0823165B1 (en) 1996-02-22 1997-02-20 Asynchronous packet switching
JP53040697A JP3816531B2 (en) 1996-02-22 1997-02-20 Asynchronous packet switching

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US08/605,677 US5959995A (en) 1996-02-22 1996-02-22 Asynchronous packet switching

Publications (1)

Publication Number Publication Date
US5959995A true US5959995A (en) 1999-09-28

Family

ID=24424724

Family Applications (1)

Application Number Title Priority Date Filing Date
US08/605,677 Expired - Lifetime US5959995A (en) 1996-02-22 1996-02-22 Asynchronous packet switching

Country Status (5)

Country Link
US (1) US5959995A (en)
EP (1) EP0823165B1 (en)
JP (1) JP3816531B2 (en)
DE (1) DE69735740T2 (en)
WO (1) WO1997031464A1 (en)

Cited By (82)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2000013455A1 (en) * 1998-08-27 2000-03-09 Intel Corporation Method and apparatus for input/output link retry, failure and recovery in a computer network
US6205498B1 (en) * 1998-04-01 2001-03-20 Microsoft Corporation Method and system for message transfer session management
US6252851B1 (en) * 1997-03-27 2001-06-26 Massachusetts Institute Of Technology Method for regulating TCP flow over heterogeneous networks
US6304569B1 (en) * 1997-03-27 2001-10-16 Siemens Aktiengesellschaft Method for the reception of message cells from low-priority connections from only one of a number of redundant transmission paths
US20020027912A1 (en) * 2000-08-11 2002-03-07 Peter Galicki Pull transfers and transfer receipt confirmation in a datapipe routing bridge
US6393023B1 (en) * 1998-05-08 2002-05-21 Fujitsu Limited System and method for acknowledging receipt of messages within a packet based communication network
US20020071431A1 (en) * 2000-12-13 2002-06-13 Chakravarthy Kosaraju Method and an apparatus for a re-configurable processor
US6415312B1 (en) * 1999-01-29 2002-07-02 International Business Machines Corporation Reliable multicast for small groups
US20020106991A1 (en) * 2001-02-05 2002-08-08 Tantivy Communications, Inc. Link-aware transmission control protocol
US20020118692A1 (en) * 2001-01-04 2002-08-29 Oberman Stuart F. Ensuring proper packet ordering in a cut-through and early-forwarding network switch
US6499066B1 (en) 1999-09-27 2002-12-24 International Business Machines Corporation Method and apparatus for using fibre channel test extended link service commands for interprocess communication
US20030037224A1 (en) * 2001-08-16 2003-02-20 Newisys, Inc. Computer system partitioning using data transfer routing mechanism
US6529932B1 (en) 1998-04-01 2003-03-04 Microsoft Corporation Method and system for distributed transaction processing with asynchronous message delivery
US6609165B1 (en) 1999-09-27 2003-08-19 International Business Machines Corporation Method and apparatus for using fibre channel extended link service commands in a point-to-point configuration
US6683850B1 (en) * 1997-08-29 2004-01-27 Intel Corporation Method and apparatus for controlling the flow of data between servers
US6683876B1 (en) * 1996-09-23 2004-01-27 Silicon Graphics, Inc. Packet switched router architecture for providing multiple simultaneous communications
US6687766B1 (en) 1998-10-14 2004-02-03 International Business Machines Corporation Method and apparatus for a fibre channel control unit to execute search commands locally
US20040044877A1 (en) * 2002-05-28 2004-03-04 Mark Myers Computer node to mesh interface for highly scalable parallel processing system
US20040047311A1 (en) * 2002-09-09 2004-03-11 Nokia Corporation Phase shifted time slice transmission to improve handover
US20040081394A1 (en) * 2001-01-31 2004-04-29 Giora Biran Providing control information to a management processor of a communications switch
US20040113306A1 (en) * 1999-05-19 2004-06-17 Rapacki Alan R Manufacturing conduits for use in placing a target vessel in fluid communication with a source of blood
US6763418B1 (en) 2001-09-07 2004-07-13 Agilent Technologies, Inc. Request bus arbitration
US6771659B1 (en) * 2000-01-21 2004-08-03 Nokia Mobile Phones Ltd. Method and apparatus for a selective acknowledgement scheme in a modified unacknowledge mode for use over a communications link
GB2398650A (en) * 2003-02-21 2004-08-25 Picochip Designs Ltd Communications in a processor array
US20040210646A1 (en) * 2003-04-17 2004-10-21 Hitachi, Ltd. Information processing system
US6810031B1 (en) 2000-02-29 2004-10-26 Celox Networks, Inc. Method and device for distributing bandwidth
US6826645B2 (en) 2000-12-13 2004-11-30 Intel Corporation Apparatus and a method to provide higher bandwidth or processing power on a bus
US6839794B1 (en) 2001-10-12 2005-01-04 Agilent Technologies, Inc. Method and system to map a service level associated with a packet to one of a number of data streams at an interconnect device
US20050138197A1 (en) * 2003-12-19 2005-06-23 Venables Bradley D. Queue state mirroring
US6920106B1 (en) 2001-09-07 2005-07-19 Agilent Technologies, Inc. Speculative loading of buffers within a port of a network device
US20050157650A1 (en) * 2002-02-14 2005-07-21 Nokia Corporation Clock-based time slicing
US6922749B1 (en) 2001-10-12 2005-07-26 Agilent Technologies, Inc. Apparatus and methodology for an input port of a switch that supports cut-through operation within the switch
US20050208942A1 (en) * 2004-03-19 2005-09-22 Nokia Corporation Advanced handover in phased-shifted and time-sliced networks
US6950394B1 (en) 2001-09-07 2005-09-27 Agilent Technologies, Inc. Methods and systems to transfer information using an alternative routing associated with a communication network
US6985484B1 (en) * 1997-01-09 2006-01-10 Silicon Graphics, Inc. Packetized data transmissions in a switched router architecture
US7016996B1 (en) 2002-04-15 2006-03-21 Schober Richard L Method and apparatus to detect a timeout condition for a data item within a process
US20060101234A1 (en) * 2004-11-05 2006-05-11 Hannum David P Systems and methods of balancing crossbar bandwidth
US7054330B1 (en) 2001-09-07 2006-05-30 Chou Norman C Mask-based round robin arbitration
US7075937B1 (en) * 1998-05-19 2006-07-11 Canon Kabushiki Kaisha Method and device for sending data, method and device for receiving data
US7095753B1 (en) * 2000-09-19 2006-08-22 Bbn Technologies Corp. Digital network processor-based multi-protocol flow control
US20060251010A1 (en) * 2005-03-30 2006-11-09 At&T Corp. Loss tolerant transmission control protocol
US20060266273A1 (en) * 2005-03-14 2006-11-30 Todd Westberg System and method of modular vehicle gauge system and illumination
US7209476B1 (en) 2001-10-12 2007-04-24 Avago Technologies General Ip (Singapore) Pte. Ltd. Method and apparatus for input/output port mirroring for networking system bring-up and debug
US7237016B1 (en) 2001-09-07 2007-06-26 Palau Acquisition Corporation (Delaware) Method and system to manage resource requests utilizing link-list queues within an arbiter associated with an interconnect device
US20070198900A1 (en) * 2006-02-23 2007-08-23 Samsung Electronics Co., Ltd. Network intermediate device and method thereof
US7290277B1 (en) * 2002-01-24 2007-10-30 Avago Technologies General Ip Pte Ltd Control of authentication data residing in a network device
US20080084864A1 (en) * 2006-10-06 2008-04-10 Charles Jens Archer Method and Apparatus for Routing Data in an Inter-Nodal Communications Lattice of a Massively Parallel Computer System by Semi-Randomly Varying Routing Policies for Different Packets
US20080084865A1 (en) * 2006-10-06 2008-04-10 Charles Jens Archer Method and Apparatus for Routing Data in an Inter-Nodal Communications Lattice of a Massively Parallel Computer System by Routing Through Transporter Nodes
US20080084827A1 (en) * 2006-10-06 2008-04-10 Charles Jens Archer Method and Apparatus for Routing Data in an Inter-Nodal Communications Lattice of a Massively Parallel Computer System by Dynamic Global Mapping of Contended Links
US20080184214A1 (en) * 2007-01-30 2008-07-31 Charles Jens Archer Routing Performance Analysis and Optimization Within a Massively Parallel Computer
US20090154486A1 (en) * 2007-12-13 2009-06-18 Archer Charles J Tracking Network Contention
US20090198864A1 (en) * 2008-02-05 2009-08-06 Alaxala Networks Corporation Network switch and method of switching in network
US7574597B1 (en) 2001-10-19 2009-08-11 Bbn Technologies Corp. Encoding of signals to facilitate traffic analysis
US20090228602A1 (en) * 2008-03-04 2009-09-10 Timothy James Speight Method and apparatus for managing transmission of tcp data segments
US7889654B2 (en) 2005-03-30 2011-02-15 At&T Intellectual Property Ii, L.P. Loss tolerant transmission control protocol
US20110069612A1 (en) * 2009-03-12 2011-03-24 Takao Yamaguchi Best path selecting device, best path selecting method, and program
US20110173349A1 (en) * 2009-11-13 2011-07-14 International Business Machines Corporation I/o routing in a multidimensional torus network
US7993356B2 (en) 1998-02-13 2011-08-09 Medtronic, Inc. Delivering a conduit into a heart wall to place a coronary vessel in communication with a heart chamber and removing tissue from the vessel or heart wall to facilitate such communication
US20110307628A1 (en) * 2010-03-17 2011-12-15 Nec Corporation Communication system, node, control server, communication method and program
US8301823B2 (en) 2009-07-07 2012-10-30 Panasonic Corporation Bus controller arranged between a bus master and a networked communication bus in order to control the transmission route of a packet that flows through the communication bus, and simulation program to design such a bus controller
US8463312B2 (en) 2009-06-05 2013-06-11 Mindspeed Technologies U.K., Limited Method and device in a communication network
US8512360B2 (en) 1998-02-13 2013-08-20 Medtronic, Inc. Conduits for use in placing a target vessel in fluid communication with source of blood
US8559998B2 (en) 2007-11-05 2013-10-15 Mindspeed Technologies U.K., Limited Power control
WO2014051746A1 (en) * 2012-09-29 2014-04-03 Intel Corporation Techniques for resilient communication
US8712469B2 (en) 2011-05-16 2014-04-29 Mindspeed Technologies U.K., Limited Accessing a base station
US8798630B2 (en) 2009-10-05 2014-08-05 Intel Corporation Femtocell base station
US8849340B2 (en) 2009-05-07 2014-09-30 Intel Corporation Methods and devices for reducing interference in an uplink
US8862076B2 (en) 2009-06-05 2014-10-14 Intel Corporation Method and device in a communication network
US8891371B2 (en) 2010-11-30 2014-11-18 International Business Machines Corporation Data communications in a parallel active messaging interface of a parallel computer
US8904148B2 (en) 2000-12-19 2014-12-02 Intel Corporation Processor architecture with switch matrices for transferring data along buses
US8930962B2 (en) 2012-02-22 2015-01-06 International Business Machines Corporation Processing unexpected messages at a compute node of a parallel computer
US8949328B2 (en) 2011-07-13 2015-02-03 International Business Machines Corporation Performing collective operations in a distributed processing system
US9042434B2 (en) 2011-04-05 2015-05-26 Intel Corporation Filter
US9075747B2 (en) 2010-05-27 2015-07-07 Panasonic Intellectual Property Management Co., Ltd. Bus controller and control unit that outputs instruction to the bus controller
US9107136B2 (en) 2010-08-16 2015-08-11 Intel Corporation Femtocell access control
US9164944B2 (en) 2010-01-25 2015-10-20 Panasonic Intellectual Property Management Co., Ltd. Semiconductor system, relay apparatus, and chip circuit
US9225545B2 (en) 2008-04-01 2015-12-29 International Business Machines Corporation Determining a path for network traffic between nodes in a parallel computer
US9264371B2 (en) 2011-10-14 2016-02-16 Panasonic Intellectual Property Management Co., Ltd. Router, method for controlling the router, and computer program
US20170288814A1 (en) * 2014-10-09 2017-10-05 Hewlett Packard Enterprise Development Lp A transmitter that does not resend a packet despite receipt of a message to resend the packet
US9860183B2 (en) 2015-09-25 2018-01-02 Fsa Technologies, Inc. Data redirection in a bifurcated communication trunk system and method
US9954760B2 (en) 2010-01-29 2018-04-24 International Business Machines Corporation I/O routing in a multidimensional torus network
US10856302B2 (en) 2011-04-05 2020-12-01 Intel Corporation Multimode base station

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
FR2779019B1 (en) * 1998-05-19 2003-01-24 Canon Kk METHOD AND DEVICE FOR TRANSMITTING DATA, METHOD AND DEVICE FOR RECEIVING DATA
JP5204603B2 (en) * 2008-09-29 2013-06-05 株式会社日立製作所 Quadruple computer system and duplex ring network
US11741050B2 (en) 2021-01-29 2023-08-29 Salesforce, Inc. Cloud storage class-based variable cache availability

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4058838A (en) * 1976-11-10 1977-11-15 International Telephone And Telegraph Corporation Packet-switched facsimile communications system
EP0282198A2 (en) * 1987-03-13 1988-09-14 Nortel Networks Corporation Communications system and components and methods for use therein
US5115433A (en) * 1989-07-18 1992-05-19 Metricom, Inc. Method and system for routing packets in a packet communication network
EP0602693A2 (en) * 1992-12-18 1994-06-22 ALCATEL BELL Naamloze Vennootschap ATM switching node and ATM switching element having routing logic means
US5351237A (en) * 1992-06-05 1994-09-27 Nec Corporation Network system comprising a plurality of lans connected to an ISDN via a plurality of routers, each capable of automatically creating a table for storing router information
EP0658028A2 (en) * 1993-11-30 1995-06-14 AT&T Corp. Retransmission protocol for wireless communications
US5524116A (en) * 1992-02-14 1996-06-04 At&T Corp. Packet framer
US5544161A (en) * 1995-03-28 1996-08-06 Bell Atlantic Network Services, Inc. ATM packet demultiplexer for use in full service network having distributed architecture
US5590122A (en) * 1994-12-22 1996-12-31 Emc Corporation Method and apparatus for reordering frames

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4058838A (en) * 1976-11-10 1977-11-15 International Telephone And Telegraph Corporation Packet-switched facsimile communications system
EP0282198A2 (en) * 1987-03-13 1988-09-14 Nortel Networks Corporation Communications system and components and methods for use therein
US5115433A (en) * 1989-07-18 1992-05-19 Metricom, Inc. Method and system for routing packets in a packet communication network
US5524116A (en) * 1992-02-14 1996-06-04 At&T Corp. Packet framer
US5351237A (en) * 1992-06-05 1994-09-27 Nec Corporation Network system comprising a plurality of lans connected to an ISDN via a plurality of routers, each capable of automatically creating a table for storing router information
EP0602693A2 (en) * 1992-12-18 1994-06-22 ALCATEL BELL Naamloze Vennootschap ATM switching node and ATM switching element having routing logic means
EP0658028A2 (en) * 1993-11-30 1995-06-14 AT&T Corp. Retransmission protocol for wireless communications
US5590122A (en) * 1994-12-22 1996-12-31 Emc Corporation Method and apparatus for reordering frames
US5544161A (en) * 1995-03-28 1996-08-06 Bell Atlantic Network Services, Inc. ATM packet demultiplexer for use in full service network having distributed architecture

Cited By (137)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6683876B1 (en) * 1996-09-23 2004-01-27 Silicon Graphics, Inc. Packet switched router architecture for providing multiple simultaneous communications
US6985484B1 (en) * 1997-01-09 2006-01-10 Silicon Graphics, Inc. Packetized data transmissions in a switched router architecture
US6252851B1 (en) * 1997-03-27 2001-06-26 Massachusetts Institute Of Technology Method for regulating TCP flow over heterogeneous networks
US6304569B1 (en) * 1997-03-27 2001-10-16 Siemens Aktiengesellschaft Method for the reception of message cells from low-priority connections from only one of a number of redundant transmission paths
US6181704B1 (en) * 1997-08-29 2001-01-30 Intel Corporation Method and apparatus for input/output link retry, failure and recovery in a computer network
US6683850B1 (en) * 1997-08-29 2004-01-27 Intel Corporation Method and apparatus for controlling the flow of data between servers
US7993356B2 (en) 1998-02-13 2011-08-09 Medtronic, Inc. Delivering a conduit into a heart wall to place a coronary vessel in communication with a heart chamber and removing tissue from the vessel or heart wall to facilitate such communication
US8512360B2 (en) 1998-02-13 2013-08-20 Medtronic, Inc. Conduits for use in placing a target vessel in fluid communication with source of blood
US6446144B1 (en) 1998-04-01 2002-09-03 Microsoft Corporation Method and system for message transfer session management
US6205498B1 (en) * 1998-04-01 2001-03-20 Microsoft Corporation Method and system for message transfer session management
US6529932B1 (en) 1998-04-01 2003-03-04 Microsoft Corporation Method and system for distributed transaction processing with asynchronous message delivery
US6393023B1 (en) * 1998-05-08 2002-05-21 Fujitsu Limited System and method for acknowledging receipt of messages within a packet based communication network
US7075937B1 (en) * 1998-05-19 2006-07-11 Canon Kabushiki Kaisha Method and device for sending data, method and device for receiving data
WO2000013455A1 (en) * 1998-08-27 2000-03-09 Intel Corporation Method and apparatus for input/output link retry, failure and recovery in a computer network
US6687766B1 (en) 1998-10-14 2004-02-03 International Business Machines Corporation Method and apparatus for a fibre channel control unit to execute search commands locally
US6829659B2 (en) 1998-10-14 2004-12-07 International Business Machines Corporation Method, system and program product for logically disconnecting in fibre channel communication without closing the exchange
US6415312B1 (en) * 1999-01-29 2002-07-02 International Business Machines Corporation Reliable multicast for small groups
US7285235B2 (en) 1999-05-19 2007-10-23 Medtronic, Inc. Manufacturing conduits for use in placing a target vessel in fluid communication with a source of blood
US20040113306A1 (en) * 1999-05-19 2004-06-17 Rapacki Alan R Manufacturing conduits for use in placing a target vessel in fluid communication with a source of blood
US6609165B1 (en) 1999-09-27 2003-08-19 International Business Machines Corporation Method and apparatus for using fibre channel extended link service commands in a point-to-point configuration
US6499066B1 (en) 1999-09-27 2002-12-24 International Business Machines Corporation Method and apparatus for using fibre channel test extended link service commands for interprocess communication
US6771659B1 (en) * 2000-01-21 2004-08-03 Nokia Mobile Phones Ltd. Method and apparatus for a selective acknowledgement scheme in a modified unacknowledge mode for use over a communications link
US6810031B1 (en) 2000-02-29 2004-10-26 Celox Networks, Inc. Method and device for distributing bandwidth
US20020027912A1 (en) * 2000-08-11 2002-03-07 Peter Galicki Pull transfers and transfer receipt confirmation in a datapipe routing bridge
US6967950B2 (en) * 2000-08-11 2005-11-22 Texas Instruments Incorporated Pull transfers and transfer receipt confirmation in a datapipe routing bridge
US7095753B1 (en) * 2000-09-19 2006-08-22 Bbn Technologies Corp. Digital network processor-based multi-protocol flow control
WO2002048862A2 (en) * 2000-12-13 2002-06-20 Intel Corporation A method and an apparatus for a re-configurable processor
US7133957B2 (en) 2000-12-13 2006-11-07 Intel Corporation Method and an apparatus for a re-configurable processor
WO2002048862A3 (en) * 2000-12-13 2003-10-30 Intel Corp A method and an apparatus for a re-configurable processor
US6826645B2 (en) 2000-12-13 2004-11-30 Intel Corporation Apparatus and a method to provide higher bandwidth or processing power on a bus
US20020071431A1 (en) * 2000-12-13 2002-06-13 Chakravarthy Kosaraju Method and an apparatus for a re-configurable processor
US20050021897A1 (en) * 2000-12-13 2005-01-27 Chakravarthy Kosaraju Method and an apparatus for a re-configurable processor
US6907490B2 (en) 2000-12-13 2005-06-14 Intel Corporation Method and an apparatus for a re-configurable processor
US8904148B2 (en) 2000-12-19 2014-12-02 Intel Corporation Processor architecture with switch matrices for transferring data along buses
US20020118692A1 (en) * 2001-01-04 2002-08-29 Oberman Stuart F. Ensuring proper packet ordering in a cut-through and early-forwarding network switch
US20040081394A1 (en) * 2001-01-31 2004-04-29 Giora Biran Providing control information to a management processor of a communications switch
US7184401B2 (en) * 2001-02-05 2007-02-27 Interdigital Technology Corporation Link-aware transmission control protocol
US20070147245A1 (en) * 2001-02-05 2007-06-28 Interdigital Technology Corporation Link-aware transmission control protocol
US20020106991A1 (en) * 2001-02-05 2002-08-08 Tantivy Communications, Inc. Link-aware transmission control protocol
US7672241B2 (en) 2001-02-05 2010-03-02 Ipr Licensing, Inc. Link-aware transmission control protocol
AU2002324671B2 (en) * 2001-08-16 2008-09-25 Sanmina Corporation Computer system partitioning using data transfer routing mechanism
EP1442385A1 (en) * 2001-08-16 2004-08-04 Newisys Inc. Computer system partitioning using data transfer routing mechanism
US20030037224A1 (en) * 2001-08-16 2003-02-20 Newisys, Inc. Computer system partitioning using data transfer routing mechanism
US7921188B2 (en) 2001-08-16 2011-04-05 Newisys, Inc. Computer system partitioning using data transfer routing mechanism
EP1442385A4 (en) * 2001-08-16 2006-01-18 Newisys Inc Computer system partitioning using data transfer routing mechanism
US6950394B1 (en) 2001-09-07 2005-09-27 Agilent Technologies, Inc. Methods and systems to transfer information using an alternative routing associated with a communication network
US6920106B1 (en) 2001-09-07 2005-07-19 Agilent Technologies, Inc. Speculative loading of buffers within a port of a network device
US7054330B1 (en) 2001-09-07 2006-05-30 Chou Norman C Mask-based round robin arbitration
US6763418B1 (en) 2001-09-07 2004-07-13 Agilent Technologies, Inc. Request bus arbitration
US7237016B1 (en) 2001-09-07 2007-06-26 Palau Acquisition Corporation (Delaware) Method and system to manage resource requests utilizing link-list queues within an arbiter associated with an interconnect device
US6839794B1 (en) 2001-10-12 2005-01-04 Agilent Technologies, Inc. Method and system to map a service level associated with a packet to one of a number of data streams at an interconnect device
US7209476B1 (en) 2001-10-12 2007-04-24 Avago Technologies General Ip (Singapore) Pte. Ltd. Method and apparatus for input/output port mirroring for networking system bring-up and debug
US6922749B1 (en) 2001-10-12 2005-07-26 Agilent Technologies, Inc. Apparatus and methodology for an input port of a switch that supports cut-through operation within the switch
US7574597B1 (en) 2001-10-19 2009-08-11 Bbn Technologies Corp. Encoding of signals to facilitate traffic analysis
US7290277B1 (en) * 2002-01-24 2007-10-30 Avago Technologies General Ip Pte Ltd Control of authentication data residing in a network device
US20050157650A1 (en) * 2002-02-14 2005-07-21 Nokia Corporation Clock-based time slicing
US7016996B1 (en) 2002-04-15 2006-03-21 Schober Richard L Method and apparatus to detect a timeout condition for a data item within a process
US20040044877A1 (en) * 2002-05-28 2004-03-04 Mark Myers Computer node to mesh interface for highly scalable parallel processing system
US7058034B2 (en) * 2002-09-09 2006-06-06 Nokia Corporation Phase shifted time slice transmission to improve handover
US20040047311A1 (en) * 2002-09-09 2004-03-11 Nokia Corporation Phase shifted time slice transmission to improve handover
US20070083791A1 (en) * 2003-02-12 2007-04-12 Gajinder Panesar Communications in a processor array
US7987340B2 (en) 2003-02-21 2011-07-26 Gajinder Panesar Communications in a processor array
GB2398650B (en) * 2003-02-21 2006-09-20 Picochip Designs Ltd Communications in a processor array
GB2398650A (en) * 2003-02-21 2004-08-25 Picochip Designs Ltd Communications in a processor array
US20040210646A1 (en) * 2003-04-17 2004-10-21 Hitachi, Ltd. Information processing system
US7814222B2 (en) * 2003-12-19 2010-10-12 Nortel Networks Limited Queue state mirroring
US20050138197A1 (en) * 2003-12-19 2005-06-23 Venables Bradley D. Queue state mirroring
US7660583B2 (en) 2004-03-19 2010-02-09 Nokia Corporation Advanced handover in phased-shifted and time-sliced networks
US20050208942A1 (en) * 2004-03-19 2005-09-22 Nokia Corporation Advanced handover in phased-shifted and time-sliced networks
US7600023B2 (en) * 2004-11-05 2009-10-06 Hewlett-Packard Development Company, L.P. Systems and methods of balancing crossbar bandwidth
US20060101234A1 (en) * 2004-11-05 2006-05-11 Hannum David P Systems and methods of balancing crossbar bandwidth
US20060266273A1 (en) * 2005-03-14 2006-11-30 Todd Westberg System and method of modular vehicle gauge system and illumination
US8537675B2 (en) 2005-03-30 2013-09-17 At&T Intellectual Property I, L.P. Loss tolerant transmission control protocol
US7889654B2 (en) 2005-03-30 2011-02-15 At&T Intellectual Property Ii, L.P. Loss tolerant transmission control protocol
US20110099437A1 (en) * 2005-03-30 2011-04-28 AT&T INTELLECTUAL PROPERTY II, L.P. (fka AT&T Corp.) Loss Tolerant Transmission Control Protocol
US7366132B2 (en) * 2005-03-30 2008-04-29 At&T Corp. Loss tolerant transmission control protocol
US20060251010A1 (en) * 2005-03-30 2006-11-09 At&T Corp. Loss tolerant transmission control protocol
US8418045B2 (en) * 2006-02-23 2013-04-09 Samsung Electronics Co., Ltd. Network intermediate device and method thereof
US20070198900A1 (en) * 2006-02-23 2007-08-23 Samsung Electronics Co., Ltd. Network intermediate device and method thereof
US20080084827A1 (en) * 2006-10-06 2008-04-10 Charles Jens Archer Method and Apparatus for Routing Data in an Inter-Nodal Communications Lattice of a Massively Parallel Computer System by Dynamic Global Mapping of Contended Links
US7835284B2 (en) 2006-10-06 2010-11-16 International Business Machines Corporation Method and apparatus for routing data in an inter-nodal communications lattice of a massively parallel computer system by routing through transporter nodes
US20080084864A1 (en) * 2006-10-06 2008-04-10 Charles Jens Archer Method and Apparatus for Routing Data in an Inter-Nodal Communications Lattice of a Massively Parallel Computer System by Semi-Randomly Varying Routing Policies for Different Packets
US20080084865A1 (en) * 2006-10-06 2008-04-10 Charles Jens Archer Method and Apparatus for Routing Data in an Inter-Nodal Communications Lattice of a Massively Parallel Computer System by Routing Through Transporter Nodes
US8031614B2 (en) 2006-10-06 2011-10-04 International Business Machines Corporation Method and apparatus for routing data in an inter-nodal communications lattice of a massively parallel computer system by dynamic global mapping of contended links
US7839786B2 (en) * 2006-10-06 2010-11-23 International Business Machines Corporation Method and apparatus for routing data in an inter-nodal communications lattice of a massively parallel computer system by semi-randomly varying routing policies for different packets
US20080184214A1 (en) * 2007-01-30 2008-07-31 Charles Jens Archer Routing Performance Analysis and Optimization Within a Massively Parallel Computer
US8423987B2 (en) 2007-01-30 2013-04-16 International Business Machines Corporation Routing performance analysis and optimization within a massively parallel computer
US8559998B2 (en) 2007-11-05 2013-10-15 Mindspeed Technologies U.K., Limited Power control
US20090154486A1 (en) * 2007-12-13 2009-06-18 Archer Charles J Tracking Network Contention
US8055879B2 (en) 2007-12-13 2011-11-08 International Business Machines Corporation Tracking network contention
US8095721B2 (en) * 2008-02-05 2012-01-10 Alaxala Networks Corporation Network switch and method of switching in network
US20090198864A1 (en) * 2008-02-05 2009-08-06 Alaxala Networks Corporation Network switch and method of switching in network
US20110289234A1 (en) * 2008-03-04 2011-11-24 Sony Corporation Method and apparatus for managing transmission of tcp data segments
US8589586B2 (en) * 2008-03-04 2013-11-19 Sony Corporation Method and apparatus for managing transmission of TCP data segments
US8015313B2 (en) * 2008-03-04 2011-09-06 Sony Corporation Method and apparatus for managing transmission of TCP data segments
US8301799B2 (en) * 2008-03-04 2012-10-30 Sony Corporation Method and apparatus for managing transmission of TCP data segments
US8301685B2 (en) * 2008-03-04 2012-10-30 Sony Corporation Method and apparatus for managing transmission of TCP data segments
US20120278502A1 (en) * 2008-03-04 2012-11-01 Sony Corporation Method and apparatus for managing transmission of tcp data segments
US20090228602A1 (en) * 2008-03-04 2009-09-10 Timothy James Speight Method and apparatus for managing transmission of tcp data segments
US20110122816A1 (en) * 2008-03-04 2011-05-26 Sony Corporation Method and apparatus for managing transmission of tcp data segments
US9225545B2 (en) 2008-04-01 2015-12-29 International Business Machines Corporation Determining a path for network traffic between nodes in a parallel computer
US8213298B2 (en) 2009-03-12 2012-07-03 Panasonic Corporation Best path selecting device, best path selecting method, and program
US20110069612A1 (en) * 2009-03-12 2011-03-24 Takao Yamaguchi Best path selecting device, best path selecting method, and program
US8849340B2 (en) 2009-05-07 2014-09-30 Intel Corporation Methods and devices for reducing interference in an uplink
US8463312B2 (en) 2009-06-05 2013-06-11 Mindspeed Technologies U.K., Limited Method and device in a communication network
US8892154B2 (en) 2009-06-05 2014-11-18 Intel Corporation Method and device in a communication network
US9807771B2 (en) 2009-06-05 2017-10-31 Intel Corporation Method and device in a communication network
US8862076B2 (en) 2009-06-05 2014-10-14 Intel Corporation Method and device in a communication network
US8301823B2 (en) 2009-07-07 2012-10-30 Panasonic Corporation Bus controller arranged between a bus master and a networked communication bus in order to control the transmission route of a packet that flows through the communication bus, and simulation program to design such a bus controller
US8798630B2 (en) 2009-10-05 2014-08-05 Intel Corporation Femtocell base station
US20110173349A1 (en) * 2009-11-13 2011-07-14 International Business Machines Corporation I/o routing in a multidimensional torus network
US9565094B2 (en) * 2009-11-13 2017-02-07 International Business Machines Corporation I/O routing in a multidimensional torus network
US9164944B2 (en) 2010-01-25 2015-10-20 Panasonic Intellectual Property Management Co., Ltd. Semiconductor system, relay apparatus, and chip circuit
US10348609B2 (en) 2010-01-29 2019-07-09 International Business Machines Corporation I/O routing in a multidimensional torus network
US10601697B2 (en) 2010-01-29 2020-03-24 International Business Machines Corporation I/O routing in a multidimensional torus network
US9954760B2 (en) 2010-01-29 2018-04-24 International Business Machines Corporation I/O routing in a multidimensional torus network
US10979337B2 (en) 2010-01-29 2021-04-13 International Business Machines Corporation I/O routing in a multidimensional torus network
US20110307628A1 (en) * 2010-03-17 2011-12-15 Nec Corporation Communication system, node, control server, communication method and program
US9075747B2 (en) 2010-05-27 2015-07-07 Panasonic Intellectual Property Management Co., Ltd. Bus controller and control unit that outputs instruction to the bus controller
US9107136B2 (en) 2010-08-16 2015-08-11 Intel Corporation Femtocell access control
US8949453B2 (en) 2010-11-30 2015-02-03 International Business Machines Corporation Data communications in a parallel active messaging interface of a parallel computer
US8891371B2 (en) 2010-11-30 2014-11-18 International Business Machines Corporation Data communications in a parallel active messaging interface of a parallel computer
US10856302B2 (en) 2011-04-05 2020-12-01 Intel Corporation Multimode base station
US9042434B2 (en) 2011-04-05 2015-05-26 Intel Corporation Filter
US8712469B2 (en) 2011-05-16 2014-04-29 Mindspeed Technologies U.K., Limited Accessing a base station
US9122840B2 (en) 2011-07-13 2015-09-01 International Business Machines Corporation Performing collective operations in a distributed processing system
US8949328B2 (en) 2011-07-13 2015-02-03 International Business Machines Corporation Performing collective operations in a distributed processing system
US9264371B2 (en) 2011-10-14 2016-02-16 Panasonic Intellectual Property Management Co., Ltd. Router, method for controlling the router, and computer program
US8930962B2 (en) 2012-02-22 2015-01-06 International Business Machines Corporation Processing unexpected messages at a compute node of a parallel computer
CN104583962B (en) * 2012-09-29 2018-05-08 英特尔公司 Technology for elasticity communication
US8990662B2 (en) 2012-09-29 2015-03-24 Intel Corporation Techniques for resilient communication
CN104583962A (en) * 2012-09-29 2015-04-29 英特尔公司 Techniques for resilient communication
WO2014051746A1 (en) * 2012-09-29 2014-04-03 Intel Corporation Techniques for resilient communication
US20170288814A1 (en) * 2014-10-09 2017-10-05 Hewlett Packard Enterprise Development Lp A transmitter that does not resend a packet despite receipt of a message to resend the packet
US10789115B2 (en) * 2014-10-09 2020-09-29 Hewlett Packard Enterprise Development Lp Transmitter that does not resend a packet despite receipt of a message to resend the packet
US9860183B2 (en) 2015-09-25 2018-01-02 Fsa Technologies, Inc. Data redirection in a bifurcated communication trunk system and method
US9900258B2 (en) 2015-09-25 2018-02-20 Fsa Technologies, Inc. Multi-trunk data flow regulation system and method

Also Published As

Publication number Publication date
EP0823165A1 (en) 1998-02-11
DE69735740D1 (en) 2006-06-01
JPH11511634A (en) 1999-10-05
JP3816531B2 (en) 2006-08-30
DE69735740T2 (en) 2006-09-14
WO1997031464A1 (en) 1997-08-28
EP0823165B1 (en) 2006-04-26

Similar Documents

Publication Publication Date Title
US5959995A (en) Asynchronous packet switching
US6393023B1 (en) System and method for acknowledging receipt of messages within a packet based communication network
US6952419B1 (en) High performance transmission link and interconnect
JP3739798B2 (en) System and method for dynamic network topology exploration
CA2564363C (en) Method and apparatus for group communication with end-to-end reliability
JP3816529B2 (en) Interconnect failure detection and location method and apparatus
US6661773B1 (en) Method for detection of stale cells following route changes in a data communication
KR0170500B1 (en) Multiprocessor system
JP2825120B2 (en) Method and communication network for multicast transmission
US6545981B1 (en) System and method for implementing error detection and recovery in a system area network
JP2540930B2 (en) Congestion control device
US7876751B2 (en) Reliable link layer packet retry
US6003064A (en) System and method for controlling data transmission between network elements
Sanders et al. The Xpress transfer protocol (XTP)—a tutorial
US20090028172A1 (en) Speculative forwarding in a high-radix router
US20090059928A1 (en) Communication apparatus, communication system, absent packet detecting method and absent packet detecting program
EP1139602A1 (en) Method and device for multicasting
US6339796B1 (en) System for logical connection resynchronization
US20080107116A1 (en) Large scale multi-processor system with a link-level interconnect providing in-order packet delivery
US6230283B1 (en) Logical connection resynchronization
CA3221912A1 (en) Method for distributing multipath flows in a direct interconnect network
US6237111B1 (en) Method for logical connection resynchronization
Kumar et al. Adaptive fault tolerant routing in interconnection networks: a review
WO2024022243A1 (en) Data transmission method, network device, computer device, and storage medium
JPH06252895A (en) Data transmission system

Legal Events

Date Code Title Description
AS Assignment

Owner name: HAL COMPUTER SYSTEMS, INC., CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:WICKI, THOMAS M.;HELLAND, PATRICK J.;SHIMIZU, TAKESHI;AND OTHERS;REEL/FRAME:008224/0995;SIGNING DATES FROM 19960222 TO 19961030

AS Assignment

Owner name: FUJITSU, LTD., JAPAN

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:HAL COMPUTER SYSTEMS, INC.;REEL/FRAME:008638/0065

Effective date: 19970719

STCF Information on status: patent grant

Free format text: PATENTED CASE

FEPP Fee payment procedure

Free format text: PAYOR NUMBER ASSIGNED (ORIGINAL EVENT CODE: ASPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

CC Certificate of correction
CC Certificate of correction
FPAY Fee payment

Year of fee payment: 4

FPAY Fee payment

Year of fee payment: 8

FPAY Fee payment

Year of fee payment: 12