US20040039895A1 - Memory shared between processing threads - Google Patents

Memory shared between processing threads Download PDF

Info

Publication number
US20040039895A1
US20040039895A1 US10/644,337 US64433703A US2004039895A1 US 20040039895 A1 US20040039895 A1 US 20040039895A1 US 64433703 A US64433703 A US 64433703A US 2004039895 A1 US2004039895 A1 US 2004039895A1
Authority
US
United States
Prior art keywords
stack
datum
pointer
command
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US10/644,337
Inventor
Gilbert Wolrich
Matthew Adiletta
William Wheeler
Daniel Cutter
Debra Bernstein
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Intel Corp
Original Assignee
Intel Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Intel Corp filed Critical Intel Corp
Priority to US10/644,337 priority Critical patent/US20040039895A1/en
Publication of US20040039895A1 publication Critical patent/US20040039895A1/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5011Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
    • G06F9/5016Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals the resource being the memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/3004Arrangements for executing specific machine instructions to perform operations on memory

Definitions

  • the invention relates to memory shared between processing threads.
  • a computer thread is a sequence or stream of computer instructions that performs a task.
  • a computer thread is associated with a set of resources or a context.
  • a method includes pushing a datum onto a stack by a first processor and popping the datum off the stack by the second processor.
  • FIG. 1 is a block diagram of a system employing a hardware-based multi-threaded processor.
  • FIG. 2 is a block diagram of a MicroEngine employed in the hardware-based multi-threaded processor of FIG. 1.
  • FIG. 3 is a block diagram showing instruction sets of two threads that are executed on the MicroEngines of FIGS. 1 and 2.
  • FIG. 4 is a simplified block diagram of the system of FIG. 1 showing selected sub-systems of the processor including a stack module.
  • FIG. 5A is a block diagram showing the memory components of the stack module of FIG. 4.
  • FIG. 5B is a block diagram showing the memory components of an alternate implementation of the stack module of FIG. 4.
  • FIG. 6A is a flow chart of the process of popping a datum from the memory components of FIG. 5A.
  • FIG. 6B is a block diagram showing the memory components of FIG. 5A after the popping process of FIG. 6A.
  • FIG. 7A is a flow chart of the process of pushing a datum on the memory components of FIG. 6B.
  • FIG. 7B is a block diagram showing the memory components of FIG. 6B after the pushing process of FIG. 7A.
  • FIG. 8 is a block diagram showing memory components used to implement two stacks in one stack module.
  • a system 10 includes a parallel, hardware-based multithreaded processor 12 .
  • the hardware-based multithreaded processor 12 is coupled to a bus 14 , a memory system 16 and a second bus 18 .
  • the bus 14 complies with the Peripheral Component Interconnect Interface, revision 2 . 1 , issued Jun. 1, 1995 (PCI).
  • PCI Peripheral Component Interconnect Interface
  • the system 10 is especially useful for tasks that can be broken into parallel subtasks or functions.
  • hardware-based multithreaded processor 12 is useful for tasks that are bandwidth oriented rather than latency oriented.
  • the hardware-based multithreaded processor 12 has multiple MicroEngines 22 each with multiple hardware controlled threads that can be simultaneously active and independently work on a task.
  • the hardware-based multithreaded processor 12 also includes a central controller 20 that assists in loading microcode control for other resources of the hardware-based multithreaded processor 12 and performs other general-purpose computer type functions such as handling protocols, exceptions, and extra support for packet processing where the MicroEngines pass the packets off for more detailed processing such as in boundary conditions.
  • the processor 20 is a StrongArm (TM) (StrongArm is a trademark of ARM Limited, United Kingdom) based architecture.
  • the general-purpose microprocessor 20 has an operating system. Through the operating system, the processor 20 can call functions to operate on MicroEngines 22 a - 22 f .
  • the processor 20 can use any supported operating system preferably a real time operating system.
  • the hardware-based multithreaded processor 12 also includes a plurality of functional MicroEngines 22 a - 22 f .
  • Functional MicroEngines (MicroEngines) 22 a - 22 f each maintain a plurality of program counters in hardware and states associated with the program counters. Effectively, a corresponding plurality of sets of threads can be simultaneously active on each of the MicroEngines 22 a - 22 f while only one is actually operating at any one time.
  • each MicroEngine 22 a - 22 f has capabilities for processing four hardware threads.
  • the six MicroEngines 22 a - 22 f operate with shared resources including memory system 16 and bus interfaces 24 and 28 .
  • the memory system 16 includes a Synchronous Dynamic Random Access Memory (SDRAM) controller 26 a and a Static Random Access Memory (SRAM) controller 26 b .
  • SDRAM memory 16 a and SDRAM controller 26 a are typically used for processing large volumes of data, e.g., processing of network payloads from network packets.
  • the SRAM controller 26 b and SRAM memory 16 b are used in a networking implementation for low latency, fast access tasks, e.g., accessing look-up tables, memory for the core processor 20 , and so forth.
  • the six MicroEngines 22 a - 22 f access either the SDRAM 16 a or SRAM 16 b based on characteristics of the data. Thus, low latency, low bandwidth data is stored in and fetched from SRAM, whereas higher bandwidth data for which latency is not as important, is stored in and fetched from SDRAM.
  • the MicroEngines 22 a - 22 f can execute memory reference instructions to either the SDRAM controller 26 a or SRAM controller 16 b.
  • SRAM or SDRAM memory accesses can be explained by SRAM or SDRAM memory accesses.
  • an SRAM access requested by a Thread_ 0 from a MicroEngine, will cause the SRAM controller 26 b to initiate an access to the SRAM memory 16 b .
  • the SRAM controller controls arbitration for the SRAM bus, accesses the SRAM 16 b , fetches the data from the SRAM 16 b , and returns data to a requesting MicroEngine 22 a - 22 b .
  • the MicroEngine e.g., 22 a had only a single thread that could operate, that MicroEngine would be dormant until data was returned from the SRAM.
  • Thread_ 1 can function while the first thread, e.g., Thread_ 0 , is awaiting the read data to return.
  • Thread_ 1 may access the SDRAM memory 16 a .
  • Thread_ 1 operates on the SDRAM unit, and Thread_ 0 is operating on the SRAM unit, a new thread, e.g., Thread_ 2 can now operate in the MicroEngine 22 a .
  • Thread_ 2 can operate for a certain amount of time until it needs to access memory or perform some other long latency operation, such as making an access to a bus interface. Therefore, simultaneously, the processor 12 can have a bus operation, SRAM operation and SDRAM operation all being completed or operated upon by one MicroEngine 22 a and have one more thread available to process more work in the data path.
  • the hardware context swapping also synchronizes completion of tasks. For example, two threads could hit the same shared resource e.g., SRAM.
  • Each one of these separate functional units e.g., the FBUS interface 28 , the SRAM controller 26 a , and the SDRAM controller 26 b , when they complete a requested task from one of the MicroEngine thread contexts reports back a flag signaling completion of an operation.
  • the MicroEngine receives the flag, the MicroEngine can determine which thread to turn on.
  • One example of an application for the hardware-based multithreaded processor 12 is as a network processor.
  • the hardware-based multithreaded processor 12 interfaces to network devices such as a media access controller device e.g., a 10/100BaseT Octal MAC 13 a or a Gigabit Ethernet device 13 b .
  • the Gigabit Ethernet device 13 b complies with the IEEE 802.3z standard, approved in June 1998.
  • the hardware-based multithreaded processor 12 can interface to any type of communication device or interface that receives/sends large amounts of data.
  • Communication system 10 functioning in a networking application could receive a plurality of network packets from the devices 13 a , 13 b and process those packets in a parallel manner. With the hardware-based multithreaded processor 12 , each network packet can be independently processed.
  • processor 12 Another example for use of processor 12 is a print engine for a postscript processor or as a processor for a storage subsystem, i.e., RAID disk storage.
  • a further use is as a matching engine.
  • the advent of electronic trading requires the use of electronic matching engines to match orders between buyers and sellers. These and other parallel types of tasks can be accomplished on the system 10 .
  • the processor 12 includes a bus interface 28 that couples the processor to the second bus 18 .
  • Bus interface 28 in one embodiment couples the processor 12 to the so-called FBUS 18 (FIFO bus).
  • the FBUS interface 28 is responsible for controlling and interfacing the processor 12 to the FBUS 18 .
  • the FBUS 18 is a 64-bit wide FIFO bus, used to interface to Media Access Controller (MAC) devices.
  • MAC Media Access Controller
  • the processor 12 includes a second interface e.g., a PCI bus interface 24 that couples other system components that reside on the PCI 14 bus to the processor 12 .
  • the PCI bus interface 24 provides a high-speed data path 24 a to memory 16 e.g., the SDRAM memory 16 a . Through that path data can be moved quickly from the SDRAM 16 a through the PCI bus 14 , via direct memory access (DMA) transfers.
  • DMA direct memory access
  • the hardware based multithreaded processor 12 supports image transfers.
  • the hardware based multithreaded processor 12 can employ a plurality of DMA channels so if one target of a DMA transfer is busy, another one of the DMA channels can take over the PCI bus to deliver information to another target to maintain high processor 12 efficiency.
  • the PCI bus interface 24 supports target and master operations.
  • Target operations are operations where slave devices on bus 14 access SDRAMs through reads and writes that are serviced as a slave to target operation.
  • master operations the processor core 20 sends data directly to or receives data directly from the PCI interface 24 .
  • Each of the functional units is coupled to one or more internal buses.
  • the internal buses are dual, 32 bit buses (i.e., one bus for read and one for write).
  • the hardware-based multithreaded processor 12 also is constructed such that the sum of the bandwidths of the internal buses in the processor 12 exceeds the bandwidth of external buses coupled to the processor 12 .
  • the processor 12 includes an internal core processor bus 32 , e.g., an ASB bus (Advanced System Bus) that couples the processor core 20 to the memory controller 26 a , 26 c and to an ASB translator 30 described below.
  • the ASB bus is a subset of the so-called AMBA bus that is used with the Strong Arm processor core.
  • the processor 12 also includes a private bus 34 that couples the MicroEngine units to SRAM controller 26 b , ASB translator 30 and FBUS interface 28 .
  • a memory bus 38 couples the memory controller 26 a , 26 b to the bus interfaces 24 and 28 and memory system 16 including flashrom 16 c used for boot operations and so forth.
  • the MicroEngine includes a control store 70 , which, in one implementation, includes a RAM of here 1,024 words of 32 bit.
  • the RAM stores a microprogram.
  • the microprogram is loadable by the core processor 20 .
  • the MicroEngine 22 f also includes controller logic 72 .
  • the controller logic includes an instruction decoder 73 and program counter (PC) units 72 a - 72 d .
  • the four micro program counters 72 a - 72 d are maintained in hardware.
  • the MicroEngine 22 f also includes context event switching logic 74 .
  • Context event logic 74 receives messages (e.g., SEQ_#_EVENT_RESPONSE; FBI_EVENT_RESPONSE; SRAM_EVENT_RESPONSE; SDRAM _EVENT_RESPONSE; and ASB _EVENT_RESPONSE) from each one of the shared resources, e.g., SRAM 26 a , SDRAM 26 b , or processor core 20 , control and status registers, and so forth. These messages provide information on whether a requested function has completed. Based on whether or not a function requested by a thread has completed and signaled completion, the thread needs to wait for that completion signal, and if the thread is enabled to operate, then the thread is placed on an available thread list (not shown).
  • the MicroEngine 22 f can have a maximum of e.g., 4 threads available.
  • the MicroEngines 22 In addition to event signals that are local to an executing thread, the MicroEngines 22 employ signaling states that are global. With signaling states, an executing thread can broadcast a signal state to all MicroEngines 22 . Receive Request Available signal, Any and all threads in the MicroEngines can branch on these signaling states. These signaling states can be used to determine availability of a resource or whether a resource is due for servicing.
  • the context event logic 74 has arbitration for the four (4) threads. In one embodiment, the arbitration is a round robin mechanism. Other techniques could be used including priority queuing or weighted fair queuing.
  • the MicroEngine 22 f also includes an execution box (EBOX) data path 76 that includes an arithmetic logic unit 76 a and general-purpose register set 76 b .
  • the arithmetic logic unit 76 a performs arithmetic and logical functions as well as shift functions.
  • the registers set 76 b has a relatively large number of general-purpose registers. As will be described in FIG. 6, in this implementation there are 64 general-purpose registers in a first bank, Bank A and 64 in a second bank, Bank B.
  • the general-purpose registers are windowed as will be described so that they are relatively and absolutely addressable.
  • the MicroEngine 22 f also includes a write transfer register 78 and a read transfer 80 . These registers are also windowed so that they are relatively and absolutely addressable.
  • Write transfer register 78 is where write data to a resource is located.
  • read register 80 is for return data from a shared resource. Subsequent to or concurrent with data arrival, an event signal from the respective shared resource e.g., the SRAM controller 26 a , SDRAM controller 26 b or core processor 20 will be provided to context event arbiter 74 which will then alert the thread that the data is available or has been sent.
  • Both transfer register banks 78 and 80 are connected to the execution box (EBOX) 76 through a data path.
  • the read transfer register has 64 registers and the write transfer register has 64 registers.
  • processor 12 has processing threads 41 and 42 executing in MicroEngines 22 a and 22 b respectively. In other instances, the threads 41 and 42 may be executed on the same MicroEngine. The processing threads may or may not share data between them. For example, in FIG. 3, processing thread 41 receives data 43 and processes it to produce data 44 . Processing thread 42 receives and possesses the data 44 to produce output data 45 . Threads 41 and 42 are concurrently active.
  • the MicroEngines 22 a and 22 b share SDRAM 16 a and SRAM 16 b (memory), one MicroEngines 22 a may need to designate sections of memory for its exclusive use.
  • the SDRAM memory is divided into memory segments, referred to as buffers.
  • the memory locations in a buffer share a common address prefix, or pointer.
  • the pointer is used by the processor as an identifier for a buffer.
  • Pointers to buffers that are not currently in use by a processing thread are managed by pushing the pointers onto a free memory stack.
  • a thread can allocate a buffer for use by the thread by popping a pointer off the stack, and using the pointer to access the corresponding buffer.
  • the thread pushes the pointer to the buffer onto the stack to make the buffer available to other threads.
  • the threads 41 and 42 have processor instruction sets 46 , 47 that respectively include a “PUSH” 46 a and a “POP” 47 A instruction. Upon executing either the “PUSH” or the “POP” instruction, the instruction is transmitted to a logical stack module 56 (FIG. 4).
  • a section of the processor 9 and SRAM 16 b provide the logical stack module 56 .
  • the logical stack module is implemented as a linked list of SRAM addresses. Each SRAM address on the linked list contains the address of the next item on the list. As a result, if you have the address of the first item on the list, you can read the contents of that address to find the address of the next item on the list, and so on. Additionally, each address on the linked list is associated with a corresponding memory buffer.
  • the stack module 56 is used to implement a linked list of memory buffers. While in use, the linked list allows the stack to increase or decrease in size as needed.
  • the stack module 56 includes control logic 51 on the SRAM unit 26 b .
  • the control logic 51 performs the necessary operations on the stack while SRAM 16 b stores the contents of the stack.
  • One of SRAM registers 50 is used to store the address of the first SRAM location on the stack. The address is also a pointer to the first buffer on the stack.
  • Thread 41 and thread 42 may be implemented as two operating system threads which execute “PUSH” and “POP” operating system commands to allocate memory from a shared memory pool.
  • the operating system commands may include calls to a library of functions written in the “C” programming language.
  • the equivalents of the control logic 51 , the SRAM registers 50 and SRAM 16 B are implemented using software within the operating system.
  • the software may be stored in a hard disk, a floppy disk, computer memory, or other computer readable medium.
  • SRAM register Q 1 stores an address (0xC5) of the first item on the stack 60 .
  • the SRAM location (0xC5) of the first item on the stack 60 is used to store the SRAM address (0xA1) of the second item on the stack 60 .
  • the SRAM location (0xA1) of the second item on the stack 60 is used to store the address of the third item on the stack 60 , etc.
  • the SRAM location (0xE9) of the last item on the stack stores a pre-determined invalid address (0x00), which indicates the end of the linked list.
  • the addresses of the items (0xC5, 0xA1, and 0xE9) on the stack 60 are pointers to stack buffers 61 a , 61 b , 61 c contained within SDRAM 16 A.
  • a pointer to a buffer is pushed onto the stack by thread 41 , so that the buffer is available for use by other processing threads.
  • a buffer is popped by thread 42 to allocate the buffer for use by thread 42 .
  • the pointers are used as an address base to access memory locations in the buffers.
  • SDRAM 16 A In addition to stack buffers 61 a - c , SDRAM 16 A also contains processing buffer 62 , which is allocated to thread 41 .
  • the pointer to processing buffer 62 is not on the stack because it is not available for allocation by other threads. Thread 41 may later push a pointer to the processing buffer 62 onto the stack when it no longer needs the buffer 62 .
  • the stack will be discussed with reference to the buffer management scheme above, it can be used without buffers.
  • the SRAM locations 0xC5, 0xA1, and 0xE9 may, respectively, contain data 70 a , 70 b , and 70 c in addition to an address to the next item on the list.
  • Such a scheme may be used to store smaller units of data 70 a - c on the stack.
  • the control logic would assign a memory location within the SRAM for storing the unit of data (datum) that is to be pushed onto the stack.
  • the datum pushed onto the stack may be text, numerical data, or even an address or pointer to another memory location.
  • FIG. 6A to pop a datum off the stack stored in SRAM register Q 1 , thread 42 executes 101 the instruction “POP #1”.
  • the pop instruction is part of the instruction set of the MicroEngines 22 .
  • the pop instruction is transmitted to control logic 51 over bus 55 for stack processing.
  • Control logic 51 decodes 102 the pop instruction.
  • the control logic also determines 103 the register that contains a pointer to the stack that is referred to in the instruction based on the argument of the pop instruction. Since the argument to the pop instruction is “#1”, the corresponding register is Q 1 .
  • the control logic 51 returns 104 the contents of the Q 1 register to the context of processing thread 42 .
  • the stack of FIG. 5A would return “0xC5”.
  • Processing thread 42 receives 107 the contents of the Q 1 register, which is “0xC5”, and uses 108 the received content to access data from the corresponding stack buffer 61 b by appending a suffix to the content.
  • Control logic 27 reads 105 the content (0xA1) of the address (0xC5) stored in the Q 1 register. Control logic 27 stores 106 the read content (0xA1) in the Q 1 register to indicate that the 0xC5 has been removed from the stack and 0xA1 is now the item at the top of the stack.
  • the register Q 1 now contains the address 0xA1, which was previously the address of the second item on the stack. Additionally, the location that was previously stack buffer 61 b (in FIG. 5A) is now processing buffer 65 , which is used by thread 42 . Thus, thread 42 has removed stack buffer 61 b from the stack 60 and allocated the buffer 61 b for its own use.
  • Thread 41 pushes processing buffer 62 (shown in FIG. 6B) onto the stack by executing 201 the instruction “PUSH #1 0x01”.
  • the argument 0x01 is a pointer to the buffer 62 because it is a prefix that is common to the address space of the locations in the buffer.
  • the push instruction is transmitted to control logic 51 over the bus 55 .
  • the control logic 51 Upon receiving the push instruction, the control logic 51 decodes 202 the instruction and determines 203 the SRAM register corresponding to the instruction, based on the second argument of the push instruction. Since the second argument is “#1”, the corresponding register is Q 1 . The control logic 51 determines the address to be pushed from the third argument (0x01) of the push instruction. The control logic determines 205 the content of the Q 1 register by reading the value of the register location. The value 0xA1 is the content of the Q 1 register in the stack of FIG. 6B. The control logic stores 206 the content (0xA1) of the Q 1 register in the SRAM location whose address is the push address (0x01). The control logic then stores 207 the push address (0x01) in the Q 1 register.
  • the SRAM register Q 1 contains the address of the first location on the stack, which is now 0x01.
  • the address of the first location on the stack is also the address of stack buffer 61 d , which was previously a processing buffer 62 used by thread 41 .
  • the location 0xA1, which was previously the first item on the stack, is now the second item on the stack.
  • thread 41 adds stack buffer 61 d onto the stack to make it available for allocation to other threads.
  • Thread 42 can later allocate the stack buffer 61 d for its own use by popping it off the stack, as previously described for FIG. 6A.
  • a second stack 60 b may be implemented in the same stack module by using a second SRAM control register to store the address of the first element in the second stack 60 b .
  • the second stack may be used to manage a separate set of memory buffers, for example, within SRAM 16 b or SDRAM 16 a .
  • a first stack 60 a has the address of the first element on the stack 60 a stored in SRAM register Q 1 .
  • a second stack 60 b has the address of its first element stored in register Q 6 .
  • the first stack 60 a is identical to the stack 60 in FIG. 7B.
  • the second stack 60 b is similar to previously described stacks.
  • stack 60 (shown in FIG. 5A) stores the pointer to the first element in a register Q 1 , the linked list in SRAM 16 B and the buffers in SDRAM 16 A
  • any of the stack module elements could be stored in any memory location. For example, they could all be stored in SRAM 16 b or SDRAM 16 a.
  • FIG. 1 For example, a short pointer is a prefix to more addresses and is, therefore, a pointer to a larger address buffer.
  • the stack may be used to manage resources other than buffers.
  • One possible application of the stack might be to store pointers to the contexts of active threads that are not currently operating.
  • MicroEngine 22 a temporarily sets aside a first active thread to process a second active thread, it stores the context of the first active thread in a memory buffer and pushes a pointer to that buffer on the stack. Any MicroEngine can resume the processing of the first active thread by popping the pointer to memory buffer containing the context of the first thread and loading that context.
  • the stack can be used to manage the processing of multiple concurrent active threads by multiple processing engines.

Abstract

A method includes pushing a datum onto a stack by a first processor and popping the datum off the stack by a second processor.

Description

    BACKGROUND
  • The invention relates to memory shared between processing threads. [0001]
  • A computer thread is a sequence or stream of computer instructions that performs a task. A computer thread is associated with a set of resources or a context. [0002]
  • SUMMARY
  • In one general aspect of the invention, a method includes pushing a datum onto a stack by a first processor and popping the datum off the stack by the second processor. [0003]
  • Advantages and other features of the invention will become apparent from the following description and from the claims.[0004]
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • FIG. 1 is a block diagram of a system employing a hardware-based multi-threaded processor. [0005]
  • FIG. 2 is a block diagram of a MicroEngine employed in the hardware-based multi-threaded processor of FIG. 1. [0006]
  • FIG. 3 is a block diagram showing instruction sets of two threads that are executed on the MicroEngines of FIGS. 1 and 2. [0007]
  • FIG. 4 is a simplified block diagram of the system of FIG. 1 showing selected sub-systems of the processor including a stack module. [0008]
  • FIG. 5A is a block diagram showing the memory components of the stack module of FIG. 4. [0009]
  • FIG. 5B is a block diagram showing the memory components of an alternate implementation of the stack module of FIG. 4. [0010]
  • FIG. 6A is a flow chart of the process of popping a datum from the memory components of FIG. 5A. [0011]
  • FIG. 6B is a block diagram showing the memory components of FIG. 5A after the popping process of FIG. 6A. [0012]
  • FIG. 7A is a flow chart of the process of pushing a datum on the memory components of FIG. 6B. [0013]
  • FIG. 7B is a block diagram showing the memory components of FIG. 6B after the pushing process of FIG. 7A. [0014]
  • FIG. 8 is a block diagram showing memory components used to implement two stacks in one stack module.[0015]
  • DETAILED DESCRIPTION
  • Referring to FIG. 1, a [0016] system 10 includes a parallel, hardware-based multithreaded processor 12. The hardware-based multithreaded processor 12 is coupled to a bus 14, a memory system 16 and a second bus 18. The bus 14 complies with the Peripheral Component Interconnect Interface, revision 2.1, issued Jun. 1, 1995 (PCI). The system 10 is especially useful for tasks that can be broken into parallel subtasks or functions. Specifically hardware-based multithreaded processor 12 is useful for tasks that are bandwidth oriented rather than latency oriented. The hardware-based multithreaded processor 12 has multiple MicroEngines 22 each with multiple hardware controlled threads that can be simultaneously active and independently work on a task.
  • The hardware-based [0017] multithreaded processor 12 also includes a central controller 20 that assists in loading microcode control for other resources of the hardware-based multithreaded processor 12 and performs other general-purpose computer type functions such as handling protocols, exceptions, and extra support for packet processing where the MicroEngines pass the packets off for more detailed processing such as in boundary conditions. In one embodiment, the processor 20 is a StrongArm (TM) (StrongArm is a trademark of ARM Limited, United Kingdom) based architecture. The general-purpose microprocessor 20 has an operating system. Through the operating system, the processor 20 can call functions to operate on MicroEngines 22 a-22 f. The processor 20 can use any supported operating system preferably a real time operating system. For the core processor implemented as a StrongArm architecture, operating systems such as, Microsoft NT real-time, and VXWorks and μC/OS, a freeware operating system available over the Internet at http://www.ucos-ii.com/, can be used.
  • The hardware-based [0018] multithreaded processor 12 also includes a plurality of functional MicroEngines 22 a-22 f. Functional MicroEngines (MicroEngines) 22 a-22 f each maintain a plurality of program counters in hardware and states associated with the program counters. Effectively, a corresponding plurality of sets of threads can be simultaneously active on each of the MicroEngines 22 a-22 f while only one is actually operating at any one time.
  • In one embodiment, there are six MicroEngines [0019] 22 a-22 f as shown. Each MicroEngines 22 a-22 f has capabilities for processing four hardware threads. The six MicroEngines 22 a-22 f operate with shared resources including memory system 16 and bus interfaces 24 and 28. The memory system 16 includes a Synchronous Dynamic Random Access Memory (SDRAM) controller 26 a and a Static Random Access Memory (SRAM) controller 26 b. SDRAM memory 16 a and SDRAM controller 26 a are typically used for processing large volumes of data, e.g., processing of network payloads from network packets. The SRAM controller 26 b and SRAM memory 16 b are used in a networking implementation for low latency, fast access tasks, e.g., accessing look-up tables, memory for the core processor 20, and so forth.
  • The six MicroEngines [0020] 22 a-22 f access either the SDRAM 16 a or SRAM 16 b based on characteristics of the data. Thus, low latency, low bandwidth data is stored in and fetched from SRAM, whereas higher bandwidth data for which latency is not as important, is stored in and fetched from SDRAM. The MicroEngines 22 a-22 f can execute memory reference instructions to either the SDRAM controller 26 a or SRAM controller 16 b.
  • Advantages of hardware multithreading can be explained by SRAM or SDRAM memory accesses. As an example, an SRAM access requested by a Thread_[0021] 0, from a MicroEngine, will cause the SRAM controller 26 b to initiate an access to the SRAM memory 16 b. The SRAM controller controls arbitration for the SRAM bus, accesses the SRAM 16 b, fetches the data from the SRAM 16 b, and returns data to a requesting MicroEngine 22 a-22 b. During an SRAM access, if the MicroEngine e.g., 22 a had only a single thread that could operate, that MicroEngine would be dormant until data was returned from the SRAM. By employing hardware context swapping within each of the MicroEngines 22 a-22 f, the hardware context swapping enables other contexts with unique program counters to execute in that same MicroEngine. Thus, another thread e.g., Thread_1 can function while the first thread, e.g., Thread_0, is awaiting the read data to return. During execution, Thread_1 may access the SDRAM memory 16 a. While Thread_1 operates on the SDRAM unit, and Thread_0 is operating on the SRAM unit, a new thread, e.g., Thread_2 can now operate in the MicroEngine 22 a. Thread_2 can operate for a certain amount of time until it needs to access memory or perform some other long latency operation, such as making an access to a bus interface. Therefore, simultaneously, the processor 12 can have a bus operation, SRAM operation and SDRAM operation all being completed or operated upon by one MicroEngine 22 a and have one more thread available to process more work in the data path.
  • The hardware context swapping also synchronizes completion of tasks. For example, two threads could hit the same shared resource e.g., SRAM. Each one of these separate functional units, e.g., the [0022] FBUS interface 28, the SRAM controller 26 a, and the SDRAM controller 26 b, when they complete a requested task from one of the MicroEngine thread contexts reports back a flag signaling completion of an operation. When the MicroEngine receives the flag, the MicroEngine can determine which thread to turn on.
  • One example of an application for the hardware-based [0023] multithreaded processor 12 is as a network processor. As a network processor, the hardware-based multithreaded processor 12 interfaces to network devices such as a media access controller device e.g., a 10/100BaseT Octal MAC 13 a or a Gigabit Ethernet device 13 b. The Gigabit Ethernet device 13 b complies with the IEEE 802.3z standard, approved in June 1998. In general, as a network processor, the hardware-based multithreaded processor 12 can interface to any type of communication device or interface that receives/sends large amounts of data. Communication system 10 functioning in a networking application could receive a plurality of network packets from the devices 13 a, 13 b and process those packets in a parallel manner. With the hardware-based multithreaded processor 12, each network packet can be independently processed.
  • Another example for use of [0024] processor 12 is a print engine for a postscript processor or as a processor for a storage subsystem, i.e., RAID disk storage. A further use is as a matching engine. In the securities industry for example, the advent of electronic trading requires the use of electronic matching engines to match orders between buyers and sellers. These and other parallel types of tasks can be accomplished on the system 10.
  • The [0025] processor 12 includes a bus interface 28 that couples the processor to the second bus 18. Bus interface 28 in one embodiment couples the processor 12 to the so-called FBUS 18 (FIFO bus). The FBUS interface 28 is responsible for controlling and interfacing the processor 12 to the FBUS 18. The FBUS 18 is a 64-bit wide FIFO bus, used to interface to Media Access Controller (MAC) devices.
  • The [0026] processor 12 includes a second interface e.g., a PCI bus interface 24 that couples other system components that reside on the PCI 14 bus to the processor 12. The PCI bus interface 24, provides a high-speed data path 24 a to memory 16 e.g., the SDRAM memory 16 a. Through that path data can be moved quickly from the SDRAM 16 a through the PCI bus 14, via direct memory access (DMA) transfers. The hardware based multithreaded processor 12 supports image transfers. The hardware based multithreaded processor 12 can employ a plurality of DMA channels so if one target of a DMA transfer is busy, another one of the DMA channels can take over the PCI bus to deliver information to another target to maintain high processor 12 efficiency. Additionally, the PCI bus interface 24 supports target and master operations. Target operations are operations where slave devices on bus 14 access SDRAMs through reads and writes that are serviced as a slave to target operation. In master operations, the processor core 20 sends data directly to or receives data directly from the PCI interface 24.
  • Each of the functional units is coupled to one or more internal buses. As described below, the internal buses are dual, 32 bit buses (i.e., one bus for read and one for write). The hardware-based [0027] multithreaded processor 12 also is constructed such that the sum of the bandwidths of the internal buses in the processor 12 exceeds the bandwidth of external buses coupled to the processor 12. The processor 12 includes an internal core processor bus 32, e.g., an ASB bus (Advanced System Bus) that couples the processor core 20 to the memory controller 26 a, 26 c and to an ASB translator 30 described below. The ASB bus is a subset of the so-called AMBA bus that is used with the Strong Arm processor core. The processor 12 also includes a private bus 34 that couples the MicroEngine units to SRAM controller 26 b, ASB translator 30 and FBUS interface 28. A memory bus 38 couples the memory controller 26 a, 26 b to the bus interfaces 24 and 28 and memory system 16 including flashrom 16 c used for boot operations and so forth.
  • Referring to FIG. 2, an exemplary one of the [0028] MicroEngines 22 a-22 f, e.g., MicroEngine 22 f is shown. The MicroEngine includes a control store 70, which, in one implementation, includes a RAM of here 1,024 words of 32 bit. The RAM stores a microprogram. The microprogram is loadable by the core processor 20. The MicroEngine 22 f also includes controller logic 72. The controller logic includes an instruction decoder 73 and program counter (PC) units 72 a-72 d. The four micro program counters 72 a-72 d are maintained in hardware. The MicroEngine 22 f also includes context event switching logic 74. Context event logic 74 receives messages (e.g., SEQ_#_EVENT_RESPONSE; FBI_EVENT_RESPONSE; SRAM_EVENT_RESPONSE; SDRAM _EVENT_RESPONSE; and ASB _EVENT_RESPONSE) from each one of the shared resources, e.g., SRAM 26 a, SDRAM 26 b, or processor core 20, control and status registers, and so forth. These messages provide information on whether a requested function has completed. Based on whether or not a function requested by a thread has completed and signaled completion, the thread needs to wait for that completion signal, and if the thread is enabled to operate, then the thread is placed on an available thread list (not shown). The MicroEngine 22 f can have a maximum of e.g., 4 threads available.
  • In addition to event signals that are local to an executing thread, the [0029] MicroEngines 22 employ signaling states that are global. With signaling states, an executing thread can broadcast a signal state to all MicroEngines 22. Receive Request Available signal, Any and all threads in the MicroEngines can branch on these signaling states. These signaling states can be used to determine availability of a resource or whether a resource is due for servicing.
  • The [0030] context event logic 74 has arbitration for the four (4) threads. In one embodiment, the arbitration is a round robin mechanism. Other techniques could be used including priority queuing or weighted fair queuing. The MicroEngine 22 f also includes an execution box (EBOX) data path 76 that includes an arithmetic logic unit 76 a and general-purpose register set 76 b. The arithmetic logic unit 76 a performs arithmetic and logical functions as well as shift functions. The registers set 76 b has a relatively large number of general-purpose registers. As will be described in FIG. 6, in this implementation there are 64 general-purpose registers in a first bank, Bank A and 64 in a second bank, Bank B. The general-purpose registers are windowed as will be described so that they are relatively and absolutely addressable.
  • The MicroEngine [0031] 22 f also includes a write transfer register 78 and a read transfer 80. These registers are also windowed so that they are relatively and absolutely addressable. Write transfer register 78 is where write data to a resource is located. Similarly, read register 80 is for return data from a shared resource. Subsequent to or concurrent with data arrival, an event signal from the respective shared resource e.g., the SRAM controller 26 a, SDRAM controller 26 b or core processor 20 will be provided to context event arbiter 74 which will then alert the thread that the data is available or has been sent. Both transfer register banks 78 and 80 are connected to the execution box (EBOX) 76 through a data path. In one implementation, the read transfer register has 64 registers and the write transfer register has 64 registers.
  • Referring to FIG. 3, [0032] processor 12 has processing threads 41 and 42 executing in MicroEngines 22 a and 22 b respectively. In other instances, the threads 41 and 42 may be executed on the same MicroEngine. The processing threads may or may not share data between them. For example, in FIG. 3, processing thread 41 receives data 43 and processes it to produce data 44. Processing thread 42 receives and possesses the data 44 to produce output data 45. Threads 41 and 42 are concurrently active.
  • Because the MicroEngines [0033] 22 a and 22 b share SDRAM 16 a and SRAM 16 b (memory), one MicroEngines 22 a may need to designate sections of memory for its exclusive use. To facilitate efficient allocation of memory sections, the SDRAM memory is divided into memory segments, referred to as buffers. The memory locations in a buffer share a common address prefix, or pointer. The pointer is used by the processor as an identifier for a buffer.
  • Pointers to buffers that are not currently in use by a processing thread are managed by pushing the pointers onto a free memory stack. A thread can allocate a buffer for use by the thread by popping a pointer off the stack, and using the pointer to access the corresponding buffer. When a processing thread no longer needs a buffer that is allocated to the processing thread, the thread pushes the pointer to the buffer onto the stack to make the buffer available to other threads. [0034]
  • The [0035] threads 41 and 42 have processor instruction sets 46, 47 that respectively include a “PUSH” 46 a and a “POP” 47A instruction. Upon executing either the “PUSH” or the “POP” instruction, the instruction is transmitted to a logical stack module 56 (FIG. 4).
  • Referring to FIG. 4, a section of the processor [0036] 9 and SRAM 16 b provide the logical stack module 56. The logical stack module is implemented as a linked list of SRAM addresses. Each SRAM address on the linked list contains the address of the next item on the list. As a result, if you have the address of the first item on the list, you can read the contents of that address to find the address of the next item on the list, and so on. Additionally, each address on the linked list is associated with a corresponding memory buffer. Thus the stack module 56 is used to implement a linked list of memory buffers. While in use, the linked list allows the stack to increase or decrease in size as needed.
  • The [0037] stack module 56 includes control logic 51 on the SRAM unit 26 b. The control logic 51 performs the necessary operations on the stack while SRAM 16 b stores the contents of the stack. One of SRAM registers 50 is used to store the address of the first SRAM location on the stack. The address is also a pointer to the first buffer on the stack.
  • Although the different components of the [0038] stack module 56 and the threads will be explained using an example that uses hardware threads and stack modules, the stack can also be implemented in operating system software threads using software modules. Thread 41 and thread 42 may be implemented as two operating system threads which execute “PUSH” and “POP” operating system commands to allocate memory from a shared memory pool. The operating system commands may include calls to a library of functions written in the “C” programming language. In the operating system example, the equivalents of the control logic 51, the SRAM registers 50 and SRAM 16B are implemented using software within the operating system. The software may be stored in a hard disk, a floppy disk, computer memory, or other computer readable medium.
  • Referring to FIG. 5A, SRAM register Q[0039] 1 stores an address (0xC5) of the first item on the stack 60. The SRAM location (0xC5) of the first item on the stack 60 is used to store the SRAM address (0xA1) of the second item on the stack 60. The SRAM location (0xA1) of the second item on the stack 60 is used to store the address of the third item on the stack 60, etc. The SRAM location (0xE9) of the last item on the stack stores a pre-determined invalid address (0x00), which indicates the end of the linked list.
  • Additionally, the addresses of the items (0xC5, 0xA1, and 0xE9) on the [0040] stack 60 are pointers to stack buffers 61 a, 61 b, 61 c contained within SDRAM 16A. A pointer to a buffer is pushed onto the stack by thread 41, so that the buffer is available for use by other processing threads. A buffer is popped by thread 42 to allocate the buffer for use by thread 42. The pointers are used as an address base to access memory locations in the buffers.
  • In addition to stack buffers [0041] 61 a-c, SDRAM 16A also contains processing buffer 62, which is allocated to thread 41. The pointer to processing buffer 62 is not on the stack because it is not available for allocation by other threads. Thread 41 may later push a pointer to the processing buffer 62 onto the stack when it no longer needs the buffer 62.
  • Although the stack will be discussed with reference to the buffer management scheme above, it can be used without buffers. Referring to FIG. 5B, the SRAM locations 0xC5, 0xA1, and 0xE9 may, respectively, contain data [0042] 70 a, 70 b, and 70 c in addition to an address to the next item on the list. Such a scheme may be used to store smaller units of data 70 a-c on the stack. In such a scheme, the control logic would assign a memory location within the SRAM for storing the unit of data (datum) that is to be pushed onto the stack. The datum pushed onto the stack may be text, numerical data, or even an address or pointer to another memory location.
  • Referring to FIG. 6A, to pop a datum off the stack stored in SRAM register Q[0043] 1, thread 42 executes 101 the instruction “POP #1”. The pop instruction is part of the instruction set of the MicroEngines 22. The pop instruction is transmitted to control logic 51 over bus 55 for stack processing. Control logic 51 decodes 102 the pop instruction. The control logic also determines 103 the register that contains a pointer to the stack that is referred to in the instruction based on the argument of the pop instruction. Since the argument to the pop instruction is “#1”, the corresponding register is Q1. The control logic 51 returns 104 the contents of the Q1 register to the context of processing thread 42. The stack of FIG. 5A would return “0xC5”. Processing thread 42 receives 107 the contents of the Q1 register, which is “0xC5”, and uses 108 the received content to access data from the corresponding stack buffer 61 b by appending a suffix to the content.
  • [0044] Control logic 27 reads 105 the content (0xA1) of the address (0xC5) stored in the Q1 register. Control logic 27 stores 106 the read content (0xA1) in the Q1 register to indicate that the 0xC5 has been removed from the stack and 0xA1 is now the item at the top of the stack.
  • Referring to FIG. 6B, the state of the stack after the operations of FIG. 6A will be described. As shown, the register Q[0045] 1 now contains the address 0xA1, which was previously the address of the second item on the stack. Additionally, the location that was previously stack buffer 61 b (in FIG. 5A) is now processing buffer 65, which is used by thread 42. Thus, thread 42 has removed stack buffer 61 b from the stack 60 and allocated the buffer 61 b for its own use.
  • Referring to FIG. 7A, the process of adding a buffer to the stack will be described. [0046] Thread 41 pushes processing buffer 62 (shown in FIG. 6B) onto the stack by executing 201 the instruction “PUSH #1 0x01”. The argument 0x01 is a pointer to the buffer 62 because it is a prefix that is common to the address space of the locations in the buffer. The push instruction is transmitted to control logic 51 over the bus 55.
  • Upon receiving the push instruction, the [0047] control logic 51 decodes 202 the instruction and determines 203 the SRAM register corresponding to the instruction, based on the second argument of the push instruction. Since the second argument is “#1”, the corresponding register is Q1. The control logic 51 determines the address to be pushed from the third argument (0x01) of the push instruction. The control logic determines 205 the content of the Q1 register by reading the value of the register location. The value 0xA1 is the content of the Q1 register in the stack of FIG. 6B. The control logic stores 206 the content (0xA1) of the Q1 register in the SRAM location whose address is the push address (0x01). The control logic then stores 207 the push address (0x01) in the Q1 register.
  • Referring to FIG. 7B, the contents of the stack after the operations of FIG. 7A will be described. As shown, the SRAM register Q[0048] 1, contains the address of the first location on the stack, which is now 0x01. The address of the first location on the stack is also the address of stack buffer 61 d, which was previously a processing buffer 62 used by thread 41. The location 0xA1, which was previously the first item on the stack, is now the second item on the stack. Thus, thread 41 adds stack buffer 61 d onto the stack to make it available for allocation to other threads. Thread 42 can later allocate the stack buffer 61 d for its own use by popping it off the stack, as previously described for FIG. 6A.
  • Referring to FIG. 8, a second stack [0049] 60 b (shown in phantom) may be implemented in the same stack module by using a second SRAM control register to store the address of the first element in the second stack 60 b. The second stack may be used to manage a separate set of memory buffers, for example, within SRAM 16 b or SDRAM 16 a. A first stack 60 a has the address of the first element on the stack 60 a stored in SRAM register Q1. Additionally, a second stack 60 b has the address of its first element stored in register Q6. The first stack 60 a is identical to the stack 60 in FIG. 7B. The second stack 60 b is similar to previously described stacks.
  • Other embodiments are within the scope of the following claims. Although the stack [0050] 60 (shown in FIG. 5A) stores the pointer to the first element in a register Q1, the linked list in SRAM 16B and the buffers in SDRAM 16A, any of the stack module elements could be stored in any memory location. For example, they could all be stored in SRAM 16 b or SDRAM 16 a.
  • Other embodiments my implement the stack in a continuous address space, instead of using a linked list. The size of the buffers may be varied by using pointers (address prefixes) of varying length. For example, a short pointer is a prefix to more addresses and is, therefore, a pointer to a larger address buffer. [0051]
  • Alternatively, the stack may be used to manage resources other than buffers. One possible application of the stack might be to store pointers to the contexts of active threads that are not currently operating. When MicroEngine [0052] 22 a temporarily sets aside a first active thread to process a second active thread, it stores the context of the first active thread in a memory buffer and pushes a pointer to that buffer on the stack. Any MicroEngine can resume the processing of the first active thread by popping the pointer to memory buffer containing the context of the first thread and loading that context. Thus the stack can be used to manage the processing of multiple concurrent active threads by multiple processing engines.

Claims (30)

What is claimed is:
1. A method comprising:
pushing a datum onto a stack by a first processing thread; and
popping the datum off the stack by a second processing thread.
2. The method of claim 1 wherein the pushing comprises:
executing a push command on the first processing thread, the push command having at least one argument,
determining a pointer to a current stack datum,
determining a location associated with an argument of the push command,
storing the determined pointer at the determined location,
producing a pointer associated with determined location the pointer to the current stack datum.
3. The method of claim 2 wherein determining a location comprises:
decoding the push command.
4. The method of claim 2 wherein determining a location comprises:
storing an argument of the pop command in a location associated with the argument of the push command.
5. The method of claim 2 wherein said push command is at least one of a processor instruction, and an operating system call.
6. The method of claim 1 wherein popping comprises:
executing a pop command by the second processing thread,
determining a pointer to a current stack datum,
returning the determined pointer to the second processing thread,
retrieving a pointer to a previous stack datum from a location associated with the pointer to the current stack datum, and
assigning the retrieved pointer the pointer to the current stack datum.
7. The method of claim 6 wherein the location associated with the pointer to the current stack datum is the location that has an address equal to the value of the pointer to the current stack datum.
8. The method of claim 6 wherein the location associated with the pointer to the current stack datum is the location that has an address equal to the sum of an offset and the value of the pointer to the current stack datum.
9. The method of claim 6 wherein the pop command is at least one of a processor instruction or an operating system call.
10. The method of claim 1 further comprising:
storing data in a memory buffer that is accessible using a buffer pointer having the datum that is pushed onto the stack.
11. The method of claim 1 further comprising:
using the popped datum as a buffer pointer to access information stored in a memory buffer.
12. The method of claim 1 further comprising:
a third processing thread pushing a second datum onto the stack.
13. The method of claim 1 further comprising:
a third processing thread popping a second datum of the stack.
14. A system comprising:
a stack module that stores data by pushing it onto the stack and processing threads can retrieve information by popping the information off the stack,
a first processing thread having a first command set, including at least one command for pushing data onto the stack, and
a second processing thread having a second command set, including at least one command for popping the data off the stack.
15. The system of claim 14 wherein the first and second processing threads are executed on a single processing engine.
16. The system of claim 14 wherein the first and second processing threads are executed on separate processing engines.
17. The system of claim 16 wherein the separate processing engines are implemented on the same integrated circuit.
18. The system of claim 14 wherein the stack module and the processing threads are on the same integrated circuit.
19. The system of claim 14 where the first and second command sets are at least one of a processor instruction set and an operating system instruction set.
20. The system of claim 14 further comprising a bus interface for communicating between at least one of the processing threads and the stack module.
21. A stack module comprising:
control logic that responds to commands from at least two processing threads, the control logic storing datum on a stack structure in response to a push command and retrieving datum from the stack in response to a pop command.
22. The stack module of claim 21 further comprising a stack pointer associated with the most recently stored datum on the stack.
23. The stack module of claim 22 further comprising a memory location associated with a first datum on the stack, the second memory location including:
a pointer associated with a second datum which was stored on the stack prior to said first datum.
24. The stack module of claim 22 further comprising a second stack pointer associated with the most recently stored datum on a second stack.
25. The stack module of claim 22 wherein the stack pointer is a register on a processor.
26. The stack module of claim 23 wherein said memory location includes SRAM memory.
27. The stack module of claim 21 wherein the commands are processor instructions.
28. The stack module of claim 21 wherein the commands are operating system instructions.
29. An article comprising a computer-readable medium which stores computer logic, the computer logic comprising:
a stack module configured to store data from a first processing thread by pushing the data onto a stack and to retrieve the data for a second processing thread by popping the data off the stack, the stack module being responsive to a first processing thread command to store data on the stack and a second processing thread command to retrieve data from the stack.
30. An article comprising a computer-readable medium which stores computer-executable instructions, the instructions causing a processor to:
store data from a first processing thread by executing an instruction to push the data onto the stack; and
retrieve the data for a second processing thread by executing an instruction to pop the data from the stack for use by the second thread.
US10/644,337 2000-01-05 2003-08-20 Memory shared between processing threads Abandoned US20040039895A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US10/644,337 US20040039895A1 (en) 2000-01-05 2003-08-20 Memory shared between processing threads

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US09/479,377 US6631462B1 (en) 2000-01-05 2000-01-05 Memory shared between processing threads
US10/644,337 US20040039895A1 (en) 2000-01-05 2003-08-20 Memory shared between processing threads

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US09/479,377 Continuation US6631462B1 (en) 2000-01-05 2000-01-05 Memory shared between processing threads

Publications (1)

Publication Number Publication Date
US20040039895A1 true US20040039895A1 (en) 2004-02-26

Family

ID=23903749

Family Applications (2)

Application Number Title Priority Date Filing Date
US09/479,377 Expired - Lifetime US6631462B1 (en) 2000-01-05 2000-01-05 Memory shared between processing threads
US10/644,337 Abandoned US20040039895A1 (en) 2000-01-05 2003-08-20 Memory shared between processing threads

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US09/479,377 Expired - Lifetime US6631462B1 (en) 2000-01-05 2000-01-05 Memory shared between processing threads

Country Status (10)

Country Link
US (2) US6631462B1 (en)
EP (1) EP1247168B1 (en)
CN (1) CN1253784C (en)
AT (1) ATE280972T1 (en)
AU (1) AU2280101A (en)
DE (1) DE60015395T2 (en)
HK (1) HK1046180B (en)
SG (1) SG149673A1 (en)
TW (1) TWI222011B (en)
WO (1) WO2001050247A2 (en)

Cited By (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030041216A1 (en) * 2001-08-27 2003-02-27 Rosenbluth Mark B. Mechanism for providing early coherency detection to enable high performance memory updates in a latency sensitive multithreaded environment
US20030067934A1 (en) * 2001-09-28 2003-04-10 Hooper Donald F. Multiprotocol decapsulation/encapsulation control structure and packet protocol conversion method
US20030105899A1 (en) * 2001-08-27 2003-06-05 Rosenbluth Mark B. Multiprocessor infrastructure for providing flexible bandwidth allocation via multiple instantiations of separate data buses, control buses and support mechanisms
US20030115426A1 (en) * 2001-12-17 2003-06-19 Rosenbluth Mark B. Congestion management for high speed queuing
US20030145155A1 (en) * 2002-01-25 2003-07-31 Gilbert Wolrich Data transfer mechanism
US20030231635A1 (en) * 2002-06-18 2003-12-18 Kalkunte Suresh S. Scheduling system for transmission of cells to ATM virtual circuits and DSL ports
US20040073778A1 (en) * 1999-08-31 2004-04-15 Adiletta Matthew J. Parallel processor architecture
US20040071152A1 (en) * 1999-12-29 2004-04-15 Intel Corporation, A Delaware Corporation Method and apparatus for gigabit packet assignment for multithreaded packet processing
US20040085901A1 (en) * 2002-11-05 2004-05-06 Hooper Donald F. Flow control in a network environment
US20040186921A1 (en) * 1999-12-27 2004-09-23 Intel Corporation, A California Corporation Memory mapping in a multi-engine processor
US20050132132A1 (en) * 2001-08-27 2005-06-16 Rosenbluth Mark B. Software controlled content addressable memory in a general purpose execution datapath
US20050144413A1 (en) * 2003-12-30 2005-06-30 Chen-Chi Kuo Method and apparatus utilizing non-uniformly distributed DRAM configurations and to detect in-range memory address matches
US20050210517A1 (en) * 2004-02-27 2005-09-22 Yukiyoshi Hirose Information processing system, network system situation presenting method and computer program
US20050216710A1 (en) * 2002-01-17 2005-09-29 Wilkinson Hugh M Iii Parallel processor with functional pipeline providing programming engines by supporting multiple contexts and critical section
US20050244411A1 (en) * 1999-01-25 2005-11-03 Biogen Idec Ma Inc. BAFF, inhibitors thereof and their use in the modulation of B-cell response and treatment of autoimmune disorders
US20060069882A1 (en) * 1999-08-31 2006-03-30 Intel Corporation, A Delaware Corporation Memory controller for processor having multiple programmable units
US20060067348A1 (en) * 2004-09-30 2006-03-30 Sanjeev Jain System and method for efficient memory access of queue control data structures
US20060140203A1 (en) * 2004-12-28 2006-06-29 Sanjeev Jain System and method for packet queuing
US20060155959A1 (en) * 2004-12-21 2006-07-13 Sanjeev Jain Method and apparatus to provide efficient communication between processing elements in a processor unit
US20070234009A1 (en) * 2000-08-31 2007-10-04 Intel Corporation Processor having a dedicated hash unit integrated within
USRE41849E1 (en) 1999-12-22 2010-10-19 Intel Corporation Parallel multi-threaded processing
US7895239B2 (en) 2002-01-04 2011-02-22 Intel Corporation Queue arrays in network devices
US7991983B2 (en) 1999-09-01 2011-08-02 Intel Corporation Register set used in multithreaded parallel processor architecture
US20120167115A1 (en) * 2010-12-22 2012-06-28 Lsi Corporation System and method for synchronous inter-thread communication
US8413149B2 (en) 2004-02-27 2013-04-02 Sony Corporation Priority based processor reservations
US20150032986A1 (en) * 2013-07-29 2015-01-29 Ralph Moore Memory block management systems and methods
US10725997B1 (en) * 2012-06-18 2020-07-28 EMC IP Holding Company LLC Method and systems for concurrent collection and generation of shared data
US10901887B2 (en) * 2018-05-17 2021-01-26 International Business Machines Corporation Buffered freepointer management memory system

Families Citing this family (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7299470B2 (en) * 2001-09-13 2007-11-20 International Business Machines Corporation Method and system for regulating communication traffic using a limiter thread
US7275247B2 (en) * 2002-09-19 2007-09-25 International Business Machines Corporation Method and apparatus for handling threads in a data processing system
US7233335B2 (en) * 2003-04-21 2007-06-19 Nividia Corporation System and method for reserving and managing memory spaces in a memory resource
US7418540B2 (en) * 2004-04-28 2008-08-26 Intel Corporation Memory controller with command queue look-ahead
CN100349142C (en) * 2004-05-25 2007-11-14 中国科学院计算技术研究所 Remote page access method for use in shared virtual memory system and network interface card
US7277990B2 (en) 2004-09-30 2007-10-02 Sanjeev Jain Method and apparatus providing efficient queue descriptor memory access
US7418543B2 (en) 2004-12-21 2008-08-26 Intel Corporation Processor having content addressable memory with command ordering
US7467256B2 (en) 2004-12-28 2008-12-16 Intel Corporation Processor having content addressable memory for block-based queue structures
DE102005026721A1 (en) * 2005-06-09 2007-01-11 Rohde & Schwarz Gmbh & Co. Kg Method for memory management of digital computing devices
US7853951B2 (en) * 2005-07-25 2010-12-14 Intel Corporation Lock sequencing to reorder and grant lock requests from multiple program threads
US20070044103A1 (en) * 2005-07-25 2007-02-22 Mark Rosenbluth Inter-thread communication of lock protected data
US20070124728A1 (en) * 2005-11-28 2007-05-31 Mark Rosenbluth Passing work between threads
US20070157030A1 (en) * 2005-12-30 2007-07-05 Feghali Wajdi K Cryptographic system component
US8051163B2 (en) * 2006-05-11 2011-11-01 Computer Associates Think, Inc. Synthetic transactions based on system history and load
US8407451B2 (en) * 2007-02-06 2013-03-26 International Business Machines Corporation Method and apparatus for enabling resource allocation identification at the instruction level in a processor system
US8031612B2 (en) 2008-09-11 2011-10-04 Intel Corporation Altering operation of a network interface controller based on network traffic
US9483272B2 (en) 2014-09-30 2016-11-01 Freescale Semiconductor, Inc. Systems and methods for managing return stacks in a multi-threaded data processing system
CN105868014A (en) * 2016-04-08 2016-08-17 京信通信技术(广州)有限公司 Memory optimization queuing method and system

Citations (99)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US3373408A (en) * 1965-04-16 1968-03-12 Rca Corp Computer capable of switching between programs without storage and retrieval of the contents of operation registers
US3792441A (en) * 1972-03-08 1974-02-12 Burroughs Corp Micro-program having an overlay micro-instruction
US3889243A (en) * 1973-10-18 1975-06-10 Ibm Stack mechanism for a data processor
US3940745A (en) * 1973-06-05 1976-02-24 Ing. C. Olivetti & C., S.P.A. Data processing unit having a plurality of hardware circuits for processing data at different priority levels
US4016548A (en) * 1975-04-11 1977-04-05 Sperry Rand Corporation Communication multiplexer module
US4032899A (en) * 1975-05-05 1977-06-28 International Business Machines Corporation Apparatus and method for switching of data
US4075691A (en) * 1975-11-06 1978-02-21 Bunker Ramo Corporation Communication control unit
US4514807A (en) * 1980-05-21 1985-04-30 Tatsuo Nogi Parallel computer
US4745544A (en) * 1985-12-12 1988-05-17 Texas Instruments Incorporated Master/slave sequencing processor with forced I/O
US4831358A (en) * 1982-12-21 1989-05-16 Texas Instruments Incorporated Communications system employing control line minimization
US4991112A (en) * 1987-12-23 1991-02-05 U.S. Philips Corporation Graphics system with graphics controller and DRAM controller
US5115507A (en) * 1987-12-23 1992-05-19 U.S. Philips Corp. System for management of the priorities of access to a memory and its application
US5390329A (en) * 1990-06-11 1995-02-14 Cray Research, Inc. Responding to service requests using minimal system-side context in a multiprocessor environment
US5392391A (en) * 1991-10-18 1995-02-21 Lsi Logic Corporation High performance graphics applications controller
US5392411A (en) * 1992-02-03 1995-02-21 Matsushita Electric Industrial Co., Ltd. Dual-array register file with overlapping window registers
US5392412A (en) * 1991-10-03 1995-02-21 Standard Microsystems Corporation Data communication controller for use with a single-port data packet buffer
US5404484A (en) * 1992-09-16 1995-04-04 Hewlett-Packard Company Cache system for reducing memory latency times
US5404482A (en) * 1990-06-29 1995-04-04 Digital Equipment Corporation Processor and method for preventing access to a locked memory block by recording a lock in a content addressable memory with outstanding cache fills
US5432918A (en) * 1990-06-29 1995-07-11 Digital Equipment Corporation Method and apparatus for ordering read and write operations using conflict bits in a write queue
US5517648A (en) * 1993-04-30 1996-05-14 Zenith Data Systems Corporation Symmetric multiprocessing system with unified environment and distributed system functions
US5542070A (en) * 1993-05-20 1996-07-30 Ag Communication Systems Corporation Method for rapid development of software systems
US5542088A (en) * 1994-04-29 1996-07-30 Intergraph Corporation Method and apparatus for enabling control of task execution
US5592622A (en) * 1995-05-10 1997-01-07 3Com Corporation Network intermediate system with message passing architecture
US5613071A (en) * 1995-07-14 1997-03-18 Intel Corporation Method and apparatus for providing remote memory access in a distributed memory multiprocessor system
US5613136A (en) * 1991-12-04 1997-03-18 University Of Iowa Research Foundation Locality manager having memory and independent code, bus interface logic, and synchronization components for a processing element for intercommunication in a latency tolerant multiple processor
US5617327A (en) * 1993-07-30 1997-04-01 Xilinx, Inc. Method for entering state flow diagrams using schematic editor programs
US5623489A (en) * 1991-09-26 1997-04-22 Ipc Information Systems, Inc. Channel allocation system for distributed digital switching network
US5627829A (en) * 1993-10-07 1997-05-06 Gleeson; Bryan J. Method for reducing unnecessary traffic over a computer network
US5630641A (en) * 1993-07-29 1997-05-20 Aisin Seiki Kabushiki Kaisha Sunroof device for vehicle
US5644623A (en) * 1994-03-01 1997-07-01 Safco Technologies, Inc. Automated quality assessment system for cellular networks by using DTMF signals
US5649157A (en) * 1995-03-30 1997-07-15 Hewlett-Packard Co. Memory controller with priority queues
US5717898A (en) * 1991-10-11 1998-02-10 Intel Corporation Cache coherency mechanism for multiprocessor computer systems
US5721870A (en) * 1994-05-25 1998-02-24 Nec Corporation Lock control for a shared main storage data processing system
US5740402A (en) * 1993-12-15 1998-04-14 Silicon Graphics, Inc. Conflict resolution in interleaved memory systems with multiple parallel accesses
US5742782A (en) * 1994-04-15 1998-04-21 Hitachi, Ltd. Processing apparatus for executing a plurality of VLIW threads in parallel
US5742822A (en) * 1994-12-19 1998-04-21 Nec Corporation Multithreaded processor which dynamically discriminates a parallel execution and a sequential execution of threads
US5742587A (en) * 1997-02-28 1998-04-21 Lanart Corporation Load balancing port switching hub
US5745913A (en) * 1996-08-05 1998-04-28 Exponential Technology, Inc. Multi-processor DRAM controller that prioritizes row-miss requests to stale banks
US5751987A (en) * 1990-03-16 1998-05-12 Texas Instruments Incorporated Distributed processing memory chip with embedded logic having both data memory and broadcast memory
US5761522A (en) * 1995-05-24 1998-06-02 Fuji Xerox Co., Ltd. Program control system programmable to selectively execute a plurality of programs
US5761507A (en) * 1996-03-05 1998-06-02 International Business Machines Corporation Client/server architecture supporting concurrent servers within a server with a transaction manager providing server/connection decoupling
US5764915A (en) * 1996-03-08 1998-06-09 International Business Machines Corporation Object-oriented communication interface for network protocol access using the selected newly created protocol interface object and newly created protocol layer objects in the protocol stack
US5781774A (en) * 1994-06-29 1998-07-14 Intel Corporation Processor having operating modes for an upgradeable multiprocessor computer system
US5784712A (en) * 1995-03-01 1998-07-21 Unisys Corporation Method and apparatus for locally generating addressing information for a memory access
US5784649A (en) * 1996-03-13 1998-07-21 Diamond Multimedia Systems, Inc. Multi-threaded FIFO pool buffer and bus transfer control system
US5860158A (en) * 1996-11-15 1999-01-12 Samsung Electronics Company, Ltd. Cache control unit with a cache request transaction-oriented protocol
US5886992A (en) * 1995-04-14 1999-03-23 Valtion Teknillinen Tutkimuskeskus Frame synchronized ring system and method
US5887134A (en) * 1997-06-30 1999-03-23 Sun Microsystems System and method for preserving message order while employing both programmed I/O and DMA operations
US5890208A (en) * 1996-03-30 1999-03-30 Samsung Electronics Co., Ltd. Command executing method for CD-ROM disk drive
US5892979A (en) * 1994-07-20 1999-04-06 Fujitsu Limited Queue control apparatus including memory to save data received when capacity of queue is less than a predetermined threshold
US5905876A (en) * 1996-12-16 1999-05-18 Intel Corporation Queue ordering for memory and I/O transactions in a multiple concurrent transaction computer system
US5905889A (en) * 1997-03-20 1999-05-18 International Business Machines Corporation Resource management system using next available integer from an integer pool and returning the integer thereto as the next available integer upon completion of use
US5915123A (en) * 1997-10-31 1999-06-22 Silicon Spice Method and apparatus for controlling configuration memory contexts of processing elements in a network of multiple context processing elements
US5918235A (en) * 1997-04-04 1999-06-29 Hewlett-Packard Company Object surrogate with active computation and probablistic counter
US5928736A (en) * 1996-09-09 1999-07-27 Raytheon Company Composite structure having integrated aperture and method for its preparation
US6012151A (en) * 1996-06-28 2000-01-04 Fujitsu Limited Information processing apparatus and distributed processing control method
US6014729A (en) * 1997-09-29 2000-01-11 Firstpass, Inc. Shared memory arbitration apparatus and method
US6023742A (en) * 1996-07-18 2000-02-08 University Of Washington Reconfigurable computing architecture for providing pipelined data paths
US6058168A (en) * 1995-12-29 2000-05-02 Tixi.Com Gmbh Telecommunication Systems Method and microcomputer system for the automatic, secure and direct transmission of data
US6061710A (en) * 1997-10-29 2000-05-09 International Business Machines Corporation Multithreaded processor incorporating a thread latch register for interrupt service new pending threads
US6067585A (en) * 1997-06-23 2000-05-23 Compaq Computer Corporation Adaptive interface controller that can operate with segments of different protocol and transmission rates in a single integrated device
US6073215A (en) * 1998-08-03 2000-06-06 Motorola, Inc. Data processing system having a data prefetch mechanism and method therefor
US6072781A (en) * 1996-10-22 2000-06-06 International Business Machines Corporation Multi-tasking adapter for parallel network applications
US6079008A (en) * 1998-04-03 2000-06-20 Patton Electronics Co. Multiple thread multiple data predictive coded parallel processing system and method
US6085294A (en) * 1997-10-24 2000-07-04 Compaq Computer Corporation Distributed data dependency stall mechanism
US6085215A (en) * 1993-03-26 2000-07-04 Cabletron Systems, Inc. Scheduling mechanism using predetermined limited execution time processing threads in a communication network
US6092127A (en) * 1998-05-15 2000-07-18 Hewlett-Packard Company Dynamic allocation and reallocation of buffers in links of chained DMA operations by receiving notification of buffer full and maintaining a queue of buffers available
US6092158A (en) * 1997-06-13 2000-07-18 Intel Corporation Method and apparatus for arbitrating between command streams
US6170051B1 (en) * 1997-08-01 2001-01-02 Micron Technology, Inc. Apparatus and method for program level parallelism in a VLIW processor
US6182177B1 (en) * 1997-06-13 2001-01-30 Intel Corporation Method and apparatus for maintaining one or more queues of elements such as commands using one or more token queues
US6195676B1 (en) * 1989-12-29 2001-02-27 Silicon Graphics, Inc. Method and apparatus for user side scheduling in a multiprocessor operating system program that implements distributive scheduling of processes
US6199133B1 (en) * 1996-03-29 2001-03-06 Compaq Computer Corporation Management communication bus for networking devices
US6201807B1 (en) * 1996-02-27 2001-03-13 Lucent Technologies Real-time hardware method and apparatus for reducing queue processing
US6212611B1 (en) * 1998-11-03 2001-04-03 Intel Corporation Method and apparatus for providing a pipelined memory controller
US6212542B1 (en) * 1996-12-16 2001-04-03 International Business Machines Corporation Method and system for executing a program within a multiscalar processor by processing linked thread descriptors
US6216220B1 (en) * 1998-04-08 2001-04-10 Hyundai Electronics Industries Co., Ltd. Multithreaded data processing method with long latency subinstructions
US6223274B1 (en) * 1997-11-19 2001-04-24 Interuniversitair Micro-Elecktronica Centrum (Imec) Power-and speed-efficient data storage/transfer architecture models and design methodologies for programmable or reusable multi-media processors
US6223238B1 (en) * 1998-03-31 2001-04-24 Micron Electronics, Inc. Method of peer-to-peer mastering over a computer bus
US6223207B1 (en) * 1995-04-24 2001-04-24 Microsoft Corporation Input/output completion port queue data structures and methods for using same
US6223279B1 (en) * 1991-04-30 2001-04-24 Kabushiki Kaisha Toshiba Single chip microcomputer having a dedicated address bus and dedicated data bus for transferring register bank data to and from an on-line RAM
US6247025B1 (en) * 1997-07-17 2001-06-12 International Business Machines Corporation Locking and unlocking mechanism for controlling concurrent access to objects
US6338078B1 (en) * 1998-12-17 2002-01-08 International Business Machines Corporation System and method for sequencing packets for multiprocessor parallelization in a computer network system
US6345334B1 (en) * 1998-01-07 2002-02-05 Nec Corporation High speed semiconductor memory device capable of changing data sequence for burst transmission
US6347344B1 (en) * 1998-10-14 2002-02-12 Hitachi, Ltd. Integrated multimedia system with local processor, data transfer switch, processing modules, fixed functional unit, data streamer, interface unit and multiplexer, all integrated on multimedia processor
US6356692B1 (en) * 1999-02-04 2002-03-12 Hitachi, Ltd. Optical module, transmitter, receiver, optical switch, optical communication unit, add-and-drop multiplexing unit, and method for manufacturing the optical module
US6360262B1 (en) * 1997-11-24 2002-03-19 International Business Machines Corporation Mapping web server objects to TCP/IP ports
US6366998B1 (en) * 1998-10-14 2002-04-02 Conexant Systems, Inc. Reconfigurable functional units for implementing a hybrid VLIW-SIMD programming model
US6373848B1 (en) * 1998-07-28 2002-04-16 International Business Machines Corporation Architecture for a multi-port adapter with a single media access control (MAC)
US6389449B1 (en) * 1998-12-16 2002-05-14 Clearwater Networks, Inc. Interstream control and communications for multi-streaming digital processors
US6393483B1 (en) * 1997-06-30 2002-05-21 Adaptec, Inc. Method and apparatus for network interface card load balancing and port aggregation
US6529983B1 (en) * 1999-11-03 2003-03-04 Cisco Technology, Inc. Group and virtual locking mechanism for inter processor synchronization
US6532509B1 (en) * 1999-12-22 2003-03-11 Intel Corporation Arbitrating command requests in a parallel multi-threaded processing system
US6535878B1 (en) * 1997-05-02 2003-03-18 Roxio, Inc. Method and system for providing on-line interactivity over a server-client network
US6552826B2 (en) * 1997-02-21 2003-04-22 Worldquest Network, Inc. Facsimile network
US6560667B1 (en) * 1999-12-28 2003-05-06 Intel Corporation Handling contiguous memory references in a multi-queue system
US6577542B2 (en) * 1999-12-28 2003-06-10 Intel Corporation Scratchpad memory
US6584522B1 (en) * 1999-12-30 2003-06-24 Intel Corporation Communication between processors
US6681300B2 (en) * 1999-12-28 2004-01-20 Intel Corporation Read lock miss control and queue management
US6694380B1 (en) * 1999-12-27 2004-02-17 Intel Corporation Mapping requests from a processing unit that uses memory-mapped input-output space

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5625812A (en) * 1994-11-14 1997-04-29 David; Michael M. Method of data structure extraction for computer systems operating under the ANSI-92 SQL2 outer join protocol
JPH1091443A (en) 1996-05-22 1998-04-10 Seiko Epson Corp Information processing circuit, microcomputer and electronic equipment
US5946487A (en) * 1996-06-10 1999-08-31 Lsi Logic Corporation Object-oriented multi-media architecture
KR100417398B1 (en) * 1996-09-11 2004-04-03 엘지전자 주식회사 Method for processing instruction block repeat of dsp
US5898885A (en) * 1997-03-31 1999-04-27 International Business Machines Corporation Method and system for executing a non-native stack-based instruction within a computer system
GB2327784B (en) * 1997-07-28 2002-04-03 Microapl Ltd A method of carrying out computer operations
WO1999009469A1 (en) * 1997-08-18 1999-02-25 Koninklijke Philips Electronics N.V. Stack oriented data processing device

Patent Citations (99)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US3373408A (en) * 1965-04-16 1968-03-12 Rca Corp Computer capable of switching between programs without storage and retrieval of the contents of operation registers
US3792441A (en) * 1972-03-08 1974-02-12 Burroughs Corp Micro-program having an overlay micro-instruction
US3940745A (en) * 1973-06-05 1976-02-24 Ing. C. Olivetti & C., S.P.A. Data processing unit having a plurality of hardware circuits for processing data at different priority levels
US3889243A (en) * 1973-10-18 1975-06-10 Ibm Stack mechanism for a data processor
US4016548A (en) * 1975-04-11 1977-04-05 Sperry Rand Corporation Communication multiplexer module
US4032899A (en) * 1975-05-05 1977-06-28 International Business Machines Corporation Apparatus and method for switching of data
US4075691A (en) * 1975-11-06 1978-02-21 Bunker Ramo Corporation Communication control unit
US4514807A (en) * 1980-05-21 1985-04-30 Tatsuo Nogi Parallel computer
US4831358A (en) * 1982-12-21 1989-05-16 Texas Instruments Incorporated Communications system employing control line minimization
US4745544A (en) * 1985-12-12 1988-05-17 Texas Instruments Incorporated Master/slave sequencing processor with forced I/O
US4991112A (en) * 1987-12-23 1991-02-05 U.S. Philips Corporation Graphics system with graphics controller and DRAM controller
US5115507A (en) * 1987-12-23 1992-05-19 U.S. Philips Corp. System for management of the priorities of access to a memory and its application
US6195676B1 (en) * 1989-12-29 2001-02-27 Silicon Graphics, Inc. Method and apparatus for user side scheduling in a multiprocessor operating system program that implements distributive scheduling of processes
US5751987A (en) * 1990-03-16 1998-05-12 Texas Instruments Incorporated Distributed processing memory chip with embedded logic having both data memory and broadcast memory
US5390329A (en) * 1990-06-11 1995-02-14 Cray Research, Inc. Responding to service requests using minimal system-side context in a multiprocessor environment
US5404482A (en) * 1990-06-29 1995-04-04 Digital Equipment Corporation Processor and method for preventing access to a locked memory block by recording a lock in a content addressable memory with outstanding cache fills
US5432918A (en) * 1990-06-29 1995-07-11 Digital Equipment Corporation Method and apparatus for ordering read and write operations using conflict bits in a write queue
US6223279B1 (en) * 1991-04-30 2001-04-24 Kabushiki Kaisha Toshiba Single chip microcomputer having a dedicated address bus and dedicated data bus for transferring register bank data to and from an on-line RAM
US5623489A (en) * 1991-09-26 1997-04-22 Ipc Information Systems, Inc. Channel allocation system for distributed digital switching network
US5392412A (en) * 1991-10-03 1995-02-21 Standard Microsystems Corporation Data communication controller for use with a single-port data packet buffer
US5717898A (en) * 1991-10-11 1998-02-10 Intel Corporation Cache coherency mechanism for multiprocessor computer systems
US5392391A (en) * 1991-10-18 1995-02-21 Lsi Logic Corporation High performance graphics applications controller
US5613136A (en) * 1991-12-04 1997-03-18 University Of Iowa Research Foundation Locality manager having memory and independent code, bus interface logic, and synchronization components for a processing element for intercommunication in a latency tolerant multiple processor
US5392411A (en) * 1992-02-03 1995-02-21 Matsushita Electric Industrial Co., Ltd. Dual-array register file with overlapping window registers
US5404484A (en) * 1992-09-16 1995-04-04 Hewlett-Packard Company Cache system for reducing memory latency times
US6085215A (en) * 1993-03-26 2000-07-04 Cabletron Systems, Inc. Scheduling mechanism using predetermined limited execution time processing threads in a communication network
US5517648A (en) * 1993-04-30 1996-05-14 Zenith Data Systems Corporation Symmetric multiprocessing system with unified environment and distributed system functions
US5542070A (en) * 1993-05-20 1996-07-30 Ag Communication Systems Corporation Method for rapid development of software systems
US5630641A (en) * 1993-07-29 1997-05-20 Aisin Seiki Kabushiki Kaisha Sunroof device for vehicle
US5617327A (en) * 1993-07-30 1997-04-01 Xilinx, Inc. Method for entering state flow diagrams using schematic editor programs
US5627829A (en) * 1993-10-07 1997-05-06 Gleeson; Bryan J. Method for reducing unnecessary traffic over a computer network
US5740402A (en) * 1993-12-15 1998-04-14 Silicon Graphics, Inc. Conflict resolution in interleaved memory systems with multiple parallel accesses
US5644623A (en) * 1994-03-01 1997-07-01 Safco Technologies, Inc. Automated quality assessment system for cellular networks by using DTMF signals
US5742782A (en) * 1994-04-15 1998-04-21 Hitachi, Ltd. Processing apparatus for executing a plurality of VLIW threads in parallel
US5542088A (en) * 1994-04-29 1996-07-30 Intergraph Corporation Method and apparatus for enabling control of task execution
US5721870A (en) * 1994-05-25 1998-02-24 Nec Corporation Lock control for a shared main storage data processing system
US5781774A (en) * 1994-06-29 1998-07-14 Intel Corporation Processor having operating modes for an upgradeable multiprocessor computer system
US5892979A (en) * 1994-07-20 1999-04-06 Fujitsu Limited Queue control apparatus including memory to save data received when capacity of queue is less than a predetermined threshold
US5742822A (en) * 1994-12-19 1998-04-21 Nec Corporation Multithreaded processor which dynamically discriminates a parallel execution and a sequential execution of threads
US5784712A (en) * 1995-03-01 1998-07-21 Unisys Corporation Method and apparatus for locally generating addressing information for a memory access
US5649157A (en) * 1995-03-30 1997-07-15 Hewlett-Packard Co. Memory controller with priority queues
US5886992A (en) * 1995-04-14 1999-03-23 Valtion Teknillinen Tutkimuskeskus Frame synchronized ring system and method
US6223207B1 (en) * 1995-04-24 2001-04-24 Microsoft Corporation Input/output completion port queue data structures and methods for using same
US5592622A (en) * 1995-05-10 1997-01-07 3Com Corporation Network intermediate system with message passing architecture
US5761522A (en) * 1995-05-24 1998-06-02 Fuji Xerox Co., Ltd. Program control system programmable to selectively execute a plurality of programs
US5613071A (en) * 1995-07-14 1997-03-18 Intel Corporation Method and apparatus for providing remote memory access in a distributed memory multiprocessor system
US6058168A (en) * 1995-12-29 2000-05-02 Tixi.Com Gmbh Telecommunication Systems Method and microcomputer system for the automatic, secure and direct transmission of data
US6201807B1 (en) * 1996-02-27 2001-03-13 Lucent Technologies Real-time hardware method and apparatus for reducing queue processing
US5761507A (en) * 1996-03-05 1998-06-02 International Business Machines Corporation Client/server architecture supporting concurrent servers within a server with a transaction manager providing server/connection decoupling
US5764915A (en) * 1996-03-08 1998-06-09 International Business Machines Corporation Object-oriented communication interface for network protocol access using the selected newly created protocol interface object and newly created protocol layer objects in the protocol stack
US5784649A (en) * 1996-03-13 1998-07-21 Diamond Multimedia Systems, Inc. Multi-threaded FIFO pool buffer and bus transfer control system
US6199133B1 (en) * 1996-03-29 2001-03-06 Compaq Computer Corporation Management communication bus for networking devices
US5890208A (en) * 1996-03-30 1999-03-30 Samsung Electronics Co., Ltd. Command executing method for CD-ROM disk drive
US6012151A (en) * 1996-06-28 2000-01-04 Fujitsu Limited Information processing apparatus and distributed processing control method
US6023742A (en) * 1996-07-18 2000-02-08 University Of Washington Reconfigurable computing architecture for providing pipelined data paths
US5745913A (en) * 1996-08-05 1998-04-28 Exponential Technology, Inc. Multi-processor DRAM controller that prioritizes row-miss requests to stale banks
US5928736A (en) * 1996-09-09 1999-07-27 Raytheon Company Composite structure having integrated aperture and method for its preparation
US6072781A (en) * 1996-10-22 2000-06-06 International Business Machines Corporation Multi-tasking adapter for parallel network applications
US5860158A (en) * 1996-11-15 1999-01-12 Samsung Electronics Company, Ltd. Cache control unit with a cache request transaction-oriented protocol
US5905876A (en) * 1996-12-16 1999-05-18 Intel Corporation Queue ordering for memory and I/O transactions in a multiple concurrent transaction computer system
US6212542B1 (en) * 1996-12-16 2001-04-03 International Business Machines Corporation Method and system for executing a program within a multiscalar processor by processing linked thread descriptors
US6552826B2 (en) * 1997-02-21 2003-04-22 Worldquest Network, Inc. Facsimile network
US5742587A (en) * 1997-02-28 1998-04-21 Lanart Corporation Load balancing port switching hub
US5905889A (en) * 1997-03-20 1999-05-18 International Business Machines Corporation Resource management system using next available integer from an integer pool and returning the integer thereto as the next available integer upon completion of use
US5918235A (en) * 1997-04-04 1999-06-29 Hewlett-Packard Company Object surrogate with active computation and probablistic counter
US6535878B1 (en) * 1997-05-02 2003-03-18 Roxio, Inc. Method and system for providing on-line interactivity over a server-client network
US6182177B1 (en) * 1997-06-13 2001-01-30 Intel Corporation Method and apparatus for maintaining one or more queues of elements such as commands using one or more token queues
US6092158A (en) * 1997-06-13 2000-07-18 Intel Corporation Method and apparatus for arbitrating between command streams
US6067585A (en) * 1997-06-23 2000-05-23 Compaq Computer Corporation Adaptive interface controller that can operate with segments of different protocol and transmission rates in a single integrated device
US5887134A (en) * 1997-06-30 1999-03-23 Sun Microsystems System and method for preserving message order while employing both programmed I/O and DMA operations
US6393483B1 (en) * 1997-06-30 2002-05-21 Adaptec, Inc. Method and apparatus for network interface card load balancing and port aggregation
US6247025B1 (en) * 1997-07-17 2001-06-12 International Business Machines Corporation Locking and unlocking mechanism for controlling concurrent access to objects
US6170051B1 (en) * 1997-08-01 2001-01-02 Micron Technology, Inc. Apparatus and method for program level parallelism in a VLIW processor
US6014729A (en) * 1997-09-29 2000-01-11 Firstpass, Inc. Shared memory arbitration apparatus and method
US6085294A (en) * 1997-10-24 2000-07-04 Compaq Computer Corporation Distributed data dependency stall mechanism
US6061710A (en) * 1997-10-29 2000-05-09 International Business Machines Corporation Multithreaded processor incorporating a thread latch register for interrupt service new pending threads
US5915123A (en) * 1997-10-31 1999-06-22 Silicon Spice Method and apparatus for controlling configuration memory contexts of processing elements in a network of multiple context processing elements
US6223274B1 (en) * 1997-11-19 2001-04-24 Interuniversitair Micro-Elecktronica Centrum (Imec) Power-and speed-efficient data storage/transfer architecture models and design methodologies for programmable or reusable multi-media processors
US6360262B1 (en) * 1997-11-24 2002-03-19 International Business Machines Corporation Mapping web server objects to TCP/IP ports
US6345334B1 (en) * 1998-01-07 2002-02-05 Nec Corporation High speed semiconductor memory device capable of changing data sequence for burst transmission
US6223238B1 (en) * 1998-03-31 2001-04-24 Micron Electronics, Inc. Method of peer-to-peer mastering over a computer bus
US6079008A (en) * 1998-04-03 2000-06-20 Patton Electronics Co. Multiple thread multiple data predictive coded parallel processing system and method
US6216220B1 (en) * 1998-04-08 2001-04-10 Hyundai Electronics Industries Co., Ltd. Multithreaded data processing method with long latency subinstructions
US6092127A (en) * 1998-05-15 2000-07-18 Hewlett-Packard Company Dynamic allocation and reallocation of buffers in links of chained DMA operations by receiving notification of buffer full and maintaining a queue of buffers available
US6373848B1 (en) * 1998-07-28 2002-04-16 International Business Machines Corporation Architecture for a multi-port adapter with a single media access control (MAC)
US6073215A (en) * 1998-08-03 2000-06-06 Motorola, Inc. Data processing system having a data prefetch mechanism and method therefor
US6347344B1 (en) * 1998-10-14 2002-02-12 Hitachi, Ltd. Integrated multimedia system with local processor, data transfer switch, processing modules, fixed functional unit, data streamer, interface unit and multiplexer, all integrated on multimedia processor
US6366998B1 (en) * 1998-10-14 2002-04-02 Conexant Systems, Inc. Reconfigurable functional units for implementing a hybrid VLIW-SIMD programming model
US6212611B1 (en) * 1998-11-03 2001-04-03 Intel Corporation Method and apparatus for providing a pipelined memory controller
US6389449B1 (en) * 1998-12-16 2002-05-14 Clearwater Networks, Inc. Interstream control and communications for multi-streaming digital processors
US6338078B1 (en) * 1998-12-17 2002-01-08 International Business Machines Corporation System and method for sequencing packets for multiprocessor parallelization in a computer network system
US6356692B1 (en) * 1999-02-04 2002-03-12 Hitachi, Ltd. Optical module, transmitter, receiver, optical switch, optical communication unit, add-and-drop multiplexing unit, and method for manufacturing the optical module
US6529983B1 (en) * 1999-11-03 2003-03-04 Cisco Technology, Inc. Group and virtual locking mechanism for inter processor synchronization
US6532509B1 (en) * 1999-12-22 2003-03-11 Intel Corporation Arbitrating command requests in a parallel multi-threaded processing system
US6694380B1 (en) * 1999-12-27 2004-02-17 Intel Corporation Mapping requests from a processing unit that uses memory-mapped input-output space
US6560667B1 (en) * 1999-12-28 2003-05-06 Intel Corporation Handling contiguous memory references in a multi-queue system
US6577542B2 (en) * 1999-12-28 2003-06-10 Intel Corporation Scratchpad memory
US6681300B2 (en) * 1999-12-28 2004-01-20 Intel Corporation Read lock miss control and queue management
US6584522B1 (en) * 1999-12-30 2003-06-24 Intel Corporation Communication between processors

Cited By (41)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050244411A1 (en) * 1999-01-25 2005-11-03 Biogen Idec Ma Inc. BAFF, inhibitors thereof and their use in the modulation of B-cell response and treatment of autoimmune disorders
US20040073778A1 (en) * 1999-08-31 2004-04-15 Adiletta Matthew J. Parallel processor architecture
US20060069882A1 (en) * 1999-08-31 2006-03-30 Intel Corporation, A Delaware Corporation Memory controller for processor having multiple programmable units
US8316191B2 (en) 1999-08-31 2012-11-20 Intel Corporation Memory controllers for processor having multiple programmable units
US7991983B2 (en) 1999-09-01 2011-08-02 Intel Corporation Register set used in multithreaded parallel processor architecture
USRE41849E1 (en) 1999-12-22 2010-10-19 Intel Corporation Parallel multi-threaded processing
US9824038B2 (en) 1999-12-27 2017-11-21 Intel Corporation Memory mapping in a processor having multiple programmable units
US8738886B2 (en) 1999-12-27 2014-05-27 Intel Corporation Memory mapping in a processor having multiple programmable units
US20040186921A1 (en) * 1999-12-27 2004-09-23 Intel Corporation, A California Corporation Memory mapping in a multi-engine processor
US9128818B2 (en) 1999-12-27 2015-09-08 Intel Corporation Memory mapping in a processor having multiple programmable units
US9824037B2 (en) 1999-12-27 2017-11-21 Intel Corporation Memory mapping in a processor having multiple programmable units
US9830285B2 (en) 1999-12-27 2017-11-28 Intel Corporation Memory mapping in a processor having multiple programmable units
US9830284B2 (en) 1999-12-27 2017-11-28 Intel Corporation Memory mapping in a processor having multiple programmable units
US20040071152A1 (en) * 1999-12-29 2004-04-15 Intel Corporation, A Delaware Corporation Method and apparatus for gigabit packet assignment for multithreaded packet processing
US7751402B2 (en) 1999-12-29 2010-07-06 Intel Corporation Method and apparatus for gigabit packet assignment for multithreaded packet processing
US7681018B2 (en) 2000-08-31 2010-03-16 Intel Corporation Method and apparatus for providing large register address space while maximizing cycletime performance for a multi-threaded register file set
US20070234009A1 (en) * 2000-08-31 2007-10-04 Intel Corporation Processor having a dedicated hash unit integrated within
US7743235B2 (en) 2000-08-31 2010-06-22 Intel Corporation Processor having a dedicated hash unit integrated within
US20050132132A1 (en) * 2001-08-27 2005-06-16 Rosenbluth Mark B. Software controlled content addressable memory in a general purpose execution datapath
US20030105899A1 (en) * 2001-08-27 2003-06-05 Rosenbluth Mark B. Multiprocessor infrastructure for providing flexible bandwidth allocation via multiple instantiations of separate data buses, control buses and support mechanisms
US20030041216A1 (en) * 2001-08-27 2003-02-27 Rosenbluth Mark B. Mechanism for providing early coherency detection to enable high performance memory updates in a latency sensitive multithreaded environment
US20030067934A1 (en) * 2001-09-28 2003-04-10 Hooper Donald F. Multiprotocol decapsulation/encapsulation control structure and packet protocol conversion method
US20030115426A1 (en) * 2001-12-17 2003-06-19 Rosenbluth Mark B. Congestion management for high speed queuing
US8380923B2 (en) 2002-01-04 2013-02-19 Intel Corporation Queue arrays in network devices
US7895239B2 (en) 2002-01-04 2011-02-22 Intel Corporation Queue arrays in network devices
US20050216710A1 (en) * 2002-01-17 2005-09-29 Wilkinson Hugh M Iii Parallel processor with functional pipeline providing programming engines by supporting multiple contexts and critical section
US20030145155A1 (en) * 2002-01-25 2003-07-31 Gilbert Wolrich Data transfer mechanism
US20030231635A1 (en) * 2002-06-18 2003-12-18 Kalkunte Suresh S. Scheduling system for transmission of cells to ATM virtual circuits and DSL ports
US20040085901A1 (en) * 2002-11-05 2004-05-06 Hooper Donald F. Flow control in a network environment
US20050144413A1 (en) * 2003-12-30 2005-06-30 Chen-Chi Kuo Method and apparatus utilizing non-uniformly distributed DRAM configurations and to detect in-range memory address matches
US8413149B2 (en) 2004-02-27 2013-04-02 Sony Corporation Priority based processor reservations
US20050210517A1 (en) * 2004-02-27 2005-09-22 Yukiyoshi Hirose Information processing system, network system situation presenting method and computer program
US20060067348A1 (en) * 2004-09-30 2006-03-30 Sanjeev Jain System and method for efficient memory access of queue control data structures
US20060155959A1 (en) * 2004-12-21 2006-07-13 Sanjeev Jain Method and apparatus to provide efficient communication between processing elements in a processor unit
US20060140203A1 (en) * 2004-12-28 2006-06-29 Sanjeev Jain System and method for packet queuing
US8819700B2 (en) * 2010-12-22 2014-08-26 Lsi Corporation System and method for synchronous inter-thread communication
US20120167115A1 (en) * 2010-12-22 2012-06-28 Lsi Corporation System and method for synchronous inter-thread communication
US10725997B1 (en) * 2012-06-18 2020-07-28 EMC IP Holding Company LLC Method and systems for concurrent collection and generation of shared data
US20150032986A1 (en) * 2013-07-29 2015-01-29 Ralph Moore Memory block management systems and methods
US9424027B2 (en) * 2013-07-29 2016-08-23 Ralph Moore Message management system for information transfer within a multitasking system
US10901887B2 (en) * 2018-05-17 2021-01-26 International Business Machines Corporation Buffered freepointer management memory system

Also Published As

Publication number Publication date
EP1247168A2 (en) 2002-10-09
CN1253784C (en) 2006-04-26
AU2280101A (en) 2001-07-16
US6631462B1 (en) 2003-10-07
WO2001050247A3 (en) 2002-01-31
HK1046180A1 (en) 2002-12-27
ATE280972T1 (en) 2004-11-15
SG149673A1 (en) 2009-02-27
DE60015395T2 (en) 2005-11-10
EP1247168B1 (en) 2004-10-27
HK1046180B (en) 2005-05-13
CN1451114A (en) 2003-10-22
WO2001050247A2 (en) 2001-07-12
TWI222011B (en) 2004-10-11
DE60015395D1 (en) 2004-12-02

Similar Documents

Publication Publication Date Title
US6631462B1 (en) Memory shared between processing threads
US6560667B1 (en) Handling contiguous memory references in a multi-queue system
US9824037B2 (en) Memory mapping in a processor having multiple programmable units
US7546444B1 (en) Register set used in multithreaded parallel processor architecture
US6532509B1 (en) Arbitrating command requests in a parallel multi-threaded processing system
US7111296B2 (en) Thread signaling in multi-threaded processor
US6324624B1 (en) Read lock miss control and queue management
EP1236088B9 (en) Register set used in multithreaded parallel processor architecture
US6671827B2 (en) Journaling for parallel hardware threads in multithreaded processor
US7610451B2 (en) Data transfer mechanism using unidirectional pull bus and push bus
US7191309B1 (en) Double shift instruction for micro engine used in multithreaded parallel processor architecture
JP2742245B2 (en) Parallel computer

Legal Events

Date Code Title Description
STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION