Creating a NIC Driver
This section introduces the NIC driver and explains how to create a project and implement its functionalities.
Topics:
- Creating a Project
- Using the Sample Driver
- General Information about NIC Drivers
- Real-Time Requirements
- Memory Allocation and Deallocation Restrictions
- Implementing the NIC Driver Functionality
Creating a Project
To create a project in Visual Studio:
- To create a project:
- Create a new project in Visual Studio.
- Select C or C++ as the template type.
- Select Finish to create the project.
- Add #include <Rtnd.h> to the header file.
Note: The declarations and definitions needed by the driver's source code are in Rtnd.h. A driver must NOT include
Using the Sample Driver
wRTOS SDK provides source code and project files for a NIC driver sample (see NL2NICDriverSample). You can use this sample as a reference when implementing driver functions into your project. This sample provides a basic structure for a NIC driver. Please access this sample from the directory %public%\Public Documents\IntervalZero\MaxRT\wRTOS SDK\1.0\Samples or through the Start menu.
General Information about NIC Drivers
NIC drivers are RTDLL modules loaded by the Network Link Layer (NL2) at startup.
The NL2 uses the NIC driver's services by calling the functions exported by the driver. These functions are described in the NIC Driver API Functions section.
After loading a driver, the first function called by the NL2 is RtndInitDriver. This function initializes the driver and passes pointers to NL2 callback functions that the driver can call.
Some of the functions exported by drivers are always called from the main thread of the NL2 process, while others may be called from any thread of the NL2 process or from the application process. This information is specified in the Remarks section of each API function.
To prevent driver and hardware state corruption, the NL2 prevents an application process from terminating while it's executing a driver’s exported function.
Real-Time Requirements
NL2 drivers are designed for real-time applications, but not all driver functions require real-time performance. This section outlines the meaningful real-time requirements for a driver and identifies the critical parts of a driver that must adhere to them.
Topics in this section:
Concepts
Real-time is a multifaceted concept that varies based on context. If we take the strongest definition, then no RTSS application can be considered truly real-time because it interacts with shared hardware resources, such as CPU caches, SDRAM, and PCI buses, which are also used by components beyond its control, including Windows processors, GPUs, SMIs, and other RTSS applications.
For clarity, we define two levels of real-time behavior:
- Shutdown-safe (1st level): a code segment is considered shutdown-safe if it is guaranteed to never make an SRI (System Request Interrupt) to Windows.
- Deterministic (2nd level): a code segment is considered deterministic if it is guaranteed to always execute in a predictable, bounded, amount of time (unless it is preempted by a higher-priority thread that becomes runnable on the same processor after the code has started to execute).
Please note that deterministic code is always shutdown-safe, but not all shutdown-safe code is deterministic.
Applying the above concepts to a driver, we can derive the following:
- Being shutdown-safe depends only on the software design of the driver.
- Being deterministic depends not only on the software design of the driver but also on the underlying hardware. If the code for sending a frame needs to wait on some shared NIC hardware resources, then that code can't be deterministic.
Requirements
We distinguish two categories of functions in an NL2 driver:
- The entry point functions which the NL2 calls. Each function has different real-time requirements depending on its role and the thread it is being called from. Entry point functions called from the NL2 main thread have no real-time requirement at all, because the NL2 main thread is not designed to be deterministic nor shutdown safe. Please refer to Real-Time Network Driver API Functions for other entry-point functions.
- The chain of functions that execute from the point when the ISR (Interrupt Service Routine) of a driver is called for a Transmit/Receive event to the point when the driver acknowledges the interrupt and calls the right NL2 notification function (namely RTND_CALLBACKS.NotifyTxInterrupt and RTND_CALLBACKS.NotifyRxInterrupt), which normally happens in the IST (Interrupt Service Thread):
- The code in that chain of functions MUST be shutdown safe
- The code in that chain of functions SHOULD be deterministic
The first requirement is to ensure the hardware remains usable during system shutdown.
The second requirement is to avoid preventing an NL2 application from achieving deterministic behavior simply because it needs to send/receive Ethernet frames. Driver developers must try to fulfill the second requirement, unless the hardware doesn't allow it.
Note: For drivers that implement the EnableInterruptFromSpecifiedMessage function (in MSI-X multi-vector mode), the requirements on ISR/IST also apply to EnableInterruptFromSpecifiedMessage.
Note: Drivers that control a virtual (software) device follow the same requirements as above, but the affected chain of functions starts when the Transmit/Receive event occurs.
Memory Allocation and Deallocation Restrictions
The NL2 imposes strict restrictions on when a NIC driver is allowed to allocate memory.
Drivers can dynamically allocate/deallocate two types of memory:
|
Memory type |
Description |
|---|---|
|
Local memory |
This type of memory contains the following:
|
|
Contiguous physical memory |
This memory type is used when the driver allocates memory that needs to be shared with the NIC hardware, typically for Transmit/Receive buffers and DMA descriptors rings. |
Note: Each NIC Driver (RTND) entry point function has different memory allocation requirements.
Implementing the NIC Driver Functionality
In this section, we will introduce the different features that a NIC Driver must implement.
Topics in this section:
- Initialize the Driver and Register the Callback Function Pointers
- Probe an Interface on the PCI Bus
- Return the Capabilities of an Interface and Its Queues
- Configure an Interface and Its Queues
- Start an Interface
- Return the MAC Address of an Interface
- Return the Features Supported by an Interface
- Monitor the Link Status of an Interface
- Read the Current Time from an Interface’s Local Clock and Cross-timestamp it with the CPU Clock
- Adjust the Current Time of an Interface’s Local Clock
- Adjust the Rate of an Interface’s Local Clock
- Controlling Special Functions of a Device
- Configure the Credit-Based Shaper on a Transmit Queue
- Monitor the Egress Timestamps Captured by an Interface
- Configure the Operating Mode of an Interface
- Configure the Multicast Hash Filter of an Interface
- Configure the Interrupt Moderation
- Configure the EtherType Hardware Dispatcher
- Configure the PCP Hardware Dispatcher
- Configure the UDP Port Hardware Dispatcher
- Allocate Frame Buffers
- Get Buffers for Transmit
- Transmit Frames
- Restart a Transmit Queue
- Process Transmit Interrupts
- Receive Frames
- Restart a Receive Queue
- Process Receive Interrupts
- Real-time Interrupt Masking
Initialize the Driver and Register the Callback Function Pointers
After the NL2 loads the NIC driver, it calls RtndInitDriver to provide the driver with a structure containing function pointers to the callbacks exposed by the NL2.
Note: This set of exported callback functions is not specific to an interface or a driver. The NL2 provides the same functions to all drivers.
The RtndInitDriver function also allows the driver to perform any required global initialization.
Probe an Interface on the PCI Bus
The Probe operation involves searching for the interface on the PCI Bus and reading the PCI Configuration Space to identify the device and ensure it is supported.
The NL2 calls RtndManageInterface to request a Probe operation. If the driver finds a supported device, it must allocate a context for that interface, record the information from the PCI Configuration Space, and return a number or a pointer (ULONG_PTR type) to identify that interface.
Note: The driver is not allowed to write to any NIC register in this function. It must NOT start the interface.
Return the Capabilities of an Interface and Its Queues
After a successful call to RtndManageInterface, the NL2 calls RtndQueryInterfaceCapability to retrieve the interface's capabilities and queues. The NL2 calls RtndQueryInterfaceCapability once for each capability type it requires.
Configure an Interface and Its Queues
After retrieving the interface's capabilities and queues, the NL2 configures them by calling RtndSetInterface. The NL2 calls RtndSetInterface once for each setting it requires to configure.
The NL2 always does this operation before calling RtndStartInterface to start the interface.
Note: If the NL2 does not explicitly configure a setting, the driver must use default value.
Start an Interface
After configuring the interface and its queues, the NL2 calls RtndStartInterface to start the hardware. The driver must use the settings configured in the previous calls to RtndSetInterface.
In the RtndStartInterface function, the driver is expected to allocate receive buffers for all of the interface's enabled receive queues. When done this way, the NIC is ready to accept incoming frames when the function returns. The driver must use RTND_CALLBACKS.CreateRxBuffers to allocate the receive buffers. See Allocate Frame Buffers and Receive Frames.
The driver is also expected to allocate Transmit Buffers for all enabled Transmit Queues of the interface. This is done so that when the function returns, the NL2 can call RtndGetTxBuffers to get Transmit Buffers and frames. The driver must use RTND_CALLBACKS.CreateTxBuffers to allocate the Transmit Buffers. See Allocate Frame Buffers and Get Buffers for Transmit.
Return the MAC Address of an Interface
After the interface is started, the NL2 calls RtndQueryMacAddress to query its MAC Address.
Return the Features Supported by an Interface
In addition to capabilities, the driver reports features to the NL2.
Capabilities describe the static options and ranges that the NL2 can set before starting an interface, while features describe the functionalities that can be used by the NL2 after the interface is started.
The NL2 calls RtndQueryInterfaceFeature to retrieve the features supported by an interface. This function is called once for each feature type defined by the NL2.
Monitor the Link Status of an Interface
After the interface is started, the driver is responsible for monitoring its link status.
In general, NICs use hardware interrupts to inform the driver of a link status change, but the driver can also implement a polling operation to keep track of the link status.
When the driver detects a link status change, it must call RTND_CALLBACKS.NotifyLinkStatusChange to inform the NL2. The driver is not expected to immediately read the new link status (see below).
Note: The callback function must be called from within the NL2 process.
Inside the callback, the NL2 notes that a link status change happened and wakes up its main thread. When the main thread wakes up, the NL2 calls RtndQueryLinkStatus. At this point, the driver is expected to read the NIC registers to retrieve and return the interface's current link status.
Read the Current Time from an Interface’s Local Clock and Cross-timestamp it with the CPU Clock
Some interfaces have one or multiple onboard hardware clocks that are used for ingress and egress timestamping of Ethernet frames.
If the Local Clocks of an interface satisfy the following requirements, the driver should set the RTND_CAPABILITY_CLOCK.ReadClockSupported field to TRUE, in which case it must expose the RtndReadClock function:
- Requirement 1: for each clock, the hardware exposes a clock counter that increments at a well-known frequency which the driver can read at any time.
- Requirement 2: the clock counter has a sufficiently large width so that it takes at least 100 years to overflow when running freely (without adjustment from the software). A width of 32 bits for a counter in seconds, or a width of 64 bits for a counter in nanoseconds, is sufficient to satisfy this requirement.
If one of the above requirements is not satisfied, the driver must not advertise support for the Read Clock functionality, as the NL2 or the user application may not behave properly due to incorrect assumptions about the hardware.
When the NL2 calls RtndReadClock, the driver is expected to read the current value of the specified Local Clock and get a cross-timestamp from the CPU clock. This cross-timestamp consists of a triplet of timestamps taken as close as possible to each other, in the following order:
- CPU Clock time
- NIC Clock time
- CPU Clock time (again)
That collection of timestamps, as well as other information about the performed operation, are returned in a RTND_READ_CLOCK_RESULT structure.
Reading the CPU Clock is a very fast and predictable operation. For example, on x64 CPUs, it’s one machine instruction (rdtsc) and has no noticeable latency. However, reading the NIC Clock requires exchanging messages on the PCI-Express bus, which takes a non-negligible amount of time (up to several microseconds). This amount of time is not fixed. It depends on the interfering traffic on the PCI-Express bus, which varies over time. The CPU Clock is read twice to give an accurate range instead of an approximate single value.
Some hardware implements the PTM technology (Precision Time Measurement), which performs the cross-timestamping operation entirely in hardware. The NIC uses special PCI-Express messages to request the CPU Clock time. Those messages provide information that allows the NIC to measure the time it took for the CPU to answer. With that information, the NIC can compensate for the travel time of the messages on the PCI-Express bus, and get an extremely accurate cross-timestamp.
If the underlying hardware supports PTM, the driver should indicate it to the NL2 by setting the RTND_CAPABILITY_CLOCK.PtmSupported field to TRUE. In addition, the driver should try to enable PTM at startup time, in the RtndStartInterface function. If it succeeds, it should set the RTND_CAPABILITY_CLOCK.PtmEnabled field to TRUE.
When the NL2 calls the RtndReadClock function, it indicates in the Flags parameter whether PTM should be used. If PTM is requested, the driver must try to use it. In case of failure, it must fall back to the software cross-timestamping method. If PTM is not requested, the driver must use the software cross-timestamping method without trying to use PTM. If the driver used the PTM method, it MUST set the RTND_READ_CLOCK_RESULT_FLAG_PTM flag in the RTND_READ_CLOCK_RESULT.Flags field and it MUST store the PTM Master time, in nanoseconds, in the RTND_READ_CLOCK_RESULT.PtmMasterTimeNs field.
If the driver used the software method, it MUST clear the RTND_READ_CLOCK_RESULT_FLAG_PTM flag in the RTND_READ_CLOCK_RESULT.Flags field and it MUST store the values of QPC prior and after reading the NIC time in the RTND_READ_CLOCK_RESULT.QpcPrior and RTND_READ_CLOCK_RESULT.QpcAfter fields.
Adjust the Current Time of an Interface’s Local Clock
If the Local Clocks of an interface satisfy the following requirements, then the driver should set the RTND_CAPABILITY_CLOCK.AdjustClockTimeSupported field to TRUE, in which case it must expose the RtndAdjustClockTime function:
- Requirement 1: for each clock, the hardware exposes some registers to force the value of the clock counter to a given arbitrary value, and/or some registers to apply a given time offset.
- Requirement 2: the clock counter’s width is less than 2^48 seconds, which means that the range of values that the counter can take is 0 to something less than or equal to 2^48-1 seconds and 999,999,999 nanoseconds.
- Requirement 3: the clock counter wraps around to 0 after reaching its maximum value.
If one of the above requirements is not satisfied then the driver should not advertise support for the Adjust Clock Time functionality.
Local Clocks generally expose two methods to adjust their current time:
- Method 1 (preferred, if available): the driver provides the offset directly to the hardware and lets the hardware apply the offset. This is a called the fine time adjustment method. It is very precise but generally works with small offset values only. If the hardware supports this method, the driver must properly populate the RTND_FEATURE_CLOCK.ClockTimeFineOffsetMax and RTND_FEATURE_CLOCK.ClockTimeFineOffsetMin fields.
- Method 2: the driver reads the current counter value, adds the offset to it, and writes the new counter value. This method is called the coarse time adjustment method. It is less precise than the fine time adjustment method because reading and writing hardware registers takes a non-negligible and variable amount of time.
Adjust the Rate of an Interface’s Local Clock
If the hardware allows its clock frequency to be modified, the driver must set the RTND_CAPABILITY_CLOCK.SetClockRateSupported field to TRUE and expose the RtndSetClockRate function.
Controlling Special Functions of a Device
A driver can expose special functions that are not covered by the NL2 API. This can be useful in the following cases:
- To control a hardware-specific function.
- To control a function made for debug purposes only.
- To control a function that is not supported by the NL2.
If your driver must expose special functions, it must implement function RtndControlDeviceSpecialFunction, define the different function codes and their associated data structures, and share them with the application. Please refer to the NL2NICDriverSample for an example of how to do that.
When an application calls Rtnl2ControlDeviceSpecialFunction, the NL2 switches to the context of its own process and calls the RtndControlDeviceSpecialFunction function of the driver, without interpreting the passed function code or the function data.
Configure the Credit-Based Shaper on a Transmit Queue
When an interface supports the Credit-Based Shaper (CBS), the driver must export the RtndSetCbsParams function, which allows the NL2 to configure the bandwidth (called IdleSlope in the IEEE standard) allocated to the CBS on a given Transmit Queue.
The NL2 will use the RtndSetCbsParams only on Transmit Queues where the CBS has been enabled (see the RTND_SETTING_ID_TX_QUEUE_SCHEDULING setting).
At startup, the driver must set the bandwidth used by each Transmit Queue with CBS enabled to 100%. This means the queue's bandwidth is not limited until the NL2 explicitly sets the desired bandwidth by calling RtndSetCbsParams.
The NL2 can call this function at any time after the interface starts, and the driver must apply the new settings immediately before the function returns.
Notes:
- If the driver claims support for the CBS on an interface, it must support it on at least one of its Transmit Queues. It doesn’t have to support it in all the Transmit Queues. If the NL2 requests to enable the CBS on a Transmit Queue that doesn’t support it, or on more Transmit Queues than the hardware supports, the driver must fail the RtndStartInterface function.
- Depending on the hardware, the granularity of the bandwidth setting can be greater than 1 Kbps. In that case, the driver must silently round the passed value up. For example, if the granularity is 16 Kbps and the requested value is 12500 Kbps, the driver must set the bandwidth to 12512 Kbps.
- If the requested bandwidth is greater than the current link speed, the driver must perform as if the requested bandwidth were equal to the link speed. The driver must make that adjustment automatically whenever the link's speed changes. For example, if the requested bandwidth is 150000 Kbps, the driver must restrict the actual bandwidth to 100000 Kbps on a 100 Mbps link but revert it to 150000 Kbps if the link speed changes to 1 Gbps.
- The value of the bandwidth that the NL2 passes in RTND_CBS_PARAMS.RawBandwidthKbps is the raw occupied bandwidth on the wire. It accounts for the 14-byte Ethernet header, the 4-byte VLAN Tag (if any), the payload, the 4-byte FCS, AND the 20-byte gap after the frame (the minimum space a sender must insert between the FCS of a frame and the Ethernet header of the next frame).
Monitor the Egress Timestamps Captured by an Interface
When an interface supports Egress Timestamping, the driver is responsible for monitoring the availability of those Egress Timestamps.
In general, NICs use hardware interrupts to inform the driver when a new Egress Timestamp is available, but the driver can also implement a polling operation to monitor Egress Timestamps.
When the driver detects the availability of a new Egress Timestamp, it must call RTND_CALLBACKS.NotifyEgressTimestamp to inform the NL2. The driver is not expected to extract the Egress Timestamp immediately (see below).
Note: The callback function must be called from within the NL2 process.
Inside the callback, the NL2 notes that a new Egress Timestamp is available and wakes up its main thread. When the main thread wakes up, the NL2 calls RtndExtractLastTxTimestamp. At this point, the driver is expected to read the NIC registers to extract the last Egress Timestamp for a given Transmit Queue.
Extracting an Egress Timestamp has 2 side effects:
- Marks the current Egress Timestamp as invalid.
- Allows the hardware to override the Egress Timestamp with a new one.
Configure the Operating Mode of an Interface
Some NICs can alter their nominal operating mode, described by a set of TRUE or FALSE settings.
The NL2 supports two of these settings:
- Promiscuous mode: When this setting is enabled, the hardware does not drop a packet because its destination MAC address is unknown. If the interface supports this setting, the driver must export the RtndSetPromiscuousMode function.
- PassBadFrames mode: When this setting is enabled, the hardware does not drop a packet because it has Layer 1 or Layer 2 errors. If the interface supports this setting, the driver must export the RtndSetPassBadFramesMode function.
By default, at startup, the driver must set the interface to operate in nominal mode. All the above settings must be turned off.
Note: Support for the above settings is optional. The NL2 will never try to modify a setting not supported by the underlying driver. See RTND_FEATURE_INTERFACE_MODES.
Configure the Multicast Hash Filter of an Interface
Every NIC hardware has a hash table to filter out undesired Multicast frames early on the reception path.
The NL2 maintains an interface-specific linked list of RTND_MULTICAST_ENTRY structures which contain the Multicast MAC addresses that should pass the Multicast Hash Filter. Every time the content of this list is modified, the NL2 calls the RtndSetMulticastFilter function and provides a pointer to the head of the new linked list of RTND_MULTICAST_ENTRY structures. The driver is responsible for updating the hardware hash table accordingly.
The NL2 can also request the driver to let ALL Multicast frames pass. Instead of providing a list of RTND_MULTICAST_ENTRY structures, the NL2 sets the bPassAllMulticast parameter of the RtndSetMulticastFilter function to TRUE.
By default, at startup, the driver must clear the Multicast Hash Filter to prohibit the reception of all Multicast frames.
Note: If the hardware lacks a Multicast Hash Filter, the driver should allow all Multicast addresses to pass (same result as a theoretical Multicast Hash Filter of size 0).
Configure the Interrupt Moderation
Some NICs support a feature to slow down the rate of interrupts generated by the hardware. This is done by specifying an interval in nanoseconds. When the feature is enabled, no more than one interrupt is generated per interval.
In MSI-X mode, there is one interval value per MSI-X Message. In Line-Based and MSI mode, there is a single interval value for the whole interface.
The NL2 configures interrupt moderation by calling RtndSetInterruptModeration.
By default, at startup, interrupt moderation must be turned off by the driver.
Note: Support for interrupt moderation is optional. When it’s supported, the interval value can't exceed a given limit. The NL2 will never try to use a non-supported feature or a non-supported value. See RTND_FEATURE_INTERRUPT.
Configure the EtherType Hardware Dispatcher
The NL2 views the EtherType Hardware Dispatcher as a fixed-size table. Each entry within this table has the following structure:
- Enabled
- EtherType
- RxQueueIndex
When Enabled is FALSE, the entry is inactive. When Enabled is TRUE, the entry is active, and the hardware forwards all incoming frames with the EtherType into the Receive Queue specified by RxQueueIndex.
Incoming frames that don't match any Hardware Dispatcher rule, neither EtherType, PCP, nor UdpPort are forwarded to Receive Queue 0.
The NL2 guarantees that it never enables different entries with the same EtherType value. It also guarantees that the RxQueueIndex value in each enabled entry corresponds to an existing and enabled Receive Queue.
By default, at startup, the driver must disable all the entries of all Hardware Dispatchers, including the EtherType, PCP, and UDP Port. This means that all received frames go to Receive Queue 0.
The NL2 uses two functions provided by the driver to configure the EtherType Hardware Dispatcher:
- RtndSetDispatcherEtherTypeEntry: updates the content of an EtherType Hardware Dispatcher table entry.
- RtndGetDispatcherEtherTypeEntry: reads the content of an EtherType Hardware Dispatcher table entry.
Note: The driver is responsible for maintaining a copy of the EtherType Hardware Dispatcher table in memory. The RtndGetDispatcherEtherTypeEntry function must NOT access any hardware register; it must read and return the content saved in memory.
Configure the PCP Hardware Dispatcher
The NL2 views the PCP Hardware Dispatcher as a fixed-size table. Each entry within this table has the following structure:
- Enabled
- Pcp
- RxQueueIndex
When Enabled is FALSE, the entry is inactive. When Enabled is TRUE, the entry is active, and the hardware forwards all incoming frames with a VLAN Tag with the specified Pcp into the Receive Queue specified by RxQueueIndex.
Incoming frames that don't match any Hardware Dispatcher rule, neither EtherType nor PCP nor UdpPort, are forwarded to Receive Queue 0.
The NL2 guarantees that it never enables different entries with the same Pcp value. It also guarantees that the RxQueueIndex value in each enabled entry corresponds to an existing and enabled Receive Queue.
By default, at startup, the driver must disable all the entries of all Hardware Dispatchers, including the EtherType, PCP, and UDP Port. This means that all received frames go to Receive Queue 0.
The NL2 uses two functions provided by the driver to configure the PCP Hardware Dispatcher:
- RtndSetDispatcherPcpEntry: updates the content of a PCP Hardware Dispatcher table entry.
- RtndGetDispatcherPcpEntry: reads the content of a PCP Hardware Dispatcher table entry.
Note: The driver is responsible for maintaining a copy of the PCP Hardware Dispatcher table in memory. The RtndGetDispatcherPcpEntry function must NOT access any hardware register; it must read and return the content saved in memory.
Configure the UDP Port Hardware Dispatcher
The NL2 views the UDP Port Hardware Dispatcher as a fixed-size table. Each entry within this table has the following structure:
- Enabled
- UdpPort
- RxQueueIndex
When Enabled is FALSE, the entry is inactive. When Enabled is TRUE, the entry is active, and the hardware forwards all incoming UDP frames with the UdpPort into the Receive Queue specified by RxQueueIndex.
Incoming frames that don't match any Hardware Dispatcher rule, neither EtherType nor PCP nor UdpPort, are forwarded to Receive Queue 0.
The NL2 guarantees that it never enables different entries with the same UdpPort value. It also guarantees that the RxQueueIndex value in each enabled entry corresponds to an existing and enabled Receive Queue.
By default, at startup, the driver must disable all the entries of all Hardware Dispatchers, including the EtherType, PCP, and UDP Port. This means that all received frames go to Receive Queue 0.
The NL2 uses two functions provided by the driver to configure the UDP Port Hardware Dispatcher:
- RtndSetDispatcherUdpPortEntry: updates the content of a UDP Port Hardware Dispatcher table entry.
- RtndGetDispatcherUdpPortEntry: reads the content of a UDP Port Hardware Dispatcher table entry.
Note: The driver is responsible for maintaining a copy of the UDP Port Hardware Dispatcher table in memory. The RtndGetDispatcherUdpPortEntry function must NOT access any hardware register; it must read and return the content saved in memory.
Allocate Frame Buffers
Frame buffers hold the content of an Ethernet frame and can be accessed by the hardware. Transmit Frame Buffers are meant to be used by transmit queues, while receive frame buffers are meant to be used by receive queues.
As specific hardware may have specific requirements in terms of memory alignment and memory location used by the frame buffers, the driver allocates and frees those buffers.
All buffers are allocated at the driver's initiative in the RtndStartInterface function. The workflow is as follows:
- Within the RtndStartInterface function, the driver calls RTND_CALLBACKS.CreateTxBuffers or RTND_CALLBACKS.CreateRxBuffers.
- Within the CreateTxBuffers or CreateRxBuffers function, the NL2 allocates the buffer headers and calls RtndAllocateTxFrameDataBuffers or RtndAllocateRxFrameDataBuffers to allocate the frame buffers.
- When the driver returns, the NL2 completes its buffer header management.
- When the NL2 returns, the driver can use the allocated NL2 Buffers.
Note: The NL2 will never call RtndAllocateTxFrameDataBuffers or RtndAllocateRxFrameDataBuffers outside of the CreateTxBuffers or CreateRxBuffers functions.
All Buffers are freed at the driver's initiative, either by RtndStartInterface or RtndStopInterface. The workflow is as follows:
- Within RtndStartInterface or RtndStopInterface, the driver calls RTND_CALLBACKS.DestroyTxBuffers or RTND_CALLBACKS.DestroyRxBuffers.
- Within the DestroyTxBuffers or DestroyRxBuffers function, the NL2 marks the buffer as pending free. If all buffers of the same set are marked as pending free, the NL2 calls RtndFreeTxFrameDataBuffers or RtndFreeRxFrameDataBuffers to free all the frame buffers of the set in one shot.
- When the driver returns, the NL2 frees the associated buffer headers.
- When the NL2 returns, it indicates to the driver that the resources associated with the NL2 Buffers have been freed. It should not reference them again.
Note: The NL2 will never call RtndFreeTxFrameDataBuffers or RtndFreeRxFrameDataBuffersoutside of the DestroyTxBuffers or DestroyRxBuffers functions.
Get Buffers for Transmit
At startup, in the RtndStartInterface function, the driver must allocate Transmit Buffers for its Transmit Queues and manage them in pools of available Transmit Buffers. There is one pool for each enabled Transmit Queue.
Soon after RtndStartInterface returns, the driver owns all the allocated Transmit Buffers. Before being able to transmit (see Transmit Frames), the NL2 calls RtndGetTxBuffers to get one or more Transmit Buffers from the driver. Ownership of those buffers is then transferred to the NL2, and the NL2 can use those buffers to prepare the content of Ethernet frames and submit them to the driver for transmission. Whenever a Transmit Buffer is submitted, its ownership is transferred back to the driver. Once a Transmit Buffer is extracted, its ownership is transferred to the NL2.
The NL2 transfers all the Transmit Buffers it owns back to the driver before the RtndStopInterface function is called. This is done either by submitting the buffers for transmission with RtndSubmitTxBuffer, or by returning them with RtndReturnTxBuffers. The difference is that RtndReturnTxBuffers does not perform transmission.
Transmit Frames
Transmit operations may be executed in the context of the application process to meet the user applications' high throughput and low-latency requirements. The NL2 provides the required locking mechanisms to prevent race conditions (see below).
For each enabled Transmit Queue of each interface, the driver must support the following operations:
- Attach the Transmit Queue to the current process using RtndAttachTxQueue. This operation allows the driver to allocate memory and open handles to kernel objects in the context of the user application process before being requested to send frames from that user application process. This operation must not access any register of the NIC.
- Insert one buffer in the DMA ring using RtndSubmitTxBuffer.
- Trigger the transmission of the buffer(s) inserted in the step above using RtndApplyTxBuffers.
- Extract one buffer from the DMA ring using RtndExtractTxBuffer.
- Detach the Transmit Queue from the current process using RtndDetachTxQueue. This operation is to release the potential resources allocated by RtndAttachTxQueue. This operation must not access any register of the NIC. Please note that the user application process may be terminated before it calls RtndDetachTxQueue. This should not be an issue as all resources allocated in RtndAttachTxQueue (memory and handles) are automatically released by the real-time subsystem as soon as the user application process terminates.
Although a given Transmit Queue can be used by multiple processes simultaneously, the NL2 guarantees the following workflow is always followed for a given process:
- Call RtndAttachTxQueue.
- Call RtndSubmitTxBuffer, RtndApplyTxBuffers, and RtndExtractTxBuffer as many times as needed, in any order. The NL2 guarantees these three functions are never simultaneously called by two different threads of any attached processes.
- Either call RtndDetachTxQueue or get terminated. After calling RtndDetachTxQueue, the calling process is no longer allowed to call RtndSubmitTxBuffer, RtndApplyTxBuffers, and RtndExtractTxBuffer until it calls RtndAttachTxQueue again (thus returning to step 1 above).
Based on the above workflow, the following rules are derived:
- The driver MUST be prepared for RtndAttachTxQueue and RtndDetachTxQueue to be called multiple times from multiple processes (potentially simultaneously).
- The driver MUST NOT assume that the Transmit DMA ring is empty when RtndAttachTxQueue or RtndDetachTxQueue is called.
- The driver does NOT have to implement its own lock to protect the accesses to the Transmit DMA ring. The RtndSubmitTxBuffer, RtndApplyTxBuffers, and RtndExtractTxBuffer functions are guaranteed not to be called simultaneously.
- When RtndSubmitTxBuffer, RtndApplyTxBuffers, RtndExtractTxBuffer or RtndDetachTxQueue is called, the driver can assume that the calling process has previously called RtndAttachTxQueue.
IMPORTANT: Although the driver must allow calls to RtndSubmitTxBuffer, RtndApplyTxBuffers, and RtndExtractTxBuffer from multiple processes for the same queue, real-time applications optimized for low latency will call those functions from a single thread only, always running on the same processor. The driver should be implemented so that the best performances are attained when RtndSubmitTxBuffer, RtndApplyTxBuffers, and RtndExtractTxBuffer are always called from the same processor.
Restart a Transmit Queue
Some NICs can stop a Transmit Queue and restart it later without affecting other queues. This is useful, as it allows all buffers submitted to the queue to be canceled without waiting for them to be consumed, which can take time or, when the link is down, never complete.
NICs that can stop and restart and Transmit Queue should set RTND_CAPABILITY_TX.DynamicStopStartTxQueueSupported to TRUE and implement these functions:
A call to RtndStopTxQueue is expected to stop the hardware queue and cancel any buffer that was already submitted but not yet consumed. A call to RtndStartTxQueue is expected to restore the hardware queue in the state it was just after the interface was started.
The NL2 guarantees that the following functions are called only between a call to RtndStartTxQueue and RtndStopTxQueue:
The NL2 also guarantees that it returns to the driver, by calling RtndReturnTxBuffers, all the Transmit Buffers that it owns before calling RtndStopTxQueue.
Process Transmit Interrupts
By default, at startup, the driver must disable all Transmit interrupts on all Transmit Queues of all interfaces.
After startup, the NL2 may dynamically request the driver to enable Transmit interrupts on some of its Transmit Queues. This is done by the RtndEnableTxInterrupts function, which enables or disables the Transmit interrupt. This function is always called in the context of the NL2 process, but the NL2 doesn’t guarantee it is always called from the same thread.
Note: The driver MUST be prepared to receive a call to RtndEnableTxInterrupts from the NL2 process while it is executing RtndSubmitTxBuffer, RtndApplyTxBuffers, or RtndExtractTxBuffer from a user process, for the same Transmit Queue.
A Transmit interrupt eventually triggers the execution of the associated IST in the context of the NL2 process. Depending on the hardware and the configured interrupt mode (MSI, MSI-X), the driver may be unable to distinguish which Transmit Queue caused the interrupt. In that case, the driver must call RTND_CALLBACKS.NotifyTxInterrupt once for each Transmit Queue that has Transmit interrupts enabled and that may be the cause of the interrupt. If the driver knows which Transmit Queue caused the interrupt, it must call RTND_CALLBACKS.NotifyTxInterrupt for that queue only.
Note: The NL2 tolerates calls to RTND_CALLBACKS.NotifyTxInterrupt for a Transmit Queue which it requested to disable Transmit interrupts. The driver should avoid doing this as it unnecessarily consumes CPU resources.
Even though the NL2 does NOT have any locking mechanism to prevent an IST from being executed at the same time as another function of the driver, this should not be an issue because the IST should be designed to be as simple as:
- Read the interrupt status register.
- Acknowledge the interrupts.
- If it's a Transmit interrupt:
Call RTND_CALLBACKS.NotifyTxInterrupt for each Transmit Queue that may have caused this interrupt and which has Transmit interrupts enabled.
- If it's a Receive interrupt:
For each Receive Queue which may have caused this interrupt, and which has Receive interrupts enabled, call RTND_CALLBACKS.NotifyRxInterrupt.
- If it's a Link Status Change interrupt:
Call RTND_CALLBACKS.NotifyLinkStatusChange.
- If it's an Egress Timestamp Available interrupt:
Call RTND_CALLBACKS.NotifyEgressTimestamp.
If a particular driver needs to access queue-specific registers during the IST, it must have locking mechanisms to avoid race conditions with other driver functions, especially RtndSubmitTxBuffer, RtndApplyTxBuffers, and RtndExtractTxBuffer.
Receive Frames
Like transmit operations, the receive operations presented in this section must be very fast to reach the user applications' high throughput and/or low-latency requirements. To achieve this, they may be executed in the user application context. The NL2 provides the required locking mechanisms to avoid race conditions (see below).
For each enabled Receive Queue of each interface, the driver must support the following operations:
- Attach the Receive Queue to the current process using RtndAttachRxQueue. This operation allows the driver to allocate memory or open handles to kernel objects in the context of the user application process before being requested to receive frames from that user application process. This operation must not access any register of the NIC.
- Insert one buffer in the DMA ring using RtndSubmitRxBuffer.
- Trigger the fetching of the buffer(s) inserted in the step above using RtndApplyRxBuffers.
- Extract one buffer from the DMA ring using RtndExtractRxBuffer.
- Detach the Receive Queue from the current process using RtndDetachRxQueue. This operation is to release the potential resources allocated by RtndAttachRxQueue. This operation must not access any register of the NIC. Please note that the user application process may be terminated before it can call RtndDetachRxQueue. This should not be an issue as all resources allocated in RtndAttachRxQueue (memory and handles) are automatically released by the real-time subsystem as soon as the user application process terminates.
For each process (NL2 or user application) using the Receive Queue, the NL2 guarantees the following workflow is always followed:
- Call RtndAttachRxQueue.
- Call RtndSubmitRxBuffer, RtndApplyRxBuffers, and RtndExtractRxBuffer as many times as needed, in any order. The NL2 guarantees that two different threads of any attached processes never call these three functions simultaneously.
- Either call RtndDetachRxQueue or get terminated. After calling RtndDetachRxQueue, the calling process is no longer allowed to call RtndSubmitRxBuffer, RtndApplyRxBuffers, and RtndExtractRxBuffer until it calls RtndAttachRxQueue again (thus returning to step 1 above).
From the driver's perspective, when a user application process is terminated, the effect is the same as calling RtndDetachRxQueue.
Based on the above workflow, the following rules are derived:
- The driver MUST be prepared for RtndAttachRxQueue and RtndDetachRxQueue to be called multiple times from multiple processes (potentially simultaneously).
- The driver MUST NOT assume that the Receive DMA ring is empty when RtndAttachRxQueue or RtndDetachRxQueue is called.
- The driver does NOT have to implement a lock to protect accesses to the Receive DMA ring. The RtndSubmitRxBuffer, RtndApplyRxBuffers, and RtndExtractRxBuffer functions are guaranteed not to be called simultaneously.
- When RtndSubmitRxBuffer, RtndApplyRxBuffers, RtndExtractRxBuffer, or RtndDetachRxQueue is called, the driver can assume that the calling process has previously called RtndAttachRxQueue.
Note: Although the above guarantees should be sufficient, the driver may use additional custom locks to protect its data from race conditions. However, this should be avoided as much as possible, as it degrades the overall application performance.
IMPORTANT: Although the driver must allow calls to RtndSubmitRxBuffer, RtndApplyRxBuffers, and RtndExtractRxBuffer from multiple processes for the same queue, real-time applications optimized for low latency will call those functions from a single thread only, always running on the same processor. The driver should be implemented so that the best performances are attained when RtndSubmitRxBuffer, RtndApplyRxBuffers, and RtndExtractRxBuffer are always called from the same processor.
Restart a Receive Queue
Some NICs can stop and restart a Receive Queue without affecting other queues. NICs that support this feature should set RTND_CAPABILITY_RX.DynamicStopStartRxQueueSupported to TRUE and implement these functions:
A call to RtndStopRxQueue is expected to stop the hardware queue and prevent it from filling a buffer. A call to RtndStartRxQueue is expected to restore the hardware queue to its state just after the interface was started.
The NL2 guarantees that the following functions are called only between a call to RtndStartRxQueue and RtndStopRxQueue:
The NL2 also guarantees that it submits to the driver, by calling RtndSubmitRxBuffer, all the Receive Buffers it owns before calling RtndStopRxQueue.
Process Receive Interrupts
By default, at startup, the driver must disable all Receive interrupts, on all Receive Queues of all interfaces.
After startup, the NL2 may dynamically request the driver to enable Receive interrupts on some of its Receive Queues. This is done by the RtndEnableRxInterrupts function, which requests both enabling and disabling the Receive interrupt. This function is always called in the context of the NL2 process, but the NL2 doesn’t guarantee that it is always called from the same thread.
Note: The driver MUST be prepared to receive a call to RtndEnableRxInterrupts from the NL2 process while it is executing RtndSubmitRxBuffer, RtndApplyRxBuffers, or RtndExtractRxBuffer from a user process, for the same Receive Queue.
The occurrence of a Receive interrupt eventually triggers the execution of the associated IST in the context of the NL2 process. Depending on the hardware and the configured interrupt mode (MSI, MSI-X), the driver may not be able to distinguish which Receive Queue caused the interrupt. In that case, the driver must call RTND_CALLBACKS.NotifyRxInterrupt once for each Receive Queue that has Receive interrupts enabled and that may be the cause of the interrupt. If the driver knows which Receive Queue caused the interrupt, it must call RTND_CALLBACKS.NotifyRxInterrupt for that queue only.
Note: The NL2 tolerates calls to RTND_CALLBACKS.NotifyRxInterrupt for a Receive Queue which it requested to disable Receive interrupts. The driver should avoid doing this as it unnecessarily consumes CPU resources.
Though the NL2 does NOT have any locking mechanism to prevent an IST from being executed at the same time as another function of the driver, this should not be an issue because the IST should be designed to be as simple as:
- Read the interrupt status register.
- Acknowledge the interrupts.
- If it's a Transmit interrupt:
For each Transmit Queue which may have caused this interrupt, and which has Transmit interrupts enabled, call RTND_CALLBACKS.NotifyTxInterrupt.
- If it's a Receive interrupt:
For each Receive Queue which may have caused this interrupt, and which has Receive interrupts enabled, call RTND_CALLBACKS.NotifyRxInterrupt.
- If it's a Link Status Change interrupt:
Call RTND_CALLBACKS.NotifyLinkStatusChange.
- If it's an Egress Timestamp Available interrupt:
Call RTND_CALLBACKS.NotifyEgressTimestamp.
If a driver needs to access queue-specific registers during the IST, it must have locking mechanisms to avoid race conditions with other driver functions, especially RtndSubmitRxBuffer, RtndApplyRxBuffers, and RtndExtractRxBuffer.
Real-time Interrupt Masking
As described in the previous sections, the NL2 may request the driver to dynamically enable or disable interrupts on specific Tx and Rx Queues. Such requests by the driver are not expected to be real-time.
The Subsystem also must control when a device is allowed to send interrupts to the processor for these reasons:
- To prevent the device from re-sending interrupts before the previously sent interrupts are handled by the processor.
- To prevent the device from interrupting the processor while it executes a high-priority thread (with a priority higher than the priority of the IST).
Subsystem requests to enable or disable interrupts can occur several thousand times per second and must be processed in real-time to avoid impacting system performance.