LastUpdate: 2026-05-19, Author: HAO022
This blog post is part of a series of foundational technical research articles on AI Infrastructure.
Concept
Basic Concept: A Memory Window (MW) is a type of RDMA resource requested by the user that enables a remote node to access a local memory region. (For this reason, an MW possesses only an R_KEY and has no L_KEY.) Each MW is bound to an already registered Memory Region (MR). Compared to an MR, an MW provides more flexible control over remote access permissions. An MW can be roughly understood as a subset of an MR. A single MR can have multiple MWs carved out of it, and each MW can have its own permission set configured. The relationship is illustrated in the following network diagram:

MR/MW Permission Relationship: When binding a Memory Window, a Consumer can request any combination of remote access rights for the Window. However, if the associated Region does not have local write access enabled and the Consumer requests remote write or remote atomic access for the Window, the Channel Interface must return an error either at bind time or access time.
Use Cases:
- To grant and revoke remote access rights to a registered region dynamically, thereby reducing performance penalties.
- To grant different remote access rights within the same registered memory region to different remote agents.
Implementation
User-space Implementation: libibverbs API:
struct ibv_mw *ibv_alloc_mw(struct ibv_pd *pd, enum ibv_mw_type type);
struct ibv_mw {
uint32_t rkey;
enum ibv_mw_type type;
...
};
Parameters and Return Values:
ibv_mw_type: ForIBV_MW_TYPE_1, the MW can only be bound to an MR via theibv_bind_mwmethod. ForIBV_MW_TYPE_2, binding occurs viaibv_post_send.rkey: The Remote Key generated by the HCA. The kernel driver implementation is straightforward: it assembles the parameters and dispatches the command to the hardware.
Kernel Implementation:
mlx5_ib_alloc_mw constructs the MKey Context and submits it to the HCA.
1. MKey Context construction:
- `mkc.free`
- `mkc.pd`
- `mkc.umr_en`
- `mkc.en_rinval`
2. `mlx5_ib_create_mkey` submits the request and returns the `rkey`.
Memory Window Binding, 1:
// Bind a memory window to a region
int ibv_bind_mw(struct ibv_qp *qp, struct ibv_mw *mw, struct ibv_mw_bind *mw_bind);
struct ibv_mw_bind {
uint64_t wr_id; /* User defined WR ID */
unsigned int send_flags; /* Use ibv_send_flags */
struct ibv_mw_bind_info bind_info; /* MW bind information */
}
struct ibv_mw_bind_info {
struct ibv_mr *mr; /* The MR to bind the MW to */
uint64_t addr; /* The address the MW should start at */
uint64_t length; /* The length (in bytes) the MW should span */
unsigned int mw_access_flags; /* Access flags to the MW. Use ibv_access_flags */
};
mw_access_flags:
IBV_ACCESS_REMOTE_WRITEIBV_ACCESS_REMOTE_READIBV_ACCESS_REMOTE_ATOMICIBV_ACCESS_ZERO_BASED
int mlx5_bind_mw(struct ibv_qp *qp, struct ibv_mw *mw, struct ibv_mw_bind *mw_bind)
...
// Initialize WR
struct ibv_send_wr wr;
wr.opcode = IBV_WR_BIND_MW;
wr.send_flags = mw_bind->send_flags;
wr.bind_mw.bind_info = mw_bind->bind_info;
wr.bind_mw.mw = mw;
wr.bind_mw.rkey = mw->rkey;
// Submit bind memory window request via post-send
_mlx5_post_send(qp, &wr, ...);
...
Memory Window Binding, 2: A more flexible approach involves assembling the WQE manually and submitting the request via ibv_post_send.

Observations
- Must the MW
addrfall within the MR’s range? In RDMA programming, a registered Memory Window (MW) must be bound within an already registered Memory Region (MR). It cannot exist independently, nor can it extend beyond the boundaries of the MR. The MR is the fundamental unit for memory registration and address translation (Virtual Address → Physical Address). The RDMA NIC hardware relies on the MR to validate and access memory. The MW itself performs no memory registration; its function is solely to refine access control and scope restrictions on top of the MR. This mechanism is primarily used for the dynamic granting and revocation of remote access permissions. Furthermore, frequent MR operations incur significant performance overhead, whereas MW operations are relatively lightweight.
HUATUO is an operating system observability project open-sourced by DiDi and incubated under the China Computer Federation (CCF).
