MCTP stands for Management Component Transport Protocol. While it is designed around a server, it is also designed as a network protocol. As such, the Linux implementation makes use of the Socket interface and the Kernel plumbing for dealing with network protocols. To support a new transport mechanism, you implement a new device driver specific to that transport mechanism that implements the struct net_device contract, which includes implementing the functions for struct net_device_ops.
Before I write a full implementation, I want to write something that only echos a packet back to the receiver. This mimics the behavior of the mctp-echo server in the user tools, and can make use of the mctp-req echo client.
A network driver moves traffic from the operating system to the network device or in the opposite direction. My proof-of-concept code is going to do half of each operation: it is going to take a packet from the operating system, munge it, and send it back to the operating system. In order to perform this operation, I hae to implement the function .ndo_start_xmit from the struct net_device_ops. Here’s my implementation:
static netdev_tx_t mctp_pcc_tx(struct sk_buff *skb, struct net_device *ndev)
{
struct mctp_hdr * hdr;
struct control_work_data * control_write_data;
u8 orig_dest;
u8 orig_src;
printk(KERN_INFO "mctp_pcc_tx called\n");
hdr = mctp_hdr(skb);
printk("Version: %x\n",hdr->ver);
printk("Source: %x\n",hdr->src);
printk("Destination: %x\n",hdr->dest);
printk("flags_seq_tag: %x\n",hdr->flags_seq_tag);
pkt_hex_dump(skb);
orig_dest = hdr->dest;
orig_src = hdr->src;
__mctp_cb(skb);
hdr = mctp_hdr(skb);
hdr->src=orig_dest;
hdr->dest=orig_src;
hdr->flags_seq_tag += 8;
//This is a single static bloc instead of a kmalloc'ed block. The kmalloc
//Was triggering a check during interrupt context and stack dumping
control_write_data = &control_write_data_static;//kmalloc(sizeof(struct control_work_data), GFP_KERNEL);
control_write_data->skb = skb;
INIT_WORK(&control_write_data->work, temp_echo_handler);
init_waitqueue_head(&control_write_data->wq);
schedule_work(&control_write_data->work);
return NETDEV_TX_OK;
}
A few observations.
- I am assuming that this code is supposed to consume and free the sk_buff that is handed in. In stead of doing that, I reuse it for the outward journey. I’d like to confirm that this is correct, but I have not gotten a double free message, so I think I am right.
- I swap the source and destination values., as that is where the response message needs to go. The original destination value lets me set a routing rule to ensure that the packet gets sent to the new device driver. The recvfrom call in the mctp_req program looks for a packet from this address
- There is a flag on the header that indicates that this is a response packet. Without this, the packet does not get delivered back to the the mctp_req process. That is what this code does: hdr->flags_seq_tag += 8; Should be a bitwise or setting, but this proves the concept.
- Since the start_tx call is done in interrupt context, we want to exit out as quickly as possible. The essential work that has to be done here is scheduling the work that cannot be done in interrupt context. This is done in the function linked in the control_write_data block with the skbuff data attached. I could have put more of this work into the callback function, but nothing I did here would block.
The work_queue function that sends the packet back to the kernel looks like this:
void temp_echo_handler(struct work_struct *work){
struct control_work_data *my_data = container_of(work, \
struct control_work_data, work);
pr_info("%s\n", __func__);
if (my_data->skb != NULL){
pr_info("%s calling netif_rx\n", __func__);
netif_rx(my_data->skb);
}
pr_info("%s after netif_rx \n", __func__);
}
Aside from the logging, the function calls netif_rx(my_data->skb) which sends the packet back into the kernels’ network processing code…it ends up on a queue to get delivered to the appropriate waiting socket.
To set up a test for this code, I run the following shell script:
#!/bin/sh
insmod mctp_pcc.ko
mctp route add 9 via mctpipcc2d
mctp address add 10 dev mctpipcc2d
mctp link set mctpipcc2d up
With this run, I can see the mctp device using the mctp tool:
# mctp link
dev lo index 1 address 0x00:00:00:00:00:00 net 1 mtu 65536 up
dev mctpipcc2d index 9 address 0x(no-addr) net 1 mtu 68 up
To avoid fooling myself, I need to ensure that the mctp-echo server is not running, or it will respond for me.
A test run looks like this:
# ./obj/mctp-req eid 9
req: sending to (net 1, eid 9), type 1
-> sent
req: message from (net 1, eid 9) type 1 len 1: 0x00..
I added the ->sent message from debugging, but otherwise this is the vanilla code linked above.