admin管理员组文章数量:1122832
I hope you are doing well. I am working with eBPF and tc on the egress side to add a PPPoE header to forwarded and locally generated packets. Due to GSO/TSO, I observe packets larger than the MTU size, and I see outputs like skb->gso_size = 1452
, skb->gso_segs = 3
, skb->len = 4410
. My concern is whether the MAC and PPPoE headers, which I inserted in the tc program, will be properly included in each fragment generated by GSO/TSO.
Since fragmentation happens at the IP layer, and PPPoE operates at the link layer, I understand that each fragment should be a complete link layer frame. However, I’m wondering if GSO will replicate the MAC and PPPoE headers in each fragment. If that’s the case, how should I handle the len
field of the PPPoE header for each fragment? Or is there a specific approach I should use to ensure that each fragment is correctly processed?
Here’s the eBPF code I’m working with:
SEC("tc")
int pppoe_egress(struct __sk_buff *skb) {
#define BPF_LOG_TOPIC "pppoe_egress"
void *data_end = (void *)(long)skb->data_end;
void *data = (void *)(long)skb->data;
u32 pkt_sz = skb->len - 14;
if (pkt_sz > pppoe_mtu) {
bpf_log_info("egress package too large size: %u", pkt_sz);
return TC_ACT_SHOT;
}
struct ethhdr *eth = (struct ethhdr *)(data);
if ((void *)(eth + 1) > data_end) {
bpf_log_info("package size smaller than ethhdr");
return TC_ACT_SHOT;
}
if (eth->h_proto != ETH_IPV4 && eth->h_proto != ETH_IPV6) {
bpf_log_info("egress eth proto is error: %x", eth->h_proto);
return TC_ACT_PIPE;
}
u32 offset = 14;
u8 protocol = 0;
u16 mss_value = 0;
u16 ppp_proto = ETH_PPP_IPV4;
// DECAP support since Linux kernel 6.3
u64 adj_room_flag = BPF_F_ADJ_ROOM_ENCAP_L3_IPV4;
if (eth->h_proto == ETH_IPV6) {
ppp_proto = ETH_PPP_IPV6;
adj_room_flag = BPF_F_ADJ_ROOM_ENCAP_L3_IPV6;
struct ipv6hdr *iph6;
if (VALIDATE_READ_DATA(skb, &iph6, offset, sizeof(*iph6))) {
return TC_ACT_SHOT;
}
protocol = iph6->nexthdr;
offset = offset + 40;
mss_value = pppoe_mtu - 40 - 20;
} else {
struct iphdr *iph;
if (VALIDATE_READ_DATA(skb, &iph, offset, sizeof(*iph))) {
return TC_ACT_SHOT;
}
protocol = iph->protocol;
offset = offset + (iph->ihl * 4);
mss_value = pppoe_mtu - (iph->ihl * 4) - 20;
}
if (protocol == IPPROTO_TCP) {
mss_clamp(skb, offset, mss_value);
}
u16 l2_proto = bpf_htons(0x8864);
bpf_skb_store_bytes(skb, 12, &l2_proto, sizeof(u16), 0);
int result = bpf_skb_adjust_room(skb, 8, BPF_ADJ_ROOM_MAC, adj_room_flag);
if (result) {
bpf_log_info("egress adjust room error %d", result);
return TC_ACT_SHOT;
}
struct pppoe_header pppoe = {
.version_and_type = 0x11,
.code = 0x00,
.session_id = bpf_htons(session_id),
.length = bpf_htons(pkt_sz + 2),
.protocol = ppp_proto,
};
bpf_skb_store_bytes(skb, sizeof(struct ethhdr), &pppoe, sizeof(struct pppoe_header), 0);
return TC_ACT_PIPE;
#undef BPF_LOG_TOPIC
}
Any suggestions for handling this properly with tc in such a case? Thank you so much for your help!
Best regards
本文标签:
版权声明:本文标题:linux - Do all fragments of an IP packet greater than MTU carry the full PPPoE header when modified in an eBPF tc program? - Sta 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1736395802a1944313.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论