TL;DR
A developer built a Linux kernel module that allows consumer AMD mini PCs with Thunderbolt ports to emulate InfiniBand devices, enabling high-speed RDMA communication. This breakthrough could democratize AI clustering at home.
A developer has created an experimental Linux kernel module that enables ordinary USB4/Thunderbolt ports on AMD mini PCs to emulate InfiniBand devices, achieving high-speed RDMA communication suitable for AI workloads. This development could significantly lower the barrier for high-performance AI clustering at home.
The project involves building a Linux kernel module that makes USB4/Thunderbolt ports appear as InfiniBand devices, enabling RDMA (Remote Direct Memory Access) over consumer hardware. The developer tested this setup on 128GB Strix Halo mini PCs, achieving bidirectional data transfer rates of approximately 95 Gb/s and one-way latency around 7 microseconds. These speeds are comparable to enterprise-grade InfiniBand, vastly outperforming traditional Ethernet and soft-RoCE configurations.
The tests included running tensor-parallel inference and FSDP workloads across two consumer boxes, with notable reductions in training time—e.g., a Gemma 3 27B LoRA FSDP step decreased from 1,359 seconds over Ethernet to 126 seconds over USB4 RDMA. The setup used experimental kernel modules loaded on Linux systems, with the developer emphasizing that this is research code, not production-ready software, and that stability and support are not guaranteed.
Why It Matters
This development matters because it demonstrates the potential for high-performance, low-cost AI clustering using consumer hardware. If scalable and stable, such technology could democratize access to advanced AI training and inference, previously limited to expensive enterprise networks. It also pushes the boundaries of what is technically possible with existing consumer interfaces like Thunderbolt and USB4, potentially influencing future hardware and software designs.

Corsair EX400U Survivor 2TB USB4 External SSD – Up to 4000 MB/s, IP55 Rugged Drive, Plug & Play for PC, Mac & iPad – Black
Rugged, Environmentally Sealed Housing – A heavy-duty housing stands up to the rigors of daily use, with IP55…
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Background
InfiniBand is a high-speed networking technology widely used in data centers for AI training and HPC workloads, but it is costly and complex for individual users. Recent efforts have focused on soft-RoCE over Ethernet to approximate similar performance, yet these solutions remain limited in speed and latency.
This project builds on the concept of RDMA, which allows direct memory access between computers, reducing latency and increasing throughput. The developer’s approach leverages the USB4/Thunderbolt ports found on many AMD mini PCs, typically used for peripherals, to emulate InfiniBand devices. The effort aligns with broader trends of making high-performance computing more accessible outside traditional data centers, although it is still in experimental stages. For related challenges, see Volkswagen blocks Home Assistant.
“This is experimental research code that makes consumer USB4/Thunderbolt ports behave like InfiniBand devices, enabling high-speed RDMA for AI workloads.”
— Developer behind the project
“If scalable and stable, this could open new avenues for affordable AI clustering at home, bypassing expensive enterprise networking gear.”
— Researcher or observer

SABRENT USB4 to 10 Gigabit Ethernet Adapter, USB-C to 10GbE Network Adapter for USB4/Thunderbolt 3/4/5, Aluminum Housing, Bus Powered, Backward Compatible with Multi-Gig and Gigabit Networks (NT-P10G)
TRUE 10GbE NETWORKING — WHAT YOU NEED: Connect to any USB4 or Thunderbolt 3/4/5 port for wired speeds…
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
What Remains Unclear
It is not yet clear whether this approach can be stabilized for regular use, whether it will scale to larger clusters, or if hardware limitations will prevent broader adoption. The project remains experimental, and performance may vary across different hardware and configurations. Further testing and development are needed to assess its practical viability.

Anker USB C Cable(3.3FT, 240W), USB 4 Data Cable, 40Gbps, 8K HD Display, Thunderbolt 4/3 Compatible, for iPhone 17, MacBook, Hub, Docking and More
Move Files Fast: Transfer music, movies, or entire seasons of TV shows in seconds at 40 Gbps.
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
What’s Next
Next steps include ongoing testing for stability, scalability, and compatibility with various hardware setups. The developer plans to refine the kernel modules and possibly seek community feedback or collaboration. If successful, future work could involve formalizing the software, improving robustness, and exploring integration with existing AI frameworks.

Reatan S8 Mini Gaming PC Intel Core I9 12900H 14C/20T 32GB DDR5 RAM 1TB SSD RJ45 LAN x2 ,4K Triple Display,Thunderbolt 4,Mini Desktop Gaming Computer Wi-Fi 6 /BT 5.2
【Intel Core i9-12900H Processor】Reatan S8 gaming mini PC Equipped with the Intel Core i9-12900H(base 2.5GHZ turbo 5.0GHZ) Processor….
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Key Questions
Can this technology be used in production now?
No, this is experimental research code and is not suitable for production environments. It is intended for testing and development purposes only.
What hardware is needed to try this setup?
It requires AMD mini PCs with USB4/Thunderbolt ports, and the developer used 128GB Strix Halo mini PCs for testing. The setup involves custom Linux kernel modules and specific software configurations.
How does this compare to traditional InfiniBand or Ethernet for AI workloads?
Preliminary tests show speeds of up to 95 Gb/s bidirectional with microsecond latency, vastly outperforming Ethernet and soft-RoCE configurations. However, these are experimental results and may not reflect real-world stability or scalability.
Is this approach likely to become widely available?
Uncertain. As an experimental project, it requires further development, testing, and community validation before it could be considered for wider use.
Source: Hacker News