We’ve a set of bodily {hardware} that includes a set of as much as 8 modules, every of which is a SoC combining an FPGA and an ARM core working Linux. At the moment, all of those modules are related to a single PC over ethernet, and on the module, all of the communication is dealt with utilizing the traditional TCP/IP stack in Linux, with knowledge from the FPGA being routed via the software program layer.
This technique works nicely and is dependable, and having the system use regular networking protocols and requirements has confirmed to have actual advantages. The issue with the present system is that we’re producing knowledge from the FPGA at a a lot larger knowledge fee than could be dealt with by the processor (each as a result of the processor is just not quick sufficient, and since the ethernet {hardware} is 1Gb/s).
We’re proposing to have a second knowledge stream from the FPGA which is handed over a devoted SFP+ optical hyperlink. Clearly, there are various methods through which this may very well be applied. Our present pondering is, on the FPGA stage, we merely pack the info in ethernet frames and hearth it out over the optical hyperlink at 10Gb/s. Every module’s optical hyperlink is then plugged into one of many 8 lanes accessible on a twin QSFP+ commodity networking card.
Circulate management is then dealt with over the prevailing software program TCP hyperlink on the utility layer. I.e. solely sending any knowledge if the QSFP+ NIC has ample area in its buffers.
We are able to detect lack of knowledge, however we’d hope this might occur vanishingly not often.
The query then is what the pitfalls of such an structure may be.
Are ethernet frames handed one-per-link within the case of QSFP+ (i.e. every body is just not striped throughout the 4 hyperlinks), so every of the 4 QSFP+ hyperlinks could be pushed from a distinct module?
Can I simply obtain the uncooked ethernet frames on the receiving facet, below both Linux or Home windows?
Is utilizing circulate management primarily based on buffer area ok to reliably not lose packets (assuming the structure described)?
Are sure bits/producers of package higher for this sort of factor?