Beyond POSIX - Adventures in Alternative Networking APIs

What You'll Learn

1Hear about alternative networking APIs.

2Learn about Aeron, an asynchronous messaging system, what it has to offer in terms of performance.

Over the last 10 years it has become increasingly apparent that the venerable BSD sockets API is not sufficient for the performance demands of modern high performance systems nor sufficient to drive the latest and fastest networking hardware.  This talk will be a survey through some of the alternative APIs available on various platforms.  Some of the APIs will be looked at in fine detail and discuss some of the implementation pitfalls.  We will also look at how the architecture of a higher level application (in this case Aeron Messaging) can impact how effective using these APIs can be.

What is the work that you are doing today? 

I started working as an independent consultant about a year and a half ago.  Previously I worked for a company called LMAX Exchange for about 10 years, and my primary focus is on high-performance distributed systems.

Around half my time is spent as a core contributor to the Aeron project, which is a high-speed messaging bus. It is an implementation of a layer 4 protocol that has mechanisms for persistent storage, and it also has a system for clustered applications that uses Raft style consensus, which is quite interesting. And I tend to do a lot of consulting around that. The majority of the customers tend to be in capital markets. Financial exchanges, market makers, trading platforms, brokers, that type of thing. And a mix of languages, mostly Java and C, tend to be the ones that crop up the most, and occasional use of C++ and C#.

What are your goals for the talk?

My talk is about alternative networking APIs, about stepping away from the bog standard. I used BSD sockets in C as the canonical example. But I am talking about all the additional ones like DPDK or AF_PACKET or IO Ring, I talk about a bunch of these things and it's a very quick overview of a few of these. What I'm hoping people will get out of it depends on the audience. Some might just be some ideas on how to better design applications, because I show how some of these tools gain improved performance. But in order to get there, often you have to design your software in a way to take advantage of it. I will talk a little bit about that.  I also talk about the specifics of some of these tools and APIs. Hopefully, some in the audience will say, `OK, if I structure my software better, I could better make use of the network infrastructure`. Whereas others might find a specific API might be something that can be utilized for a particular system to gain a performance advantage. So hopefully there's a couple of different levels where attendees can gain a little bit more knowledge or a little bit of something useful they can apply to their own systems. And to give people a slightly broader perspective on the way you can solve certain distributed computing problems.

Do people have to completely change their view of network programming? Are these APIs different from just read and write to a stream? 

Not completely changed, but certainly the biggest shift to getting decent performance out of the networking infrastructure is ensuring your application is asynchronous. So if you're going to have a synchronous application, very RPC style based, that's a difficult transition. That's the most significant transition I see.  There's a lot of systems that are written that way [synchronous] and trying to get them to be fast involves making them asynchronous, which is a big mental shift for a lot of people. And it's very hard often to change the application logic to work like that. The systems I have worked on that are already asynchronous, moving to some of these APIs doesn't require that much heavy lifting. It's often a slight shift in API usage cases. To get some really interesting gains you have to go to lower levels. There's extra work you have to do, especially if you're looking at something that's dealing with raw packets. You are no longer dealing with the abstraction of a UDP socket or TCP socket. You don't have ports and socket buffers to deal with. There are some lower level concepts that you might have to be aware of, an extra implementation work which you have to do, but the shift in effort is far less than moving from a synchronous system to an asynchronous system. 

Java devs are used to using synchronous APIs. Has Java any good asynchronous APIs? 

Aeron is written in Java and C, and there are APIs for a bunch of languages, C++, C, C#, Java, a few others. Aeron is primarily asynchronous messaging. You can do that across a network or even just between threads and processes. It's got a very fast IPC mechanism. It's kind of a multicast style. Single write to multiple readers get replications of a given message. We've written some queue libraries, which is part of a little commons style library that comes with Aeron. And I'm trying to think of some others. There's a lot of things like RX, reactive extensions. That's another interesting type of tool that fits into this. In terms of networking, there's things like Netty, which has a couple of wrappers around some of the APIs I talk about, like IO Ring, for example.  Those are available via Netty, but they're not available via the standard Java library. So there's a lot around. The standard library is OK. Java's not natively asynchronous, like  Erlang or Pony are, but there's still plenty of libraries that can get you pretty far. And I've worked with a lot of companies. When I went to LMAX, the exchange there, and also some of the other customers I work with who are doing some of the faster systems, they are asynchronous already and they're just using messaging tools to make that work for themselves.

A lot of it is less so about the language. It's a lot more about modeling, a lot more about understanding the model that you use to build these sorts of systems. And you really need to get your head around state machines. If you want to do an operation that's going to go over a number of distributed calls, you're essentially building up some state. You are missing an event, and when the event comes back, you're taking that event to potentially transitioning to another state. And we also have to factor in things like time-outs and represent them as events. That's the mental model to use when building complex logic over multiple distributed calls.


Michael Barker

Software Engineer & Independent Consultant at Ephemeris Consulting Ltd
Michael Barker is an independent software consultant, specialising distributed systems, concurrency and high performance systems.  He has spent 20+ years working in a variety of industries and has spent most of the last decade working on low-latency financial exchanges.  Michael is a... Read more Find Michael Barker at:

Wednesday May 26 / 11:10AM EDT (40 minutes)

TRACK Mechanical Sympathy - Developing for Modern Hardware TOPICS Hardware ConsiderationsNetworking ADD TO CALENDAR Calendar IconAdd to calendar