Design of zmq.rs - the native ZeroMQ stack in Rust (1)

0

zmq.rs 项目的设计介绍,因为要贴给 rust-dev / zeromq-dev 邮件列表,所以用英文写了,稍后再贴一份中文版的。

It is the first time I write a blog post in English, formally. I would like to use the first post, to describe all the design I made for project zmq.rs, at the moment of typing.

zmq.rs is such a project that (re)implements the whole ZeroMQ stack in the Rust programming language. It all started from the zeromq-dev mailing list when someone made such a proposal. There was some voice supporting this idea, little progress was made though. After following the development of the Rust programming language for a few months, I think it is time to initialize this project, when most I/O operations got a timeout version and sockets got enough closability to "interrupt" a busy-waiting Task.

I'm a very "old" programmer, but still pretty new for Rust. Therefore for the very first design, I tried to keep it as small as possible, and learned a lot from the standard libzmq implementation. There was actually a few blog post in Chinese where I mostly recorded the learning process of me trying to set up the very basic part of zmq.rs. However the key problem of the design was already there back then:

How to correctly make use of Task in Rust.

At this point there are two major difference between C++ and Rust: 1. Rust doesn't have an operational select() interface around file descriptors yet and 2. Rust offers the lightweight task model for ready-to-use cooperative concurrency candies through the optional libgreen. As a heavy user of Gevent, it is natural for me to choose to use hundreds of Rust tasks to achieve what select() can do for asynchronous concurrency. It sounds ideal but, libgreen is optional. That is to say, with this design, a normal network application built with zmq.rs may use up to hundreds of operation system threads under the 1:1 model of libnative.

That doesn't sound very pleasant. Let's see the other option, fully copy libzmq, implement whatever is missing for Rust. It would be quite fine for 1:1 model, the code could be quite readable - at least the same readable as C++ version. But what will happen if user switches to libgreen? There will be only very few cooperative tasks created, executing event-driven code. That is really weird, because such coroutines are supposed to be driven by the event loop, not to run it - they are designed this way so that developers could get a more convenient synchronous coding environment and forget about the event loop. Under the hood, powered by libuv, libgreen schedules all the tasks cooperatively. To implement the select() interface, we'll have to somehow expose a part of libuv's interface to the front, where we'll handle the scheduling explicitly on our own, again.

To my knowledge this might be the most reasonable way to have select() on libgreen. However that makes a huge inharmony to me between libnative and libgreen, feels like it is a bad idea to unify 1:1 with M:N. I'll leave the discussion for future select() authors after Rust 1.0, and go for the former cooperative design which requires less extra coding and seems to be more Rustic.

Till now, I only have some very rough code written, waiting to be heavily discussed and changed. Using zmq.rs now looks like this:

let ctx = Context::new();

let mut rep = ctx.socket(zmq::REP);
rep.bind("tcp://127.0.0.1:12345");

let mut req = ctx.socket(zmq::REQ);
req.connect("tcp://127.0.0.1:12345");

let mut msg_sent = box super::Msg::new(4);
msg_sent.data.push_all([65u8, 66u8, 67u8, 68u8]);

req.msg_send(msg_sent);
println!(">>> {}", rep.msg_recv());

I'm still working on it and the major part under active development is the socket. As for the time of typing, ~~only~~ bind() and msg_recv() are working partially, together with connect() and msg_send(), you can try it out with some other ZeroMQ implementation. Following text will focus on the design of socket in zmq.rs.

~~There are anyway relational concurrency to take care of, so select() is inevitable, for a ZeroMQ socket instance. I made an abstraction, hiding any source that may bring changes to the socket instance, behind a Rust channel, into a trait named Endpoint. Yeah I know it is a bad name which may possibly confuse people, please get me a new one. Each endpoint encapsulates a different task that communicates with the socket through a channel, while the socket instance - running in an individual task - simply selects over all the Receiver ends of the channels, and calls in_event() of corresponding endpoint instance on the arrival of new message.~~

~~The Endpoint trait looks like this:~~

pub trait Endpoint {
    fn get_chan<'a>(&'a self) -> &'a Receiver<ZmqResult<SocketMessage>>;

    fn in_event(&mut self, msg: ZmqResult<SocketMessage>, socket: &mut SocketBase);

    ...
}

~~So after an Endpoint is registerd to the socket, the socket will start polling on the Receiver offered by get_chan() for incoming messages, and call in_event() with each message received.~~

~~For example, TcpListener is an endpoint, feeding the socket with freshly connected TcpStreams. It has two parts, TcpListener holds the Receiver inheriting Endpoint and lives in the socket task, and a private InnerTcpListener lives in an individual task on its own holding the Sender end. InnerTcpListener listens on a TCP port forever, and send over any connected client back through the channel. On the other end, the socket instance detects an incoming message, and passes it over to TcpListener.in_event() in the same task, TcpListener then processes the new TcpStream and causes further changes. Thanks to the moving semantic of Rust, there will be zero memory copy throughout all channels.~~

~~Because the socket instance needs to busy-wait on all it's registered endpoints all the time, so it is not possible for a user task to directly own such a socket instance - user code shouldn't block. Therefore I created this socket interface object to communicate with and operate the actual socket instance. Interestingly, the interface object is also an endpoint - similarly it generates messages from user code and commit changes to the socket itself.~~

~~Let's take a look at a more complicated endpoint, the stream engine object which wraps e.g. a TcpStream.~~

Alright then, I didn't know the repository may grow faster than this blog. After #10 the socket task design is changed, because I realised I don't want the task to take care of incoming and outgoing messages - they should be directly connected to the socket object in user tasks.

The current design of socket is much simpler. User creates and owns socket objects. Socket objects connects to stream engines directly with a duplex channel - stream engine encapsulates the ZeroMQ connection to a peer and runs in a separate task for each engine. Therefore receiving and sending messages are directly done over this kind of channels. And different type of sockets - like REP or REQ - may choose to implement their own strategy on when/which to send or receive.

Stream engines are created by factories like TcpListener or TcpConnecter, which are spawned through the socket interfaces bind() and connect(). The factories live in separate tasks, and feed the socket object with stream engine channels (internally called Peer) through a special fan-in channel. Because we don't have full control over the task that owns the socket object, we cannot listen on the receiver end of the channel all the time. Receiving is done in a on-need way - whenever recv() or send() is called, socket will firstly try to collect(internally sync_until()) all the new Peers since last sync_until(), and then resume with the requested job.

sync_until() may block on a given condition from specific socket implementation. For example, first send() on a REQ socket must block until there is at least one Peer available. So in REQ implementation we simply call sync_until() with a closure object which will only return true if there are Peers available. Similarly, further send() on the same REQ within one multi-part messages sequence will have to sync_until() the same Peer is alive, so that the multi-part messages won't be sent to different ends.

你可能感兴趣的

fantix 作者 · 2014年09月17日

哈哈,我猜对了,libnativelibgreen 果然要分家了:https://github.com/rust-lang/rfcs/pull/230

回复

lerencao · 2014年11月28日

Honestly, the article is hard to read :)

回复

fantix 作者 · 2014年11月28日

Thanks for being honest :)

回复

刘昆 · 2015年01月20日

哇,拜大牛。打入zmq内部了呀!

回复

fantix 作者 · 2015年01月21日

[捂脸] 混了个前缀而已

回复

载入中...