Working for Red Hat is awesome. Not only can you work on amazing things, you will also get the tools you need in order to do just that. We wanted to test Eclipse Hono (yes, again) and see how far we can scale it. And of course which limits and issues we encounter on the way. So we took the current development version of Hono (0.7) from Eclipse IoT, backed by EnMasse 0.21 and ran it on an OpenShift 3.9 cluster.
Note: This blog post presents an intermediate result of the whole test, as it is still ongoing. Want to know more? We submitted a talk for EclipseCon Europe about this scale test. With a bit of luck we can show you more in person at the end of October in Ludwigsburg.
From the full test cluster, we received an allocation of 16 nodes with a bit of storage (mostly HDDs), Intel Xeon E5-2620, 2×6 cores (24 threads) each and a mix of 64GB/128GB RAM. 12 nodes got assigned for the IoT cluster, running Eclipse Hono, EnMasse and OpenShift. The remaining 4 nodes made up the simulation cluster for generating the IoT workload. For the simulation cluster, we also deployed OpenShift, simply to re-use the same features like scaling, deploying, building as we did for the IoT cluster. Both clusters are a single master setup. For the IoT cluster, we went with GlusterFS as the storage provider as we wanted to have dynamic provisioning for the broker deployments. Everything is connected by a 1GBit Ethernet link. In the IoT cluster, we allocated 3 nodes for infrastructure-only purposes (like the Docker registry and the OpenShift router). This left 8 general-purpose compute nodes that Hono could make use of.
The focus of this test was on telemetry data using HTTP as a transport. For this we simulated devices, sending one message per second. In the context of IoT, you have a bigger number of senders (devices), but they do send less payload and less frequently than e.g. a cloud-side enterprise system might do. It is also most likely that an IoT device wouldn’t send once each second over HTTP. But “per second” is easier to process. And, at least in theory, you could trade in 1,000 devices sending once per second with 10,000 devices sending once every 10 seconds.
The simulator cluster consisted of three main components. An InfluxDB to store some metrics. A “consumer” and a “HTTP simulator” deployment. The consumer consumed directly from the EnMasse Qpid dispatch router instance via AMQP 1.0, as fast as possible. The HTTP simulator tries to simulate 2,000 devices with a message rate of 1 message per second per device. If the HTTP adapter stalls, it will wait for requests to complete. For the HTTP client, we used the Vert.x Web Client, as it turned out to be the best performing Java HTTP client (aside from having a nice API). So scaling up by a single pod means that we increase the IoT workload by 2,000 devices (meaning 2,000 additional messages per second).
To the max
As a first exercise, we tried out a few configurations to see how far we could get. In the end, we were able to saturate the ethernet port of our (initially) two ingress nodes and so decided to re-allocate one node from Eclipse Hono to the OpenShift infrastructure. So 3 ingress nodes and 8 compute nodes. This did reduce the capacity available for Hono and let us run into a limit of processing messages. However, it seemed better to run into a limit with Hono compared to running into a limit of network throughput. Adding an additional ingress node would be a simple task to do. And if we could improve Hono during the test, then we would actually see more throughput as we have some reserves in network throughput with that third node.
The final setup processed something around 80,000 devices with 1 message/second. There was a bit of room above that. But our DNS round-robin “load balancer” was not optimal, so we kept that reserve for further testing.
Note: Please note that this number may be quite different on other machines, in other environments. We simply used this as a baseline for further testing.
The first automated scenario we ran was a simple scale up test. For that we scaled down all producers and consumers and slowly started to scale up the producers. After adding a new pod we waited until the message flow had settled. If the failure rate is too high, then scale up an additional protocol adapter. Otherwise, scale up another producer and continue.
As an acceptable failure rate, this test used 2% of the messages over the last 3 minutes. And a “failure” is actually a rejection of the message at the current point in time. Devices may re-try at a later time to submit their data. For telemetry data, it may be fine to drop some information (with QoS 0) every now and then. Or use QoS 1 instead but be aware of the fact that the current request was rejected and re-try at a later time. In any case, if Hono responds with a failure of 503, then the adapter cannot handle any more requests at the moment, leading to an increased failure rate in the simulator.
So let’s have a quick look at the results of this test:
This chart shows the scale-up of the simulator pods and the accompanying scale-up of the Eclipse Hono protocol adapter pods. You can also see the number of messages each instance of the protocol adapters processes. It looks like, once we push a few messages into the system, this evens out at around 5,000 msgs/s. Meaning that each additional Hono HTTP adapter instance can serve 5,000 more messages/s, or 5,000 devices sending one message per second. Or 50,000 devices sending one message every 10 seconds. And each time we fire up a new instance, the whole system can handle 5,000 msgs/s more.
In the second chart we can see the failure rate:
Now, the rule for the test was that the failure rate has to be below 2% in order for the test to continue scaling up. What the test didn’t do well was to wait a bit longer and see if the failure rate declined even more. The failure rate is a moving average over 3 minutes. For that reason, this behavior has been changed in succeeding tests. The scenario now waits a bit longer before recording the final result of the current step.
So what you can see is that the failure rate stays below that “magic” 2% line. But that was the requirement. Except of course for the last entry, where the test was ended as there were no more resources to scale up in order for the scenario to compensate.
Yes it scales
Does Eclipse Hono scale? With charts and numbers, there is always room for interpretation. But to me, it definitely looks that way. When we increase the IoT workload, we can compensate by scaling up protocol adapters in a linear way. Settling around 5,000 msgs/s per protocol adapter instance and keeping that figure until the end of the test. Until we ran out of computing resources.
This post was originally published on Jens’s private blog. Thanks, Jens, for sharing this with us!