2019-04-17

Testing metrics thoughts and examples: how to turn lights on and off through MQTT with pytest-play

In this article I'll share some personal thoughts about test metrics and talk about some technologies and tools playing around a real example: how to turn lights on and off through MQTT collecting test metrics.

By the way the considerations contained in this article are valid for any system, technology, test strategy and test tools so you can easily integrate your existing automated tests with statsd with a couple of lines of code in any language.

I will use the pytest-play tool in this example so that even non programmers should be able to play with automation collecting metrics because this tool is based on YAML (this way no classes, functions, threads, imports, no compilation, etc) and if Docker is already no installation is needed. You'll need only a bit of command line knowledge and traces of Python expressions like variables["count"] > 0.

Anyway... yes, you can drive telematics/IoT devices with MQTT using pytest-play collecting and visualizing metrics thanks to:
  • statsd, a "Daemon for easy but powerful stats aggregation"
  • Graphite, a statsd compatible "Make it easy to store and graph metrics" solution
or any other statsd capable monitoring engine.

In our example we will see step by step how to:
  • send a command to a device through MQTT (e.g., turn on a fridge light)
  • make assertions against the expected asynchronous response sent back by the device through MQTT (e.g., report light on/off status. In our case we expect a light on status)
  • collect key performance externally observable metrics on a JUnit compatible report file and optionally feed a statsd external metrics/monitoring engine (e.g, track how much time was needed for a round trip command/feedback on the MQTT broker)
using MQTT and pytest-play, using YAML files.

Why test metrics?

"Because we can" (cit. Big Band TheorySeries 01 Episode 09 - The Cooper-Hofstadter Polarization):
Sheldon: Someone in Sezchuan province, China is using his computer to turn our lights on and off.
Penny: Huh, well that’s handy. Um, here's a question... why?!
All together: Because we can!
If the "Because we can" answer doesn't convince your boss, there are several advantages that let you react proactively before something of not expected happens. And to be proactive you need knowledge of you system under test thanks to measurable metrics that let you:
  • know how your system behaves (and confirm where bottlenecks are located)
    • in standard conditions
    • under load
    • under stress
    • long running
    • peak response
    • with a big fat databases
    • simulating a small percentage of bad requests
    • or any other sensible scenario that needs to be covered
  • know under which conditions your users will perceive
    • no performance deterioration
    • a performance deterioration
    • a critical performance deterioration
    • system stuck
  • understand how much time is available before first/critical/blocking performance deterioration will met considering users/application growth trends
so that you can be proactive and:
  • keep your stakeholders informed very valuable information
  • improve your system performance before something of bad will happen
Ouch! The effects of a bad release in action
In addition you can:
  • anticipate test automation failures due to timeouts, maybe you already experienced a test always passing and one day it will start sporadically to exceed your maximum timeout
  • choose more carefully timeouts if there are no specific requirements
  • avoid false alarms like a generic "today the system seems slower". If there is a confirmed problem you might say instead: "compared to previous measurements, the system response is 0.7 s slower today. Systematically."
  • find corner cases. You might notice that the average response time is always pretty the same or slightly higher because there is a particular scenario that systematically produces a hard to discover response time peak compared to similar requests and that might create some integration problems if other components are not robust
  • avoid retesting response times with previous versions comparing to the actual built, because everything has been already tracked
What should you measure? Everything of valuable for you:.
  • API response times
  • time needed for an asynchronous observable effect will happen
  • metrics from a user/business perspective (e.g., it is more important for users API response times, browser first paint or how when she/he can start using a web page?)
  • metadata (browser, versions, etc). Metadata formats non compatible with statsd might be tracked on custom JUnit XML reports
  • pass/skip/error/etc rates
  • deploys
  • etc

Some information about statsd/Graphite and MQTT

statsd/Graphite

Very very interesting readings about statsd and the measure everything approach:

If you are not familiar with statsd and Graphite you can install it (root/root by default):

docker run -d\
 --name graphite\
 --restart=always\
 -p 80:80\
 -p 2003-2004:2003-2004\
 -p 2023-2024:2023-2024\
 -p 8125:8125/udp\
 -p 8126:8126\
 graphiteapp/graphite-statsd

and play with it sending fake metrics using nc:
echo -n "my.metric:320|ms" | nc -u -w0 127.0.0.1 8125
you'll find a new metric aggregations available:
stats.timers.$KEY.mean
stats.timers.$KEY.mean_$PCT
stats.timers.$KEY.upper_$PCT
stats.timers.$KEY.sum_$PCT
...
where:

  • $KEY is my.metric in this example (so metric keys are hierarchical for a better organization!)
  • $PCT is the percentile (e.g., stats.timers.my.metric.upper_90)
More info, options, configurations and metric types here:

What is MQTT?

From http://mqtt.org/:
MQTT is a machine-to-machine (M2M)/"Internet of Things" connectivity protocol.
It was designed as an extremely lightweight publish/subscribe messaging transport.
It is useful for connections with remote locations where a small code footprint is required and/or network bandwidth is at a premium.
MQTT is the standard de facto for smarthome/IoT/telematics/embedded devices communications, even on low performance embedded devices, and it
is available on many cloud infrastructures.

Every actor can publish a message for a certain topic and every actor can subscribe to a set of topics, so you get a message for every message of interest.

Topics are hierarchical so that you can subscribe to a very specific or wide range of topics coming from devices or sensors (e.g., /house1/room1/temp, /house1/room1/humidity or all messages related to /house1/room1/ etc).

For example in a telematics application every device will listen to any command or configuration sent by a server component through a MQTT broker (e.g., project1/DEVICE_SN/cmd);
server will be notified for any device response or communication subscribing to a particular topic (e.g., project1/DEVICE_SN/data).
So:
  • you send commands to a particular device publishing messages on foo/bar/DEVICE_SN/cmd 
  • you expect responses subscribing to foo/bar/DEVICE_SN/data.
If you are not confident with MQTT you can install the mosquitto utility and play with the mosquitto_sub and mosquitto_pub commands using with the public broker iot.eclipse.org. For example you can publish a message for a given topic:
$ mosquitto_pub -t foo/bar -h iot.eclipse.org -m "hello pytest-play!"
 and see the response assuming that you previously subscribed to foo/bar (we see all messages sent with mosquitto_pub of our topics of interest here):
$ mosquitto_sub -t foo/bar/# -h iot.eclipse.org -v

Prerequisites

pytest-play is multi platform because it is based on Python (installation might be different for different operative system).
Using Docker instead no installation is required, you need to install Docker and you are ready to start playing with pytest-play without any installation:
As a user you should be confident with a shell and command line options.

Steps

And now let's start with our example.

Create a new folder project

Create a new folder (e.g., fridge) and enter inside.

Create a variables file

Create a env.yml file with the following contents:
pytest-play:
  mqtt_host: YOUR_MQTT_HOST
  mqtt_port: 20602
  mqtt_endpoint: foo/bar
You can have one or more configuration files defining variables for your convenience. Typically you have one configuration file or each target environment (e.g., dev.yml, alpha.yml, etc).

We will use later this file for passing variables thanks to the --variables env.yml command line option, so you can switch environment passing different files.

Create the YML script file

Create a a YAML file called test_light_on.yml inside the fridge folder or any other subfolder if any. Note well: the *.yml extension and test_ prefix matter otherwise the file won't be considered as executable at this time of writing.

If you need to simulate a command or simulate a device activity you need just one command inside your YAML file:
- comment: send light turn ON command
  provider: mqtt
  type: publish
  host: "$mqtt_host"
  port: "$mqtt_port"
  endpoint: "$mqtt_endpoint/$device_serial_number/cmd"
  payload: '{"Payload":"244,1"}'
where 244 stands for the internal ModBUS registry reference for the fridge light and 1 stands for ON (and 0 for OFF).

But... wait a moment. Until now we are only sending a payload to a MQTT broker resolving the mqtt_host variable for a given endpoint and nothing more... pretty the same business you can do with mosquitto_pub, right? You are right! That's why we are about to implement something of more:
  • subscribe to our target topic where the expected response will come and store every single received message to a messages variable (it will contain an array of response payload strings)
  • add an asynchronous waiter waiting for the expected device response
  • once detected the expected response arrived, make some assertions
  • track testing metrics
  • enable support for parametrized scenarios with decoupled test data
  • Jenkins/CI capabilities (not covered in this article, see http://davidemoro.blogspot.com/2018/03/test-automation-python-pytest-jenkins.html)
Put inside our file the following contents inside the test_light_on.yml file and save:
markers:
  - light_on
test_data:
  - device_serial_number: 8931087315095410996
  - device_serial_number: 8931087315095410997
---
- comment: subscribe to device data and store messages to messages variable once received (non blocking subscribe)
  provider: mqtt
  type: subscribe
  host: "$mqtt_host"
  port: "$mqtt_port"
  topic: "$mqtt_endpoint/$device_serial_number"
  name: "messages"
- comment: send light turn ON command
  provider: mqtt
  type: publish
  host: "$mqtt_host"
  port: "$mqtt_port"
  endpoint: "$mqtt_endpoint/$device_serial_number/cmd"
  payload: '{"Payload":"244,1"}'
- comment: start tracking response time (stored in response_time variable)
  provider: metrics
  type: record_elapsed_start
  name: response_time
- comment: wait for a device response
  provider: python
  type: while
  timeout: 12
  expression: 'len(variables["messages"]) == 0'
  poll: 0.1
  sub_commands: []
- command: store elapsed response time in response_time variable
  provider: metrics
  type: record_elapsed_stop
  name: response_time
- comment: assert that status light response was sent by the device
  provider: python
  type: assert
  expression: 'loads(variables["messages"][0])["measure_id"] == [488]'
- comment: assert that status light response was sent by the device with status ON
  provider: python
  type: assert
  expression: 'loads(variables["messages"][0])["bin_value"] == [1]'
Let's comment command by command and section by section the above YAML configuration.

Metadata, markers and decoupled test data

First of all the --- delimiter splits an optional metadata document from the scenario itself. The metadata section in our example contains:
markers:
  - light_on
You can mark your scripts with one or more markers so that you can select which scenario will run from the command line using marker expressions like -m light_off or  something like -m "light_off and not slow" assuming that you have some script marked with the pretend slow marker.

Decoupled test data and parametrization

Assume that you have 2 or more real devices providing different firmware versions always ready to be tested.

In such case we want define our scenario once and it will be executed more thanks to parametrization. Our scenario will be executed for each any item defined in the test_data array in the metadata section. In our example it will be executed twice:
test_data:
  - device_serial_number: 8931087315095410996
  - device_serial_number: 8931087315095410997
If you want you can track different metrics for different serial numbers so that you are able to compare different firmware versions.

Subscribe to topics where we expect a device response

As stated in the official play_mqtt documentation https://github.com/davidemoro/play_mqtt
you can subscribe to one or more topics using the mqtt provider and type: subscribe. You have to provide the where the MQTT broker host lives (e.g., iot.eclipse.org), the port, obviously the topic you want to subscribe (e.g., foo/bar/$device_serial_number/data/light where $device_serial_number will be replaced with what you define in environment configuration files or for each test_data section.
- comment: subscribe to device data and store messages to messages variable once received (non blocking subscribe)
  provider: mqtt
  type: subscribe
  host: "$mqtt_host"
  port: "$mqtt_port"
  topic: "$mqtt_endpoint/$device_serial_number"
  name: "messages"
This is a non blocking call so that while the flow continues, it will collect underground every message published on the topics of our interest storing them to a messages variable.

messages is an array containing all matching messaging coming from MQTT and you can access to the messages value in expressions with variables["messages"].

Publish a command

This is self explaining (you can send any payload, even dynamic/parametrized payloads):
- comment: send light turn ON command
  provider: mqtt
  type: publish
  host: "$mqtt_host"
  port: "$mqtt_port"
  endpoint: "$mqtt_endpoint/$device_serial_number/cmd"
  payload: '{"Payload":"244,1"}'
where 244 is the internal reference and 1 stands for ON.

Track time metrics

This command let you start tracking time from now until a record_elapsed_stop will be executed:
- comment: start tracking response time (stored in response_time variable)
  provider: metrics
  type: record_elapsed_start
  name: response_time
... <one or more commands or asynchronous waiters here>
- command: store elapsed response time in response_time variable
  provider: metrics
  type: record_elapsed_stop
  name: response_time
The time metric will be available under a variable name called in our example response_time (from name: response_time). For a full set of metrics related commands and options see https://github.com/pytest-dev/pytest-play.

You can record key metrics of any type for several reasons:
  • make assertions about some expected timings
  • report key performance metrics or properties in custom JUnit XML reports (in conjunction with the command line option --junit-xml results.xml for example so that you have an historical trend of metrics for each past or present test execution)
  • report key performance metrics on statsd capable third party systems (in conjunction with the command line option --stats-d [--stats-prefix play --stats-host http://myserver.com --stats-port 3000])

While

Here we are waiting for a message response was collected and stored to the messages variable (do you remember the already discussed MQTT subscribe command in charge of collecting/storing messages of interest?):
- comment: wait for a device response
  provider: python
  type: while
  timeout: 12
  expression: 'len(variables["messages"]) == 0'
  poll: 0.1
  sub_commands: []
You can specify a timeout (e.g., timeout: 12), a poll time (how many wait seconds between a while iteration, in such case poll: 0.1) and an optional list of while's sub commands (not needed for this example).

When the expression returns a true-ish value, the while command exits.

Does your device publish different kind of data on the same topic? Modify the while expression restricting to the messages of your interest, for example:
- comment: [4] wait for the expected device response
  provider: python
  type: while
  timeout: 12
  expression: 'len([item for item in variables["messages"] if loads(item)["measure_id"] == [124]]) == 0'
  poll: 0.1
  sub_commands: []
In the above example we are iterating over our array obtaining only the entries with a given measure_id where the loads is a builtin JSON parse (python's json.loads).
<?xml version="1.0" encoding="utf-8"?><testsuite errors="0" failures="0" name="pytest" skipped="0" tests="1" time="10.664"><testcase classname="test_on.yml" file="test_on.yml" name="test_on.yml[test_data0]" time="10.477"><properties><property name="response_time" value="7.850502967834473"/></properties><system-out>...

Assertions

And now it's assertions time:
- comment: assert that status light response was sent by the device
  provider: python
  type: assert
  expression: 'loads(variables["messages"][0])["measure_id"] == [488]'
- comment: assert that status light response was sent by the device with status ON
  provider: python
  type: assert
  expression: 'loads(variables["messages"][0])["bin_value"] == [1]'
Remember that the messages variables is an array of string messages? We are taking the first message (with variables["messages"][0] you get the first raw payload), parse the JSON payload so that assertions will be simpler (in our case loads(variables["messages"][0]) for sake of completeness) obtaining a dictionary and then assert that we have the expected values under certain dictionary keys.

As you can see pytest-play is not 100% codeless by design because it requires a very basic Python expressions knowledge, for example:
  • variables["something"] == 0
  • variables["something"] != 5
  • not variables["something"]
  • variables["a_boolean"] is True
  • variables["a_boolean"] is False
  • variables["something"] == "yet another value"
  • variables["response"]["status"] == "OK" and not variables["response"]["error_message"]
  • "VALUE" in variables["another_value"]
  • len([item for item in variables["mylist"] if item > 0) == 0
  • variables["a_string"].startswith("foo")
One line protected Python-based expressions let you express any kind of waiters/assertions without having the extend the framework's commands syntax introducing an exotic YAML-based meta language that will never be able to express all the possible use cases. The basic idea behind Python expressions is that even for non programmers it is easier to learn the basics of Python assertions instead of trying to figure out how to express assertions in an obscure meta language.

pytest-play is not related to MQTT only, it let you write actions and assertions against a real browser with Selenium, API/REST, websockets and more.

So if you have to automate a task for a device simulator, a device driver, some simple API calls with assertions, asynchronous wait for a condition is met with timeouts or interact with browsers, cross technology actions (e.g., publish a MQTT message and poll a HTTP response until something happens) and decoupled test data parametrization... even if you are not a programmer because you don't have to deal with imports, function or class definitions and it is always available if you have Docker installed.


And now you can show off with shining metrics!

Run your scenario

And finally, assuming that you are already inside your project folder, let's run our scenario using Docker (remember --network="host" if you want to send metrics to a server listening on localhost):
docker run --rm -it -v $(pwd):/src --network="host" davidemoro/pytest-play --variables env.yml --junit-xml results.xml --stats-d --stats-prefix play test_light_on.yml
The previous command will run our scenario printing the results and if there is a stats server listening on localhost metrics will be collected and you will be able to create live dashboards like the following one:
statsd/Graphene response time dashboard
and metrics are stored in the results.xml file too:
<?xml version="1.0" encoding="utf-8"?><testsuite errors="0" failures="0" name="pytest" skipped="0" tests="1" time="10.664"><testcase classname="test_on.yml" file="test_on.yml" name="test_on.yml[test_data0]" time="10.477"><properties><property name="response_time" value="7.850502967834473"/></properties><system-out>...

Sum up

This was a very long article and we talked about a lot of technologies and tools. So if you are not yet familiar with some tools or technologies it's time to read some documentation and play with some hello world examples:

Any feedback is welcome!

Do you like pytest-play?

Let's get in touch for any suggestion, contribution or comments. Contributions will be very appreciated too!

No comments:

Post a Comment