Remote checking Home Control Units

Those of you who are aware of the Home Control project (Home Control 2019) will know that the software/hardware combination of a Raspberry Pi (running Jessie, Node-Red, SQLite and Mosquitto) and ESP12 (running the software described in the Home Control project) is proving very reliable – but there are occasions when you may wish to be absolutely sure that units are actually doing something.  One way to do that is simply to put timeouts on every response you expect… but that makes for complicated coding. A simpler way is to regularly ask each unit for the status of one of the outputs. It doesn’t matter what the response is, as long as there is a response.

As it happens I’m having an issue right now – the heating controller part of my system back home in the UK is working perfectly except of an on-going issue with Plusnet broadband. They are denying it is their fault but the broadband fails perhaps once a day, maybe more, maybe once every two days but eventually it will fail. Most of the time the TP-Link TD-9980 router reconnects within a minute or so – but for reasons I don’t yet understand, occasionally it just will not reconnect. The solution to that was simple and two-fold, a standard timer which turns the router off for a minute in the early hours every day. That’s the “belt and braces” approach. In addition, the Raspberry Pi is polling Google every few minutes. If it fails totally over a 15 minute period to contact Google, an ESP8266 unit will reboot the router.

All of this worked perfectly for several days and then… the heating relay ESP8266 refused to log back in after one of these “cuts”. I can only put it down to being right on the edge of signal. The reason I say that is that late last year we had horrendous problems with WIFI here in Spain, some kind of attack. It went away but during that time over several days I was losing WIFI constantly so I made damned sure the ESP units would reliable log back in, in the event of temporary failure of MQTT or WIFI or both etc. And yet just occasionally this particular unit will not log back in – a power cycle fixes the problem but that means I need to know about it and ask someone to pull the plug for a moment.

So – I’ve implemented this for the two most important devices in the building – the main heating temperature sensor – and the ESP that controls the relay to turn the heating on and off.

watchdog

Timeout NodeWhat you see above is as follows: 2 Node-Red “inject” nodes (standard) fire out requests every 5 minutes to MQTT to ask the relay and temperature units for the state of output 0.

Which command is irrelevant as long as it is a query which returns a result and otherwise has no effect on anything.  So for example with a topic holly1/toesp and a payload of {out0?} the system is expecting a response of OFF or ON (irrelevant here) from topic holly1/fromesp/out0

The two incoming MQTT nodes (left, purple) pick up on the response and feed them into my “timeout” nodes  (node-red-contrib-timeout).

As you’ll see below, the operation is simple – the timeout node doesn’t care what is incoming as long as something is. if it gets nothing for 60 minutes it will time out and send a payload out – in this case thanks to the standard email node (green) an email to me.

During that time, normally, the MQTT nodes would have responded every 5 minutes – ie 20 times, topping the timeout node up to 3600 seconds even if most of the messages were missing (and there is no reason why any should be missing).  In this case I’ve not put messages for “safe state” or “warning” – so nothing goes out unless the timeout actually runs down to zero.

Click on any of these images for larger versions.

So while this does not solve the problem, it does mean that I’m alerted to the issue and can ask someone to disconnect and reconnect the relay unit. With what has been 100% reliability of Raspberry Pi 2 boards (with a decent Samsung microSD or similar) I can be pretty sure I’ll get an email if something goes wrong.

I hope this proves useful to others.

13 thoughts on “Remote checking Home Control Units

  1. Cannot answer anything in depth until Thursday. On the Bay of Biscay ferry with the world’s worst WIFI connection. Sorry. Flag this up then.

  2. Hello Pete and thank you for your timeout node!
    This is exactly I needed to check if some of my RF sensors was switched off for some reason.
    But I’d like to add minor improvement to it: It always sends “Safe” message on first event from the sensor after redeploy. So I get a lot of “Safe” messages after every redeploy
    Could you please add an option to do not send first “Safe” message?
    Thank you!

      1. Thanks, I thought about such option
        But I also like the ability to send a message like “Sensor is back to work!” and such message definitely should be set in “safe” message field

        1. Erm, ok in which case I need clarification- you don’t want that message going out – but you do? Sorry but I’m not following….

          1. I do not want that message going out right after the deploying application (on the first message from the sensor). But it’s OK if that message is sent after warning/unsafe message.

            1. Ok, I could see that – I consider that a bug… or rather I did. Version 1.0.6 is uploaded with that fixed. Well, it’ll probably show up in half an hour. Hope that does the job for you.

              1. Thanks a lot! It works!
                Also, I do not want to sound pushy, but is it possible to add to the generated message some new field describing the reason of it?
                E.g. state=”safe”/”warning”/”unsafe”
                In this case, the message could be analyzed by switch/function node, not only sent to the user.
                Thank you again!

                1. You caught me with 10 minutes spare – it could be a little while before I’m working on that again..

                  Pete.

                  1. Sorry, but one more question about timeout node 🙂
                    What is “Auto-restart” parameter needed for?
                    As far as I understand – to notify about unsafe state periodically, but right after unsafe state notification, I receive “safe” notification because the timer is restarted. Is it correct?

  3. Hi Vladimir

    It has it’s uses I’m sure – it’s not something I’ve used, The MQTT Broker I use (Mosquitto) supports it. For my own purposes I prefer the idea of regular polling to make sure units are actually sitting there listening but I’m sure LWT would do the job just as well. I would have great difficulty using it without entering a payload like “argggghhh”.

    1. The LWT is good to just track that something went offline. In larger setups monitoring is a must so alerts fetch from the LWT let’s you know that something is offline and may want to be investigated.

Comments are closed.