Tag Archives: Speech recognition for IOT

From Zero to Star Trek

Or controlling your IOT by voice.

Star Trek Communicator2 days ago I was no-where with this – and today I “stand on the cusp” of having full voice control with just one minor issue bugging me – and as this subject has received so much attention I thought I’d better try to move people on to the matter at hand as most of it is resolved.

If you don’t have an Android device – or home control or MQTT – probably best to read no further.

Having defined my audience…. we’ll move on.  I started looking for some means to divert Google Now and the general speech recognition of Google to actually controlling something as against finding things on the web.  I put up an article in this very blog – and the number of responses was amazing. Clearly this is of interest.

So, the basis of this is that you have some kind of home control and that you are running or thinking of running Node-Red on some device (PC, Raspberry Pi, whatever) to control stuff. Much experience has led me to believe this is definitely the way forward, having the very reliable Node-Red as the central point to control everything. We’re talking I/O from ESP, Arduino and a host of other hardware gadgets and talking to them and Blynk, websockets, Email, Pushbullet, database, you name it.

So I can control stuff (as you’ll see in other items in the blog) using websockets, Blynk, Nextion serial displays and a host of other stuff – but up until now, controlling by voice was not on the cards.

It is now!

Starting with an HTC ONE M8 phone in my case, I added the following items:

  • Autovoice
  • Tasker
  • MQTT Publisher Plugin

There is little initial setup to do – just load these onto your phone. The first two are not free (but are dirt cheap) – but Tasker is SO useful you’ll want it anyway if you don’t already have it – what a bargain.

AutoVoiceSo the general idea here is to have a program trap the return information from Google – let’s say I were to say “execute lamp on”. I could make that a phrase to search for but that’s a bit restricting… so you can set Tasker+Autovoice to look for “execute”. I picked that word as it is easily recognised and I can’t imagine wanting to search for “execute” otherwise.

So depending how you go into this – you might say “Ok Google, Execute lamp on” or you may just press the little microphone button then say “execute lamp on”.

So what I wanted to achieve and DID achieve thanks to perseverance and a lot of help from you guys – was to get “execute” trapped – and then get “lamp on” passed to Node-Red via MQTT for further processing.

Before we get into the phone bit let me tell you about the Node-Red bit.  Many years ago I designed a package to let others write simple text-based adventures and though that may seem a little wasted now it helped me make some good decisions when it comes to how Node-Red handles the text.

Google liked phrases that make sense – whereas we want the minimal.  I wrote some code for Node-Red which you’ll see below (and this is just the start – trust me).  Let’s say I want to turn the heating up, down, to a specific temperature or just to auto.  So there are FOUR variations here and one of them involves a variable number. That could get messy.

In addition I’d like to be able to speak in English for the sake of my good lady who might have trouble remembering limited sets of instructions.

So what’s needed is the ability to say “execute set the heating to 15 degrees”.  What the computer wants is “heating 15” – preferably as 2 arguments.. the easiest way to do that is to separate the LAST argument off.

Here’s what I’ve done – you’ll no doubt immediately think of variations.

In a function in in Node-Red..

msg.payload = msg.payload.replace(/the /g, '');
msg.payload = msg.payload.replace(/to /g, '');
msg.payload = msg.payload.replace(/degrees /g, '');
msg.payload = msg.payload.replace(/degrees/g, '');
msg.payload = msg.payload.replace(/turn /g, '');
msg.payload = msg.payload.replace(/to /g, '');

var incoming=msg.payload.split(" and ");

var msg2 = {
payload : "",
topic : ""
}

var msg3 = {
payload : "",
topic : ""
}   

for (var a=0; a<incoming.length;a++)
{
var lastIndex= incoming[a].lastIndexOf(" ");
var leftbit = incoming[a].substring(0, lastIndex);
var rightbit = incoming[a].substring(lastIndex+1,incoming[a].length);

switch (leftbit)
{
case "salt lamp" :
case "salt":
msg.payload="Turning the salt lamp "+ rightbit;
msg2.topic="dfgdfg";
msg2.payload="dfgdfgdfg";
msg3.topic="orvibo/ACCF238D6DE6";
msg3.payload="salt "+rightbit;
node.send ([msg,msg2,msg3]);
break;


default :
msg.payload="I do not understand "+ leftbit + "?";
node.send ([msg,null,null]);
break;
}
}

So – given the example “execute turn the salt lamp off”…   Node-Red never sees “Execute” it only gets “turn the salt lamp off”  I have a function to control as it happens, an Orvibo socket (dealt with elsewhere in the blog) and as it happens it is expecting “salt on” or “salt off” as a message.

In the function above I have 3 outputs – one to a speech synth (Ivona – again dealt with earlier), the lamp and MQTT. Normally I’d just do everything by MQTT so don’t take this too literally.

First things first – remove un-necessary words for example “a”, “the”, “turn”, “degrees” etc. You could do this  the other way but then you have to cater for numbers etc.

Then split the sentence on “ and “ and loop through the parts. For each part split off the LAST word as this is your argument “on”, “off” or a number.

And so now we can handle (‘ve only implemented salt lamp above but heating etc trivial to add) “execute turn the salt lamp on and the heating to 22 degrees”.

That comes down to two sentences… “salt lamp on” (or “salt on”) and “heating 22”

These are then easy to parse in a SWITCH statement, passing any voice (to come out of your speakers somewhere), any direct control – and MQTT messages.

Absolutely works a treat and if it has no idea what “turn boiler off” means it will say “I do not understand boiler”  - similarly one could provide witty responses to invalid parameters “you must be kidding – heating at 5 degrees??”.

So all of that works – and it just ITCHING for further development which by the time you read this it will likely be getting – I’ve already stopped twice to get to the code you see above as I kept thinking of better ways. And please don’t say “you can do this all in Tasker” – I LIKE working in Node-Red.

You may by now be asking what the catch is? There isn’t other than one tiny item bugging me…. I hope to get responses as to how to fix it.

Autovoice/Tasker is listening to what I have to say and sending all but the command word to MQTT – reliably – from power up – no problems.

However and some of you have touched on this – depending on what I say after “execute” it may just pass on the material to MQTT and hence to Node-Red – OR it may first display something on the phone or even Google may have a bash at responding – usually giving up after the first word or so.

I’ve seen your GREAT responses… so before we start, re: Tasker etc.. yes, I’m running Tasker in the foreground, yes I have “Autovoice Google Now integration” turned on. “Do Google Now Search” is disabled.

In other words, pretty much all of the things that are SUPPOSED to stop Chrome+Google sneaking in and saying something or displaying something  - which does happen – but occasionally just a tad late – as if Tasker was not quite getting in there quickly enough to stop things.

Ok so having installled AufoVoice, Tasker and the MQTT plug in – here is what I’ve done – some of you might want to follow my lead – hopefully someone else will say “OH WELL THAT’S WHY YOU’RE HAVING A PROBLEM”  - whoever you are I am sitting here waiting to hear from you.

Open up TASKER… Select a new Event – plugin – Autovoice

1 2 3

After Autovoice you want the RECOGNISED option at the bottom of the menu (was just offscreen on my phone)

4 5 6

After recognise, you want “speak filter” – and in there say “command” or whatever clear, unambiguous word or phrase you want to use. When done – tick “contains all” and tick that box at the top. You are now done with the recognition bit.

7 8 9

So then you should be asked for a new TASK – if not – ask the people who desgned Tasker because it worked for me (I’m doing all of this from scratch now as I build up this blog entry).  Give the task a name – I chose to call it MQTT. Call if FRED if you like. Then you need an ACTION for that task.

10 11 12

Select the MQTT plugin which just appears by magic (assuming you installed it). Fill in you MQTT details – you need address and port, user and pass (if you have a user and pass of course – some don’t but then I would not be using this out of the house if you don’t) and I used the TOPIC “voice”.  The payload is

%avcommnofilter

What happens there is that gets replaced by all the text in your phrase EXCEPT the keyword or phrase.

Hit Accept and if you’ve not messed up that’s it.

I then chose to add another ACTION (not task) – a BEEP – that’s PROFILE – click on the green MQTT next to the profile – then hit + and select ALERT then BEEP and make it 8Khz and 1000ms or less (but not 1ms as you won’t hear anything.

 

Back out of all of that – press the mic button, say “command turn light on” – and back in your node-red you should have some text coming on in MQTT. You might want to test that with an MQTT client listening for “voice”…

Now – given the above – does anyone know how to make changes so that Google does not try to interfere with this – it never stops the command working but seeing a brief flash of screen or a brief unwanted word – could do without that.

SO – I THEN took the phone (which has good mics) – put it on the wall and from the other side of the room tried “command turn salt lamp on” – and it worked. GRANTED it’s not that big a room – and the wife is in the USA right now and the TV was off and we don’t have dogs.

Star Trek here we come. Once someone gets me out of my final misery and stops the occasional Googleisms, I’m adding heating control (3 fixed – up, down, auto and one variable – ie 22).   You could do the same with lamps (on, off, up, down, auto) etc… and if you make it so big that it slows down your Raspberry Pi – get something faster!!! The sky is the limit. I do this on a Raspberry Pi 2 with a fast SD – no problem – that and 100 other jobs.

So as an alternative to the BEEP – you could setup Pushbullet and have Node-Red send you a message when it gets a valid command – that way you know for SURE that something has happened.  And that is in fact what I have now done – having experimented with lots of MQTT clients for Android all of which seem less than reliable to put it mildly.

If you are interested - makes sure you SUBSCRIBE to this subject even if you don't wish to comment - to make sure you get regular updates!

Facebooktwittergoogle_pluspinterestlinkedin