Updated 21/DEC/2015
This entry is now DEAD. Google have changed the goalposts and removed the API access for their language translation and text to speech system so that the base program behind this work – a program called Normit – is no dead in the water. I’m doing an update blog without the Google service.
This could be a good day, my WIFI problems have disappeared (for now anyway) all on their own and I’ve given up on the Orange Pi PC until someone comes up with a solution for the missing sound ( had some other minor issues – life’s too short). I have had a couple of successes recently – one of them being… getting the Raspberry Pi 2 to SPEAK. Specifically giving Node-Red the power of speech.
If you’re not using Node-red – or rather if you don’t have NodeJS on your Pi, then this won’t be of much interest! (I did get Node-Red running on the Orange Pi but no sound) – if you are, read on as this is a winner.
So first things first – this is is an example of what I started running on my Raspberry Pi 2… this particular bunch or injects just sit on the Node-Red desktop and can be accessed at the press of a button. The purpose of this, for someone living in Spain but not having a clue what to say to anyone ringing me up in Spanish to inform me of a delivery, is to give me a starting point at talking to them.
So I press one of the buttons on the left – and the Pi talks in clear Spanish through the speaker. My next application is more serious – it tells me when my ESP8266 boards have logged in by extracting their ID names from the incoming MQTT response (parser – trivial) and blasting out the text as speech. I was having WIFI issues for a while and it helped to know when the units were in trouble and continuously logging back in.
In this instance the responses play back in English. So how do we get from a non-talking Raspberry Pi to this?
First things first you need to go look for normit (something like npn install normit –g) and get that running on the Pi at the command line. What this neat little package does is to take some text and send it off to Google translate, get an mpeg back and play that. You’ll also want mpg123 which I believe is apt-get install mpg123 (you may need to use sudo if you are not root).
So the test is this..
normit en en “hello there”
That will visually respond with the phrase “hello there” onscreen visually. To change that to Spanish:
normit en es “hello there”
and you’ll get the Spanish version.
If you’ve installed mpg123 correctly and your sound is working…
normit en en “Hello there” –t will play the sound back through the speaker – there’s a slight delay depending on the length of the sentence which should I believe be no longer than 100 characters.
That in itself is quite useful.
in Node-Red there is a function called EXEC (under advanced – and thanks to the Node-Red guys for helping me out with that one). It will run command line stuff (you can see what’s coming next, can’t you…)
Grab the EXEC node and fill in as follows:
So now if you pass a message such as en en ‘hello there’ –t in your msg.payload to this node via, say an inject node – you’ll get speech. EASY!! And thanks to the Node-Red guys for pointing this out to me. Don’t try “spawn” mode and don’t worry it won’t hold up Node-Red – I can prove that by sending 2 messages quickly – it will play two at once (I’ve a fix for that lower down). Ignore the outputs of the EXEC node, you don’t need them.
So that in itself is ok, but you might want to take that a little further to make actual use more flexible and easier. I did. I also have a package called moment on my Pi which make nice time and date formatting.
So in my case the next step was to add a function to this..
The contents of the function are as follows – you might fancy something different.
var moment=context.global.moment;
var lang=’en en “‘;
var timeadd=””;
var dateadd=””;
if (msg.topic.indexOf(“time”)!=-1)
timeadd=’Time:, ‘ + moment().format(“h:mm a “)+ ‘.’;if (msg.topic.indexOf(“spanish”)!=-1)
lang=’en es “‘;if (msg.topic.indexOf(“date”)!=-1)
dateadd= ‘Date:, ‘ + moment().format(“dddd,,, MMMM Do YYYY”)+ ‘.’;msg.payload= lang + dateadd + timeadd + msg.payload + ‘” -t’;
return msg;
So now, in the simplest case the input to that function might simply be…. hello world. The function adds on the technical bits and quotes.
Would it not be nice to package that up – seems like that’s what I’ve done, doesn’t it. Create a SUBFLOW in Node-Red and dump those two items (the function and the exec node) in there. Edit the name of the subflow to “speech” and add an input to that function. This really is a lot easier done than discussed. I thought it was going to be mega-hard but it’s not. Think of it as a visual macro.
Lo and behold you have a drop-in box you can use for speech in English or Spanish (or, clearly any language you like).
Heading back to my Spanish example here’s one of the input INJECT boxes I showed you at the top…
I’ve put the text I want to say in the payload (one sentence please, for some reason Google doesn’t like a full stop in the middle of something – you could of course without too much effort fix that – split the text up and fire out one after another – getting the delay between the parts right could be fun though) – and used the topic to add in options such as “spanish” “time” and “date” – I’m sure you’ll think of others – if you come up with winners do write in.
So now, from a simple command line tool we have a great drop-in for Node-Red projects to get verbal feedback. And the sound, though with a strong American accent – really is very good.
Add in speech recognition and a little logic and you have your own Hal-9000 – except you could miss out the bit about not letting people back into the spaceship.
The only issue with this is that as this stands, there is no way to stop the unit from sending messages simultaneously – fine for my Spanish stuff, not so good for multiple simultaneous logins of my little ESP8266 boards. Now, we can’t EASILY tell if a process is finished, but we can schedule messages, so that messages could be put in a queue and that queue checked, say, once every 3 or 4 seconds. If your messages are not ultra-time-critical, then you could use that solution.
Here’s a variation of my flow…
Note the difference. In this version, an inject node is used to send a blank message every 3 seconds. If a message comes in – it is put in a queue, if a blank message comes in the queue is checked – and speech sent out if there is anything there, using the wonderfully elegant combination of arr.pushj and arr.shift – see the code. The only exception being if the message is urgent in which case it is sent immediately.
The code:
if (typeof context.arr == “undefined” || !(context.arr instanceof Array)) {
context.arr = [];
}
if (msg.payload!==””)
{
var moment=context.global.moment;
var lang=’en en “‘;
var timeadd=””;
var dateadd=””;
if (msg.topic.indexOf(“time”)!=-1)
timeadd=’Time:, ‘ + moment().format(“h:mm a “)+ ‘.’;
if (msg.topic.indexOf(“spanish”)!=-1)
lang=’en es “‘;
if (msg.topic.indexOf(“date”)!=-1)
dateadd= ‘Date:, ‘ + moment().format(“dddd,,, MMMM Do YYYY”)+ ‘.’;
msg.payload= lang + dateadd + timeadd + msg.payload + ‘” -t’;
// append new value to the array OR play it now if urgent
if (msg.topic.indexOf(“urgent”)!=-1)
return msg;
else
context.arr.push(msg.payload);}
else
{
if (context.arr.length)
{
msg.payload=context.arr.shift();
return msg;
}
}
And just set the inject node to repeat every 3 seconds (well, that works for me) and send a blank message OR add into the topic the word URGENT and it goes out straight away. SIMPLES!!!
And if you think THAT’s good – check out THIS version that ALSO handles .mp3 files!!! Simply put a file path, say /usr/audio/alert02.mp3 into the subject line – (option of “urgent” still remains) and you can add your own favourite star trek BEEPS!
Here’s what it looks like…
If you need beeps….. http://www.trekcore.com/audio/
And here’s the code – I’ve also added an ALERT option to play back both .mp3 and the voice file (not I’ve embedded a specific file in there as I’m out of fields – I may write a node for all of this yet):
var frompush=0;
if (typeof context.arr == “undefined” || !(context.arr instanceof Array)) context.arr = [];
if (msg.payload==””) if (context.arr.length) { frompush=1; msg=context.arr.shift(); }if (msg.payload!==””)
{
// if not urgent just push but not recursively
if ((msg.topic.indexOf(“urgent”)==-1) && (frompush==0)) { context.arr.push(msg); return; }if (msg.payload.indexOf(“.mp3”)!=-1) return [null,msg];
if (msg.topic.indexOf(“alert”)!=-1)
{
var msg2 = {
payload : “”,
topic : “”
};
msg2.topic=msg.topic;
msg2.payload=’/usr/audio/alert02.mp3′;
node.send([null,msg2]);
}
var moment=context.global.moment;
var lang=’en en \”;
var timeadd=””;
var dateadd=””;
if (msg.topic.indexOf(“time”)!=-1)
timeadd= moment().format(“h:mm a “)+ ‘, ‘;
if (msg.topic.indexOf(“spanish”)!=-1)
lang=’en es \”;
if (msg.topic.indexOf(“date”)!=-1)
dateadd= moment().format(“dddd, MMMM Do YYYY”)+ ‘, ‘;
msg.payload= lang + dateadd + timeadd + msg.payload + ‘\’ -t’;
return [msg,null];
}
And all of that was fine – until I started to experiment with multiple message – at which point it all went to hell. For reasons I’m not yet sure of – sending an entire object to an array and pushing and shifting doesn’t work that well. I made a slight change shown in red below and fixed the problem.
var frompush=0;
if (typeof context.arr == “undefined” || !(context.arr instanceof Array)) context.arr = [];
if (msg.payload==””) if (context.arr.length) { frompush=1; msg.topic=context.arr.shift(); msg.payload=context.arr.shift(); }if (msg.payload!==””)
{
// if not urgent just push but not recursively
if ((msg.topic.indexOf(“urgent”)==-1) && (frompush==0)) { context.arr.push(msg.topic); context.arr.push(msg.payload); return; }if (msg.payload.indexOf(“.mp3”)!=-1) return [null,msg];
if (msg.topic.indexOf(“alert”)!=-1)
{
var msg2 = {
payload : “”,
topic : “”
};
msg2.topic=msg.topic;
msg2.payload=’/usr/audio/alert02.mp3′;
node.send([null,msg2]);
}
var moment=context.global.moment;
var lang=’en en \”;
var timeadd=””;
var dateadd=””;
if (msg.topic.indexOf(“time”)!=-1)
timeadd= moment().format(“h:mm a “)+ ‘, ‘;
if (msg.topic.indexOf(“spanish”)!=-1)
lang=’en es \”;
if (msg.topic.indexOf(“date”)!=-1)
dateadd= moment().format(“dddd, MMMM Do YYYY”)+ ‘, ‘;
msg.payload= lang + dateadd + timeadd + msg.payload + ‘\’ -t’;
return [msg,null];
}
Now, if someone comes up with a way to detect the end of speech instead of using a timer – I’d be most grateful for the code – a fixed timer is, ok, but not ideal. Another possibility is to look at the number of characters in the string and arrange a time delay based on that – as clearly longer sentences take longer to speak – and it would be good to fire off a sentence before the previous one is finished – as it takes time to download the mpg file.
Hi Pete, you seem to have writing functions all sorted. Did you find any developers resources on the net that you can share?
In my function, I am trying to find out the mqtt client that sent the msg. There are examples on handling the msg.payload, but I can’t find a msg.client or msg.owner. (The MQTT protocol should know where a message has come from so I don’t have to put it in the payload also…)
I can’t find out a list of properties of the msg object, Or indeed any other objects. I am hoping you may know where to look.
Thanks.
Just a quick thought without trying it – have the debug node print the entire message not just payload – if it’s there, it’ll print it.
With some thought the “topic based fair queue” mode of the stock NR Delay Node be used to buffer the input messages and release on a schedule.
Edit: The “Limit Rate To” mode looks more appropriate. The Delay Node is quite a powerful thing!