Ivona Again - Scargill's Tech Blog

A recurring theme in software development is re-inventing the wheel because you don’t know about or don’t understand what is available already. Well, sometimes it is for me anyway.

And so it was that my PERFECTLY working, well appreciated but totally un-necessary workflow for Ivona speech on the Node-Red – went in the bin.

But first:

So (here come the links) – Ivona is an online service from Amazon, free within reason – with REALLY nice quality sound. The Ivona NODE is a free node that lets you talk to Ivona in Node-Red easily using, say a Raspberry Pi (which in it’s latest incarnation remains the fastest and easiest to use SBC in anywhere near that price range) – as easily as passing a text string to the node – and out pops high quality speech. Don’t like the idea of relying on an external service? Read on…

Yesterday we started working on a better caching mechanism to reduce calls to the Ivona speech API – and this morning – I took a LONG look at the instructions and….. it’s already in.

So here is the MK 2 explanation of how to use the Ivona Node-Red node – this time with the benefit of having actually read the instructions.

It is THIS easy.

Yup.. all that buffering work for nothing. But I’m making assumptions – let’s start from scratch:

We’re talking about getting high quality speech on, say a Raspberry Pi – using Node-Red. In my case Node-Red is the central control for my home control… and speech is part of that – I want to know when devices are logging in etc.

There are speech synths for the Pi and other SBCs – and not to put too fine a point – most of them sound like someone being strangled. Ivona on the other hand provides REALLY nice speech in a range of languages. Google used to have a good API but they got greedy.

Installing:

From my Pi script…….

sudo apt-get install -y mpg123

and in the .node-red directory wherever that is on your system…

npm install node-red-contrib-ivona

Oh, Err:

In my original version, having put the speech into a first-in/first-out buffer, I passed the output to MPG123 – a program (accessed by an EXEC function) that plays MP3 files. Why? Because the Ivona node takes in speech and dumps an MP3 file on your disk/SD with the recording in it.

All fine and good but what about overlapping messages – and what about constant use of the API? Well to cut a LONG story short it is all handled in the node itself. If you play the same message twice, it just uses the file it made last time. The node successfully queues messages as well (all that work…) If like me you didn’t read the instructions – the file went into the /tmp directory and you probably thought you’d have to develop your own file system.,

And now:

So – let me show you the setup for the Ivona Node – and then I’ll go through it piece–by-piece.

See installation above – see also my original blog for more details and putting in the credentials that Ivona give you. A one-off task and free.

So… in the message area above – just leave that as it is… the “moustache” system refers to using braces around stuff. It can take a little grasping but you don’t have to here… In this case – the message entry simple means – if you fire a msg.payload like “Hello there” into the node – then it will be used. You’ll do that by, for example an INJECT node (see above) sending TEXT in the payload.

Voice: Well that’s simple enough – dropdown box – pick a voice! For UK users , Brian is good.

Exec: Here you put in the name (and path if needed) of a program to play the file. I definitely recommend installing MPG123 as it is easy and reliable. {{{file}}} simply means – use the file generated by Ivona node.

File: At the start, to the left of {{{ put the name of the directory you want to store files – I made one called “recordings” under my /home/pi directory. The directory should exist before you use it. Don’t forget the slash at the end of the directory.

Then you see a number of items which together take the name of the speaker, the language and the actual text and make these into a filename. I suggest you leave them as-is. In this case..

/home/pi/recordings/{{voice_name}}-{{lang}}-{{slug}}.mp3

That’s it – when you fire speech at the node it should play it (assuming you’ve set the audio output to the right place – check your audio first – I didn’t and got ZILCH – then I realised it was set to go out of HDMI and the monitor I was using had no speakers!!!)

The sound will stay in that file. The next time you go for an identical piece of text with the same speaker (Brian in this case) – it will merely play back the file you already have instead of going off to get more. In a closed system with fixed messages, eventually the Ivona service won’t be needed.

This offers up a possibility for sound effects. Let’s say you send to Ivona “Alert” in the payload. In my case that will generate “brian—alert.mp3”. Now, let’s say you have a nice alert sound effect… by simply giving it the file name “brian—alert.mp3”, that file will be used instead of Brian saying “alert” – so you can have a complete library of MP3 effect files used alongside your recorded speech!!! All without any special mechanisms.

Fire multiple messages and they are all played in sequence without you having to worry about overlaps.

All in all an excellent node – just a shame some of us didn’t read the instructions the first time around!!

Update May 22: I have now implemented an improvement to the Ivona node and sent this back to the author – time will tell if he implements it.

So it would be nice to be able to control the VOICE dynamically. For example – you might wish to have your gadgets TALK.

If you want Ivona to work via MQTT the problem is you ONLY have TOPIC and PAYLOAD. The latter is obviously the speech.

The changes I’ve made allow you to strap an MQTT input node to the Ivona node – with the topic subscription ivona/#

This means that Ivona will listen to any message send by MQTT to ivona/ – but if you put a name after the topic – for example ivona/brian – that will change the voice temporarily!!

In the Ivona .JS file is a line as follows..

text = mustache.render(node.message, msg);

Affer that line, add this code (and then restart node-red of course)

// Dynamic voice control using topic: ivona/XXX where XXX is a voice – i.e. brian.

node.voice = config.voice; // default voice otherwise any change will be permanent
if (msg.topic.substring(0,6).toLowerCase()==”ivona/”)
   {
    for (x in voices)
       {
         if (voices[x].name.toLowerCase()==msg.topic.substring(6,msg.topic.length).toLowerCase()) node.voice=x;
       }
   }
// end of modification

Simples! And so here we see MQTT-SPY in action.

Or with Brian talking…

And so what about sound effects? Unless you want the words “alert 1” coming out for example, you need to ensure that pre-recorded sound effect files always use the same name. As the default in my example is “emma” – ensure you do NOT send a voice when you want a sound effect.

To make this easy and compatible with the above MQTT – I used “—“ as a separator so that for example you might say…

Topic: ivona/brian

Payload: alert 1—This is an alert

The first part is automatically sent to Ivona with no topic – hence the default voice and hence the need for a file emma—alert-1.mpg whereas the second part should be sent with the topic.

And here is the code for the sub-flow:

var tPayload,tTopic;
var ar=msg.payload.split(“–“);
if (ar.length>1)
{
    msg.payload=ar[0];
    tTopic=msg.topic;
    msg.payload=ar[0];
    msg.topic=””;
    node.send(msg);
    msg.topic=tTopic;
    msg.payload=ar[1];
}
node.send(msg);

Here is a VERY handy list of star-trek alert mp3 files.. http://www.lcarscom.net/sounds.htm – just rename the files to have dashes instead of any spaces – and prefix with emma–

5 thoughts on “Ivona Again”

Peter Baldwin says:

January 29, 2017 at 5:42 pm

I have a feeling they have removed the ability to get a free API key – unless I’m blind…
1. says:
  
  January 29, 2017 at 6:02 pm
  
  It DOES look like you can no longer sign up for it…. OK GUYS we need to find another one – and not one that sounds like a tired robot please. IDEAS???
  1. Peter Baldwin says:
    
    April 2, 2017 at 11:19 pm
    
    Ivona appears to have been replaced with “Amazon Polly” which seems similar and is available free for a year. I’ve tried signing up and have various API ids, secret keys etc, and have tried putting them into the Ivona node but can’t get it to work. Don’t get an error but it’s seems to create empty mp3 files in the directory.
    1. says:
      
      April 3, 2017 at 7:08 am
      
      I do hope Ivona continues for free in SOME phone as it is so good – however all is not lost entirely – as I realised while working on Debian for the mobile phone… worse case, take an old Android phone, put Tasker on it and MQTT – making sure the MQTT guy keeps his promise to make it work reliably on power up… and have incoming MQTT messages speak on the phone – the speech quality is just fine. That’s one cheap solution. Sadly the other speech systems for the likes of Pi are to put it mildly, atrocious.
john macrae says:

May 22, 2016 at 3:10 pm

Fine Job, Peter. I have used Ivona with Node red to announce when my internet goes off and back on with messages such as ‘Dont even bother!’. Its interesting that there is a slight bug in Ivona that tries to pronounce the apostrophe in Don’t – so its best to leave it out

Comments are closed.