This is the forum archive of Homey. For more information about Homey, visit the Official Homey website.

The Homey Community has been moved to https://community.athom.com.

This forum is now read-only for archive purposes.

Usable speech recognition

I've been using my homey for about a month now, mostly just to control the lights in the house. And while it's great when it works, most of the time it fails.

I've created a flow to talk back what it hears, and if I make sure there is no background, stand at max 2 meter distance, and articulate very well:

-  50% it recognizes the right words
-  40% it's close (and I think the system should recognize similar words; if I configure "livingroom on' and it recognizes 'livingstone on' (just an example, I'm using Dutch speech) it should still activate the flow without me configuring all the different ways Homey can almost-recognize the speech
- 10% it just spins and nothing happens.

Now as soon as I am around the corner, more than 3 meters away, or the cat decides to walk by, in the percentages are 10 / 10 / 80.  Which just isn't usable.

I have no experience with the competition (Google / Amazon), but I assume they are performing better?

What are the plans to improve this? 
«134

Comments

  • ZperXZperX Member
    edited November 2016
    I have submitted the issue: https://github.com/athombv/homey/issues/920
    I am planning to make review video. Could you also make a video otherwise we are regularly slapped by the enhusiast: `It works for me why are you complaining`?
    I can confirm that both Alexa and Google works as advertised (English).
  • It's indeed not really as advertised and I would love some official comment. If voice recognition is not improved a whole lot we'll just have to learn to do without, not nice and not honest, but it is better to know and deal with the facts then keep on trying.

    Here is what Homey understood when I asked it to switch the office light off.


    Lol

    2.jpg 2.2K
  • I can confirm this type of behavior. But I also have been digging into several forum items from the past. Somewhere I found a suggestion that the "app-load" might be to high or a app might have crashed which would use too many system resources.

    I cleaned up some of the apps I just tested and not really needed and loadavg dropped quite some. It looks like the speech recognition is behaving better now. 
    It is just a suggestion..... but it never hurts to try  ;)
  • Yep... im curious what Athom has to say about this. I have also issues with the speech recognition. Tried almost every spot in my living room to test if Homey performs better, but not a big difference.

    When im 2 meters > away from Homey i barely see the feedback led respondig to the sound of my voice. I would like to have the ability to adjust the volume/sensitivity of the mic, but dont know if its technically possible for the noise cancellation..






  • The memory-issue was about speech-output, not about input afaik... Raw recorded speech is send to the Athom-cloud. There Athom uses several services for recognition, I suspect one of these is really bad as opposed to the other services...
  • I always had this issue even with wiped clean homey. Therefore it is not memory issue... But the memory issue should be sorted too.

    @Mathijs That is a good one, the engine used the urban dictionary.
  • The memory-issue was about speech-output, not about input afaik... Raw recorded speech is send to the Athom-cloud. There Athom uses several services for recognition, I suspect one of these is really bad as opposed to the other services...
    Ah... you could be right, I have read many many different postings. Probably coincidence that the recognition seems better.
  • I've been using my homey for about a month now, mostly just to control the lights in the house. And while it's great when it works, most of the time it fails.

    I've created a flow to talk back what it hears, and if I make sure there is no background, stand at max 2 meter distance, and articulate very well:

    -  50% it recognizes the right words
    -  40% it's close (and I think the system should recognize similar words; if I configure "livingroom on' and it recognizes 'livingstone on' (just an example, I'm using Dutch speech) it should still activate the flow without me configuring all the different ways Homey can almost-recognize the speech
    - 10% it just spins and nothing happens.

    Now as soon as I am around the corner, more than 3 meters away, or the cat decides to walk by, in the percentages are 10 / 10 / 80.  Which just isn't usable.

    I have no experience with the competition (Google / Amazon), but I assume they are performing better?

    What are the plans to improve this? 
    Same Here 
  • The main issue is that is Athom is indeed using external engines to recognize speech they leach on the investments of others who will not appreciate this. I am running a similar project and the moment it got some volume we were cut off and were asked to pay a sum per 'transaction'. And as Athom does not sell a service per month that would make it impossible to use these services. That would explain why it used to work well and does not work well anymore.

    Now clearly I got no idea how these things work behind the scenes. I just know that the videos Athom has online about this show a idea that I am unable to recreate.  And that high end voice recognition in serious volume without any cost just does not exist. Again, I honestly would prefer an honest sorry, 'won't happen', over the silence we have now.  It would mean I can move my Homey from my desk to a place where connections are better. 

    Right now my Homey is like a deaf device next to my $50 Echo. That one is magical in recognizing voice. I do understand Athom and Amazon have different resources and I back Athom with the two Homeys I got. I just ask for some information on what to expect.



  • honeyhoney Member
    edited November 2016
    Google speech to text api:

    Pricing Table

    0-60 minutes Free
    61+ - 1 million minutes $0.006 / 15 seconds*

    Well if homey would not go through the Athom servers but connect directly to Google than would be within that 60 minutes. I think google is the best bet because of the wide ranges of microphone types supported, works in noisy environment, supports many languages and accents, reliable and fast service.

    I believe Amazon`s service is also accessible for other devices. But that depends a lot on microphone tuning.

    Ivona text to voice is not a free service either. But man that works! The best quality I have ever seen.
    Probably they have done a special deal but Ivona cost around $30-45/voice. Their cloud service (that is not used by Homey just for comparison): $1000/month (prepaid) + $0.003/unit for usage above 250k units/month.

    Ps: I haven`t realized till now that Ivona is an Amazon company.

  • honeyhoney Member
    edited November 2016
    Microsoft:
    Short form recognition15 sec per call$4 per 1000 calls
    Long form recognition2 min per call0-10 hours at $9 per hour, 10-100 hours at $7.50 per hour, over 100 hour at $5.50 per hour

  • jjtbsomhorstjjtbsomhorst Member
    edited November 2016
    I have to agree on you al. The speech recognition sucks sometimes. This morning I gave up asking for lights after 5 tries. Luckily there is the app. But still if you say : lamp San and homey thinks you said lang haar... or stamp aan...
  • Fire69Fire69 Member
    edited November 2016
    This has been discussed many times before...
    The problem is not the speech recognition service they use, it's the hardware (or lack of software optimization of the hardware (read: microphones))

    Whether you use your phone or Homey itself, the same recognition service is used.
    And if you use your phone, speech is recognized much much better then with Homey.

    msmits said:
    When im 2 meters > away from Homey i barely see the feedback led respondig to the sound of my voice.
    I think this proves my point.  I see exactly the same behaviour.
    Homey just doesn't hear you when you're 'too far' away.

    I have to agree on you al. The speech recognition sucks sometimes. This morning I gave up asking for lights after 5 tries. Luckily there is the app. But still if you say : lamp San and homey thinks you said lang haar... or stamp aan...
    The problem with this example is that the recognition software tries to make a logical sentence from your speech.
    When you use very short commands, it fails at doing this lots of times.
    Try saying 'Doe de lamp van Sam aan'.  That will probably work better.
  • jjtbsomhorstjjtbsomhorst Member
    edited November 2016
    I have to disagree on this @Fire69 . We also have a flow that is triggered when we say 'Netflix'. This flow is trigger all the time. No matter where we are. So I think it depends on both sentence but also the combination of letters used or something else.. 

    Fact is that this needs to be addressed by @Athom  very soon because they are talking about going retail ( in their newsletter) now.
  • I wonder, if I have my phone listening when I'm four meters away, would the recognition still be that good...
  • For me speech is 90% of the time not working. Personally I backed-up Athom because the Speech Recognition feature because for all the other functionallity there are plenty of other cheaper devices. But I must admit that I am now hooked up to my homey. Eventhough the Speech Recognition sucks. I also prefer to have a honest answer from Athom. I think when people buy a Homey at the shop they will return it very fast because this functionality is not working and I think for many people one of the hotest feature.

  • This is the unique selling point because other companies are doing it so why can't homey do it?

  • I agree to you all. Athom should respond on this, but i see for a while now that Athom does not respond that much on the forum. Also the new newsletter contains not that much news for us. It looks more like an advertisement of how good they are, nothing mentioned about the big problems they have and sollutions they have in mind or time table.

    Athom did have a communication problem since the start. @emile did admit that and promised to do better, and they did a little bit for a while. Writing a weekly software status, but now i only see the same behaviour: don't tell, don't respond, don't  mention, only "working hard on it". That is not enough for us people who are spending hours weekly in order to get homey working. If they don't want to sent that kind of information it to the normal user (if they have some), they could write a newsletter to the forum users, or they could write weekly a status report of the 10 biggest issues they have and about the progress. Not pointing to github or slack, just mention the biggest problems they have and their status like:
    -433 range BAD (know the problem, working on sollution, estimated release week 40)
    -Speech recognition (problem not known, could be hardware,  Currently no sollution available)
    -Slow speech (etc)

    I also would like to hear the truth from them. Are they able to fix it? I It is hardware and if so are they working on homey v2.0 who can really do as promised? And if so, do we all get that for reduced price for all our testing effort? And use the homey v1 for in our bedroom, from where the distance is so short , it does recognise our speech.

    I know they work hard and have a potential beautiful product, but right now too much works too unstable. And they seem to focus on too many things.

    And what is the status about the financial situation of Athom and what about the current sellings of homeys? Do they have enough money to keep developing, or should we be afraid that they run out of momey.  But i hardly can expect a honest response on that one :-)

    I really like the homey and are willing to give more of my time to test it, but they should start communicate better and be more transparent.




  • I have to disagree on this @Fire69 . We also have a flow that is triggered when we say 'Netflix'. This flow is trigger all the time. No matter where we are. So I think it depends on both sentence but also the combination of letters used or something else.. 

    Fact is that this needs to be addressed by @Athom  very soon because they are talking about going retail ( in their newsletter) now.
    Netflix is a pretty specific term, and this is one of the reasons why they use an online recognition service instead of doing it offline on Homey itself.  The online service can easily adapt to current trending words, offline they would have to regularly update their dictionary.
  • Agreed, but still light aan or lamp aan is one of their default terms. It should not mis interpreted them so much if what is you say is true. I guess. But hey lets see what athom  has to say about it.
  • Reading the news letter make me very sad but also afraid.
    They start delivering the orders form there web-store in couple of days.
    Also they are make them become ready to go for selling Homey in the shops.

    Sad because of the feeling that don't listen to us here although they write there love the community.
    Afraid that selling Homey for real, the refunds with the extra cost becomes the begin of the and of Athom.

    The two things that for me counted to became first batcher where: Speech and better KaKu handeling.
    We know where we stand in this.
  • msmitsmsmits Member
    edited November 2016
    This was a example, i said: "Gaat het nog regenen"? This was the result.
    I had contact with Athom about this, they said it can be caused by my voice. Quote: For "some" voices it's difficult for Homey to recognize. But when my gf talks to Homey it's exact the same....

    They also told me that they're still working on the voice recognition, and the system is "self learning", so the more people talks to Homey, the better it gets.

    Now i've running a flow which mutes the tv when Homey is listening.. it's a little bit better, but still not acceptable in my opinion. 




  • Of course we can`t see the source code but most issues are pointing to the same thing. Athom has no experience with signal processing: Voice, IR, 433Mhz receiver. Z-wave range issue. How handle noise and physical signals is clearly beyond their knowledge. They should hire a specialist or outsource it. Since Homey is all about wireless signals and voice there is no way around it.
  • @TheoDeKoning ;
    I agree with you. 

    I hope that @emile or @JeroenVollenbrock or @annemarie or someone else from @Athom ;
    will respond on our concerns and give us more inside information about the current situation. 

    Ik hope they respond very soon

  • Not everything is about coding. This is where the coding meets with the physical world. I am sure they do their best to learn about signal processing but they can`t just accumulate a 10-15 years of experience overnight or over 2 years. 
  • technimantechniman Member
    edited November 2016
    Fire69 said:
    Whether you use your phone or Homey itself, the same recognition service is used.
    And if you use your phone, speech is recognized much much better then with Homey.
    I read somewhere on the forum that the reason is that the data sent from the phone to homey is plain text,
    e.g. the phone's speech recognition software is used,  e.g. Google or Siri   
    that's why it's a) better and b) faster

    Personally I gave up on speech recognition & spoken replies through homey,
    because it works so bad the WAF of the homey is also very low (to a point where she said " I'm going to throw that paperweight out")

    One workaround I can think of is to use Tasker + Autovoice  to trigger a HomeyFlow 
    But that kind of defeats the purpose of Homey..  s



  • techniman said:
    Fire69 said:
    Whether you use your phone or Homey itself, the same recognition service is used.
    And if you use your phone, speech is recognized much much better then with Homey.
    I read somewhere on the forum that the reason is that the data sent from the phone to homey is plain text,
    e.g. the phone's speech recognition software is used,  e.g. Google or Siri   
    that's why it's a) better and b) faster

    That is incorrect and it has been confirmed by Athom on Slack that the same service is used on phone and Homey.
  • Fire69 said:

    That is incorrect and it has been confirmed by Athom on Slack that the same service is used on phone and Homey.

    Now that is VERY scary. It means that indeed the problem is hardware and that makes a fix nearly impossible. 

    Let's hope that at the meeting this month the developers can show that Voice Recognition does work for them, if it does not work for them I think we bought a defective device and should be repaired. Happens all the time, not a big deal and I'm not worried. I am sure Athom will make it work as they advertised it. It they are able to throw in spotify....

    The fact they do not respond at all however is not a good sign. 

  • Mathijs said:
    Fire69 said:

    That is incorrect and it has been confirmed by Athom on Slack that the same service is used on phone and Homey.

    Now that is VERY scary. It means that indeed the problem is hardware and that makes a fix nearly impossible. 

    Let's hope that at the meeting this month the developers can show that Voice Recognition does work for them, if it does not work for them I think we bought a defective device and should be repaired. Happens all the time, not a big deal and I'm not worried. I am sure Athom will make it work as they advertised it. It they are able to throw in spotify....

    The fact they do not respond at all however is not a good sign. 

    No that is not scary at all.
    It means that they have problems with their echo-cancellation and noise-removal, and not with the speech recognition. Homey's mic is on par if not better than most phone's. The problem is that you're not 10-20cm away from Homey's microphone.
  • KoenMartens said:

    No that is not scary at all.
    It means that they have problems with their echo-cancellation and noise-removal, and not with the speech recognition. Homey's mic is on par if not better than most phone's. The problem is that you're not 10-20cm away from Homey's microphone.
    Yep :)
    And it has 2 high quality (according to Athom :) ) microphones, so that should not be the problem :)
Sign In or Register to comment.