This is the forum archive of Homey. For more information about Homey, visit the Official Homey website.

The Homey Community has been moved to https://community.athom.com.

This forum is now read-only for archive purposes.

[TROUBLESHOOT] Z-wave ProcesSendData lockup

TedTolboomTedTolboom Member
edited December 2016 in Questions & Help
An attempt to create a generic troubleshooting guide for the issue where the Z-wave controller of Homey get’s in a lockdown state; showing in Settings > Z-wave > Stuff for geeks: currentProcess: ProcessSendData Main hypothesis:
Z-wave devices connected to Homey with their default parameters and appliance connected to it, start to send too frequent data reports towards Homey; basically DDOS-ing Homey, resulting in a lockup of the Z-wave controller
Statements:
  1. this is an issue not related to 1 specific brand (e.g. Greenwave) or type (e.g. power switch) of Z-wave devices.
    I’ve observed this issue with my Greenwave power nodes, Neo Power plugs as well as with my Fibaro Motion sensor 
  2. Next to the Z-wave device itself also the appliance connected to it (if applicable) plays a relevant role
    specifically devices with a slightly fluctuation standby power consumption 
  3. Different data types can cause this issue (e.g. power consumption / illumination reports etc) 
To start troubleshooting this issue (Z-wave lock-up with currentProcess: ProcessSendData): 
  1. Start looking into your insights data 
  2. Look for devices showing some erratic data logging (higher than the expected reporting interval; notice the jittering during the night)

  3. Do not completely trust the graphs in insights. Download the data as logged and open and view it in a text editor (showing reports every 9 seconds)

  4. Keep in mind that the parser of Homey, only records data with different timestamp AND different value
    if the value did not change but was reported by your device, it might not show up in your graph / data
  5. If you have found a suspicious device. Start looking at the settings (e.g. the polling interval) of the device and if needed make adjustments
  6. Some things to keep in mind: not all parameters (including their default values) are already implemented in the current apps… With most apps, the polling interval or reporting interval / threshold is implemented.
    But for instance the relative power increase triggering a report to Homey are in most apps missing
  7. So if you have a suspicious device, please report it back in this topic.
    Please add: device name + settings used + data as logged
  8. Complete your investigation through all your devices

Some specific issues found, including their mitigation action see second post in this topic

With my current setup (3x Greenwave powernode 6, 2x Greenwave powernode 1, 2x Fibaro Motion Sensor (non-plus) and 2x Neo power plugs), I had serveral lockups per day... https://github.com/athombv/com.greenwavesystems/issues/18

Based on above troubleshooting and solution directions, I was able to get the Z-wave network stable again (no lock-ups yet)

Please contribute to this Troubleshooting guide by providing feedback in this topic.

Contributors:
@Phuturist @caseda ;;;;;

Comments

  • TedTolboomTedTolboom Member
    edited December 2016

    Some specific issues found, including their mitigation action 
    (will be updated based on additional feedback in this topic)

    1. Greenwave power nodes: 
      Association group 3 is reporting on the power value change. 
      In the current app, Homey’s ID is by default added to this group and by default parameter 0 is set to 10% (power increase)
    2. Solution: remove Homey’s ID from association group 3. Pull request has been made to include this parameter in the app and increase the default value
    3. NEO power plugs:
      Similar issues as with the Greenwaves. Default power value change set to 30%. But no separate association group.
      Solution: add to expert parameters 6,1,80 (setting the power value change to 80%) 
      Pull request has been made on the NEO power plug to include parameter in the app and increase the default value); expect updated APP to be release to the app store this week

    4. Other devices with a similar “power  value change” parameter: 
      Aeotec: only 5% power change (did not have the time to look into a solution, check manual)
      Fibaro wall plug: the default setting is 80% power change… should not lead to issues

    5. Fibaro Motion Sensor illumination reports
      In additon, I also had issues where my Fibaro Motion Sensor (non-plus) started to send erratic (too frequent) reports on illumination changes (not linked to the illumination report interval and illumination report threshold settings I applied). I do not understand this issue (https://github.com/athombv/com.fibaro/issues/70) yet…
      Mitigation:  setting the illumination report threshold setting to 0 (LUX)… now my motion sensor reports every 300 seconds (good enough for me)
  • I asume these are workarounds and the troubles are much deeper. It can be a nice solution for now, but i think it has to be investigated. As far as i know, Athom is looking in to it and Emile is doing his best to find a solution for it.
  • MHubertMHubert Member
    edited December 2016
    TLDR(great initiative though) 

    Isn't this a case which Athom should solve? Hanging Homeys and/or crashing Z-wave chips should be there highest priority and communication topic! 

    By taking away the issue(s)/symptoms, it could result in Athom not receiving and crash reports anymore and not able to find the bug/root cause. 

  • @kasteleman ;@MHubert See the referenced github issues with the details I already shared to enable Athom to find the bug / root cause. I also invited @Emile to have access to my Homey to support this root cause finding.

    While awaiting their feedback, I do like to have a operational Z-wave network and avoid having to continuously monitor the state of my Z-wave network... These steps (yes too long) helped me, and in the mean time others, to get a stable Z-wave network again... Use it to your own advantage.
  • @kasteleman : A zwave-message with every 10% energy-change seems like a designflaw of the sensor :p if you connect a mobile phone charger it will constantly send messages. Even at 100% it report every change from 1 to 2 watt...
  • That could be, but not seen yet that it wil crash my z-wave controller. Only seen that polling of Greenwave powernode 6 crashes my non Homey z-wave controller and have to unplug the controller ( Æon labs z-stick non zwave plus ).
  • EternityEternity Member
    edited December 2016
    Checking all associations of my z-wave devices, I saw that both my Fibaro relay's had group 3 set to 1:


    All others z- wave had Group 1 set to 1 (Greenwave powernodes and Aeotec sensors). I changed it, so now Groep 1 has 1 and Groep 2 and 3 are empty. That's the way it should be, I presume. 

    This is what the Fibaro manual states:
  • Perhaps we should fill some wiki-pages on https://github.com/athombv/com.fibaro and https://github.com/athombv/com.greenwavesystems to collect the correct settings?


  • TedTolboomTedTolboom Member
    edited December 2016
    Eternity said:
    Checking all associations of my z-wave devices, I saw that both my Fibaro relay's had group 3 set to 1:
    ...
    All others z- wave had Group 1 set to 1 (Greenwave powernodes and Aeotec sensors). I changed it, so now Groep 1 has 1 and Groep 2 and 3 are empty. That's the way it should be, I presume. 
    @Eternity association groups do not have a standard definition; they even vary between devices from a single manufacturer.

    Based on the manual section you also referenced. Homey's ID should be in group 3 (as per default setting). Group 1 and group 2 enable triggering other devices once switch 1 (S1) or S2 is activated.

  • Eternity said:
    Checking all associations of my z-wave devices, I saw that both my Fibaro relay's had group 3 set to 1:


    All others z- wave had Group 1 set to 1 (Greenwave powernodes and Aeotec sensors). I changed it, so now Groep 1 has 1 and Groep 2 and 3 are empty. That's the way it should be, I presume. 

    This is what the Fibaro manual states:
    Uhmmm, as i read it, you should assign group 3 to Homey (main controller) and only Homey because he is the controller. To the other groups you can assign other z-wave devices!
  • Thanks for your advise, guys!
  • IMHO, it would help to discover and troubleshoot these kind of problems when there would be a activity log available with (msec) timestamps so you can see which device is doing what when the system hangs (or to troubleshoot why your flow doesn't work).
    insights isn't suetable for that.
    I think the community can help athom troubleshoot issues this way a lot faster.
    community happy and athom gets more time to develop new features.

    just my 2 cents.

    anyway, great job for all people involved troubleshooting and fixing things this way!
  • To bad, after 21 hours (still a personal record :smile:  )
     I have the ProcessSendData issue again and zwave is not working anymore.
    This was with the default polling intervalls.

    Going to reboot homey and change the poll intervall to see if this helps.
  • EmileEmile Administrator, Athom
    In Homey v1.0.4 (up for experimental release this week) the SendData process should never hang again. If it occurs, please report back!
  • Thanks Emile!!!
  • Da_JoJoDa_JoJo Member
    edited December 2016
    that would be the most valuable thing @Emile ;
    i was almost at the point of buying a new vera as this was the only thing close to a homey and actually does work great.
    i have my whole house z-wave and i just need it to work eventhough it would mean a lot of options not working/available or whatever. i cannot have a controller that does not respond to commands, it would mean i cannot turn on lights , open my window covering, open the door or freaking even hear a doorbell when someone is at the door nor use my alarm. its just plain ridiculous it not working..  also this thing with flows that when you update a device , you have to do all flows using this device again. also not responding to alarm frames and simple things a basic switch, thermostate and other z-wave unrecognized devices is a no-go when advertising it support z-wave. it just plain does not. i hope you guys understand and go fix this asap 1st priority otherwise the whole thing is doomed to fail.
  • Emile said:
    In Homey v1.0.4 (up for experimental release this week) the SendData process should never hang again. If it occurs, please report back!
    I am experiencing a lot of ProcessSendData issues again since Fibaro v1.4.3 - 24.04.17.
    Homey is running current stable version and becomes unresponsive (offline in app) 10+ minutes long regularly. Restart micro controller or  homey restart does only solve for the first 30 minutes. 
  • casedacaseda Member
    edited May 2017
    @Jeroenvano
    is it really hanging? or is it just processing lots and lots of data.

    and restarting the micro controller doesn't restart the z-wave chip, only the 433 and normal 868 chips, so rebooting that shouldn't make any difference with homey hanging if it really is the z-wave
  • @caseda it looks to be sending a lot of data bit it is hard to check as Homey goes offline a lot lately. 

    My other zwave devices have no issues other than a strong delay in executing (sometimes +10minutes) 

    If had to put momey on it I would say it is the latest Fibaro update as I am experiencing this now since that last app update 
  • casedacaseda Member
    The only thing the last update does is fix an issue for the smoke sensor plus, the rest of the app is untouched.
    it probably is a domino effect that was activated by an app that takes long loading (the fibaro app).

    but to see what really is causing it i will need your z-wave log, there you should be able to see how much data is being received by homey. (cause that is what "processSendData" is saying, processing the data that was send by other devices.)

    and might already be fixed in 1.3 since there is a fix in there that almost looks like the issue you are describing.
  • We could try the same trick here maybe:
    uncheck all ur zwave apps. No delete, just stop them. Ptp and wait for like 15 minutes before put it back in. After Homey is really done with everything start the zwave apps one by one again.
  • The devices having a lot of send data are the Fibaro motion sensors. I have send the fibaro log to @caseda in a PM.
  • casedacaseda Member
    hmm, combination of the security issue and no ack in between that is a first one for me :p but i guess i haven't seen all logs from everyone.

    almost looks like homey is saving a lot of send data still on harddrive when it is too busy and after that rebooted. (swap data)
    and the 15 min ptp trick @Rocodamelshe gives trespasses the time limit data still can be used.

    definitely try that
  • @Jeroenvano can you forward the log (or post it here)...
    I do recognise the security issue and no_acknowledgements from issues I've created on 1.2.2 (assume you are on that release)... and I would take the bet on your statement that it is Fibaro app version related...
  • @TedTolboom I've send you a PM with the log. 
    Homey Firmware Versie: 1.2.2

    Let me know if you need to know more or if you want to chat on slack

  • @Jeroenvano check https://github.com/athombv/homey/issues/1468
    I had similar security related issues on 1.2.2...  what worked (at least temporarely) is to manually wake-up the sensors (security transmission will succeed)

    I also suspected the Fibaro app version... so I tried if an older version of the Fibaro app (one of my test versions) resolved the Z-wave issues I was experiencing... and it didn't.. hence my bet

    With 1.3.0-RC5 I did not see these security commands in the log; yet there are some other issues that need to be resolved first
  • @TedTolboom Yes exactly the same, good to see I am not the only one :)

    Manual wake-up did work, but as you described not always. One thing for sure is that communication with the sensors is unreliable, it comes and goes. I have 3 sensors and it is not always related to a specific sensor. When the ProcessSendData issues and security issues are active on Homey, the flows with all zwave devices act slow, sometimes up to 20 minutes late. 
Sign In or Register to comment.