Skip to content

Network Ports disappearing #5588

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
sundeepgoel72 opened this issue Jan 5, 2019 · 37 comments
Closed

Network Ports disappearing #5588

sundeepgoel72 opened this issue Jan 5, 2019 · 37 comments
Labels
waiting for feedback Waiting on additional info. If it's not received, the issue may be closed.

Comments

@sundeepgoel72
Copy link

sundeepgoel72 commented Jan 5, 2019

Platform

  • Hardware: ESP-12
  • Core Version: 2_5_0_BETA2
  • Development Env: Arduino IDE (v1.8.8)
  • Operating System: Windows

Settings in IDE

  • Module: Generic ESP8266
  • Flash Mode: DIO
  • Flash Size: 1MB
  • lwip Variant: v2 Lower Memory
  • Reset Method: ck
  • Flash Frequency: 40Mhz
  • CPU Frequency: 80Mhz
  • Upload Using: OTA
  • Upload Speed: NA

Problem Description

I have multiple esp modules with OTA configured, all network ports are appearing fine on the bonjour browser. However, on the IDE on startup, all network ports appear but soon except one or two, all others disappear. These is no pattern to the ones which remains, it appears to be random.

Seems to be an IDE / mDNS issue rather than code / sketch related.

MCVE Sketch

//------------------------------------------------------------------------------

#include <ESP8266WiFiMulti.h>
#include <ESP8266WiFi.h>

#include <ESP8266mDNS.h>
#include <WiFiUdp.h>
#include <ArduinoOTA.h>

ESP8266WiFiMulti WiFiMulti;
WiFiClient espClient;

void setup() 
{
  Serial.begin(115200);
  delay(1000);
  WiFiMulti.addAP("SG-GF","xxx");
  Serial.println();
  Serial.print("Connecting to Wifi: ");
  Serial.println(WiFi.SSID());  

  while(WiFiMulti.run() != WL_CONNECTED) {
    delay(500);
    Serial.print(".");
  }

  if(WiFiMulti.run() == WL_CONNECTED) {
    Serial.println("");
    Serial.print("WiFi connected. ");
    Serial.print("IP address: ");
    Serial.println(WiFi.localIP());
  }

  ArduinoOTA.setHostname("myesp8266Roaming2");

  ArduinoOTA.onStart([]() {
    String type;
    if (ArduinoOTA.getCommand() == U_FLASH) {
      type = "sketch";
    } else { // U_SPIFFS
      type = "filesystem";
    }

    Serial.println("Start updating " + type);
  });
  ArduinoOTA.onEnd([]() {
    Serial.println("\nEnd");
  });
  ArduinoOTA.onProgress([](unsigned int progress, unsigned int total) {
    Serial.printf("Progress: %u%%\r", (progress / (total / 100)));
  });
  ArduinoOTA.onError([](ota_error_t error) {
    Serial.printf("Error[%u]: ", error);
    if (error == OTA_AUTH_ERROR) {
      Serial.println("Auth Failed");
    } else if (error == OTA_BEGIN_ERROR) {
      Serial.println("Begin Failed");
    } else if (error == OTA_CONNECT_ERROR) {
      Serial.println("Connect Failed");
    } else if (error == OTA_RECEIVE_ERROR) {
      Serial.println("Receive Failed");
    } else if (error == OTA_END_ERROR) {
      Serial.println("End Failed");
    }
  });
  ArduinoOTA.begin();
  
}

void loop() 
{
  if (WiFiMulti.run() != WL_CONNECTED)
  {
    Serial.print("Reconnecting to WiFi ");
  }
  ArduinoOTA.handle();
}

Debug Messages

Serial output on startup ---------------------
[MDNSResponder] _parseQuery: Possible race-condition for host domain detected while probing.
[MDNSResponder] _parseQuery: Possible race-condition for service domain myesp8266Roaming.arduino.tcp detected while probing.
[MDNSResponder] _parseQuery: Possible race-condition for host domain detected while probing.
[MDNSResponder] _parseQuery: Possible race-condition for service domain myesp8266Roaming.arduino.tcp detected while probing.

@devyte
Copy link
Collaborator

devyte commented Jan 5, 2019

@sundeepgoel72 you ignored the issue template. Which core version? What is yourr platform? What is the MCVE sketch to reproduce? Etc.
Please edit your post and add the required fields.

CC @LaborEtArs .

@devyte devyte added the waiting for feedback Waiting on additional info. If it's not received, the issue may be closed. label Jan 5, 2019
@sundeepgoel72
Copy link
Author

@sundeepgoel72 you ignored the issue template. Which core version? What is yourr platform? What is the MCVE sketch to reproduce? Etc.
Please edit your post and add the required fields.

CC @LaborEtArs .

apologies for missing, its my first post here. Have updated the issue based on template

@devyte
Copy link
Collaborator

devyte commented Jan 5, 2019

The request is for an MCVE sketch, where M means Minimal. Could you please reduce? e.g.: remove thingspeak, sensors, etc and provide as small a sketch as possible that shows your probe issue.

What happens if you remove that delay(50) in the loop for all devices?

Side note: your timing calc millis() > nextTx won't work as expected on rollover. I suggest using the periodic polledTimeout instead. There are examples and device tests you can check out for usage.

@sundeepgoel72
Copy link
Author

sundeepgoel72 commented Jan 5, 2019

The request is for an MCVE sketch, where M means _M_inimal. Could you please reduce? e.g.: remove thingspeak, sensors, etc and provide as small a sketch as possible that shows your probe issue.

What happens if you remove that delay(50) in the loop for all devices?

Side note: your timing calc millis() > nextTx won't work as expected on rollover. I suggest using the periodic polledTimeout instead. There are examples and device tests you can check out for usage.

Hi, updated issue with Minimal sketch, Same issue persists.

Serial output
[MDNSResponder] _parseQuery: Possible race-condition for host domain detected while probing.
[MDNSResponder] _parseQuery: Possible race-condition for service domain myesp8266Roaming2.arduino.tcp detected while probing.
[MDNSResponder] _parseQuery: Possible race-condition for host domain detected while probing.
[MDNSResponder] _parseQuery: Possible race-condition for service domain myesp8266Roaming2.arduino.tcp detected while probing.
[MDNSResponder] _parseQuery: Possible race-condition for host domain detected while probing.
[MDNSResponder] _parseQuery: Possible race-condition for service domain myesp8266Roaming2.arduino.tcp detected while probing.

@LaborEtArs
Copy link
Contributor

When using the mDNSResponder you should add MDNS::update() in the loop.
The race condition might be because two or more devices in your network are using the same hostname (or are trying to do so). I can‘t see the probe callback in your MCVE code... how do you resolve hostname conflicts?

@devyte
Copy link
Collaborator

devyte commented Jan 5, 2019

@LaborEtArs the ArduinoOTA.handle() call internally calls MDNS.update().

@devyte
Copy link
Collaborator

devyte commented Jan 5, 2019

From the MCVE:

ArduinoOTA.setHostname("myesp8266Roaming2");

Are all ESPs setting the same hostname?

@sundeepgoel72
Copy link
Author

From the MCVE:

ArduinoOTA.setHostname("myesp8266Roaming2");

Are all ESPs setting the same hostname?

No, all have different hostnames.

@LaborEtArs
Copy link
Contributor

The race conditions are detected, when some other network member queries for our domain name, while we're not yet done with probing for this name. In your case, this might be a 'old' user of your device asking for the IP address (or someone who tries to update a DNS cache), while the device is restarting (an the mDNS responder is probing for the hostname). However, this shouldn't lead to a problem, IF the tiebrake is successfully handled.
You do see these two messages, as I forgot to remove two non-DEBUG_EX_...-encapsulated messages for exactly this situation... sometimes even errors may have their use....
Unfortunately I wasn't able to create a really reproducible test case for the tiebrake code; maybe you succeeded now ;-)
It would be great, if you could enable the debug messages in the tiebraking code blocks to see what happens there.
LEAmDNS_Control.cpp, lines 345ff.
LEAmDNS_Control.cpp, lines 435ff.
LEAmDNS_Control.cpp, lines 510ff.

@reaper7
Copy link
Contributor

reaper7 commented Jan 12, 2019

I have the same problem on win 10.
I tried different configs:
Arduino IDE 1.8.7/1.8.8
Core 2.4.2 stable/2.5.0b2/2.5.0dev

Bonjour browser shows all my esp devices
but on arduino ide they appear and disappear at random.
I rarely see all of them, most often appear individually.
The sketch seems irrelevant, the same is happening in my own sketch
but also on external projects like sonoff-tasmota, espurna and any others.

Devices appear frequently when I use "compile" button
but when I try flash it turns out that the device is no longer visible.

This is very annoying, considering the time devoted to compilation.
I do not know the process of OTA device detection/refresh in IDE
but maybe it should be done more often?

ota_problem

@d-a-v
Copy link
Collaborator

d-a-v commented Jan 12, 2019

This is very annoying, considering the time devoted to compilation.

While waiting for new mDNS to be debugged, you can start arduino OTA by hand:

# This script will push an OTA update to the ESP
# use it like: python espota.py -i <ESP_IP_address> -I <Host_IP_address> -p <ESP_port> -P <Host_port> [-a password] -f <sketch.bin>

@samlof
Copy link

samlof commented Jan 13, 2019

Yeah same problem. I can take the esp8266-id.local address from Bonjour Browser and use that with espota.py and I can upload thru net.
Doesn't show up in Ports. Using Arduino ide 1.8.8 with 2.5.0-beta2, with nodemcu v2. Using the BasicOTA example sketch.
I tried 2.4.0 core but that didn't work either. No idea is the problem here or with Arduino Ide

Edit: I slept thru the night and waking up, it does now show up on arduino ide too. Just had computer on sleep mode, so no rebooting.

Edit2: disappeared from Arduino Ide again as I tried a new chip. However vMicro shows it and works with Visual Studio just as expected. I'm thinking it's a problem with Arduino IDE and not the esp8266 core?

@LaborEtArs
Copy link
Contributor

If the problem exists with v2.4.x core as well as with the v2.5.x core, the reason can‘t be the new mDNS responder, as it was introduced with 2.5 only. At least it can’t be the only reason...
Has someone tried the old responder in the v2.5 core?
However the new mDNS responder had had a bug, that caused the ESP to get lost in ZeroConf in some configurations after the first host domain timeout (TTL, 120s) was reached. The PR for a fix is running right now...

@pfeerick
Copy link
Contributor

I may (or may not!) be be suffering the same issue. I'm using both the Arduino IDE (at times, when I'm lazy) and PlatformIO. The Arduino IDE is using 2.5.0 beta2, and PlatformIO is still using core 2_4_2. I've noticed of late only one or occasionally two/three mDNS discovery devices are showing up on the Arduino IDE, although that hasn't really bothered me, as it was just one more reason to be using PlatformIO as it is seeing them all fine! :-/

I found out at one point that disabling and re-enabling my ethernet connection seemed to wake the Arduino IDE up (sometimes, but doesn't seem to of late). This is when both PlatformIO and zeroconf browser (as I found bonjourBrowser a bit buggy at times) could see all the devices just fine.

A side note - I noticed that the three devices not visible to the Arduino IDE have 'board' set to PLATFORMIO_OAK rather than ESP8266_OAK... related? Relevant?

arduino_ide_ports

platform_io

zeroconf_browser

@devyte
Copy link
Collaborator

devyte commented Jan 19, 2019

There was a relevant fix to mdns merged recently. Please retest with latest git.

@sundeepgoel72
Copy link
Author

There was a relevant fix to mdns merged recently. Please retest with latest git.

which is the right library to use? presume this means the issue was / is in the mdns library as opposed to the IDE itself ...

@LaborEtArs
Copy link
Contributor

As the issue appeared when using core v2.4.x as well as when using core 2.5.0, it might not have been (completely) solved by the update of core 2.5.0 ...
I addition, there are a few minor issues left, for which the PR is running.
But: On my system, all the ESPs stay visible in the IDE (using Eclipse) and the Bonjour browser now :-)

@sundeepgoel72
Copy link
Author

sundeepgoel72 commented Jan 21, 2019

As the issue appeared when using core v2.4.x as well as when using core 2.5.0, it might not have been (completely) solved by the update of core 2.5.0 ...

I have 2.5.0-beta2 installed. The IDE is not giving an upgrade option, am i at the right version ?

@pfeerick
Copy link
Contributor

pfeerick commented Jan 22, 2019

I take it you found the update option in the board manager then (as I was about to reply to your 'how to upgrade question' which you've since edited ;) )? The latest 'released' version is 2.5.0-beta2, but changes have been released since then, you need to consider installing the bleeding edge git version (install instructions here).

@devyte
Copy link
Collaborator

devyte commented Jan 23, 2019

2.5.0-beta1 had a complete MDNS responder rewrite. There have been recent fixes to mdns merged after beta2 that could be relevant. Please retest with latest git, installation instructions are on readthedocs. Or you can wait until beta3.

@pfeerick
Copy link
Contributor

pfeerick commented Jan 24, 2019

I'm still not convinced recent changes to MDNS are the culprit here... after all, if other browsers such as PlatformIO and zeroconfbrowser can see the devices, then isn't MDNS working fine?

There are at least two relevant issues open on the Arduino github issue tracker - 6695 - which suggests that if there are any issues on their end - it is coming from updating the jmdns library. There is also the recently opened 8408 which builds on the first issue, and there is mention of use what appears to be a new mdns discovery system in the Arduino IDE betas... which I'm about to have a look at.

But more relevantly to this issue... I'll just leave some screenshots to so some inconsistent behaviour based on IDE version and opening of the ports menu... I have one device that is running ArduinoOTA - first built against 2.5.0beta2 whilst on windows, and then against current git version of the ESP8266 core when on linux.

Windows 10, Arduino IDE 1.8.8, ESP8266 v2.5.0 beta2, first open of ports menu
ports_188

Windows 10, Arduino IDE 1.8.8, ESP8266 v2.5.0 beta2, second of ports menu (haven't closed the IDE, or changed ANYTHING else)
ports_188_2nd run

Ok.... so lets try on v1.8.1 as there was mention that v1.8.1 was pre some updates to the jmdns and discovery was working better. Windows 10, Arduino IDE 1.8.1, ESP8266 v2.5.0 beta2, first open of ports menu (no change on subsequent opens of menu).

ports_181

Whilst PlatformIO the whole time was showing... guess what... all four (actually five, but ignore Oak2 - it just work up from deepsleep in time to make an appearance for the screenshot).
screenshot from 2019-01-24 19-35-27

So what about against the latest git version, and throw linux into the mix to see if it's OS specific...

Ubuntu 18.10, Arduino IDE 1.8.8, ESP8266 git version, first open of ports menu
linux_188_1st open of menu

Ubuntu 18.10, Arduino IDE 1.8.8, ESP8266 git version, second open of ports menu
linux_188_2nd open of menu

That's it... I give up on mDNS working properly with the Arduino IDE unless they've fixed it in the beta version!

@pfeerick
Copy link
Contributor

Woohoo! 1.9.0 beta of the Arduino IDE seems to have some... issues with duplication of entries, but at least mDNS discovery seems rock solid so far in my brief 5 minutes of attempting to break it... all devices are being discovered, consistently, and upload was working just fine. This is again against 2.5.0 beta2 as that's what my Windows 10 environment is currently on.

image

@pieman64
Copy link

@pfeerick I have Win and Ubuntu systems and my main Windows machine has always struggled with port discovery. After a reboot it works for a while but no for long. I switched to http OTA updates rather than local OTA updates.

Will check out 1.9.0 beta though before I discard local OTA altogether.

@pfeerick
Copy link
Contributor

pfeerick commented Jan 24, 2019

@pieman64 Yeah, I've found over the years that linux is much more stable with networking stuff. On the OTA/mDNS front, the mDNS discovery that PlatformIO seems to be stable and working on Windows, hence why I point the finger at the Arduino IDE. Especially since it's mDNS discovery is just as bad on linux (well, almost as bad... sometimes the full list sticks around, or starts to reappear after a few minutes), whilst platformio is working just fine there also - and has a refresh button!!

@reaper7
Copy link
Contributor

reaper7 commented Jan 25, 2019

Latest changes looks very promising,
for a long time of work with arduino 1.8.8 all my esp82xx OTA devs are visible :)

@reaper7
Copy link
Contributor

reaper7 commented Feb 21, 2019

looks like "disappearing network ports" problem back, when sdk was revert from pre-3.0 to 2.2.1 :(

I recompile and upload the same espurna project to 2 from my 5 devices
and I do not see these two devices in IDE and zeroConfServiceBrowser tool.
The other 3 devices compiled on pre-3.0 are still visible.

@d-a-v
Copy link
Collaborator

d-a-v commented Feb 21, 2019

sdk-pre-3 is still available in tool menu, available when using Generic esp8266 board.

@VinceW31
Copy link

Ive just had the same problem as the OP. Im pretty sure its #include <ESP8266WiFiMulti.h> thats the problem.
In my case Id just upgraded my sketch to include WiFiMulti functionality (to choose between 2 sets of WiFi credentials, whichever was available) and uploaded it via OTA when all the problems with network ports in the Arduino IDE started and intermittent connections to all my ESP devices.
Rebooting the router got them back temporarily and I managed to revert all my devices back to the old sketch without WiFiMulti functionality and all is back to normal now. The original post above doesn't need WiFiMulti functionality anyway because its only got one set of WiFi credentials to use.

@reaper7
Copy link
Contributor

reaper7 commented Jul 4, 2019

For a long time OTA worked on my devices(espurna project) properly if I compiled on sdk pre-3.
All the time all devices were available in ide,
but in the last week, all of them have disappeared after cyclic update
and they only appeared for some time after their reset.

I decided to go back in this repository before commit f9009b8
and re-compiled projects for two of my six devices.
These two devices do not disappear.

I think that the disappearing port for OTA can cause something in mDNS

@d-a-v
Copy link
Collaborator

d-a-v commented Jul 4, 2019

@reaper7
Can you try and replace a.ifUP() by true and check if it solves your issue ?

@reaper7
Copy link
Contributor

reaper7 commented Jul 4, 2019

I will check, of course...
but we will have to wait for the observation results :)

@reaper7
Copy link
Contributor

reaper7 commented Jul 6, 2019

so, this is not this line, something else must cause this problem,
but I'm convinced that it's mDNS

I introduced your suggestion on Thursday (yesterday I did not have the opportunity to check the effects)
but this morning none of the devices were available in the ide...
...so I went back to commit 961b558, I've compiled images for all devices again,
and until now everything is visible.
I will send the next report tomorrow.

@reaper7
Copy link
Contributor

reaper7 commented Jul 8, 2019

all devices are still visible for OTA if the project is compiled before making mDNS changes (f9009b8)

@hreintke
Copy link
Contributor

hreintke commented Jul 8, 2019

@reaper7
I am trying to reproduce this issue in my environment but did not succeed. yet.
Can you describe your steps to reproduce ?

@d-a-v
Copy link
Collaborator

d-a-v commented Jul 8, 2019

Please check this comment,

@hreintke it seems it doesn't work on android. We have at least two PRs about that not merged because they do not follow standards. It appears that android has an issue with 192.168..

@reaper7
Copy link
Contributor

reaper7 commented Jul 9, 2019

@d-a-v - it looks like the udp fix has solved the problem

@devyte
Copy link
Collaborator

devyte commented Feb 2, 2020

Several comments say this is resolved, and a whole lot of critical fixes have been merged since this was opened. Closing.
If the problem is still valid, please open a new issue and fill out the required info, including a MCVE.

@devyte devyte closed this as completed Feb 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
waiting for feedback Waiting on additional info. If it's not received, the issue may be closed.
Projects
None yet
Development

No branches or pull requests

10 participants