Skip to content

ESP is not Responsive in AP_STA mode while STA is still connecting with OTA Enabled #5915

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Lan-Hekary opened this issue Mar 25, 2019 · 24 comments
Assignees
Labels
component: MDNS type: bug waiting for feedback Waiting on additional info. If it's not received, the issue may be closed.

Comments

@Lan-Hekary
Copy link
Contributor

Lan-Hekary commented Mar 25, 2019

-### Basic Infos

  • [ X] This issue complies with the issue POLICY doc.
  • [X ] I have read the documentation at readthedocs and the issue is not addressed there.
  • [X ] I have tested that the issue is present in current master branch (aka latest git).
  • [X ] I have searched the issue tracker for a similar issue.
  • [X ] If there is a stack dump, I have decoded it.
  • [ X] I have filled out all fields below.

Platform

  • Hardware: [ESP-12E(NodeMCU)]
  • Core Version: [7a2e935]
  • Development Env: [Platformio]
  • Operating System: [Windows]

Settings in IDE

  • Module: [Nodemcu]
  • Flash Mode: [qio]
  • Flash Size: [4MB]
  • lwip Variant: [v2 Lower Memory]
  • Reset Method: [nodemcu]
  • Flash Frequency: [80Mhz]
  • CPU Frequency: [160MHz]
  • Upload Using: [SERIAL]
  • Upload Speed: [921600] (serial upload only)

Problem Description

The ESP in AP_STA mode does not respond to pings or webserver requests while STA is not Connected ( AP not found or Temporary Disconnect from the router )

This happens while the OTA is Enabled only .. It works normally if I turned the OTA off ..

I think it's related to the mDNS Module ..
and I think it's related to this issue as well #5866
the problem appeared after the last repo update I made ..

MCVE Sketch

#include <Arduino.h>
#include <ESP8266WiFi.h>
#include <ESP8266mDNS.h>
#include <WiFiUdp.h>
#include <ArduinoOTA.h>
#include <ESP8266WebServer.h>

#ifndef STASSID
#define STASSID "STA"
#define STAPSK "12345678"
#endif

ESP8266WebServer server(80);

IPAddress apIP(192, 168, 4, 1);
const char* ssid = "AP";
const char* password = "12345678";

void handleRoot() {
  server.send(200, "text/html", "<h1>You are connected</h1>");
}

void setup() {
Serial.begin(115200);
Serial.println("Booting");

WiFi.mode(WIFI_AP_STA);
WiFi.softAPConfig(apIP, apIP, IPAddress(255, 255, 255, 0));
WiFi.softAP(ssid, password);

WiFi.begin(STASSID,STAPSK);

ArduinoOTA.begin();

server.on("/", handleRoot);
server.begin();

Serial.println("Ready");
Serial.print("IP address: ");
Serial.println(WiFi.localIP());
Serial.println("HTTP server started");

}

void loop() {
ArduinoOTA.handle();
server.handleClient();
}
@devyte
Copy link
Collaborator

devyte commented Mar 25, 2019

Using a different device, e.g. your phone, which channel do you see the esp's AP on? Which channel was the router on, the one the esp STA tries to connect to?

@Lan-Hekary
Copy link
Contributor Author

I tried it on 2 different NodeMCUs ..
Laptop and Android Smart Phone ..
I First noticed it while my Device was searching for a Station that does not excite .. ( I Have Wifi Manger Portal to change SSID Settings ) ..
After that I tried it while having a Good SSID configurations but it only started responding after the Station got Connected )

I tracked the Issue to this PR #5894
And reverting the file fixed it ..

@devyte
Copy link
Collaborator

devyte commented Mar 25, 2019

You did not answer my question.

Ref #5894, how are you pinging?

@devyte
Copy link
Collaborator

devyte commented Mar 25, 2019

@Lan-Hekary said:

I First noticed it while my Device was searching for a Station that does not excite

The ESP has only one radio shared between AP and STA interfaces. If the station is looking for your router, it will drag the softap along on its search. If you're pinging the softap at the time, you'll see the ping fail, because the softap dropped out of the channel. Sometime the device that was connected to the softap from which you're pinging will disconnect from it and connect elsewhere. The behavior depends on timing, so you won't always reproduce exactly the same way.

@Lan-Hekary
Copy link
Contributor Author

ESP AP is on Ch1 at first ..
The router on Ch9 ..
After the fix .. ESP channel changed to 9 ..

@Lan-Hekary
Copy link
Contributor Author

I am using the Ping command from CMD on windows ..

@Lan-Hekary
Copy link
Contributor Author

@Lan-Hekary said:

I First noticed it while my Device was searching for a Station that does not excite

The ESP has only one radio shared between AP and STA interfaces. If the station is looking for your router, it will drag the softap along on its search. If you're pinging the softap at the time, you'll see the ping fail, because the softap dropped out of the channel. Sometime the device that was connected to the softap from which you're pinging will disconnect from it and connect elsewhere. The behavior depends on timing, so you won't always reproduce exactly the same way.

Yes .. You're right .. But if the the STA I am looking for does not exist the Channel won't change ..
And I have a code that is working and I can ping and communicate with the ESP while it's scanning for the STA ..
Only During the Connection Phase that I will lost Connectivity .. And that is not the Issue I am Addressing ..
I am talking about before the Connection Phase .. In the Case of trying to connect to a SSID that does not exist ..

@devyte
Copy link
Collaborator

devyte commented Mar 26, 2019

I am using the Ping command from CMD on windows ..

Pinging the IP or hostname.local?

if the the STA I am looking for does not exist the Channel won't change

I suppose you meant "if the ssid the STA is looking for". If so, that is not true. The STA interface scans the channels looking for the ssid you specified, and connects once it finds it. During that scan it drags the softap along with it. If the ssid exists, it could find it quickly, e.g. if your router is on a low channel. Or the config could be stored on flash and connection could be done shortly after boot even before begin() (I don't see use of persistent(false) in your code) by using prior information, in which case the channel won't change. In contrast, if the (new) ssid doesn't exist, the esp will channel hop across all channels before giving up, and that won't be quick, so you'll likely notice it.

@Lan-Hekary
Copy link
Contributor Author

I am pinging 192.168.4.1 ..
Originally .. I am using a different approach in my code ..
I disabled auto reconnect .. And I call a method to check for wifi and connect if bot connected every 1 min ..

So .. Basically I am scanning only every 1 min ..
And the problem is still there only only after the commit I mentioned ..

The problem was not there before ..

I was able to connect to the AP and ping .. And browse the webserver without porblems .. Only once every 1 min that I miss a packet for of the ping .. And the webserver hangs for a second .. Bit otherwise it was fine ..

Now after the commit .. All ping Times Out ... And no communication at all ..

I want to investigate the source of the problem in this commit #5894 ..

@Lan-Hekary
Copy link
Contributor Author

I can upload to the ESP using OTA if I am using the IP address 192.168.4.1 ..
Not hostname.local ..
This is mainly due to the MDNS server is not working in AP mode while station is not connected ..
But Forcing it cause a Serious Problem ..
I am trying different approaches now ..

@Lan-Hekary
Copy link
Contributor Author

I have a new discovery on the issue ..
I made the changes that this PR #5894 made again and did some experiments ..
At first the problem was there ..
But once I've added a 50 ms delay in the loop the problem went away ..

I suspected that this modification caused the MDNS service to overload somehow ..
The new problem is .. It didn't help with the Original Problem ( of the OTA ), I can't upload to with OTA through hostname.local ..
But I can now upload with the IP ( only if there is a 50 ms Delay in the loop I can't do anything otherwise)

@devyte
Copy link
Collaborator

devyte commented Mar 26, 2019

Which core version? You didn't fill out that info in the issue template

@Lan-Hekary
Copy link
Contributor Author

The last commit ..
7a2e935

@Lan-Hekary
Copy link
Contributor Author

I opened wireshark ..
And I noticed that the ESP is flooding the network with a query of itself!!!
It's overloading the network .. And this is the source of this issue ..
the 50ms delay I put there allow the AP network a oom to breathe and process the rest of the requests ..

The Flooding stops once it connects to a Station ..
But it starts again once I disconnect ..
And if there is no station broadcasting the ssid I want to connect to, it just keeps flooding the network with queries ..
This flooding happens only in AP_STA with no connection ..
The flooding stops once I switch the mode to AP only ..

@zacharydrew
Copy link
Contributor

I wonder if it doesn't realize the query is coming from itself? There are different methods for getting the local AP IP address and the local Station IP address.

@devyte
Copy link
Collaborator

devyte commented Mar 26, 2019

CC @LaborEtArs

@Lan-Hekary
Copy link
Contributor Author

Still no Fix ..

@d-a-v
Copy link
Collaborator

d-a-v commented Apr 24, 2019

Is there a proposed fix for this ?
If we revert #5894 then we have #5866.
@LaborEtArs @hreintke since you know pretty well the mDNS code, what would be the fix to prevent ESP to talk to itself ?

@hreintke
Copy link
Contributor

@Lan-Hekary : Will try to reproduce in my config but can you provide me with one/some of your wireshark logs ? Maybe gives me a first thought.

@Lan-Hekary
Copy link
Contributor Author

@hreintke this was captured after updating to the last commit 968d6fc
this is a wire shark log file .. for a brief moment .. it's flooding the interface with mdns queries ..
there is a little delay in the main loop ( 16ms ) to account for proccessing ..
but it's the same with no delay ..

mdns.zip

image

@Lan-Hekary
Copy link
Contributor Author

Still no Fix Guys .. Can you at least revert this commit and reopen the issue of the MDNS to Focus the Effort on how to fix the Original Problem ( OTA Update in AP mode )

@LaborEtArs
Copy link
Contributor

@Lan-Hekary Well, there is (kind of) a solution, but not committed yet, as not really a good one...
The problem is, that while the AP is online, maybe the ESP isn't connected to some AP, so the STA-IP address isn't set; this leads to the flooding as the send fails and is repeated instantly. So the first step of the solution is to check the 'fromIPAddress' in 'MDNSResponder::_sendMDNSMessage_Multicast' in 'LEAmDNS_Transfer.cpp' (~line 115) and return a (faked) 'true' if NOT set "if (!fromIPAddress.isSet()) return true".
Unfortunately there is yet another (currently unsolved) problem: Every time the ESP tries to connect to an AP (in STA/STA_AP mode), the mDNS responder is reseted and starts its probing/advertising process again...
The only workaround today is to disable the STA-mode while using the AP mode.
The final solution would be to create separate mDNS responder instances for each used interface, but currently I haven't got time to implement this....
Another solution would be to turn back to the legacy responder, if non of the new featured are needed, but the STA_AP mode.

@Lan-Hekary
Copy link
Contributor Author

Lan-Hekary commented Jul 5, 2019

The final solution would be to create separate mDNS responder instances for each used interface, but currently I haven't got time to implement this....
This PR #6224 fixed it ..
but I can't get it to work on AP only mode .. is there any solutions ??
I will close this issue but I think someone other than me investigate this issue before committing to a final solution in the future .

edit from maintainer: see #5866 (comment)

@d-a-v d-a-v added 4 - Done waiting for feedback Waiting on additional info. If it's not received, the issue may be closed. and removed 4 - Done labels Jul 5, 2019
@Johnnysod
Copy link

I'm a bit of a novice here. Programming a SonOff Basic.
I have the same problem and wondered about having 2 modes.
When the button is pressed it could go to OTA mode or AP mode if not.
A bit messy though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component: MDNS type: bug waiting for feedback Waiting on additional info. If it's not received, the issue may be closed.
Projects
None yet
Development

No branches or pull requests

7 participants