Android Issue post 3.8

Since the 3.8 update (still present in 3.8.2) Android devices running Oreo on Nexus 5X are virtually unusable for internet connection when connecting via Wifi (5Ghz). The Wifi connection is established but the android device regularly reports no internet connection.

I have tried with and without DNSSEC and with and without forwarding which seems to change the severity but not drastically.

Note that for 3.7 series I had to add the following to get reasonable behaviour with the Nexus 5X otherwise the device kept jumping from 5Ghz to 2.4Ghz. When 2.4Ghz was disabled it kept dissaccociating completely. With the low ack set to 0 it was fine though.

option disassoc_low_ack 0

Other devices are also having issues since 3.8 but none as bad as this. Wired devices don’t seem as badly effected.

1 Like

I’m running latest nightly of Lineageos 14.1 and have no problems. Can you downgrade to Nougat?

I’m just turning off Wifi and dropping to 4G at the moment. Investigating other devices that are having issues like the Playstation 4. It’s not limited to Android.

I have same issue on nexus5x after September security update :frowning: And I am not sure as the upgrade to 3.8 on Omnia was almost same time, what is the problem. But other devices haven’t such issue even there are some drops too. So waiting for october update for nexus and next update on Omnia with fingers crossed to solve one of updates my problem.

1 Like

November fix does not solve this but I found out that if you reboot the phone it will get stable for couple of days.

1 Like

Neither my wife or I are experiencing this with our Google Pixels, Android 8.0.0 and the October 5th patch. 3.8.2 on the Omnia.

Turris Omnia 3.8.2 and Nexus 5x with stock 8.0.0 here. I confirm that the WiFi connection is unusable after a while, despite trying various combinations on DNSSEC and forwarding.
I am experimenting with a specific DNS assignment as workaround for the Nexus as described in https://forum.test.turris.cz/t/yamaha-receiver-problem-with-knot-dns/1304/23. I will have to wait some more hours/days to confirm that the promising results are not only due to a recent reboot of the smartphone.

I am on Turris Omnia 3.8.2, Nexus 5x and 8.0.0 with no issues here. Happy to review configs to see if there is something different in my settings if anyone has a starting point for me to look at

A few days down the road, I confirm that the workaround of assigning a more traditional DNS resolver than Turris Omnia’s Knot leaves my Android’s 8.0.0 WiFi stable. I noticed that an iPad on iOS 10.3.3 is also exhibiting erratic external DNS resolution behavior. The device is much more usable than the Android was however.

I have dnsmasq handling the resolution of the local lan hostnames on port 5454, typically nas.home.mydomain.net

root@turris:~# cat /etc/kresd/custom.conf
local lan_rule = policy.add(policy.suffix(policy.STUB(‘127.0.0.1@5454’), policy.todnames({‘lan’, ‘home.mydomain.net’})))
policy.del(lan_rule.id)
table.insert(policy.rules, 1, lan_rule)

Outside than that, in foris/config/dns my DNS settings are pretty dull:

  • Forwarding: activated
  • Disable DNSSEC: deactivated
  • Enable DHCP clients in DNS: activated

I played with combination of the above without success. To be noted that the Turris Omnia has IPv6 connectivity through 6rd.

Without being 100% sure of it, I observed that when my iOS device fails to get a DNS resolution for a domain on the internet, local resolution still works.

Any hint or observation is appreciated.

For almost all people, DNS issues starting around 3.8 were worked around by disabling either forwarding or DNSSEC validation. An exception are various smart things, and those should be worked around soon as well (internet-of-things devices).

Can you see if there are some particular domains that fail? I don’t expect to matter at all which device in the network did the DNS query. (I can’t give too much time into this, during this month at least, but I’ll try a bit.)

I think I observed an occurrence when 1 tab was failing to load on the iPad, while another 1 was properly loading, but not knowing how the iOS resolver works, ie if there is a cache, I cannot ascertain what happened.

In another occurrence, I was able to capture Knot’s log when the resolution was stalled.

https://pastebin.com/E0LLzNTH

There is a first error on line 105, and subsequent from 125. Unfortunately, I did not yet find a way to reproduce the issue systematically, or at least reliably.

Thank you

Oh, log. Interesting. First of all www.mathieu.agopian.info is a non-existent domain (but its parent exists and serves HTTP).

Besides that, there are some strange issues. You configured forwarding to two servers; the log shows a period where they repeatedly fail to answer and in case they do they miss some mandatory DNSSEC records. (I’m quite confident about this interpretation, but I can’t e.g. access the servers…)

In your place I would either disable forwarding or at least add some other provider so knot-resolver can choose in such cases. Our association runs public DNS servers but are many others as well. In case similar issues happened without forwarding, the logs would be more interesting to me.

I was able to decorrelate, at least for now, the Android issue from the DNS (thanks @vcunat for the hints on forwarding, I’ll confirm over the next days if the DNS issues are completely gone)

Regarding Android, I noticed a bug in /etc/resolver/dhcp_host_domain_ng.py but this is not the root cause, or perhaps even not related… Still digging.

As I could not find it on Github, find below a proposed patch:

230c230
<             log("DHCP delete hostname [%s,%s]" % (hostname, ipv4), LOG_INFO)
---
>             log("DHCP delete hostname [%s,%s]" % (op, hostname, ipv4), LOG_INFO)
233c233
<             log("DHCP remove old hostname [%s,%s]" % (hostname, ipv4), LOG_INFO)
---
>             log("DHCP remove old hostname [%s,%s]" % (op, hostname, ipv4), LOG_INFO)

That is fixed already, but not in a release yet.

Issue disappears on my nexus 5x with 8.1 beta android…

I upgraded to 8.1, but I am still experiencing the issue.
I noticed that my phone was only getting IPv6.
I logged into Omnia and the IPv4 lease was still active to my phone.
I found this old reference to the problem:

Tickets were opened in Openwrt.

The upsteam OpenWrt ticket indeed sounds like my experience, possibly combined with various DNS issues we’ve had with the Turris software.

https://dev.openwrt.org/ticket/20854

Further upstream:

https://bugzilla.kernel.org/show_bug.cgi?id=188201

And more reports here:

I have an ASUS Zenbook and a Chromebook flip also connected to the Turris via 5Ghz only (I disabled 2.4Ghz to test this) and have only had issues relating to DNS. When those are worked around (disable DNSSEC, use forwarding) these clients don’t seem to have issues (well maybe questionable range).

I think the workaround here is to give up trying to get the Nexus 5x to talk on 5Ghz until I can find out out from upstream if there was ever a resolution to this issue.

I’m not sure if all these problems are the same bug or not.
When my Nexus 5X doesn’t connect, it won’t connect to either bands.
Today my Nexus didn’t connect again, I rebooted Omnia, still it didn’t connect. I rebooted the phone and it connected.
Maybe it is the client side issue.