Forum - It's not MTU or MSS, so what is it? :: SixXS

It's not MTU or MSS, so what is it?

Shadow Hawkins on Sunday, 06 December 2015 07:29:09

My hair's turning grey at fast pace. I have a SixXS heartbeat client on the other end of a PPPoE line and even though I am convinced that I did it all right, as soon as traffic volume increases (e.g. running `find /` over SSH), the connection freezes, and then a few more packets flow after a minute or so, and so on. This is typically an MTU problem, except the MTU is configured just fine. It's a classic PPPoE link, so one would assume an MTU of 1492, and indeed this is what `tracepath` confirms:

% tracepath debian.org
 1?: [LOCALHOST]                                         pmtu 1500
 1:  atom.mtvic                                            1.762ms
 1:  atom.mtvic                                            1.435ms
 2:  atom.mtvic                                            1.300ms pmtu 1492
 2:  no reply
 3:  S-SCE1-ADX.cpcak3-r2.tranzpeer.net                   34.865ms asymm  5
[]
21:  senfter.debian.org                                  298.407ms reached
     Resume: pmtu 1492 hops 21 back 23

So, I set the tunnel at 1472 at first, but then the problems started. I then tried 1280, but the problems persist. I am using iptables `--clamp-mss-to-pmtu` hack, because I really don't trust the provider here, but this doesn't have an effect either. I've seen this problem here and there and I could never figure it out. What else could be going on? Thanks for any insights, martin

It's not MTU or MSS, so what is it?

Jeroen Massar SixXS Staff

on Sunday, 06 December 2015 13:57:14

the connection freezes

Wireshark the connection and it will become a lot clearer what happens.

This is typically an MTU problem, except the MTU is configured just fine.

You make that statement, but you do not provide any details at all.

It's a classic PPPoE link, so one would assume an MTU of 1492,

As you state 'assume' is the wrong answer.

and indeed this is what `tracepath` confirms:

That 1492 is only for the first hop and what you configured. Check that it is really the right value, it can be a lot of other things, though indeed 1492 is 'common' on PPPoE.

21: senfter.debian.org 298.407ms reached

What does that host have to do with your tunnel? Note also that I recall you reporting other weird connectivity issues previously that had similar issues. It could just be that something on the path is causing totally irrelevant problems.

So, I set the tunnel at 1472 at first, but then the problems started. I then tried 1280, but the problems persist.

Guessing it will not solve anything. All the hops on the path have to allow it and be configured properly.

I am using iptables `--clamp-mss-to-pmtu` hack

That only functions when both sides are configured correctly and then set the right MSS. If your MTU is misconfigured it does not help a thing. Hence why it is a hack. Also, figuring out which hop is ignoring the PMTU is much more important, noting that even HTTP is moving to UDP and UDP does not work with MSS.... The MSS hack would only work with TCP and only if all the nodes involved are properly configured.

What else could be going on?

Lots of things. You will need to provide a lot more detail about all the hops involved though to get anybody to be able to answer it.

It's not MTU or MSS, so what is it?

Shadow Hawkins on Tuesday, 08 December 2015 07:24:45

Jeroen Massar wrote:

> the connection freezes Wireshark the connection and it will become a lot clearer what happens.

You mean the underlying proto-41 link, right?

> and indeed this is what `tracepath` confirms: That 1492 is only for the first hop and what you configured. Check that it is really the right value, it can be a lot of other things, though indeed 1492 is 'common' on PPPoE.

It's my understanding that tracepath prints the lowest MTU across the whole path at the bottom, where it also says 1492.

> 21: senfter.debian.org 298.407ms reached What does that host have to do with your tunnel?

Nothing, it was just the trace target.

Note also that I recall you reporting other weird connectivity issues previously that had similar issues. It could just be that something on the path is causing totally irrelevant problems.

This is a different tunnel in a different country with different devices and a different provider. As I said, I've seen this in various places and it's giving me grey hair because I can't figure it out.

> So, I set the tunnel at 1472 at first, but then the problems started. I then tried 1280, but the problems persist. Guessing it will not solve anything. All the hops on the path have to allow it and be configured properly.

If the provider had a problem with an MTU as low as 1280, tracepath would show that, don't you think? Note that I am connected directly to the PPP peer, so if there's a problem, it's between the provider's PPP peer and nzwlg01, which I find a bit hard to believe. Not that my provider doesn't screw up (oh my), but it'd be really hard to screw up a link using a hard-coded 1280 MTU like that, no?

Also, figuring out which hop is ignoring the PMTU is much more important, noting that even HTTP is moving to UDP and UDP does not work with MSS.... The MSS hack would only work with TCP and only if all the nodes involved are properly configured.

What else could be going on?

Lots of things. You will need to provide a lot more detail about all the hops involved though to get anybody to be able to answer it.

Hm, I wish I'd know what I'm looking for. :(

It's not MTU or MSS, so what is it?

Jeroen Massar SixXS Staff

on Tuesday, 08 December 2015 10:57:08

You mean the underlying proto-41 link, right?

On the IPv4 level, yes. And without ignoring things like ICMP, as there might just be an error message coming back from something.

It's my understanding that tracepath prints the lowest MTU across the whole path at the bottom, where it also says 1492.

But if there is a hop that is misconfigured (eg 1492 instead of 1280) then this fails as it just checks ICMP. Tracepath is a good indicator of MTU issues, and showing what it is estimated to be, it is not fool proof.

Nothing, it was just the trace target.

But it is important, as your tunnel goes from your endpoint to the tunnel endpoint (eg a PoP), it does not go to the random host you selected. Those paths will thus be different and might have different properties.

This is a different tunnel in a different country with different devices and a different provider.

Completely different? That is, all hardware is different? The whole path is different? What is the same? I am starting to think of hardware issues seeing the type of corrupted packets you are showing in the other message.

If the provider had a problem with an MTU as low as 1280, tracepath would show that, don't you think?

Not necessarily. If something is misconfigured either inbound or outbound tracepath won't catch that. You need to look from both sides of the link. Also there are still hosts on the Internet that do rate limiting of ICMP for instance that horribly break things.

Note that I am connected directly to the PPP peer, so if there's a problem, it's between the provider's PPP peer and nzwlg01, which I find a bit hard to believe.

Tracepath/route? :) More details please.

Not that my provider doesn't screw up (oh my), but it'd be really hard to screw up a link using a hard-coded 1280 MTU like that, no?

I guess you have never been a DOCSIS user where sometimes only packets <512 bytes go through I guess? :) Hardware and links can have soooo many problems, it is not funny. That is why networking people have jobs....

Hm, I wish I'd know what I'm looking for. :(

I am thinking of hardware issues.... could just be a weird cable somewhere or humidity affecting things.

It's not MTU or MSS, so what is it?

Shadow Hawkins on Tuesday, 08 December 2015 10:41:48

Jeroen Massar wrote:

> the connection freezes Wireshark the connection and it will become a lot clearer what happens.

So I did, and boy, this looks really weird. Once the data start flowing, messages like this appear (this is inbound traffic):

18.614704 2001:470:b46d::1 -> 2001:4428:29c::1 TCP 144 22 > 44073 [PSH, ACK] Seq=1761 Ack=1188 Win=31360 Len=36 TSval=853594417 TSecr=19149217[Reassembly error, protocol TCP: Dissector writer didn't bother saying what the error was]

and then after a while tshark reports a slew of the following, hundreds of the second line, and every now and then packet similar to the first line.

19.980644 2001:470:b46d::1 -> 2001:4428:29c::1 TCP 1316 [TCP Previous segment not captured] 22 > 44073 [ACK] Seq=35021 Ack=2368 Win=34176 Len=1208 TSval=853594758 TSecr=19149357[Reassembly error, protocol TCP: Dissector writer didn't bother saying what the error was]
19.981579 3829:ff84:ca15:887a:7cc5:2ce7:6080:0 -> 4d8:635:2001:470:b46d:: IPv6 1336 [Malformed Packet]

To me, this seems like the IPv6 encapsulated in these packets is corrupt, and its inbound. I've uploaded a full pcap file of an outbound SSH connection running find / on a remote host to here: pcap The find output stops at around 00:11 shortly after the data transfer starts. At around 03:30 it briefly resumes (I think so anyway, see the TCP window update) and then stops again. Eventually, the command will terminate, but I didn't see the need in capturing beyond this second freeze.

It's not MTU or MSS, so what is it?

Jeroen Massar SixXS Staff

on Tuesday, 08 December 2015 11:09:46

on a remote host to here: pcap

Packet 85 shows that that packet arrived out of order, as the previous segment was not captured. Packet 86 is a previously unseen, but acked, packet. That is rather weird don't you think? That means that your host did ack it, but your pcap did not see it. What system is this? :) Also where are you doing the pcapping? Packet 103 is completely corrupt indeed. That packet SHOULD have never been forwarded. Fun detail: that packet does NOT come from SixXSd. Check the previous packets, where the ID is always 0x4200 as that is what sixxsd sets it to. Also the destination address is mangled by then, thus sixxsd would not be able to send it anywhere. Packet 103 arrives with that set to 0x0000. Now you will also see that the whole packet is "offset" as the destination IPv6 address contains the real source address!!!! Magically there is a correct "60" directly after the proto-41 packet. Payload length of 16896 though is completely broken. At 0x40 of the packet there is the source address and from 0x50 there is the destination address, still in tact. The rest seems to be plausible for an IPv6 packet. Actually, if you check, there is a '60' at 0x38, and if you retarget Wireshark, the rest parses perfectly as an IPv6 packet. Something thus inserted bytes 0x24 - 0x38 into the packet. That corruption MUST have happened between sixxsd (the PoP) and your side of the IPv4 endpoint on the IPv4 layer. Instead of testing SSH, I would suggest doing a plain-text ncat between two hosts of some large file with line numbers and then per line say 1000 AAAAAAAA.... and then next line BBBBB... in them. That way you will see the proper order in wireshark and can see where the packet goes wrong.

It's not MTU or MSS, so what is it?

Shadow Hawkins on Wednesday, 09 December 2015 04:50:12

Jeroen Massar wrote:

That is rather weird don't you think? That means that your host did ack it, but your pcap did not see it. What system is this? :) Also where are you doing the pcapping?

It's a Raspberry Pi and while it's working fine otherwise, I am ready to investigate it for memory corruption. You're right, some of those packets could not have possibly come from sixxsd.

Instead of testing SSH, I would suggest doing a plain-text ncat between two hosts of some large file with line numbers and then per line say 1000 AAAAAAAA.... and then next line BBBBB... in them. That way you will see the proper order in wireshark and can see where the packet goes wrong.

I'll engage in further testing, once I return from my business trip. Thanks a lot for your help.

Please note Posting is only allowed when you are logged in.