SixXS::Sunset 2017-06-06

Request: uptime for PoPs
[nl] Shadow Hawkins on Saturday, 12 March 2011 20:14:41
[admin edit: moved to subforum + updated title to reflect content] What I'd really like to see is some uptime stats for the PoPs. I'm curious what the relative 'quality' of the PoPs is.
SixXS: What can be done better?
[ch] Jeroen Massar SixXS Staff on Tuesday, 09 September 2008 20:34:07
What kind of "uptime" statistics would you mean? I do hope that you realize that "pure uptime" doesn't mean service availability which is a much more useful metric. But there is a problem there, what exactly do you count as "service availability"? When the PoP is reachable over IPv4 and/or IPv6 from where? And even when it is reachable, what the throughput is. Thus what are the metrics? Since the beginning of SixXS we of course have had: Latency and Traffic statistics. These can indicate a bit of these, especially as both IPv4 and IPv6 reachability between all the PoPs is measured. If a PoP is not reachable, you can see that there too. The most important question is of course "for what purpose", which leads me to think that you are wanting 24/7 uptime and have other such requirements, which basically means: What are the requirements for the SLA you are defining? And, if you want an SLA, then you really are at the wrong place with SixXS. This is a freely provided community service, we try to do our best, and so do the various providers providing the actual PoPs, but we are not going to guarantee anything, uptime, bandwidth, response time, or any of those things; simply as we can't. Fortunately the PoPs are of awesome quality and thanks to the ISPs providing them they can definitely claim 99.99% "uptime" (reachable + working etc). Then again remember that 99,99% of a year means about 4 days of downtime per year, and actually some of the PoPs haven't been "down" in the last four years. (the actual kernel uptime of one of them is actually 974 days) If you require IPv6 connectivity though, and have a list of requirements for an SLA, don't hesitate to forward them to us, as we can definitely bring these to light to a number of ISP's who can provide you those services directly. See also the FAQ for a large number of ISP's who can provide you with connectivity and are very willing to provide a service with an SLA.
SixXS: What can be done better?
[gb] Shadow Hawkins on Saturday, 13 September 2008 19:49:29
It seems that the graphs are a bit out of date, as I do not see the noosl01 pop there. Any chance that it could be added? :-)
SixXS: What can be done better?
[ch] Jeroen Massar SixXS Staff on Sunday, 14 September 2008 13:18:23
Traffic was not there indeed, should pop up soon. Latency was there already.
SixXS: What can be done better?
[gb] Shadow Hawkins on Sunday, 14 September 2008 16:12:10
Cheers!
SixXS: What can be done better?
[gb] Shadow Hawkins on Sunday, 02 November 2008 10:06:30
I still don't see it on the graphs at http://www.sixxs.net/misc/traffic/.
SixXS: What can be done better?
[nl] Shadow Hawkins on Sunday, 14 September 2008 18:37:29
Hi Jeroen, thanks for your reply.
I do hope that you realize that "pure uptime" doesn't mean service availability which is a much more useful metric. [...] Thus what are the metrics? Since the beginning of SixXS we of course have had: Latency and Traffic statistics.
Those are good questions. I think the latency / loss stats are what I am looking for. It takes some time selecting the right graphs and interpreting the values, but different people will have different considerations, so that is okay. I was looking for a single 'goodness' metric, but as you say it does not exist. Still, some prominent graphs on the PoP pages would be nice. What about this idea: graphs on every PoP page with the traffic and averaged latency and loss to all other PoPs? Another idea would be to use the ticketing system also for 'known problem' issues, which would 1) avoid all of those "pop is down", "yes we know" tickets and 2) provide a searchable history of problems.
The most important question is of course "for what purpose", which leads me to think that you are wanting 24/7 uptime and have other such requirements, which basically means: What are the requirements for the SLA you are defining?
I notice that you often bring up the subject of SLAs in tickets and forum posts dealing with PoP problems. So often in fact that honestly it begins to feel like an excuse to hide behind. I am sure that is not intentional, but it feels like you are confusing requests for transparency with availability demands. IPv6 through nlams05 has always been almost as fast and as stable as my IPv4 connection, which is why I may have been depending on it a bit too much. With the recent downtime, however, I am wondering about things like "is this kind of downtime expected?", "how are other PoPs doing?", "should I change to another PoP?", "should I reconsider publishing AAAA DNS records?" I just want to know what to expect from my IPv6 connection. That does not translate into a 24/7 uptime SLA. Regards, Joost
SixXS: What can be done better?
[ch] Jeroen Massar SixXS Staff on Sunday, 14 September 2008 19:47:54
What about this idea: graphs on every PoP page with the
traffic and averaged latency and loss to all other PoPs?
Linked from the PoP page, including them though would clutter those pages too much, especially when there are multiple PoPs from one ISP.
Another idea would be to use the ticketing system also for
'known problem' issues
These are reported (look through the archive), when the PoP is completely down though, then the big yellow box above the ticket page and the mentioning on the news page should be sufficient, as, as that implies, we really know that there is a problem and people are working on it, they get notified directly.
I notice that you often bring up the subject of SLAs in
tickets and forum posts dealing with PoP problems. So often in fact
that honestly it begins to feel like an excuse to hide behind.
No, it is not an 'excuse' in anyway, it is about people demanding certain requirements. There is nothing to demand as there is no SLA. Not between the user and SixXS, nor between SixXS and the PoPs. Everything is provided on a best effort basis. Which is why we mention that if one requires an SLA that we can help you in the right direction for getting one.
I am sure that is not intentional, but it feels like you are confusing
requests for transparency with availability demands.
What level of transparency do you want? A direct hotline for answering your questions? Not possible unfortunately, unless we hire several people who can do shifts and handle those questions 24/7. Note also that the website contains almost every tidbit of information that you possibly would want to know; that is concerning SixXS: the who, the what and the why. It is all there.
IPv6 through nlams05 has always been almost as fast and as
stable as my IPv4 connection, which is why I may have been
depending on it a bit too much.
Like a lot of people do, as generally it simply works. And that is the whole idea of SixXS, when it runs it runs, and it runs great. Unfortunately when there is then a problem, these problems can't be guaranteed to be resolved quickly. Do also note that we ourselves use the PoPs, as such, we really do also feel the pain when something goes wrong.
With the recent downtime, however, I am wondering about things like
"is this kind of downtime expected?",
No.
"how are other PoPs doing?"
Exactly the same. For instance nlams04 also had issues a year or so ago. These issues are gone now, all resolved. Just have a little patience.
"should I change to another PoP?",
If all users of nlams05 move to another PoP, what do you think that does to the load of the other PoP? Does that resolve the problem? Not really.
"should I reconsider publishing AAAA DNS records?"
That is something you can consider of course. Unfortunately the routes are statically assigned to most of the PoPs thus the upstream router doesn't notice that the PoP is down (for that matter, we didn't even design the system that PoPs would be down, eg it breaks the whole idea of the credit system), and thus one doesn't even get ICMP unreachables in these cases.
I just want to know what to expect from my IPv6 connection.
That it is run on a best-effort basis, it might be fixed in a few minutes, it might also be fixed next week. We sincerely target that everything runs as a charm, unfortunately computers are still computers and things break once in a while. See also: PoP Down and Ticket Tracker. And no, we can't repeat those words mentioned there all over the site, we do actually expect people to read up a little bit.
SixXS: What can be done better?
[nl] Shadow Hawkins on Monday, 15 September 2008 14:42:51
What about this idea: graphs on every PoP page with the
traffic and averaged latency and loss to all other PoPs?
Linked from the PoP page, including them though would clutter
those pages too much, especially when there are multiple PoPs
from one ISP.
Would that also be the case if it would be one graph averaged over the other PoPs? The latency and loss graphs for nlams05 to the other PoPs are all wildly different, and the traffic graphs do not show anything after May. Additionally, although uptime is not service availability, it is certainly a part of it and far more easily measured.
I notice that you often bring up the subject of SLAs in
tickets and forum posts dealing with PoP problems. So often in fact
that honestly it begins to feel like an excuse to hide behind.
No, it is not an 'excuse' in anyway, it is about people demanding
certain requirements. There is nothing to demand as there is no
SLA. Not between the user and SixXS, nor between SixXS and the PoPs.
Everything is provided on a best effort basis. Which is why we
mention that if one requires an SLA that we can help you in the
right direction for getting one.
I don't know what abusive mail the SixXS operators get in their private mailboxes, but I have not seen any tickets or forum posts with a demanding tone. I guess we are less abusive when our names are placed above each post. Anyway, this thread is about making things better, so expect some constructive criticism.
I am sure that is not intentional, but it feels like you are confusing
requests for transparency with availability demands.
What level of transparency do you want? A direct hotline for
answering your questions? Not possible unfortunately, unless we
hire several people who can do shifts and handle those questions 24/7.
Not at all! The question-answer latency is so low on ticket, forum and e-mail communication with the SixXS staff that I expect a hotline would only be slower. :-) What I mean is the difference between the silence surrounding the current unavailability of nlams05 and (for example) the excellent reporting in (for example) ticket 454811.
With the recent downtime, however, I am wondering about things like
[...]
These were not meant as direct question to the SixXS staff, but thanks for answering them. You mention the following:
"how are other PoPs doing?"
Exactly the same. For instance nlams04 also had issues a year or so ago.
I guess this is exactly the point. If you are refering to the problem mentioned in ticket 432920, then this proves my point. Great ticket history with updates! Currently, nlams05 appear to be up again (it has been flapping), so where is the history of this event? (Oh, it is down again.) If one want to form an opinion on the SixXS 'service levels' (if you will), the performance history is covered by the graphs but problem history is scattered. Current problems are mentioned on the front page and the PoP page, but only if the PoP is down now. For example, now that nlams05 is flapping, I sometimes see status=Up. There is no log of problems and their duration on the news pages. The ticket history contains some problems, but not all. That is why I was suggesting to use the ticket system also for known PoP problem.
I just want to know what to expect from my IPv6 connection.
That it is run on a best-effort basis, it might be fixed in a few
minutes, it might also be fixed next week. We sincerely target that
everything runs as a charm, unfortunately computers are still
computers and things break once in a while.
I don't think anyone is doubting the effort and professionalism of the SixXS staff and the PoP administrators in maintaining the service. I see that I have allowed myself to widen the scope of my initial request (for uptime stats) to a general discussion on transparency. I am sorry for that. - Joost
SixXS: What can be done better?
[ch] Jeroen Massar SixXS Staff on Wednesday, 17 September 2008 14:53:12
Would that also be the case if it would be one graph averaged
over the other PoPs?
I don't exactly understand what you mean here. Can you describe it a bit better?
The latency and loss graphs for nlams05 to the other PoPs are all wildly
different, and the traffic graphs do not show anything after May.
Some of the graphs where not collected for some period because of problems in gathering the data.
Additionally, although uptime is not service availability, it is
certainly a part of it and far more easily measured.
"This host has been up for 999999 days" which tells you that there where no security patches applied and most likely also that there was no usage of the box at all as nobody cared.
What I mean is the difference between the silence surrounding the
current unavailability of nlams05 and (for example) the excellent
reporting in (for example) ticket 454811.
In that case there was information available, in some others there is not. Simple, nothing more to it.
I guess this is exactly the point. If you are refering to the
problem mentioned in ticket 432920, then this proves my point.
Great ticket history with updates!
There where no updates in that ticket. It is the same as the current issue with nlams05: It is down. No further details given. See also the great FAQ entry: FAQ: PoP's that are marked down that says the same thing.
SixXS: What can be done better?
[nl] Shadow Hawkins on Sunday, 28 September 2008 10:50:26
Would that also be the case if it would be one graph averaged
over the other PoPs?
I don't exactly understand what you mean here. Can you describe
it a bit better?
The latency graphs are measured between two destinations. If we would have one graph averaged over multiple destinations (for example, all other PoPs or a couple of well-known destinations), it would be easier to compare connectivity of the different PoPs.
"This host has been up for 999999 days" which tells you that there
where no security patches applied and most likely also that there
was no usage of the box at all as nobody cared.
Sorry, I guess I should have said 'downtime' instead of 'uptime'. I did not mean uptime in the sense of seconds since last boot.
What I mean is the difference between the silence surrounding the
current unavailability of nlams05 and (for example) the excellent
reporting in (for example) ticket 454811.
In that case there was information available, in some others there
is not. Simple, nothing more to it.
Not quite true. There *is* information available: the PoP is down. This information (on the news page) is not retained after the problem is fixed, while a ticket is. That means that a ticket is useful for keeping a history.
I guess this is exactly the point. If you are refering to the
problem mentioned in ticket 432920, then this proves my point.
Great ticket history with updates!
There where no updates in that ticket. It is the same as the
current issue with nlams05: It is down. No further details given.
There is a log in that ticket, is what I mean. Updates about ordering hardware and such. These are the details I was referring to. What is the cause of the problem with nlams05? "No further details given" means that SixXS does not know the cause of the problem, I think (as I expect you would give out details if you had them). If a PoP owner does not tell SixXS about what causes the problem that says something about that owner.
See also the great FAQ entry: FAQ: PoP's that are marked down
that says the same thing.
Indeed, that is what it says. What I am saying is that the communication of PoP problems could be done better. SixXS fixes problems as fast as they can, surely. According to the FAQ "[...] when the PoP is marked as down, the problem is out of our hands." But this does not mean that the PoP owner can not be asked what the cause of the problem is. This information can be relayed to the SixXS users. If the PoP owner declines to give any information, just say so. The ticket system would be the right communication channel for this.
SixXS: What can be done better?
[ch] Jeroen Massar SixXS Staff on Sunday, 28 September 2008 11:18:28
The latency graphs are measured between two destinations.
If we would have one graph averaged over multiple destinations
(for example, all other PoPs or a couple of well-known destinations),
it would be easier to compare connectivity of the different PoPs.
Which would not tell a thing, just that maybe there is one of the links to a PoP where the latency went up extremely high, which doesn't tell a thing about anything.
Sorry, I guess I should have said 'downtime' instead of 'uptime'.
I did not mean uptime in the sense of seconds since last boot.
How exactly do you propose that we measure this downtime? And no, we are not going to spread bad information about organizations who are providing a perfectly fine service for a long time. There are enough people who already do that for us. Just have patience.
SixXS: What can be done better?
[nl] Shadow Hawkins on Sunday, 28 September 2008 12:26:46
Which would not tell a thing, just that maybe there is one of
the links to a PoP where the latency went up extremely high,
which doesn't tell a thing about anything.
Would the averaging not dampen the effect of one bad link? That was the idea anyway: to use the average of a number of destinations to make the measure fairer.
And no, we are not going to spread bad information about
organizations who are providing a perfectly fine service
for a long time. There are enough people who already do that
for us.
Why would information about problems be bad information? Transparency is not bad publicity. On the contrary, being upfront about problems is a good thing. What could possibly be good reason for not informing SixXS users?

Please note Posting is only allowed when you are logged in.

Static Sunset Edition of SixXS
©2001-2017 SixXS - IPv6 Deployment & Tunnel Broker