PDA

View Full Version : Astraweb/US Sunday PM (San Jose Time) Wobblyness Back (No Vengence... Yet)



Beck38
08-19-2013, 02:53 AM
For close to a half-dozen years, Astraweb/US was haunted by a Sunday PM (usually around 5-11pm San Jose/Server time) wobbliness that, at it's height (7-10 or so pm) virtually shut down the 'service' both incoming (posting) and outgoing (downloading).

Then, out of the blue, it 'disappeared' about a year or so ago. Never saw a hint of it. Then, all of a sudden, the last 2-3 weeks. it all of a sudden started it's 'routine' back up again. Not anywhere near the result of years past, but more than a hint of slowdowns, connection 'resets', and tons of 'header checks' and the like. Again, not to the level it had for years but for anyone who actually 'watched' their transfers, it was there.

Hopefully, it won't continue to rise to the level it was in years past, it's not quite there to actually 'avoid' the block of time every Sunday evening (again, San Jose/Pacific Time), but after being gone so long, one would hope it's not going to 'ramp up' to where it was in the past.

UPDATE: okay, maybe some vengeance; the last couple of weeks it was hinting at but never quite getting to a complete shutdown, but tonight (9/18/13 @ 2030hrs Pacific Time) it has pretty much done so, going on about 15+ minutes at this point. Complete and utter shutdown.

UPDATE (after 'event'): The 'heart' of the event was right around 2hrs in length, it appeared the entire 'front end' of the Astra/US plant was hosed. As far as I could tell, the entire path to that front end (virtually all Level3 transmission links) was just find, during the complete 'event', Giganews as well as others were up and running fine.

Hopefully, this is not going to be a total repeat of the years past but whatever they were doing in those years (that they stopped doing for about a year) is back.

Beck38
08-20-2013, 07:45 PM
A couple of days later, I can see where the throughput (both upload and download) is occasionally 'wobbly' in that it slows down in both directions every couple of hours (or thereabouts) to about 1/3rd of it's speed (at least the maximum my connection has) but does 'rebound' to 'full speed' for a few (x) hours.

Well, they may be trying to 'tweak' something, who knows. If I had 'Google Fiber' or some more 'reasonable' connection I might be able to tell more, but I don't. The occasional 'stalls' (complete slowdowns to zero in both directions, happening for 10-20 seconds every couple of hours) still seem to be there as it's never completely gone away over the past x years I've been on Astraweb. It's a pretty small price to pay for the low price and 99% of the time it's works just fine, I guess; still have my Giganews account and a few other block accounts, but unless things really of off the tracks I don't pay them much mind.

Again, I'm sure other folks don't see much of what I do; it's the transmission engineer part of me that comes out when I see those 'woobles'!

piercerseth
08-28-2013, 04:09 AM
Beck, you've said before you're on a vpn iirc, hence the level3 transit? Have you scrutinized the connect without it? Running direct over comca$t's backbone and I haven't seen any kind of appreciable issue with them. At least downstream. Even on Sunday aka "holy shit everybody grab these tv shows at once." Granted I'm only a few hops from the equinix datacenter, but so are you.

Beck38
08-28-2013, 02:51 PM
No; I actually engineered/re-engineered most if not all of the level3 transmission systems on the west coast some 15 years ago, although obviously they've been upgraded probably at least 2-3 times since, and I have the ability built into my system to check things out both through the vpn and without at the same time. This last weekend, after a lot of obvious things that were done a week ago during the weekdays, there were no slowdowns or wackiness this last Sunday (the 25th), maybe they saw what was beginning to happen again and 'fixed' it last week.

Comcast changed up it's Pacific Northwest backbone routing significantly about 2-3 years ago, streamlining things quite significantly, and haven't changed it since (they had a lot of traffic going through Portland, OR apparently from the time Verizon FIOS was built in the Beaverton/Washington County area), and then after Verizon sold off those properties (and the FIOS area in Snohomish County, WA, both ex-GTE/Northwest) it took a couple years and they changed up their routing scheme to streamline things to more direct to the SF/California hub more directly.

So for now (at least this last Sunday) it seems to have gone away again; we'll see if this coming Sunday it stays away, then they've probably 'fixed' it 'permanently' once again.

DngrMs
09-04-2013, 07:21 PM
Do you suppose this type of observed behaviour affects peering as well or just endpoint connections?

Beck38
09-05-2013, 01:08 AM
Be specific, what kind of peering are you talking about? If you're talking base traffic from one carrier to another (say, Comcast to Level3, or AT&T to Level 3), then it affects most everything. Most.

If you're talking about 'special' traffic (that in which the carrier is getting paid 'extra' to carry on 'dedicated' tunnels and such (say Netflix), they probably not. Probably. This type of traffic is relatively new, there was very little of it 'back in the day' when I was working for a living, only transoceanic submarine cables had dedicated 'waves' (upwards of 1-10Gb/s each), but now there are entire cable systems (submarine and transcontinental) where the entire cable is nothing BUT multiple waves (light frequencies) dedicated to a specific customer.

Those customers are on completely separate systems next to the 'bulk' traffic that us peons ride on.

DngrMs
09-05-2013, 06:49 PM
Astraweb's peering with other newsgroup providers.

Beck38
09-05-2013, 07:45 PM
Very to extremely doubtful that there are any such 'dedicated' linkages between them in that regard. The basic reason is that the amount of traffic (although very large by any measure) is simply not 'time or accuracy dependent', meaning that say if one posts to Astraweb/US in 'absolutely positively has to be at Giganews in x microseconds. Or even at Astraweb/EU in any such hurry. A few hours in either case would be 'just fine', even though the two plants (Astra/US and Giganews/US) are only a couple hundred miles apart.

The type of wave peering I described, like that used to connect major financial markets (like US Wall St. and, say, London) are very expensive, require multi-year (actually multi-decade) contracts to set up and maintain. The reliance on computer-generated trading systems and the micro-second trading that they feed on, has, in the case of north Atlantic traffic, actually spurned the construction of entire specialized cables to carry this traffic.

Usenet is nowhere near having the need, or the funds, to jump on that bandwagon. I monitor lag time between most of the major usenet servers, in even an hourly sense, but in a fair percentage of days, on a pretty continual basis. 'On a Good Day' things run pretty fairly, a message posted on x will find it's way to every other server within a few minutes time, a least in an hour or two. Sometimes, even those physically 'close' (like Astra/US and Giganews/US or those populating the 'usenet corridor' in eastern Virginia on the DC Beltway (say Blocknews and Usenet-News) occasionally lose a bit of 'cohesion' where it may take a day or two to 'catch up' from one to another 100%.

The electronic bucket brigade that is Usenet wasn't ever designed to be an 'instant' lock between plants. Folks who get bent that it may take a day or two for all the 'parts' of a huge binary posting to propagate from A to B, need to step back a bit. Those plants already have a pretty large bill for their traffic, they don't need to add to it by getting any kind of 'Cadillac' service (or is it
'Lexis' these days?).

Of course, some of this is fueled by the thought that a takedown might be issued for that WWE fight that started posing an hour a go. Yep, western civilization is crumbling because some (idiot) posted something so plainly obvious that some other (idiot) just absolutely NEEDS to have. Usenet is thrown into a tailspin. Yawn.

UPDATE: Just a a kind of 'funny', Giganews (both US and EU plants) are currently several hours 'behind' on a rather large amount of parts posted through Astraweb; both Astra plants (US and EU) are 100% (again, there may be some s/w 'linkage' making that so) but starting at around 0530am Pacific time this morning Giganews started 'skipping' parts. I'm seen this lots before. It may take them 1-2 days to 'catch up'. Just fer fun, I'm checking Blocknews (it will take an hour or so) but I'll bet they have no problem. So much for the 'high priced' spread.

UPDATE2: A bit more information; for whatever reason, both Usenet-News and Giganews have changed their peering (I've notice this before when things kinda went 'off the tracks') and their taking their major peering/feeds from some Russian outfit named "!goblin.stu.neva.ru', and both have the 'skips' starting right at the same time (0540 Pacific, or 0840 Eastern). Blocknews apparently has started 'stripping off' their paths for some reason, so folks using them can't tell what the path was (or is). This may be a anti-DMCA measure, who knows. But who knows.

So it appears that the peering change is causing the 'skips'. Why the change? And why several server plants changed (all at the same time?)???!!??

It's a Russian plot. The last time this happened a few months back, the plants involved changed it back (off the Russians) in a couple days.

DngrMs
09-08-2013, 10:47 AM
^ thanks for the explanation.

Beck38
09-08-2013, 02:55 PM
In the thread I started on this Giganews et. al. that started the 'skipping' for about 12+ hours the other day. Blocknews caught things up pretty quickly (about 12 hours later it had backfilled 100% as far as I could tell) but Giganews was lagging far behind in doing so.

This Euro folks posting all this 'mini-parts' sillyness (large multi-gig postings, but the individual parts are around 250KB each, do the math and its literally tens of thousands of message ID's) are gumming up the works in tracking down these kinds of 'off the tracks' events. It takes literally hours of d/l'ing headers to try and see what's going on. They already have the postings encrypted and the only way to get the code is by private websites, so why the wacky part size is beyond me.

I may be able to 'look back' into Giganews sometime tomorrow to see if the 'fade' from Sept 5 has been backfilled. Interestingly, this time (maybe because the 'event' was fairly short in duration) there hasn't been much if any screaming on the any of the major newsgroups this time. Or maybe folks have learned that even with GN they need a secondary source (like Blocknews) if they are going to get hyper-sensitive about getting stuff quickly.

Beck38
09-09-2013, 01:51 AM
Well, the Sunday evening Astraweb wobble is back tonight, maybe not as bad as it has been in the past, but still there.

Once again, will wait and see what time it goes away, and if it comes back again next Sunday.

UPDATE: Actually, around 2000hrs (8pm) Pacific time, it basically came to a full halt. It started back up again around 2115, we'll see how it goes until morning.

piercerseth
09-09-2013, 02:18 AM
BrBa, Newsroom, Boardwalk, et al here we go :)

Vestibule
09-09-2013, 02:54 AM
Boardwalk Empire!!!!! Yes! I didn't know it started again tonight.... here we go indeed!

piercerseth
09-11-2013, 01:55 AM
Hey Beck38, what's your crystal ball say about this afternoon/evening? Astraweb has been all out of whack it would seem.

EDIT: Slow headers, seems to have remedied itself.

Beck38
09-11-2013, 05:56 AM
Astra/US has been running about as straight and normal as it almost always has, and has been since about 2300 Sunday (Pacific time). If you're talking about Astra/EU, no data there. it's on the other side of the planet from me.

Beck38
09-16-2013, 06:31 PM
Just a bit of fyi, Astra/US ran about as perfectly normal as ever through this last weekend and Sunday PM, no blips of anything; so whatever may (once again) be cured or .... Will of course notice a week from now, as my 24/7 'schedule' continues unabated.

Beck38
09-30-2013, 03:28 AM
Well, this Sunday PM (San Jose Time) it (Astra/US) fell flat right about 7pm, and as of the time stamp on this message (8:30pm) it's still dead.

This is on the inbound (Posting) side; outbound (leeching) is running at about 1/3rd speed, figuring that nothing else in the transmission path is going wonky. To me, some server side work (or process) is pulling a large amount of cpu cycles.