Page 1 of 3 123 LastLast
Results 1 to 10 of 23

Thread: The END of RETENTION LIMITATIONS?

  1. #1
    Member
    Join Date
    Mar 2006
    Posts
    1,244
    The past month or so, I've been going back 'rummaging through' some old nzb's, going back further and further on, particularly, both x.264 and blu-ray postings some 500 days (or further) back.

    What seems to be happening, is that even with usenet adding some 5TB or so per day, that somewhere back, again, some 500 or so days ago (I use Astraweb although it appears that Giganews is not quite as far back, maybe 350+ days or so), but that with the ability of the major providers to keep adding that 5TB of disc space per day, that we may be entering an era where anything posted to usenet (text OR binary) is there 'forever'.

    To 'keep up', again, with that 5TB+ per day, costs around $300/day or thereabouts, in disc addition. Simple arithmetic yields that a provider would need around 800+ subs to pay just for the hardware upgrades, say double that to provide an ongoing staff to maintain 'hands on'. I'm a bit out of date as to the current costing analysis on large scale (say OC48 or OC192 internet connections, but that's easily looked up), to provide the user interconnections, but...

    The upshot is that somewhere back around, say, August 2008 or so, everything posted to usenet since may well be (again, for the major server plants), be there 'forever'.

    All a plant may need, again, is some 3000 or so subscribers to maintain 'equilibrium', so to speak (pay all the bills, keep adding hardware to the plant, maintain interconnections).

    Just some musings, early in the morning for me, that has kinda become super-obvious, as I sit here d/l'ing yet another posting form aound 500 days back (from today) or thereabouts. I know that the text groups have been 'virtually complete' back several years for quite a while, but I'm talking about the binary groups as well.

    One of the things I've noticed is that (some, most?) the 'major' indexing sites are finding it hard to 'keep up' with the retention of the majors (GN and Astra). Certainly Newzbin is, but with them 'taking down' stuff (listings) it's hard to tell.

    One of (the?) biggest pluses of the P2P folks has always been that there are/were so many individual folks 'on the network', that anything 'on' the network was there virtually forever (as long as someone, somewhere, retained the file).

    Usenet may be on the verge, or already past, that point as well.

    2000TB of retention per year (at the 5TB/day), is simply not that big of a deal. $200K/per year hardware wise (simple $100/TB), or thereabouts.

    Something to think about, remembering back to when a couple weeks to a couple months of retention was a 'big deal'.
    Last edited by Beck38; 01-29-2010 at 03:56 PM.

  2. Newsgroups   -   #2
    Rart's Avatar Hold The Line
    Join Date
    Jul 2009
    Posts
    3,826
    This was a great point you brought up.

    Whenever I checked the retention data, it always seemed like GN and Astraweb were increasing retention at exactly the same rate(1 day for every day), and it never stopped (and as of such Astraweb is always just a tiny bit behind, probably due to starting later or something).

    It would appear that the growth and development of storage space (and it's feasibility/cost) is outpacing the rate at which content is added to usenet.

    The wonders of modern technology .
    Last edited by Rart; 01-29-2010 at 07:26 PM.

  3. Newsgroups   -   #3
    Member
    Join Date
    Mar 2006
    Posts
    1,244
    Quote Originally Posted by Rart View Post
    It would appear that the growth and development of storage space (and it's feasibility/cost) is outpacing the rate at which content is added to usenet.
    I think most folks are wondering what the next 'step' is in magnetic data storage, after perpendicular drives (being the last 'great leap forward). Unless I've missed some news release somewhere, 2TB has been the 'plateau' for a couple years now, but the pricing continues to fall even on those (with or without 'green' types).

    I kinda wonder exactly how many subscribers the 'big boys' have, though. Certainly greater (MUCH greater) than the number needed in my small attempt at cost analysis.

    I think it's over. Maybe 1 Jan 2009 was the last day anything anywhere 'rolled off' usenet. Anywhere.

  4. Newsgroups   -   #4
    tesco's Avatar woowoo
    Join Date
    Aug 2003
    Location
    Canadia
    Posts
    21,669
    Quote Originally Posted by Beck38 View Post
    The past month or so, I've been going back 'rummaging through' some old nzb's, going back further and further on, particularly, both x.264 and blu-ray postings some 500 days (or further) back.

    What seems to be happening, is that even with usenet adding some 5TB or so per day, that somewhere back, again, some 500 or so days ago (I use Astraweb although it appears that Giganews is not quite as far back, maybe 350+ days or so), but that with the ability of the major providers to keep adding that 5TB of disc space per day, that we may be entering an era where anything posted to usenet (text OR binary) is there 'forever'.

    To 'keep up', again, with that 5TB+ per day, costs around $300/day or thereabouts, in disc addition. Simple arithmetic yields that a provider would need around 800+ subs to pay just for the hardware upgrades, say double that to provide an ongoing staff to maintain 'hands on'. I'm a bit out of date as to the current costing analysis on large scale (say OC48 or OC192 internet connections, but that's easily looked up), to provide the user interconnections, but...

    The upshot is that somewhere back around, say, August 2008 or so, everything posted to usenet since may well be (again, for the major server plants), be there 'forever'.

    All a plant may need, again, is some 3000 or so subscribers to maintain 'equilibrium', so to speak (pay all the bills, keep adding hardware to the plant, maintain interconnections).

    Just some musings, early in the morning for me, that has kinda become super-obvious, as I sit here d/l'ing yet another posting form aound 500 days back (from today) or thereabouts. I know that the text groups have been 'virtually complete' back several years for quite a while, but I'm talking about the binary groups as well.

    One of the things I've noticed is that (some, most?) the 'major' indexing sites are finding it hard to 'keep up' with the retention of the majors (GN and Astra). Certainly Newzbin is, but with them 'taking down' stuff (listings) it's hard to tell.

    One of (the?) biggest pluses of the P2P folks has always been that there are/were so many individual folks 'on the network', that anything 'on' the network was there virtually forever (as long as someone, somewhere, retained the file).

    Usenet may be on the verge, or already past, that point as well.

    2000TB of retention per year (at the 5TB/day), is simply not that big of a deal. $200K/per year hardware wise (simple $100/TB), or thereabouts.

    Something to think about, remembering back to when a couple weeks to a couple months of retention was a 'big deal'.
    They store data in more than one place.
    The big guys have seperate server farms in different cities, and they can't run off of one source per article. The articles (especailly the newest ~7 days?) would have to be on many machines for optimization/load balancing. Not the mention backups...

    edit: Maybe backup is unneeded, if content goes missing it can be looked up in some sort of master 'index' then redownloaded from other servers (same way it gets the data in the first place).

  5. Newsgroups   -   #5
    Member
    Join Date
    Mar 2006
    Posts
    1,244
    Quote Originally Posted by tesco View Post
    They store data in more than one place.
    The big guys have seperate server farms in different cities, and they can't run off of one source per article. The articles (especailly the newest ~7 days?) would have to be on many machines for optimization/load balancing. Not the mention backups...
    Actually, no they don't. The trend over the last few years has been to consolidate (costing is the driver), and I know that both Astraweb and Giganews are a single plant (other than European and with GN, Asian). I've not only been to where GN is (Hampton Roads, VA) and I used to live a few blocks from Astraweb (Santa Clara, CA). When GN was hq in Phoenix several years ago, I was there as well. WAY back when, when GN started out in Austin, TX, I used to live there as well (small world, or I've lived all over the place, did TONS of traveling while building the 'fiber planet' for more clients than I can remember).

    The 'machines' are multiple, to an extent, for the load balancing you mention. But short of a close nuclear weapons detention, they are pretty solid.

  6. Newsgroups   -   #6
    iLOVENZB's Avatar FST Crew BT Rep: +1
    Join Date
    Sep 2008
    Location
    Land gurt by sea
    Posts
    8,331
    Quote Originally Posted by tesco View Post
    ...
    edit: Maybe backup is unneeded, if content goes missing it can be looked up in some sort of master 'index' then redownloaded from other servers (same way it gets the data in the first place).
    Imagine if it was a RAID backup .

    I remember reading that Usenet wasn't on just one server, if it was it would be very easy to shutdown not to mention if a server died etc.

    This is why Usenet is near impossible to bring down.
    "Computer games don't affect kids; I mean if Pac-Man affected us as kids, we'd all be running around in darkened rooms, munching magic pills and listening to repetitive electronic music"

  7. Newsgroups   -   #7
    tesco's Avatar woowoo
    Join Date
    Aug 2003
    Location
    Canadia
    Posts
    21,669
    Quote Originally Posted by iLOVENZB View Post
    Quote Originally Posted by tesco View Post
    ...
    edit: Maybe backup is unneeded, if content goes missing it can be looked up in some sort of master 'index' then redownloaded from other servers (same way it gets the data in the first place).
    Imagine if it was a RAID backup .

    I remember reading that Usenet wasn't on just one server, if it was it would be very easy to shutdown not to mention if a server died etc.

    This is why Usenet is near impossible to bring down.
    Yes I know.
    The question is whether within one host/serverfarm there are multiple copies of the same article or just a single one.

    Quote Originally Posted by Beck38 View Post
    Quote Originally Posted by tesco View Post
    They store data in more than one place.
    The big guys have seperate server farms in different cities, and they can't run off of one source per article. The articles (especailly the newest ~7 days?) would have to be on many machines for optimization/load balancing. Not the mention backups...
    Actually, no they don't. The trend over the last few years has been to consolidate (costing is the driver), and I know that both Astraweb and Giganews are a single plant (other than European and with GN, Asian).

    The 'machines' are multiple, to an extent, for the load balancing you mention. But short of a close nuclear weapons detention, they are pretty solid.
    Elabroate a bit, I'm really interested in this. What's the server setup like?
    Last edited by tesco; 01-30-2010 at 02:10 AM. Reason: Automerged Doublepost

  8. Newsgroups   -   #8
    Member
    Join Date
    Mar 2006
    Posts
    1,244
    This is a pretty good picture:

    http://usenet-news.net/index1.php?url=home

    although most colo (co-location) setups are basically chain-link fencing with the rack mounts within each 'separate' area. Lots of keys and high-security locks abound.

    I've got several 'bankers boxes' full of pictures, from 'back in the day' when everything was 'film'. Just about the time things started changing to digital (2002-3) was when I retired from the 'rat race'.

    Most everyone rents/leases space in a colo facility, even back then. Even folks as big as GN or Astra only have 2-3 rows of racks. Only the VERY big corporations run enough to have their 'own' facilities, like Microsoft.

    The last big facility I was involved in designing/building was for QWest in south Seattle, about 15,000sqft. Everybody worth anything in town had a big chunk of space there, including MS, Boeing, Comcast, you name it.

    But usenet suppliers? Small operations. Toss the numbers, one of those 19" racks in the usenet-news pictures can hold 300+ HD's, so that's 600TB right there. Just replicate that a few times, and the amount becomes mind-numbing.

    Oh, a couple of added items, since a lot (uh, the whole?) of what the picture shows might be a bit unknown.

    The largest drive assembly I think I've seen ('tripped over' while skimming mr. internet) is 100+ drive boxes, 19" x maybe 48 RU's (rack units). That would be about 3' high or thereabouts, so one could fit 3 of them in a 'standard' 7-9' rack. The 600TB would be about 1/3rd of a year of the total usenet (5TB a day, 600/5 = 120 days), so it would take 3 racks to hold about a year (120x3 = 360). Six racks, two years. VERY small space needed.

    Now, back 'in the day' when I was 'gainfully employed', and drives were at best say 500MB, it would take 4 times as much space. STILL pretty small. Heck, the backup battery plant would be larger, floor space wise!
    Last edited by Beck38; 01-30-2010 at 04:13 AM.

  9. Newsgroups   -   #9
    Even though I'm pretty sure I've downloaded all I want (and more) in the last two years, it is nice to think that it will always be there, just in case.

    As a mere user from a non-technical background, it's very interesting to read about the infrastructure.

  10. Newsgroups   -   #10
    Member
    Join Date
    Mar 2006
    Posts
    1,244
    The last month or so, I've been going back and 'reviewing' some of the first HiDef nzb's I collected right after I had built my HTPC. Then I found that the player s/w (both commercial and PD) had various limitations, and I kinda put things on hold (which is why the 'old' nzb's), until I finally got a Popcorn Hour STB.

    Meanwhile, I've moved over to Astra, and in going back, have gotten many things upwards of 450 to almost 500 days back. I'm d/l'ing something now that's from 1 Nov 2008, and I've done other 'stuff' back to around Aug08.

    Then it kinda hits a wall, 'file not found', etc.

    I wonder how far back GN is, though, again these are binary groups (x.264 and the like). Would be interesting to find out. But I think we've hit a 'paradigm shift', maybe even bigger than most/many have realized.

Page 1 of 3 123 LastLast

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •