|
-
Sep 7th, 2001, 09:53 AM
#1
Thread Starter
Retired VBF Adm1nistrator
To map out the entire internet
I was just thinking that it wouldnt actually be all that difficult to map out the internet, if you used a distributed computing approach to the problem.
Consider if you will, a client application.
The client application scans the hdd for URLs, or starts off at a given URL, and then scans that URL for links, and follows those links, and scans those pages for links ... so on and so forth.
It could also try a few smart things too.
Eg. If it was given a URL www.host.com/somedir/somefile.htm, it could try to get a directory listing of /somedir/, and then it could also try to get the index page from /
The application follows these links until it is a certain depth down from the starting point, or until its run out of memory or something. Then every so often it connects to a central server, and uploads a compressed version of its findings.
Now, if you took maybe 10 computers, and gave the 10 computers directory sites to start off on, and told those people to tell other people etc, then you'd end up getting a lot of data very fast.
One thing though, there are lots of free webspace providers with thousands upon thousands of users these days.
So if the app came across *.geocities.com, it could just record that geocities.com was a valid host.
Then we make another special app that is designed for webspace providers. The app would use the site's search engine to scan for pages contained on the site. Or if the site had a directory then even better.
So its not all that implausable.
If ya got a lot of people running this app, which I might add, would sit in your system try and not bother you, then you'd get a lot of data very fast.
Whaddaya think ?
Microsoft MVP : Visual Developer - Visual Basic [2004-2005]
-
Sep 7th, 2001, 12:37 PM
#2
Member
Sure, go buy a bunch of top-secret supercooled supercomputers that have 256 quantum processors each and get about 500 backbones. Then I guess it could work.
-
Sep 7th, 2001, 12:40 PM
#3
Monday Morning Lunatic
I don't think number of processors makes much difference to quantum computers, since the processor itself can be any arbitrary size and problems are solved almost infinitely quickly.
I refuse to tie my hands behind my back and hear somebody say "Bend Over, Boy, Because You Have It Coming To You".
-- Linus Torvalds
-
Sep 7th, 2001, 12:43 PM
#4
Member
But the rest is true.
-
Sep 7th, 2001, 12:45 PM
#5
Monday Morning Lunatic
Yep Lots of backbones = A good browsing experience for arksie
I refuse to tie my hands behind my back and hear somebody say "Bend Over, Boy, Because You Have It Coming To You".
-- Linus Torvalds
-
Sep 7th, 2001, 01:21 PM
#6
Junior Member
what would be the net effect of mapping out the net?
in watermelon sugar the deeds were done and done again as my life is done in watermelon sugar.
-
Sep 7th, 2001, 08:19 PM
#7
Lively Member
It won't work that way, at least 75% of the content that can be accessible via the web is either hidden or not indexed.
The only possible way in future...when you create a new medium like web but rather an advanced one, every document/object created for the web will be indexed at some central server.
In that way, you will have a near 100% accurate map of the web.
But for now, good night..see it in your dreams.
-
Sep 8th, 2001, 08:43 AM
#8
Thread Starter
Retired VBF Adm1nistrator
No I think this could work.
The fact that you're using distributed applications to do it means that you could get people surifing on 56k modems, people on dialup ISDN lines, or people sitting an E3 connection.
Anyway, you wouldnt be downloading entire pages, just the source.
And the fact that a lot of stuff on the web isnt viewable to everyone, well, its viewable to some people. And if those people are running this app, and they can view that stuff, they've probably got it in their favourites or in cache. So then it would be indexed too.
I mean look at any distributed computing application.
The likes of SETI@Home, Intel/UD's cancer cure app, the "Help VB World help a good cause" app.
They're basically brute-forcing billions upon billions of combinations of things. So why couldnt we ?
Oh yes and the net effect of mapping the net ... well there is none really. It was just an idea
Microsoft MVP : Visual Developer - Visual Basic [2004-2005]
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|