|
-
Nov 17th, 2012, 12:28 AM
#1
Thread Starter
Member
Download files attached to a website
Can anyone help me with a VB program to download files linked from a website. The idea is that the user enters the URL of the website and then the program will list the files and the user clicks a button to save all the files to disk. The file types that I want it to show and download are:
- Image Files
- CSS Files
- HTML Files
- JS Files
If anyone can help at all it would be greatly appreciated.
Thanks,
Benedict3578
-
Nov 17th, 2012, 12:38 AM
#2
Re: Download files attached to a website
Most web sites won't provide direct access to style sheets and script files so, at most, you be able to get HTML files and images. Many web sites won't even have HTML files because their pages are dynamic and generated by PHP or ASP.NET or the like. I think that you might want to think about this a bit more to determine whether it's viable and worthwhile.
-
Nov 17th, 2012, 12:47 AM
#3
Thread Starter
Member
Re: Download files attached to a website
 Originally Posted by jmcilhinney
Most web sites won't provide direct access to style sheets and script files so, at most, you be able to get HTML files and images. Many web sites won't even have HTML files because their pages are dynamic and generated by PHP or ASP.NET or the like. I think that you might want to think about this a bit more to determine whether it's viable and worthwhile.
Just using the Safari web inspector I can list and view over 20 different .js and four .css files on this website. I have also tried on many other websites and I seem to be able to list the .js and .css files, and it appears that I can see all of them as I have tested it on my own server. The main interest is in image files so even if the program could only access those it would be fine.
-
Nov 17th, 2012, 12:58 AM
#4
Re: Download files attached to a website
You have to be able to access the style sheets and scripts in order for them to be used in the browser but I think that what you'll find is happening there is that the browser is loading a page and then checking its header to find style sheets and scripts. It's not actually going to the web server and browsing because any decent admin will have turned directory browsing off. That means that you can specify the URL of a site and download the default page but from there you can only crawl the pages. It's not like an FTP site where you can get a file or folder listing with a single command.
-
Nov 17th, 2012, 01:01 AM
#5
Lively Member
Re: Download files attached to a website
just download internet download manager :P
-
Nov 17th, 2012, 01:06 AM
#6
Thread Starter
Member
Re: Download files attached to a website
 Originally Posted by jmcilhinney
You have to be able to access the style sheets and scripts in order for them to be used in the browser but I think that what you'll find is happening there is that the browser is loading a page and then checking its header to find style sheets and scripts. It's not actually going to the web server and browsing because any decent admin will have turned directory browsing off. That means that you can specify the URL of a site and download the default page but from there you can only crawl the pages. It's not like an FTP site where you can get a file or folder listing with a single command.
That is all I need though. If my program can look through the headers and find the files listed in the header even if they are not all the files in the server then that is fine. The only files needed are the ones that the HTML file requires to load.
-
Nov 17th, 2012, 01:08 AM
#7
Thread Starter
Member
Re: Download files attached to a website
 Originally Posted by nosewey
just download internet download manager :P
The point is that I need to develop my own program to do it because the download part is only step one in what will be a complex program. I can already download the files but the point is that my program has to be able to do it.
-
Nov 17th, 2012, 01:22 AM
#8
Re: Download files attached to a website
In that case, you can use a WebClient to download a file from a URL. If you provide the domain then it will download the default document, e.g. Index.htm or Default.aspx. You can then use the HTML Agility Pack to load the document into a DOM and examine it's contents from there.
http://htmlagilitypack.codeplex.com/
-
Nov 17th, 2012, 01:26 AM
#9
Thread Starter
Member
Re: Download files attached to a website
 Originally Posted by jmcilhinney
In that case, you can use a WebClient to download a file from a URL. If you provide the domain then it will download the default document, e.g. Index.htm or Default.aspx. You can then use the HTML Agility Pack to load the document into a DOM and examine it's contents from there.
http://htmlagilitypack.codeplex.com/
Awesome. Thanks for the help!
Tags for this Thread
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|