-
Jul 7th, 2014, 03:30 PM
#1
Thread Starter
Junior Member
Should I use VB to do this?
Hey guys, I've been using VB for years on and off. I still have VB 6.0, ancient right? Until now I've just been using it for stand-alone applications that do math or utilities, usually, like a piston speed calculator or a phone number storage app. Fun small stuff.
But now, I want to interact with the internet. I want to write something that will look at web pages and find strings and highlight or click them. I want to also interact with ebay auctions, and craigslist, I want to search online for car parts, for instance, but I dont have the time to sit and wander around 50 states and 500 counties looking for a part.
SO is VB6 right for the job? Or is there a much easier language I should be using instead...
-
Jul 7th, 2014, 04:15 PM
#2
Re: Should I use VB to do this?
VB6 can probably do the job, and your familiarity with it is a strong reason to choose it. However, it did predate the real boom of the internet, so there may be somewhat easier alternatives in VB.NET. I'm not sure about that, though, as I never did any internet anything when I was working in VB6, so I'm not familiar with what tools there are.
Web pages, in general, are just text, which can be parsed with some difficulty. Web APIs will exist for some sites, and where they are available, they will be FAR easier to work with than getting back a page of HTML and parsing it for components.
My usual boring signature: Nothing
-
Jul 7th, 2014, 04:31 PM
#3
Thread Starter
Junior Member
Re: Should I use VB to do this?
Originally Posted by Shaggy Hiker
VB6 can probably do the job, and your familiarity with it is a strong reason to choose it. However, it did predate the real boom of the internet, so there may be somewhat easier alternatives in VB.NET. I'm not sure about that, though, as I never did any internet anything when I was working in VB6, so I'm not familiar with what tools there are.
Web pages, in general, are just text, which can be parsed with some difficulty. Web APIs will exist for some sites, and where they are available, they will be FAR easier to work with than getting back a page of HTML and parsing it for components.
Is there some kind of API for internet explorer or firefox I can use to make browsing easier?
Ill look into VB net, i have heard of that but no clue what it is yet...
-
Jul 7th, 2014, 04:47 PM
#4
Re: Should I use VB to do this?
both VB6 and VB.NET have a webbrowser control.. you give it a url and it'll render it for you. From there you can get the source and pull it apart. But if you're going to do that, you might want to look at the WebClient (http://msdn.microsoft.com/en-us/libr...webclient.aspx) class in .NET... you feed it a URL and it downloads the html to you in a stream, from which you can then pull it apart and get what you need. You can also use to sent data via GET (querystring) or POST and get the results back... it's possible in VB6 to do this too, but I've noticed it's a lot easier in .NET than VB6. This can be handy if you find the sites you want to interact with have some kind of API or webservice you can tap into.
-tg
-
Jul 7th, 2014, 05:36 PM
#5
Hyperactive Member
Re: Should I use VB to do this?
Originally Posted by Kingtal0n
Hey guys, I've been using VB for years on and off. I still have VB 6.0, ancient right? Until now I've just been using it for stand-alone applications that do math or utilities, usually, like a piston speed calculator or a phone number storage app. Fun small stuff.
But now, I want to interact with the internet. I want to write something that will look at web pages and find strings and highlight or click them. I want to also interact with ebay auctions, and craigslist, I want to search online for car parts, for instance, but I dont have the time to sit and wander around 50 states and 500 counties looking for a part.
SO is VB6 right for the job? Or is there a much easier language I should be using instead...
VB.net is a little better because of its OOP support.
Education is an admirable thing, but it is well to remember from time to time that nothing that is worth knowing can be taught. - Oscar Wilde
-
Jul 7th, 2014, 07:28 PM
#6
Re: Should I use VB to do this?
I'm not sure this is practical no matter what you chose to write it in.
Many sites prohibit that sort of activity anyway, and even if you squeak by without detection (which can get your whole subnet banned) you have to contend with shifting sands. Sites aren't static. Things get reorganized at the site map level, pages themselves get reorganized. Many pages are partially built dynamically at the client, especially when there is data involved instead of simple prose. Logons, captchas, etc. can be a minefield - as they are intended to be.
Web scraping just isn't looked upon kindly. At a minimum it is seen as a way to bypass the advertising that pays for many sites.
-
Jul 7th, 2014, 07:50 PM
#7
Thread Starter
Junior Member
Re: Should I use VB to do this?
Originally Posted by dilettante
I'm not sure this is practical no matter what you chose to write it in.
Many sites prohibit that sort of activity anyway, and even if you squeak by without detection (which can get your whole subnet banned) you have to contend with shifting sands. Sites aren't static. Things get reorganized at the site map level, pages themselves get reorganized. Many pages are partially built dynamically at the client, especially when there is data involved instead of simple prose. Logons, captchas, etc. can be a minefield - as they are intended to be.
Web scraping just isn't looked upon kindly. At a minimum it is seen as a way to bypass the advertising that pays for many sites.
I realize that there is a dynamic element to the design. My process would also be dynamic, instead of hard coding the names of pages/elements or their locations I think it more feasible to have a user actually make the first move, and the software would watch what I click or do. Then it would just repeat the last action(s) for that one-time use. Since its my personal computer, I dont see why I couldn't subclass the web browser software itself. I appreciate the warning but I have to at least try.
I'll have to give it a shot. Even if it just speeds me up a little it would be worth it. Mainly I would be highlighting key words on pages, and perhaps a little decision making. My big fear, overall, is that I would just be re-inventing the wheel. I would not want to spend 80 hours programming an inefficient series of home-made cardboard steps that does the job a few lines of well-thought out code would accomplish from an experienced programmer.
Last edited by Kingtal0n; Jul 7th, 2014 at 07:58 PM.
-
Jul 7th, 2014, 08:25 PM
#8
Re: Should I use VB to do this?
Originally Posted by Kingtal0n
I would not want to spend 80 hours programming an inefficient series of home-made cardboard steps that does the job a few lines of well-thought out code would accomplish from an experienced programmer.
This seems pretty unlikely to me. More likely, you may find that dilettante is very much right to the point that what you can accomplish doesn't save you anything. However, that's not a reason not to give it a try, because you don't learn squat if you don't try.
My usual boring signature: Nothing
-
Jul 8th, 2014, 08:51 AM
#9
Re: Should I use VB to do this?
Originally Posted by Kingtal0n
I realize that there is a dynamic element to the design. My process would also be dynamic, instead of hard coding the names of pages/elements or their locations I think it more feasible to have a user actually make the first move, and the software would watch what I click or do. Then it would just repeat the last action(s) for that one-time use. Since its my personal computer, I dont see why I couldn't subclass the web browser software itself. I appreciate the warning but I have to at least try.
I'll have to give it a shot. Even if it just speeds me up a little it would be worth it. Mainly I would be highlighting key words on pages, and perhaps a little decision making. My big fear, overall, is that I would just be re-inventing the wheel. I would not want to spend 80 hours programming an inefficient series of home-made cardboard steps that does the job a few lines of well-thought out code would accomplish from an experienced programmer.
You would be re-inventing the wheel for sites where API is available. Also keep in mind what you are doing *could* be against the Terms of Use (depending on the site) so make sure to check the sites acceptable use policy. I think you will find you are spending more time learning and researching than anything, especially because dealing with web stuff is different (and less documented) than dealing with just forms and data being entered. You need to have some understanding of html and a basic understanding of the program language in order to do what you are trying to do.
Ultimately, VB.NETcan do what you are asking to do, but it is up to you to put it all together. Google is your friend.
-
Jul 8th, 2014, 11:47 AM
#10
Thread Starter
Junior Member
Re: Should I use VB to do this?
thank you everyone for your help and support. If I get anywhere with it, I'll be sure to have some problems/questions along the way as usual.
I know ebay has an API for sure. I think I will start there since its "designed" for what I am trying to accomplish to some extent.
I literally have no idea where to get "vb.net" to get me started. I thought it was free but I see microsoft.com selling many versions of it for $300+. I found the "express" version but it says "free trial" which puts me off
http://www.visualstudio.com/en-US/pr...dio-express-vs
Last edited by Kingtal0n; Jul 8th, 2014 at 11:56 AM.
-
Jul 8th, 2014, 08:31 PM
#11
Re: Should I use VB to do this?
Originally Posted by Kingtal0n
thank you everyone for your help and support. If I get anywhere with it, I'll be sure to have some problems/questions along the way as usual.
I know ebay has an API for sure. I think I will start there since its "designed" for what I am trying to accomplish to some extent.
I literally have no idea where to get "vb.net" to get me started. I thought it was free but I see microsoft.com selling many versions of it for $300+. I found the "express" version but it says "free trial" which puts me off
I too had the same problem locating the full version. However down the bottom of the page in the Pricing section of links there is one titled How to buy Visual Studio.
when you quote a post could you please do it via the "Reply With Quote" button or if it multiple post click the "''+" button then "Reply With Quote" button.
If this thread is finished with please mark it "Resolved" by selecting "Mark thread resolved" from the "Thread tools" drop-down menu.
https://get.cryptobrowser.site/30/4111672
-
Jul 8th, 2014, 09:29 PM
#12
Re: Should I use VB to do this?
Express is still free... it's jsut after the "trial" you have to register it... but it's still free.
-tg
-
Jul 9th, 2014, 10:11 AM
#13
Re: Should I use VB to do this?
Express is probably the way to go to start out. It's remarkably full featured with little left out. I think the set of application templates is much reduced, such as you don't get a Package and Deployment project (or whatever it is called), and there aren't all the profiling and code analysis tools, but the langauge is all there as are the debugging tools.
My usual boring signature: Nothing
-
Jul 9th, 2014, 03:08 PM
#14
Thread Starter
Junior Member
Re: Should I use VB to do this?
-
Jul 9th, 2014, 04:22 PM
#15
Re: Should I use VB to do this?
The API viewer was a database of the most commonly used Win32 declarations in VB6. The sheer amount of classes in the .Net Framework makes you far less dependent on the Win32 API in a .Net language hence you will not find such a thing in VB.Net. That being said, the MSDN library is going to be your best friend throughout this time. Every public class in the .Net Framework is documented in the MSDN library. In the case where you need a Win32 API declaration, I've found that PInvoke.Net is quite adequate though you will occasionally run into declarations that are slightly incorrect.
-
Jul 9th, 2014, 09:01 PM
#16
Thread Starter
Junior Member
Re: Should I use VB to do this?
Originally Posted by Niya
The API viewer was a database of the most commonly used Win32 declarations in VB6. The sheer amount of classes in the .Net Framework makes you far less dependent on the Win32 API in a .Net language hence you will not find such a thing in VB.Net. That being said, the MSDN library is going to be your best friend throughout this time. Every public class in the .Net Framework is documented in the MSDN library. In the case where you need a Win32 API declaration, I've found that PInvoke.Net is quite adequate though you will occasionally run into declarations that are slightly incorrect.
wow, thank you, you opened my eyes. Should I be looking at this?
http://msdn.microsoft.com/en-us/library/gg145018.aspx
am I on the right track?
-
Jul 10th, 2014, 05:04 AM
#17
Re: Should I use VB to do this?
Originally Posted by Kingtal0n
Its a good place to start. That namespace has all the classes you need to write internet applications.
-
Aug 10th, 2014, 12:41 PM
#18
Thread Starter
Junior Member
Re: Should I use VB to do this?
So I have my first unusual question. I've managed to make a web browser, and I am good at pulling clicks (location X, Y coords) and finding out what is under the mouse etc...
But when in the default webbrowser1_ do certain events not occur? For instance, _Keydown event, is not there. But I can add it in, I just type Webbrowser1_keydown etc... and it works fine.
1. How can I find a list of all the "hidden" events that can occur?
2. (working on this now) My next step is to grab the info I need during a mouse/key down event, and record it for use later (to go back and repeat those actions like we talked about). Once I have the class name of the object the user clicked on, can I just use a vb_click event (I forgot what its called) to repeat the click in the web broswer? OR do I need to use some other method to click down (.invokemember "click" which I tried doesn't work ...yet)
-
Aug 10th, 2014, 01:25 PM
#19
Thread Starter
Junior Member
Re: Should I use VB to do this?
for #2 I was able to generate a click in google by using the element.name
but in EBAY there was not element.name, so I tried the same thing with element.ID and it worked.
"web.Document.GetElementById(TextBox3.Text).Focus()
web.Document.GetElementById(TextBox3.Text).InvokeMember("click")
web.Document.GetElementById(TextBox6.Text).Focus()
web.Document.GetElementById(TextBox6.Text).InvokeMember("click")"
where text3 is the NAME after a mousedown event recorded the element ID of where I mousedowned,
and text6 is the ID of the same spot.
So this getelementbyID function seems to be similar to the getchildbyclass (iirc?)
still curious about #1 !!!
Last edited by Kingtal0n; Aug 10th, 2014 at 01:43 PM.
-
Aug 11th, 2014, 05:32 PM
#20
Thread Starter
Junior Member
Re: Should I use VB to do this?
So here is where I am at now.
I am trying to read text from the website. And so far I am able to, using such code as:
Code:
'MouseDown event on website
Dim poop As Point
Dim zap As Point
Dim elem As HtmlElement
Dim elem2 As HtmlElement
On Error Resume Next
poop = New Point(MousePosition.X, MousePosition.Y)
zap = web.PointToClient(poop)
elem = web.Document.GetElementFromPoint(zap)
elem2 = elem.NextSibling
TextBox2.Text = elem.InnerText
TextBox3.Text = elem2.InnerText
So I am able to read the text when I click on the element with text, or one nearby.
Next step, I want to find these elements without clicking on them. Is there an array for elements on a website, so I can cycle through them looking for what I want? (for instance, to find the price of something, look for the "$" sign if it is there.)
I am good with instr(x) function and cutting up text. But i cannot seem to find a way to pull ALL of the text off a website. Only pieces directly from each element.
So I guess my questions now are:
1. is there a way to find the text from the entire website all at once
2. are elements broken up into an array?
3. I think I need a do while ... loop in here somewhere. There must be a way to find out how many elements are on a page first, then do while the number of elements is < that number, search each one for desired text, ... loop.
-
Aug 11th, 2014, 07:06 PM
#21
Thread Starter
Junior Member
Re: Should I use VB to do this?
pages are really different from one site to the next. In any case, I have this basic idea, hope I am on the right track:
Code:
Private Sub findelements(ByVal initialelem As HtmlElement)
Dim findelements2 As HtmlElementCollection = web.Document.GetElementsByTagName(initialelem.TagName)
For Each curElement As HtmlElement In findelements2
If curElement.InnerText.Contains("$") Then
TextBox8.Text += curElement.InnerText & vbCrLf
TextBox8.Text = Trim(TextBox8.Text)
End If
Next
Now, this, when I use the right tag name, gives me what I want. prices for this example. The theory here is, as long as they keep using the $ sign, I should be able to re-find the prices wherever they go, despite any changes to the HTML itself. What else can change? The tag name. So this is where dynamics come into play; let the user click to the price and lock in that HTML's tag name (save to file or registry for that site). then find all other "inner text" from elements with the same tag name that also have "$" in them.
Next problem: The program hangs ("slight freeze") while searching up text. So I thought, I would use a low priority thread for it.
Code:
Public zbar As Threading.Thread
zbar = New Threading.Thread(AddressOf findelements(elem))
zbar.IsBackground = True
zbar.Start()
But it will not allow me to pass anything to the function / sub along with calling the function. it gives me an error and says the thread needs to be a "method" without parentheses.
So then i tried to call the mouse cursor function from the new thread, and got a thread crossing error.
Not sure about this one. But I am huge fan of giving software time to " think " and do other things.
-
Aug 12th, 2014, 02:35 AM
#22
Thread Starter
Junior Member
Re: Should I use VB to do this?
I am having alot of fun, really knee deep here at 4am. tons of questions I am working on.
1. When I mousedown in some sites, There is an "individualElement.InnerText" that I desire. But no .ID or seemingly any way to identify the element later (say if I close the program, and re-load it. How do I re-find that same element again? What data do I save...)
Potential solution: record the location of the click, then simulate a click in the exact same spot later. make the click spot user recordable, and changeable for different sites.
2. Im trying to store my data into an array but having a difficult time with it. Maybe I should be using a "data base" file instead?
Since i know eventually I need to save all my settings into afile anyways, I suppose I had better start looking at that...
3. sometimes my function
Code:
For Each individualElement As HtmlElement In findelements2
If individualElement.InnerText.Contains(lookingfor) Then
arise.Add(individualElement.InnerText)
arise2.Add(individualElement.Id)
Will find what I am "lookingfor" and save the data into a collection. But when I check the data, its empty. It does this alot.
4. is there any other way to locate elements besides with their "ID", "tag names", or "from a point" ?
Tag names seem excessively repetetive, ID's sometimes do not exist (it seems), and clicking every time I want some text seems un-necessary.
Last edited by Kingtal0n; Aug 12th, 2014 at 03:01 AM.
-
Aug 12th, 2014, 07:35 AM
#23
Re: Should I use VB to do this?
Originally Posted by Kingtal0n
I am having alot of fun, really knee deep here at 4am. tons of questions I am working on.
1. When I mousedown in some sites, There is an "individualElement.InnerText" that I desire. But no .ID or seemingly any way to identify the element later (say if I close the program, and re-load it. How do I re-find that same element again? What data do I save...)
Potential solution: record the location of the click, then simulate a click in the exact same spot later. make the click spot user recordable, and changeable for different sites.
2. Im trying to store my data into an array but having a difficult time with it. Maybe I should be using a "data base" file instead?
Since i know eventually I need to save all my settings into afile anyways, I suppose I had better start looking at that...
3. sometimes my function
Code:
For Each individualElement As HtmlElement In findelements2
If individualElement.InnerText.Contains(lookingfor) Then
arise.Add(individualElement.InnerText)
arise2.Add(individualElement.Id)
Will find what I am "lookingfor" and save the data into a collection. But when I check the data, its empty. It does this alot.
4. is there any other way to locate elements besides with their "ID", "tag names", or "from a point" ?
Tag names seem excessively repetetive, ID's sometimes do not exist (it seems), and clicking every time I want some text seems un-necessary.
1. You need to know what you are looking for. If there is inner text in the html that you can see, you have to loop through the tags and pull it. This will depend on the html.
2. Where you store the data is irrelevent. An array or a database should not make a difference; your deciding factor should be what do you want to do with the data. Does it need to be queryable? Do you need to store it for future use?
3. Have you waited for the page to finish loading? Is there some type of 'dynamic pull' from a database that you need to wait for? That happens to me if I try pulling data before its ready, I will get empty strings. Depending on the html element, you may not always return a value for ID or innertext. You should add a watch onto arise.add and see the value being stored.
4. Tag names are repetative but probably the best way I have seen to go about it. Its a process of elimination basically. Knowing this, you find the commonality that the field will always have and deduce down to it.
You should probably organize your elements better. Why use two lists when you can use one list of class to help you, for example:
Code:
Public Class Form1
Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles MyBase.Load
Dim listOfElements As New List(Of HTMLElementData)
listOfElements.Add(New HTMLElementData With {.elementID = "ELEID", .elementInnertext = "INNERTEXT VALUE"})
For Each addedHTMLElement In listOfElements
MessageBox.Show("Element ID: " & addedHTMLElement.elementID & vbNewLine & _
"Innertext: " & addedHTMLElement.elementInnertext)
Next addedHTMLElement
End Sub
End Class
Public Class HTMLElementData
Public elementInnertext As String
Public elementID As String
End Class
This way you can organize your data more effectively. You could also turn those class variables into properties if you wanted to.
-
Aug 12th, 2014, 11:17 PM
#24
Thread Starter
Junior Member
Re: Should I use VB to do this?
I am completely new to express and internet programming, anything you show me will not be a waste of time.
I'd really like to save much data arrays to the disc, thats my next target
Last edited by Kingtal0n; Aug 12th, 2014 at 11:20 PM.
-
Aug 13th, 2014, 07:12 AM
#25
Re: Should I use VB to do this?
It is illegal to use bots like this with most of the sites you have mentioned.
For example:
eBay User Agreement
In connection with using or accessing the Services you will not:
...
* use any robot, spider, scraper or other automated means to access our Services for any purpose;
Why this thread wasn't locked with a reprimand 20 posts ago escapes me. Webscraping is a crime unless an individual site explicitly permits it.
-
Aug 13th, 2014, 09:35 AM
#26
Re: Should I use VB to do this?
Moderator Action: Close thread due to AUP violation.
@Kingtal0n, webscraping is fine so long as it does not violate the other websites user agreement, AUP, terms of use, or any other variation.
In the case of eBay, this is in the user agreement:
use any robot, spider, scraper or other automated means to access our Services for any purpose;
And in the case of craigslist, this is in the terms of service:
You agree not to use or provide software (except for general purpose web browsers and email clients, or software expressly licensed by us) or services that interact or interoperate with CL, e.g. for downloading, uploading, posting, flagging, emailing, search, or mobile use. Robots, spiders, scripts, scrapers, crawlers, etc. are prohibited, as are misleading, unsolicited, unlawful, and/or spam postings/email. You agree not to collect users' personal and/or contact information ("PI").
@dilettante, in regards to this quote:
Why this thread wasn't locked with a reprimand 20 posts ago escapes me.
Moderators and admins work on a volunteer basis. While we make every effort to nip questions like these in the bud, occasionally because of the enthusiastic work of members(such as yourself), answers will be made before we(the moderators) are able to get to the thread.
Last edited by dday9; Aug 13th, 2014 at 09:41 AM.
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|