Results 1 to 4 of 4

Thread: Screen Scraping Windows Applications

  1. #1

    Thread Starter
    New Member
    Join Date
    Mar 2006
    Posts
    1

    Screen Scraping Windows Applications

    Hello,

    Using vs 2003, vb.net as a windows application....how would I go about scraping information from a specific window when there could be multiple windows open from various apps. I need to identify the proper window and then scrape 3 fields of information off the screen and place them into text boxes on my own application that would be open at the time?

    1. my application would be open (windows app)
    2. user clicks 'get data' button
    3. program somehow iterates through all open windows and selects the window that I need to scrape
    4. the data is scraped from this window (3 separate fields) and these fields are placed into 3 separate text boxes in my app.

    I believe the windows I'm scraping the info from would be another windows app...this may end up being some kind of terminal emulation program or something...

    Anyone have any ideas?

    Thanks in advance for any help you can offer me.

    Scott

  2. #2
    New Member
    Join Date
    Mar 2006
    Posts
    7

    Re: Screen Scraping Windows Applications

    windows screen scraping is a VERY difficult endeavor. You can use tools like AutoIt to determine which window by the window title or text in the window. To actually get data off the screen will depend what the app is. I don't relish your task.

  3. #3
    PowerPoster
    Join Date
    Aug 2005
    Location
    College Station, TX
    Posts
    4,521

    Re: Screen Scraping Windows Applications

    I found an example from microsoft before that had a screen scraping example in a Regular Expressions sample app, I had modified it a little so I dont have the original project anymore, and not sure if it even works. I tried searching for it but was unable to find it again. Here is the assembly information if you wish to try to search...
    Code:
    <Assembly: AssemblyTitle("VB.NET How-To: Use Regular Expressions")> 
    <Assembly: AssemblyDescription("Microsoft Visual Basic .NET How-To: Use Regular Expressions")> 
    <Assembly: AssemblyCompany("Microsoft Corporation")> 
    <Assembly: AssemblyProduct("Microsoft Visual Basic .NET How To: 2002")> 
    <Assembly: AssemblyCopyright("Copyright © 2002 Microsoft Corporation.  All rights reserved.")> 
    <Assembly: CLSCompliant(True)>
    **Note - I believe it was to scrape downloaded HTML code...

    **EDIT - This might be it, but I havent downloaded it to check...

  4. #4
    Super Moderator Shaggy Hiker's Avatar
    Join Date
    Aug 2002
    Location
    Idaho
    Posts
    40,106

    Re: Screen Scraping Windows Applications

    Actually, this isn't all that hard, but it sure can be tricky. You will need to use API calls, here they are:

    VB Code:
    1. Declare Ansi Function FindWindow Lib "user32" Alias "FindWindowA" (ByVal lpClassName As String, ByVal lpWindowName As String) As Integer
    2.     Declare Ansi Function GetWindowText Lib "user32" Alias "GetWindowTextA" (ByVal hWnd As Integer, ByVal lpString As System.Text.StringBuilder, ByVal nMaxCount As Integer) As Integer
    3.     Declare Ansi Function FindWindowEx Lib "user32" Alias "FindWindowExA" (ByVal hWndParent As Integer, ByVal hWndChildAfter As Integer, ByVal lpClassName As String, ByVal lpWindowName As String) As Integer
    4.     Declare Ansi Function SendMessageTimeout Lib "user32" Alias "SendMessageTimeoutA" (ByVal hWnd As Integer, ByVal msg As Integer, ByVal buffSize As Integer, ByVal lParam As System.Text.StringBuilder, ByVal fuFlags As Integer, ByVal uTimeout As Long, ByVal lpdwResult As Integer) As Integer

    Also, here's a snippet showing how I go after a specific window in a program:

    VB Code:
    1. Private Function FindProg() As Boolean
    2.         Dim lClientHandle As Integer
    3.         Dim lH As Integer
    4.         Dim lH2 As Integer
    5.         Dim tWin1 As Integer
    6.         Dim x As Integer
    7.         Dim st1 As String
    8.         Dim stBuilder As New System.Text.StringBuilder(256)
    9.  
    10.         Try
    11.             'Get the top level window using the window caption.
    12.             p3Handle = FindWindow(vbNullString, "PITTag3 - [New Tag Session]")
    13.             If p3Handle = 0 Then
    14.                 'This is another possibility.
    15.                 p3Handle = FindWindow(vbNullString, "PITTag3")
    16.                 If p3Handle = 0 Then
    17.                     'This is an attempt to find the registered class.
    18.                     p3Handle = FindWindow("ThunderRT6MDIForm", vbNullString)
    19.                     If p3Handle > 0 Then
    20.                         x = GetWindowText(p3Handle, stBuilder, stBuilder.Capacity + 1)
    21.                         'This would be the right one.
    22.                         st1 = stBuilder.ToString
    23.                         If Not st1.Substring(0, 7) = "PITTag3" Then
    24.                             mErrorMessage = "Some other program is blocking the visibility of P3."
    25.                             Return False
    26.                         End If
    27.                     End If
    28.                 End If
    29.             End If
    30.             If p3Handle > 0 Then
    31.                 'The rest of these lines drill down through the many layers of client windows
    32.                 'to get the handle of the base window.
    33.                 lClientHandle = FindWindowEx(p3Handle, 0, "MDIClient", vbNullString)
    34.                 lH2 = FindWindowEx(lClientHandle, 0, "ThunderRT6FormDC", vbNullString)
    35.                 If lH2 = 0 Then
    36.                     mErrorMessage = "There is no session open. Try this again after opening a session."
    37.                     Return False
    38.                 End If
    39.                 lH = FindWindowEx(lH2, 0, vbNullString, vbNullString)
    40.                 lH2 = FindWindowEx(lH, 0, vbNullString, vbNullString)
    41.                 lH2 = FindWindowEx(lH, lH2, vbNullString, vbNullString)
    42.                 tWin1 = FindWindowEx(lH2, 0, "ATLfpOCXComboBox30", vbNullString)
    43.                 For x = 1 To 3
    44.                     tWin1 = FindWindowEx(lH2, tWin1, "ATLfpOCXComboBox30", vbNullString)
    45.                 Next
    46.                 'This is the proper window, so use it.
    47.                 p3TextHandle = FindWindowEx(lH2, tWin1, "ATLfpOCXComboBox30", vbNullString)
    48.  
    49.                 If p3TextHandle = 0 Then
    50.                     mErrorMessage = "Unable to find the correct textbox. This is hard to explain, but if you get this message, there is still a bug remaining to be squashed."
    51.                     Return False
    52.                 Else
    53.                     Return True
    54.                 End If
    55.             Else
    56.                 mErrorMessage = "P3 could not be located. It may not be running, or some unforseen situation has arisen."
    57.                 Return False
    58.             End If
    59.         Catch ex As Exception
    60.             mErrorMessage = ex.Message
    61.             Return False
    62.         End Try
    63.  
    64.     End Function

    Look up the API calls in MSDN for more information about what they do. Then use Spy++ to look at the window you are looking for. There is a little bullseye tool in Spy++ that can show you information about a specific window, but I found it to be generally more usefull to look at a list of all running programs. The thing is that there might be more than one registered window with the same class name and window name, but you will find that the one you want is always in the same relationship to others. The Find Prog function should show you how I did that for a specific window in a specific program.
    My usual boring signature: Nothing

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width