Results 1 to 6 of 6

Thread: Filtering html to get results

  1. #1

    Thread Starter
    Lively Member
    Join Date
    Jul 2013
    Posts
    108

    Filtering html to get results

    I'm working on .net6+ winForms.
    I'm trying to filter results by some criteria having a big html. I don't know if what I am trying to ask is achievable, but it costs nothing to ask. I'm playing a browser game which as a big map 100x100 and for each pixel it has an island. Each island has a different cave ( of different materials, but I am only interested on a determined material ) and a carpentry and each island has 16 free slots. I would like to find an island with no more than 7 occupied slots and of these, which has the highest level of carpentry and mine. Each island html is structured like this :
    Code:
    <div id="tile_12_10" class="islandTile island2" style="left: 240px; top: 1320px; display: block;" title="Nyeos [71:92]">
       <div id="marking_12_10" class=""></div>
       <div id="wonder_12_10" class="wonder wonder1"><span class="ikaeasy-resource ikaeasy-resource-wonder ikaeasy-t-20 ikaeasy-l--20 ikaeasy-d-n">5</span></div>
       <div id="tradegood_12_10" class="tradegood tradegood2"><span class="ikaeasy-resource ikaeasy-resource-mine ikaeasy-l--20 ikaeasy-d-n">16</span></div>
       <div id="cities_12_10" class="cities">16</div>
       <div id="piracy_12_10" class=""></div>
       <div id="helios_12_10" class=""></div>
       <div id="magnify_12_10"></div>
       <div id="owner_12_10" class="ownerState "></div>
       <a href="javascript:ikariam.getScreen().clickIsland('tile_12_10');" id="linkurl_12_10" class="linkurl"></a>
       <div class="ikaeasy-resource ikaeasy-resource-wood ikaeasy-b-70 ikaeasy-l-55 ikaeasy-d-n">
          <div class="ikaeasy-resource-icon ikaeasy-w-24 ikaeasy-h-20" style="background-image: url(skin/resources/icon_wood.png);"></div>
          <span>21</span>
       </div>
    </div>
    The results I'm interested are: For the cave, as I said, I'm only interested to find the island with a specific material, and the latter is specified in the html in the class tradegood tradegood2. So that "2" is the specific materials the island produces. After that, the level of cave is determine in the class ikaeasy-resource ikaeasy-resource-mine ikaeasy-l--20 ikaeasy-d-n which contains the "16" level. For the wood( carpentry ), the div class ikaeasy-resource ikaeasy-resource-wood ikaeasy-b-70 ikaeasy-l-55 ikaeasy-d-n which contains the "21" level. For the occupied slots, the div cities_12_10 which contains class="cities" which is "16", which means the island has not free slots.

    Would be possible to get all the island that has that specific material, no more than 7 slots occupied, and to sort these for the highest cave and carpentry levels? something like
    Island [71:92] mine : lvl 16 carpentry : lvl 20 free slots: 9
    p.s. The class class="oceanTile " is pure ocean so no island in there.
    p.s.2 since <div id="tile_12_10" changes for every island, i'm adding and starts-with(@id, 'tile_')]") at the first for each. I'm using html agility pack library and this is what i've been able to write so far
    I'm using html agility pack library and this is what i've been able to write so far

    Code:
    Imports System.IO
    Imports HtmlAgilityPack
    
    Public Class Form1
        Public Class Island
            Public Property Coord As String
            Public Property MineLevel As Integer
            Public Property CarpentryLevel As Integer
            Public Property FreeSlots As Integer
        End Class
        Private Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click
              Dim htmlDoc As New HtmlAgilityPack.HtmlDocument()
            htmlDoc.Load("pastebin.html")
    
                ' Get all islandTile elements that have a trade good type of 2 (tradegood2)
    
            Dim islands As New List(Of Island)()
            For Each tile As HtmlNode In htmlDoc.DocumentNode.SelectNodes("//div[contains(@class, 'islandTile') and not(contains(@class, 'oceanTile')) and starts-with(@id, 'tile_')]")
    
                Dim tradeGoodTypeNode As HtmlNode = tile.SelectSingleNode(".//div[contains(@class, 'tradegood')]/@class")
                Dim tradeGoodType As String
                If tradeGoodTypeNode IsNot Nothing Then
                    tradeGoodTypeNode.GetAttributeValue("class", "")
                  
                     ' Get the mine and carpentry levels
                    Dim mineLevel As Integer
                    Integer.TryParse(tile.SelectSingleNode(".//div[contains(@class, 'tradegood2')]/span")?.InnerText, mineLevel)
                    Dim carpentryLevel As Integer
                    Integer.TryParse(tile.SelectSingleNode(".//div[contains(@class, 'ikaeasy-resource-wood')]/span")?.InnerText, carpentryLevel) ' Get the number of occupied slots
                    Dim occupiedSlots As Integer
                    Integer.TryParse(tile.SelectSingleNode(".//div[contains(@class, 'cities')]/text()")?.InnerText, occupiedSlots)  ' If the island has less than 7 occupied slots, add it to the list
                    If occupiedSlots <= 7 Then
                        islands.Add(New Island() With {
                            .Coord = tile.GetAttributeValue("id", ""),
                            .MineLevel = mineLevel,
                            .CarpentryLevel = carpentryLevel,
                            .FreeSlots = 16 - occupiedSlots
                        })
                    End If
                End If
    
            Next
    
               ' Sort islands by mine and carpentry level
            islands = islands.OrderByDescending(Function(x) x.MineLevel).ThenByDescending(Function(x) x.CarpentryLevel).ToList()
    
                ' Print island information
    
            For Each island As Island In islands
                Console.WriteLine($"Island {island.Coord} mine : lvl {island.MineLevel} carpentry : lvl {island.CarpentryLevel} free slots: {island.FreeSlots}")
            Next
        End Sub
    End Class
    but nothing happens when i press the button...

    I pasted the entire html in a pastebin file if it could give any help. Much appreciated
    Last edited by matty95srk; Feb 27th, 2023 at 08:27 AM.

  2. #2
    Super Moderator Shaggy Hiker's Avatar
    Join Date
    Aug 2002
    Location
    Idaho
    Posts
    40,102

    Re: Filtering html to get results

    Nothing at all? The computer doesn't melt?

    First off, will this ever work? From the HTML, it looks feasible. Of course, if this would allow you to cheat at the game in any way, then it is quite possible that the publishers of the game don't want you doing this. If that's the case, then the HTML will change as soon as they figure out that you ARE doing this. They'd have many options, some subtle, some obvious. On the other hand, if the publishers of the game don't care, then it might work. In fact, they might have some alternative solution that would work better for you, such as an API you could use. In general, we don't endorse hacking a site, though in this case, it doesn't seem like the publishers have taken even the slightest measure to hide the information, so it seems likely that they don't care.

    You did say that nothing happens. Did you put a breakpoint in the code and it wasn't reached? Or are you saying that you just didn't see any output on the console, which appears to be the only visible result of the button? If that's the case, then do the other thing. Nothing will show up in the console if the list of islands is empty. If you don't think it SHOULD be empty, but it IS empty, then you need to step through the code to see WHY it is empty. There are a variety of possible reasons, and nobody is better positioned to figure out which one it is than you.
    My usual boring signature: Nothing

  3. #3
    Angel of Code Niya's Avatar
    Join Date
    Nov 2011
    Posts
    9,017

    Re: Filtering html to get results

    I don't think he is trying to hack the game Shaggy. Nothing in what he says implies he is trying to change the game state. He is only readying HTML to get information from it about the game state which might be considered a form of cheating. However what you must understand is that in "gamer culture" cheating is usually understood as manipulating the game state. Trust me, game developers know how determined the human being is to win at all costs and they would have taken the steps to not expose data if they thought it would compromise the integrity of their game. The fact that their HTML is available like that for anyone to decipher tells me that they aren't concerned with people seeing what's in it.

    There are also cases where it the developers don't care if users cheat. This is typical of single player games because in these cases it only affects the individual player. Cheating a multiplayer game ruins the experience for everyone against their will but in single player scenarios, you are ruining your own experience of your own free will. I suspect that might be the case here.
    Treeview with NodeAdded/NodesRemoved events | BlinkLabel control | Calculate Permutations | Object Enums | ComboBox with centered items | .Net Internals article(not mine) | Wizard Control | Understanding Multi-Threading | Simple file compression | Demon Arena

    Copy/move files using Windows Shell | I'm not wanted

    C++ programmers will dismiss you as a cretinous simpleton for your inability to keep track of pointers chained 6 levels deep and Java programmers will pillory you for buying into the evils of Microsoft. Meanwhile C# programmers will get paid just a little bit more than you for writing exactly the same code and VB6 programmers will continue to whitter on about "footprints". - FunkyDexter

    There's just no reason to use garbage like InputBox. - jmcilhinney

    The threads I start are Niya and Olaf free zones. No arguing about the benefits of VB6 over .NET here please. Happiness must reign. - yereverluvinuncleber

  4. #4
    Super Moderator Shaggy Hiker's Avatar
    Join Date
    Aug 2002
    Location
    Idaho
    Posts
    40,102

    Re: Filtering html to get results

    That was my thinking about it. If it's that readily available, then they must expect people to be gathering it. After all, that's some very clear and explicit HTML. It's almost documented. However, considering how clear the HTML is, it seems possible that they also offer the data up via API, and if so, then that's certainly the way to go. That seems less likely with a game, though, as game developers likely don't think like that.
    My usual boring signature: Nothing

  5. #5

    Thread Starter
    Lively Member
    Join Date
    Jul 2013
    Posts
    108

    Re: Filtering html to get results

    Hacking? not even close guys. How can you hack a game just by downloading part of its html? Maybe that's possible but over my skills.
    As I specify, it's just a tool to speed up the research of an island with higher caves and carpentry. That's it.
    By the way i solved by myself with

    Code:
    Imports System.IO
    Imports HtmlAgilityPack
    
    Public Class Form1
        Public Class Island
            Public Property Coord As String
            Public Property MineLevel As Integer
            Public Property CarpentryLevel As Integer
            Public Property FreeSlots As Integer
        End Class
        Private Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click
            Dim htmlDoc As New HtmlAgilityPack.HtmlDocument()
            htmlDoc.Load("kappa.html")
            ' Ottieni tutti gli elementi islandTile che hanno il tipo di materiale 2 (tradegood2)
            Dim islands As New List(Of Island)()
    
            For Each tile As HtmlNode In htmlDoc.DocumentNode.SelectNodes("//div[contains(@class, 'islandTile') and not(contains(@class, 'oceanTile')) and starts-with(@id, 'tile_')]")
    
                ' Check if this island produces the specific material you're interested in
                Dim tradeGoodType As String = tile.SelectSingleNode(".//div[contains(@class, 'tradegood')]/@class")?.GetAttributeValue("class", "")
                If tradeGoodType <> "tradegood tradegood2" Then
                    Continue For
                End If
    
                ' Get the levels of mine and carpentry
                Dim mineLevel As Integer
                Integer.TryParse(tile.SelectSingleNode(".//div[contains(@class, 'tradegood2')]/span")?.InnerText, mineLevel)
                Dim carpentryLevel As Integer
                Integer.TryParse(tile.SelectSingleNode(".//div[contains(@class, 'ikaeasy-resource-wood')]/span")?.InnerText, carpentryLevel)
    
                ' Get the number of occupied slots
                Dim occupiedSlots As Integer
                Integer.TryParse(tile.SelectSingleNode(".//div[contains(@class, 'cities')]/text()")?.InnerText, occupiedSlots)
    
                ' Check if the island meets the criteria and add it to the list
                If occupiedSlots <= 7 Then
                    Dim island As New Island With {
                        .Coord = tile.GetAttributeValue("title", ""),
                        .MineLevel = mineLevel,
                        .CarpentryLevel = carpentryLevel,
                        .FreeSlots = 16 - occupiedSlots
                    }
                    islands.Add(island)
                End If
            Next
    
            ' Sort the islands by cave level (descending) and then by carpentry level (descending)
            islands = islands.OrderByDescending(Function(i) i.MineLevel).ThenByDescending(Function(i) i.CarpentryLevel).ToList()
    
            ' Print the results
            For Each island In islands
                Debug.Print("Island " & island.Coord & " mine: lvl " & island.MineLevel & " carpentry: lvl " & island.CarpentryLevel & " free slots: " & island.FreeSlots)
            Next
        End Sub
    End Class

  6. #6
    Super Moderator Shaggy Hiker's Avatar
    Join Date
    Aug 2002
    Location
    Idaho
    Posts
    40,102

    Re: Filtering html to get results

    Not hacking, exactly. Game cheats.

    The concern with parsing HTML is largely that it keeps on changing. Web developers that I've known seem unable to resist changing the HTML around seemingly just for fun. Add a <div>, change a <p>, alter a class name, and so forth. I'm not sure what the motivation is. It might just be that there are so MANY ways to skin that cat that web devs always have a new way they want to try.

    What's really nice about the HTML you are looking at is that they seem to be unusually clear and convenient. There are IDs on most things, and those IDs clearly follow a consistent, transparent, format.
    My usual boring signature: Nothing

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width