Results 1 to 26 of 26

Thread: How can I remove text clones from the hosts file?

  1. #1

    Thread Starter
    Member
    Join Date
    Jun 2018
    Posts
    51

    Question How can I remove text clones from the hosts file?

    I have a program which writes websites to the hosts file from a .txt file which has a list of websites, so that next time a website is searched, the connection is closed.

    It works great! However, because the program is told to keep writing websites to the hosts file, the file size is starting to grow. Since I have a large list of websites, if I keep writing those websites every time the application is ran, too much storage will be used up.

    So, I thought that by removing the lines of text added until there a no more clones would be a good idea, so I tried using this code:

    Code:
    Dim text As String = File.ReadAllText("C:\Windows\System32\drivers\etc\hosts")
            Dim index As Integer = text.IndexOf(line)
            If index >= 2 Then
                Dim delLine As Integer = 10
                Dim lines As List(Of String) = System.IO.File.ReadAllLines("C:\Windows\System32\drivers\etc\hosts").ToList
                lines.RemoveAt(delLine - 1) ' index starts at 0 
                MessageBox.Show("Clone Removed.")
            End If

    However, it doesn't work. The MessageBox shows up when the file has over 2 copies of the black-listed website, but it doesn't remove those new entries.

    How can I remove text clones from the hosts file (or a .txt file) so that only one copy of the original text is stored, and not just text being written all the time?

  2. #2
    Super Moderator jmcilhinney's Avatar
    Join Date
    May 2005
    Location
    Sydney, Australia
    Posts
    110,344

    Re: How can I remove text clones from the hosts file?

    I see that you deleted this question over at Stack Overflow. Is it still relevant?

  3. #3

    Thread Starter
    Member
    Join Date
    Jun 2018
    Posts
    51

    Re: How can I remove text clones from the hosts file?

    Yes, it still is un-solved. I meant to post on vbforums but thought stackoverflow would be better.

  4. #4
    Super Moderator jmcilhinney's Avatar
    Join Date
    May 2005
    Location
    Sydney, Australia
    Posts
    110,344

    Re: How can I remove text clones from the hosts file?

    vb.net Code:
    1. Dim filePath = "file path here"
    2.  
    3. File.WriteAllLines(filePath, File.ReadLines(filePath).Distinct())
    Done!

  5. #5

    Thread Starter
    Member
    Join Date
    Jun 2018
    Posts
    51

    Re: How can I remove text clones from the hosts file?

    It's still not removing any cloned text. This is what my host file looks like:


    Code:
    127.0.0.1       www.badwebsite.com
    
    127.0.0.1       www.badwebsite.com
    127.0.0.1       www.badwebsite.com

  6. #6
    Super Moderator jmcilhinney's Avatar
    Join Date
    May 2005
    Location
    Sydney, Australia
    Posts
    110,344

    Re: How can I remove text clones from the hosts file?

    If you need to only remove duplicates from part of the file, do something like this:
    vb.net Code:
    1. Dim allLines = File.ReadAllLines(filePath)
    2. Dim lineCount = allLines.Length
    3.  
    4. 'The first 10 lines are not touched.
    5. Dim section1Lines = allLines.Take(10).ToArray()
    6.  
    7. 'Remove duplicates from the middle of the file.
    8. Dim section2Lines = allLines.Skip(10).Take(lineCount - 20).Distinct().ToArray()
    9.  
    10. 'The last 10 lines are not touched.
    11. Dim section3Lines = allLines.Skip(lineCount - 10).ToArray()
    12.  
    13. File.WriteAllLines(filePath, section1Lines.Concat(section2Lines).Concat(section3Lines)

  7. #7
    Super Moderator si_the_geek's Avatar
    Join Date
    Jul 2002
    Location
    Bristol, UK
    Posts
    41,930

    Re: How can I remove text clones from the hosts file?

    Rather than writing the duplicates to the file in the first place, check if they are in there before writing them.

    To check you can use the first two lines you did above, followed by something like: If index = -1 Then writeToFile(line)

    Alternatively you can use ReadAllLines (as you did above) and something like: If Not lines.Contains(line) Then writetoFile(line)
    (this will take slightly longer to run)

  8. #8

    Thread Starter
    Member
    Join Date
    Jun 2018
    Posts
    51

    Re: How can I remove text clones from the hosts file?

    I only want to remove
    Code:
    line
    , not just from random places of the file.

  9. #9
    Super Moderator jmcilhinney's Avatar
    Join Date
    May 2005
    Location
    Sydney, Australia
    Posts
    110,344

    Re: How can I remove text clones from the hosts file?

    I had this first up:
    vb.net Code:
    1. File.WriteAllLines(filePath, File.ReadAllLines(filePath).Distinct())
    and then changed it to this:
    vb.net Code:
    1. File.WriteAllLines(filePath, File.ReadLines(filePath).Distinct())
    but that second one actually throws an exception because the file is still being read when it tries to write. The first is tried and tested and works as it should. If it doesn't work for you then you did it wrong.

  10. #10
    Super Moderator jmcilhinney's Avatar
    Join Date
    May 2005
    Location
    Sydney, Australia
    Posts
    110,344

    Re: How can I remove text clones from the hosts file?

    Quote Originally Posted by Modulus View Post
    I only want to remove
    Code:
    line
    , not just from random places of the file.
    That doesn't make any sense.

  11. #11

    Thread Starter
    Member
    Join Date
    Jun 2018
    Posts
    51

    Re: How can I remove text clones from the hosts file?

    Quote Originally Posted by jmcilhinney View Post
    That doesn't make any sense.
    Code:
    line
    is the piece of text that is added into the hosts file. So, for example, line could be: 127.0.0.1 test.com

    So, in a simple way of explaining it, I want to remove 127.0.0.1 test.com

  12. #12
    Super Moderator si_the_geek's Avatar
    Join Date
    Jul 2002
    Location
    Bristol, UK
    Posts
    41,930

    Re: How can I remove text clones from the hosts file?

    How about instead of removing the copy that is currently in the file (so you can add it again), you simply avoid adding the duplicate copy of it?

    For suggestions on how to do that, see my previous post.

  13. #13

    Thread Starter
    Member
    Join Date
    Jun 2018
    Posts
    51

    Re: How can I remove text clones from the hosts file?

    Could you leave me some code? I've read your post, and I understand what you're getting at. Some code would be really good thanks.

  14. #14
    Super Moderator jmcilhinney's Avatar
    Join Date
    May 2005
    Location
    Sydney, Australia
    Posts
    110,344

    Re: How can I remove text clones from the hosts file?

    So you're saying, without actually saying, that you have some text in a variable and you want to remove lines matching that text from the file? In that case:
    vb.net Code:
    1. Dim filePath = "file path here"
    2.  
    3. File.WriteAllLines(filePath, File.ReadAllLines(filePath).Where(Function(s) s <> line))
    If you want to, say, remove all but the first instance of matching lines:
    vb.net Code:
    1. Dim filePath = "file path here"
    2. Dim allLines = File.ReadAllLines(filePath)
    3. Dim firstMatchIndex = Array.IndexOf(allLines, line)
    4. Dim section1Lines = allLines.Take(firstMatchIndex + 1)
    5. Dim section2Lines = allLines.Skip(firstMatchIndex + 1).Where(Function(s) s <> line)
    6.  
    7. File.WriteAllLines(filePath, section1Lines.Concat(section2Lines))

  15. #15
    Super Moderator jmcilhinney's Avatar
    Join Date
    May 2005
    Location
    Sydney, Australia
    Posts
    110,344

    Re: How can I remove text clones from the hosts file?

    Quote Originally Posted by Modulus View Post
    Could you leave me some code? I've read your post, and I understand what you're getting at. Some code would be really good thanks.
    Or, you make an attempt and then we can help you fix your code if it doesn't work? You're allowed to try and fail. It's part of learning.

  16. #16
    Super Moderator si_the_geek's Avatar
    Join Date
    Jul 2002
    Location
    Bristol, UK
    Posts
    41,930

    Re: How can I remove text clones from the hosts file?

    Indeed it is... and all of the code you need (except for adding the new line to the file, which I assume you can do already) is either in post #7, or the parts of post #1 it mentions.

  17. #17

    Thread Starter
    Member
    Join Date
    Jun 2018
    Posts
    51

    Re: How can I remove text clones from the hosts file?

    Still not deleting. Maybe there's a way to set a limit for the amount of times certain text is added?

  18. #18
    Super Moderator si_the_geek's Avatar
    Join Date
    Jul 2002
    Location
    Bristol, UK
    Posts
    41,930

    Re: How can I remove text clones from the hosts file?

    What code did you use?

    Were there any errors? (or anything else we should know)

  19. #19

    Thread Starter
    Member
    Join Date
    Jun 2018
    Posts
    51

    Re: How can I remove text clones from the hosts file?

    Nope, no errors. I used this code:


    Code:
    If index >= 2 Then
                MessageBox.Show("Test.")
                Dim filePath = "C:\Windows\System32\drivers\etc\hosts"
                Dim allLines = File.ReadAllLines(filePath)
                Dim firstMatchIndex = Array.IndexOf(allLines, line)
                Dim section1Lines = allLines.Take(firstMatchIndex + 1)
                Dim section2Lines = allLines.Skip(firstMatchIndex + 1).Where(Function(s) s <> line)
    
                File.WriteAllLines(filePath, section1Lines.Concat(section2Lines))
    End If

  20. #20
    Super Moderator jmcilhinney's Avatar
    Join Date
    May 2005
    Location
    Sydney, Australia
    Posts
    110,344

    Re: How can I remove text clones from the hosts file?

    Quote Originally Posted by Modulus View Post
    Still not deleting.
    Then you did it wrong. We can't know what you did wrong if we don't know what you did. Please put some thought into your posts.

  21. #21

    Thread Starter
    Member
    Join Date
    Jun 2018
    Posts
    51

    Re: How can I remove text clones from the hosts file?

    I'm pasting in the code you've told me to!

  22. #22
    Super Moderator jmcilhinney's Avatar
    Join Date
    May 2005
    Location
    Sydney, Australia
    Posts
    110,344

    Re: How can I remove text clones from the hosts file?

    Quote Originally Posted by Modulus View Post
    Nope, no errors. I used this code:


    Code:
    If index >= 2 Then
                MessageBox.Show("Test.")
                Dim filePath = "C:\Windows\System32\drivers\etc\hosts"
                Dim allLines = File.ReadAllLines(filePath)
                Dim firstMatchIndex = Array.IndexOf(allLines, line)
                Dim section1Lines = allLines.Take(firstMatchIndex + 1)
                Dim section2Lines = allLines.Skip(firstMatchIndex + 1).Where(Function(s) s <> line)
    
                File.WriteAllLines(filePath, section1Lines.Concat(section2Lines))
    End If
    And did you actually debug that? What was the value of 'index'? I just tested the code I posted in post #14 and it worked perfectly for me. I copied the data you posted in post #5 and the first code snippet deleted all three matching lines while the second code snippet left the first behind and deleted the last two.

  23. #23

    Thread Starter
    Member
    Join Date
    Jun 2018
    Posts
    51

    Re: How can I remove text clones from the hosts file?

    My bad, I only tried the second code sample you provided, not the first of #14. Will try now.

  24. #24
    Super Moderator jmcilhinney's Avatar
    Join Date
    May 2005
    Location
    Sydney, Australia
    Posts
    110,344

    Re: How can I remove text clones from the hosts file?

    Quote Originally Posted by Modulus View Post
    My bad, I only tried the second code sample you provided, not the first of #14. Will try now.
    Whether you should use either of them or something else similar depends on exactly what you want. They were intended as examples that you can modify as required. As I said, the first one will remove every line from the file that matches the value you specify, while the second will remove all but the first instance. If you don't either of those things exactly then you'll need to make modifications as required.

    I should also point out that, while I have been answering the question that you seemed to be asking, I have to agree with si that prevention is generally better than cure, i.e. it would generally be better to not add duplicates at all rather than try to remove then after the fact. Is there any reason that you can't do that?

  25. #25

    Thread Starter
    Member
    Join Date
    Jun 2018
    Posts
    51

    Re: How can I remove text clones from the hosts file?

    I probably could, but I’m just not 100% sure how too.

  26. #26
    Super Moderator jmcilhinney's Avatar
    Join Date
    May 2005
    Location
    Sydney, Australia
    Posts
    110,344

    Re: How can I remove text clones from the hosts file?

    Who says that you have to be 100% sure about something to make an attempt to do it? si has given you pretty much all the information you need. Take some time to give it some thought and effort. The fact that you don't know immediately exactly what code to write is no reason to not bother and wait for someone else to write the code for you. The best way to learn is to do. We can help you if you run into an actual issue but if you don't try then you'll never know. We're more inclined to help those who help themselves.

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width