2 Attachment(s)
Remove Garbage Characters
I have a file that I have attached and it contains just numbers, but when I load the file into the applications and show the file in a messagebox I can see garbage characters that where never in the file. I attached a picture of the messagebox.How can I remove these?
Re: Remove Garbage Characters
The characters aren't garbage, instead the file has been saved using UTF-8 encoding. This means the first 128 characters (thus including numbers and basic English alphabet) are the same as in US-ASCII, but the remaining bytes are assigned a special encoding that allows Unicode code points to be represented.
Anyway, the first three characters are actually used to identify a UTF-8 encoded text file in Windows.
I guess in your case you can just ignore them (the data being just numbers & spaces), so simply do Mid$(strFile, 4) to get rid of them if you're reading the whole file at once into a string. If you save the file again and the source of the file is another application then you probably must have the three bytes at the beginning of the file.
Re: Remove Garbage Characters
I see, the reason why the file is saved in UTF-8 is because I was testing the file in different formats. I have tried saving it in Unicode but I end up with "Runtime Error 62: Input Past End Of File". The following code I have to read the file is:
Dim fileNum As Integer
fileNum = FreeFile
Open App.Path & "\file.txt" For Input As #fileNum
FileCaption.Caption = Replace$(Input(LOF(fileNum), fileNum), vbCrLf, "")
Close #fileNum
Re: Remove Garbage Characters
The problem is that VB6 is not Unicode aware with it's native file methods. One of the easiest ways to work around the issue is to read the file to a byte array:
Code:
Dim fileNum As Integer, bytData() As Byte
fileNum = FreeFile
Open App.Path & "\file.txt" For Binary Access Read As #fileNum
ReDim bytData(0 To LOF(fileNum) - 1)
Get #fileNum, , bytData
Close #fileNum
FileCaption.Caption = Replace(CStr(bytData), vbCrLf, vbNullString)
However any non-ANSI character is not displayed in this case, because VB6 native controls are not Unicode aware either. bytData and string copied from bytData will hold true Unicode UTF-16 information though when loaded this way.
Note that I removed the dollar sign from Replace because it is a special case that always returns a String datatype, unlike Mid and many others.
Re: Remove Garbage Characters
I get the error type Mismatch and it highlights on the last line.
Re: Remove Garbage Characters
Change bytData() As String to bytData() As Byte
Re: Remove Garbage Characters
Quote:
Originally Posted by
Merri
Change bytData() As String to bytData() As Byte
Merri, Where is bytData() As String?
Re: Remove Garbage Characters
It was in post #4, I updated the code in it so it no longer is there and should work when pasted.