|
-
Oct 10th, 2000, 07:36 AM
#1
Thread Starter
Member
Hello friends...
I'm having an urgent problem here; I'm trying to import a very large text file of +54 MB(!!) into a String variable but each time the machine crashes...
Does anyone know what the problem is? Normally seen a string variable can contain up to 2 billion characters (says the VB Help)...
PS. It works fine with large files of 10 to 20 MB.
PPS. The machines who need to process those files are Pentium III's with 128 MB internal memory...
Here's the code I use in my program:
====================================
Public Function ImportFile(iMessage As Integer, Optional ByVal fLatestFile As Boolean = False, _
Optional strFileName As String) As String
Open sFileName For Input As #1
ImportFile = Input(lLength, #1) -> Crash!
Close
====================================
lLength contains a value like this = 55000000
Is it crashing because of a memory-overflow?
Please help Me if You know a solution for this...
Thank You
Bart
-
Oct 10th, 2000, 07:47 AM
#2
transcendental analytic
Try Binary
Code:
Open sFileName For Binary As #1
Get#1,,ImportFile
Close
also if that doesn't work, a byte array takes half less space
Code:
Dim buffer() as byte
Open sFileName For Binary As #1
Get#1,,buffer
Close
Use  
writing software in C++ is like driving rivets into steel beam with a toothpick.
writing haskell makes your life easier:
reverse (p (6*9)) where p x|x==0=""|True=chr (48+z): p y where (y,z)=divMod x 13
To throw away OOP for low level languages is myopia, to keep OOP is hyperopia. To throw away OOP for a high level language is insight.
-
Oct 10th, 2000, 09:02 AM
#3
Thread Starter
Member
Well, it doesn't import anything... the string (ImportFile) is empty. Possibly because the Get statement expects the data being formatted into a sort of record struture. The files that I need to import contain just one long string...
So I need to use this code again...
ImportFile = Input(lLength, #1)
-> Crash
pfff
I really don't know how to solve this one...
-
Oct 10th, 2000, 09:18 AM
#4
transcendental analytic
let's try again
Code:
Open sFileName For Binary As #1
Importfile=Space(lof(1))
Get#1,,ImportFile
Close
Use  
writing software in C++ is like driving rivets into steel beam with a toothpick.
writing haskell makes your life easier:
reverse (p (6*9)) where p x|x==0=""|True=chr (48+z): p y where (y,z)=divMod x 13
To throw away OOP for low level languages is myopia, to keep OOP is hyperopia. To throw away OOP for a high level language is insight.
-
Oct 10th, 2000, 10:03 AM
#5
Thread Starter
Member
Oh my...
nope, doesn't work either.
The systems crash here
-> Importfile=Space(lof(1))
Do You have another suggestion?
Bye the way, thanks for the help until now 
-
Oct 10th, 2000, 10:22 AM
#6
Frenzied Member
hey KEDAMAN! nice to see you again man!!!
Balip, I'm just guessing, try this API thingy... only use the ReadFile API to read the file into a byte-array.
not sure if it works for you, you don't need the other code, but it may be handy sometime
Code:
Const MOVEFILE_REPLACE_EXISTING = &H1
Const FILE_ATTRIBUTE_TEMPORARY = &H100
Const FILE_BEGIN = 0
Const FILE_SHARE_READ = &H1
Const FILE_SHARE_WRITE = &H2
Const CREATE_NEW = 1
Const OPEN_EXISTING = 3
Const GENERIC_READ = &H80000000
Const GENERIC_WRITE = &H40000000
Private Declare Function SetVolumeLabel Lib "kernel32" Alias "SetVolumeLabelA" (ByVal lpRootPathName As String, ByVal lpVolumeName As String) As Long
Private Declare Function WriteFile Lib "kernel32" (ByVal hFile As Long, lpBuffer As Any, ByVal nNumberOfBytesToWrite As Long, lpNumberOfBytesWritten As Long, ByVal lpOverlapped As Any) As Long
Private Declare Function ReadFile Lib "kernel32" (ByVal hFile As Long, lpBuffer As Any, ByVal nNumberOfBytesToRead As Long, lpNumberOfBytesRead As Long, ByVal lpOverlapped As Any) As Long
Private Declare Function CreateFile Lib "kernel32" Alias "CreateFileA" (ByVal lpFileName As String, ByVal dwDesiredAccess As Long, ByVal dwShareMode As Long, ByVal lpSecurityAttributes As Any, ByVal dwCreationDisposition As Long, ByVal dwFlagsAndAttributes As Long, ByVal hTemplateFile As Long) As Long
Private Declare Function CloseHandle Lib "kernel32" (ByVal hObject As Long) As Long
Private Declare Function SetFilePointer Lib "kernel32" (ByVal hFile As Long, ByVal lDistanceToMove As Long, lpDistanceToMoveHigh As Long, ByVal dwMoveMethod As Long) As Long
Private Declare Function SetFileAttributes Lib "kernel32" Alias "SetFileAttributesA" (ByVal lpFileName As String, ByVal dwFileAttributes As Long) As Long
Private Declare Function GetFileSize Lib "kernel32" (ByVal hFile As Long, lpFileSizeHigh As Long) As Long
Private Declare Function GetTempFileName Lib "kernel32" Alias "GetTempFileNameA" (ByVal lpszPath As String, ByVal lpPrefixString As String, ByVal wUnique As Long, ByVal lpTempFileName As String) As Long
Private Declare Function MoveFileEx Lib "kernel32" Alias "MoveFileExA" (ByVal lpExistingFileName As String, ByVal lpNewFileName As String, ByVal dwFlags As Long) As Long
Private Declare Function DeleteFile Lib "kernel32" Alias "DeleteFileA" (ByVal lpFileName As String) As Long
Private Sub Form_Load()
'KPD-Team 1998
'URL: http://www.allapi.net/
'E-Mail: [email protected]
Dim sSave As String, hOrgFile As Long, hNewFile As Long, bBytes() As Byte
Dim sTemp As String, nSize As Long, Ret As Long
'Ask for a new volume label
sSave = InputBox("Please enter a new volume label for drive C:\" + vbCrLf + " (if you don't want to change it, leave the textbox blank)")
If sSave <> "" Then
SetVolumeLabel "C:\", sSave
End If
'Create a buffer
sTemp = String(260, 0)
'Get a temporary filename
GetTempFileName "C:\", "KPD", 0, sTemp
'Remove all the unnecessary chr$(0)'s
sTemp = Left$(sTemp, InStr(1, sTemp, Chr$(0)) - 1)
'Set the file attributes
SetFileAttributes sTemp, FILE_ATTRIBUTE_TEMPORARY
'Open the files
hNewFile = CreateFile(sTemp, GENERIC_WRITE, FILE_SHARE_READ Or FILE_SHARE_WRITE, ByVal 0&, OPEN_EXISTING, 0, 0)
hOrgFile = CreateFile("c:\config.sys", GENERIC_READ, FILE_SHARE_READ Or FILE_SHARE_WRITE, ByVal 0&, OPEN_EXISTING, 0, 0)
'Get the file size
nSize = GetFileSize(hOrgFile, 0)
'Set the file pointer
SetFilePointer hOrgFile, Int(nSize / 2), 0, FILE_BEGIN
'Create an array of bytes
ReDim bBytes(1 To nSize - Int(nSize / 2)) As Byte
'Read from the file
ReadFile hOrgFile, bBytes(1), UBound(bBytes), Ret, ByVal 0&
'Check for errors
If Ret <> UBound(bBytes) Then MsgBox "Error reading file ..."
'Write to the file
WriteFile hNewFile, bBytes(1), UBound(bBytes), Ret, ByVal 0&
'Check for errors
If Ret <> UBound(bBytes) Then MsgBox "Error writing file ..."
'Close the files
CloseHandle hOrgFile
CloseHandle hNewFile
'Move the file
MoveFileEx sTemp, "C:\KPDTEST.TST", MOVEFILE_REPLACE_EXISTING
'Delete the file
DeleteFile "C:\KPDTEST.TST"
Unload Me
End Sub
Jop - validweb.nl
Alcohol doesn't solve any problems, but then again, neither does milk.
-
Oct 10th, 2000, 10:50 AM
#7
transcendental analytic
Hi there Jop! 
Balip, maybe you actually ran out of memory and there was not enough harddiskspace to swap to it either?!? Did you get any error messages or did it simply crash vb? Did Jops apicode work?
Use  
writing software in C++ is like driving rivets into steel beam with a toothpick.
writing haskell makes your life easier:
reverse (p (6*9)) where p x|x==0=""|True=chr (48+z): p y where (y,z)=divMod x 13
To throw away OOP for low level languages is myopia, to keep OOP is hyperopia. To throw away OOP for a high level language is insight.
-
Oct 11th, 2000, 02:08 AM
#8
Thread Starter
Member
Goooooooooood Mor-ning Jop and Kedaman!!! ;-)
Well guy's... I tried Kedaman's code this evil morning, it worked (didn't crash, OLE!) but what do I have to do with the result in the byte-array?
I need to parse the content of the file as a whole, so I need 1 string where I can search in... can I convert the byte-array one or another way so I get what I want? (Pleaszzzzze say yes?!)
Jop, It could be that I ran out of memory...
The systems crash when VB is putting the imported data from the file into the string. This takes a very long time and the harddisk lights are flashing heavily... then, from one moment to the other the system hangs (when I'm in debug mode, if I make an EXE of it the application just crashes but the system doesn't).
I really thing the systems I need to work with here are up-to-date: 128MB, free harddisk space of +1GB, Pentium III etc.
Do You guys have other ideas???? The people who hired me are going to kill me if they find out that my backup program can't parse their extreme large text-files...
-
Oct 11th, 2000, 02:35 AM
#9
Frenzied Member
Hmm... I tried importing a 25 MB file in a string, Open file for binary....
It didn't actually crash my system, but it just took to long (5+ minutes) that I decided to CTRL+ALT+DELETE/Kill my app 
But I have an idea, why don't you sell your prog as a Winamp plugin? it sounded great hehe just ship a 100 MB file, read it and winamp is acting cool 
No, but seriously now.
You need it for a backup program? Why would it need a 50 MB TEXT file? Can't you split it in small parts?
If you really need that big file, read it in small chunks...
Or use C++/Assembler.
< Jop's wondering how progs like mediaplayer opens their very big MPG files without even slowing down >
< Jop suddenly remembers, Buffering! >
< Jop still don't know why the harddisk isn't that busy then >
Im kinda now, how the hell to they do that?
Have fun anyway 
Jop - validweb.nl
Alcohol doesn't solve any problems, but then again, neither does milk.
-
Oct 11th, 2000, 04:10 AM
#10
Thread Starter
Member
-
Oct 11th, 2000, 09:19 AM
#11
Frenzied Member
You *can* read it in a byte array? this code works for me:
Code:
'After loading in Byte Array B()
Dim x as long, str as String
For x = LBound(B) to UBound(B)
str = str & B(x)
next x
Hope that helps!!!
Jop - validweb.nl
Alcohol doesn't solve any problems, but then again, neither does milk.
-
Oct 12th, 2000, 04:54 AM
#12
transcendental analytic
Use  
writing software in C++ is like driving rivets into steel beam with a toothpick.
writing haskell makes your life easier:
reverse (p (6*9)) where p x|x==0=""|True=chr (48+z): p y where (y,z)=divMod x 13
To throw away OOP for low level languages is myopia, to keep OOP is hyperopia. To throw away OOP for a high level language is insight.
-
Oct 12th, 2000, 10:10 AM
#13
Thread Starter
Member
Well Guys,
I made it work 
Apparently VB can't read 1 string of +40.000.000 characters in one time (pff) 
So I opened the file in MultiEdit (very good text editor!) and saved it under the same name... When I opened the file again, MultiEdit had added some Line Feeds/Line Breaks to it and guess what... VB can import the file!
In VB I used this code to import it cause the system keeps crashing when You try to import the data in a whole like I did before (see the code I typed before)...
==========
Public Function ImportFile(...) as String
...
Open sFileName For Binary As #1
On Error GoTo ErrorHandler
While Not EOF(1)
Line Input #1, strTmp
strTmpData = strTmpData & strTmp
If Len(strTmpData) >= 2000000 Then
If Len(strFileData) < 20000000 Then
strFileData = strFileData & strTmpData
Else
ImportFile = ImportFile & strFileData
strFileData = ""
End If
strTmpData = ""
End If
Wend
If strTmpData <> "" Then
strFileData = strFileData & strTmpData
End If
ImportFile = ImportFile & strFileData
strFileData = ""
strTmpData = ""
strTmp = ""
Close
...
End Function
============
Maybe You guys know something to speed it up? Right now it's taking up 10 to 15 min. to import the data...
Ok, this worked, but now I'm suffering with the fact that the processing of the string (search processes = InStr()) slowed down so much that You can write an MP3 song by hand and still be ready before him! (
But, lucky me, their normal system (PDMAIN = IBM/Oracle) crashed also due to the heavy load of data from those files (they growed again to +/- 65MB!) and maybe they are going to cut those files in pieces or remove a lot of data in it...
So, thanks for all Your help, someday my program will run like it was ment to be 
Bye,
Bart
-
Oct 12th, 2000, 11:39 AM
#14
transcendental analytic
balip, don't use line input, if you want mý advice, use binary and get it in chunks instead 
Did you try out strconv?
Use  
writing software in C++ is like driving rivets into steel beam with a toothpick.
writing haskell makes your life easier:
reverse (p (6*9)) where p x|x==0=""|True=chr (48+z): p y where (y,z)=divMod x 13
To throw away OOP for low level languages is myopia, to keep OOP is hyperopia. To throw away OOP for a high level language is insight.
-
Oct 13th, 2000, 06:35 AM
#15
Thread Starter
Member
-
Oct 13th, 2000, 08:25 AM
#16
Frenzied Member
To replace the buffer:
Code:
Importfile=StrConv(Importfile, vbFromUnicode)
Does that help?
hehe and Line Input does just what is says, it read's the file line-by-line, so it's a bit slow.
And the Binary thing get's X bytes at a time.
Jop - validweb.nl
Alcohol doesn't solve any problems, but then again, neither does milk.
-
Oct 13th, 2000, 10:29 AM
#17
transcendental analytic
Code:
Dim buffer As String, Importfile As String, chunksize As Long
chunksize = 65536
buffer = Space(chunksize)
Open strFile For Binary As #1
Do While LOF(1) - Loc(1) > chunksize
Get #1, , buffer
Importfile = Importfile & buffer
Loop
buffer = Space(LOF(1) - Loc(1))
If Len(buffer) Then
Get #1, , buffer
Importfile = Importfile & buffer
Else: End If
Close #1
Binary chunk reading is a bit more complicated than that, you read small parts, in this case 65536 bytes per chunk and then add them upin importfile, which may prevent a huge buffer, that may be what caused your problem. As jop explained, Line input is much slower, and that's because it has to check each byte for a linefeed to cut off the reading + it has to format the data for the variable that you read. Binary reading just get's the raw data from the file directly into the variable memory.
Use  
writing software in C++ is like driving rivets into steel beam with a toothpick.
writing haskell makes your life easier:
reverse (p (6*9)) where p x|x==0=""|True=chr (48+z): p y where (y,z)=divMod x 13
To throw away OOP for low level languages is myopia, to keep OOP is hyperopia. To throw away OOP for a high level language is insight.
-
Oct 14th, 2000, 05:33 PM
#18
New Member
Hi,
interresting reading...
Run into a problem with a little app and are now searching for quicker ways to open a text-file, split every line, search and replace some swedish signs, and finally put it in a text box.
Right now it takes minutes to open a file on 89 kB....
Any suggestions?
-
Oct 14th, 2000, 05:59 PM
#19
transcendental analytic
Open the file in Binary, get the data into one string and use replace to replace the words, then put it in the textbox, why do you need to split them anyway?
Use  
writing software in C++ is like driving rivets into steel beam with a toothpick.
writing haskell makes your life easier:
reverse (p (6*9)) where p x|x==0=""|True=chr (48+z): p y where (y,z)=divMod x 13
To throw away OOP for low level languages is myopia, to keep OOP is hyperopia. To throw away OOP for a high level language is insight.
-
Oct 14th, 2000, 06:31 PM
#20
New Member
I'm trying to open it binary, but I only get the first 65536 characters, so it seems like the buffersize can't be bigger.
I'm using this code:
////
Open strFile For Binary As ff
buffer = Space(LOF(1))
Text2.Text = LOF(1) 'just for control use
Get #ff, , buffer
Close ff
Text1.Text = buffer
////
In Text2 I get a value > 65536.
Why I want to split every line?
Well, it's a app that converts text-files...
Original file example:
Name: Doe John
Phone: 555-1234
--
Name: Doe Jane
Phone: 555-7890
--
And need it to look like this:
John;Doe;555-1234
Jane;Doe;555-7890
-
Oct 14th, 2000, 06:43 PM
#21
transcendental analytic
well, replace the linefeed with ; instead then 
No, the textbox can't contain more than 65536 bytes, and no nullchars allowed either. You do the split after replacing.
Use  
writing software in C++ is like driving rivets into steel beam with a toothpick.
writing haskell makes your life easier:
reverse (p (6*9)) where p x|x==0=""|True=chr (48+z): p y where (y,z)=divMod x 13
To throw away OOP for low level languages is myopia, to keep OOP is hyperopia. To throw away OOP for a high level language is insight.
-
Oct 14th, 2000, 07:36 PM
#22
New Member
Ok, then I have to use a richtextbox.
I show the file after replace but before the split, now it opens in seconds.
My only excuse is that in Sweden it's 2:30 am right now...
Thanx for the help kedaman
-
Oct 14th, 2000, 07:44 PM
#23
transcendental analytic
Ah, well it's almost 4 AM here in Finland, hur håller man sig vaken egentligen?
Use  
writing software in C++ is like driving rivets into steel beam with a toothpick.
writing haskell makes your life easier:
reverse (p (6*9)) where p x|x==0=""|True=chr (48+z): p y where (y,z)=divMod x 13
To throw away OOP for low level languages is myopia, to keep OOP is hyperopia. To throw away OOP for a high level language is insight.
-
Oct 14th, 2000, 07:56 PM
#24
New Member
-
Oct 15th, 2000, 12:01 AM
#25
Hyperactive Member
Better late than never?
I know this thread is approaching the end, but it is
interesting to me so I thought I'd look into it next week
during any spare time. It may amount to nothing, but
in case I find a solution along the path I am going to
investigate, I will let you know.
I am interested in researching "Memory Mapped Files". to
quote Dan Appleman from his fine book, "There is really no
difference between a file and memory. Ah, I know what
you’re thinking: Surely such a statement is the product of
a hallucination. We all know that these are two different
things.
All of the material presented here is copyrighted by either
Desaware or Macmillan. No part of this material may be used
or reproduced in any fashion (except in brief quotations
used in critical articles and reviews) without prior
consent."
So you see, there may in fact be no need to open the file
and read the string at all. Since the file exists on disk,
and you may wish to open a file up to a size larger than
your available RAM (including swap) then it makes sense to
try this set of API calls.
If anyone can beat me to it, then post here so I don't
run around re-inventing the wheel.
Thanks
-
Oct 15th, 2000, 05:06 AM
#26
transcendental analytic
Paul, youre right, variables are just connected to RAM by the Virtual Machine and to the harddisk by Virtual memory, but never heard about Memory Mapped Files.
Hmm, also i think it actually reads the files into memory every time you read the swap file, and that's why "swap", or isn't it?
[Edited by kedaman on 10-15-2000 at 06:09 AM]
Use  
writing software in C++ is like driving rivets into steel beam with a toothpick.
writing haskell makes your life easier:
reverse (p (6*9)) where p x|x==0=""|True=chr (48+z): p y where (y,z)=divMod x 13
To throw away OOP for low level languages is myopia, to keep OOP is hyperopia. To throw away OOP for a high level language is insight.
-
Oct 15th, 2000, 05:14 AM
#27
Hyperactive Member
OK, done what I wanted to do
I know I said "next week when I get some spare time", but
there were no good movies on TV so I had a little tinker
with the Memory Mapped Files.
Where I think they will be useful (in general) is:
Speed: Less than 100ms to map then file as Memory.
Sharing: Different applications can share the memory
(which happens to be stored in a file).
Where I think it could help those applications where large
files are involved is that you would not need to load the
entire file into memory. This is a bad idea in general
unless you know that your file is always less than a
certian limit. But even then, what if your file is 1GB is
size. Will you ensure you have enough memory (Real and
virtual) to load the file? I think not. I also seriously
doubt that your user would enjoy waiting for a 1GB file to
be read if all he wanted to do was to search for a sequence
of characters.
So, if the only reason balip (or whoever else is
interested) loads the file into memory (in a string) is to
use InStr, then I suspect that a far far better idea is to
use the memory mapped file idea, write or find a dll that
allows you to find a byte sequence in a given memory range
(in fact I am 100% certain this will be built in to
win32API - will take a look soon). Another function needed
is to extract a string from between two memory addresses
(like Mid does already).
Even if you don't do all this, you need to consider using a
byte array for your project instead of building a string.
This is simply because a string of say 1000 characters is
stored internally as unicode which makes it 2000 bytes.
You immediately save 50% of your storage space by using a
byte array instead of a string. The only sacrifice is that
you have less access to some pre-defined VB string handling
code.
If anyone is interested in the sample project I wrote then
email me at [email protected]. This sample is very
simple and uses the api calls to tell VB that a disk file
is REALLY some memory belonging to my application.
The lessons I followed to learn this technique are from Dan
Appleman's Win32 API book which I think every VB developer
must have (or have access to).
Regards
-
Oct 15th, 2000, 05:59 AM
#28
Hyperactive Member
No - Only reads as it needs
I am fairly certain of this fact. The memory mapped file
is only read if someone (that would be me) tried to read
some of the contents of the memory. Using CopyMemory is
the way to go about accessing the memory (and hence the
file is read).
I might continue to look if there is a better approach than
this to the question. If the file needs to be parsed
several times from start to finish then this approach while
it will not need much memory, might end up slower than
another approach.
Cheers
-
Oct 15th, 2000, 06:57 PM
#29
Hyperactive Member
Some results from my testing
OK. I made some big claims about speed and how to load large files into memory or alternatives.
Now I have made a dll (only a VB one mind you) and I have
tested it against InStr.
InStr blows it away as far as speed to find text at the end
of the 8MB file I was using as my sample. I used a VB-
World html page which I then appended to itself multiple
times. Then I added a text string to the end of the 8MB
file.
The purpose was to determine if the overhead in time to
load the file in as a string (not to mention memory limits
or anything else) was worth it.
In my tests, I found that with the 8MB string, InStr
returned the position within 600ms to 1300ms (it varied - I
guess the heap conditions affect it quite a bit). My DLL
which could possibly be tweaked quite a bit, managed a
consistent 1500ms . The tweaking might reduce it to under
1000ms if I am lucky.
So, for each InStr, I would lose by up to 900ms or a factor
of 2.5. I suspect that the factor is the way to measure
the relative difference.
The time taken for me to load the 8MB file in as a string
was around 12s (12000ms). So to save time overall, using
instr, I would have to have about 8000 InStr calls in the
code.
Now, if I can find an easy way (And I thing the FoldString
API does this) to convert a byte array into a VB String
(not a simple translation I fear), then this payoff will be
a great deal lower.
Time to extract a portion of the string using my DLL or MID
were negligible until I started trying for huge return
strings (like 2MB or so). When trying for 4MB, the built
in Mid performed in < 80ms whereas my DLL did it in
5000ms. However the reason here is again in the conversion
to a VB String which can be sped up greatly once I find the
DLL.
More tests should be performed using the same techniques I
use except instead of using the Memory Mapped File, use
normal VB code to binary load a chunk of the file at a
time. I would assume that this would not be as fast but it
is yet to be seen.
Conclusion so far for me is that it is worth investigating
further because of the possibility of a programmer wanting
to deal with a string larger than the available RAM. As
soon as this limit is reached, any programmer would need to
look for another way of dealing with "Strings".
Anyhow - I'm off to do more research.
Cheers
P.S. If this load of dribble is too boring for the thread, let me know and I'll stop posting
-
Oct 15th, 2000, 08:27 PM
#30
Hyperactive Member
Ha Ha
Well I should have read properly the previous posts because
the guys (kedaman and others) pointed out the StrConv
utility which I glossed over. I figured it couldn't be
that easy to convert a byt array to a Unicode String..hehe
So now the modified code runs like this:
Where as it was taking 12000ms to load the 8MB string, it
now takes 2200ms.
So this means now I'd have to perform about 1500 InStr
commands on the loaded string instead of using my DLL in
order to start winning in the time stakes. Much more
acceptable but still quite a lot of calls.
Also, the modified DLL version of Mid takes about 250ms
instead of the 5000ms it used to... StrConv really rocks.
By implementing a different method in my DLL (using Instr
but on a chunk of the source string at a time), I get
speeds of between 1000ms to 1300ms (instead of the
consistent 1500ms). The variation is due to my using InStr
again since it has proven it's worth to me (hehe). So now,
the number of InStr Operation needed to beat the Memory
Mapped File and byte array is about 2200 operations (at
worst) and break even at best.
Conclusion:
By not loading the file into memory at all and only loading
chunks of it at a time, there is potential to cut load
times for applications loading huge strings from a file by
a very large fraction of the original time. Instead of
minutes of wait time just to load the string, you are able
to load only the parts you need for the operation you are
performing. For example, suppose the String you want
happens to be inside the first 1KB of a 8MB string? The
memory mapped method will only need to access the disk for
the first chunk of data (in my case I use 64000 bytes per
chunk).
I also found that if the chunk size was increased, (640kb
for example) the time for the dll to find the string
reduced drastically as well. This is to be expected of
course.
I'll put together a new version of my sample for those that
showed an interest.
Regards
-
Oct 16th, 2000, 05:43 AM
#31
Thread Starter
Member
-
Oct 16th, 2000, 06:07 AM
#32
Fanatic Member
We have been over the whole chunk reading thing already. Check it out here
Iain, thats with an i by the way!
-
Oct 16th, 2000, 10:11 AM
#33
transcendental analytic
Code:
Private Function FindStr(String1 As String) As Long
' simple method to find a position of one sequence of bytes
' (representing a string) in mybytes
' need to discover an existing dll in win32api that does this already
' just pass the dll the two byte arrays and the two array lengths,
' and it should return the position. THIS MUST ALREADY EXIST I AM SURE
Dim b() As Byte
ReDim b(Len(Text1) - 1)
Dim c As Long, d As Long
For c = 1 To Len(Text1)
b(c - 1) = Asc(Mid(Text1, c, 1))
Next
Dim found As Boolean
For c = 0 To UBound(myBytes)
found = True
For d = 0 To UBound(b)
If myBytes(c + d) <> b(d) Then
found = False
Exit For
End If
Next
If found Then Exit For
Next
If found Then FindStr = c Else FindStr = -1
End Function
Paul, I hade a look at your project and it's amazing! Anyway I'm not sure did you use Instr or not? I know i've been trying to do this myself once, a faster version of instr using byte arrays; ALAS! IT's slow. So in this case did you replace the byte array searching with INSTR again? By converting back to unicode, with Strconv, yeah this function's the best thing i know next to copymemory, and then do the search with instr, you'll save much time But I asked about, and i'm sure there is a faster function than INSTR$, take a look at Like operator, it compares reallly fast!
Use  
writing software in C++ is like driving rivets into steel beam with a toothpick.
writing haskell makes your life easier:
reverse (p (6*9)) where p x|x==0=""|True=chr (48+z): p y where (y,z)=divMod x 13
To throw away OOP for low level languages is myopia, to keep OOP is hyperopia. To throw away OOP for a high level language is insight.
-
Oct 16th, 2000, 01:43 PM
#34
Hyperactive Member
Thanks
Thanks kedaman and balip, I hope my experimentation with
the API and whatnot was of use. I certainly enjoyed
researching the question because it has taught me several
things I didn't know before.
To Iain,
It is irrelevant if "We have been over the whole chunk
reading thing already" because as you will no doubt agree,
learning by doing is far more permanent than learning by
someone else doing. So I have learned some things by
trying out some ideas I had which I add is the whole point
of answering questions... One of the things I learned
about was Memory Mapped Files so for just that one thing, I
feel my times was worthwhile (for me)...
I did ask in an earlier post if anyone already knew about
this stuff to save me re-inventing the wheel too... 
Anyhow..off to work I go..
Cheers
[Edited by PaulLewis on 10-16-2000 at 03:54 PM]
-
Oct 19th, 2000, 06:35 AM
#35
Thread Starter
Member
Yepididoo, here I am again...
Say Paul, I used Your/Kedaman's fast way of importing data by using Binary reading...
It's indeed a lot faster (about 40%) but, one -big- problem,
it just doesn't copy all the characters that are in the datafile(??)
I used this code
Code:
hFile = CreateFile(strMapFile, GENERIC_READ Or GENERIC_WRITE, 0, ByVal 0, OPEN_EXISTING, FILE_FLAG_SEQUENTIAL_SCAN, 0)
'Importing data into String
lMaxSize = FileLen(strMapFile) - 1
ReDim myBytes(lMaxSize - 1) As Byte
CopyMemory myBytes(0), ByVal hAddress, lMaxSize
ImportFile = StrConv(myBytes, vbUnicode)
I do not get any error messages and I already tried to change the lMaxSize to a value of 1000 characters more than the FileLen() function says the file is.
Result: the string (after convertion) is again as large as it was before I changed the lMaxSize...
When I look at the characters that are at the end of the string by using the Right() function, they remains a cut off piece of data... 
I don't know how I can let him read until the end of the file.
Do You know a solution or do I need to keep using LineInput() cause this function works great right now... little bit slower, but hey, it works... 
------
I've got also another question...
I'm using this code
Code:
ImportFile = Replace(ImportFile, Chr(13), "", , , vbBinaryCompare)
ImportFile = Replace(ImportFile, Chr(10), "", , , vbBinaryCompare)
to replace the LineFeed etc. but these functions are sooooo
sllooooooowwwwwww, You just can't imagine how slow they are!
Do You know a way of replacing these Chr()'s in a binary array?
Thank You!
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|