In order to decode a file I must work with nibbles, so I have to do some byte shifting and-ing and or-ing. Because the files are large, around 10 Mb, speed really counts. I thought it would be faster if I could directly work on nibbles, but no such data type exists in VB. Could there be some way around? Btw I do the bit shifting by multiplying or dividing by powers of 2 as there's no shift operator as far as I know.
Last edited by krtxmrtz; Apr 21st, 2005 at 05:24 AM.
Lottery is a tax on people who are bad at maths
If only mosquitoes sucked fat instead of blood...
To do is to be (Descartes). To be is to do (Sartre). To be do be do (Sinatra)
Does your data allow you to use a "brute-force" method? That is, if you only have a limited number of values in your data set you can hard-code the processing for each value.
Does this make sense?
This world is not my home. I'm just passing through.
Does your data allow you to use a "brute-force" method? That is, if you only have a limited number of values in your data set you can hard-code the processing for each value.
Does this make sense?
The number of values is limited, of course, but it can be different according to the file I'm working with. But I'm afraid I don't know what you mean by that hard-coding, can you set an example?
Lottery is a tax on people who are bad at maths
If only mosquitoes sucked fat instead of blood...
To do is to be (Descartes). To be is to do (Sartre). To be do be do (Sinatra)
I was just thinking that if there was some kind of relationship between the lower and upper nybbles then you could know the upper nybble just by looking at the lower nybble. It sounds like this isn't going to work for you though. Since its not going to be applicable I can't give an example. Sorry.
This world is not my home. I'm just passing through.
Those are the masks used to pull just the lower order or high order nibble out of the byte.
lngLowOrder = lngValue and 15 ' lngLowOrder now has only the low order nibble
lngHighOrder = (lngValue and 240) / 16 ' lngHighOrder now has the high order nibble and it's bit shifted down
*** Read the sticky in the DB forum about how to get your question answered quickly!! ***
Please remember to rate posts! Rate any post you find helpful - even in old threads! Use the link to the left - "Rate this Post".
Those are the masks used to pull just the lower order or high order nibble out of the byte.
lngLowOrder = lngValue and 15 ' lngLowOrder now has only the low order nibble
lngHighOrder = (lngValue and 240) / 16 ' lngHighOrder now has the high order nibble and it's bit shifted down
Yes, that's exactly how I'm doing it, but the aim of this post was rather to make sure there wasn't some trick to achieve the same result but faster. The thing is I'm trying to optimize in terms of speed this file decoding I've mentioned in the first post, and one of the possibilities was better manipulation of nibbles.
Lottery is a tax on people who are bad at maths
If only mosquitoes sucked fat instead of blood...
To do is to be (Descartes). To be is to do (Sartre). To be do be do (Sinatra)
That is the best you are going to get out of a language like VB.
It the nibble work you need to do can be done in ASM (or maybe C) then a routine in that language might be better.
But the last thing you want to do is start calling a function (expensive to call) just to work a byte - it would have to do all the work before coming back to VB (or at least a large part of it).
*** Read the sticky in the DB forum about how to get your question answered quickly!! ***
Please remember to rate posts! Rate any post you find helpful - even in old threads! Use the link to the left - "Rate this Post".
That is the best you are going to get out of a language like VB.
It the nibble work you need to do can be done in ASM (or maybe C) then a routine in that language might be better.
But the last thing you want to do is start calling a function (expensive to call) just to work a byte - it would have to do all the work before coming back to VB (or at least a large part of it).
A number of years ago I used Microsoft C 6.0 in a IDE running on DOS. But I don't know how I'd call a sub written in C from VB. Would I have to make a DLL or something?
Lottery is a tax on people who are bad at maths
If only mosquitoes sucked fat instead of blood...
To do is to be (Descartes). To be is to do (Sartre). To be do be do (Sinatra)
Yes, make a DLL with C. It is likely to be 10-100 times faster to do string parsing in C than in VB.
Well, it's not exactly string parsing what I've got to do but the nibble operations mentioned above.
Now I come to think of it, I know how to make a dll in the IDE of VB6 but how would I go about it in c? Do I have to install the VC part of Visual Studio or is there a more straightforward way in the old DOS-C itself?
Lottery is a tax on people who are bad at maths
If only mosquitoes sucked fat instead of blood...
To do is to be (Descartes). To be is to do (Sartre). To be do be do (Sinatra)
Well, it's not exactly string parsing what I've got to do but the nibble operations mentioned above.
Now I come to think of it, I know how to make a dll in the IDE of VB6 but how would I go about it in c? Do I have to install the VC part of Visual Studio or is there a more straightforward way in the old DOS-C itself?
Anything resembling traversing a ginormous array of bytes is considered string-parsing (an old C nomenclature).
I am not an expert making C or C++ DLLs but I have done it a few times in the past. I searched around and found plenty of tutorials and followed them. They always had me using VS C++ 6. I recommend you go that route also. C++ is very fast - I consider it the same speed as C when all you are doing is string-parsing.
Nobody knows what software they want until after you've delivered what they originally asked for.
Don't solve problems which don't exist.
"If I had eight hours to cut down a tree, I'd spend six hours sharpening my axe." --- Abraham Lincoln (1809-1865)
They always had me using VS C++ 6. I recommend you go that route also.
OK, I'll have a go at it. Thanks.
Lottery is a tax on people who are bad at maths
If only mosquitoes sucked fat instead of blood...
To do is to be (Descartes). To be is to do (Sartre). To be do be do (Sinatra)
But the last thing you want to do is start calling a function (expensive to call) just to work a byte - it would have to do all the work before coming back to VB (or at least a large part of it).
I'm sure you know this, krtxmrtz, but szlamany makes an excellent point. Do not make a function that accepts data byte per byte; rather, give the DLL your entire buffer, or at least a KiloByte at a time...
Nobody knows what software they want until after you've delivered what they originally asked for.
Don't solve problems which don't exist.
"If I had eight hours to cut down a tree, I'd spend six hours sharpening my axe." --- Abraham Lincoln (1809-1865)
I'm sure you know this, krtxmrtz, but szlamany makes an excellent point. Do not make a function that accepts data byte per byte; rather, give the DLL your entire buffer, or at least a KiloByte at a time...
Yes I was aware of this but it's one of those things you so easily tend to overlook so, thanks for reminding me of it.
Lottery is a tax on people who are bad at maths
If only mosquitoes sucked fat instead of blood...
To do is to be (Descartes). To be is to do (Sartre). To be do be do (Sinatra)
Yes, make a DLL with C. It is likely to be 10-100 times faster to do string parsing in C than in VB.
Hi,
Just curious as to WHY "C" is faster than VB. I know it's true, I've just never learned the why. Any insight or any links to info on this would be appreciated.
I'm curious too, I don't understand why VB compilers couldn't be designed to be as efficient as c compilers. What's he difference?
Lottery is a tax on people who are bad at maths
If only mosquitoes sucked fat instead of blood...
To do is to be (Descartes). To be is to do (Sartre). To be do be do (Sinatra)
Just curious as to WHY "C" is faster than VB. I know it's true, I've just never learned the why. Any insight or any links to info on this would be appreciated.
I don't use C, but I imagine it's how the language can look at a spot in memory.
Back in our mainframe BASIC days we wrote little PEEK and POKE functions - pass the function the "location" of a string and the offset into that string you wanted to touch and it would pull a byte, word or longword from it.
In VB (I assume) the only way to do that is to LEFT/MID/RIGHT the string, which makes a new string (probably an 8 byte memory structure just to hold the byte, word or longword) and then some other FUNCTION call (ASC?) to convert it to a value.
That type of requirement to get at a memory location to play with bytes is expensive.
I would choose to develop the routines needed in this thread in ASM myself - but from what I hear C can imbed ASM and ASM-like call directly.
*** Read the sticky in the DB forum about how to get your question answered quickly!! ***
Please remember to rate posts! Rate any post you find helpful - even in old threads! Use the link to the left - "Rate this Post".
I don't use C, but I imagine it's how the language can look at a spot in memory.
In VB (I assume) the only way to do that is to LEFT/MID/RIGHT the string, which makes a new string (probably an 8 byte memory structure just to hold the byte, word or longword) and then some other FUNCTION call (ASC?) to convert it to a value.
That type of requirement to get at a memory location to play with bytes is expensive.
Ahhhhh... So it's this kind of overhead that does it !
Ahhhhh... So it's this kind of overhead that does it !
Thankyou - Very well put.
Yes - what seems so routine in higher level languages - a simple call to a VB function, or a SUB/Function in a module you write yourself causes all kinds of memory framing action so that "new variable" space can be created - arguments passed (by ref less expensive - by value more expensive).
It's so common on this forum for suggestions to use SPLIT to find things in a string. SPLIT creates an array of new strings - that's expensive.
It's also common in the VB world to use REDIM with PRESERVE - that's extremely expensive as well. Most likely moving the entire array to a new large or smaller location.
*** Read the sticky in the DB forum about how to get your question answered quickly!! ***
Please remember to rate posts! Rate any post you find helpful - even in old threads! Use the link to the left - "Rate this Post".
Could you possibly post some of the code you're using?
This is the core of it.
What this does is, it reads a file where a matrix of values z=f(x,y) is
stored. Now, these values have a 12-bit precision and in order to save
some space, they are packed in 3 bytes -call them b0, b1 and b2- in
this fashion:
The first 12-bit value has its 8 low order bits in b0 and its 4 high order
bits in the high order nibble of b2. The second 12-bit value has its 8 low
order bits in b1 and its 4 high order bits in the low order nibble of b2.
I hope this is clear at least from the attached drawing (sorry about this large size, I didn't mean to make it like that)
VB Code:
Const HeaderSize = 2048
Dim nSize As Long
Dim maxval As Single
Dim DataByte() As Byte
Dim Val12bit() As Integer
Dim NormdData() As Single
Dim GreyLevel() As Byte
Public Sub Read_the_file()
'The following sub reads 2048 bytes of header data
'I don't include it for it's irrelevant
Get_Header
'This sub is pretty fast, (possibly) no need to optimize here
Get_Rest
Process_Data
End Sub
Private Sub Get_Rest()
Dim ff As Integer
ff = FreeFile
Open FileName For Binary As #ff
nSize = LOF(ff)
'HeaderSize has been defined elsewhere
'DataByte is grouped in 3 consecutive bytes
'in which two 12-bit values are packed
ReDim DataByte(0 To 2, 0 To (nSize - HeaderSize) \ 3 - 1)
Seek #ff, HeaderSize + 1
Get #ff, , DataByte
Close #ff
End Sub
Public Sub Process_the_Data()
Dim i As Long
Dim j As Long
Dim uLim As Long
uLim = 2 * (nSize - HeaderSize) \ 3 - 1
ReDim Val12bit(uLim)
ReDim GreyLevel(uLim)
ReDim NormdData(uLim)
maxval = 0
For i = 0 To (nSize - HeaderSize) \ 3 - 1
j = 2 * i
'The first 12-bit value is built from the high order nibble of
'the 3rd DataByte followed by the entire first DataByte
Val12bit(j) = (DataByte(2, i) And &HF0) * 16 Or DataByte(0, i)
'The second 12-bit value is built from the low order nibble of
'the 3rd DataByte followed by the entire second DataByte
Val12bit(j + 1) = (DataByte(2, i) And &HF) * 256 Or DataByte(1, i)
'The Val12bit values are scaled here and their
'maximum is sought for posterior normalization
NormdData(j) = Val12bit(j) / 1000
If maxval < NormdData(j) Then maxval = NormdData(j)
NormdData(j + 1) = Val12bit(j + 1) / 1000
If maxval < NormdData(j + 1) Then maxval = NormdData(j + 1)
Next
'The data represent the values of a function f(x,y)
'Although they have 12-bit precision, it's convenient
'to plot them on a picturebox in a gray scale
'In the following loop they are compressed to 8 bit
For i = 0 To (nSize - HeaderSize) \ 3 - 1
j = 2 * i
'Maximum value should be black and 0 (minimum) white
'I have left out the actual code for plotting to a picturebox
'For that I'm using a class someone provided some time
'ago in this forum with DIB related functions, faster than
'SetPixel
End Sub
Last edited by krtxmrtz; Apr 15th, 2005 at 08:22 AM.
Lottery is a tax on people who are bad at maths
If only mosquitoes sucked fat instead of blood...
To do is to be (Descartes). To be is to do (Sartre). To be do be do (Sinatra)
By the way...
If I am to write one or two DLLs in C I'd like someone to recommend a tutorial, but not just a C tutorial as I more or less know the syntax and structure of the language. What I'd like is some push to get blitz started in the VC environment of Visual Studio.
Lottery is a tax on people who are bad at maths
If only mosquitoes sucked fat instead of blood...
To do is to be (Descartes). To be is to do (Sartre). To be do be do (Sinatra)
'I have left out the actual code for plotting to a picturebox
'For that I'm using a class someone provided some time
'ago in this forum with DIB related functions, faster than
'SetPixel
End Sub
The largest different here is that I took away the NormdData array. I don't know if you need it later on in which case you can simply remove the comment. Some other improvments you might want to think about is to remove the Array bound check, the Integer overflow check, and the Floating point error check. Of course those optimisations will not be noticable until you compile the program.
I did some changes to your code but since I can't test it it's hard if these will make any noticable differences (even though it should be faster)
Well, I'll stick this into my own code and see what comes out from the tests. In the meantime I've been playing around and I'm wondering if anything is to be gained by using a 1D rather than 2D DataByte array, but that'll take me some time to implement. I'll keep you updated...
Lottery is a tax on people who are bad at maths
If only mosquitoes sucked fat instead of blood...
To do is to be (Descartes). To be is to do (Sartre). To be do be do (Sinatra)
'Is the following calculation really correct?
' uLim = 2 * (nSize - HeaderSize) \ 3 - 1
'Should it not be:
' uLim = 2 * ((nSize - HearderSize) \ 3 - 1)
I think it's correct because if you call
N = nSize - HeaderSize
then, this N is the total number of data bytes. For every 3 of these, 2 Val12bit values are built so, first you divide by 3 to get the number of triads,
N\3 triads
from each triad you get 2 Val12bit values so, the number of those is
2*N\3
and finally you've got to redim from 0 to 2*N\3 -1
This is one reason why I find this feature confusing, starting arrays at 0 rather than at 1, yet I've been using 0-based arrays recently, I suppose it's more convenient for me to adjust to what most people do.
Lottery is a tax on people who are bad at maths
If only mosquitoes sucked fat instead of blood...
To do is to be (Descartes). To be is to do (Sartre). To be do be do (Sinatra)
I've been playing around and I'm wondering if anything is to be gained by using a 1D rather than 2D DataByte array
You might notice that it's running faster in the IDE but I doubt very much that it would make any difference when you've compiled it to native code since the machine code doesn't know the difference between a 1D and a 2D array
That means that the compiler will probably create the same native code regardless of how many dimensions your array have.
To answer the question why C is faster than VB, C has no overhead of Classes, COM, goofy data structures that represent a String.
In VB 6, a string is a pointer to a structure with a size variable and then another pointer to the actual array of bytes. Also, you cannot just talk about the byte at position 6. You have to call functions that call other functions and make new data structures, and the list goes on and on.
In C, you start with an array of bytes. You can talk about each byte in the array as an integer offset to the array. It is the fastest way possible to traverse an array byte by byte. No goofy VB overhead.
For pure string-parsing, nothing beats C, not even assembler.
Nobody knows what software they want until after you've delivered what they originally asked for.
Don't solve problems which don't exist.
"If I had eight hours to cut down a tree, I'd spend six hours sharpening my axe." --- Abraham Lincoln (1809-1865)
I don't know if this helps you or not, but here is one way you can create your C++ dll from VC++
let's assume dll is called MyDll.dll
Create a new Win32 dll
Add the following code to the cpp file
Code:
#include <stdio.h>
#include <math.h>
#include <stdlib.h>
/******************************************************************************
declare Sub ProcessData Lib "MyDll.dll" (DataByte as Byte, Val12bit as Long, NormdData as Single, GreyLevel as Byte, ByVal BufferSize as Long)
******************************************************************************/
__declspec( dllexport ) void _stdcall ProcessData (char DataByte[][3], short Val12bit[], float NormdData[], char GreyLevel[], long BufferSize)
{
//OK do your magic
return;
}
Notice that the rows and columns are reversed in C++
Then from VB declare your new sub
For pure string-parsing, nothing beats C, not even assembler.
I find this quite surprising not to say hard to believe. If c is so efficient at string parsing you'd only have to take the compiled c code and convert it to assembler, on the basis that each machine instruction corresponds to an actual assembler instruction... But my assembler days belong to the past and to the another computer architecture so I may overlook some of the facts.
Lottery is a tax on people who are bad at maths
If only mosquitoes sucked fat instead of blood...
To do is to be (Descartes). To be is to do (Sartre). To be do be do (Sinatra)
I find this quite surprising not to say hard to believe. If c is so efficient at string parsing you'd only have to take the compiled c code and convert it to assembler, on the basis that each machine instruction corresponds to an actual assembler instruction...
If you take a string parsing routine written in C, say a for loop using integer offsets to the base address of the string (array of bytes), and then decompile it to assembler, you will get about a 1-1 ratio of C code to assembler code. In other words, you may not get 1-line to 1-line correspondence, but you would not be able to reduce the assembler any further. C can be extremely close to assembler in terms of efficiency.
Nobody knows what software they want until after you've delivered what they originally asked for.
Don't solve problems which don't exist.
"If I had eight hours to cut down a tree, I'd spend six hours sharpening my axe." --- Abraham Lincoln (1809-1865)
I don't know if this helps you or not, but here is one way you can create your C++ dll from VC++
Well, I used to write C code for DOS a few years ago so I hope I can still work out the magic as you say, but finding my way in the VC++ IDE without any background and coming from VB is not so straighforward so your post came in like the answer to my prayers... Thanks a lot.
Lottery is a tax on people who are bad at maths
If only mosquitoes sucked fat instead of blood...
To do is to be (Descartes). To be is to do (Sartre). To be do be do (Sinatra)
If you take a string parsing routine written in C, say a for loop using integer offsets to the base address of the string (array of bytes), and then decompile it to assembler, you will get about a 1-1 ratio of C code to assembler code. In other words, you may not get 1-line to 1-line correspondence, but you would not be able to reduce the assembler any further. C can be extremely close to assembler in terms of efficiency.
Oh I see what you mean, C is as efficient as assembler. I thought you meant assembler was less efficient than C, silly me !!!
Lottery is a tax on people who are bad at maths
If only mosquitoes sucked fat instead of blood...
To do is to be (Descartes). To be is to do (Sartre). To be do be do (Sinatra)
In many cases a C compiler will produce faster code than you will produce writing in assembly language yourself. This is beacuse the compiler knows a lot of tricks that you don't. I've actually tested it.
If you need help translating the magic I can help too.
This is what I've tried so far (I have made a few changes)
Code:
#include <stdio.h>
#include <math.h>
#include <stdlib.h>
/************************************************************************************************************************
declare Sub ProcessData Lib "MyDll.dll" (DataByte as Byte, Val12bit as Long, Max12bit as Long, ByVal BufferSize as Long)
*************************************************************************************************************************/
__declspec( dllexport ) void _stdcall ProcessData (char DataByte[][3], short Val12bit[], long Max12bit,long BufferSize)
{
int i;
int j;
Max12bit=0;
for (i=0;i<=BufferSize;i++)
j=2*i;
Val12bit[j] = (DataByte[i][2] & 240) * 16 | DataByte[i][0];
Val12bit[j + 1] = (DataByte[i][2] & 15) * 256 | DataByte[i][1];
if (Max12bit < Val12bit[j])
Max12bit = Val12bit[j];
if (Max12bit < Val12bit[j + 1])
Max12bit = Val12bit[j + 1];
return;
}
But when I try to add a reference to the dll an error message shows up, probably I didn't add this MyDll.def file in the correct way: I just added a text file with the text you posted.
Lottery is a tax on people who are bad at maths
If only mosquitoes sucked fat instead of blood...
To do is to be (Descartes). To be is to do (Sartre). To be do be do (Sinatra)
After creating the def file, add it to your project
Project\Add to Project\Files
Files of Type\Definition Files (.def)
This should add the following to
Project\Settings\Link\Project Options
/def:".\MyDll.def"
It's been a while since I did this but if you want to expose a function in a DLL don't you need to include windows.h? As I see it there is no reason at all to include stdio.h or math.h since you don't use any functions in those libraries.
Well, I doubt very much that he will need any functions from the stdio.h library since that's an old C library used for IO in the console world (which BTW is subsided by the iostrem.h in C++).
Anyway thanks for the info about not needing the windows.h library to expose a Windows DLL function (I wanted to give you a rating for that, but I wasn't allowed to since I rated you earlier and I haven't rated enough people to give you a new one ), but I do still believe that windows.h is more valuable then stdio.h in the Windows world
Last edited by Joacim Andersson; Apr 17th, 2005 at 07:08 PM.
After creating the def file, add it to your project
Project\Add to Project\Files
Files of Type\Definition Files (.def)
This should add the following to
Project\Settings\Link\Project Options
/def:".\MyDll.def"
I have to make sure what exactly the error was but at the moment I'm sitting in front of a computer where VC++ is not installed. I think it was something like "It was not possible to add a reference to this library" when I tried to add the reference in the VB project/references menu.
As I said, I had to put the
EXPORTS
ProcessData
lines in a text file. But when I tried to add it to the project there were no def files among the different types available (to be sure, I'd had a couple of beers before that) so I added it as a cpp code file.
Lottery is a tax on people who are bad at maths
If only mosquitoes sucked fat instead of blood...
To do is to be (Descartes). To be is to do (Sartre). To be do be do (Sinatra)