As part of another project I am working with I made the following string functions:
IsNumeric
Upcase
Lowercase
Trim
RTrim
LTrim
right
left
mid
I was wondering if you guys could take the time to look at my code and let me know if it is optimized, and/or if there is a better way to do something. Then let me know so I can fix it. Included is a sample console program just to show how each function works.
Thanks
Last edited by Technocrat; Mar 25th, 2002 at 02:01 PM.
MSVS 6, .NET & .NET 2003 Pro
I HATE MSDN with .NET & .NET 2003!!!
Check out my sites:
http://www.filthyhands.com
http://www.techno-coding.com
As a suggestion, limit compares that are redundant, and don't use external string functions like strlen(). By using the ternary ? operator you cut down the comparison operations by close to 50%. I used the IsNumeric() function as an example.
This assumes sz string as argument
Code:
#define _ISDIGIT 0x10
BOOL IsNumeric(char *chString)
{
char *buf;
buf = chString;
while ((*buf & _ISDIGIT) ? 1 : (*buf =='.') )
{
buf++;
}
return (BOOL) (*buf==0x00); /*Return - if we got to to the end then result is TRUE*/
}
I think I need to understand (*buf & _ISDIGIT) better. I see it works but I am not sure why. Are you masking the pointer here? Why are you masking it with 0x10 if you are? How does that come back if it is a true or false?
Last edited by Technocrat; Mar 25th, 2002 at 03:26 PM.
MSVS 6, .NET & .NET 2003 Pro
I HATE MSDN with .NET & .NET 2003!!!
Check out my sites:
http://www.filthyhands.com
http://www.techno-coding.com
It isn't necessarily better. Sometimes more efficient code is hard to understand, which is definitely a bad thing.
& _ISDIGIT returns a positive value for any char that is an ASCII character that is a number. *buf points to the VALUE of what buf is currently aimed at.
In C, 0 is False, everything else is True. There is no BOOL value. BOOL is unsigned int, in other words, a number.
What the code does:
Checks if the code is a digit. If it fails the digit test, check for a
'.'. Return the value of the test to control the loop.
buf++ just moves to the next character. At string end the character is '\0' - the null character. It is not a number so it fails the test, you exit the loop. If you hit a non-number you exit the loop early.
If you went all the way thru the string to the end, then
return (*buf==0x00) is True. if not it's False because you exit the loop early.
While this kind of stuff can be fun & instructive (maybe), you should'nt re-invent the wheel. All of this stuff has been done to death, so consider looking around for algorithms. Especially if you are getting money for your code.
A really great place is the library - Check out Knuth's 'The Art of Computing'. About 40% of the questions in all the forums here were answered really well by this guy. In 1968.
Originally posted by jim mcnamara It isn't necessarily better. Sometimes more efficient code is hard to understand, which is definitely a bad thing.
Yeah I found that out a long time ago. But this time I am really looking for the most efficient code I can.
& _ISDIGIT returns a positive value for any char that is an ASCII character that is a number. *buf points to the VALUE of what buf is currently aimed at.
In C, 0 is False, everything else is True. There is no BOOL value. BOOL is unsigned int, in other words, a number.
What the code does:
Checks if the code is a digit. If it fails the digit test, check for a
'.'. Return the value of the test to control the loop.
buf++ just moves to the next character. At string end the character is '\0' - the null character. It is not a number so it fails the test, you exit the loop. If you hit a non-number you exit the loop early.
If you went all the way thru the string to the end, then
return (*buf==0x00) is True. if not it's False because you exit the loop early.
While this kind of stuff can be fun & instructive (maybe), you should'nt re-invent the wheel. All of this stuff has been done to death, so consider looking around for algorithms. Especially if you are getting money for your code.
I figured that out with playing with the code. The only thing I really don't understand is what happens here (*buf & _ISDIGIT). What exactly happens at this point?
A really great place is the library - Check out Knuth's 'The Art of Computing'. About 40% of the questions in all the forums here were answered really well by this guy. In 1968.
Hmm I will check it out.
Also thanks for you help & input.
MSVS 6, .NET & .NET 2003 Pro
I HATE MSDN with .NET & .NET 2003!!!
Check out my sites:
http://www.filthyhands.com
http://www.techno-coding.com
There's a flaw (bug) in this code. Even though it impressed me so much, it is not quite accurate!!
according to your code, this is a number:
<=>:;
This code is very good by the way. After studying it for a while, i figured out what was going on!
_ISDIGIT ==> 0x10 ==> 16 ==> 0010 0000 ==> MASK
so when you do (*buff & _ISDIGIT) you are asking if the 1st bit matches... since the mask is 16.
Breakdown:
Code:
our mask is: 0010 0000
and the numbers are on the 0x3? range...
A BITWISE & operator compares 2 BITS and
if any one of the bits is 0, returns 0.
EG:
a: 1100 0011 ==> C3 ==> 195
b: 0101 0111 ==> 57 ==> 87
==============
ans: 0100 0011 ==> 43 ==> 67
which equals: 67 as illustrated.
Now with numbers, they are all in the range of HEX: 0x3?
but so are :;<=>? (colon, semi-colon, less than, equal, greater than, and question mark)
0 1 2 3 4 5 6 7 8 9 : ; < = > ?
so now when you get their actual bits (binary representation),
you get:
0 ==> 110000
1 ==> 110001
2 ==> 110010
3 ==> 110100
4 ==> 111000
5 ==> 110101
6 ==> 110110
7 ==> 110111
8 ==> 111000
9 ==> 111001
: ==> 111010
; ==> 111011
< ==> 111100
= ==> 111101
> ==> 111110
? ==> 111111
|--------------|
| 11???? |
| AND 100000 |
|------------- |
| 100000 |
|--------------|
Notice how ALL of the first bits are 1. Now if you mask any of
these values with 10000 it will ALWAYS return 10000 which is
0x10 and 16 respectively, and DEFINATELY not a zero value.
Phew!! There goes my day. This was the best thing ive stumbled onto for a very long time, but unfortunately I cant use it in my code... I have a function that checks the IsNum like this:
Even though its quite redundant. But if your data doesnt contain any of those "valid number" characters, then the above formulae is the best.
By the way, im working on an expressions evaluator that will convert a mathematical expression into a numeric value. And the first thing it does is make sure the string passed is numbers...
Also, for everything else, use CALCULATOR (Programs > Accessories > Calc) and view it in SCIENTIFIC MODE ( View > Scientific ), then you can do all of the comparissons that way :P
WOw Jim, Thats some real crazy stuff... but I cant believe they fill in a whole char array for the mere purpose of checking the values!!
You see I have already thought up of that... and for a simple task of getting wether the string passed is all numbers... but, since i will already be doing if/else group for each sub group of the expression (ie, switch (currChar) { case '+', '-', '/', '*', '^')... then I really dont need to add overhead to my function.
By the way, i notice that boreland are MACRO ppl and that HP is all functions!!
ANyways, Macros are very cool if used right, as in anything else!!
But see how boreland has to fill in an array of 256 chars to find out what the value of the arg is... its very cool idea and it would speed things up by thousands, but if you are making a dll, the dll gets called then destroyed after each call, so that step would slow things down.
Who would do such an inefficient thing as destroying a dll after each call to it? And if he does, there's so much overhead that a 256 byte array doesn't matter.
BTW "& 0x10" would also include:
0x10 to 0x1F (unprintable)
0x50 to 0x5F (P to _)
0x70 to 0x7F (p to DEL)
0x90 to 0x9F (unprintable)
0xB0 to 0xBF (diacriticae)
0xD0 to 0xDF ------||------
0xF0 to 0xFF ------||------
All the buzzt CornedBee
"Writing specifications is like writing a novel. Writing code is like writing poetry."
- Anonymous, published by Raymond Chen
Don't PM me with your problems, I scan most of the forums daily. If you do PM me, I will not answer your question.