Using a Client>Server>Client Architecture, I have built a chat system, and would like to share it with you all!
I'd like to ask your help to see where potential problems (including abuses and exploits from clients) could lie.
Couple notes... The way im sending the user's info seems hackish to me... I assume you can't begin communicating between the computers until .Accept ID is called, correct? basically, im relying on command messages (ie "/ver")being passed too and from the server, to "validate" the client, but, say someone wanted to use my source to build their own client, they could pretty easily get around these checks, no? IE I think now how the code is, if the client just ignores the commands, and treat them like common chat messages, the server could crash/behave wrong etc. I need a way to DEMAND that the client respond in a certain way, and if it doesn't, disconnect. I keep on thinking of a timer, but seems a little hackish to me. Here is the code... please rip it to shreds, constructively, so that we all may learn!
For testing, you need to have a server running (server.exe), and at least 2 chat clients (client.exe), the server having port 10101 naked to the clients.
Tell me what errors/problems/vulnerabilities you find, and if you have the nack, suggest solutions.
Please don't post here about errors you have created yourself within the code... I am not responsible for damage caused by your recoding of the program. As far as I am aware... there are no major errors, and the program does not change your computer in any way.
REQUIRES: Microsoft winsock control, obviously.
Tested in XP, check for me if itll work in other Windows OS's too!
Last edited by EntityReborn; Sep 22nd, 2008 at 05:34 PM.
If I have helped you out, be a pal and rate the helpful post!
Here, I have rebuilt the whole package, and therefore, some things are not up to date/included yet, such as the authorization.
I have worked on a PM system, which is about half done, and have made better use of modules, as well as buffered data clientside as well as serverside. Enjoy!
One of the things that can get harder to live with as your program grows in size and complexity is change. Even small things can become painful because you may end up making a ton of picky little changes sprinkled all through the code.
This is one of the reasons why the encapsulation offered by Class and UserControl modules is important. You try to carefully define an interface (properties, methods, events, enums) that gives you the control you need over the "black box" and then inside that box you are free to make changes that do not impact every other line of code in the program... as long as you don't change the interface.
I bring this up because in looking through just your Client program's logic I see a lot of code in a module that is constantly reaching back inside your forms, or at least back into frmClient.
A lot of this is meant to add functionality to the Winsock control. Some of it is actually updating other controls on the form.
The result is a little inside-out and slightly twisted. The form updates probably should be events raised back to the Form module in most cases. The Winsock control clearly belongs over where the code that deals with it lives. I think you want a UserControl here, not a Winsock on the Form and a module that fiddles with it. Put an instance of the UserControl on the Form, and handle raised UserControl events in the Form to do your GUI updates.
Message formatting
It also looks like you have a message format now that has both a length and a delimiter. This is unusual. Then the same delimiter seems to be used within messages, in between fields in the message. Very odd, but maybe I am misinterpreting things.
We can step back and look at analogous situations with data files.
The common CSV format is pretty ancient, and goes back to early Basic days in the 1960s. The idea was simple: accept multi-valued user input by letting the user type commas between the values, and signal "end of line" by pressing the CR (now Enter) key. Soon this was extended to disk I/O as well.
Very early on the problem of data with commas came along. Ok, allow quotee around string values. Wow, what about quotes in the quoted data? Ok, allow quote escaping, i.e. "Tom said, ""Look at that!"""
But even in the beginning there were two delimiters: a comma (field) and a CR (line, or record). Your program seems to be using vbCrLf for both purposes. Hmm. Is this why the length prefix is there?
To get around this quoting and escaping programmers have often looked to other schemes when manual user-entry of the data isn't involved.
VB6 has some handy constants for now-seldom-used characters that can be bent to such a purpose: vbNullChar, vbFormFeed, vbVerticalTab, and vbTab. The last one is now useful because the Tab key normally causes the focus to advance to the next control in the Form's tab-order instead of being entered as a character. The others are clumsy to enter at all or have side efects if allowed in a Textbox, etc.
In simple cases you might choose one of these as your line/record/message delimiter and the other as your field delimiter.
You might just as easily choose other hard to enter values, like Chr$(&HFF). They aren't as convenient as the constants though, and so in a VB6 program you may pay a penalty for using them. The compiler does not optimize these sorts of expressions to an inline constant value.
What about binary?
But what if you have to pass binary data? Data that might contain any or all possible 8-bit patterns? You can no longer use a delimiter-based scheme without escaping occurrences of your delimiters.
I see some of this in your Client program. It seems to be replacing vbCrLf with <CrLf> and vice versa in places.
Back to modularity
This takes me back to the point I opened with here: better modularity can isolate the impact of design changes later on. I hesitate to suggest changes that might require lots of fiddly little changes to your code.
But about binary...
Flipping back to the issue of binary data, there is another option: one alternative people use is "run length" encoding. This is how VB manages String variables, at least the variable length String type we take for granted most of the time.
A simple example might be to use a decimal character string prefix that represents the length of the sequence of bytes that follows:
14This is a test
5Hello
The problem is, how do you knoow the length... of the length?
1616 bags of corn.
One answer might be to go back to delimiters again:
16;16 bags of corn.
Another might be to require a fixed-length length (10 here):
000000001616 bags of corn.
You might even go to hex, say the 8 digits of a Long:
0000001016 bags of corn.
0000000DThis is a test
Beyond the message itself, we usually still have fields to deal with. This can get ugly fast if you have to look at the raw data:
0000002F00000004This00000004That0000000FThe Other Thing
Here we have a 55 byte message with only 23 bytes of payload! Not very efficient, but I've seen worse schemes. At least the contents of the 3 fields can contain any byte you need with no escaping. Even empty field values can be passed (length 00000000).
There are better options. It is just a question of choosing one. A first approximation might be a binary length field. You just need to decide what precision is required. 32 bits? 16 bits? 8 bits?
Using 32 bits (and representing the values as X's here for illustration) we get:
XXXXXXXXThisXXXXThatXXXXThe Other Thing
Using 32 bits for the message length and 8 bits for field lengths gets us:
XXXXXThisXThatXThe Other Thing
But maybe sometimes a field must be longer than 255 bytes. What then? Well you can get more exotic, and have each field start with a "length length" byte and then the length:
XXXXXXX1XThis1XThat2XXThe Other Thing
Ok, pretty ugly, I admit. But versatile! And who will be looking at it?
Lots of talk here. What's the point?
Well this takes me back to modularity and encapsulation again.
Let's say we design a Class interface for a Message class:
Methods:
Function DeSerialize(Buffer As InboundBuffer) As Boolean
Function Serialize() As String
Properties:
Let Field(FieldID as FieldIDEnum, Value As String)
Get Field(FieldID As FieldIDEnum) As String
Get Fields() As String()
Get GoodMessage() As Boolean
Enums:
FieldIDEnum
Now you have a Message Class you can create instances of, and only inside that Class do you need to put in logic that knows anything about how your messages are formatted.
It is dependent on a companion Class InboundBuffer, which knows how to manage the data taken from GetData calls in your DataArrival event:
Methods:
Sub Append (Segment As String)
Sub Compact()
Function PeekChars(Length As Long) As String
Function ReadChars(Length As Long) As String
Properties:
Get CharsLeft() As Long
Get HasData() As Boolean
Again, only this Class needs to know how it manages things internally. Maybe the actual buffer is an array of Strings. Maybe one big String that gets concatenated. Maybe it's a Byte array. Who cares?
Now you can have code in a UserControl that has a Winsock named wsk like:
Code:
Private Buffer As InboundBuffer
Private Msg As Message
Public Event MessageReceived(Msg As Message)
:
{initialize objects}
:
Private Sub wsk_DataArrival(ByVal BytesTotal As Long)
Dim strSegment As String
wsk.GetData strSegment
Buffer.Append strSegment
If Buffer.HasData Then
Do While Msg.DeSerialize(Buffer)
If Msg.GoodMessage Then
'Process fields in Msg.
RaiseEvent MessageReceived(Msg)
Else
'Error, badly formatted Msg. Take action.
End If
Loop
Buffer.Compact
End If
End Sub
Back in your Form, the handler for MessageReceived can use the Field() property to get the message fields, or even the Fields() property to retreive them all in a String array, etc.
The InboundBuffer Class can maintain a Cursor internally pointing to the next character or Byte to be processed as a Message, and a Length internally to keep track of the total number of characters buffered.
InboundBuffer.CharsLeft returns the number of chars currently between Cursor and Length. InboundBuffer.PeekChars can return characters starting at Cursor without moving Cursor, while InboundBuffer.ReadChars does the same and advances the Cursor. These would be used in Message.DeSerialize to try to first see if there are enough characters buffered to contain the message length prefix, then get the message length prefix, then see if there are enough characters buffered to contain a complete message, and then finally extract the complete message text.
Then Message.DeSerialize would parse apart the message string into fields. These might be stored into a String array, a Collection, whatever.
InboundBuffer.Compact just "slides everything over" so that the leftover data at Cursor is now the first data in the buffer, and Length also gets adjusted to point to where new data will be Appended.
The reverse would be done for output.
Create a new instance of Message. Set each Field. Then call a method of your UserControl, passing the Message as a parameter.. which does:
wsk.SendData Msg.Serialize
In Message.Serialize you string all of the fields together with the length prefixes (or even your delimiters, etc. escaping special characters) and return one String or Byte array to be transmitted.
Ok so what you want to say here, would be to have my usercontrol, Winsock as a withevents variable there, along with other variables like userpassword, username, etc (I have since added more functionality), so that the winsock can reference directly to the classes that modify it. Have events raised that pertain to different actions (ie event UpdateText(Message as string) to print to the Chat window) within the main form, and have a seperate class that will do blackbox magic in terms of serializing the data sent and received. In the case of the server, this Usercontrol will be Indexed.
Message Formatting
Yes, Im aware of the duality. Here is my reasoning: You mention How to know the length of the length? The Delimiter allows me to split the first set of data(length and data), making that easy. then I can move on to the command and parameters.
Basically, after stripping the command (/IDENTIFY, etc), I split the data into an array, making it easier to deal with. For instance, /PRIVMSG 0<nullchar>Hello! will tell the server I want to send a private message. The server strips the command, makes an array, and the first part of that array is the usernumber, the second is the message. It then acts appropriately to send that message. Basically, this is just keeping data seperate. Where the length identifier comes in is just making sure the whole stream gets finished. After all this, the final message would be like 17<CrLf>/PRIVMSG 0<nullchar>Hello!<CrLf>. Because the length identifier uses CrLf, and any CrLf bytes in the message are converted to a string, this works out pretty well, and so far in all my tests (you will notice in my client, if i hadn't removed it, within one of the serialization procedures, there is a part that deals if <null>, <nonull> etc was included in the message the user typed, before sending it. this was me simulating 'broken' and 'appended' streams), everything went very well. I have since made a few changes, which may have broken it again, but that remains to be seen.
Bottom line: This method seems to be working pretty well. Agreed, I can use a different delimiter, and may just do that.
What about binary?
I've thought about this, even in my tasks in the raw client project file i've mentioned it, but its a question mark. This is meant to be a chat program. Sending data twice seems to be a waste to me, besides textual chatting. The final goal for this program is more for a small to medium office situation, not a major protocol chat program. Those who use MSN are well aware of how slow file transfers are. For the time being, I'll leave this out. If I do implement this, it would probably be in the form of 10<CrLf>/DATA 28sj<CrLf>, changing CrLfs where appropriate, as I do now, if that will work. I have not figured out how to use the native GetData variable type yet (data).
Lots of talk here. What's the point? Not going to comment until I can wrap my head around all that you said, and try some things. Unfortunately, a lot of what you said means rewriting a good chunk of my code :P
There is one problem with your code, I think... What happens if packets are appended together?
Here is how I'm working on the buffering and header management.
1. Message comes in at DataArrival in the usercontrol, and gets sent to bufferclass.
2. Bufferclass adds it to bufferstring
3. Bufferclass processes the bufferstring, and if it is not a complete message (doesn't end with NullChar, which delimits one packet from another) then exits, the partial message still in the bufferstring. If it is complete, it raises an event in the usercontrol for each packet it contains.
4. Event in usercontrol sends message to be de-headered
5. if message length matches what the header says it should be, another event is raised in the UserControl. If not, the message is debug.printed and the header processor returns vbNullString
6. The event forwards the processing where it needs to go, based on content.
Hey! New Update!
Redid too many things serverside to mention here, may have actually rewritten it from scratch, I can't remember
I have removed Private Messageing for the time being, as well as boiled down the server UI to a form that will never show (due to usercontrols that are needed), while adding some savable options. Also Redid the directory structure of the project files, for saner browsing.
Rip it up!
If I have helped you out, be a pal and rate the helpful post!