Results 1 to 10 of 10

Thread: [RESOLVED] Converting from an XML model object to more useful class

  1. #1

    Thread Starter
    Hyperactive Member
    Join Date
    Apr 2007
    Location
    cobwebbed to PC
    Posts
    311

    Resolved [RESOLVED] Converting from an XML model object to more useful class

    Hi folks,

    TLDR: I have class MessyA, resulting from a deserializer, that models real data and want to have class SmartA that implements IA that is easier for the user to work with, what is the best way to move from MessyA to SmartA (and vice-versa for serialization)?

    I am working on a C# DLL that serializes XML of a two different types of XML document; a configuration and a profile. I have used the XSD.exe tool to create classes that model the XML of each type as an object, types Configuration and a Profile; and I have serialization and deserialization working well.

    A problem arises however that the deserialized objects of these class types are not very useful for presenting to a consumer.

    Now I know that the classes created by XSD.exe are marked partial, so currently my implementation adds another file with another partial part of the same classes that adds functionality, such as, for example, the property:

    c# Code:
    1. public string Format;

    Can better be accessed by the now added property:

    c# Code:
    1. public Format FormatValue;

    Format being an Enum of the available formats and the FormatValue property's Get doing a conversion from the Format string.

    This leaves the user seeing Format and FormatValue. When this solution is applied to several other properties things start to get messy quickly. It also means the XML-ness is exposed. My end plan for this was to have the ability to easily add other serial formats. To that End, my XmlSerializer class already implements an IDataSerializer interface, but it returns generic Object types because the Profile and Configuration types are currently tied specifically to the XML serializer.

    To this end I though to introduce Interfaces for both the configuration and profile: IConfiguration and IProfile, but implementing them in the partial part of the existing Configuration and Profile classes would mean a name clash (e.g. the presentable Format Format property in the interface implementation would clash with the string Format property of the XML model). So it seems not only do I need to deserialize but also convert the resulting model classes to more friendly classes.

    Now I can do this easily enough by implementing some sort of converter and essentially just copying over the bits of the model I want to the new friendlier class... but copying back and forth (deserialization and serialization) like that seems like a code smell.

    Thus my question finally is this: is there a recommended practice or oft-used method for moving from the model object type to a type that is more presentable?

    Thank you
    Last edited by wolf99; Aug 18th, 2017 at 05:25 AM.
    Thanks

  2. #2
    Smooth Moperator techgnome's Avatar
    Join Date
    May 2002
    Posts
    34,543

    Re: Converting from an XML model object to more useful class

    Give the properties your friendly name, them mark them with an XMLAttribute (or XMLElement) serialization attribute to map them to the appropriate unfriendly XML document element item.
    Example:
    Code:
    <Serializable>
    Public Class Class1
    
        <Xml.Serialization.XmlElement(ElementName:="Bar")>
        Public Property Foo As String
    
    
    End Class
    This creates a class with a single Property. From the code, the property is called Foo... but when it's serialized to XML, the element is called "Bar".

    -tg
    * I don't respond to private (PM) requests for help. It's not conducive to the general learning of others.*
    * I also don't respond to friend requests. Save a few bits and don't bother. I'll just end up rejecting anyways.*
    * How to get EFFECTIVE help: The Hitchhiker's Guide to Getting Help at VBF - Removing eels from your hovercraft *
    * How to Use Parameters * Create Disconnected ADO Recordset Clones * Set your VB6 ActiveX Compatibility * Get rid of those pesky VB Line Numbers * I swear I saved my data, where'd it run off to??? *

  3. #3

    Thread Starter
    Hyperactive Member
    Join Date
    Apr 2007
    Location
    cobwebbed to PC
    Posts
    311

    Re: Converting from an XML model object to more useful class

    @techgnome: Damn it, I should have thought of this! Thanks!

    I think to completely solve my problem (it's not just the names, but the types that differ also) I can solve this in two steps:

    1) make the friendly XML map to unfriendly names
    2) make the friendly names not map to XML

    IIRC one can also mark the XML model names as private but still have them serialize (I'll have to double check that), so I can hide the unfriendly stuff from the user a bit more now too
    Thanks

  4. #4
    You don't want to know.
    Join Date
    Aug 2010
    Posts
    4,578

    Re: Converting from an XML model object to more useful class

    OK, short answer first, then a tougher and longer answer.

    It's not always a smell to deserialize, then convert the deserialized type to a more appropriate type. Sometimes the serialized format (like your XML) is laid out in a way that leads to clunky and bad C# types. You have two choices to handle that situation:
    Code:
    [*]Use a serialization library to parse to an intermediate type, then post-process the intermediate type into more convenient types.[*]Write a custom serialization framework that parses directly to convenient types.
    It's usually harder and more error-prone to write your own parser, especially if the format is ever expected to change. So it's usually easier to be maintainable if you stick to "use the easy parser, then write some integration code".

    It is a smell in the sense of "you'd rather not have to do it", but it's an excusable smell in the sense of, "If you don't fully control your input stream, no solution smells like roses."

    Longer answer:

    You are parsing XML that is a stream of two different kinds of objects. Let's say Peas and Carrots. (Part of the reason people hate Algebra is it's easier to lose track of A, B, Q, and C than Dogs, Cats, Lemons, and Lizards.)

    In easy-to-parse XML formats, you make sure not to mix your Peas with your Carrots:
    Code:
    <Plate>
        <Peas>
            <Pea id="1" />
            <Pea id="2" />
        </Peas>
        <Carrots>
            <Carrot id="1" />
        </Carrots>
    </Plate>
    This leads to a very natural C# class hierarchy:
    Code:
    Public Class Plate
        Public Property Peas() As Pea()
        Public Property Carrots() As Carrot()
    End Class
    But for some reason people also like to write "cute" XML that mixes peas and carrots. Worse, sometimes they like to use the same tag for both, because hey, they're vegetables and they want to be "extensible":
    Code:
    <Plate>
        <Vegetable type="Pea" id="1" />
        <Vegetable type="Carrot" id="2" />
        <Vegetable type="Pea" id="3" />
    </Plate>
    This is what I think you're describing, and techgnome's advice won't work for this. I urge you to change your XML format, if you can, to look more like the first example and less like the last. .NET's XML serialization framework is not sophisticated enough for rules like, "Please convert Vegetable tags with the attribute type equal to "Pea" into Pea objects". You have to write things like that yourself, and the less you have to write the better off you are.

    "But what about adding new vegetables later?"

    It's just as easy to add a "Corn" element to the first file as it is to add a "Vegetable element with attribute type set to "Corn" to the second. But the second also requires you to update an Enum, right? And you need to find the code that converts Vegetables to "better objects" and update it with a new case for Corn...

    Or you could just add the element, add the Corn class, and let .NET do the work.

    Optional Appendix

    "But wait," you might say, "Right now I know a Configuration comes before the Profile it represents! How will I tell which is which if they are separate?" Use their identifier:
    Code:
    <File>
        <Profiles>
            <Profile id="1" configId="2" ... />
            ...
        </Profiles>
        <Configurations>
            <Configuration id="2" ... /> <!-- corresponds to Profile 1 -->
            ...
        </Configuration>
    </File>
    You can see this in the .vbproj and .csproj file formats, elements that need to reference other elements use IDs. If you have a given Profile, and access to the File, you can get the Configuration that goes with it via a simple LINQ query:
    Code:
    Dim myProfile = myFile.Profiles(0)
    Dim possibleConfig = myFile.Configurations.Where(
            Function (config) config.Id = myProfile.ConfigId
        ).FirstOrDefault()
    
    If possibleConfig IsNot Nothing Then
        ' Found!
    End If
    You can use Dictionary types to speed that up if you do it frequently, but it's not required.

    The "friendlier" your XML is to the limited capabilities of the .NET XML serialization framework, the less work you have to do on your end. It's generally easier to update the file format than code. So do it, if you can.
    This answer is wrong. You should be using TableAdapter and Dictionaries instead.

  5. #5
    Smooth Moperator techgnome's Avatar
    Join Date
    May 2002
    Posts
    34,543

    Re: Converting from an XML model object to more useful class

    Quote Originally Posted by wolf99 View Post
    it's not just the names, but the types that differ also
    That concerns me because there shouldn't be changes in types... strings are strings... dates are dates, figs are figs, and numbers are numbers. So if something is coming in the file as a date, it should be a date from the start... it shouldn't be changing its type in any way. That's just asking for more trouble.
    If you think you're having to jump through hoops like that just to do something as simple as (de)serialization ... then Sitten's right, you've got a major design flaw and the whole thing needs to be redone. Personally, if I can help it, I don't let my XML drive my code/objects, but let my objects design my XML.

    -tg
    * I don't respond to private (PM) requests for help. It's not conducive to the general learning of others.*
    * I also don't respond to friend requests. Save a few bits and don't bother. I'll just end up rejecting anyways.*
    * How to get EFFECTIVE help: The Hitchhiker's Guide to Getting Help at VBF - Removing eels from your hovercraft *
    * How to Use Parameters * Create Disconnected ADO Recordset Clones * Set your VB6 ActiveX Compatibility * Get rid of those pesky VB Line Numbers * I swear I saved my data, where'd it run off to??? *

  6. #6
    You don't want to know.
    Join Date
    Aug 2010
    Posts
    4,578

    Re: Converting from an XML model object to more useful class

    Quote Originally Posted by techgnome View Post
    Personally, if I can help it, I don't let my XML drive my code/objects, but let my objects design my XML.

    -tg
    Yeah, this is the best takeaway.

    Serialization frameworks tend to be primitive. It's hard to write a tool that can say "I can parse any kind of XML" because in XML, there's always at least 10 different ways to express even a simple object. When you pick .NET's serialization framework, you're mostly committing to an obvious and straightforward XML schema where every property is an element. You can force some to be attributes, but IMO this is more an aesthetic concern than a technical one and if everything goes right you should never see your serialized stuff outside of testing!

    When you step away from "the most obvious way to make XML", you tend to have to write your own parser.
    This answer is wrong. You should be using TableAdapter and Dictionaries instead.

  7. #7

    Thread Starter
    Hyperactive Member
    Join Date
    Apr 2007
    Location
    cobwebbed to PC
    Posts
    311

    Re: Converting from an XML model object to more useful class

    Thanks for all the info folks.

    I realised why I hadn't used the XMLAttribute avenue to begin with - I had been trying to keep the classes generated by XSD.exe untouched, so as to make it easy to regenerate in future if need be. This idea of changing the attributes of members generated as part of these classes obviously necessitates work to the generated class (as required by part 1 of my great master plan above) as opposed to work to another class or to the other partial class parts. I guess this is letting my XML drive my object design...

    I got halfway through writing a spiel explaining why this presented problems and then thought, well I should just move away from using XSD.exe generation and write the class myself, driving the XML from the object, and things would be a lot easier

    This still leaves a problem of naming types though, for example the ReadProtocol class property is of a complex type, because it has attributes. The complex type is also named ReadProtocol, this works fine. But, the value of the element is an enum, a simple type in XML, this is also called Readprotocol, which means there are two types with the same name, which is invalid XML.

    The XML looks something like:

    xml Code:
    1. <ReadProtocol Length="10" Variable="False">FooReadProtocolEnumMember</ReadProtocol>

    I guess the best thing to do is, like @SittenSpynne said, to not use attributes but instead:

    xml Code:
    1. <ReadProtocol>FooReadProtocolEnumMember</ReadProtocol>
    2. <ReadLength>10</ReadLength>
    3. <ReadLengthIsVariable>False</ReadLengthIsVariable>

    Thinking as I write; this seems overly verbose, but it actually kind of matches up to what I currently have eventually forced my deserialized data into anyway. Does all this seem more logical?
    Thanks

  8. #8
    Smooth Moperator techgnome's Avatar
    Join Date
    May 2002
    Posts
    34,543

    Re: Converting from an XML model object to more useful class

    Nope... that's not right either... what if you have two?
    Code:
    <ReadProtocol>FooReadProtocolEnumMember</ReadProtocol>
    <ReadLength>10</ReadLength>
    <ReadLengthIsVariable>False</ReadLengthIsVariable>
    <ReadProtocol>FooReadProtocolEnumMember</ReadProtocol>
    <ReadLength>10</ReadLength>
    <ReadLengthIsVariable>False</ReadLengthIsVariable>
    ???
    Uh... no. Since the ReadLEngth and ReadLengthIsVariable belong to ReadProtocol, they should either be an attribute (I'd argue this is OK as it's meta data describing the property) OR they are child elements of the ReadProtocol node:
    Code:
    <ReadProtocol>FooReadProtocolEnumMember
        <ReadLength>10</ReadLength>
        <ReadLengthIsVariable>False</ReadLengthIsVariable>
    </ReadProtocol>
    I'm not overly a huge fan of this because the value (FooReadProtocolEnumMember) can get lost in there... but that's easily solved by:
    Code:
    <ReadProtocol>
        <Name>FooReadProtocolEnumMember</Name>
        <ReadLength>10</ReadLength>
        <ReadLengthIsVariable>False</ReadLengthIsVariable>
    </ReadProtocol>
    Now things are more easily identifiable... This is probably what I'd start with.

    Now if you want more than one:
    Code:
    <ReadProtocols>
        <ReadProtocol>
            <Name>FooReadProtocolEnumMember</Name>
            <ReadLength>10</ReadLength>
            <ReadLengthIsVariable>False</ReadLengthIsVariable>
        </ReadProtocol>
        <ReadProtocol>
            <Name>BarReadProtocolEnumMember</Name>
            <ReadLength>10</ReadLength>
            <ReadLengthIsVariable>False</ReadLengthIsVariable>
        </ReadProtocol>
        <ReadProtocol>
            <Name>FooBarReadProtocolEnumMember</Name>
            <ReadLength>10</ReadLength>
            <ReadLengthIsVariable>False</ReadLengthIsVariable>
        </ReadProtocol>
    </ReadProtocols>
    It all ties up nice an neat.

    -tg
    * I don't respond to private (PM) requests for help. It's not conducive to the general learning of others.*
    * I also don't respond to friend requests. Save a few bits and don't bother. I'll just end up rejecting anyways.*
    * How to get EFFECTIVE help: The Hitchhiker's Guide to Getting Help at VBF - Removing eels from your hovercraft *
    * How to Use Parameters * Create Disconnected ADO Recordset Clones * Set your VB6 ActiveX Compatibility * Get rid of those pesky VB Line Numbers * I swear I saved my data, where'd it run off to??? *

  9. #9

    Thread Starter
    Hyperactive Member
    Join Date
    Apr 2007
    Location
    cobwebbed to PC
    Posts
    311

    Re: Converting from an XML model object to more useful class

    Thanks for the info techgnome; that sounds logical enough in the general case...

    However, in this specific case a command may only ever have 0 or 1 ReadProtocol, that is the <ReadProtocol> element has maxoccurs=1 and minoccurs=0. Which is to say that a given command may either be readable or not, if it is readable, then it is read from using a certain protocol... a command cannot be read using multiple protocols as a command with a different protocol would be a different command. So perhaps your starting problem is sort of a non-problem in this case?

    Nevertheless, to follow it down to:

    xml Code:
    1. <ReadProtocol>
    2.   <Name>FooReadProtocolEnumMember</Name>
    3.   <ReadLength>10</ReadLength>
    4.   <ReadLengthIsVariable>False</ReadLengthIsVariable>
    5. </ReadProtocol>

    This brings an issue with friendly names for types. A <ReadProtocol> should probably map to a class ReadProtocol type. The type that collects possible read protocols for Name should probably be enum ReadProtocol. Now we have a type clash in the XSD (ReadProtocol simple type for the enum and same for complex type for the element with nested elements) and also for the C# (ReadProtocol for the enum and same for the class).

    Although the <ReadProtocol> complex type could possibly be anonymous (in the case where there is a maximum of 1 read protocol) but that looks bad, does not work if there are to be multiple read protocols and still does not solve the clash in C#. Unless the Name type was enum ReadProtocolName, but again that starts to seem ... clunky and as such seems like a code smell maybe?

    PS: I cannot just use Protocol for either the class or the enum type name to avoid this clash, because that would then clash with solving the same problem for the WriteProtocol which obviously has a different set of protocol names and so a different enum.
    Last edited by wolf99; Aug 24th, 2017 at 04:42 AM.
    Thanks

  10. #10
    Smooth Moperator techgnome's Avatar
    Join Date
    May 2002
    Posts
    34,543

    Re: Converting from an XML model object to more useful class

    Quote Originally Posted by wolf99 View Post
    Thanks for the info techgnome; that sounds logical enough in the general case...

    However, in this specific case a command may only ever have 0 or 1 ReadProtocol, that is the <ReadProtocol> element has maxoccurs=1 and minoccurs=0. Which is to say that a given command may either be readable or not, if it is readable, then it is read from using a certain protocol... a command cannot be read using multiple protocols as a command with a different protocol would be a different command. So perhaps your starting problem is sort of a non-problem in this case?

    Nevertheless, to follow it down to:

    xml Code:
    1. <ReadProtocol>
    2.   <Name>FooReadProtocolEnumMember</Name>
    3.   <ReadLength>10</ReadLength>
    4.   <ReadLengthIsVariable>False</ReadLengthIsVariable>
    5. </ReadProtocol>

    This brings an issue with friendly names for types. A <ReadProtocol> should probably map to a class ReadProtocol type. The type that collects possible read protocols for Name should probably be enum ReadProtocol. Now we have a type clash in the XSD (ReadProtocol simple type for the enum and same for complex type for the element with nested elements) and also for the C# (ReadProtocol for the enum and same for the class).

    Although the <ReadProtocol> complex type could possibly be anonymous (in the case where there is a maximum of 1 read protocol) but that looks bad, does not work if there are to be multiple read protocols and still does not solve the clash in C#. Unless the Name type was enum ReadProtocolName, but again that starts to seem ... clunky and as such seems like a code smell maybe?

    PS: I cannot just use Protocol for either the class or the enum type name to avoid this clash, because that would then clash with solving the same problem for the WriteProtocol which obviously has a different set of protocol names and so a different enum.
    now you're catching on to the problems with letting the XML drive the design of the classes and objects rather than the other way around. Eventually no one should be looking at the XML anyways, it isn't for human consumption. But the code is. So that's the important part. Solve the code/class side of things and you'll find that the XML falls into place and 10 minutes later you'll find you don't really care what the XML even looks like, just that it works.

    -tg
    * I don't respond to private (PM) requests for help. It's not conducive to the general learning of others.*
    * I also don't respond to friend requests. Save a few bits and don't bother. I'll just end up rejecting anyways.*
    * How to get EFFECTIVE help: The Hitchhiker's Guide to Getting Help at VBF - Removing eels from your hovercraft *
    * How to Use Parameters * Create Disconnected ADO Recordset Clones * Set your VB6 ActiveX Compatibility * Get rid of those pesky VB Line Numbers * I swear I saved my data, where'd it run off to??? *

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width