I have to apply [Serializable()] attribute for all classes, but I want to know is there any way to make classes Serializable globally instead of applying this attribute individually for all classes?
No, there isn't a way of applying this globally - you'd have to visit each type and add the attribute.
However: applying this globally is a really, really bad idea. Knowing exactly what you're serializing, when, and why is really important - whether this is for session-state, primary persistence, cache, or any other use-case. Statements like
I have to apply [Serializable()] attribute for all classes
tells me that you are not currently in control of what you are storing.
Additionally, since [Serializable] maps (usually) to BinaryFormatter, it is important to know that there are a lot of ways (when using BinaryFormatter) in which it is possible to accidentally drag unexpected parts of your model into the serialized data. The most notorious of these is "events", but: there are others.
When I see this type of question, what I envisage is that you're using types from your main data model as the thing that you are putting into session-state, but frankly: this is a mistake - and leads to questions like this. Instead, the far more maneagable approach is to create a separate model that exists purely for this purpose:
it only has the data that you need to have available in session
it is marked [Serializable] if your provider needs that - or whatever other metadata is needed for the sole purpose for which it exists
it does not have any events
it doesn't involve any tooling like ORM contexts, database connections etc
ideally it is immutable (to avoid confusion over what happens if you make changes locally, which can otherwise sometimes behave differently for in-memory vs persisted storage)
just plain simple basic objects - very easy to reason about
can be iterated separately to your main domain objects, so you don't have any unexpected breaks because you changed something innocent-looking in your domain model and it broke the serializer
Related
For my specific context I control the target classes. They were auto-generated based on XSDs and have huge overlaps because they represent different versions of the same class.
Each version is a huge C# class of over 5.000 lines.
Support can't be dropped for old versions. This means we always need to be able to map the domain class to several different versions and back again. There are always small but breaking changes from version to version. More than 90% of the target class is always the same, even if the code is duplicated for each version.
Currently there is one big mapping for each format, which is a horror. There is so. much. duplicated. code. Furthermore, developers tend to make updates where they need it, and skip everything else, which means individual versions often go out of sync, meaning that one version will be updated to do something that other versions don't. This is also not ideal.
So my question to you is: What strategy can you use for this kind of mapping?
Given the size of your classes, and having to maintain multiple versions, I'd suggest serializing and serializing. Assuming that they otherwise approximate one another, JsonConvert doing JsonConvert.Deserialize<TargetClass>(JsonConvert.Serialize(sourceClass)) should solve it, though I've not worked with such large models to have any idea on how performant it is.
Alternatively, you could use a t4 template (if you're not in .net Core anyway) to generate the mapping using reflection into a common method or whatever.
As far as preventing the Developer problem... Interfaces, base classes that define as much of this centrally as possible. Code reviews to ensure that developers are making changes to the lowest layer they possibly can.
You can do some tricky things with inheretence with static using statements, I'm pretty sure.
Something dumb like
using OldVersion = path.to.the.class.CantRenameThis;
class CantRenameThis : OldVersion
We ended up with a solution that achieved the main targets:
Decent compile-time safety to spot mapping errors
De-duplication of code
No messing with the auto-generated code
We did this by exploiting that the auto-generated classes are generated as partial. That means we can extend them.
We ended up creating hierarchies of interfaces/classes looking like this:
ClassV1 implements IClassVerySpecificV1
ClassV2 implements IClassVerySpecificV2
IClassVerySpecificV1 implements SpecificA, SpecificB, SpecificC and IClassBasic
IClassVerySpecificV1 implements SpecificB, SpecificC, SpecificD and IClassBasic
A mapper would then look like:
ClassV1Mapper requires a SpecificAMapper, SpecificBMapper, SpecificCMapper and ClassBasicMapper
ClassV2Mapper requires a SpecificBMapper, SpecificCMapper, SpecificDMapper and ClassBasicMapper
This way we could map 90% of everything by just throwing everything that belongs to IClassBasic into a ClassBasicMapper.
We did run into some issues however:
As you can already guess, we end up with a LOT of interfaces. More than you want.
Sometimes a field exists across versions, but has different (enum) values. Our domain model would have the superset, with an attribute specifiying which values were valid for which versions.
Using Protobuf-net, I want to know what properties of an object have been updated at the end of a merge operation so that I can notify interested code to update other components that may relate to those updated properties.
I noticed that there are a few different types of properties/methods I can add which will help me serialize selectively (Specified and ShouldSerialize). I noticed in MemberSpecifiedDecorator that the ‘read’ method will set the specified property to true when it reads. However, even if I add specified properties for each field, I’d have to check each one (and update code when new properties were added)
My current plan is to create a custom SerializationContext.context object, and then detect that during the desearalization process – and update a list of members. However… there are quite a few places in the code I need to touch to do that, and I’d rather do it using an existing system if possible.
It is much more desirable to get a list of updated member information. I realize that due to walking down an object graph that may result in many members, but in my use case I’m not merging complex objects, just simple POCO’s with value type properties.
Getting a delta log isn't an inbuilt feature, partly because of the complexity when it comes to complex models, as you note. The Specified trick would work, although this isn't the purpose it was designed for - but to avoid adding complexity to your own code,that would be something best handled via reflection, perhaps using the Expression API for performance. Another approach might be to use a ProtoReader to know in advance which fields will be touched, but that demands an understanding of the field-number/member map (which can be queried via RuntimeTypeModel).
Are you using habd-crafted models? Or are you using protogen? Yet another option would be to have code in the setters that logs changes somewhere. I don't think protogen currently emits partial method hooks, but it possibly could.
But let me turn this around: it isn't a feature that is built in right now, and it is somewhat limited due to complexity anyway, but: what would a "good" API for this look like to you?
As a side note: this isn't really a common features in serializers - you'd have very similar challenges in any mainstream serializer that I can think of.
I need to work on an application that consists of two major parts:
The business logic part with specific business classes (e.g. Book, Library, Author, ...)
A generic part that can show Books, Libraries, ... in data grids, map them to a database, ...).
The generic part uses reflection to get the data out of the business classes without the need to write specific data-grid or database logic in the business classes. This works fine and allows us to add new business classes (e.g. LibraryMember) without the need to adjust the data grid and database logic.
However, over the years, code was added to the business classes that also makes use of reflection to get things done in the business classes. E.g. if the Author of a Book is changed, observers are called to tell the Author itself that it should add this book to its collection of books written by him (Author.Books). In these observers, not only the instances are passed, but also information that is directly derived from the reflection (the FieldInfo is added to the observer call so that the caller knows that the field "Author" of the book is changed).
I can clearly see advantages in using reflection in these generic modules (like the data grid or database interface), but it seems to me that using reflection in the business classes is a bad idea. After all, shouldn't the application work without relying on reflection as much as possible? Or is the use of reflection the 'normal way of working' in the 21st century?
Is it good practice to use reflection in your business logic?
EDIT: Some clarification on the remark of Kirk:
Imagine that Author implements an observer on Book.
Book calls all its observers whenever some field of Book changes (like Title, Year, #Pages, Author, ...). The 'FieldInfo' of the changed field is passed in the observer.
The Author-observer then uses this FieldInfo to decide whether it is interested in this change. In this case, if FieldInfo is for the field Author of Book, the Author-Observer will update its own vector of Books.
The main danger with Reflection is that the flexibility can escalate into disorganized, unmaintainable code, particularly if more junior devs are used to make changes, who may not fully understand the Reflection code or are so enamored of it that they use it to solve every problem, even when simpler tools would suffice.
My observation has been that over-generalization leads to over-complication. It gets worse when the actual boundary cases turn out to not be accommodated by the generalized design, requiring hacks to fit in the new features on schedule, transmuting flexibility into complexity.
I avoid using reflection. Yes, it makes your program more flexible. But this flexibility comes at a high price: There is no compile-time checking of field names or types or whatever information you're collecting through reflection.
Like many things, it depends on what you're doing. If the nature of your logic is that you NEVER compare the field names (or whatever) found to a constant value, then using reflection is probably a good thing. But if you use reflection to find field names, and then loop through them searching for the fields named "Author" and "Title", you've just created a more-complex simulation of an object with two named fields. And what if you search for "Author" when the field is actually called "AuthorName", or you intend to search for "Author" and accidentally type "Auhtor"? Now you have errors that won't show up until runtime instead of being flagged at compile time.
With hard-coded field names, your IDE can tell you every place that a certain field is used. With reflection ... not so easy to tell. Maybe you can do a text search on the name, but if field names are passed around as variables, it can get very difficult.
I'm working on a system now where the original authors loved reflection and similar techniques. There are all sorts of places where they need to create an instance of a class and instead of just saying "new" and the class, they create a token that they look up in a table to get the class name. What does this gain? Yes, we could change the table to map that token to a different name. And this gains us ... what? When was the last time that you said, "Oh, every place that my program creates an instance of Customer, I want to change to create an instance of NewKindOfCustomer." If you have changes to a class, you change the class, not create a new class but keep the old one around for nostalgia.
To take a similar issue, I make a regular practice of building data entry screens on the fly by asking the database for a list of field names, types, and sizes, and then laying it out from there. This gives me the advantage of using the same program for all the simpler data entry screens -- just pass in the table name as a parameter -- and if a field is added or deleted, zero code change is required. But this only works as long as I don't care what the fields are. Once I start having validations or side effects specific to this screen, the system is more trouble than it's worth, and I'm better off to fall back to more explicit coding.
Based on your edit, it sounds like you are using reflection purely as a mechanism for identifying fields. This is as opposed to dynamic behavior such as looking up the fields, which should be avoided when possible (since such lookups usually use strings which ruin static type safety). Using FieldInfo to provide an identifier for a field is fairly harmless, though it does expose some internals (the info class) in a way that is not entirely ideal.
I tend not to use reflection where i can help it. by using interfaces and coding against these i can do a lot of things that some would use reflection for.
But im a big fan of if it works, it works.
Also by using reflection you probably have something that can adapt fairly easily.
Ie the only objection most would have is fairly religious ... and if your performance is fine and the code is maintainable and clear .... who cares?
Edit: based on your edit i would indeed use interfaces to achieve what you want. Unless i misunderstand you.
I think it is a good idea to stay away from Reflection when possible, but dont be afraid to resort to it when it provides a better or more flexible solution to your problem. The performance hit for anything but tight loop operations is likely to be minimal in the overall scheme of an application or Web Form request.
Just a good article to share about reflection -
http://www.simple-talk.com/dotnet/.net-framework/a-defense-of-reflection-in-.net/
I tend to use interfaces in my business layer and leave the reflection to my presentation layer. This is not an absolute but rather a guideline.
When using simple DTOs in various scenarios I have frequently run into the same kind of problem and I always wondered whether there's a better way to deal with it.
The thing is, I have a business object, e.g. Asset which has a bunch of properties, child objects and calculated fields, some of them expensive to calculate in sense of time, some of them huge in sense of data amonut. I need to use a different flavor of this object in various screens in the UI, e.g.
in a tree where there is a hierarchy displayed and I don't need much more than the display name
in a grid where I'm showing just a couple of properties
in a detail pane where there's a big subset of available information, but still some of it (like mapped objects) is shown only on demand
To be able to achieve optimal performance with this scenario, I have always created different DTOs for each context, only containing the subset of information which is actually used in that context. While being a resource-optimal solution, this leads to couple of problems :
I have a class explosion with huge number of DTO classes
I have quite a hard time coming up with different names for the same thing like AssetDtoForGridInTheOverviewScreenInTheUpperPaneAboveTheSplitter, not to mention maintaining them later
I am frequently repeating myself in the transformation methods, because there are properties that are used by most of the DTOs but not by all of them (therefore I can't put them into any superclass and reuse the transformation logic)
The technology I'm using is ASP.NET SOAP WebServices and C# 3.5, but I think somehow this could be a language-agnostic problem. Any ideas are welcome..
This is a known problem with DTOs. It's described in this otherwise mediocre articule on MSDN. To paraphrase: DTO is the most versatile n-tier data access pattern, but it also requires most work.
You can address some of your issues with mapping by using convention-based mapping, such as AutoMapper.
When it comes to class explosion, could it be that you are using too flat data structures?
This can be difficult to tell because DTOs naturally include a great deal of semantic repetition that turns out to not be logical repetition at all. For example, even if you have semantically similar types, if one is a ViewModel and the other is a Domain Object, they may share semantic structure, but have vastly different responsibilities.
If, on the other hand, you have a lot of repetition in the same application layer (e.g. UI), you may be violating the DRY principle. In this case, it may often help to encapsulate related data in what starts out as a flat data structure into a separate class. In most UI frameworks I'm aware of, you can still databind a flat display to a hierarchically structured class.
The problem of class explosion is inherent to the DTO approach, there probably isn't much you can do about that. Be careful not to mix your view-model with your DTO model. Your DTO's should only be used to get the data from your data tier to your front end and not for presentation.
With the advent of .NET 3.5 you can choose to implement some basic, more coarse grained DTO's and replace your ViewModel with an anonymous type which you can dynamically create off your DTO's. I found this to be avery flexible solution.
Regarding your naming conventions, it is probably useful to group your DTO's into scenarios and put them in a corresponding namespace. For example Solution.AssetManagement.Asset and Solution.AssetReporting.Asset
One advantage that comes to my mind is, if you use Poco classes for Orm mapping, you can easily switch from one ORM to another, if both support Poco.
Having an ORM with no Poco support, e.g. mappings are done with attributes like the DataObjects.Net Orm, is not an issue for me, as also with Poco-supported Orms and theirs generated proxy entities, you have to be aware that entities are actually DAO objects bound to some context/session, e.g. serializing is a problem, etc..
POCO it's all about loose coupling and testability.
So when you are doing POCO you can test your Domain Model (if your're doing DDD for example) in isolation. You don't have to bother about how it is persisted. You don't need to stub contexts/sessions to test your domain.
Another advantage is that there is less leaky abstractions. Because persistance concerns are not pushed to domain layer. So you are enforcing the SRP principle.
The third advantage I can see is that doing POCO your Domain Model is more evolutive and flexible. You can add new features easier than if it was coupled to the persistance.
I use POCO when I'm doing DDD for example, but for some kind of application you don't need to do DDD (if you're doing small data based applications) so the concerns are not the same.
Hope this helps
None. Point. All advantages people like throwing around are advantages that are not important in the big scale of the picture. I rather prefer a strong base class for entity objects that actually holds a lot of integrated code (like throwing property change events when properties change) than writing all that stuff myself. Note that I DID write a (at that time commercially available) ORM for .NET before "LINQ" or "ObjectSpaces" even were existing. I've used O/R mappers like for 15 years now, and never found a case where POCO was really something that was worth the possible trouble.
That said, attributes MAY be bad for other reasons. I rather prefer the Fluent NHibernate approach these days - having started my own (now retired) mapper with attributes, then moved to XML based files.
The "POCO gets me nothing" theme mostly comes from the point that Entities ARE SIMPLY NOT NORMAL OBJECTS. They have a lot of additional functionality as well as limitations (like query speed etc.) that the user should please be aware of anyway. ORM's, despite LINQ, are not replacable anyway - noit if you start using their really interesting higher features. So, at the end you get POCO and still are suck with a base class and different semantics left and right.
I find that most proponents of POCO (as in: "must have", not "would be nice") normally have NOT thought their arguments to the real end. You get all kinds of pretty crappy thoughts, pretty much on the level of "stored procedures are faster than dynamic SQL" - stuff that simply does not hold true. Things like:
"I want to have them in cases where they do not need saving ot the database" (use a separate object pool, never commit),
"I may want to have my own functionality in a base class (the ORM should allos abstract entity classed without functionality, so put your OWN base class below the one of the ORM)
"I may want to replace the ORM with another one" (so never use any higher functionality, hope the ORM API is compatible and then you STILL may have to rewrite large parts).
In general POCO people also overlook the hugh amount of work that acutally is to make it RIGHT - with stuff like transactional object updates etc. there is a TON of code in the base class. Some of the .NET interfaces are horrific to implement on a POCO level, though a lot easier if you can tie into the ORM.
Take the post of Thomas Jaskula here:
POCO it's all about loose coupling and
testability.
That assumes you can test databinding without having it? Testability is mock framework stuff, and there are REALLY Powerful ones that can even "redirect" method calls.
So when you are doing POCO you can
test your Domain Model (if you're
doing DDD for example) in isolation.
You don't have to bother about how it
is persisted. You don't need to stub
contexts/sessions to test your domain.
Actually not true. Persistence should be part of any domain model test, as the domain model is there to be persisted. You can always test non-persistent scenarios by just not committing the changes, but a lot of the tests will involve persistence and the failure of that (i.e. invoices with invalid / missing data re not valid to be written to disc, for example).
Another advantage is that there is
less leaky abstractions. Because
persistance concerns are not pushed to
domain layer. So you are enforcing the
SRP principle.
Actually no. A proper Domain model will never have persistence methods in the entities. This is a crap ORM to start with (user.Save ()). OTOH the base class will to things like validation (IDataErrorInfo), handle property update events on persistent filed and in general save you a ton of time.
As I said before, some of the functionality you SHOULD have is really hard to implement with variables as data store - like the ability to put an entity into an update mode, do some changes, then roll them back. Not needed - tell that Microsoft who use that if available in their data grids (you can change some properties, then hit escape to roll back changes).
The third advantage I can see is that
doing POCO your Domain Model is more
evolutive and flexible. You can add
new features easier than if it was
coupled to the persistance.
Non-argument. You can not play around adding fields to a peristet class without handling the persistence, and you can add non-persistent features (methods) to a non-poco class the same as to a poco class.
In general, my non-POCO base class did the following:
Handle property updates and IDataErrorInfo - without the user writing a line of code for fields and items the ORM could handle.
Handle object status information (New, Updated etc.). This is IMHO intrinsic information that also is pretty often pushed down to the user interface. Note that this is not a "save" method, but simply an EntityStatus property.
And it contained a number of overridable methods that the entity could use to extend the behavior WITHOUT implementing a (public) interface - so the methods were really private to the entity. It also had some more internal properties like to get access to the "object manager" responsible for the entity, which also was the point to ask for other entities (submit queries), which sometimes was needed.
POCO support in an ORM is all about separation of concerns, following the Single Responsibility Principle. With POCO support, an ORM can talk directly to a domain model without the need to "muddy" the domain with data-access specific code. This ensures the domain model is designed to solve only domain-related problems and not data-access problems.
Aside from this, POCO support can make it easier to test the behaviour of objects in isolation, without the need for a database, mapping information, or even references to the ORM assemblies. The ability to have "stand-alone" objects can make development significantly easier, because the objects are simple to instantiate and easy to predict.
Additionally, because POCO objects are not tied to a data-source, you can treat them the same, regardless of whether they have been loaded from your primary database, an alternative database, a flat file, or any other process. Although this may not seem immediately beneficial, treating your objects the same regardless of source can make behaviour easy to predict and to work with.
I chose NHibernate for my most recent ORM because of the support for POCO objects, something it handles very well. It suits the Domain-Driven Design approach the project follows and has enabled great separation between the database and the domain.
Being able to switch ORM tools is not a real argument for POCO support. Although your classes may not have any direct dependencies on the ORM, their behaviour and shape will be restricted by the ORM tool and the database it is mapping to. Changing your ORM is as significant a change as changing your database provider. There will always be features in one ORM that are not available in another and your domain classes will reflect the availability or absence of features.
In NHibernate, you are required to mark all public or protected class members as virtual to enable support for lazy-loading. This restriction, though not significantly changing my domain layer, has had an impact on its design.