I wondered whether UpdateModel is considered an "expensive" operation (due to Reflection lookup of the model properties), especially when seen in the context of a larger web application (think StackOverflow)?
I don't want to engage in premature optimization but I consider it a design choice to use UpdateModel which is why I'd like to know early whether it is advisable or not to go with it. The other (tedious) choice is writing my own UpdateModel method for various domain objects with fixed properties.
Thank you!
You are smart to want to not engage in premature optimization. Especially since this "optimization" would favor the processor's time over yours, which is far more expensive.
The primary rule of optimization is to optimize the slow stuff first. So consider how often you actually update a model versus selecting from your database backend. I'm guessing it's 1/10 as often or less. Now consider the cost of selecting from the database backend versus the cost of reflection. The cost of reflection is measured in milliseconds. The cost of selecting from the database backend can be measured in seconds at worst. My experience is that POSTs are rarely very slow, and when they are it's usually the database at fault rather than the reflection. I think you're likely to spend most of your optimization time on GETs.
Compared to network latency, database calls and general IO, the UpdateModel() call is trivial and I wouldn't bother with it.
I think UpdateModel is a bit of a shortcut that causes a huge amount of coupling between the view and the model.
I choose not to use "built-in" models (like being able to pass LINQ created objects to the view directly from the database) because I want the option to replace my model with something more sophisticated - or even just another database provider. It is very tempting to use LINQtoSQL (or ADO.NET Entities) for fast prototyping though.
What I tend to do is create my MVC application, then expose a 'service' layer which is then connected to a 'model' (which is an OO view of my domain). That way I can easily create a web service layer, swap databases, write new workflows etc without concern.
(and make sure you write your tests and use DI - it saves a lot of hassle!)
Rob
Related
I've inherited a project that is massive in scale, and a bit of a labyrinth. The traffic is substantial enough to want to optimize the data access, so I've began converting some WCF services that are invoked by javascript to Web API.
Unfortunately, the database's primary keys (not auto incrementing) are also managed by a custom ORM by querying a MySQL function that returns the next set of ID's to be used. The ORM then caches them and serves them up to the application. The database itself is an ever growing 2 TB's of data, which would make downtime significant.
I had planned on using Dapper as I've enjoyed ease/performance in the past, but weening this database off of this custom ORM seems daunting and prone to error.
Questions:
Does anyone have any advice for tackling a large scale project?
Should I focus more on migrating the database into an entirely new data structure? (it needs significant normalization, too!)
My humble opinion:
A rule of thumb when you deal with legacy code is: if something works, keep it that way. If necessary, make a change or an improvement. The main reasons are:
The effect of redesign is almost zero when you want to add a business value to the system.
The system, good or bad, works. No matter the care you have, you can always mess something with an structural change.
Besides, it depends a lot on the plan of the company the reasons to change (adding a feature, fixing a bug, improving design or optimizing resource usage). My humble experience tells me that time and budget are very important, and although we always want to redesign (or in some cases, to code from scratch), the most important is the business objectives and the added value.
In your case, maybe it's not necessary to change all the ORM. If you say that the ID's are cached, a better approach would be to modify the PK's, adding on them the identity property (with the properly starting value on each table). After that you can delete that particular part of the code that get the next Id's.
In some cases, an unnormalized database has his reasons. I've seen cases in which the data is copied to the tables to avoid the join, which affects performance. I'm talking about millions of records...
Reasons to change the ORM: maybe if it's inefficient, or if it does not close unmanaged code (in this case a better approach is to implement the IDisposable interface). If it works, maybe a better approach is to use a new ORM once you need to create new functionalities. If the project needs a refactoring for optimization purposes, the change needs to be applied in the bottlenecks, not the entire code.
There is a lot of discussion about the topic. A good recommended resource is "Working effectively with legacy code" by Michael Feathers, or "Getting Started With DDD When Surrounded By Legacy Systems" by Eric Evans.
Greetings!
I'm currently working with a system that has inherited a DAL using .Net's strongly typed datasets. I have never worked with them before this, but I'm finding that I have a strong aversion to using them. Compared to a POCO based DAL, them seem to be clunky, difficult to manage, and the resulting objects are highly coupled with database-specific concerns (e.g., accessing objects from tables and rows, getting desired data by key values, etc -- isn't the entire purpose of a DAL to abstract this away from logic layers?).
There has been some discussion about rewriting and/or re-factoring parts of the database layer. I, personally, would like to see these datasets removed, but I'm having a hard time convincing some of my colleagues, who are used to using them.
What are some of the pros and cons of using strongly typed datasets vs. a POCO based DAL? Am I justified in my aversion to strongly typed datasets, or is the community consensus that they arn't a problem? Are there any other solutions that I'm missing?
Although I also agree that there are benefits to using an ORM framework like NHibernate, I think a library of that complexity would be a hard sell on my colleagues. If anyone can provide a compelling enough argument to this direction, I would like to hear it.
Strongly typed datasets were an easy way to do a designer-based approach to database access. They could be generated from the database and are fairly easy to update. They also have the benefit of enforcing data types.
You can think of them as a transitionary stage between raw ADO.NET with DataSets, DataTables, and DataAdapters, and Entity Framework. I would try presenting Entity Framework using the designer and database first code generation as a replacement for the current method to your colleagues. It should be a familiar pattern and will allow them to transition more easily. It should also be the lowest amount of additional work to retrofit the existing code.
You can use that introduction to work on their comfort level and later on start to introduce POCO, Linq and separation of concerns in new projects. Remember that, typically, the faster the pace of change (and/or the higher workloads are), the greater the resistance. If you can present new methodologies in bite-sized pieces and as safe-feeling proof of concepts, you'll be much better received. Change is risk, so management of both perceptions and potential work expansion due to unknowns is important.
In my last question I posted some sample code on how I was trying to achieve separation of concerns. I received some ok advice, but I still just don't "get it" and can't figure out how to design my app to properly separate concerns without designing the next space shuttle.
The site I am working on (slowly converting from old ASP section by section) is moderately sized with several different sections including a store (with ~100 orders per day) and gets a decent amount of traffic (~300k uniques/month). I am the primary developer and there might be at most 2-3 devs that will also work on the system.
With this in mind, I am not sure I need full enterprise level architecture (correct me if i am wrong), but since I will be working on this code for the next few years, I want it to perform well and also be easy to extend as needed. I am learning C# and trying to incorporate best practices from the beginning. The old ASP site was a spaghetti mess and I want to avoid that this time around.
My current stab at doing this ended up being a bunch of DTOs with services that validate and make calls to a DAL layer to persist. It was not intentional, but I think the way it is setup now is a perfect anemic domain model. I have been trying to combat this by turning my BLL to domain objects and only use the DTOs to transfer data between the DAL and BO, but it is just not working. I also had all my dtos/blls split up according to the database tables / functionality (eg - YouTube style app - I have separate DTO/BLL/DAL for segments, videos, files, comments, etc).
From what I have been reading, I need to be at least using repositories and probably interfaces as well. This is great, but I am unsure how to move forward. Please help!
From what I can see you have four points that need addressing:
(1) "With this in mind, I am not sure I need full enterprise level architecture"
Lets deal with the high level fluff first. It depends on what you mean by "full enterprise level architecture", but the short answer is "Yes" you need to address many aspects of the system (and it will depend on the context of the system as to what the main ones are). If nothing else, the keys ones would be Change and Supportability. You need to structure the application in a way that supports changes in the future (logical and physical separation of concerns (Dependency Injection is a great for the latter); modular design, etc).
(2) "How to properly separate concerns in my architecture without designing a spacecraft?"
I like this approach (it's an article I wrote that distilled everything I had learnt up to that point) - but here's the gist:
Looking at this you'll have a minimum of six assemblies - and that's not huge. If you can break your system down (separate concerns) into these large buckets it should go a long way to giving what you need.
(3) Detail
Separating concerns into different layers and classes is great but you need to go further than that if you want to be able to effectively deal with change. Dependency Inversion (DI) is a key tool to use here. When I learnt DI it was a hand-rolled affair (as shown in the previous link) but there are lots of frameworks (etc) for it now. If you're new to DI (and you work in .Net) the article will step you through the basics.
(4) How to move forward
Get a simple vertical slice (UI all the way to the DB) working using DI, etc. As you do this you'll also be building the bones of the framework (sub-systems and major plumbing) that your system will use.
Having got that working start on a second slice; it's at this point that you should uncover any places where you're inadvertently not reusing things you should be - this is the time to change those before you build slices 3,4 and 5 - before there's too much rework.
Updates for Comments:
Do you you think I should completely
drop web forms and take up MVC from
scratch or just with what I know for
now?
I have no idea, but for the answer to be 'yes' you'd need to be able to answer these following questions with 'yes':
We have the required skills and experience to use and support MVC.
We have time to make the change (there is clear benefit in making this change).
We know MVC is better suited for our needs.
Making this change does not put successful delivery at risk.
...do I need to move to projects and
setup each of these layers as a
separate project?
Yes. Projects map 1-to-1 with assemblies, so get the benefits of loose-coupling you'll definitely want to separate things that way, and be careful how you set references.
when you refer to POCOs, are you meaning just DTOs or rich domain objects?
DTO not Rich Domain Object. BUT, people seem yo use the terms POCO and DTO interchangeably when strictly speaking they aren't - if you're from the Martin Fowler school of thought. In his view a DTO would be a bunch of POCO's (or other objects(?)) parcelled together for sending "across the wire" so that you only make one call to some external system and not lots of calls.
Everyone says I should not expose my
data structures to my UI, but I say
why not?
Managing dependencies. What you don't want is for you UI to reference the physical data structure because as soon as that changes (and it will) you'll be (to use the technical term) screwed. This is the whole point of layering. What you want to do is have the UI depend on abstractions - not implementations. In the 5-Layer Architecture the POCOs are safe to use for that because they are an abstract / logical definition of 'some thing' (a business concept) and so they should only change if there is a business reason - so in that sense they are fairly stable and safer to depend on.
If you are in the process of rewriting your eCommerce site, you should atleast consider replacing it with a standard package.
There are many more such packages available today. So although the decision to build the original site may have been correct, it is possible that building a custom app is no longer the correct decision.
There are several eConmmerce platforms listed here: Good e-commerce platform for Java or .NET
It should cost much less than the wages of 2-3 developers.
I believe that the best way to save your application state is to a traditional relational database which most of the time its table structure is pretty much represent the data model of our system + meta data.
However other guys in my team think that today it's best to simply serialize the entire object graph to a binary or XML file.
No need to say (but I'll still say it) that World War 3 is going between us and I would like to hear your opinion about this issue.
Personally I hate serialization because:
The data saved is adhered only to your development platform (C# in my case). No other platforms like Java or C++ can use this data.
Entire object graph (including all the inheritance chain) is saved and not only the data we need.
Changing the data model might cause severe backward compatibility issues when trying to load old states.
Sharing parts of the data between applications is problematic.
I would like to hear your opinion about that.
You didn't say what kind of data it is -- much depends on your performance, simultaneity, installation, security, and availability/centralization requirements.
If this data is very large (e.g. many instances of the objects in question), a database can help performance via its indexing capabilities. Otherwise it probably hurts performance, or is indistinguishable.
If your app is being run by multiple users simultaneously, and they may want to write this data, a database helps because you can rely on transactions to ensure data integrity. With file-based persistence you have to handle that yourself. If the data is single-user or single-instance, a database is very likely overkill.
If your app has its own soup-to-nuts installation, using a database places an additional burden on the user, who must set up and maintain (apply patches etc.) the database server. If the database can be guaranteed to be available and is handled by someone else, this is less of an issue.
What are the security requirements for the data? If the data is centralized, with multiple users (either simultaneous or sequential), you may need to manage security and permissions on the data. Without seeing the data it's hard to say whether it would be easier to manage with file-based persistence or a database.
If the data is local-only, many of the above questions about the data have answers pointing toward file-based persistence. If you need centralized access, the answers generally point toward a database.
My guess is that you probably don't need a database, based solely on the fact that you're asking about it mainly from a programming-convenience perspective and not a data-requirements perspective. Serialization, especially in .NET, is highly customizable and can be easily tailored to persist only the essential pieces you need. There are well-known best practices for versioning this data as well, so I'm not sure there's an advantage on the database side from that perspective.
About cross-platform concerns: If you do not know for certain that cross-platform functionality will be required in the future, do not build for it now. It's almost certainly easier overall to solve that problem when the time comes (migration etc.) than to constrain your development now. More often than not, YAGNI.
About sharing data between parts of the application: That should be architected into the application itself, e.g. into the classes that access the data. Don't overload the persistence mechanism to also be a data conduit between parts of the application; if you overload it that way, you're turning the persisted state into a cross-object contract instead of properly treating it as an extension of the private state of the object.
It depends on what you want to serialize of course. In some cases serialization is ridicilously easy.
(I once wrote kind of a timeline program in Java,
where you could draw en drag around and resize objects. If you were ready you could save it in file (like myTimeline.til). On that momenet hundreds of objects where saved, their position on the canvas, their size, their colors, their innertexts, their special effects,...
You could than ofcourse open myTimeLine.til and work further.
All this only asked a few lines of code. (just made all classes and their dependencies
serializable) and my coding time took less than 5 minutes, I was astonished myself! (it was the first time I used serialization ever)
Working on a timeline you could also 'saveAs' for different versions and the 'til' files where very easy to backup and mail.
I think in my particular case it would be a bit idiot to use databases. But that's of course for document-like structures only, like Word to name one.)
My point thus first : there are certainly several scenarios in which databases wouldn't be the best solution. Serialization was not invented by developers just because they were bored.
Not true if you use XMLserialization or SOAP
Not quite relevant anymore
Only if you are not carefull, plenty of 'best practices' for that.
Only if you want it to be problematic, see 1
Of course serialization has besides the speed of implementation other important advantages like not needing a database at all in some cases!
See this Stackoverflow posting for a commentary on the applicability of XML vs. the applicability of a database management system. It discusses an issue that's quite similar to the subject of the debate in your team.
You have some good points. I pretty much agree with you, but I'll play the devil's advocate.
Well, you could always write a converter in C# to extract the data later if needed.
That's a weak point, because disk space is cheap and the amount of extra bytes we'll use costs far less than the time we'll waste trying to get this all to work your way.
That's the way of the world. Burn the bridges and require upgrades. Convert the data, or make a tool to do that, and then no longer support the old version's way of doing it.
Not if the C# program hands off the data to the other applications. Other applications should not be accessing the data that belongs to this application directly, should they?
For transfer and offline storage, serialization is fine; but for active use, some kind of database is far preferable.
Typically (as you say), without a database, you need to deserialize the entire stream to perform any query, which makes it hard to scale. Add the inherent issues with threading etc, and you're asking for pain.
Some of your other pain points about serialization aren't all true - as long as you pick wisely. Obviously, BinaryFormatter is a bad choice for portability and versioning, but "protocol buffers" (Google's serialization format) has versions for Java, C++, C#, and a lot of others, and is designed to be version tolerant.
Just make sure you have a component that handles saving/loading state with a clean interface to the rest of your application. Then whatever choice you make for persistence can easily be revisited later.
Serializing an object graph to a file might be a good quick and dirty initial solution that is very quick to implement.
But if you start to run into issues that make a database a better choice you can plug in a new version with little or no impact on the rest of the application.
Yes propably true. The downside is that you must retrieve the whole object which is like retrieving all rows from a table. And if it's big it will be a downside. But if it ain't so big and with my hobbyprojects they are not, so maybe they should be a perfect match?
Are there any performance implications with using the provider pattern?
Does it rely on reflection for each instantiation or anything?
Yes, the provider model usually involves a small amount of reflection, and therefore, there is going to be a little bit of a performance hit, however, it is only in the instantiation of the provider object. Once the object is instantiated, it is accessed as normal (usually via an interface). The performance versus a hard-coded model should have very little difference, but the gain you get from the programming perspective far outweighs any performance penalty. Assuming the provider actually may change one day. If not, just hard-code it.
Providers are instanced once per app-domain. Although newing up an object via reflection is slower than doing it inline, it is still very, very fast. I would say there is no performance concern for most business apps.