Is my idea for an object persistence library useful?

Is my idea for an object persistence library useful? - c#

First, I apologize if this is not an appropriate venue to ask this question, but I wasn't really sure where else to get input from.
I have created an early version of a .NET object persistence library. Its features are:
A very simple interface for persistence of POCOs.
The main thing: support for just about every conceivable storage medium. This would be everything from plain text files on the local filesystem, to embedded systems like SQLite, any standard SQL server (MySQL, postgres, Oracle, SQL Server, whatever), to various NoSQL databases (Mongo, Couch, Redis, whatever). Drivers could be written for nearly anything, so for instance you could fairly easily write a driver where the actual backing store could be a web-service.
When I first had this idea I was convinced it was totally awesome. I quickly created an initial prototype. Now, I'm at the 'hard part' where I am debating issues like connection pooling, thread safety, and debating whether to try to support IQueryable for LINQ, etc. And I'm taking a harder look at whether it is worthwhile to develop this library beyond my own requirements for it.
Here is a basic example of usage:
var to1 = new TestObject { id = "fignewton", number = 100, FruitType = FruitType.Apple };
ObjectStore db = new SQLiteObjectStore("d:/objstore.sqlite");
db.Write(to1);
var readback = db.Read<TestObject>("fignewton");
var readmultiple = db.ReadObjects<TestObject>(collectionOfKeys);
The querying interface that works right now looks like:
var appleQuery = new Query<TestObject>().Eq("FruitType", FruitType.Apple).Gt("number",50);
var results = db.Find<TestObject>(appleQuery);
I am also working on an alternative query interface that lets you just pass in something very like a SQL WHERE clause. And obviously, in the NET world it would be great to support IQueryable / expression trees.
Because the library supports many storage mediums with disparate capabilities, it uses attributes to help the system make the best use of each driver.
[TableName("AttributeTest")]
[CompositeIndex("AutoProperty","CreatedOn")]
public class ComplexTypesObject
{
[Id]
public string id;
[QueryableIndexed]
public FruitType FruitType;
public SimpleTypesObject EmbeddedObject;
public string[] Array;
public int AutoProperty { get; set; }
public DateTime CreatedOn = DateTime.Now;
}
All of the attributes are optional, and are basically all about performance. In a simple case you don't need any of them.
In a SQL environment, the system will by default take care of creating tables and indexes for you, though there is a DbaSafe option that will prevent the system from executing DDLs.
It is also fun to be able to migrate your data from, say, a SQL engine to MongoDB in one line of code. Or to a zip file. And back again.
OK, The Question:
The root question is "Is this useful?" Is it worth taking the time to really polish, make thread-safe or connection pooled, write a better query interface, and upload somewhere?
Is there another library already out there that already does something like this, NAMELY, providing a single interface that works across multiple data sources (beyond just different varieties of SQL)?
Is it solving a problem that needs to be solved, or has someone else already solved it better?
If I proceed, how do you go about trying to make your project visible?
Obviously this isn't a replacement for ORMs (and it can co-exist with ORMs, and coexist with your traditional SQL server). I guess its main use cases are for simple persistence where an ORM is overkill, or for NoSQL type scenarios and where a document-store type interface is preferable.

My advice: Write it for your own requirements and then open-source it. You'll soon find out if there's a market for it. And, as a bonus, you'll find that other people will tell you which bits need polishing; there's a very high chance they'll polish it for you.

Ben, I think it's awesome. At the very least post it to CodePlex and share with the rest of the world. I'm quite sure there are developers out there who can use an object persistence framework (or help polish it up).

For what its worth I think its a great idea.
But more importantly, you've chosen a project (in my opinion) that will undoubtedly improve your code construction and design chops. It is often quite difficult to find projects that both add value while improving your skills.
At least complete it to your initial requirents and then open source it. Anything after that it is a bonus!

While I think the idea is intriguing, and could be useful, I am not sure what long-term value it may hold. Given the considerable advances with EF v4 recently, including things like Code-Only, true POCO support, etc. achieving what you are talking about is actually not that difficult with EF. I am a true believer in Code-Only these days, as it is simple, powerful, and best of all, compile-time checked.
The idea about supporting any kind of data store is intriguing, and something that is worth looking into. However, I think it might be more useful, and reach a considerably broader audience, if you implemented store providers for EF v4, rather than trying to reinvent the wheel that Microsoft has now spent years on. Small projects more often than not grow...and things like pooling, thread safety, LINQ/IQueryable support, etc. become more important...little by little, over time.
By developing EF data store providers for things like SqLite, MongoDB, Xml files or flat files, etc. you add to the capabilities of an existing, familiar, accessible framework, without requiring people to learn an additional one.

Related

How to handle multiple mongodb Collections in an API for .net

I'm trying to make a backend for my .Net webApp using mongoDB for this purpose.
I'm new to mongoDb and quite frankly I feel lost in all the documentation.
Until now I've followed the Microsoft guide on how to make the first steps in building an "onedimensional" api.
I could potentially build everything using only one collection, but I feel like this will quite hard to handle down the road.
That's why I thought it would be wise to split everything into smaller collections.
The Api is written with C#.
My code so far
appsetting:
"FantaTrainerDatabaseSettings": {
//"UsersCollectionName": "Users",
"FantaTrainerCollectionName": "Trainers",
//"TeamsCollectionName": "Teams",
"SoccerPlayersCollectionName": "SoccerPlayers",
"ConnectionString": "mongodb://localhost:27017",
"DatabaseName": "FantaTrainerDb"
}
}
The Startup.cs file has this method where the Controllers are instanciated:
public void ConfigureServices(IServiceCollection services)
{
services.Configure<FantaTrainerDatabaseSettings>(
Configuration.GetSection(nameof(FantaTrainerDatabaseSettings)));
services.AddSingleton<IFantaTrainerDatabaseSettings>(sp =>
sp.GetRequiredService<IOptions<FantaTrainerDatabaseSettings>>().Value);
services.AddSingleton<FantaTrainerService>();
services.AddControllers();
}
What I tried was
public void ConfigureServices(IServiceCollection services)
{
services.Configure<FantaTrainerDatabaseSettings>(
Configuration.GetSection(nameof(FantaTrainerDatabaseSettings)));
services.AddSingleton<IFantaTrainerDatabaseSettings>(sp =>
sp.GetRequiredService<IOptions<FantaTrainerDatabaseSettings>>().Value);
services.AddSingleton<FantaTrainerService>();
services.Configure<SoccerPlayersDatabaseSettings>(
Configuration.GetSection(nameof(SoccerPlayersDatabaseSettings)));
services.AddSingleton<ISoccerPlayersDatabaseSettings>(sp =>
sp.GetRequiredService<IOptions<SoccerPlayersDatabaseSettings>>().Value);
services.AddSingleton<SoccerPlayersService>();
services.AddControllers();
}
But as I supposed It doesn't work this way..
I'm not going to paste all the other code since it's more or less a copy-paste from the microsoft guide, with renamed variables. But let me know if you need more details.
To make it short, I don't get where I need to put the reference to access the other collections.
Do I need like a big controller class that handles all the controllers, or do I need to make the ConfigureServices() dynamic and figure out a way to handle the different collections?
Is there a right way to do that?
Let me know if you need further details or maybe reformulate the question to make it clearer what the problem is

data & collection modelling:
don't put everything in one collection. even though you can. it get's extremely difficult to query and update deeply nested entities as your app grows in complexity due to the c# mongodb driver having limitations on what it can do (without jumping through hoops).
i usually have one collection per logical entity. such as Book,Author,Publisher and have references between them when needed to define relationships. some may say that modelling your entities like a relational db beats the purpose of a nosql db but i don't agree with that becuase it helps with ease of development and maintaining your app in the longrun. you do lose a bit of performance doing lookups/joins but it won't be much worse than mysql/sql-server (with the proper use if indexes).
also keep in mind that mongodb has a hard limit on how big a single document can get, which is 16mb in size. so it would be a bad idea to embed millions of entities inside an array field of an entity. my personal rule of thumb is; if there's going to be more than a few hundred entities embeded in a field, store it in it's own collection. even if it's going to be less than a few hundred, something may get it's own collection if it's a complex (enough) entity.
sometimes, you'll be duplicating data to make queries fast. for ex: you could choose to embed a list of author names inside of the publisher entity. but that has the downside of you having to manually update that embeded list when there's a change to a name of one of the authors. the more places you duplicate in, the more work you have to do to keep data consistent.
in the end, it all really depends on how your app's views/ui/api is going be querying your data. when doing complex apps, you will have to choose the right balance of embedded vs. referenced.
app architecture & layering:
following microsoft tutorials is fine for just getting a lay of the land but they are just too basic when you need to figure out how to build complex systems.
my suggestion is to find some open source projects on github and study how people do things when building real-world apps. but choose wisely what you look at, because there's tendancy in the industry to over-engineer and over-complicate things due to trends/ hype of certain patterns, frameworks and technologies. a couple of such things would be: dependency injection, mocking, ddd, etc. i can suggest the following youtube videos if you'd like to cut through the crap and get to the heart of what really matters.
Core Principles Of API Design
Responsibilities Of a Controller
Dependency Injection? No Thank you!
Mocking? No Thank you!
Interfaces? No Thank you!
you might also find this mongodb web-api starter template interesting where i try to simplify things as much as possible in order to increase ease of development, readability and maintainability.

Benefits of LINQ over functional method chaines [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
There was a discussion on Kotlin Slack about a possibility of adding code trees to support things like C# LINQ.
In C# LINQ has many applications, but I want to focus on only one (because others a already presumably covered by Kotlin syntax): composing SQL queries to remote databases.
Prerequisites:
We have a data schema of an SQL database expressed somehow in the code so that static tools (or the type system) could check the correctness (at least the naming) of an SQL query
We have to generate the queries as strings
We want a syntax close to SQL or Java streams
Question: what do expression trees add to the syntax that is so crucial for the task at hand? How good an SQL builder can be without them?

How good a query dsl can be without expression trees?
As JINQ has shown you can get very far by analyzing the bytecode to understand the intent of developer and thus translate predicate to SQL. So in principle expression trees are not essential to build a great looking query dsl:
val alices = database.customerStream().where { it.name == "Alice" }
Even without a hackery such as bytecode analysis it's possible to get a decent query dsl with code generation. Querydsl and JOOQ are great examples. With a little bit of Kotlin wrapping code you can then write
val alices = db.findAll(QCustomer.customer, { it.name.eq("Alice") })
How do expression trees help building query dsl
Expression tree is a structure representing some code that resolves to a value. By having such structure generated by compiler one does not need bytecode analysis to understand what's the supposed to do. Given the example
val alices = database.customerStream().where { it.name == "Alice" }
The argument to where function would be an expression that we can inspect in runtime and translate it to SQL or other query language. Because the expression trees represent code you don't need to switch between Kotlin and SQL paradigm to write queries. The query code expressed using linq/jinq look pretty much the same regardless if they are executed in memory using POCO/POJO or in the database engine using its query language. The compiler can also do more type checking. Furthermore it's very easy to replace the underlying database with in memory representation to make running tests much faster.
Further reading:
What are Expression Trees and how do you use them and why would you use them?
Practical use of expression trees

JOOQ and Querydsl:
The typical solution to ORM has been to employ either a DSL or an embedded DSL from the application logic. While great advances have been made with these schemes, culminating in JOOQ and Querydsl, there are still many caveats to such a system:
many of the paradigms the people writing these queries are used to (namely type safety) are either missing or different in key ways
the exact semantics are non-obvious: in the previous answer it is suggested that we use an extension method eq to perform a db-native equality filter. It is highly probably that a new developer will mistakenly use equals instead of eq.
this second point is compounded by the difficulty in testing: using a live connector with fake data is a nutoriously difficult problem, so depending on the testing procedure, the incorrect code jooqDB.where { it.name.equals("alice") } may not be discovered until much further down the development pipeline.
Jinq
Jinq is not a first party data connector. While I think the importance of this is largely psychosomatic, it is important none the less. Many projects are going to use the tools suggested by the vendors, and all of the major DB providers have Java connectors, so it is likely that most developers will simply use them.
While I have not used Jinq, it is my belief that another reason Jinq has not seen wide-spread adoption is largely because it's attempting to use a much tougher domain to solve the problem: building queries from AST's is much easier than building queries from byte code for the same reason that building the back-end of a compiler is easier than building a transcompiler. While I cannot help but tip my hat to the Jinq team for doing such an amazing job, I also cannot help but think they are hampered by their tools: building queries out of bytecode is hard. By definition, java bytecode is committed to running on the JVM, trying to retrofit that commitment for another interpreter is a very hard problem.
My current work does not permit me to use a traditional database, but if I was to switch projects, knowing that I would need a great deal of data exposure in my DAL, I would likely retreat from Kotlin and Java back to .net, largely because of Linq, rather than investigate Jinq. "Linq from Kotlin" might well change my mind.
Support from DB vendors:
The LINQ-to-SQL and LINQ-to-mongo database connectors have seen wide-spread adoption in the .net community. This is because they are first party, high quality, and act in a reasonably straightforward manner: compiling an AST to SQL (or the mongo-query-language) is at least conceptually straight forward. Many of the traditional caveats of ORM's apply, but the vendors (Microsoft and Mongo) continue to address these problems.
If Kotlin supported runtime code trees in a similar vein to Linq, and if Kotlin continues to gain traction at its current rate, then I believe the MongoDB and the Hibernate teams would be quick to start retrofitting their existing LINQ-to-X connectors to support Kotlin clients, and eventually even the bigger slower companies like Microsoft and IBM would begin to support the same flow.
Linq from Kotlin
What’s more, the exact roles the Kotlin-unique concepts of a "receiver type" and the aggressive implementation of inline might play in the Linq space is interesting. Linq-from-Kotlin might well be more effective than LINQ-from-C#.
where C# has
someInterface
.where(customer -> customer.Name.StartsWith("A") && ComplexFunctionCallDoneByMongoDriver(customer))
.select(customer -> new ResultObject(customer.Name, customer.Age, OtherContext()))
Kotlin might be able to make advances:
someInterface
.filter { it.name startsWith "A" && inlinedComplexFunctionCallDoneOnDB(it) }
//inlined methods would have their AST exposed -> can be run on the DB process instead of on the driver.
.map { ResultObject(name, age, otherContext()) }
//uses a reciever type, no need to specify "customer ->"
//otherContext() might also be inlined
This is off the top of my head, I suspect that smarter brains than mine can put these tools to better use.
Other uses:
Its worth mentioning, the assumption made about the applications of runtime-code-AST's is false:
other [runtime AST problem domains] [are] already presumably covered by Kotlin syntax
The reason I brought this up in the first place was because I was annoyed with Kotlin's null-safety feature and its interaction with Mockito: Having spent some time researching the issue, there is no Mocking framework designed for Kotlin, only java frameworks that can be used from Kotlin, with some pain.
Some currently unsolved problems in both the java domain and the Kotlin domain:
Mocking frameworks, as above. With access to the AST all of the clever but bizarre tricks around argument-operation-order employed by Mockito become obsolete. Other more traditional mocking frameworks gain a much more intuitive and type-safe front-end.
binding expressions, for UI frameworks or otherwise, often devolve to strings. Consider a UI framework where developers could write notifyOfUIElementChange { this.model.displayName } instead of notifyOfUIElementChange("model.displayName"). The latter suffers from the crippling problem of being stale if somebody renames the property.
I'm very excited to see what the ControlsFX guys or Thomas Mikula might do with a feature such as this.
similar to Kotlin specific Linq: I suspect Kotlin’s applications here might provide for a number of tools I'm not aware of. But I'm very confident that they do exist.
I really like Linq, and I cannot help but think that with Kotlin's focus on industry problems, a Linq-from-Kotlin module would be a perfect fit and make a number of peoples lives, including mine, a fair bit easier.

Which serialization to use for my c# objects to save them in a SQL database

I'm looking for some advice, it may be that there is no hard and fast answer but any thoughts are appreciated. The application is very typical. Its a c# code, currently using VS2010. This code is used amongst other things for providing data from a model to end users. The model itself is based on some known data at fixed points (a "cube") various parameters and settings. In standard programming practice the user accesses the data via public "get" functions which in turn rely on private member variables such as the known data and the settings. So far so standard. Now I want to save the class providing this data to the users into an sql database - primarily so all users can share the same data (or more precisely model generated data).
Of course I could just take each member variable of the class and write these into the db using sql database and reinstantiate the class from these. But I dont want to miss out on all the goodies .net & c# has to offer. So what I'm thinking of doing is serializing the object and using linq to sql to squirt this into the db. The linq to sql section is straightforward, but I'm a total newbie when it comes to serialization and I'm a bit concerned/confused about it. It seems the obvious thing is to format the object into xml and write this into the database as a column in the table with sql datatype "xml". But that leaves me with a few questions.
To my understanding the standard XMLserializer will only write the public members of the class into the xml. That looks like a non-starter to me since my class design is predicated on keeping much of the class private (writing classes with all public members is outside of my experience - who does that ?). This has led me to consider the DataContractSerializer in which you can opt-in variables for serialization. However this seems to have some WCF dependencies and I'm wondering what are the potential drawbacks of using it. Additionally there is the SoapFormatter, which seems to be more prevalent for web applications and also JSON. I'm not really considering these, but maybe I should ? Maybe there are better ways of solving the problem ? Perhaps a bit verbose but hopefully all the context can help so people can shoot me some advice.

I have had requirements similar to yours and have done quite a bit of research in this area. I did a number of proof-of-concept projects using XMLSerialization, JSON, BinraryFormatter and not to forget some home grown hacks. I had almost decided to go with JSON (JSON.NET), until I found protobuf-net! It is fast, small in size, version independent, supports inheritance and easy to implement without much changes to your code. Recommend heavily.

If you store an object as XML, it will be very hard to use from the database. For example, if you store customer objects as XML, how would you write the following?
select * from Customers where LastName = 'Obama'
It can be done, but it's not easy.
How to map objects to a database is a subject of some controversy. Frameworks that are easy to get started with can become overly complex in the application's later life. Since most applications spend more time in maintenance than in initial development, I'd use the simplest model that works.
Plain ADO.NET or Dapper are good contenders. You'll write a bit more boilerplate code, but the decrease in complexity more than makes up for that.

Applicable design patterns

I must develop a simple web application to produce reports. I have a single table "contract" and i must return very simple aggregated values : number of documents produced in a time range, average number of pages for documents and so on . The table gets filled by a batch application, users will have roles that will allow them to see only a part of the reports (if they may be called so ).
My purpose is :
develop a class, which generates the so called reports, opened to future extension (adding new methods to generate new reports for different roles must be easy )
decouple the web graphic interface from the database access
I'm evaluating various patterns : decorator, visitor, ... but being the return data so simple i cannot evaluate which apply or even if its the case to use one. Moreover i must do it i less than 5 days. It can be done if i make a so called "smart gui" but as told at point 1, i don't want to get troubles when new roles or method will be added.
thank you for your answers.
I'm sorry, i realize i haven't provided too much infos. I live in a Dilbert world. at the moment i've got the following info : db will be oracle (the concrete db doesn't exist yet) , so no EF, maybe linqtodataset (but i'm new to linq). About new features of the application,due to pravious experiences, the only thing i wish is not to be obliged to propagate changes over the whole application, even if it's simple. that are the reasons i've thougth to design patterns (note i've said "if it's the case" in my question) .
I'll KISS it and then will refactor it if needed , as suggested by ladislav mrnka, but i still appreciate any suggestion on how to keep opened to extension the data gathering class

KISS - keep it simple and stupid. You have five days. Create working application and if you have time refactor it to some better solution.

The road to good code is not paved with design patterns.
Good code is code that is readable, maintainable, robust, compatible and future-proof.
Don’t get me wrong: Design patterns are a great thing because they help categorise, and thus teach, the experience that earlier generations of programmers have accrued. Each design pattern once solved a problem in a way that was not only novel and creative, but also good. The corrolary is not that a design pattern is necessarily good code when applied to any other problem.
Good code requires experience and insight. This includes experience with design patterns, and insight into their applicability and their downsides and pitfalls.
That said, my recommendation in your specific case is to learn about the recommended practice regarding web interfaces, database access, etc. Most C# programmers write web applications in ASP.NET; tend to use LINQ-to-Entities or LINQ-to-SQL for database access; and use Windows Forms or WPF for a desktop GUI. Each of these may or may not fulfill the requirements of your particular project. Only you can tell.

How about you use strategy pattern for the retrieving data? And use interfaces like following to keep it extendable at all times.
IReportFilter: Report filter/criteria set
IReportParams: Gets report parameters
IReportData: Gets the report data in a result set
IReportFormat: Report formatting
IReportRender: Renders the report
Just thinking out loud.

Looking for the most painless non-RDBMS storage method in C#

I'm writing a simple program that will run entirely client-side. (Desktop programming? do people still do that?) and I need a simple way to store trivial amounts of data in a structured form, but really don't see any need to use a database system. What's more, some of the data needs to be serialized and passed around to different users, like some kind of "file" or perhaps a "document". (has anyone ever done that before?)
So, I've looked at using .Net DataSets, LINQ, direct XML manipulation, and they all seem like they would get the job done, but I would like to know before I dive into any of them if there's one method that is generally regarded as easier to code than others. As I said, the amount of data to be stored is trivial, even if one hundred people all used the same machine we're not talking about more than 10 MB, so performance is not as large a concern as is codeability/maintainability. Thank you all in advance!

Sounds like Linq-to-XML is a good option for this.
Link 1
Link 2
Tons of info out there on this.

Without knowing anything else about your app, the .Net DataSets would likely be your easiest option because WriteXml and ReadXml already exist.

Any serialization API should do fine here. I would recommend something that is contract based (not BinaryFormatter, which is type-based) as that will keep it usable over time (as your assembly changes).
So I would build a basic object model (DTO) and use any of:
XmlSerializer
DataContractSerializer
protobuf-net (you all knew it was coming...)
OO, simple, and easy. And easy to use for passing fragments of the data (either between users of to a central server).

I would choose an embedded database. Using something like sqlite doesn't seem to be an overkill for me. You may even try its c# port (http://code.google.com/p/csharp-sqlite/).

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.