Create PARTITION on MySql Table created via Code First implementation on EF6 - c#

I am using EF6 to create and populate a table with its primary key as a ServerTime (DateTime).
The table is very large, and in order to speed up access times as well as enjoy the benefit of having my table split into smaller partition files instead of one massive .ibd file when I perform an external query such as this:
SELECT
gbpusd.servertime,
gbpusd.orderbook
FROM
gbpusd
WHERE
gbpusd.servertime BETWEEN '2014-12-23 23:48:08.183000' AND '2015-03-23 23:48:08.183000'
I would like the table to be automatically partitioned by servertime during Code First creation.
I already know the raw MySql syntax for partitioning a table by range.
My current solution is to create and populate the database via EF6 Code first, and then manually execute the partitioning via raw MySQL query. Another solution is to use plain old ADO.NET directly after Code First creation, but I would have rather have everything streamlined inside EF6 code.
What I need to know is how I can accomplish the same thing via Code First implementation (assuming it is even possible)
Much thanks.

Is it about only speed up ? Where you need to populate this data? Assuming you have service layer or manager to use this entity.You can always use use your repository to perform over your entity using lembda expression.
example
using (var dataBaseContext= new your_DbContext())
{
var repoServer = new Repository<gbpusd>(dataBaseContext);
var searchedGbpusd =repoServer.SearchFor(i=> i.servertime <= date && i.servertime >= date).select;
///you can specify date what ever you like
///use this searched Data where ever you want to use.
}
Hope it will help.
Thanks
Fahad

You can just create your entity using EF6 code first practice. Using search via repository practice search the data and fill in that table. Like I suggested , use searched data and then fill the data and run this activity as frequently as you need. Like can do when users are not on peak. – Shaikh Muhammad Fahad 1

Related

Trying to get an UPSERT working on a set of data using dapper

I'm trying to get an upsert working on a collection of IDs (not the primary key - that's an identity int column) on a table using dapper. This doesn't need to be a dapper function, just including in case that helps.
I'm wondering if it's possible (either through straight SQL or using a dapper function) to run an upsert on a collection of IDs (specifically an IEnumerable of ints).
I really only need a simple example to get me started, so an example would be:
I have three objects of type Foo:
{ "ExternalID" : 1010101, "DescriptorString" : "I am a descriptive string", "OtherStuff" : "This is some other stuff" }
{ "ExternalID" : 1010122, "DescriptorString" : "I am a descriptive string123", "OtherStuff" : "This is some other stuff123" }
{ "ExternalID" : 1033333, "DescriptorString" : "I am a descriptive string555", "OtherStuff" : "This is some other stuff555" }
I have a table called Bar, with those same column names (where only 1033333 exists):
Table Foo
Column ID | ExternalID | DescriptorString | OtherStuff
Value [1]|[1033333] |["I am a descriptive string555"]|["This is some other stuff555"]
Well, since you said that this didn't need to be dapper-based ;-), I will say that the fastest and cleanest way to get this data upserted is to use Table-Valued Parameters (TVPs) which were introduced in SQL Server 2008. You need to create a User-Defined Table Type (one time) to define the structure, and then you can use it in either ad hoc queries or pass to a stored procedure. But this way you don't need to export to a file just to import, nor do you need to convert it to XML just to convert it back to a table.
Rather than copy/paste a large code block, I have noted three links below where I have posted the code to do this (all here on S.O.). The first two links are the full code (SQL and C#) to accomplish this (the 2nd link being the most analogous to what you are trying to do). Each is a slight variation on the theme (which shows the flexibility of using TVPs). The third is another variation but not the full code as it just shows the differences from one of the first two in order to fit that particular situation. But in all 3 cases, the data is streamed from the app into SQL Server. There is no creating of any additional collection or external file; you use what you currently have and only need to duplicate the values of a single row at a time to be sent over. And on the SQL Server side, it all comes through as a populated Table Variable. This is far more efficient than taking data you already have in memory, converting it to a file (takes time and disk space) or XML (takes cpu and memory) or a DataTable (for SqlBulkCopy; takes cpu and memory) or something else, only to rely on an external factor such as the filesystem (the files will need to be cleaned up, right?) or need to parse out of XML.
How can I insert 10 million records in the shortest time possible?
Pass Dictionary<string,int> to Stored Procedure T-SQL
Storing a Dictionary<int,string> or KeyValuePair in a database
Now, there are some issues with the MERGE command (see Use Caution with SQL Server's MERGE Statement) that might be a reason to avoid using it. So, I have posted the "upsert" code that I have been using for years to an answer on DBA.StackExchange:
How to avoid using Merge query when upserting multiple data using xml parameter?

Working with a large amount of data

I got c# appliaction and entity famework as ORM.
I got database with table Images. Table have Id, TimeStamp, Data columns.
This table can have really ALOT entities. Also Datacolumn contain large byte array.
I need to take first entity starting from some date, or first 5 as example.
var result = Images.OrderBy(img => img.TimeStamp).FirstOrDefault(img => img.TimeStamp > someDate);
throws out of memory exception.
Is there some way to pass that?
Should i use stored procedure or something else?
If Images is already a queried object, then when you OrderBy it, it accesses the whole set. I'll assume it isn't, and it is directly your DbSet or an EF IQueryable (so you are querying using Linq-To-Entities and not Linq-To-Objects and the ordering is done on the query to the database, and not on the returned whole set).
Unless you need change tracking detection, use AsNoTracking on your DbSet (in this case, Context.Images.AsNoTracking().OrderBy(...). That should lower the memory requirements by a lot (change tracking detection requires more than twice the memory).
Also, if using large blob data, it might be wise to store it in its own table (with just an id and the data) and access it only when you need it (having a reference to this id on the table/entity where you are doing your operations) if you are using an ORM and want to work with the original entity all the time (you could also use a Select to project the query on a new entity without the blob field).
If you need to access the image data for the returned rows all the time, and there's not enough memory in the system for it, then tough luck.

Retrieve just some columns using an ORM

I'm using Entity Framework and SQL Server 2008 with the Database First approach.
My problem is :
I have some tables that holds many many columns (~100), and when I try to retrieve a lot of rows it takes a significant time before it returns the results, even if sometimes I need to use just 3 or 4 columns from that table.
I passed half a day in Stackoverflow trying to find a way to solve this problem, and I came up with two solutions :
Using stored procedures to retrieve data with the columns I want.
Edit the .edmx (xml) and the .cs files to remove the columns that I won't use.
My problem again is :
If I use stored procedures to retrieve the data with the columns that I want, Entity Framework loose it benefit and I can use ADO.NET instead of it and call directly the stored procedures ...
I can't take the second solution, because every time I make a change in the database, I'm obliged to regenerate the .edmx file and I loose the changes I made before :'(
Is there a way to do this somehow in Entity Framework ? Is that possible !
I know that other ORMs exist like NHibernate or Dapper, but I don't know if they can offer this feature without causing a lot of pain.
You don't have to return every column each time. You can specify which columns you need.
var query = from t in db.Table
select new { t.Column1, t.Column2, t.Column3 };
Normally if you project the data into a different poco it will do this automatically in EF / L2S etc:
var slim = from row in db.Customers
select new CustomerViewModel {
Name = row.Name, Id = row.Id };
I would expect that to only read 2 columns.
For tools like dapper: since you control the SQL, only specify columns you want - don't use *
You can create a second project with a code-first DbContext, POCO's and maps that return the subset of columns that you require.
This is a case of cut and paste code but it will get you what you need.
You can just create classes and project the data into them but I'm not sure you can make updates using this method. You can use anonymous types within a single method but you'll need actual classes to pass around between methods.
Another option would be to move to a code first development.

LINQ to SQL Attaching collection of object from XML file Database

I am developing an HRM application to import and export xml data from database. The application receives exported xml data for the employee entry. I imported the xml file using linq to xml, where I converted the xml into respective objects. Then I want to attach (update) the employee objects.
I tried to use
//linqoper class for importing xml data and converts into IEnumerable employees object.
var emp = linqoper.importxml(filename.xml);
Using (EmployeedataContext db = new EmployeedatContext){
db.attachAllonSubmit(emp);
db.submitchange();
}
But I got error
“An entity can only be attached as modified without original state if it declares as version member or doesn't have an update check policy”.
I have also an option to retrieve each employee, and assign value to the new employee from xml data using this format.
//import IEnumerable of Employee objects
var employees = = linqoper.importxml(filename.xml)
using(Employeedatacontext db = new Employeedatacontext){
foreach(var empobj in employees)
{
Employee emp = db.Employee.where(m=>m.id==empobj.Id);
emp.FirstName=empobj.FirstName;
emp.BirthDate=empobj.BirthDate;
//….continue
}
db.submitChanges();
}
But the problem with the above is I have to iterate through the whole employee objects, which is very tiresome.
So is there any other way, I could attach (update) the employee entity in the database using LINQ to SQL.
I have seen some similar links on SO, but none of them seems to help.
https://stackoverflow.com/questions/898267/linq-to-sql-attach-refresh-entity-object
When linq-to-sql saves the changes to the database, it has to know properties of the object has been changed. It also checks if a potentially conflicting update to the database have been done during the update (optimistic concurrency).
To handle those cases LINQ-to-SQL needs two copies of the object when attaching. One with the original values (as present in the DB) and one with the new, changed values. There is also a more advanced mechanism involving a version member which is mapped to a rowversion column.
The linq-to-sql way to update a set of data is to first read all data from the database, then update the objects retrieved form the database and finally call SubmitChanges(). That would be my first approach in your situation.
If you experience performance problems, then it's time to go outside of linq-to-sql's toolbox. A solution with better performance is to load the new data into a separate staging table (for best performance, use bulk insert). Then run a SQL command or Stored Procedure that does the actual merging of data. The SQL Merge clause is excellent for this kind of updates.
LINQ to SQL is proper ORM, but if you want to take control of create/update/delete in your hand; than you can try some simple ORMs which just provide ways to do CRUD operations. I can recommend one http://crystalmapper.codeplex.com, it is simple yet powerful.
Why CrystalMapper?
I built this for large financial transaction system with lots of insert and update operations. What I need is speed and control of insert/update serving complex business scenarios ... hitting multiple tables just for one transaction.
When I put this to use in social text processing platform, it serves very well there too.

Programming pattern using typed datasets in VS 2008

I'm currently doing the following to use typed datasets in vs2008:
Right click on "app_code" add new dataset, name it tableDS.
Open tableDS, right click, add "table adapter"
In the wizard, choose a pre defined connection string, "use SQL statements"
select * from tablename and next + next to finish. (I generate one table adapter for each table in my DB)
In my code I do the following to get a row of data when I only need one:
cpcDS.tbl_cpcRow tr = (cpcDS.tbl_cpcRow)(new cpcDSTableAdapters.tbl_cpcTableAdapter()).GetData().Select("cpcID = " + cpcID)[0];
I believe this will get the entire table from the database and to the filtering in dotnet (ie not optimal), is there any way I can get the tableadapter to filer the result set on the database instead (IE what I want to is send select * from tbl_cpc where cpcID = 1 to the database)
And as a side note, I think this is a fairly ok design pattern for getting data from a database in vs2008. It's fairly easy to code with, read and mantain. But I would like to know it there are any other design patterns that is better out there? I use the datasets for read/update/insert and delete.
A bit of a shift, but you ask about different patterns - how about LINQ? Since you are using VS2008, it is possible (although not guaranteed) that you might also be able to use .NET 3.5.
A LINQ-to-SQL data-context provides much more managed access to data (filtered, etc). Is this an option? I'm not sure I'd go "Entity Framework" at the moment, though (see here).
Edit per request:
to get a row from the data-context, you simply need to specify the "predicate" - in this case, a primary key match:
int id = ... // the primary key we want to look for
using(var ctx = new MydataContext()) {
SomeType record = ctx.SomeTable.Single(x => x.SomeColumn == id);
//... etc
// ctx.SubmitChanges(); // to commit any updates
}
The use of Single above is deliberate - this particular usage [Single(predicate)] allows the data-context to make full use of local in-memory data - i.e. if the predicate is just on the primary key columns, it might not have to touch the database at all if the data-context has already seen that record.
However, LINQ is very flexible; you can also use "query syntax" - for example, a slightly different (list) query:
var myOrders = from row in ctx.Orders
where row.CustomerID = id && row.IsActive
orderby row.OrderDate
select row;
etc
There is two potential problem with using typed datasets,
one is testability. It's fairly hard work to set up the objects you want to use in a unit test when using typed datasets.
The other is maintainability. Using typed datasets is typically a symptom of a deeper problem, I'm guessing that all you business rules live outside the datasets, and a fair few of them take datasets as input and outputs some aggregated values based on them. This leads to business logic leaking all over the place, and though it will all be honky-dory the first 6 months, it will start to bite you after a while. Such a use of DataSets are fundamentally non-object oriented
That being said, it's perfectly possible to have a sensible architecture using datasets, but it doesn't come naturally. An ORM will be harder to set up initially, but will lend itself nicely to writing maintainable and testable code, so you don't have to look back on the mess you made 6 months from now.
You can add a query with a where clause to the tableadapter for the table you're interested in.
LINQ is nice, but it's really just shortcut syntax for what the OP is already doing.
Typed Datasets make perfect sense unless your data model is very complex. Then writing your own ORM would be the best choice. I'm a little confused as to why Andreas thinks typed datasets are hard to maintain. The only annoying thing about them is that the insert, update, and delete commands are removed whenever the select command is changed.
Also, the speed advantage of creating a typed dataset versus your own ORM lets you focus on the app itself and not the data access code.

Categories