I have a query that looks like this:
using (MyDC TheDC = new MyDC())
{
foreach (MyObject TheObject in TheListOfMyObjects)
{
DBTable TheTable = new DBTable();
TheTable.Prop1 = TheObject.Prop1;
.....
TheDC.DBTables.InsertOnSubmit(TheTable);
}
TheDC.SubmitChanges();
}
This query basically inserts a list into the database using linq-to-sql. Now I've read online that L2S does NOT support bulk operations.
Does my query work by inserting each element at a time or all of them in one write?
Thanks for the clarification.
I modified the code from the following link to be more efficient and used it in my application. It is quite convenient because you can just put it in a partial class on top of your current autogenerated class. Instead of InsertOnSubmit add entities to a list, and instead of SubmitChanges call YourDataContext.BulkInsertAll(list).
http://www.codeproject.com/Tips/297582/Using-bulk-insert-with-your-linq-to-sql-datacontex
partial void OnCreated()
{
CommandTimeout = 5 * 60;
}
public void BulkInsertAll<T>(IEnumerable<T> entities)
{
using( var conn = new SqlConnection(Connection.ConnectionString))
{
conn.Open();
Type t = typeof(T);
var tableAttribute = (TableAttribute)t.GetCustomAttributes(
typeof(TableAttribute), false).Single();
var bulkCopy = new SqlBulkCopy(conn)
{
DestinationTableName = tableAttribute.Name
};
var properties = t.GetProperties().Where(EventTypeFilter).ToArray();
var table = new DataTable();
foreach (var property in properties)
{
Type propertyType = property.PropertyType;
if (propertyType.IsGenericType &&
propertyType.GetGenericTypeDefinition() == typeof(Nullable<>))
{
propertyType = Nullable.GetUnderlyingType(propertyType);
}
table.Columns.Add(new DataColumn(property.Name, propertyType));
}
foreach (var entity in entities)
{
table.Rows.Add(
properties.Select(
property => property.GetValue(entity, null) ?? DBNull.Value
).ToArray());
}
bulkCopy.WriteToServer(table);
}
}
private bool EventTypeFilter(System.Reflection.PropertyInfo p)
{
var attribute = Attribute.GetCustomAttribute(p,
typeof(AssociationAttribute)) as AssociationAttribute;
if (attribute == null) return true;
if (attribute.IsForeignKey == false) return true;
return false;
}
The term Bulk Insert usually refers to the SQL Server specific ultra fast bcp based SqlBulkCopy implementation. It is built on top of IRowsetFastLoad.
Linq-2-SQL does not implement insert using this mechanism, under any conditions.
If you need to bulk load data into SQL Server and need it to be fast, I would recommend hand coding using SqlBulkCopy.
Linq-2-SQL will attempt to perform some optimisations to speed up multiple inserts, however it still will fall short of many micro ORMs (even though no micro ORMs I know of implement SqlBulkCopy)
It will generate a single insert statement for every record, but will send them all to the server in a single batch and run in a single transaction.
That is what the SubmitChanges() outside the loop does.
If you moved it inside, then every iteration through the loop would go off to the server for the INSERT and run in it's own transaction.
I don't believe there is any way to fire off a SQL BULK INSERT.
LINQ Single Insert from List:
int i = 0;
foreach (IPAPM_SRVC_NTTN_NODE_MAP item in ipapmList)
{
++i;
if (i % 50 == 0)
{
ipdb.Dispose();
ipdb = null;
ipdb = new IPDB();
// .NET CORE
//ipdb.ChangeTracker.AutoDetectChangesEnabled = false;
ipdb.Configuration.AutoDetectChangesEnabled = false;
}
ipdb.IPAPM_SRVC_NTTN_NODE_MAP.Add(item);
ipdb.SaveChanges();
}
I would suggest you take a look at N.EntityFramework.Extension. It is a basic bulk extension framework for EF 6 that is available on Nuget and the source code is available on Github under MIT license.
Install-Package N.EntityFramework.Extensions
https://www.nuget.org/packages/N.EntityFramework.Extensions
Once you install it you can simply use BulkInsert() method directly on the DbContext instance. It support BulkDelete, BulkInsert, BulkMerge and more.
BulkInsert()
var dbcontext = new MyDbContext();
var orders = new List<Order>();
for(int i=0; i<10000; i++)
{
orders.Add(new Order { OrderDate = DateTime.UtcNow, TotalPrice = 2.99 });
}
dbcontext.BulkInsert(orders);
Related
I am converting data from sql into a list of PersonModel. But my question is: is there a faster way ( less code ) to get this done without using any framework / dapper. LINQ is ALLOWED.
( The code right now is just working fine, but maybe it can be done in a simpler way ).
This is the code I have right now:
public List<PersonModel> GetPerson_All()
{
var people = new List<PersonModel>();
//Get the connectionString from appconfig
using (var connection = new SqlConnection(GlobalConfig.CnnString(db)))
{
connection.Open();
//Using the stored procedure in the database.
using (var command = new SqlCommand("dbo.spPeople_GetAll", connection))
{
using (var reader = command.ExecuteReader())
{
//With a while loop, going trough each row to put all the data in the PersonModel class.
while (reader.Read())
{
var person = new PersonModel();
person.Id = (int)reader["Id"];
person.FirstName = (string) reader["FirstName"];
person.LastName = (string)reader["LastName"];
person.EmailAdress = (string)reader["EmailAddress"];
person.CellphoneNumber = (string)reader["CellphoneNumber"];
//Add the data into a list of PersoModel
people.Add(person);
}
}
}
}
return people;
}
With ( dapper ) you can put all the data inmediatly to a list. Is something like this possible without dapper?
public List<PersonModel> GetPerson_All()
{
List<PersonModel> output;
using (var connection = new SqlConnection(GlobalConfig.CnnString(db)))
{
output = connection.Query<PersonModel>("dbo.spPeople_GetAll").ToList();
}
return output;
}
No, there is not an more optimal way to do this then to use a method or mapper library like Dapper or Entity Framework: that is the entire reason such libraries exist.
If you have repeating code blocks like this, but don't want to use external libraries, you can try to refactor this to a method which executes a statement, iterates over the result and instantiates and fills objects.
Is it possible to update objects with Entity Framework, without grabbing them first?
Example: Here, I have a function that provides a Primary Key to locate the objects, pulls them, then updates them. I would like to eliminate having to pull the objects first, and simply run an UPDATE query. Removing the need for the SELECT query being generated.
public async Task<int> UpdateChecks(long? acctId, string payorname, string checkaccountnumber, string checkroutingnumber, string checkaccounttype)
{
using (var max = new Max(_max.ConnectionString))
{
var payments = await
max.payments.Where(
w =>
w.maindatabaseid == acctId && (w.paymentstatus == "PENDING" || w.paymentstatus == "HOLD")).ToListAsync();
payments.AsParallel().ForAll(payment =>
{
payment.payorname = payorname;
payment.checkaccountnumber = checkaccountnumber;
payment.checkroutingnumber = checkroutingnumber;
payment.checkaccounttype = checkaccounttype;
payment.paymentmethod = "CHECK";
payment.paymentstatus = "HOLD";
});
await max.SaveChangesAsync();
return payments.Count;
}
}
You can use the Attach() command to attach an entity you already know exists and then call SaveChanges() will will call the appropriate update method. Here is some sample code from the MSDN article on the topic:
on the subject:
var existingBlog = new Blog { BlogId = 1, Name = "ADO.NET Blog" };
using (var context = new BloggingContext())
{
context.Entry(existingBlog).State = EntityState.Unchanged;
// Do some more work...
context.SaveChanges();
}
Note that this is general EF logic, not related to any specific database implementation.
Now I'm using Dapper + Dapper.Extensions. And yes, it's easy and awesome. But I faced with a problem: Dapper.Extensions has only Insert command and not InsertUpdateOnDUplicateKey. I want to add such method but I don't see good way to do it:
I want to make this method generic like Insert
I can't get cached list of properties for particular type because I don't want to use reflection directly to build raw sql
Possible way here to fork it on github but I want to make it in my project only. Does anybody know how to extend it? I understand this feature ("insert ... update on duplicate key") is supported only in MySQL. But I can't find extension points in DapperExtensions to add this functionality outside.
Update: this is my fork https://github.com/MaximTkachenko/Dapper-Extensions/commits/master
This piece of code has helped me enormously in MySQL -related projects, I definitely owe you one.
I do a lot of database-related development on both MySQL and MS SQL. I also try to share as much code as possible between my projects.
MS SQL has no direct equivalent for "ON DUPLICATE KEY UPDATE", so I was previously unable to use this extension when working with MS SQL.
While migrating a web application (that leans heavily on this Dapper.Extensions tweak) from MySQL to MS SQL, I finally decided to do something about it.
This code uses the "IF EXISTS => UPDATE ELSE INSERT" approach that basically does the same as "ON DUPLICATE KEY UPDATE" on MySQL.
Please note: the snippet assumes that you are taking care of transactions outside this method. Alternatively you could append "BEGIN TRAN" to the beginning and "COMMIT" to the end of the generated sql string.
public static class SqlGeneratorExt
{
public static string InsertUpdateOnDuplicateKey(this ISqlGenerator generator, IClassMapper classMap, bool hasIdentityKeyWithValue = false)
{
var columns = classMap.Properties.Where(p => !(p.Ignored || p.IsReadOnly || (p.KeyType == KeyType.Identity && !hasIdentityKeyWithValue))).ToList();
var keys = columns.Where(c => c.KeyType != KeyType.NotAKey).Select(p => $"{generator.GetColumnName(classMap, p, false)}=#{p.Name}");
var nonkeycolumns = classMap.Properties.Where(p => !(p.Ignored || p.IsReadOnly) && p.KeyType == KeyType.NotAKey).ToList();
if (!columns.Any())
{
throw new ArgumentException("No columns were mapped.");
}
var tablename = generator.GetTableName(classMap);
var columnNames = columns.Select(p => generator.GetColumnName(classMap, p, false));
var parameters = columns.Select(p => generator.Configuration.Dialect.ParameterPrefix + p.Name);
var valuesSetters = nonkeycolumns.Select(p => $"{generator.GetColumnName(classMap, p, false)}=#{p.Name}").ToList();
var where = keys.AppendStrings(seperator: " and ");
var sqlbuilder = new StringBuilder();
sqlbuilder.AppendLine($"IF EXISTS (select * from {tablename} WITH (UPDLOCK, HOLDLOCK) WHERE ({where})) ");
sqlbuilder.AppendLine(valuesSetters.Any() ? $"UPDATE {tablename} SET {valuesSetters.AppendStrings()} WHERE ({where}) " : "SELECT 0 ");
sqlbuilder.AppendLine($"ELSE INSERT INTO {tablename} ({columnNames.AppendStrings()}) VALUES ({parameters.AppendStrings()}) ");
return sqlbuilder.ToString();
}
}
Actually I closed my pull request and remove my fork because:
I see some open pull requests created in 2014
I found a way "inject" my code in Dapper.Extensions.
I remind my problem: I want to create more generic queries for Dapper.Extensions. It means I need to have access to mapping cache for entities, SqlGenerator etc. So here is my way. I want to add ability to make INSERT .. UPDATE ON DUPLICATE KEY for MySQL. I created extension method for ISqlGenerator
public static class SqlGeneratorExt
{
public static string InsertUpdateOnDuplicateKey(this ISqlGenerator generator, IClassMapper classMap)
{
var columns = classMap.Properties.Where(p => !(p.Ignored || p.IsReadOnly || p.KeyType == KeyType.Identity));
if (!columns.Any())
{
throw new ArgumentException("No columns were mapped.");
}
var columnNames = columns.Select(p => generator.GetColumnName(classMap, p, false));
var parameters = columns.Select(p => generator.Configuration.Dialect.ParameterPrefix + p.Name);
var valuesSetters = columns.Select(p => string.Format("{0}=VALUES({1})", generator.GetColumnName(classMap, p, false), p.Name));
string sql = string.Format("INSERT INTO {0} ({1}) VALUES ({2}) ON DUPLICATE KEY UPDATE {3}",
generator.GetTableName(classMap),
columnNames.AppendStrings(),
parameters.AppendStrings(),
valuesSetters.AppendStrings());
return sql;
}
}
One more extension method for IDapperImplementor
public static class DapperImplementorExt
{
public static void InsertUpdateOnDuplicateKey<T>(this IDapperImplementor implementor, IDbConnection connection, IEnumerable<T> entities, int? commandTimeout = null) where T : class
{
IClassMapper classMap = implementor.SqlGenerator.Configuration.GetMap<T>();
var properties = classMap.Properties.Where(p => p.KeyType != KeyType.NotAKey);
string emptyGuidString = Guid.Empty.ToString();
foreach (var e in entities)
{
foreach (var column in properties)
{
if (column.KeyType == KeyType.Guid)
{
object value = column.PropertyInfo.GetValue(e, null);
string stringValue = value.ToString();
if (!string.IsNullOrEmpty(stringValue) && stringValue != emptyGuidString)
{
continue;
}
Guid comb = implementor.SqlGenerator.Configuration.GetNextGuid();
column.PropertyInfo.SetValue(e, comb, null);
}
}
}
string sql = implementor.SqlGenerator.InsertUpdateOnDuplicateKey(classMap);
connection.Execute(sql, entities, null, commandTimeout, CommandType.Text);
}
}
Now I can create new class derived from Database class to use my own sql
public class Db : Database
{
private readonly IDapperImplementor _dapperIml;
public Db(IDbConnection connection, ISqlGenerator sqlGenerator) : base(connection, sqlGenerator)
{
_dapperIml = new DapperImplementor(sqlGenerator);
}
public void InsertUpdateOnDuplicateKey<T>(IEnumerable<T> entities, int? commandTimeout) where T : class
{
_dapperIml.InsertUpdateOnDuplicateKey(Connection, entities, commandTimeout);
}
}
Yeah, it's required to create another DapperImplementor instance because DapperImplementor instance from base class is private :(. So now I can use my Db class to call my own generic sql queries and native queries from Dapper.Extension. Examples of usage Database class instead of IDbConnection extensions can be found here.
I'm trying to use the Dapper orm with the following simple query:
var sqlString = new StringBuilder();
sqlString.Append("select a.acct AccountNumber,");
sqlString.Append(" b.first_name FirstName,");
sqlString.Append(" b.last_name LastName,");
sqlString.Append(" a.rr RrNumber,");
sqlString.Append(" c.addr1 AddressLine1,");
sqlString.Append(" c.addr2 AddressLine2,");
sqlString.Append(" c.addr3 AddressLine3,");
sqlString.Append(" c.addr4 AddressLine4,");
sqlString.Append(" c.addr5 AddressLine5,");
sqlString.Append(" c.addr6 AddressLine6,");
sqlString.Append(" c.addr7 AddressLine7,");
sqlString.Append(" c.addr8 AddressLine8 ");
sqlString.Append("from (pub.mfclac as a left join pub.mfcl as b on a.client=b.client) ");
sqlString.Append("left join pub.mfclad as c on a.client=c.client ");
sqlString.Append("where a.acct = '#ZYX'");
var connection = new OdbcConnection(_connectionString);
var result = connection.Query(sqlString.ToString(),
new
{
ZYX = accountNumber
});
However when I execute this with an accountNumber known to exist, dapper returns nothing. So I tried to remove the quotes to verify that the parameter is in fact being replaced with the account number, however the error being returned from the server indicates a syntax error around "#ZYX". Which means dapper is not replacing the parameter with it's given value. Any ideas why this is happening? From the limited documentation out there, this should 'just work'.
Edit1
Couldn't get this to work. Using string.format to insert the parameter as a work around.
There are two issues here; firstly (although you note this in your question) where a.acct = '#ZYX', under SQL rules, does not make use of any parameter - it looks to match the literal string that happens to include an # sign. For SQL-Server (see note below), the correct usage would be where a.acct = #ZYX.
However! Since you are use OdbcConnection, named parameters do not apply. If you are actually connecting to something like SQL-Server, I would strongly recommend using the pure ADO.NET clients, which have better features and performance than ODBC. However, if ODBC is your only option: it does not use named parameters. Until a few days ago, this would have represented a major problem, but as per Passing query parameters in Dapper using OleDb, the code (but not yet the NuGet package) now supports ODBC. If you build from source (or wait for the next release), you should be able to use:
...
where a.acct = ?
in your command, and:
var result = connection.Query(sqlString.ToString(),
new {
anythingYouLike = accountNumber
});
Note that the name (anythingYouLike) is not used by ODBC, so can be... anything you like. In a more complex scenario, for example:
.Execute(sql, new { id = 123, name = "abc", when = DateTime.Now });
dapper uses some knowledge of how anonymous types are implemented to understand the original order of the values, so that they are added to the command in the correct sequence (id, name, when).
One final observation:
Which means dapper is not replacing the parameter with it's given value.
Dapper never replaces parameters with their given value. That is simply not the correct way to parameterize sql: the parameters are usually sent separately, ensuring:
there is no SQL injection risk
maximum query plan re-use
no issues of formatting
Note that some ADO.NET / ODBC providers could theoretically choose to implement things internally via replacement - but that is separate to dapper.
I landed here from dublicate question: Dapper must declare the scalar variable
Error: Must declare the scalar variable "#Name".
I created queries dynamically with this piece of code:
public static bool Insert<T>(T entity)
{
var tableName = entity.GetType().CustomAttributes.FirstOrDefault(x => x.AttributeType.Name == nameof(TableAttribute))?.ConstructorArguments?.FirstOrDefault().Value as string;
if (string.IsNullOrEmpty(tableName))
throw new Exception($"Cannot save {entity.GetType().Name}. Database models should have [Table(\"tablename\")] attribute.");
DBSchema.TryGetValue(tableName.ToLower(), out var fields);
using (var con = new SqlConnection(ConnectionString))
{
con.Open();
var sql = $"INSERT INTO [{tableName}] (";
foreach (var field in fields.Where(x => x != "id"))
{
sql += $"[{field}]"+",";
}
sql = sql.TrimEnd(',');
sql += ")";
sql += " VALUES (";
foreach (var field in fields.Where(x => x != "id"))
{
sql += "#"+field + ",";
}
sql = sql.TrimEnd(',');
sql += ")";
var affectedRows = con.Execute(sql, entity);
return affectedRows > 0;
}
}
And I got the same error when my models was like this:
[Table("Users")]
public class User
{
public string Name;
public string Age;
}
I changed them to this:
[Table("Users")]
public class User
{
public string Name { get; set; }
public string Age { get; set; }
}
And it solved the problem for me.
I've stored 30,000 SimpleObjects in my database:
class SimpleObject
{
public int Id { get; set; }
}
I want to run a query on DB4O that finds all SimpleObjects with any of the specified IDs:
public IEnumerable<SimpleObject> GetMatches(int[] matchingIds)
{
// OH NOOOOOOES! This activates all 30,000 SimpleObjects. TOO SLOW!
var query = from SimpleObject simple in db
join id in matchingIds on simple.Id equals id
select simple;
return query.ToArray();
}
How do I write this query so that DB4O doesn't activate all 30,000 objects?
I am not an expert on this, and it might be good to post on the DB4O forums about it, but I think I have a solution. It involves not using LINQ and using SODA.
This is what I did. I created a quick project that populates the database with 30000 SimpleObject based on your post's definition. I then wrote a query to grab all the SimpleObjects from the database:
var simpleObjects = db.Query<SimpleObject>(typeof(SimpleObject));
When I wrapped a StopWatch around it, that run takes about 740 milliseconds. I then used your code to search for a 100 random numbers between 0 and 2999. The response was 772 ms, so based on that number I am assuming that it is pulling all the objects out of the database. I am not sure how to verify that, but later I think I proved it with performance.
I then went lower. From my understanding the LINQ provider from the DB4O team is just doing a translation into SODA. Therefore I figured that I would write a SODA query to test, and what I found was that using SODA against a property is bad for performance because it took 19902 ms to execute. Here is the code:
private SimpleObject[] GetSimpleObjectUsingSodaAgainstAProperty(int[] matchingIds, IObjectContainer db)
{
SimpleObject[] returnValue = new SimpleObject[matchingIds.Length];
for (int counter = 0; counter < matchingIds.Length; counter++)
{
var query = db.Query();
query.Constrain(typeof(SimpleObject));
query.Descend("Id").Constrain(matchingIds[counter]);
IObjectSet queryResult = query.Execute();
if (queryResult.Count == 1)
returnValue[counter] = (SimpleObject)queryResult[0];
}
return returnValue;
}
So thinking about why this would be so bad, I decided to not use an auto-implemented property and define it my self because Properties are actually methods and not values:
public class SimpleObject
{
private int _id;
public int Id {
get
{ return _id; }
set
{ _id = value; }
}
}
I then rewrote the query to use the _id private field instead of the property. The performance was much better at about 91 ms. Here is that code:
private SimpleObject[] GetSimpleObjectUsingSodaAgainstAField(int[] matchingIds, IObjectContainer db)
{
SimpleObject[] returnValue = new SimpleObject[matchingIds.Length];
for (int counter = 0; counter < matchingIds.Length; counter++)
{
var query = db.Query();
query.Constrain(typeof(SimpleObject));
query.Descend("_id").Constrain(matchingIds[counter]);
IObjectSet queryResult = query.Execute();
if (queryResult.Count == 1)
returnValue[counter] = (SimpleObject)queryResult[0];
}
return returnValue;
}
Just to make sure that it is was not a fluke, I ran the test run several times and recieved similar results. I then added another 60,000 records for a total of 90,000, and this was the performance differences:
GetAll: 2450 ms
GetWithOriginalCode: 2694 ms
GetWithSODAandProperty: 75373 ms
GetWithSODAandField: 77 ms
Hope that helps. I know that it does not really explain why, but this might help with the how. Also the code for the SODA field query would not be hard to wrap to be more generic.
If you try to run this query using LINQ it'll run unoptimized (that means that db4o are going to retrieve all objects of type SimpleObject and delegate the rest to LINQ to objects)
The best approach would be to run n queries (since the id field is indexed, each query should run fast) and aggregate the results as suggested by "Mark Hall".
You can even use LINQ for this (something like)
IList<SimpleObject> objs = new List<SimpleObject>();
foreach(var tbf in ids)
{
var result = from SimpleObject o in db()
where o.Id = tbf
select o;
if (result.Count == 1)
{
objs.Add(result[0]);
}
}
Best
I haven't done much with db4o LINQ. But you can use the DiagnosticToConsole (or ToTrace) and add it to the IConfiguration.Diagnostic().AddListener. This will show you if the query is optimized.
You don't give a lot of details, but is the Id property on SimpleObject indexed?
Once diagnostics are turned on, you might try the query like so...
from SimpleObject simple in db
where matchingIds.Contains(simple.Id)
select simple
See if that gives you a different query plan.
You could 'build' a dynamic linq query. For example the API could look like this:
First parameter: a expression which tells on which property you search
The other parameters: the id's or whatever you're searching for.
var result1 = db.ObjectByID((SimpleObject t) => t.Id, 42, 77);
var result2 = db.ObjectByID((SimpleObject t) => t.Id, myIDList);
var result3 = db.ObjectByID((OtherObject t) => t.Name, "gamlerhart","db4o");
The implementation builds a dynamic query like this:
var result = from SimpleObject t
where t.Id = 42 || t.Id==77 ... t.Id == N
select t
Since everything is combined with OR the can be evaluated directly on the indexes. It doesn't need activation. Example-Implementations:
public static class ContainerExtensions{
public static IDb4oLinqQuery<TObjectType> ObjectByID<TObjectType, TIdType>(this IObjectContainer db,
Expression<Func<TObjectType, TIdType>> idPath,
params TIdType[] ids)
{
if(0==ids.Length)
{
return db.Cast<TObjectType>().Where(o=>false);
}
var orCondition = BuildOrChain(ids, idPath);
var whereClause = Expression.Lambda(orCondition, idPath.Parameters.ToArray());
return db.Cast<TObjectType>().Where((Expression<Func<TObjectType, bool>>) whereClause);
}
private static BinaryExpression BuildOrChain<TIdType, TObjectType>(TIdType[] ids, Expression<Func<TObjectType, TIdType>> idPath)
{
var body = idPath.Body;
var currentExpression = Expression.Equal(body, Expression.Constant(ids.First()));
foreach (var id in ids.Skip(1))
{
currentExpression = Expression.OrElse(currentExpression, Expression.Equal(body, Expression.Constant(id)));
}
return currentExpression;
}
}