How can I take objects from the second set of objects which don't exist in the first set of objects in fast way? - c#

I have records in two databases. That is the entity in the first database:
public class PersonInDatabaseOne
{
public string Name { get; set; }
public string Surname { get; set; }
}
That is the entity in the second database:
public class PersonInDatabaseTwo
{
public string FirstName { get; set; }
public string LastName { get; set; }
}
How can I get records from the second database which don't exist in the first database (the first name and the last name must be different than in the first database). Now I have something like that but that is VERY SLOW, too slow:
List<PersonInDatabaseOne> peopleInDatabaseOne = new List<PersonInDatabaseOne>();
// Hear I generate objects but in real I take it from database:
for (int i = 0; i < 100000; i++)
{
peopleInDatabaseOne.Add(new PersonInDatabaseOne { Name = "aaa" + i, Surname = "aaa" + i });
}
List<PersonInDatabaseTwo> peopleInDatabaseTwo = new List<PersonInDatabaseTwo>();
// Hear I generate objects but in real I take it from database:
for (int i = 0; i < 10000; i++)
{
peopleInDatabaseTwo.Add(new PersonInDatabaseTwo { FirstName = "aaa" + i, LastName = "aaa" + i });
}
for (int i = 0; i < 10000; i++)
{
peopleInDatabaseTwo.Add(new PersonInDatabaseTwo { FirstName = "bbb" + i, LastName = "bbb" + i });
}
List<PersonInDatabaseTwo> peopleInDatabaseTwoWhichNotExistInDatabaseOne = new List<PersonInDatabaseTwo>();
// BELOW CODE IS VERY SLOW:
foreach (PersonInDatabaseTwo personInDatabaseTwo in peopleInDatabaseTwo)
{
if (!peopleInDatabaseOne.Any(x => x.Name == personInDatabaseTwo.FirstName && x.Surname == personInDatabaseTwo.LastName))
{
peopleInDatabaseTwoWhichNotExistInDatabaseOne.Add(personInDatabaseTwo);
}
};

The fastest way is dependent on the number of entities, and what indexes you already have.
If there's a few entities, what you already have performs better because multiple scans of a small set takes less than creating HashSet objects.
If all of your entities fit in the memory, the best way is to build HashSet out of them, and use Except which is detailed nicely by #alex.feigin.
If you can't afford loading all entities in the memory, you need to divide them into bulks based on the comparison key and load them into memory and apply the HashSet method repeatedly. Note that bulks can't be based on the number of records, but on the comparison key. For example, load all entities with names starting with 'A', then 'B', and so on.
If you already have an index on the database on the comparison key (like, in your case, FirstName and LastName) in one of the databases, you can retrieve a sorted list from the database. This will help you do binary search (http://en.wikipedia.org/wiki/Binary_search_algorithm) on the sorted list for comparison. See https://msdn.microsoft.com/en-us/library/w4e7fxsh(v=vs.110).aspx
If you already have an index on the database on the comparison key on both databases, you can get to do this in O(n), and in a scalable way (any number of records). You need to loop through both lists and find the differences only once. See https://stackoverflow.com/a/161535/187996 for more details.

Edit: with respect to the comments - using a real model and a dictionary instead of a simple set:
Try hashing your list into a Dictionary to hold your people objects, as the key - try a Tuple instead of a name1==name2 && lname1==lname2.
This will potentially then look like this:
// Some people1 and people2 lists of models already exist:
var sw = Stopwatch.StartNew();
var removeThese = people1.Select(x=>Tuple.Create(x.FirstName,x.LastName));
var dic2 = people2.ToDictionary(x=>Tuple.Create(x.Name,x.Surname),x=>x);
var result = dic2.Keys.Except(removeThese).Select(x=>dic2[x]).ToList();
Console.WriteLine(sw.Elapsed);
I hope this helps.

Related

How to turn a string into a 2d string array

as the title suggests, I am looking for guidance in how to turn a string (csvData) into a 2D string array by splitting it two times with ';' and ',' respectivly.
Currently I am at the stage where I am able to split it once into rows and turn it into an array, but I cannot figure out how to instead create a 2D array where the columns divided by ',' are also separate.
string[] Sep = csvData.Split(';').Select(csvData => csvData.Replace(" ","")).Where(csvData => !string.IsNullOrEmpty(csvData)).ToArray();
I have tried various things like :
string[,] Sep = csvData.Split(';',',').Select(csvData => csvData.Replace(" ","")).Where(csvData => !string.IsNullOrEmpty(csvData)).ToArray();
naivly thinking that c# would understand what I tried to achieve, but since I am here it's obvious that I got the error that "cannot implicitly convert type string[] to string [*,*]"
Note that I have not coded for a while, so if my thinking is completely wrong and you do not understand what I am trying to convey with this question, I apologize in advance.
Thanks!
In a strongly-typed language like C#, the compiler makes no assumptions about what you intend to do with your data. You must make your intent explicit through your code. Something like this should work:
string csvData = "A,B;C,D";
string[][] sep = csvData.Split(';') // Returns string[] {"A,B","C,D"}
.Select(str => str.Split(',')) // Returns IEnumerable<string[]> {{"A","B"},{"C","D"}}
.ToArray(); // Returns string[][] {{"A","B"},{"C","D"}}
Rows are separated by semicolon, columns by comma?
Splitting by ';' gives you an array of rows. Split a row by ',' gives you an array of values.
If your data has a consistent schema, as in each csv you process has the same columns, you could define a class to represent the entity to make the data easier to with with.
Let's say it's customer data:
John,Smith,8675309,johnsmith#gmail.com;
You could make a class with those properties:
public class Customer
{
public string FirstName { get; set; }
public string LastName { get; set; }
public string Phone { get; set; }
public string Email { get; set; }
}
Then:
var rows = csvdata.Split(';');
List<Customer> customers = new();
foreach(var row in rows)
{
var customer = row.Split(',');
customers.Add(new()
{
FirstName = row[0],
LastName = row[1],
Phone = row[2],
Email = row[3]
});
}
Now you have a list of customers to do whatever it is you do with customers.
Here is an answer to present a few alternative ideas and things you can do with C# - more for educational/academic purposes than anything else. These days to consume a CSV we'd use a CSV library
If your data is definitely regularly formed you can get away with just one Split. The following code splits on either char to make one long array. It then stands to reason that every 4 elements is a new customer, the data of the customer being given by n+0, n+1, n+2 and n+3. Because we know how many data items we will consume, dividing it by 4 gives us the number of customers so we can presize our 2D array
var bits = data.Split(';',',');
var twoD = new string[bits.Length/4,4];
for(int x = 0; x < bits.Length; x+=4){
twoD[x/4,0] = bits[x+0];
twoD[x/4,1] = bits[x+1];
twoD[x/4,2] = bits[x+2];
twoD[x/4,3] = bits[x+3];
}
I don't think I'd use 2D arrays though - and I commend the other answer advising to create a class to hold the related data; you can use this same technique
var custs = new List<Customer>();
for(int x = 0; x < bits.Length;){
custs.Add(new()
{
FirstName = bits[x++],
LastName = bits[x++],
Phone = bits[x++],
Email = bits[x++]
});
}
Here we aren't incrementing x in the loop header; every time a bit of info is assigned x is bumped up by 1 in the loop body. We could have kept the same approach as before, jumping it by 4 - just demoing another approach that lends itself well here.
I mentioned that these days we probably wouldn't really read a csv manually and split ourselves - what if the data contains a comma, or a semicolon - it wrecks the file structure
There are a boatload of libraries that read CSV files, CsvHelper is a popular one, and you'd use it like:
using var reader = new StreamReader("path\\to\\file.csv");
using var csv = new CsvReader(reader, CultureInfo.InvariantCulture)
var custs = csv.GetRecords<Customer>().ToList();
...
Your file would have a header line with column names that match your property names in c#. If it doesn't then you can use attributes on the properties to tell CsvH what column should be mapped to what property - https://joshclose.github.io/CsvHelper/getting-started/
Here's the simplest way I know to produce a 2d array by splitting a string.
string csvData = "A,B,C;D,E,F,G";
var temporary =
csvData
.Split(';')
.SelectMany((xs, i) => xs.Split(',').Select((x, j) => new { x, i, j }))
.ToArray();
int max_i = temporary.Max(x => x.i);
int max_j = temporary.Max(x => x.j);
string[,] array = new string[max_i + 1, max_j + 1];
foreach (var t in temporary)
{
array[t.i, t.j] = t.x;
}
I purposely chose csvData to be missing a value.
temporary is this:
And the final array is this:

Trying to get NetSuite Country list with enumeration value linked to code and name

I am implementing a integration with NetSuite in C#. In the external system I need to populate a list of countries that will match NetSuite's country list.
The NetSuite Web Service provides an enumeration call Country
public enum Country {
_afghanistan,
_alandIslands,
_albania,
_algeria,
...
You can also get a list of country Name and Code (in an albeit not so straight forward way) from the web service. (See: http://suiteweekly.com/2015/07/netsuite-get-all-country-list/)
Which gives you access to values like this:
Afghanistan, AF
Aland Islands, AX
Albania, AL
Algeria, DZ
American Samoa, AS
...
But, as you can see, there is no way to link the two together. (I tried to match by index but that didn't work and sounds scary anyway)
NetSuite's "help" files have a list. But this is static and I really want a dynamic solution that updates as NetSuites updates because we know countries will change--even is not that often.
Screenshot of Country Enumerations from NetSuite help docs
The only solutions I have found online are people who have provided static data that maps the two sets of data. (ex. suiteweekly.com /2015/07/netsuite-complete-country-list-in-netsuite/)
I cannot (don't want to) believe that this is the only solution.
Anyone else have experience with this that has a better solution?
NetSuite, if you are reading, come on guys, give a programmer a break.
The best solution I have come up with is to leverage the apparent relationship between the country name and the enumeration key to forge a link between the two. I am sure others could improve on this solution but what I would really like to see is a solution that isn't a hack like this that relies on an apparent pattern but rather on that is based on an explicit connection. Or better yet NetSuite should just provide the data in one place all together.
For example you can see the apparent relationship here:
_alandIslands -> Aland Islands
With a little code I can try to forge a match.
I first get the Enumeration Keys into an array. And I create a list of objects of type NetSuiteCountry that will hold my results.
var countryEnumKeys = Enum.GetNames(typeof(Country));
var countries = new List<NetSuiteCountry>();
I then loop through the list of country Name and Code I got using the referenced code above (not shown here).
For each country name I then strip all non-word characters from the country name with Regex.Replace, prepend an underscore (_) and then convert the string to lowercase. Finally I try to find a match between the Enumeration Key (converted to lowercase as well) and the matcher string that was created. If a match is found I save all the data together the countries list.
UPDATE: Based on the comments I have added additional code/hacks to try to deal with the anomalies without hard-coding exceptions. Hopefully these updates will catch any future updates to the country list as well, but no promises. As of this writing it was able to handle all the known anomalies. In my case I needed to ignore Deprecated countries so those aren't included.
foreach (RecordRef baseRef in baseRefList)
{
var name = baseRef.name;
//Skip Deprecated countries
if (name.EndsWith("(Deprecated)")) continue;
//Use the name to try to find and enumkey match and only add a country if found.
var enumMatcher = $"_{Regex.Replace(name, #"\W", "").ToLower()}";
//Compares Ignoring Case and Diacritic characters
var enumMatch = CountryEnumKeys.FirstOrDefault(e => string.Compare(e, enumMatcher, CultureInfo.CurrentCulture, CompareOptions.IgnoreNonSpace | CompareOptions.IgnoreCase) == 0);
//Then try by Enum starts with Name but only one.
if (enumMatch == null)
{
var matches = CountryEnumKeys.Where(e => e.ToLower().StartsWith(enumMatcher));
if (matches.Count() == 1)
{
Debug.Write($"- Country Match Hack 1 : ");
enumMatch = matches.First();
}
}
//Then try by Name starts with Enum but only one.
if (enumMatch == null)
{
var matches = CountryEnumKeys.Where(e => enumMatcher.StartsWith(e.ToLower()));
if (matches.Count() == 1)
{
Debug.Write($"- Country Match Hack 2 : ");
enumMatch = matches.First();
}
}
//Finally try by first half Enum and Name match but again only one.
if (enumMatch == null)
{
var matches = CountryEnumKeys.Where(e => e.ToLower().StartsWith(enumMatcher.Substring(0, (enumMatcher.Length/2))));
if (matches.Count() == 1)
{
Debug.Write($"- Country Match Hack 3 : ");
enumMatch = matches.First();
}
}
if (enumMatch != null)
{
var enumIndex = Array.IndexOf(CountryEnumKeys, enumMatch);
if (enumIndex >= 0)
{
var country = (Country) enumIndex;
var nsCountry = new NetSuiteCountry
{
Name = baseRef.name,
Code = baseRef.internalId,
EnumKey = country.ToString(),
Country = country
};
Debug.WriteLine($"[{nsCountry.Name}] as [{nsCountry.EnumKey}]");
countries.Add(nsCountry);
}
}
else
{
Debug.WriteLine($"Could not find Country match for: [{name}] as [{enumMatcher}]");
}
}
Here is my NetSuiteCountry class:
public class NetSuiteCountry
{
public string Name { get; set; }
public string Code { get; set; }
public string EnumKey { get; set; }
public Country Country { get; set; }
}
Let me start off with a disclaimer that I'm not a coder, and this is the first day I've tried to look at a C# program.
I need something similar for a Javascript project where I need the complete list of Netsuite company names, codes and their numeric values and when reading the help it seemed like the only way was through webservices.
I downloaded the sample application for webservices from Netsuite and a version of Visual Studio and I was able to edit the sample program provided to create a list of all of the country names and country codes (ex. Canada, CA).
I started out doing something similar to the previous poster to get the list of country names:
string[] countryList = Enum.GetNames(typeof(Country));
foreach (string s in countryList)
{
_out.writeLn(s);
}
But I later got rid of this and started a new technique. I created a class similar to the previous answer:
public class NS_Country
{
public string countryCode { get; set; }
public string countryName { get; set; }
public string countryEnum { get; set; }
public string countryNumericID { get; set; }
}
Here is the new code for getting the list of company names, codes and IDs. I realize that it's not very efficient as I mentioned before I'm not really a coder and this is my first attempt with C#, lots of Google and cutting/pasting ;D.
_out.writeLn(" Attempting to get Country list.");
// Create a list for the NS_Country objects
List<NS_Country> CountryList = new List<NS_Country>();
// Create a new GetSelectValueFieldDescription object to use in a getSelectValue search
GetSelectValueFieldDescription countryDesc = new GetSelectValueFieldDescription();
countryDesc.recordType = RecordType.customer;
countryDesc.recordTypeSpecified = true;
countryDesc.sublist = "addressbooklist";
countryDesc.field = "country";
// Create a GetSelectValueResult object to hold the results of the search
GetSelectValueResult myResult = _service.getSelectValue(countryDesc, 0);
BaseRef[] baseRef = myResult.baseRefList;
foreach (BaseRef nsCountryRef in baseRef)
{
// Didn't know how to do this more efficiently
// Get the type for the BaseRef object, get the property for "internalId",
// then finally get it's value as string and assign it to myCountryCode
string myCountryCode = nsCountryRef.GetType().GetProperty("internalId").GetValue(nsCountryRef).ToString();
// Create a new NS_Country object
NS_Country countryToAdd = new NS_Country
{
countryCode = myCountryCode,
countryName = nsCountryRef.name,
// Call to a function to get the enum value based on the name
countryEnum = getCountryEnum(nsCountryRef.name)
};
try
{
// If the country enum was verified in the Countries enum
if (!String.IsNullOrEmpty(countryToAdd.countryEnum))
{
int countryEnumIndex = (int)Enum.Parse(typeof(Country), countryToAdd.countryEnum);
Debug.WriteLine("Enum: " + countryToAdd.countryEnum + ", Enum Index: " + countryEnumIndex);
_out.writeLn("ID: " + countryToAdd.countryCode + ", Name: " + countryToAdd.countryName + ", Enum: " + countryToAdd.countryEnum);
}
}
// There was a problem locating the country enum that was not handled
catch (Exception ex)
{
Debug.WriteLine("Enum: " + countryToAdd.countryEnum + ", Enum Index Not Found");
_out.writeLn("ID: " + countryToAdd.countryCode + ", Name: " + countryToAdd.countryName + ", Enum: Not Found");
}
// Add the countryToAdd object to the CountryList
CountryList.Add(countryToAdd);
}
// Create a JSON - I need this for my javascript
var javaScriptSerializer = new System.Web.Script.Serialization.JavaScriptSerializer();
string jsonString = javaScriptSerializer.Serialize(CountryList);
Debug.WriteLine(jsonString);
In order to get the enum values, I created a function called getCountryEnum:
static string getCountryEnum(string countryName)
{
// Create a dictionary for looking up the exceptions that can't be converted
// Don't know what Netsuite was thinking with these ones ;D
Dictionary<string, string> dictExceptions = new Dictionary<string, string>()
{
{"Congo, Democratic Republic of", "_congoDemocraticPeoplesRepublic"},
{"Myanmar (Burma)", "_myanmar"},
{"Wallis and Futuna", "_wallisAndFutunaIslands"}
};
// Replace with "'s" in the Country names with "s"
string countryName2 = Regex.Replace(countryName, #"\'s", "s");
// Call a function that replaces accented characters with non-accented equivalent
countryName2 = RemoveDiacritics(countryName2);
countryName2 = Regex.Replace(countryName2, #"\W", " ");
string[] separators = {" ","'"}; // "'" required to deal with country names like "Cote d'Ivoire"
string[] words = countryName2.Split(separators, StringSplitOptions.RemoveEmptyEntries);
for (var i = 0; i < words.Length; i++)
{
string word = words[i];
if (i == 0)
{
words[i] = char.ToLower(word[0]) + word.Substring(1);
}
else
{
words[i] = char.ToUpper(word[0]) + word.Substring(1);
}
}
string countryEnum2 = "_" + String.Join("", words);
// return an empty string if the country name contains Deprecated
bool b = countryName.Contains("Deprecated");
if (b)
{
return String.Empty;
}
else
{
// test to see if the country name was one of the exceptions
string test;
bool isExceptionCountry = dictExceptions.TryGetValue(countryName, out test);
if (isExceptionCountry == true)
{
return dictExceptions[countryName];
}
else
{
return countryEnum2;
}
}
}
In the above I used a function, RemoveDiacritics I found here. I will repost the referenced function below:
static string RemoveDiacritics(string text)
{
string formD = text.Normalize(NormalizationForm.FormD);
StringBuilder sb = new StringBuilder();
foreach (char ch in formD)
{
UnicodeCategory uc = CharUnicodeInfo.GetUnicodeCategory(ch);
if (uc != UnicodeCategory.NonSpacingMark)
{
sb.Append(ch);
}
}
return sb.ToString().Normalize(NormalizationForm.FormC);
}
Here are the tricky cases to test any solution you develop with:
// Test tricky names
Debug.WriteLine(getCountryEnum("Curaçao"));
Debug.WriteLine(getCountryEnum("Saint Barthélemy"));
Debug.WriteLine(getCountryEnum("Croatia/Hrvatska"));
Debug.WriteLine(getCountryEnum("Korea, Democratic People's Republic"));
Debug.WriteLine(getCountryEnum("US Minor Outlying Islands"));
Debug.WriteLine(getCountryEnum("Cote d'Ivoire"));
Debug.WriteLine(getCountryEnum("Heard and McDonald Islands"));
// Enums that fail
Debug.WriteLine(getCountryEnum("Congo, Democratic Republic of")); // _congoDemocraticPeoplesRepublic added to exceptions
Debug.WriteLine(getCountryEnum("Myanmar (Burma)")); // _myanmar added to exceptions
Debug.WriteLine(getCountryEnum("Netherlands Antilles (Deprecated)")); // Skip Deprecated
Debug.WriteLine(getCountryEnum("Serbia and Montenegro (Deprecated)")); // Skip Deprecated
Debug.WriteLine(getCountryEnum("Wallis and Futuna")); // _wallisAndFutunaIslands added to exceptions
For my purposes I wanted a JSON object that had all the values for Coutries (Name, Code, Enum, Value). I'll include it here in case anyone is searching for it. The numeric values are useful when you have a 3rd party HTML form that has to forward the information to a Netsuite online form.
Here is a link to the JSON object on Pastebin.
My appologies for the lack of programming knowledge (only really do a bit of javascript), hopefully this additional information will be useful for someone.

Slow performance in getting model from list model using enumerable linq

I decided to pour database records into List<> model and use enumerable Linq to get record from it. It have 141,856 records in it. What we found instead is it is pretty slow.
So, any suggestion or recommendation on making it run very quickly?
public class Geography
{
public string Zipcode { get; set; }
public string City { get; set; }
public string State { get; set; }
}
var geography = new List<Geography>();
geography.Add(new Geography() { Zipcode = "32245", City = "Jacksonville", State = "Florida" });
geography.Add(new Geography() { Zipcode = "00001", City = "Atlanta", State = "Georgia" });
var result = geography.Where(x => (string.Equals(x.Zipcode, "32245", String Comparison.InvariantCulterIgnoreCase))).FirstOrDefault();
When we have 86,000 vehicles in Inventory and we want to use parallel task to get it done quickly but it become very slow when geography is being looked up.
await Task.WhenAll(vehicleInventoryRecords.Select(async inventory =>
{
var result = geography.Where(x => (string.Equals(x.Zipcode, inventory.Zipcode, String Comparison.InvariantCulterIgnoreCase))).FirstOrDefault();
}));
Use dictionary<string, Geography> to store geography data. Looking up data in dictionary by key is O(1) operation while for list it is O(n)
You haven't mentioned if your ZIP codes are unique, so I'll assume they aren't. If they are - look at Giorgi's answer and skip to part 2 of my answer.
1. Use lookups
Since you're looking up your geography list multiple times by the same property, you should group the values by Zipcode. You can do this easily by using ToLookup - this will create a Lookup object. It is similar to a Dictionary, except it can multiple values as it's value. Passing a StringComparer.InvariantCultureIgnoreCase as the second parameter to your ToLookup will make it case-insensitive.
var geography = new List<Geography>();
geography.Add(new Geography { Zipcode = "32245", City = "Jacksonville", State = "Florida" });
geography.Add(new Geography { Zipcode = "00001", City = "Atlanta", State = "Georgia" });
var geographyLookup = geography.ToLookup(x => x.Zipcode, StringComparer.InvariantCultureIgnoreCase);
var result = geographyLookup["32245"].FirstOrDefault();
This should increase your performance considerably.
2. Parallelize with PLINQ
The way you parallelize your lookups is questionable. Luckily, .NET has PLINQ. You can use AsParallel and a parallel Select to asynchronously iterate over your vehicleInventoryRecords like this:
var results = vehicleInventoryRecords.AsParallel().Select(x => geographyLookup[x].FirstOrDefault());
Using Parallel.ForEach is another good option.

Remove duplicates from array of objects

I have a class called Customer that has several string properties like
firstName, lastName, email, etc.
I read in the customer information from a csv file that creates an array of the class:
Customer[] customers
I need to remove the duplicate customers having the same email address, leaving only 1 customer record for each particular email address.
I have done this using 2 loops but it takes nearly 5 minutes as there are usually 50,000+ customer records. Once I am done removing the duplicates, I need to write the customer information to another csv file (no help needed here).
If I did a Distinct in a loop how would I remove the other string variables that are a part of the class for that particular customer as well?
Thanks,
Andrew
With Linq, you can do this in O(n) time (single level loop) with a GroupBy
var uniquePersons = persons.GroupBy(p => p.Email)
.Select(grp => grp.First())
.ToArray();
Update
A bit on O(n) behavior of GroupBy.
GroupBy is implemented in Linq (Enumerable.cs) as this -
The IEnumerable is iterated only once to create the grouping. A Hash of the key provided (e.g. "Email" here) is used to find unique keys, and the elements are added in the Grouping corresponding to the keys.
Please see this GetGrouping code. And some old posts for reference.
What's the asymptotic complexity of GroupBy operation?
What guarantees are there on the run-time complexity (Big-O) of LINQ methods?
Then Select is obviously an O(n) code, making the above code O(n) overall.
Update 2
To handle empty/null values.
So, if there are instances where the value of Email is null or empty, the simple GroupBy will take just one of those objects from null & empty each.
One quick way to include all those objects with null/empty value is to use some unique keys at the run time for those objects, like
var tempEmailIndex = 0;
var uniqueNullAndEmpty = persons
.GroupBy(p => string.IsNullOrEmpty(p.Email)
? (++tempEmailIndex).ToString() : p.Email)
.Select(grp => grp.First())
.ToArray();
I'd do it like this:
public class Person {
public Person(string eMail, string Name) {
this.eMail = eMail;
this.Name = Name;
}
public string eMail { get; set; }
public string Name { get; set; }
}
public class eMailKeyedCollection : System.Collections.ObjectModel.KeyedCollection<string, Person> {
protected override string GetKeyForItem(Person item) {
return item.eMail;
}
}
public void testIt() {
var testArr = new Person[5];
testArr[0] = new Person("Jon#Mullen.com", "Jon Mullen");
testArr[1] = new Person("Jane#Cullen.com", "Jane Cullen");
testArr[2] = new Person("Jon#Cullen.com", "Jon Cullen");
testArr[3] = new Person("John#Mullen.com", "John Mullen");
testArr[4] = new Person("Jon#Mullen.com", "Test Other"); //same eMail as index 0...
var targetList = new eMailKeyedCollection();
foreach (var p in testArr) {
if (!targetList.Contains(p.eMail))
targetList.Add(p);
}
}
If the item is found in the collection, you could easily pick (and eventually modify) it with:
if (!targetList.Contains(p.eMail))
targetList.Add(p);
else {
var currentPerson=targetList[p.eMail];
//modify Name, Address whatever...
}

Insert Multiple Values and Return Multiple Values

I've just started using Dapper and I've run into the following problem.
I want to insert a bunch of records, and return the inserted records alongside the auto-incremented id.
Using Postgres, I want to run the equivalent of this query:
INSERT INTO players (name)
VALUES ('Player1'), ('Player2'), ('Player3'), ('Player4'), ('Player5')
RETURNING id, name;
Using Dapper to run this query on a list of players and serialise back into a list of players (with the ids) I thought I could do this:
public class Player
{
public int Id { get; set; }
public string Name { get; set; }
}
var players = new List<Player> { new Player { Name = "Player1" }, new Player { Name = "Player2" }, new Player { Name = "Player3" }, new Player { Name = "Player4" }, new Player { Name = "Player5" }}
connection.Query<Player>("INSERT INTO players (name) VALUES (#Name) \r\n" +
"RETURNING id, name, tag;",
players);
This throws the following error (it's a list of players each with a name):
Parameter '#Name' referenced in SQL but not found in parameter list
I believe that Query() may not support lists of parameters, so I tried connection.Execute() instead. Execute works, but obviously it doesn't return back the inserted players with their Ids.
It is worth noting that I can do an INSERT and RETURNING like this when I insert only one value.
Does anyone know how I can do INSERT and RETURNING for multiple values like this with Dapper?
Update
I have this (somewhat dirty) solution:
var sb = new StringBuilder();
sb.Append("INSERT INTO players (name) VALUES \r\n");
var parameters = new ExpandoObject() as IDictionary<string, object>;
var values = new List<string>();
for (int i = 0; i < players.Count; i++)
{
var p = players[i];
values.Add($"(#Name{i})");
parameters[$"Name{i}"] = p.Name;
}
sb.Append(string.Join(", \r\n", values));
sb.Append(" \r\nRETURNING id, name, tag;");
// parameters = { Name1 = "Player1", Name2 = "Player2, ... etc}
var ret = connection.Query<Player>(sb.ToString(), parameters);
So building an ExpandoObject with properties from my Players and then passing that into Dapper Query(). It works, but it seems pretty dirty. Any suggestions on how to improve this?
Firstly, it should be noted that passing a List<Player> to the Execute method as the outermost parameter is essentially the same as:
foreach(var player in players)
connection.Execute(
"INSERT INTO players (name) VALUES (#Name) \r\n" +
"RETURNING id, name, tag;", player);
Dapper just unrolls it for you (unless it is a very specific async scenario where it can pipeline the commands). Dapper does support list-parameter expansion, but this is for leaf-level values, and was constructed for in (...) usage, so the syntax would not come out quite as you want; as an example:
DateTime dateStart = ...
int[] custIds = ...
var orders = conn.Query<Order>(#"
select * from Order
where OrderDate >= #dateStart and CustomerId in #custIds",
new { dateStart, custIds }).AsList();
which becomes the SQL:
select * from Order
where OrderDate >= #dateStart and CustomerId in (#custIds0, #custIds1, ...)
(depending on the number of items in the array)
Your expected usage is one that has been suggested and discussed quite a bit recently; at the current time it isn't supported - the loop unrolling only works for Execute, however, it is looking increasingly likely that we will add something here. The tricky bit is in deciding what the correct behavior is, and whether it is expected that this would essentially concatenate the results of multiple separate operations.
However; to do what you want via LINQ:
var results = players.SelectMany(
player => connection.Query<Player>("...", player)).AsList();
This is the same "unroll the loop and concatenate the results" behavior, except it should work.

Categories