Create a DataTable containing only unique values with Linq in C# - c#

I have a DataTable dt_Candidates
Candidate | First Name | Last Name
--------------------|----------------|---------------
John, Kennedy | John | Kennedy
Richard, Nixon | Richard | Nixon
Eleanor, Roosevelt | Eleanor | Roosevelt
Jack, Black | Jack | Black
Richard, Nixon | Richard | Nixon
I want to create without a nested loops and preferably using Linq, a DataTable containing ONLY unique values like this one called dt_Candidates2:
Candidate | First Name | Last Name
--------------------|----------------|---------------
John, Kennedy | John | Kennedy
Eleanor, Roosevelt | Eleanor | Roosevelt
Jack, Black | Jack | Black
And a list or an array called RejectedCandidates containing the distinct duplicates
RejectedCandidates = {"Richard, Nixon"}

As noted, I don't think it really needs LINQ here. It can go something like this:
DataTable dt = new DataTable();
dt.Columns.Add("Candidate");
dt.Columns.Add("First");
dt.Columns.Add("Last");
dt.PrimaryKey = new []{ dt.Columns["Candidate"] }; //means that dt.Find() will work
while(...){
string candidate = ...
if(dt.Rows.Find(candidate) != null)
RejectList.Add(...);
else
dt.Rows.Add(...);
}
Avoid using LINQ's .Any on a DataTable for this. Not only is it a pain to get going because it needs casting steps or extension libraries (see here) to, it will then use loops to find the info you seek; the built-in mechanism for the PrimaryKey uses hash tables for much faster lookups.

var dt = new DataTable
{
Columns = {"Candidate", "First Name", "Last Name"},
Rows =
{
new object [] { "John, Kennedy", "John", "Kennedy"},
new object [] { "Richard, Nixon", "Richard", "Nixon"},
new object [] { "Eleanor, Roosevelt", "Eleanor", "Roosevelt"},
new object [] { "Jack, Black", "Jack", "Black"},
new object [] { "Richard, Nixon", "Richard", "Nixon"},
}
};
you can use grouping (groupBy) to find duplicates, filter them out, and then create a new DataTable, using DataTableExtensions.CopyToDataTable extension method:
var dt2 = dt.AsEnumerable()
.GroupBy(r => r["Candidate"])
.Where(g => g.Count() == 1)
.Select(g => g.First())
.CopyToDataTable();

Related

Find the first free id

One of my small database management projects (written in delphi) used sql queries to find the first free id of mysql table.
Example: I have to find the first free id (hole) in a table like this:
| id | Col1 |
|------|------|
| 5101 | ABC |
| 5102 | BCD |
| 5103 | CDE |
| 5105 | EFG | 🡔 first missing id
| 5106 | GHI |
| 5108 | ILM |
The code should find the first free id 5104
Here's how I'd do it in SQL (in old project):
SELECT
MIN((doc.id + 1)) AS nextID
FROM (doc
LEFT JOIN doc doc1
ON (((doc.id + 1) = doc1.id)))
WHERE (ISNULL(doc1.id) AND (doc.id > 5000))
Now, which I am rewriting in c # language, I need to convert sql statements into a LINQ query (which uses Devart dotConnect for mysql Entity Framework).
Starting from here:
DC db = new DC();
var nums = db.Documentos.OrderBy(x => x.Id);
From Can LINQ be used to find gaps in a sorted list?:
var strings = new string[] { "7", "13", "8", "12", "10", "11", "14" };
var list = strings.OrderBy(s => int.Parse(s));
var result = Enumerable.Range(list.Min(), list.Count).Except(list).First(); // 9
Basically, order the list. Then create an array of sequential numbers (1,2,3...) from the minimum all the way to the max. Check for missing values in the list, and grab the first one. That's the first missing number.
This can give you all gaps within your table
var nums= (new List<int> (){1,2,3,25,4,5,6,7,8, 12, 15,21,22,23}).AsQueryable();
nums
.OrderBy(x => x)
.GroupJoin(nums, n=> n + 1, ni => ni, (o,i)=> new {o, i})
.Where(t=> !(t.i is IGrouping<int, int>))
.Dump();
.Net Fiddle
Another method (similar to what you're using now).
Assume you have an array of integers (or another type of collection) like this:
var myIDs = new int[] { 5101, 5113, 5102, 5103, 5110, 5104, 5105, 5116, 5106, 5107, 5108, 5112, 5114, 5115 };
If it's not already ordered, the OrderBy() it:
myIDs = myIDs.OrderBy(n => n).ToArray();
Extract the first number that is less than (next number) + 1:
int result = myIDs.Where((n, i) => (i < myIDs.Length - 1) && (n + 1 < myIDs[i + 1])).FirstOrDefault();
If none of the members of this collection satisfy the condition, take the last one and add 1:
result = result == default ? myIDs.Last() + 1 : result;

Sort list alphabetically by all columns in C# using linq (if possible)

I have a list of cities with city name and city country and I want to use LinQ to sort by all the columns, matching a filter.
For example:
CITY_NAME | CITY COUNTRY
-------------+---------------
Buenos Aires | Argentina
Asuncion | Paraguay
Sydney | Australia
Abadeh | Iran
Acero | Bolivia
I want to get a list sorted by matching the .StartsWith the "A" letter but considering all the columns, but first City_Country and then City_Name, and also in alphabetical order.
The result should be:
Buenos Aires | Argentina
Sydney | Australia
Abadeh | Iran
Acero | Bolivia
Asuncion | Paraguay
This:
.OrderBy(city => city.CITY_COUNTRY).ThenBy(city => city.CITY_NAME)
doesn't work, since first order by country and after by name and I'll get a result like:
Buenos Aires | Argentina
Sydney | Australia
Acero | Bolivia
Abadeh | Iran
Asuncion | Paraguay
which is wrong, since Abadeh | Iran matches better than Acero | Bolivia.
I tried to be as clear as I could.
Thanks
I switched from lambda syntax to query comprehension to make it easier to cache the StartsWith results:
var ans = from city in Cities
let countrysw = city.COUNTRY_NAME.StartsWith("A")
let citysw = city.CITY_NAME.StartsWith("A")
where countrysw || citysw
orderby citysw,(countrysw ? city.COUNTRY_NAME : city.CITY_NAME)
select city;
Basically you test the country and city for starts with matches, and sort those matches by country match first (false sorts before true) then by the matching name.
The ThenBy will only make any difference if you had two cities in the same country, since it sorts the countries, then sorts the cities within those countries.
For example, if you also had Perth, Australia in your list, then you would end up with:
Buenos Aires | Argentina
Perth | Australia
Sidney | Australia
Acero | Bolivia
Abadeh | Iran
Asuncion | Paraguay
If you want to sort by whichever of the city or country values comes first, then you could try something like this:
.OrderBy(city => string.Compare(city.CITY_COUNTRY, city.CITY_NAME) < 0 ? city.CITY_COUNTRY : city.CITY_NAME)
I think that will give you the list you expect.
I suspect you need to project a new column that contains either City or Country (depending on whether it matches the "A" prefix or not) and then order by that new column. One possible approach would be something like:
using System;
using System.Collections.Generic;
using System.Linq;
namespace TestConsole
{
public class Program
{
public class CountryAndCity
{
public string Country { get; set; }
public string City { get; set; }
}
static void Main(string[] args)
{
var cities = new List<CountryAndCity>
{
new CountryAndCity() {Country = "Australia", City = "Sydney"},
new CountryAndCity() {Country = "Argentina", City = "Buenos Aires"},
new CountryAndCity() {Country = "Paraguay", City = "Asuncion"},
new CountryAndCity() {Country = "Abadeh", City = "Iran"}
};
// The important bit starts here
var results = cities
.Where(z => z.Country.StartsWith("A") || z.City.StartsWith("A")) // this line is optional (only needed if you want to remove those that don't start with A
.Select(z =>
new
{
OriginalData = z,
Match = z.Country.StartsWith("A") ? z.Country : z.City.StartsWith("A") ? z.City : "ZZZZZZ"
})
.OrderBy(z => z.Match)
.Select(z => z.OriginalData);
// The important bit ends here
Console.WriteLine(string.Join("\r\n", results.Select(z => $"{z.Country}-{z.City}")));
Console.ReadLine();
}
}
}

How to find the distinct values of a column and find the sum of those same values of another column?

In my c# WinForm application, I have a DataTable keeping track of sales of fruits and it's like:
/*TableA*/
Name | QuantitySold
Apple | 5
Orange | 10
Apple | 3
Grape | 2
Banana | 6
Orange | 7
Apple | 2
Grape | 2
Now I want to filter them by the same fruit names AND get the sums of each of those fruits sold at the same time, creating a new resultant DataTable, which should look like
/*TableB*/
Name | TotalSold
Apple | 10
Orange | 17
Grape | 4
Banana | 6
How could I achieve this?
I have found the count of distinct fruit names by
int distinctCt = TableA.AsEnumerable()
.Select(row=>row.Field<string>("Name"))
.Distinct().Count();
But I realized this won't go anywhere from here.
Can someone please give me an idea on how to do this?
Use GroupBy and Sum:
var nameGroups = TableA.AsEnumerable().GroupBy(r => r.Field<string>("Name"));
var TableB = TableA.Clone();
TableB.Columns["QuantitySold"].ColumnName = "TotalSold";
foreach(var g in nameGroups)
{
TableB.Rows.Add(g.Key, g.Sum(r => r.Field<int>("QuantitySold")));
}
You need to use GroupBy, try this:
var x = TableA.AsEnumerable()
.GroupBy(r=>r.Field<string>("Name"))
.Select(sm => new
{
Name = sm.First().Field<string>("Name"),
QuamtitySold = sm.Sum(q => q.Field<int>("QuantitySold"))
});

how to get contain count based on sharepoint list with linq?

i have a sharepoint list, lets call it students.
name | surname | username
------|---------|----------
test | test1 | test11
test2 | test2 | test22
test3 | test3 | test33
i keep the student names based on manager, adding to sharepoint list with;
String.Join(",", ListBox2.Items.Cast<ListItem>().Select(i => i.Text).ToArray());
and i have another list lets call it manages
manager | students
--------|---------------
man1 | test11,test22
man2 | test33,test11
so what i need is each student's manager count, in counter table;
studentuName | count
-----------------|---------
test11 | 2
test22 | 1
test33 | 1
i call them as a list (there will be much more better ways for calling them, i'm just giving example)
List<string> students (has value "test11", "test22", "test33")
List<string> manages (has value "test11,test22" , "test33,test11")
so how can i get that, how many manager each student have , with linq ?
thank you
Edit
With #Servy 's answer i can get
List<string> managers = new List<string> { "a,b", "a,b,c,d", "a,c", "c,d,f", "a,f,c,b" };
var query = managers.SelectMany(manager => manager.Select(student => new { manager, student }));
var finalQuery = query.GroupBy(pair => pair.student).Select(group => new { Student = group.Key, Count = group.Count() });
it also retrurns me the count of comma "," is there any way to avoid that?
and also is there any way to do merge them with single query?
First we'll transform the manager list from a single valued manager with a multi-valued list of students to where each "row" has a single manager and a single student. We'll do this by creating additional "rows" for each student in that value of the list for managers.
var query = managers.SelectMany(manager =>
manager.students.Select(student => new { manager, student }));
Now we can just group these items by student and count the size of the group:
var finalQuery = query.GroupBy(pair => pair.student)
.Select(group => new { Student = group.Key, Count = group.Count()});
(You can combine those into one query.)
Try this. I think it is what you are trying to accomplish:
List<string> Students = new List<string>() { "Test1", "Test2" };
List<string> Manager = new List<string>(){"Test1","Test1","Test3"};
var counter = Manager.Count(m => m == Students[0]);
Console.WriteLine(counter);
Console.ReadLine();
This will allow you to create a loop that goes through each student in the list and get the count of associated managers

Search on all fields of an entity

I'm trying to implement an "omnibox"-type search over a customer database where a single query should attempt to match any properties of a customer.
Here's some sample data to illustrate what I'm trying to achieve:
FirstName | LastName | PhoneNumber | ZipCode | ...
--------------------------------------------------
Mary | Jane | 12345 | 98765 | ...
Jane | Fonda | 54321 | 66666 | ...
Billy | Kid | 23455 | 12345 | ...
If the query was "Jane", I'd expect row #1 to be returned as well as row #2.
A query for 12345 would yield rows #1 and #3.
Right now, my code looks pretty much like this:
IEnumerable<Customer> searchResult = context.Customer.Where(
c => c.FirstName == query ||
c.LastName == query ||
c.PhoneNumber == query ||
c.ZipCode == query
// and so forth. Fugly, huh?
);
This obviously works. It smells like really bad practice to me, though, since any change in the Entity (removal of properties, introduction of new properties) would break stuff.
So: is there some LINQ-foo that will search across all properties of whatever Entity I throw at it?
first find all properties within Customer class with same type as query:
var stringProperties = typeof(Customer).GetProperties().Where(prop =>
prop.PropertyType == query.GetType());
then find all customers from context that has at least one property with value equal to query:
context.Customer.Where(customer =>
stringProperties.Any(prop =>
prop.GetValue(customer, null) == query));

Categories