Select list elements contained in another list in linq

Select list elements contained in another list in linq - c#

I have a string with "|" seperators:
string s = "item1|item2|item3|item4";
a list of objects that each have a name and value:
//object
List<ItemObject> itemList = new List<ItemObject>();
itemList.Add(new ItemObject{Name="item0",Value=0});
itemList.Add(new ItemObject{Name="item1",Value=1});
//class
public class ItemObject(){
public string Name {get;set;}
public int Value {get;set;}
}
How could the following code be done in one line in linq?
var newList = new List<object>();
foreach (var item in s.Split("|"))
{
newList.Add(itemList.FirstOrDefault(x => x.Name == item));
}
// Result: newList
// {Name="item1",Value=1}

I would suggest to start from splitting the string in the beginning. By doing so we won't split it during each iteration:
List<ItemObject> newList = s
.Split("|")
.SelectMany(x => itemList.Where(i => i.Name == x))
.ToList();
Or even better:
List<ItemObject> newList = s
.Split("|") // we can also pass second argument: StringSplitOptions.RemoveEmptyEntries | StringSplitOptions.TrimEntries
.Distinct() // remove possible duplicates, we can also specify comparer f.e. StringComparer.CurrentCulture
.SelectMany(x => itemList
.Where(i => string.Equals(i.Name, x))) // it is better to use string.Equals, we can pass comparison as third argument f.e. StringComparison.CurrentCulture
.ToList();

Try this:
var newList = itemList.Where(item => s.Split('|').Contains(item.Name));
The proposed solution also prevents from populating newList with nulls from nonpresent items. You may also consider a more strict string equality check.

string s = "item1|item2|item3|item4";
I don't see a need for splitting this string s. So you could simply do
var newList = itemList.Where(i => s.Contains(i.Name));
For different buggy input you can also do
s = "|" + s + "|";
var newList = itemList.Where(o => s.Contains("|" + o.Name + '|')).ToList();

List<object> newList = itemList.Where(item => s.Split("|").Contains(item.Name)).ToList<object>();

Related

How to avoid two embedded cycles in linq query C#

var listOfIds = new List<string>();
var allItems = IEnumerable<Info>();
foreach (var id in collectionIds)
{
listOfIds.AddRange(allItems
.Where(p => p.Data.FirstOrDefault(m => m.Key == "myId").Value == id)
.Select(x => x.Id));
}
I would like to avoid using AddRange but use only Add in this case and maybe use only FirstOrDefault in the place of where to avoid the last Select case.
Is this possible and if yes how?

Assuming your original code is giving you the correct data, specifically you are OK with:
Only concerned that the first item in p.Data contains a matching value and;
p.Data will always contains at least a single element.
Then this code will give you the same output:
var listOfIds = allItems
.Where(p => collectionIds.Contains(p.Data.First(m => m.Key == "myId").Value))
.ToList();
However, if you really do care that any value in p.Data matches, then this would be more appropriate:
var listOfIds = allItems
.Where(p => p.Data.Any(m => m.Key == "myId" &&
collectionIds.Contains(m.Value)))
.ToList();

How about this approach:
var listOfIds = new List<string>();
var allItems = IEnumerable<Info>();
var groupedAllItems = allItems.GroupBy(x => x.Data.FirstOrDefault(m => m.Key == "myId")?.Value ?? "MyIdNotFound");
//collectionIds should be of type HashSet<string> for the contains to be fast
listOfIds.AddRange(groupedAllItems.Where(x => collectionIds.Contains(x.Key)).SelectMany(x => x));

Using C# Lambda to split string and search value

I have a string with the following value:
0:12211,90:33221,23:09011
In each pair, the first value (before the : (colon)) is an employee id, the second value after is a payroll id.
So If I want to get the payroll id for employee id 23 right now I have to do:
var arrayValues=mystring.split(',');
and then for each arrayValues do the same:
var employeeData = arrayValue.split(':');
That way I will get the key and the value.
Is there a way to get the Payroll ID by a given employee id using lambda?
If the employeeId is not in the string then by default it should return the payrollid for employeeid 0 zero.

Using a Linq pipeline and anonymous objects:
"0:12211,90:33221,23:09011"
.Split(',')
.Select(x => x.Split(':'))
.Select(x => new { employeeId = x[0], payrollId = x[1] })
.Where(x=> x.employeeId == "23")
Results in this:
{
employeeId = "23",
payrollId = "09011"
}
These three lines represent your data processing and projection logic:
.Split(',')
.Select(x => x.Split(':'))
.Select(x => new { employeeId = x[0], payrollId = x[1] })
Then you can add any filtering logic with Where after this the second Select

You can try something like that
"0:12211,90:33221,23:09011"
.Split(new char[] { ',' })
.Select(c => {
var pair = c.Split(new char[] { ':' });
return new KeyValuePair<string, string>(pair[0], pair[1]);
})
.ToList();
You have to be aware of validations of data

If I were you, I'd use a dictionary. Especially if you're going to do more than one lookup.
Dictionary<int, int> employeeIDToPayrollID = "0:12211,90:33221,23:09011"
.Split(',') //Split on comma into ["0:12211", "90:33221", "23:09011"]
.Select(x => x.Split(':')) //Split each string on colon into [ ["0", "12211"]... ]
.ToDictionary(int.Parse(x => x[0]), int.Parse(x => x[1]))
and now, you just have to write employeeIDtoPayrollID[0] to get 12211 back. Notice that int.Parse will throw an exception if your IDs aren't integers. You can remove those calls if you want to have a Dictionary<string, string>.

You can use string.Split along with string.Substring.
var result =
str.Split(',')
.Where(s => s.Substring(0,s.IndexOf(":",StringComparison.Ordinal)) == "23")
.Select(s => s.Substring(s.IndexOf(":",StringComparison.Ordinal) + 1))
.FirstOrDefault();
if this logic will be used more than once then I'd put it to a method:
public string GetPayrollIdByEmployeeId(string source, string employeeId){
return source.Split(',')
.Where(s => s.Substring(0, s.IndexOf(":", StringComparison.Ordinal)) == employeeId)
.Select(s => s.Substring(s.IndexOf(":", StringComparison.Ordinal) + 1))
.FirstOrDefault();
}

Assuming you have more than three pairs in the string (how long is that string, anyway?) you can convert it to a Dictionary and use that going forward.
First, split on the comma and then on the colon and put in a Dictionary:
var empInfo = src.Split(',').Select(p => p.Split(':')).ToDictionary(pa => pa[0], pa => pa[1]);
Now, you can write a function to lookup payroll IDs from employee IDs:
string LookupPayrollID(Dictionary<string, string> empInfo, string empID) => empInfo.TryGetValue(empID, out var prID) ? prID : empInfo["0"];
And you can call it to get the answer:
var emp23prid = LookupPayrollID(empInfo, "23");
var emp32prid = LookupPayrollID(empInfo, "32");
If you just have three employees in the string, creating a Dictionary is probably overkill and a simpler answer may be appropriate, such as searching the string.

LINQ Query to find string of multidimensional array with most duplicates

I have written a function that gives me an multidimensional array of an Match with multiple regex strings. (FileCheck[][])
FileCheck[0] // This string[] contains all the filenames
FileCheck[1] // This string[] is 0 or 1 depending on a Regex match is found.
FileCheck[2] // This string[] contains the Index of the first found Regex.
foreach (string File in InputFolder)
{
int j = 0;
FileCheck[0][k] = Path.GetFileName(File);
Console.WriteLine(FileCheck[0][k]);
foreach (Regex Filemask in Filemasks)
{
if (string.IsNullOrEmpty(FileCheck[1][k]) || FileCheck[1][k] == "0")
{
if (Filemask.IsMatch(FileCheck[0][k]))
{
FileCheck[1][k] = "1";
FileCheck[2][k] = j.ToString(); // This is the Index of the Regex thats Valid
}
else
{
FileCheck[1][k] = "0";
}
j++;
}
Console.WriteLine(FileCheck[1][k]);
}
k++;
}
Console.ReadLine();
// I need the Index of the Regex with the most valid hits
I'm trying to write a function that gives me the string of the RegexIndex that has the most duplicates.
This is what I tried but did not work :( (I only get the count of the string the the most duplicates but not the string itself)
// I need the Index of the Regex with the most valid hits
var LINQ = Enumerable.Range(0, FileCheck[0].GetLength(0))
.Where(x => FileCheck[1][x] == "1")
.GroupBy(x => FileCheck[2][x])
.OrderByDescending(x => x.Count())
.First().ToList();
Console.WriteLine(LINQ[1]);
Example Data
string[][] FileCheck = new string[3][];
FileCheck[0] = new string[]{ "1.csv", "TestValid1.txt", "TestValid2.txt", "2.xml", "TestAlsoValid.xml", "TestValid3.txt"};
FileCheck[1] = new string[]{ "0","1","1","0","1","1"};
FileCheck[2] = new string[]{ null, "3", "3", null,"1","2"};
In this example I need as result of the Linq query:
string result = "3";

With your current code, substituting 'ToList()' with 'Key' would do the trick.
var LINQ = Enumerable.Range(0, FileCheck[0].GetLength(0))
.Where(x => FileCheck[1][x] == "1")
.GroupBy(x => FileCheck[2][x])
.OrderByDescending(x => x.Count())
.First().Key;
Since the index is null for values that are not found, you could also filter out null values and skip looking at the FileCheck[1] array. For example:
var maxOccurringIndex = FileCheck[2].Where(ind => ind != null)
.GroupBy(ind=>ind)
.OrderByDescending(x => x.Count())
.First().Key;
However, just a suggestion, you can use classes instead of a nested array, e.g.:
class FileCheckInfo
{
public string File{get;set;}
public bool Match => Index.HasValue;
public int? Index{get;set;}
public override string ToString() => $"{File} [{(Match ? Index.ToString() : "no match")}]";
}
Assuming InputFolder is an enumerable of string and Filemasks an enumerable of 'Regex', an array can be filled with:
FileCheckInfo[] FileCheck = InputFolder.Select(f=>
new FileCheckInfo{
File = f,
Index = Filemasks.Select((rx,ind) => new {ind, IsMatch = rx.IsMatch(f)}).FirstOrDefault(r=>r.IsMatch)?.ind
}).ToArray();
Getting the max occurring would be much the same:
var maxOccurringIndex = FileCheck.Where(f=>f.Match).GroupBy(f=>f.Index).OrderByDescending(gr=>gr.Count()).First().Key;
edit PS, the above is all assuming you need to reuse the results, if you only have to find the maximum occurrence you're much better of with an approach such as Martin suggested!
If the goal is only to get the max occurrence, you can use:
var maxOccurringIndex = Filemasks.Select((rx,ind) => new {ind, Count = InputFolder.Count(f=>rx.IsMatch(f))})
.OrderByDescending(m=>m.Count).FirstOrDefault()?.ind;

Your question and code seems very convoluted. I am guessing that you have a list of file names and another list of file masks (regular expressions) and you want to find the file mask that matches most file names. Here is a way to do that:
var fileNames = new[] { "1.csv", "TestValid1.txt", "TestValid2.txt", "2.xml", "TestAlsoValid.xml", "TestValid3.txt" };
var fileMasks = new[] { #"\.txt$", #"\.xml$", "valid" };
var fileMaskWithMostMatches = fileMasks
.Select(
fileMask => new {
FileMask = fileMask,
FileNamesMatched = fileNames.Count(
fileName => Regex.Match(
fileName,
fileMask,
RegexOptions.IgnoreCase | RegexOptions.CultureInvariant
)
.Success
)
}
)
.OrderByDescending(x => x.FileNamesMatched)
.First()
.FileMask;
With the sample data the value of fileMaskWithMostMatches is valid.
Note that the Regex class will do some caching of regular expressions but if you have many regular expressions it will be more effecient to create the regular expressions outside the implied fileNames.Count for-each loop to avoid recreating the same regular expression again and again (creating a regular expression may take a non-trivial amount of time depending on the complexity).

As an alternative to Martin's answer, here's a simpler version to your existing Linq query that gives the desired result;
var LINQ = FileCheck[2]
.ToLookup(x => x) // Makes a lookup table
.OrderByDescending(x => x.Count()) // Sorts by count, descending
.Select(x => x.Key) // Extract the key
.FirstOrDefault(x => x != null); // Return the first non null key
// or null if none found.

Isn't this much more easier?
string result = FileCheck[2]
.Where(x => x != null)
.GroupBy(x => x)
.OrderByDescending(x => x.Count())
.FirstOrDefault().Key;

LINQ search though a list of string arrays for a particular string

I have a list of string arrays:
List<String[]> listOfStringArrays = something;
I need to select all objects from a collection that have a value which is equal to the string at the 0th index of any string array in the list.
For example, if I just had a simple list of strings, declared as:
List<String> listOfStrings = something;
I would just do:
var query = someCollection.Where(x => listOfStrings.Contains(x.id_num))
But obviously it's not as simple with a list of string arrays.
I know that I can easily just iterate through the list of string arrays and create a simple list of strings with the 0th value, like this:
List<String[]> listOfStringArrays = something;
List<String> listOfValues = new List<String>();
foreach (string[] s in listOfStringArrays)
listOfValues.Add(s[0]);
var query = someCollection.Where(x => listOfValues.Contains(x => x.id_num);
But would really like to avoid this and am trying to write it as a one liner without introducing extra lists and loops.

You can put it all into one query:
someCollection.Where(x => listOfValues.Select(y => y[0]).Contains(x => x.id_num);
But it will iterate over listOfValues over and over again.
I would rather go with HashSet<string> to make it faster:
var set = new HashSet<string>(listOfValues.Select(y => y[0]));
someCollection.Where(x => set.Contains(x));

Try the following
var query = someCollection.Where(s => listOfStringArrays.Any(a => a[0] == s));

var firsts = listOfString.Select(x => x[0]);
var query = someCollection.Where(x => firsts.Contains(x));
This will project each array to it's first element, and then match from there
As a one liner:
var query = someCollection.Where(x => listOfString.Select(y => y[0]).Contains(x));

It should be simply:
List<String[]> newListOfStrings = listOfStrings.where(x => x[0].Contains(identifer)).ToList()
The final ToList is needed in this case because I have not used var.

Modifying an IEnumerable type

I have a a string IEnumerable type that I get from the below code.The var groups is an Enumerable type which has some string values. Say there are 4 values in groups and in the second position the value is just empty string "" .The question is how can I move it to the 4th ie the end position.I do not want to sort or change any order.Just move the empty "" value whereever it occurs to the last position.
List<Item> Items = somefunction();
var groups = Items.Select(g => g.Category).Distinct();

Simply order the results by their string value:
List<Item> Items = somefunction();
var groups = Items.Select(g => g.Category).Distinct().OrderByDescending(s => s);
Edit (following OP edit):
List<Item> Items = somefunction();
var groups = Items.Select(g => g.Category).Distinct();
groups = groups.Where(s => !String.IsNullOrEmpty(s))
.Concat(groups.Where(s => String.IsNullOrEmpty(s)));

You can't directly modify the IEnumerable<> instance, but you can create a new one:
var list = groups.Where(x => x != "").Concat(groups.Where(x => x == ""));
Note that in this query, groups is iterated twice. This is usually not a good practice for a deferred IEnumerable<>, so you should call ToList() after the Distinct() to eagerly evaluate your LINQ query:
var groups = Items.Select(g => g.Category).Distinct().ToList();
EDIT :
On second thought, there's a much easier way to do this:
var groups = Items.Select(g => g.Category).Distinct().OrderBy(x => x == "");
Note that this doesn't touch the order of the non-empty elements since OrderBy is stable.

var groups = Items.Select(g => g.Category).Distinct().OrderByDescending(s =>s);

I don't like my query but it should do the job. It selects all items which are not empty and unions it with the items which are empty.
var groups = Items.Select(g => g.Category).Distinct()
.Where(s => !string.IsNullOrEmpty(s))
.Union(Items.Select(g => g.Category).Distinct()
.Where(s => string.IsNullOrEmpty(s)));

Try something like
var temp = groups.Where(item => ! String.IsNullOrEmpty(item)).ToList<string>();
while (temp.Count < groups.Count) temp.Add("");

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Select list elements contained in another list in linq - c#

Try this: var newList = itemList.Where(item => s.Split('|').Contains(item.Name)); The proposed solution also prevents from populating newList with nulls from nonpresent items. You may also consider a more strict string equality check.

string s = "item1|item2|item3|item4"; I don't see a need for splitting this string s. So you could simply do var newList = itemList.Where(i => s.Contains(i.Name)); For different buggy input you can also do s = "|" + s + "|"; var newList = itemList.Where(o => s.Contains("|" + o.Name + '|')).ToList();

List<object> newList = itemList.Where(item => s.Split("|").Contains(item.Name)).ToList<object>();

Related

How to avoid two embedded cycles in linq query C#

Using C# Lambda to split string and search value

LINQ Query to find string of multidimensional array with most duplicates

LINQ search though a list of string arrays for a particular string

Modifying an IEnumerable type

Categories

Resources