How to split array chunk by rule? - c#

I can split an array into smaller chunks.
public class Item{
public string Name {get; set;}
public bool IsUnique {get;set;}
}
public static void Main()
{
Random r = new Random();
var source = new[] {
new Item { Name = "Item-1", IsUnique = true},
new Item { Name = "Item-2", IsUnique = true},
new Item { Name = "Item-3", IsUnique = true},
new Item { Name = "Item-4"},
new Item { Name = "Item-5"},
new Item { Name = "Item-6"},
new Item { Name = "Item-7"},
new Item { Name = "Item-8"},
new Item { Name = "Item-9"}
};
var chunkSize = 3;
var result = source
.OrderBy(a => r.Next())
.Select((x, i) => new { Index = i, Item = x })
.GroupBy(s => s.Index / chunkSize)
.Select(g => g.ToList())
.ToList();
foreach(var item in result)
{
Console.WriteLine("Chunk: "+ (result.IndexOf(item)+1));
Console.WriteLine("-----------------------------------");
foreach(var x in item)
{
Console.WriteLine(x.Item.Name);
}
Console.WriteLine();
}
}
The result is like this:
Chunk: 1
-----------------------------------
Item-2
Item-3
Item-8
Chunk: 2
-----------------------------------
Item-5
Item-9
Item-7
Chunk: 3
-----------------------------------
Item-6
Item-4
Item-1
But if IsUniquer property of an item is true, they can not be in same chunk. For example above, Chunk-1 contains item-2 and item-3 .
Can I do this using linq?
UPDATE:
If chunk size is 3, only 3 item may be IsUnique=true.

Split your source array into two groups: those items which is unique and rest. Then iterate through each element in unique collection and take chunkSize - 1 from nonUnique collection. Take a look at thhis code:
var unique = source.Where(x => x.IsUnique);
var nonUnique = source.Where(x => !x.IsUnique)
.OrderBy(x => r.Next())
.ToList();
var result = unique.Aggregate(
(list: new List<List<Item>>(), items: nonUnique),
(c, n) =>
{
var next = c.items.Take(chunkSize - 1).ToList();
next.Add(n);
c.items.RemoveRange(0, chunkSize - 1);
c.list.Add(next.OrderBy(x => r.Next()).ToList());
return (c.list, c.items);
}).list;

Related

Remove duplicate rows from a list based on selected columns?

I have a list of a class and there are two columns in this class. Now i want to remove the duplicate rows from that class using specific columns. Like remove duplicate from first column only ,remove from send column only or remove from both.So for this i am using following code. Is there any best way to do this process because in future i will have 20-25 columns in this class and at that time i have to add 20-25 if statements in this function?
public List<ContactTemp> RemoveDupliacacse(List<ContactTemp> ContactTempList, List<string> objcolumn)
{
List<ContactTemp> ContactTempListRemobdup = new List<ContactTemp>();
if (objcolumn.Contains("CITY"))
{
ContactTempListRemobdup = ContactTempList.GroupBy(s => s.City).Select(group => group.First()).ToList();
}
if (objcolumn.Contains("STATE"))
{
ContactTempListRemobdup = ContactTempList.GroupBy(s => s.State).Select(group => group.First()).ToList();
}
return ContactTempListRemobdup;
}
I think your class like
public class ContactTemp
{
public string CITY{}
public int STATE{}
}
This list "ContactTempList" will have duplicates. you want to find and remove items from this list where CITY and STATE are duplicates.
I meant that,This will return one item for each "type" (like a Distinct) (so if you have A, A, B, C it will return A, B, C)
List<ContactTemp> noDups = ContactTempList.GroupBy(d => new {d.CITY,d.STATE} )
.Select(d => d.First())
.ToList();
If you want only the elements that don't have a duplicate (so if you have A, A, B, C it will return B, C):
List<ContactTemp> noDups = ContactTempList.GroupBy(d => new {d.CITY,d.STATE} )
.Where(d => d.Count() == 1)
.Select(d => d.First())
.ToList();
You can achieve it via reflection with same signature, i.e. arbitrary number of columns:
public List<ContactTemp> RemoveDupliacacse(List<ContactTemp> ContactTempList,
List<string> objcolumn)
{
var type = typeof(ContactTemp);
foreach (var column in objcolumn)
{
var property = type.GetProperty(column);
ContactTempList = ContactTempList.GroupBy(x => property.GetValue(x))
.Select(x => x.First()).ToList();
}
return ContactTempList;
}
How about something like this?
public static List<ContactTemp> RemoveDupliacacse(
List<ContactTemp> ContactTempList,
IEnumerable<Func<ContactTemp, object>> columnSelectors)
{
IEnumerable<ContactTemp> ContactTempListRemobdup = ContactTempList;
foreach(var selector in columnSelectors)
{
ContactTempListRemobdup = ContactTempListRemobdup
.GroupBy(s => selector(s))
.Select(group => group.First());
}
return ContactTempListRemobdup.ToList();
}
You can use it like;
RemoveDupliacacse(list, new List<Func<ContactTemp, object>> {
(ContactTemp contact) => contact.State, (ContactTemp contact) => contact.City })
As you may already know, when you select multiple columns, the method removes duplicates for each column. Please check the following examples:
var list = new List<ContactTemp> {
new ContactTemp { City = "1", State = "1" },
new ContactTemp { City = "1", State = "2" },
new ContactTemp { City = "2", State = "1" },
new ContactTemp { City = "2", State = "2" }
};
foreach (var contact in RemoveDupliacacse(
list,
new List<Func<ContactTemp, object>> {
(ContactTemp contact) => contact.State,
(ContactTemp contact) => contact.City }))
{
Console.WriteLine($"City:{contact.City}, State:{contact.State}");
}
// This will output:
// City: 1, State: 1
// If you want to check duplication of the combination of the selected columns,
// you can do it like this;
foreach (var contact in RemoveDupliacacse(
list,
new List<Func<ContactTemp, object>> {
(ContactTemp contact) => new { contact.State, contact.City } }))
{
Console.WriteLine($"City:{contact.City}, State:{contact.State}");
}
// This will output:
// City: 1, State: 1
// City: 1, State: 2
// City: 2, State: 1
// City: 2, State: 2

Best algorithm to determine added and removed items when comparing to collections

I am looking for the best algorithm to compare 2 collections and determine which element got added and which element got removed.
private string GetInvolvementLogging(ICollection<UserInvolvement> newInvolvement, ICollection<UserInvolvement> oldInvolvement)
{
//I defined the new and old dictionary's for you to know what useful data is inside UserInvolvement.
//Both are Dictionary<int, int>, because The Involvement is just a enum flag. Integer. UserId is also Integer.
var newDict = newInvolvement.ToDictionary(x => x.UserID, x => x.Involvement);
var oldDict = oldInvolvement.ToDictionary(x => x.UserID, x => x.Involvement);
//I Want to compare new to old -> and get 2 dictionaries: added and removed.
var usersAdded = new Dictionary<int, Involvement>();
var usersRemoved = new Dictionary<int, Involvement>();
//What is the best algoritm to accomplish this?
return GetInvolvementLogging(usersAdded, usersRemoved);
}
private string GetInvolvementLogging(Dictionary<int, Involvement> usersAdded, Dictionary<int, Involvement> usersRemoved)
{
//TODO: generate a string based on those dictionaries.
return "Change in userinvolvement: ";
}
Added elements are only in newDict removed only in oldDict
var intersection = newDict.Keys.Intersect(oldDict.Keys);
var added = newDict.Keys.Except(intersection);
var removed = oldDict.Keys.Except(intersection);
EDIT
I modify your base function, dictionaries is no neded.
Example UserInvolvement implementation
class UserInvolvement
{
public int UserId;
public string Name;
public string OtherInfo;
public override bool Equals(object obj)
{
if (obj == null)
{
return false;
}
UserInvolvement p = obj as UserInvolvement;
if ((System.Object)p == null)
{
return false;
}
return (UserId == p.UserId) && (Name == p.Name) && (OtherInfo == p.OtherInfo);
}
public override string ToString()
{
return $"{UserId} - {Name} - {OtherInfo}";
}
}
And example function:
private static string GetInvolvementLogging(ICollection<UserInvolvement> newInvolvement,
ICollection<UserInvolvement> oldInvolvement)
{
var intersection = newInvolvement.Select(x => x.UserId).Intersect(oldInvolvement.Select(x => x.UserId));
var addedIds = newInvolvement.Select(x => x.UserId).Except(intersection);
var removedIds = oldInvolvement.Select(x => x.UserId).Except(intersection);
List<UserInvolvement> modifiedUI = new List<UserInvolvement>();
foreach (var i in intersection)
{
var ni = newInvolvement.First(a => a.UserId == i);
var oi = oldInvolvement.First(a => a.UserId == i);
if (!ni.Equals(oi))
{
modifiedUI.Add(ni);
}
}
List<UserInvolvement> addedUI = newInvolvement.Where(x => addedIds.Contains(x.UserId)).Select(w => w).ToList();
List<UserInvolvement> removedUI = oldInvolvement.Where(x => removedIds.Contains(x.UserId)).Select(w => w).ToList();
StringBuilder sb = new StringBuilder();
sb.AppendLine("Added");
foreach (var added in addedUI)
{
sb.AppendLine(added.ToString());
}
sb.AppendLine("Removed");
foreach (var removed in removedUI)
{
sb.AppendLine(removed.ToString());
}
sb.AppendLine("Modified");
foreach (var modified in modifiedUI)
{
sb.AppendLine(modified.ToString());
}
return sb.ToString();
}
And my test function:
static void Main(string[] args)
{
List<UserInvolvement> newUI = new List<UserInvolvement>()
{
new UserInvolvement()
{
UserId = 1,
Name = "AAA",
OtherInfo = "QQQ"
},
new UserInvolvement()
{
UserId = 2,
Name = "BBB",
OtherInfo = "123"
},
new UserInvolvement()
{
UserId = 4,
Name = "DDD",
OtherInfo = "123ert"
}
};
List<UserInvolvement> oldUI = new List<UserInvolvement>()
{
new UserInvolvement()
{
UserId = 2,
Name = "BBBC",
OtherInfo = "123"
},
new UserInvolvement()
{
UserId = 3,
Name = "CCC",
OtherInfo = "QQ44"
},
new UserInvolvement()
{
UserId = 4,
Name = "DDD",
OtherInfo = "123ert"
}
};
string resp = GetInvolvementLogging(newUI, oldUI);
WriteLine(resp);
ReadKey();
WriteLine("CU");
}
Result is:
Added
1 - AAA - QQQ
Removed
3 - CCC - QQ44
Modified
2 - BBB - 123
You could try with Linq:
var usersAdded = newDict.Except(oldDict);
var usersRemoved = oldDict.Except(newDict);
If you need dictionaries as a result you can cast:
var usersAdded = newDict.Except(oldDict).ToDictionary(x => x.Key, x => x.Value);
var usersRemoved = oldDict.Except(newDict).ToDictionary(x => x.Key, x => x.Value);
Think best algorithm will be
foreach (var newItem in newDict)
if (!oldDict.ContainsKey(newItem.Key) || oldDict[newItem.Key]!=newItem.Value)
usersAdded.Add(newItem.Key, newItem.Value);
foreach (var oldItem in oldDict)
if (!newDict.ContainsKey(oldItem.Key) || newDict[oldItem.Key]!=oldItem.Value)
usersRemoved.Add(oldItem.Key, oldItem.Value);
Finally this is my implementation of GetInvolvementLogging:
(the implementation of the string builder method is irrelevant for my question here)
private string GetInvolvementLogging(ICollection<UserInvolvement> newInvolvement, ICollection<UserInvolvement> oldInvolvement)
{
//I defined the new and old dictionary's to focus on the relevant data inside UserInvolvement.
var newDict = newInvolvement.ToDictionary(x => x.UserID, x => (Involvement)x.Involvement);
var oldDict = oldInvolvement.ToDictionary(x => x.UserID, x => (Involvement)x.Involvement);
var intersection = newDict.Keys.Intersect(oldDict.Keys); //These are the id's of the users that were and remain involved.
var usersAdded = newDict.Keys.Except(intersection);
var usersRemoved = oldDict.Keys.Except(intersection);
var addedInvolvement = newDict.Where(x => usersAdded.Contains(x.Key)).ToDictionary(x => x.Key, x => x.Value);
var removedInvolvement = oldDict.Where(x => usersRemoved.Contains(x.Key)).ToDictionary(x => x.Key, x => x.Value);
//Check if the already involved users have a changed involvement.
foreach(var userId in intersection)
{
var newInvolvementFlags = newDict[userId];
var oldInvolvementFlags = oldDict[userId];
if ((int)newInvolvementFlags != (int)oldInvolvementFlags)
{
var xor = newInvolvementFlags ^ oldInvolvementFlags;
var added = newInvolvementFlags & xor;
var removed = oldInvolvementFlags & xor;
if (added != 0)
{
addedInvolvement.Add(userId, added);
}
if (removed != 0)
{
removedInvolvement.Add(userId, removed);
}
}
}
return GetInvolvementLogging(addedInvolvement, removedInvolvement);
}

Linq Lookup to parse a CSV text line

Issue
I had asked this question a while back and the requirements has changed a bit since then.
Now, there is a possibility to have a file with lines as follow:
Bryar22053;ADDPWN;Bryar.Suarez#company.com;ACTIVE
Nicole49927;ADDPWN;Nicole.Acosta#company.com;ACTIVE
Rashad58323;ADDPWN;Rashad.Everett#company.com;ACTIVE
Take first line. The first value Bryar22053 is skipped and the same lookup is used:
var columnCount = dataRow.Skip(1).Count();
var modular = 0;
// Simple Enum
var rightsFileType = new RightsFileType();
if (columnCount % 2 == 0)
{
rightsFileType = RightsFileType.WithoutStatus;
modular = 2;
}
else if (columnCount % 3 == 0)
{
rightsFileType = RightsFileType.WithStatus;
modular = 3;
}
var lookup = dataRow.Skip(1).Select((data, index) => new
{
lookup = index % modular,
index,
data
}).ToLookup(d => d.lookup);
The lookup object now has three groups:
> ? lookup[0].ToList() Count = 1
> [0]: { lookup = 0, index = 0, data = "ADDPWN" } ? lookup[1].ToList() Count = 1
> [0]: { lookup = 1, index = 1, data = "Bryar.Suarez#company.com" } ? lookup[2].ToList() Count = 1
> [0]: { lookup = 2, index = 2, data = "ACTIVE" }
If it was the original case where it would be just System1,User1,System2,User2... the lookup would have two groups and following code would work:
List<RightObjectRetrieved> rights;
rights = lookup[0].Join(lookup[1], system => system.index + 1, username => username.index, (system, username) => new
{
system = system.data,
useraname = username.data
}).Where(d => !string.IsNullOrEmpty(d.system)).Select(d => new RightObjectRetrieved {UserIdentifier = userIdentifier, SystemIdentifer = d.system, Username = d.useraname, RightType = rightsFileType}).ToList();
// rights => Key = System Identifier, Value = Username
But with the third 'status' as System1,User1,Status1,System2,User2,Status2..., I'm having issue trying to Join and get all three. Please help.
Edit
Here is what I have for raw data:
// Method has parameter localReadLine (string) that has this:
// Bryar22053;ADDPWN;Bryar.Suarez#company.com;ACTIVE
// Data line
var dataRow = localReadLine.Split(new[] { ToolSettings.RightsSeperator }, StringSplitOptions.None);
// Trim each element
Array.ForEach(dataRow, x => dataRow[Array.IndexOf(dataRow, x)] = x.Trim());
Tried (failed) so far
rights = lookup[0].Join(lookup[1], system => system.index + 1, username => username.index, status => status.index, (system, username, status) => new
{
system = system.data,
useraname = username.data,
status = status.data
}).Where(d => !string.IsNullOrEmpty(d.system)).Select(d => new RightObjectRetrieved {UserIdentifier = userIdentifier, SystemIdentifer = d.system, Username = d.useraname, RightType = rightsFileType}).ToList();
And
rights = lookup[0].Join(lookup[1], system => system.index + 1, username => username.index, (system, username) => new
{
system = system.data,
useraname = username.data
}).Join(lookup[2], status => status.index, (status) => new
{
status = status.data
}).Where(d => !string.IsNullOrEmpty(d.system)).Select(d => new RightObjectRetrieved {UserIdentifier = userIdentifier, SystemIdentifer = d.system, Username = d.useraname, RightType = rightsFileType, Status = ParseStatus(status)}).ToList();
I think you need to split up a little bit your implementation.
Let's declare a class that will hold the data:
class Data
{
public string System { get; set; }
public string Username { get; set; }
public string Status { get; set; }
}
Now, let's define a couple of parsing functions to parse a line.
The first one will parse a line which includes status:
var withStatus = (IEnumerable<string> line) => line
.Select((token, index) => new { Value = token, Index = index })
.Aggregate(
new List<Data>(),
(list, token) =>
{
if( token.Index % 3 == 0 )
{
list.Add(new Data { System = token.Value });
return list;
}
var data = list.Last();
if( token.Index % 3 == 1 )
data.Username = token.Value;
else
data.Status = token.Value;
return list;
});
The second one will parse a line which doesn't include status:
var withoutStatus = (IEnumerable<string> line) => line
.Select((token, index) => new { Value = token, Index = index })
.Aggregate(new List<Data>(),
(list, token) =>
{
if( token.Index % 2 == 0)
list.Add(new Data { System = token.Value });
else
list.Last().Username = token.Value;
return list;
});
With all that in place, you'll need the following:
Determine the modulus
Iterate the lines of the file and parse each line
Group and aggregate the results
The remaining code would look like this:
var lines = streamReader.ReadAllLines(); // mind the system resources here!
var parser = lines.First().Split(';').Length % 2 == 0 ? withoutStatus : withStatus;
var data = lines.Skip(1) // skip the header
.Select(line =>
{
var parts = line.Split(';');
return new
{
UserId = parts.First(),
Data = parser(parts.Skip(1))
};
})
.GroupBy(x => x.UserId)
.ToDictionary(g => g.Key, g => g.SelectMany(x => x.Data));
Now you have a Dictionary<string, Data> which holds the user id and its info.
Of course, a more elegant solution would be to separate each parsing function into its own class and join those classes under a common interface in case there would be more info to add in the future but the code above should work and give you an idea of what you should do.
If you want to use joins:
var result = lookup[0]
.Join(lookup[1],
system => system.index,
username => username.index - 1,
(system, username) => new {system = system.data, username = username.data, system.index})
.Join(lookup[2],
d => d.index,
status => status.index - 2,
(d, status) => new {d.system, d.username, status = status.data})
.ToList();
Another option to group by records and just select data from it (looks more readable from my point of view):
var result = dataRow
.Skip(1)
.Select((data, index) => new {data, record = index / 3})
.GroupBy(r => r.record)
.Select(r =>
{
var tokens = r.ToArray();
return new
{
system = tokens[0].data,
username = tokens[1].data,
status = tokens[2].data
};
})
.ToList();

Filtering and grouping items in a .NET List<>

List<MailingList> myGroup = lst.GroupBy(t => new {t.userId, t.userName,t.email,t.reportTypeId})
.Select(g => new MailingList
{
userId = g.Key.userId,
Acrynom = g.SelectMany(t => t.Acrynom).ToArray(),
userName = g.Key.userName,
email = g.Key.email,
reportTypeId = g.Key.reportTypeId
}).ToList();
foreach (var mailingList in myGroup.Distinct())
{
StringBuilder AcrynomsList1 = new StringBuilder();
foreach (var item in mailingList.Acrynom)
{
if (Acrynoms.Length > 0)
{
Acrynoms.Append(", ");
}
AcrynomsList1.Append(item);
}
}
What i want to achieve is filter and group myGroup by reportTypeId. reportTypeId can either be 1 or 2, so i want to have a variable StringBuilder AcrynomsList1 where reportTypeId = 1 and then another variable StringBuilder AcrynomsList2 where reportTypeId = 2.
My current StringBuilder varibale AcrynomsList1 has all reportTypeId 1 & 2 values.
This should give you what you want:
var list = lst.GroupBy(x => x.reportTypeId)
.Select(x => new
{
reportTypeId = x.Key,
Acronyms = x.SelectMany(t => t.Acrynom).ToArray()
}).ToList();
var acronymList1 = string.Join(", ", list[0].Acronyms);
var acronymList2 = string.Join(", ", list[1].Acronyms);
You can group it straight into a string without using a StringBuilder:
class Program
{
static void Main(string[] args)
{
var list = new List<MailingList>();
var grouped = list
.GroupBy(m => m.ReportTypeID)
.Select(g => new
{
ReportTypeID = g.Key,
Items = string.Join(", ", g.Where(s => !string.IsNullOrEmpty(s.Acronym)).Select(m => m.Acronym))
});
}
}
class MailingList
{
public int ReportTypeID { get; set; }
public string Acronym { get; set; }
}
The GroupBy extension method returns a number of enumerables of MailingList for you, which has a Key to expose the key you grouped by. The Distinct is part of the GroupBy, so you don't need it.

how to get a SUM in Linq?

I need to do the following, I have a List with a class which contains 2 integer id and count
Now I want to do the following linq query:
get the sum of the count for each id
but there can be items with the same id, so it should be summerized e.g.:
id=1, count=12
id=2, count=1
id=1, count=2
sould be:
id=1 -> sum 14
id=2 -> sum 1
how to do this?
Group the items by Id and then sum the Counts in each group:
var result = items.GroupBy(x => x.Id)
.Select(g => new { Id = g.Key, Sum = g.Sum(x => x.Count) });
Try it ,
.GroupBy(x => x.id)
.Select(n => n.Sum(m => m.count));
The following program...
struct Item {
public int Id;
public int Count;
}
class Program {
static void Main(string[] args) {
var items = new [] {
new Item { Id = 1, Count = 12 },
new Item { Id = 2, Count = 1 },
new Item { Id = 1, Count = 2 }
};
var results =
from item in items
group item by item.Id
into g
select new { Id = g.Key, Count = g.Sum(item => item.Count) };
foreach (var result in results) {
Console.Write(result.Id);
Console.Write("\t");
Console.WriteLine(result.Count);
}
}
}
...prints:
1 14
2 1

Categories