Can only enumerate once over IEnumerable - c#

Given is the following code (a xUnit test):
[Fact]
public void SetFilePathTest()
{
// Arrange
IBlobRepository blobRepository = null;
IEnumerable<Photo> photos = new List<Photo>()
{
new Photo()
{
File = "1.jpg"
},
new Photo()
{
File = "1.jpg"
}
};
IEnumerable<CloudBlockBlob> blobs = new List<CloudBlockBlob>()
{
new CloudBlockBlob(new Uri("https://blabla.net/media/photos/1.jpg")),
new CloudBlockBlob(new Uri("https://blabla.net/media/photos/2.jpg"))
};
// Act
photos = blobRepository.SetFilePath2(photos, blobs);
// Assert
Assert.Equal(2, photos.Count());
Assert.Equal(2, photos.Count());
}
Here is the SetFilePath2 method:
public static IEnumerable<T> SetFilePath2<T>(this IBlobRepository blobRepository, IEnumerable<T> entities, IEnumerable<CloudBlockBlob> blobs) where T : BlobEntityBase
{
var firstBlob = blobs.FirstOrDefault();
if (firstBlob is null == false)
{
var prefixLength = firstBlob.Parent.Prefix.Length;
return entities
.Join(blobs, x => x.File, y => y.Name.Substring(prefixLength), (entity, blob) => (entity, blob))
.Select(x =>
{
x.entity.File = x.blob.Uri.AbsoluteUri;
return x.entity;
});
}
else
{
return Enumerable.Empty<T>();
}
}
As you can see, I assert 2 times the very same thing. But only the first assert succeeds. When I step through with the debugger then I can only enumerate the collection once. So at the second Assert it yields no items back.
Can anyone explain me why that happens? I really don't see any problem with this code than explains this behavior.

Every time you call .Count() you basically call
blobRepository.SetFilePath2(photos, blobs).Count() and you modify the entity while using Select.I would recommend using new in the Select statement if you don't mean to alter the original value. That's why you get different results.

Related

IEnumerable failed to set element

I have a ViewModel that contains different elements inside different tables that I tend to assign to it by query.
My problem is that I can't do this with IEnumerable (in GetAll() below), it keeps returning me null for RoomCode but for a single item (in GetDeviceId() below) then it works fine.
public IEnumerable<DeviceViewModel> GetAll()
{
var result = deviceRepository.GetAll().Select(x => x.ToViewModel<DeviceViewModel>());
for(int i = 0; i < result.Count(); i++)
{
int? deviceID = result.ElementAt(i).DeviceId;
result.ElementAt(i).RoomCode = deviceRepository.GetRoomCode(deviceID);
}
return result;
}
public DeviceViewModel GetDeviceID(int deviceID)
{
var result = new DeviceViewModel();
var device = deviceRepository.Find(deviceID);
if (device != null)
{
result = device.ToViewModel<DeviceViewModel>();
result.RoomCode = deviceRepository.GetRoomCode(deviceID);
}
else
{
throw new BaseException(ErrorMessages.DEVICE_LIST_EMPTY);
}
return result;
}
public string GetRoomCode(int? deviceID)
{
string roomCode;
var roomDevice = dbContext.Set<RoomDevice>().FirstOrDefault(x => x.DeviceId == deviceID && x.IsActive == true);
if (roomDevice != null)
{
var room = dbContext.Set<Room>().Find(roomDevice.RoomId);
roomCode = room.RoomCode;
}
else
{
roomCode = "";
}
return roomCode;
}
First, you need to materialize the query to a collection in local memory. Otherwise, the ElementAt(i) will query the db and give back some kind of temporary object each time it is used, discarding any change you do.
var result = deviceRepository.GetAll()
.Select(x => x.ToViewModel<DeviceViewModel>())
.ToList(); // this will materialize the query to a list in memory
// Now modifications of elements in the result IEnumerable will be persisted.
You can then go on with the rest of the code.
Second (and probably optional), I also recommend for clarity to use foreach to enumerate the elements. That's the C# idiomatic way to loop through an IEnumerable:
foreach (var element in result)
{
int? deviceID = element.DeviceId;
element.RoomCode = deviceRepository.GetRoomCode(deviceID);
}

Optimizing LINQ routines

I run a build system. Datawise the simplified description would be that I have Configurations and each config has 0..n Builds.
Now builds produce artifacts and some of these are stored on server. What I am doing is writing kind of a rule, that sums all the bytes produced per configuration builds and checks if these are too much.
The code for the routine at the moment is following:
private void CalculateExtendedDiskUsage(IEnumerable<Configuration> allConfigurations)
{
var sw = new Stopwatch();
sw.Start();
// Lets take only confs that have been updated within last 7 days
var items = allConfigurations.AsParallel().Where(x =>
x.artifact_cleanup_type != null && x.build_cleanup_type != null &&
x.updated_date > DateTime.UtcNow.AddDays(-7)
).ToList();
using (var ctx = new LocalEntities())
{
Debug.WriteLine("Context: " + sw.Elapsed);
var allBuilds = ctx.Builds;
var ruleResult = new List<Notification>();
foreach (var configuration in items)
{
// all builds for current configuration
var configurationBuilds = allBuilds.Where(x => x.configuration_id == configuration.configuration_id)
.OrderByDescending(z => z.build_date);
Debug.WriteLine("Filter conf builds: " + sw.Elapsed);
// Since I don't know which builds/artifacts have been cleaned up, calculate it manually
if (configuration.build_cleanup_count != null)
{
var buildCleanupCount = "30"; // default
if (configuration.build_cleanup_type.Equals("ReserveBuildsByDays"))
{
var buildLastCleanupDate = DateTime.UtcNow.AddDays(-int.Parse(buildCleanupCount));
configurationBuilds = configurationBuilds.Where(x => x.build_date > buildLastCleanupDate)
.OrderByDescending(z => z.build_date);
}
if (configuration.build_cleanup_type.Equals("ReserveBuildsByCount"))
{
var buildLastCleanupCount = int.Parse(buildCleanupCount);
configurationBuilds =
configurationBuilds.Take(buildLastCleanupCount).OrderByDescending(z => z.build_date);
}
}
if (configuration.artifact_cleanup_count != null)
{
// skipped, similar to previous block
}
Debug.WriteLine("Done cleanup: " + sw.Elapsed);
const int maxDiscAllocationPerConfiguration = 1000000000; // 1GB
// Sum all disc usage per configuration
var confDiscSizePerConfiguration = configurationBuilds
.GroupBy(c => new {c.configuration_id})
.Where(c => (c.Sum(z => z.artifact_dir_size) > maxDiscAllocationPerConfiguration))
.Select(groupedBuilds =>
new
{
configurationId = groupedBuilds.FirstOrDefault().configuration_id,
configurationPath = groupedBuilds.FirstOrDefault().configuration_path,
Total = groupedBuilds.Sum(c => c.artifact_dir_size),
Average = groupedBuilds.Average(c => c.artifact_dir_size)
}).ToList();
Debug.WriteLine("Done db query: " + sw.Elapsed);
ruleResult.AddRange(confDiscSizePerConfiguration.Select(iter => new Notification
{
ConfigurationId = iter.configurationId,
CreatedDate = DateTime.UtcNow,
RuleType = (int) RulesEnum.TooMuchDisc,
ConfigrationPath = iter.configurationPath
}));
Debug.WriteLine("Finished loop: " + sw.Elapsed);
}
// find owners and insert...
}
}
This does exactly what I want, but I am thinking if I could make it any faster. Currenly I see:
Context: 00:00:00.0609067
// first round
Filter conf builds: 00:00:00.0636291
Done cleanup: 00:00:00.0644505
Done db query: 00:00:00.3050122
Finished loop: 00:00:00.3062711
// avg round
Filter conf builds: 00:00:00.0001707
Done cleanup: 00:00:00.0006343
Done db query: 00:00:00.0760567
Finished loop: 00:00:00.0773370
The SQL generated by .ToList() looks very messy. (Everything that is used in WHERE is covered with an index in DB)
I am testing with 200 configurations, so this adds up to 00:00:18.6326722. I have a total of ~8k items that need to get processed daily (so the whole routine takes more than 10 minutes to complete).
I have been randomly googling around this internet and it seems to me that Entitiy Framework is not very good with parallel processing. Knowing that I still decided to give this async/await approch a try (First time a tried it, so sorry for any nonsense).
Basically if I move all the processing out of scope like:
foreach (var configuration in items)
{
var confDiscSizePerConfiguration = await GetData(configuration, allBuilds);
ruleResult.AddRange(confDiscSizePerConfiguration.Select(iter => new Notification
{
... skiped
}
And:
private async Task<List<Tmp>> GetData(Configuration configuration, IQueryable<Build> allBuilds)
{
var configurationBuilds = allBuilds.Where(x => x.configuration_id == configuration.configuration_id)
.OrderByDescending(z => z.build_date);
//..skipped
var confDiscSizePerConfiguration = configurationBuilds
.GroupBy(c => new {c.configuration_id})
.Where(c => (c.Sum(z => z.artifact_dir_size) > maxDiscAllocationPerConfiguration))
.Select(groupedBuilds =>
new Tmp
{
ConfigurationId = groupedBuilds.FirstOrDefault().configuration_id,
ConfigurationPath = groupedBuilds.FirstOrDefault().configuration_path,
Total = groupedBuilds.Sum(c => c.artifact_dir_size),
Average = groupedBuilds.Average(c => c.artifact_dir_size)
}).ToListAsync();
return await confDiscSizePerConfiguration;
}
This, for some reason, drops the execution time for 200 items from 18 -> 13 sec. Anyway, from what I understand, since I am awaiting each .ToListAsync(), it is still processed in sequence, is that correct?
So the "can't process in parallel" claim starts coming out when I replace the foreach (var configuration in items) with Parallel.ForEach(items, async configuration =>. Doing this change results in:
A second operation started on this context before a previous
asynchronous operation completed. Use 'await' to ensure that any
asynchronous operations have completed before calling another method
on this context. Any instance members are not guaranteed to be thread
safe.
It was a bit confusing to me at first as I await practically in every place where the compiler allows it, but possibly the data gets seeded to fast.
I tried to overcome this by being less greedy and added the new ParallelOptions {MaxDegreeOfParallelism = 4} to that parallel loop, peasant assumption was that default connection pool size is 100, all I want to use is 4, should be plenty. But it still fails.
I have also tried to create new DbContexts inside the GetData method, but it still fails. If I remember correctly (can't test now), I got
Underlying connection failed to open
What possibilities there are to make this routine go faster?
Before going in parallel, it is worth to optimize query itself. Here are some suggestions that might improve your times:
1) Use Key when working with GroupBy. This might solve issue of complex & nested SQL query as in that way you instruct Linq to use the same keys defined in GROUP BY and not to create sub-select.
var confDiscSizePerConfiguration = configurationBuilds
.GroupBy(c => new { ConfigurationId = c.configuration_id, ConfigurationPath = c.configuration_path})
.Where(c => (c.Sum(z => z.artifact_dir_size) > maxDiscAllocationPerConfiguration))
.Select(groupedBuilds =>
new
{
configurationId = groupedBuilds.Key.ConfigurationId,
configurationPath = groupedBuilds.Key.ConfigurationPath,
Total = groupedBuilds.Sum(c => c.artifact_dir_size),
Average = groupedBuilds.Average(c => c.artifact_dir_size)
})
.ToList();
2) It seems that you are bitten by N+1 problem. In simple words - you execute one SQL query to get all configurations and N another ones to get build information. In total that would be ~8k small queries where 2 bigger queries would suffice. If used memory is not a constraint, fetch all build data in memory and optimize for fast lookup using ToLookup.
var allBuilds = ctx.Builds.ToLookup(x=>x.configuration_id);
Later you can lookup builds by:
var configurationBuilds = allBuilds[configuration.configuration_id].OrderByDescending(z => z.build_date);
3) You are doing OrderBy on configurationBuilds multiple times. Filtering does not affect record order, so you can safely remove extra calls to OrderBy:
...
configurationBuilds = configurationBuilds.Where(x => x.build_date > buildLastCleanupDate);
...
configurationBuilds = configurationBuilds.Take(buildLastCleanupCount);
...
4) There is no point to do GroupBy as builds are already filtered for a single configuration.
UPDATE:
I took it one step further and created code that would retrieve same results as your provided code with a single request. It should be more performant and use less memory.
private void CalculateExtendedDiskUsage()
{
using (var ctx = new LocalEntities())
{
var ruleResult = ctx.Configurations
.Where(x => x.build_cleanup_count != null &&
(
(x.build_cleanup_type == "ReserveBuildsByDays" && ctx.Builds.Where(y => y.configuration_id == x.configuration_id).Where(y => y.build_date > buildLastCleanupDate).Sum(y => y.artifact_dir_size) > maxDiscAllocationPerConfiguration) ||
(x.build_cleanup_type == "ReserveBuildsByCount" && ctx.Builds.Where(y => y.configuration_id == x.configuration_id).OrderByDescending(y => y.build_date).Take(buildCleanupCount).Sum(y => y.artifact_dir_size) > maxDiscAllocationPerConfiguration)
)
)
.Select(x => new Notification
{
ConfigurationId = x.configuration_id,
ConfigrationPath = x.configuration_path
CreatedDate = DateTime.UtcNow,
RuleType = (int)RulesEnum.TooMuchDisc,
})
.ToList();
}
}
First make a new context every parallel.foreach of you going to go that route. But u need to write a query that gets all the needed data in one trip. To speed up ef u can also disable change tracking or proxies on the context when ur reading data.
There are a lot of places for optimizations...
There are places where you should put .ToArray() to avoid asking multiple time to server...
I did a lot of refactor, but I'm unable to check, due lack of more information.
Maybe this can lead you to a better solution...
private void CalculateExtendedDiskUsage(IEnumerable allConfigurations)
{
var sw = new Stopwatch();
sw.Start();
using (var ctx = new LocalEntities())
{
Debug.WriteLine("Context: " + sw.Elapsed);
var allBuilds = ctx.Builds;
var ruleResult = GetRulesResult(sw, allConfigurations, allBuilds); // Clean Code!!!
// find owners and insert...
}
}
private static IEnumerable<Notification> GetRulesResult(Stopwatch sw, IEnumerable<Configuration> allConfigurations, ICollection<Configuration> allBuilds)
{
// Lets take only confs that have been updated within last 7 days
var ruleResult = allConfigurations
.AsParallel() // Check if you really need this right here...
.Where(IsConfigElegible) // Clean Code!!!
.SelectMany(x => CreateNotifications(sw, allBuilds, x))
.ToArray();
Debug.WriteLine("Finished loop: " + sw.Elapsed);
return ruleResult;
}
private static bool IsConfigElegible(Configuration x)
{
return x.artifact_cleanup_type != null &&
x.build_cleanup_type != null &&
x.updated_date > DateTime.UtcNow.AddDays(-7);
}
private static IEnumerable<Notification> CreateNotifications(Stopwatch sw, IEnumerable<Configuration> allBuilds, Configuration configuration)
{
// all builds for current configuration
var configurationBuilds = allBuilds
.Where(x => x.configuration_id == configuration.configuration_id);
// .OrderByDescending(z => z.build_date); <<< You should order only when needed (most at the end)
Debug.WriteLine("Filter conf builds: " + sw.Elapsed);
configurationBuilds = BuildCleanup(configuration, configurationBuilds); // Clean Code!!!
configurationBuilds = ArtifactCleanup(configuration, configurationBuilds); // Clean Code!!!
Debug.WriteLine("Done cleanup: " + sw.Elapsed);
const int maxDiscAllocationPerConfiguration = 1000000000; // 1GB
// Sum all disc usage per configuration
var confDiscSizePerConfiguration = configurationBuilds
.OrderByDescending(z => z.build_date) // I think that you can put this even later (or not to have anyway)
.GroupBy(c => c.configuration_id) // No need to create a new object, just use the property
.Where(c => (c.Sum(z => z.artifact_dir_size) > maxDiscAllocationPerConfiguration))
.Select(CreateSumPerConfiguration);
Debug.WriteLine("Done db query: " + sw.Elapsed);
// Extracting to variable to be able to return it as function result
var notifications = confDiscSizePerConfiguration
.Select(CreateNotification);
return notifications;
}
private static IEnumerable<Configuration> BuildCleanup(Configuration configuration, IEnumerable<Configuration> builds)
{
// Since I don't know which builds/artifacts have been cleaned up, calculate it manually
if (configuration.build_cleanup_count == null) return builds;
const int buildCleanupCount = 30; // Why 'string' if you always need as integer?
builds = GetDiscartBelow(configuration, buildCleanupCount, builds); // Clean Code (almost)
builds = GetDiscartAbove(configuration, buildCleanupCount, builds); // Clean Code (almost)
return builds;
}
private static IEnumerable<Configuration> ArtifactCleanup(Configuration configuration, IEnumerable<Configuration> configurationBuilds)
{
if (configuration.artifact_cleanup_count != null)
{
// skipped, similar to previous block
}
return configurationBuilds;
}
private static SumPerConfiguration CreateSumPerConfiguration(IGrouping<object, Configuration> groupedBuilds)
{
var configuration = groupedBuilds.First();
return new SumPerConfiguration
{
configurationId = configuration.configuration_id,
configurationPath = configuration.configuration_path,
Total = groupedBuilds.Sum(c => c.artifact_dir_size),
Average = groupedBuilds.Average(c => c.artifact_dir_size)
};
}
private static IEnumerable<Configuration> GetDiscartBelow(Configuration configuration,
int buildCleanupCount,
IEnumerable<Configuration> configurationBuilds)
{
if (!configuration.build_cleanup_type.Equals("ReserveBuildsByDays"))
return configurationBuilds;
var buildLastCleanupDate = DateTime.UtcNow.AddDays(-buildCleanupCount);
var result = configurationBuilds
.Where(x => x.build_date > buildLastCleanupDate);
return result;
}
private static IEnumerable<Configuration> GetDiscartAbove(Configuration configuration,
int buildLastCleanupCount,
IEnumerable<Configuration> configurationBuilds)
{
if (!configuration.build_cleanup_type.Equals("ReserveBuildsByCount"))
return configurationBuilds;
var result = configurationBuilds
.Take(buildLastCleanupCount);
return result;
}
private static Notification CreateNotification(SumPerConfiguration iter)
{
return new Notification
{
ConfigurationId = iter.configurationId,
CreatedDate = DateTime.UtcNow,
RuleType = (int)RulesEnum.TooMuchDisc,
ConfigrationPath = iter.configurationPath
};
}
}
internal class SumPerConfiguration {
public object configurationId { get; set; } //
public object configurationPath { get; set; } // I did use 'object' cause I don't know your type data
public int Total { get; set; }
public double Average { get; set; }
}

c# Intersection and Union not working correctly

I am using C# 4.0 in VS 2010 and trying to produce either an intersection or a union of n sets of objects.
The following works correctly:
IEnumerable<String> t1 = new List<string>() { "one", "two", "three" };
IEnumerable<String> t2 = new List<string>() { "three", "four", "five" };
List<String> tInt = t1.Intersect(t2).ToList<String>();
List<String> tUnion = t1.Union(t2).ToList<String>();
// this also works
t1 = t1.Union(t2);
// as does this (but not at the same time!)
t1 = t1.Intersect(t2);
However, the following doesn't. These are code snippets.
My class is:
public class ICD10
{
public string ICD10Code { get; set; }
public string ICD10CodeSearchTitle { get; set; }
}
In the following:
IEnumerable<ICD10Codes> codes = Enumerable.Empty<ICD10Codes>();
IEnumerable<ICD10Codes> codesTemp;
List<List<String>> terms;
// I create terms here ----
// and then ...
foreach (List<string> item in terms)
{
// the following line produces the correct results
codesTemp = dataContextCommonCodes.ICD10Codes.Where(e => item.Any(k => e.ICD10CodeSearchTitle.Contains(k)));
if (codes.Count() == 0)
{
codes = codesTemp;
}
else if (intersectionRequired)
{
codes = codes.Intersect(codesTemp, new ICD10Comparer());
}
else
{
codes = codes.Union(codesTemp, new ICD10Comparer());
}
}
return codes;
The above only ever returns the results of the last item searched.
I also added my own comparer just in case, but this made no difference:
public class ICD10Comparer : IEqualityComparer<ICD10Codes>
{
public bool Equals(ICD10Codes Code1, ICD10Codes Code2)
{
if (Code1.ICD10Code == Code2.ICD10Code) { return true; }
return false;
}
public int GetHashCode(ICD10Codes Code1)
{
return Code1.ICD10Code.GetHashCode();
}
}
I am certain I am overlooking something obvious - I just cannot see what it is!
This code: return codes; returns a deferred enumerable. None of the queries have been executed to fill the set. Some queries get executed each time through the loop to make a Count though.
This deferred execution is a problem because of the closure issue... at the return, item is bound to the last loop execution.
Resolve this by forcing the queries to execute in each loop execution:
if (codes.Count() == 0)
{
codes = codesTemp.ToList();
}
else if (intersectionRequired)
{
codes = codes.Intersect(codesTemp, new ICD10Comparer()).ToList();
}
else
{
codes = codes.Union(codesTemp, new ICD10Comparer()).ToList();
}
if you are using an own comparer, you should take a look at the correct implementation of the GetHashCode function. the linq operators use this comparison too. you can take a look here:
http://msdn.microsoft.com/en-us/library/system.object.gethashcode(v=vs.80).aspx
you could try changing the hash function to "return 0", to see if it is the problem. ICD10Code.GetHashCode will return perhaps different values if it is a class object
Your problem definitely is not connect to Intersect or Union LINQ extension methods. I've just tested following:
var t1 = new List<ICD10>()
{
new ICD10() { ICD10Code = "123" },
new ICD10() { ICD10Code = "234" },
new ICD10() { ICD10Code = "345" }
};
var t2 = new List<ICD10>()
{
new ICD10() { ICD10Code = "234" },
new ICD10() { ICD10Code = "456" }
};
// returns list with just one element - the one with ICF10Code == "234"
var results = t1.Intersect(t2, new ICD10Comparer()).ToList();
// return list with 4 elements
var results2 = t1.Union(t2, new ICD10Comparer()).ToList();
Using your ICD10 and ICD10Comparer classes declarations. Everything works just fine! You have to search for bug in your custom code, because LINQ works just fine.

Can I conditionally create an IEnumerable with LINQ?

I have to following code:
List<Obj> coll = new List<Obj>();
if (cond1) coll.Add(new Obj { /*...*/ });
if (cond2) coll.Add(new Obj { /*...*/ });
if (cond3) coll.Add(new Obj { /*...*/ });
Is there a way to use LINQ or collection initializers for that?
EDIT:
The reason I want to use a collection initializer here is because I have an object tree which I do completely initialize with initialiers and LINQ. This spot is the only one which doesn't follow this principle.
var myobj = new MyBigObj
{
Prop1 = from .. select ..,
Prop2 = from .. select ..,
...
Prop3 = new MySmallerObj
{
PropSmall1 = from .. select ..,
PropSmall2 = from .. select ..,
...
}
};
And now this simply doesn't fit in my scheme:
List<Obj> coll = new List<Obj>();
if (cond1) coll.Add(new Obj { /*...*/ });
if (cond2) coll.Add(new Obj { /*...*/ });
if (cond3) coll.Add(new Obj { /*...*/ });
myobj.Prop4 = coll;
Sure I could put this code in a separate function that returns IEnumerable and call that.. :)
EDIT2:
It looks like I have to code some extension method which I would call like:
new Obj[0]
.ConditionalConcat(cond1, x=>new Obj { /*...*/ })
.ConditionalConcat(cond2, x=>new Obj { /*...*/ })
.ConditionalConcat(cond3, x=>new Obj { /*...*/ })
One fairly horrible option:
var conditions = new[] { cond1, cond2, cond3 };
var values = new[] { new Obj {...}, // First value
new Obj {...}, // Second value
new Obj { ...} // Third value
};
var list = conditions.Zip(values, (condition, value) => new { condition, value })
.Where(pair => pair.condition)
.Select(pair => pair.value)
.ToList();
It's not exactly simpler than the original code though ;) (And also it unconditionally creates all the values - it's only conditionally including them in the collection.)
EDIT: An alternative which only constructs the values when it needs to:
var conditions = new[] { cond1, cond2, cond3 };
var valueProviders = new Func<Obj>[] {
() => new Obj {...}, // First value
() => new Obj {...}, // Second value
() => new Obj { ...} // Third value
};
var list = conditions.Zip(valueProviders,
(condition, provider) => new { condition, provider })
.Where(pair => pair.condition)
.Select(pair => pair.provider())
.ToList();
EDIT: Given your requested syntax, this is a fairly easy option:
new List<Obj>()
.ConditionalConcat(cond1, x=>new Obj { /*...*/ })
.ConditionalConcat(cond2, x=>new Obj { /*...*/ })
.ConditionalConcat(cond3, x=>new Obj { /*...*/ })
with an extension method:
public static List<T> ConditionalConcat<T>(this List<T> source,
bool condition,
Func<T> provider)
{
if (condition)
{
source.Add(provider);
}
return source;
}
If your conditions depend on a single status object (or something that can reduced to),
you can create a method using yield, like the following:
IEnumerable<Obj> GetElemets(MyStatus currentStatus)
{
if(currentStatus.Prop1 == "Foo")
yield return new Obj {...};
if(currentStatus.IsSomething())
yield return new Obj {...};
if(currentStatus.Items.Any())
yield return new Obj {...};
// etc...
yield break;
}
In this way, you will separate the IEnumerable<Obj> generation logic, from the consumer logic.
Old question, but here is another approach using ternery operators ? :, .Concat() and Enumerable.Empty<T>()
var range1 = Enumerable.Range(1,10);
var range2 = Enumerable.Range(100,10);
var range3 = Enumerable.Range(1000,10);
var flag1 = true;
var flag2 = false;
var flag3 = true;
var sumOfCollections = (flag1 ? range1 : Enumerable.Empty<int>())
.Concat(flag2 ? range2 : Enumerable.Empty<int>())
.Concat(flag3 ? range3 : Enumerable.Empty<int>());
Though an old question, but I have an option to solve it in a somewhat clear way and without extensions or any other methods.
Assuming that conditions and initial collection of objects to be created are of same size, I used indexed Where overload approach, so it is not adding objects conditionally, but rather filtering them, with a use of funcs/lambdas we get also laziness, if we want.
The actual creation of objects is not relevant, so I put just boxing of ints (you could replace it with real creation, i.e. getting them from another collection using index), and list manipulation is for getting ints back - but values collection already has 2 elements, so all this could be thrown away (maybe except select with func call in case of using laziness).
Here is the all the code for running sample test right in MSVS
using System;
using System.Collections.Generic;
using System.Linq;
using Microsoft.VisualStudio.TestTools.UnitTesting;
namespace UnitTests
{
[TestClass]
public class Tests
{
[TestMethod]
public void Test()
{
var conds = new[] { true, false, true };
var values = conds.Select((c, i) => new Func<object>(() => i)).Where((f, i) => conds[i]);
var list = values.Select(f => f()).Cast<int>().ToList();
Assert.AreEqual(list.Count, 2);
}
}
}
UPD.
Here also lazy and non-lazy one-liners with "getting object"
var lazy1line = new[] { true, false, true }.Select((c, i) => new Func<object>(() => (DayOfWeek)i)).Where((f, i) => conds[i]).Select(f => f());
var simple1line = new[] { true, false, true }.Select((c, i) => (DayOfWeek)i).Where((f, i) => conds[i]);
Assert.AreEqual(lazy1line.Count(), simple1line.Count());

LINQ, SelectMany with multiple possible outcomes

I have a situation where I have lists of objects that have to be merged. Each object in the list will have a property that explains how it should be treated in the merger. So assume the following..
enum Cascade {
Full,
Unique,
Right,
Left
}
class Note {
int Id { get; set; }
Cascade Cascade { get; set; }
// lots of other data.
}
var list1 = new List<Note>{
new Note {
Id = 1,
Cascade.Full,
// data
},
new Note {
Id = 2,
Cascade.Right,
// data
}
};
var list2 = new List<Note>{
new Note {
Id = 1,
Cascade.Left,
// data
}
};
var list3 = new List<Note>{
new Note {
Id = 1,
Cascade.Unique,
// data similar to list1.Note[0]
}
}
So then, I'll have a method ...
Composite(this IList<IList<Note>> notes){
return new List<Note> {
notes.SelectMany(g => g).Where(g => g.Cascade == Cascade.All).ToList()
// Here is the problem...
.SelectMany(g => g).Where(g => g.Cascade == Cascade.Right)
.Select( // I want to do a _LastOrDefault_ )
// continuing for the other cascades.
}
}
This is where I get lost. I need to do multiple SelectMany statements, but I don't know how to. But this is the expected behavior.
Cascade.Full
The Note will be in the final collection no matter what.
Cascade.Unique
The Note will be in the final collection one time, ignoring any duplicates.
Cascade.Left
The Note will be in the final collection, First instances superseding subsequent instances. (So then, Notes 1, 2, 3 are identical. Note 1 gets pushed through)
Cascade.Right
The Note will be in the final collection, Last instance superseding duplicates. (So Notes 1, 2, 3 are identical. Note 3 gets pushed trough)
I think you should decompose the problem in smaller parts. For example, you can implement the cascade rules for an individual list in a seperate extension method. Here's my untested take at it:
public static IEnumerable<Note> ApplyCascades(this IEnumerable<Note> notes)
{
var uniques = new HashSet<Note>();
Note rightToYield = null;
foreach (var n in notes)
{
bool leftYielded = false;
if (n.Cascade == Cascade.All) yield return n;
if (n.Cascade == Cascade.Left && !leftYielded)
{
yield return n;
leftYielded = true;
}
if (n.Cascade == Cascade.Right)
{
rightToYield = n;
}
if (n.Cascade == Cascade.Unique && !uniques.Contains(n))
{
yield return n;
uniques.Add(n);
}
}
if (rightToYield != null) yield return rightToYield;
}
}
This method would allow to implement the original extension method something like this:
List<Note> Composite(IList<IList<Note>> notes)
{
var result = from list in notes
from note in list.ApplyCascades()
select note;
return result.ToList();
}

Categories