C# - match ID to filename using functional principles

C# - match ID to filename using functional principles - c#

I'm rewriting an old C# project from scratch trying to figure out how it can be improved with the use of functional design. So far I've stuck with a couple of principles (except where the GUI is concerned):
Set every variable in every class as readonly and assign it a value only once.
Use immutable collections
Don't write code with side effects.
Now I'm trying to create a function which, given a folder, using yield return, enumerates a list of objects, one for each file in the given folder. Each object contains a unique ID, starting from firstAssignedID, and a filename.
Thing is, I'm not sure how to approach the problem at all. Is what I've just described even the right way of thinking about it? My code so far is a half-baked, incomplete mess. Is it possible to use a lambda here? Would that help, or is there a better way?
The FileObject class simply contains a string fileName and an int id, and the FileObject constructor simply and naively creates an instance given those two values.
public IEnumerable<FileObject> EnumerateImagesInPath(string folderPath, int firstAssignedID)
{
foreach (string path in Directory.EnumerateFiles(folderPath)
{
yield return new FileObject(Path.GetFileName(imagePath) , );
}
}

The most functional way of doing what you want is this:
IEnumerable<FileObject> EnumerateImagesInPath(string path, int firstAssignedID) =>
Enumerable.Zip(
Enumerable.Range(firstAssignedID, Int32.MaxValue),
Directory.EnumerateFiles(path),
FileObject.New);
With a FileObject type defined like so:
public class FileObject
{
public readonly int Id;
public readonly string Filename;
FileObject(int id, string fileName)
{
Id = id;
Filename = fileName;
}
public static FileObject New(int id, string fileName) =>
new FileObject(id, fileName);
}
It doesn't use yield, but that doesn't matter because Enumerable.Range and Enumerable.Zip do, so it's a lazy function just like your original example.
I use Enumerable.Range to create a lazy list of integers from firstAssignedId to Int32.MaxValue. This is zipped together with the enumerable of files in the directory. FileObject.New(id. path) is invoked as part of the zip computation.
There is no in-place state modification like the accepted answer (firstAssignedID++), and the whole function can be represented as an expression.
The other way of achieving your goal is to use the fold pattern. It is the most common way of aggregating state in functional programming. This is how to define it for IEnumerable
public static class EnumerableExt
{
public static S Fold<S, T>(this IEnumerable<T> self, S state, Func<S, T, S> folder) =>
self.Any()
? Fold(self.Skip(1), folder(state, self.First()), folder)
: state;
}
You should be able to see that its a recursive function that runs a delegate (folder) on the head of the list if there is one, then uses that as new state when calling recursively calling Fold. If it reaches the end of the list, then the aggregate state is returned.
You may notice that the implementation of EnumerableExt.Fold can blow up the stack in C# (because of a lack of tail-call optimisation). So a better way of implementing the Fold function is to do so imperatively:
public static S Fold<S, T>(this IEnumerable<T> self, S state, Func<S, T, S> folder)
{
foreach(var x in self)
{
state = folder(state, x);
}
return state;
}
There is a dual to Fold known as FoldBack (sometimes they're called 'fold left' and 'fold right'). FoldBack essentially aggregates from the tail of the list to the head, where Fold is from the head to the tail.
public static S FoldBack<S, T>(this IEnumerable<T> self, S state, Func<S, T, S> folder)
{
foreach(var x in self.Reverse()) // Note the Reverse()
{
state = folder(state, x);
}
return state;
}
Fold is so flexible, for example you could implement Count for an enumerable in terms of fold like so:
int Count<T>(this IEnumerable<T> self) =>
self.Fold(0, (state, item) => state + 1);
Or Sum like so:
int Sum<int>(this IEnumerable<int> self) =>
self.Fold(0, (state, item) => state + item);
Or most of the IEnumerable API!
public static bool Any<T>(this IEnumerable<T> self) =>
self.Fold(false, (state, item) => true);
public static bool Exists<T>(this IEnumerable<T> self, Func<T, bool> predicate) =>
self.Fold(false, (state, item) => state || predicate(item));
public static bool ForAll<T>(this IEnumerable<T> self, Func<T, bool> predicate) =>
self.Fold(true, (state, item) => state && predicate(item));
public static IEnumerable<R> Select<T, R>(this IEnumerable<T> self, Func<T, R> map) =>
self.FoldBack(Enumerable.Empty<R>(), (state, item) => map(item).Cons(state));
public static IEnumerable<T> Where<T>(this IEnumerable<T> self, Func<T, bool> predicate) =>
self.FoldBack(Enumerable.Empty<T>(), (state, item) =>
predicate(item)
? item.Cons(state)
: state);
It's very powerful, and allows the aggregation of state for a collection (so this allows us to do firstAssignedId++ without an imperative in-place state modification).
Our FileObject example is a little more complex than Count or Sum, because we need to maintain two pieces of state: the aggregate ID and the resulting IEnumerable<FileObject>. So our state is a Tuple<int, IEnumerable<FileObject>>
IEnumerable<FileObject> FoldImagesInPath(string folderPath, int firstAssignedID) =>
Directory.EnumerateFiles(folderPath)
.Fold(
Tuple.Create(firstAssignedID, Enumerable.Empty<FileObject>()),
(state, path) => Tuple.Create(state.Item1 + 1, FileObject.New(state.Item1, path).Cons(state.Item2)))
.Item2;
You can make this even more declarative by providing some extension and static methods for Tuple<int, IEnumerable<FileObject>>:
public static class FileObjectsState
{
// Creates a tuple with state ID of zero (Item1) and an empty FileObject enumerable (Item2)
public static readonly Tuple<int, IEnumerable<FileObject>> Zero =
Tuple.Create(0, Enumerable.Empty<FileObject>());
// Returns a new tuple with the ID (Item1) set to the supplied argument
public static Tuple<int, IEnumerable<FileObject>> SetId(this Tuple<int, IEnumerable<FileObject>> self, int id) =>
Tuple.Create(id, self.Item2);
// Returns the important part of the result, the enumerable of FileObjects
public static IEnumerable<FileObject> Result(this Tuple<int, IEnumerable<FileObject>> self) =>
self.Item2;
// Adds a new path to the aggregate state and increases the ID by one.
public static Tuple<int, IEnumerable<FileObject>> Add(this Tuple<int, IEnumerable<FileObject>> self, string path) =>
Tuple.Create(self.Item1 + 1, FileObject.New(self.Item1, path).Cons(self.Item2));
}
The extension methods capture the operations on the aggregate state and make the resulting fold computation very clear:
IEnumerable<FileObject> FoldImagesInPath(string folderPath, int firstAssignedID) =>
Directory.EnumerateFiles(folderPath)
.Fold(
FileObjectsState.Zero.SetId(firstAssignedID),
FileObjectsState.Add)
.Result();
Obviously using Fold for the use-case you provided is overkill, and that's why I used Zip instead. But the more general problem you were struggling with (functional aggregate state) is what Fold is for.
There is one more extension method I used in the example above: Cons:
public static IEnumerable<T> Cons<T>(this T x, IEnumerable<T> xs)
{
yield return x;
foreach(var a in xs)
{
yield return a;
}
}
More info on cons can be found here
If you want to learn more about using functional technique in C#, please check my library: language-ext. It will give you a ton of stuff that the C# BCL is missing.

Using yeild seems unnecessary:
public IEnumerable<FileObject> EnumerateImagesInPath(string folderPath, int firstAssignedID)
{
foreach (FileObject File in Directory.EnumerateFiles(folderPath)
.Select(FileName => new FileObject(FileName, firstAssignedID++)))
{
yield return File;
}
}

Related

LINQ: Customize the count method

I'm trying to implement a custom LinQ Count() method. Basically what I'm trying to achieve here is before calling the Count method, I want to filter out all elements that have the property IsDeleted set to true. So, I created an extension class and I added these methods:
public static int Count2<T>(this IEnumerable<T> source, Func<T, bool> selector)
where T : Model
{
return source.Where(x => !x.IsDeleted).Count(selector);
}
public static int Count2<T>(this IQueryable<T> source, Expression<Func<T, bool>> selector)
where T : Model
{
return source.Where(x => !x.IsDeleted).Count(selector);
}
public static int Count2<T>(this IEnumerable<T> source)
where T : Model
{
return source.Count(x => !x.IsDeleted);
}
public static int Count2<T>(this IQueryable<T> source)
where T : Model
{
return source.Count(x => !x.IsDeleted);
}
This works just find for local collections, but when executing this command for instance:
ListOfModels.Sum(x => x.PropertyThatIsAList.Count2())
and ListOfModels is an instance of IQueryable, i.e. it has to be executed in the database, it gives me this error:
The LINQ expression 'Sum()' could not be translated and will be evaluated locally.
I looked around on the web and I saw some answers saying I have to implement the IQueryableProvider but I think there is no need to go into such complicated path since the Sum() and Count() are translatable, I only need to count conditionally. Is it possible, and if it is, can anyone give me a clue on how to do it?

I suggest you instead of customizing all LinQ methods use an extended method like Validate():
public static IEnumerable<T> Validate<T>(this IEnumerable<T> list) where T: IDeleteable
{
return list.Where(w => !w.IsDeleted);
}
That IDeleteable interface is like this:
public interface IDeleteable
{
bool IsDeleted { get; set; }
}
Then use it before other methods.

C# Continuation Monad Implementation

I have been working on allowing function chaining. I have created a class called continuationmonad which takes a value, and a function from a => b. This allows me to use fmap and bind to chain these together. I have also used lazy to allowed calls to be defered where possible.
Is this class really the continuation monad or is it something else. I am finding it hard to find good literature which is not is Haskell.
Also any comments on how to improve / correct this.
using NUnit.Framework;
using System;
namespace Monads
{
public class Continuation<Input, Output>{
public Continuation(Input value, Func<Input,Output> function){
this.value = new Lazy<Input>( () => value);
this.function = function;
}
public Continuation(Lazy<Input> value, Func<Input,Output> function){
this.value = value;
this.function = function;
}
public Continuation<Output, Result> FMap<Result>(Func<Output, Result> map){
return new Continuation<Output, Result>(new Lazy<Output>( () => Run() ), x => map(x));
}
public Continuation<Output,Result> Bind<Result>(Func<Output, Continuation<Output, Result>> f){
return f(Run());
}
public Output Run(){
return function(value.Value);
}
private Func<Input, Output> function;
private Lazy<Input> value;
}
public static class ContinuationExtension{
public static Continuation<A,B> Unit<A,B>(this Func<A,B> f, A value){
return new Continuation<A, B>(value,f);
}
public static Continuation<A,B> Unit<A,B>(this A value,Func<A,B> f){
return new Continuation<A, B>(value,f);
}
}
[TestFixture]
public class MonadTests
{
public Continuation<int,int> Wrapped(int value){
return new Continuation<int,int>(value, x => x * 10);
}
[Test]
public void ContinuationMonadTests()
{
var number = 42;
var result = number.Unit(x => x + 8).FMap(x => x * 2).Bind(Wrapped).Run();
Console.WriteLine(result);
}
}
}

This is not the continuation monad. You are much closer to the Haskell Monad instance for functions.
You aren't getting anything that you couldn't get just from using Lazy<>. Since you have provided the input when you build an instance of your class, you aren't building functions, you are building values that are determined by a computation that hasn't been evaluated yet. Lazy<> delays the evaluation of computation until the value is needed.
Let's put together something like the Haskell Monad instance for functions in c#. LINQ syntax has established the convention for Monads in c#. They should have:
a Select extension method analogous to a Haskell Functor's fmap
a SelectMany extension method analogous to Haskell's Monad's >>=
an additional SelectMany that LINQ syntax uses. This takes an additional function that combines the value from two steps together.
Unfortunately, there's no convention for what the analog of a Monad's return should be called; we'll call ours Constant. Unfortunately, Constant won't be very convenient because c#'s type inference won't be able to figure out the types.
public static class Function
{
public static Func<TIn, TOut> Constant<TIn, TOut>(TOut result)
{
return x => result;
}
public static Func<TIn, TOut> Select<TIn, TMid, TOut>(
this Func<TIn, TMid> func,
Func<TMid, TOut> proj)
{
return x => proj(func(x));
}
public static Func<TIn, TOut> SelectMany<TIn, TMid, TOut>(
this Func<TIn, TMid> func,
Func<TMid, Func<TIn, TOut>> proj)
{
return x => proj(func(x))(x);
}
public static Func<TIn, TOut> SelectMany<TIn, TMid1, TMid2, TOut>(
this Func<TIn, TMid1> func,
Func<TMid1, Func<TIn, TMid2>> proj1,
Func<TMid1, TMid2, TOut> proj2)
{
return x => {
var mid1 = func(x);
var mid2 = proj1(mid1)(x);
return proj2(mid1, mid2);
};
}
}
Note that defining these extension methods only lets you interact with something like it's a Monad, it doesn't let you write code that's generic over the specific Monad being used. There's a sketch of how to do that in the second half of this answer.

This might be a bit opinion based but I'll try to give you my 5ct anyway.
Let's have a look at your class and their instances:
It includes a value and a function where you (tried) to make it all a lazy.
From a theoretical view I can see no difference to Lazy<T> on first glance:
You can surely convert one of your Continuation<Input,Output> to just a Lazy<Output>.
The same is true for the reverse: given some lazy value a you can make a instance with just
new Continuation(a, x => x)
So to me it seems that you just reinvented Lazy (which is an monad, in Haskell you would call it Identity.
The Cont monad is not really easy to crasp but it's really more related to .net-Events or .net-Observables. The datastructure itself would be like
Func<Func<Input,Output>, Output>
Where you pass in a continuation Func<Input,Output> to some internal calculation and then the struture than will call it when it has calculated an input Input to get the final result.
This might be a bit cryptic but one .net application are the Async workflows F# uses and which stood model for C#s async/await behaviour in some sense.
I have some material I used for a talk on a simpified version of this monad in C# on github maybe you'll find it interesting.

I have created a very comprehensive introduction to the Continuation monad that you can Find Here Discovering the Continuation Monad in C#
Also you can find a.Net Fiddle here
I Repeat it in summary here
Starting from an initial Function
int Square(int x ){return (x * x);}
Use Callback and remove return type
public static void Square(int x, Action<int> callback)
{
callback(x * x);
}
Curry the Callback
public static Action<Action<int>> Square(int x)
{
return (callback) => { callback(x * x); };
}
Generalize the returned Continuation
public static Func<Func<int,T>,T> Square<T>(int x)
{
return (callback) => { callback(x * x); };
}
Extract the Continuation Structure Also Known As the Return Method of the monad. That is Give me a value and i will give you a Monad for this value
//((U→ T) → T)
delegate T Cont<U, T>(Func<U, T> f);
public static Cont<U, T> ToContinuation<U, T>(this U x)
{
return (callback) => callback(x);
}
square.ToContinuation<Func<int, int>, int>()
Add The bind Monadic method and thus Complete the Monad.That is Give me a Two Monads and i will combine them to a new monad
((A→ T) → T)→( A→((B→ T) → T))→ ((B→ T) → T)
public static Cont<V, Answer> Bind<T, U, V, Answer>(
this Cont<T, Answer> m,
Func<T, Cont<U, Answer>> k,
Func<T, U, V> selector)
{
return (Func<V, Answer> c) =>
m(t => k(t)(y => c(selector(t, y))));
}

How to find the max id of any table.

I would like something like this:
public int NumberStudent()
{
int i = 0;
if (db.Tbl_Student.ToList().Count() > 0)
i = db. Tbl_Student.Max(d => d.id);
return i;
}
However, I would like to use it on any table:
public int FindMaxId(string TableName)
{
int i =0;
if ('db.'+TableName+'.ToList().Count() > 0' )
i = db. TableName.Max(d => d.id);
return i ;
}
I know it is wrong, but I'm not sure how to do it.

You can use the IEnumerable/IQueryable extension method DefaultIfEmpty for this.
var maxId = db.Tbl_Student.Select(x => x.Id).DefaultIfEmpty(0).Max();
In general, if you do Q.DefaultIfEmpty(D), it means:
If Q isn't empty, give me Q; otherwise, give me [ D ].

Below I have written a simple wrapper around the existing Max extension method that allows you provide an empty source (the table you were talking about).
Instead of throwing an exception, it will just return the default value of zero.
Original
public static class Extensions
{
public static int MaxId<TSource, TResult>(this IEnumerable<TSource> source, Func<TSource, int> selector)
{
if (source.Any())
{
return source.Max(selector);
}
return 0;
}
}
This was my attempt, which as noted by Timothy is actually quite inferior. This is because the sequence will be enumerated twice. Once when calling Any to check if the source sequence has any elements, and again when calling Max.
Improved
public static class Extensions
{
public static int MaxId<TSource>(this IQueryable<TSource> source, Func<TSource, int> selector)
{
return source.Select(selector).DefaultIfEmpty(0).Max();
}
}
This implementation uses Timothy's approach. By calling DefaultIfEmpty, we are making use of deferred execution and the sequence will only be enumerated when calling Max. In addition we are now using IQueryable instead of IEnumerable which means we don't have to enumerate the source before calling this method. As Scott said, should you need it you can create an overload that uses IEnumerable too.
In order to use the extension method, you just need to provide a delegate that returns the id of the source type, exactly the same way you would for Max.
public class Program
{
YourContext context = new YourContext();
public int MaxStudentId()
{
return context.Student.MaxId(s => s.Id);
}
public static void Main(string[] args)
{
Console.WriteLine("Max student id: {0}", MaxStudentId());
}
}
public static class Extensions
{
public static int MaxId<TSource>(this IQueryable<TSource> source, Func<TSource, int> selector)
{
return source.Select(selector).DefaultIfEmpty(0).Max();
}
}

db.Tbl_Student.Aggregate(0, (maxId, s) => Math.Max(maxId, s.Id))
or
db.Tbl_Student.Max(s => (int?)s.Id) ?? 0

How to sort a TrackableCollection?

I have an Entity Framework 'TrackableCollection' and I want to sort it by an attribute. I have tried treating it like an IEnumerable and calling
TrackableCollection<Something>.OrderBy(q=>q.SomeValue);
but it throws an exception "Cannot implicitly covert type IOrderedEnumerable to TrackableCollection.
Anyone know how to sort a TrackableCollection?

The code example Shiv Kumar refers to does not work - it doesn't compile, and even after you factor things up (like implementing generics in a lot of places), it works but buggy, since the code calls collection.Move which causes a "Index must be within the bounds of the List" exception in certain cases.
The code below works correctly. The coders of STE (Self Tracking Entities) should had implemented that themselves... This is the correct code:
public static class Extensions
{
public static void Sort<T>(this TrackableCollection<T> collection, Comparison<T> comparison)
{
var comparer = new Comparer<T>(comparison);
List<T> sorted = collection.OrderBy(x=>x, comparer) .ToList();
collection.Clear();
for (int i = 0; i < sorted.Count(); i++)
collection.Add(sorted[i]);
}
}
class Comparer<T> : IComparer<T>
{
private Comparison<T> comparison;
public Comparer(Comparison<T> comparison)
{
this.comparison = comparison;
}
public int Compare(T x, T y)
{
return comparison.Invoke(x, y);
}
}
You use this code as in the previous example:
YourTrackableCollectionName.Sort((x, y) => x.YourFieldName.CompareTo(y.YourFieldName));

This is a simplifed version of Ofer Zeligs code that uses a keySelector func (like LINQ OrderBy) instead of an explicit Comparison delegate. Since it only uses Clear() and Add() it can be called on any Collection object (like an ObservableCollection or a TrackableCollection).
public static void Sort<TSource, TKey>(this Collection<TSource> source, Func<TSource, TKey> keySelector)
{
var sorted = source.OrderBy(keySelector).ToList();
source.Clear();
foreach (var item in sorted)
source.Add(item);
}
It's used like this:
list.Sort(person => person.Name);
If you're using this on a TrackableCollection (STE) you might want to make sure you haven't started tracking changes before sorting the list.

LINQ identity function

Just a little niggle about LINQ syntax. I'm flattening an IEnumerable<IEnumerable<T>> with SelectMany(x => x).
My problem is with the lambda expression x => x. It looks a bit ugly. Is there some static 'identity function' object that I can use instead of x => x? Something like SelectMany(IdentityFunction)?

Unless I misunderstand the question, the following seems to work fine for me in C# 4:
public static class Defines
{
public static T Identity<T>(T pValue)
{
return pValue;
}
...
You can then do the following in your example:
var result =
enumerableOfEnumerables
.SelectMany(Defines.Identity);
As well as use Defines.Identity anywhere you would use a lambda that looks like x => x.

Note: this answer was correct for C# 3, but at some point (C# 4? C# 5?) type inference improved so that the IdentityFunction method shown below can be used easily.
No, there isn't. It would have to be generic, to start with:
public static Func<T, T> IdentityFunction<T>()
{
return x => x;
}
But then type inference wouldn't work, so you'd have to do:
SelectMany(Helpers.IdentityFunction<Foo>())
which is a lot uglier than x => x.
Another possibility is that you wrap this in an extension method:
public static IEnumerable<T> Flatten<T>
(this IEnumerable<IEnumerable<T>> source)
{
return source.SelectMany(x => x);
}
Unfortunately with generic variance the way it is, that may well fall foul of various cases in C# 3... it wouldn't be applicable to List<List<string>> for example. You could make it more generic:
public static IEnumerable<TElement> Flatten<TElement, TWrapper>
(this IEnumerable<TWrapper> source) where TWrapper : IEnumerable<TElement>
{
return source.SelectMany(x => x);
}
But again, you've then got type inference problems, I suspect...
EDIT: To respond to the comments... yes, C# 4 makes this easier. Or rather, it makes the first Flatten method more useful than it is in C# 3. Here's an example which works in C# 4, but doesn't work in C# 3 because the compiler can't convert from List<List<string>> to IEnumerable<IEnumerable<string>>:
using System;
using System.Collections.Generic;
using System.Linq;
public static class Extensions
{
public static IEnumerable<T> Flatten<T>
(this IEnumerable<IEnumerable<T>> source)
{
return source.SelectMany(x => x);
}
}
class Test
{
static void Main()
{
List<List<string>> strings = new List<List<string>>
{
new List<string> { "x", "y", "z" },
new List<string> { "0", "1", "2" }
};
foreach (string x in strings.Flatten())
{
Console.WriteLine(x);
}
}
}

With C# 6.0 and if you reference FSharp.Core you can do:
using static Microsoft.FSharp.Core.Operators
And then you're free to do:
SelectMany(Identity)

With C# 6.0 things are getting better. We can define the identity function in the way suggested by #Sahuagin:
static class Functions
{
public static T It<T>(T item) => item;
}
And then use it in SelectMany the using static constructor:
using Functions;
...
var result = enumerableOfEnumerables.SelectMany(It);
I think it looks very laconic in the such way. I also find the identity function useful when building dictionaries:
class P
{
P(int id, string name) // Sad. We are not getting primary constructors in C# 6.0
{
ID = id;
Name = id;
}
int ID { get; }
int Name { get; }
static void Main(string[] args)
{
var items = new[] { new P(1, "Jack"), new P(2, "Jill"), new P(3, "Peter") };
var dict = items.ToDictionary(x => x.ID, It);
}
}

This may work in the way you want. I realize Jon posted a version of this solution, but he has a second type parameter which is only necessary if the resulting sequence type is different from the source sequence type.
public static IEnumerable<T> Flatten<T>(this IEnumerable<T> source)
where T : IEnumerable<T>
{
return source.SelectMany(item => item);
}

You can get close to what you need. Instead of a regular static function, consider an extension method for your IEnumerable<T>, as if the identity function is of the collection, not the type (a collection can generate the identity function of its items):
public static Func<T, T> IdentityFunction<T>(this IEnumerable<T> enumerable)
{
return x => x;
}
with this, you don't have to specify the type again, and write:
IEnumerable<IEnumerable<T>> deepList = ... ;
var flat = deepList.SelectMany(deepList.IdentityFunction());
This does feel a bit abusive though, and I'd probably go with x=>x. Also, you cannot use it fluently (in chaining), so it will not always be useful.

I'd go with a simple class with a single static property and add as many as required down the line
internal class IdentityFunction<TSource>
{
public static Func<TSource, TSource> Instance
{
get { return x => x; }
}
}
SelectMany(IdentityFunction<Foo>.Instance)

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

C# - match ID to filename using functional principles - c#

Using yeild seems unnecessary: public IEnumerable<FileObject> EnumerateImagesInPath(string folderPath, int firstAssignedID) { foreach (FileObject File in Directory.EnumerateFiles(folderPath) .Select(FileName => new FileObject(FileName, firstAssignedID++))) { yield return File; } }

Related

LINQ: Customize the count method

C# Continuation Monad Implementation

How to find the max id of any table.

How to sort a TrackableCollection?

LINQ identity function

Categories

Resources