Split string extension with generic type?

Split string extension with generic type? - c#

I would like to create a Split extension that would allow me to split any string to a strongly-typed list. I have a head start, but since I was going to reuse this in many projects, I would like to get input from the community (and so you can add it to your own toolbox ;) Any ideas from here?
public static class Converters
{
public static IEnumerable<T> Split<T>(this string source, char delimiter)
{
var type = typeof(T);
//SPLIT TO INTEGER LIST
if (type == typeof(int))
{
return source.Split(delimiter).Select(n => int.Parse(n)) as IEnumerable<T>;
}
//SPLIT TO FLOAT LIST
if (type == typeof(float))
{
return source.Split(delimiter).Select(n => float.Parse(n)) as IEnumerable<T>;
}
//SPLIT TO DOUBLE LIST
if (type == typeof(double))
{
return source.Split(delimiter).Select(n => double.Parse(n)) as IEnumerable<T>;
}
//SPLIT TO DECIMAL LIST
if (type == typeof(decimal))
{
return source.Split(delimiter).Select(n => decimal.Parse(n)) as IEnumerable<T>;
}
//SPLIT TO DATE LIST
if (type == typeof(DateTime))
{
return source.Split(delimiter).Select(n => DateTime.Parse(n)) as IEnumerable<T>;
}
//USE DEFAULT SPLIT IF NO SPECIAL CASE DEFINED
return source.Split(delimiter) as IEnumerable<T>;
}
}

I'd add a parameter for the conversion function:
public static IEnumerable<T> Split<T>(this string source, Func<string, T> converter, params char[] delimiters)
{
return source.Split(delimiters).Select(converter);
}
And you can call it as
IEnumerable<int> ints = "1,2,3".Split<int>(int.Parse, ',');
I would also consider renaming it to avoid confusion with the String.Split instance method since this complicates overload resolution, and behaves differently to the others.
EDIT: If you don't want to specify the conversion function, you could use type converters:
public static IEnumerable<T> SplitConvert<T>(this string str, params char[] delimiters)
{
var converter = TypeDescriptor.GetConverter(typeof(T));
if (converter.CanConvertFrom(typeof(string)))
{
return str.Split(delimiters).Select(s => (T)converter.ConvertFromString(s));
}
else throw new InvalidOperationException("Cannot convert type");
}
This allows the conversion to be extended to other types rather than relying on a pre-defined list.

Although I agree with Lee’s suggestion, I personally don’t think it’s worth defining a new extension method for something that may trivially be achieved with standard LINQ operations:
IEnumerable<int> ints = "1,2,3".Split(',').Select(int.Parse);

public static IEnumerable<T> Split<T>
(this string source, char delimiter,Converter<string,T> func)
{
return source.Split(delimiter).Select(n => func(n)));
}
Example:
"...".Split<int>(',',p=>int.Parse(p))
Or you can use Converter.ChangeType without define function:
public static IEnumerable<T> Split<T>(this string source, char delimiter)
{
return source.Split(delimiter).Select(n => (T)Convert.ChangeType(n, typeof(T)));
}

I don't like this method. Parsing data types from strings (a sort of deserialization) is a very type- and content- sensitive process when you're dealing with data types more complex than an int. For example, DateTime.Parse parses the date using the current culture, so your method is not going to provide consistent or reliable output for dates across systems. It also tries to parse the date at all costs so it might skip through what might be considered bad input in some situations.
The goal of splitting any string into a strongly typed list cannot really be accomplished with a single method that uses hard-coded conversions, especially if your goal is broad usability. Even if you do update it repeatedly with new conversions. The best way to go about it is just to "1,2,3".Split(",").Select(x => whatever) like Douglas suggested above. It's also very clear what sort of conversion is taking place.

Related

Implicit Conversion to IEnumerable<T>

I have a class that is used to hold values loaded from a configuration file. To make this class easier to use, I have set up many implicit conversions to some basic types.
One of the types I would like to convert to is IEnumerable(T). For example, if the programmer has a value like this in the config file
a = "1,2,3,4"
In C# code, he can write
IEnumerable<int> a = Config.a;
Ideally what I would like to write is this
public static implicit operator IEnumerable<T>(ConfigurationValue val)
{
string value = val;
string[] parts = value.Split(new[] { ',' }, StringSplitOptions.RemoveEmptyEntries);
List<T> convertedTypes = new List<T>();
foreach (string part in parts)
{
T converted = Convert.ChangeType(part.Trim(), typeof(T));
convertedTypes.Add(converted);
}
return convertedTypes;
}
But this is giving me syntax errors that T is undefined. Is there no way to define such a conversion or is there a special syntax for it?
Also for the record I am using C# 4.0 in the .Net Framework 4.0

But this is giving me syntax errors that T is undefined. Is there no way to define such a conversion or is there a special syntax for it?
You're trying to declare a generic operator - and that's not supported in C#. (I don't know whether it's supported in IL.) The same is true for constructors, properties, events and finalizers.
Basically, only methods and types can be generic.
EDIT: As noted in comments, I'd write a generic method instead. User-defined conversions - particularly implicit ones - should be used very sparingly, IMO.

Instead of generic operator (which is impossible as #Jon stated) you can create extension method:
public static IEnumerable<T> AsEnumerable<T>(this string value)
{
if (String.IsNullOrEmpty(value))
yield break;
var parts = value.Split(new[] {','}, StringSplitOptions.RemoveEmptyEntries);
foreach (string part in parts)
yield return Convert.ChangeType(part.Trim(), typeof(T));
}
And use it like this:
IEnumerable<int> a = Config.a.AsEnumerable<int>();

As Jon Skeet mentioned generic operators aren't supported, but you could make it a generic extension method instead.
public static class ConfigurationExtensions
{
public IEnumerable<T> GetValues<T>(this ConfigurationValue val)
{
string value = val.Value;
string[] parts = value.Split(new[] { ',' }, StringSplitOptions.RemoveEmptyEntries);
List<T> convertedTypes = new List<T>();
foreach (string part in parts)
{
T converted = (T)Convert.ChangeType(part.Trim(), typeof(T));
convertedTypes.Add(converted);
}
return convertedTypes;
}
}

C# how to sort a list without implementing IComparable manually?

I have a fairly complex scenario and I need to ensure items I have in a list are sorted.
Firstly the items in the list are based on a struct that contains a sub struct.
For example:
public struct topLevelItem
{
public custStruct subLevelItem;
}
public struct custStruct
{
public string DeliveryTime;
}
Now I have a list comprised of topLevelItems for example:
var items = new List<topLevelItem>();
I need a way to sort on the DeliveryTime ASC. What also adds to the complexity is that the DeliveryTime field is a string. Since these structs are part of a reusable API, I can't modify that field to a DateTime, neither can I implement IComparable in the topLevelItem class.
Any ideas how this can be done?
Thank you

Create a new type that implements IComparer and use an instance of it to compare the objects.
public class topLevelItemComparer : IComparer<topLevelItem>
{
public int Compare(topLevelItem a, topLevelItem b)
{
// Compare and return here.
}
}
You can then call Sort() like this:
var items = new List<topLevelItem>();
// Fill the List
items.Sort(new topLevelItemComparer());

It sounds like you need to get canonicalized date sorting even though your date is represented as a string, yes? Well, you can use LINQ's OrderBy operator, but you will have to parse the string into a date to achieve correct results:
items = items.OrderBy(item => DateTime.Parse(item.subLevelItem.DeliveryTime))
.ToList();
Update:
I've added this in for completeness - a real example of how I use ParseExact with Invariant culture:
var returnMessagesSorted = returnMessages.OrderBy((item => DateTime.ParseExact(item.EnvelopeInfo.DeliveryTime, ISDSFunctions.GetSolutionDateTimeFormat(), CultureInfo.InvariantCulture)));
return returnMessagesSorted.ToList();
You can always implement a separate IComparer class, it's not fun, but it works well:
public class TopLevelItemComparer : IComparer<topLevelItem>
{
public int Compare( topLevelItem x, topLevelItem y )
{
return DateTime.Parse(x.subLevelItem.DeliveryTime).CompareTo(
DateTime.Parse(y.subLevelItem.DeliveryTime) );
}
}
items.Sort( new TopLevelItemComparer() );
Be aware that most Sort() methods in the .NET framework accept an IComparer or IComparer<T> which allows you to redefine the comparison semantics for any type. Normally, you just use Comparer<T>.Default - or use an overload that essentially supplies this for you.

Using LINQ:
items = items.OrderBy(item => item.subLevelItem.DeliveryTime).ToList();

If you want to perform an in-place sort then you can use the Sort overload that takes a Comparison<T> argument and pass an anonymous function/lambda:
items.Sort((x, y) => DateTime.Parse(x.subLevelItem.DeliveryTime).CompareTo(
DateTime.Parse(y.subLevelItem.DeliveryTime)));
If you prefer to create a new sorted sequence rather than an in-place sort then LINQ's OrderBy is probably the way to go, as others have already mentioned.

Having had this problem before I once implemented a LambdaComparer that did the compare based on an arbitrary lambda expression. Not exact code but something along these lines:
public class LambdaComparer : IComparer<T>
{
private Func<T,T,int> _func;
public LambdaComparer(Func<T,T,int> function)
{
_func = function;
}
public int Compare(T x, T y)
{
return _func(x,y);
}
}
Big advantage of this is you get a nice reusable chunk of code.

To sort the items list itself:
Comparison<topLevelItem> itemComparison = (x, y) => {
DateTime dx;
DateTime dy;
bool xParsed = DateTime.TryParse(x.subLevelItem.DeliveryTime, out dx);
bool yParsed = DateTime.TryParse(y.subLevelItem.DeliveryTime, out dy);
if (xParsed && yParsed)
return dx.CompareTo(dy);
else if (xParsed)
return -1; // or 1, if you want invalid strings to come first
else if (yParsed)
return 1; // or -1, if you want invalid strings to come first
else
// simple string comparison
return x.subLevelItem.DeliveryTime.CompareTo(y.subLevelItem.DeliveryTime);
};
items.Sort(itemComparison);
This approach has the advantage of:
Sorting the list in place (that is, if you actualy want the list sorted in-place)
Sorting by actual DateTime values, rather than strings, BUT...
Not throwing an exception if a string does not represent a valid DateTime (basically, all the invalid strings will end up on one side of the list)

C# matching two text files, case sensitive issue

What I have is two files, sourcecolumns.txt and destcolumns.txt. What I need to do is compare source to dest and if the dest doesn't contain the source value, write it out to a new file. The code below works except I have case sensitive issues like this:
source: CPI
dest: Cpi
These don't match because of captial letters, so I get incorrect outputs. Any help is always welcome!
string[] sourcelinestotal =
File.ReadAllLines("C:\\testdirectory\\" + "sourcecolumns.txt");
string[] destlinestotal =
File.ReadAllLines("C:\\testdirectory\\" + "destcolumns.txt");
foreach (string sline in sourcelinestotal)
{
if (destlinestotal.Contains(sline))
{
}
else
{
File.AppendAllText("C:\\testdirectory\\" + "missingcolumns.txt", sline);
}
}

You could do this using an extension method for IEnumerable<string> like:
public static class EnumerableExtensions
{
public static bool Contains( this IEnumerable<string> source, string value, StringComparison comparison )
{
if (source == null)
{
return false; // nothing is a member of the empty set
}
return source.Any( s => string.Equals( s, value, comparison ) );
}
}
then change
if (destlinestotal.Contains( sline ))
to
if (destlinestotal.Contains( sline, StringComparison.OrdinalIgnoreCase ))
However, if the sets are large and/or you are going to do this very often, the way you're going about it is very inefficient. Essentially, you're doing an O(n2) operation -- for each line in the source you compare it with, potentially, all lines in the destination. It would be better to create a HashSet from the destination columns with a case insenstivie comparer and then iterate through your source columns checking if each one exists in the HashSet of the destination columns. This would be an O(n) algorithm. note that Contains on the HashSet will use the comparer you provide in the constructor.
string[] sourcelinestotal =
File.ReadAllLines("C:\\testdirectory\\" + "sourcecolumns.txt");
HashSet<string> destlinestotal =
new HashSet<string>(
File.ReadAllLines("C:\\testdirectory\\" + "destcolumns.txt"),
StringComparer.OrdinalIgnoreCase
);
foreach (string sline in sourcelinestotal)
{
if (!destlinestotal.Contains(sline))
{
File.AppendAllText("C:\\testdirectory\\" + "missingcolumns.txt", sline);
}
}
In retrospect, I actually prefer this solution over simply writing your own case insensitive contains for IEnumerable<string> unless you need the method for something else. There's actually less code (of your own) to maintain by using the HashSet implementation.

Use an extension method for your Contains. A brilliant example was found here on stack overflow Code isn't mine, but I'll post it below.
public static bool Contains(this string source, string toCheck, StringComparison comp)
{
return source.IndexOf(toCheck, comp) >= 0;
}
string title = "STRING";
bool contains = title.Contains("string", StringComparison.OrdinalIgnoreCase);

If you do not need case sensitivity, convert your lines to upper case using string.ToUpper before comparison.

Casting C# out parameters?

Is it possible to cast out param arguments in C#? I have:
Dictionary<string,object> dict; // but I know all values are strings
string key, value;
Roughly speaking (and if I didn't have static typing) I want to do:
dict.TryGetValue(key, out value);
but this obviously won't compile because it "cannot convert from 'out string' to 'out object'".
The workaround I'm using is:
object valueAsObject;
dict.TryGetValue(key, out valueAsObject);
value = (string) valueAsObject;
but that seems rather awkward.
Is there any kind of language feature to let me cast an out param in the method call, so it does this switcheroo for me? I can't figure out any syntax that'll help, and I can't seem to find anything with google.

I don't know if it is a great idea, but you could add a generic extension method:
static bool TryGetTypedValue<TKey, TValue, TActual>(
this IDictionary<TKey, TValue> data,
TKey key,
out TActual value) where TActual : TValue
{
if (data.TryGetValue(key, out TValue tmp))
{
value = (TActual)tmp;
return true;
}
value = default(TActual);
return false;
}
static void Main()
{
Dictionary<string,object> dict
= new Dictionary<string,object>();
dict.Add("abc","def");
string key = "abc", value;
dict.TryGetTypedValue(key, out value);
}

I spy with my little eye an old post that was still active a month ago...
Here's what you do:
public static class DictionaryExtensions
{
public static bool TryGetValueAs<Key, Value, ValueAs>(this IDictionary<Key, Value> dictionary, Key key, out ValueAs valueAs) where ValueAs : Value
{
if(dictionary.TryGetValue(key, out Value value))
{
valueAs = (ValueAs)value;
return true;
}
valueAs = default;
return false;
}
}
And because compilers are great, you can just call it like this:
dict.TryGetValueAs(key, out bool valueAs); // All generic types are filled in implicitely! :D
But say you're not creating a blackboard AI and just need to call this operation the one time. You can simply do a quicksedoodle inliner like this:
var valueAs = dict.TryGetValue(key, out var value) ? (bool)value : default;
I know these answers have been given already, but they must be pretty old because there is no cool hip modern inlining going on to condense these methods to the size we really want: no more than 1 line.

I used Marc's extension method but added a bit to it.
My problem with the original was that in some cases my dictionary would contain an int64 whereas I would expect an int 32. In other cases the dictionary would contain a string (for example "42") while I would like to get it as an int.
There is no way to handle conversion in Marc's method so I added the ability to pass in a delegate to a conversion method:
internal static bool TryGetTypedValue<TKey, TValue, TActual>(
this IDictionary<TKey, TValue> data,
TKey key,
out TActual value, Func<TValue, TActual> converter = null) where TActual : TValue
{
TValue tmp;
if (data.TryGetValue(key, out tmp))
{
if (converter != null)
{
value = converter(tmp);
return true;
}
if (tmp is TActual)
{
value = (TActual) tmp;
return true;
}
value = default(TActual);
return false;
}
value = default(TActual);
return false;
}
Which you can call like this:
int limit;
myParameters.TryGetTypedValue("limitValue", out limit, Convert.ToInt32)

No, there is no way around that. The out parameter must have a variable that matches exactly.
Using a string reference is not safe, as the dictionary can contain other things than strings. However if you had a dictionary of strings and tried to use an object variable in the TryGetValue call, that won't work either even though that would be safe. The variable type has to match exactly.

If you know all values are strings use Dictionary<string, string> instead. The out parameter type is set by the type of the second generic type parameter. Since yours is currently object, it will return an object when retrieving from the dictionary. If you change it to string, it will return strings.

No, you can't. The code inside the method is directly modifying the variable passed to it, it is not passed a copy of the content of the variable.

It is possible by using the Unsafe.As<TFrom, TTo>(ref TFrom source) method to do the cast inline.
var dict = new Dictionary<string, int>
{
["one"] = 1,
["two"] = 2,
["three"] = 3,
};
long result = 0;
dict.TryGetValue("two", out Unsafe.As<long, int>(ref result));
Depending on which platform you are on, this may require you to add a reference to System.Runtime.CompilerServices.Unsafe.

How to determine if a string is a number in C#

I am working on a tool where I need to convert string values to their proper object types. E.g. convert a string like "2008-11-20T16:33:21Z" to a DateTime value. Numeric values like "42" and "42.42" must be converted to an Int32 value and a Double value respectively.
What is the best and most efficient approach to detect if a string is an integer or a number? Are Int32.TryParse or Double.TryParse the way to go?

Int.TryParse and Double.TryParse have the benefit of actually returning the number.
Something like Regex.IsMatch("^\d+$") has the drawback that you still have to parse the string again to get the value out.

In terms of efficiency, yes, TryParse is generally the preferred route.
If you can know (for example, by reflection) the target type in advance - but don't want to have to use a big switch block, you might be interested in using TypeConverter - for example:
DateTime foo = new DateTime(2008, 11, 20);
TypeConverter converter = TypeDescriptor.GetConverter(foo);
string s = converter.ConvertToInvariantString(foo);
object val = converter.ConvertFromInvariantString(s);

I would recommend the .TryParse() personally. That's what I use anyhow. That's if your data is going to be wrong now and again. If you're certain the incoming strings will be able to convert to integers or doubles without a hitch, the .Parse() is faster.
Here's an interesting link to support this.

Keeping the idea of a converter to skip a switch block, you could use the concept of Duck Typing. Basically, you want to turn a string to X, so you make a method that will call X.TryParse(string, out X x) if X has TryParse on it, otherwise you just don't bother (Or I suppose you could throw an error). How would you do this? Reflection and Generics.
Basically you would have a method that would take in a type and use reflection to see if it has TryParse on it. If you find such a method you then call it and return whatever TryParse managed to get. This works well with just about any value type like say Decimal or DateTime.
public static class ConvertFromString
{
public static T? ConvertTo<T>(this String numberToConvert) where T : struct
{
T? returnValue = null;
MethodInfo neededInfo = GetCorrectMethodInfo(typeof(T));
if (neededInfo != null && !numberToConvert.IsNullOrEmpty())
{
T output = default(T);
object[] paramsArray = new object[2] { numberToConvert, output };
returnValue = new T();
object returnedValue = neededInfo.Invoke(returnValue.Value, paramsArray);
if (returnedValue is Boolean && (Boolean)returnedValue)
{
returnValue = (T)paramsArray[1];
}
else
{
returnValue = null;
}
}
return returnValue;
}
}
Where GetCorrectMethodInfo would look something like this:
private static MethodInfo GetCorrectMethodInfo(Type typeToCheck)
{
MethodInfo returnValue = someCache.Get(typeToCheck.FullName);
if(returnValue == null)
{
Type[] paramTypes = new Type[2] { typeof(string), typeToCheck.MakeByRefType() };
returnValue = typeToCheck.GetMethod("TryParse", paramTypes);
if (returnValue != null)
{
CurrentCache.Add(typeToCheck.FullName, returnValue);
}
}
return returnValue;
}
And use would be:
decimal? converted = someString.ConvertTo<decimal>();
I hate plugging myself, but I have this fully explained here:
GetCorrectMethodInfo
Rest of It

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Split string extension with generic type? - c#

Although I agree with Lee’s suggestion, I personally don’t think it’s worth defining a new extension method for something that may trivially be achieved with standard LINQ operations: IEnumerable<int> ints = "1,2,3".Split(',').Select(int.Parse);

Related

Implicit Conversion to IEnumerable<T>

C# how to sort a list without implementing IComparable manually?

C# matching two text files, case sensitive issue

Casting C# out parameters?

How to determine if a string is a number in C#

Categories

Resources