Advanced string formatter with pattern support in C# - c#

I would like to have a flexible template that can translate cases similar to:
WHnnn => WH001, WH002, WH003... (nnn is just a number indicated 3 digits)
INVyyyyMMdd => INV20220228
ORDERyyyyMMdd-nnn => ORDER20220228-007
I know that I can use the following code to achieve a specific template:
string.Format("INV{0:yyyy-MM-dd}", DateTime.Now)
Which should have the same result as case 2 above. But that's not flexible. As the customer may customize their own template as long as I can understand/support, like the third case above.
I know even for the third case, I can do something like this:
string.Format("ORDER{0:yyyy-MM-dd}-{1:d3}", DateTime.Now, 124)
But that's clumsy, as I would like the template (input) to be just like this:
ORDERyyyyMMdd-nnn
The requirement is to support all the supported patterns by string.Format in C#, but the template can be any combination of those patterns.

I would probably use a custom formatter for this case.
Create a new class that will contain date/time & number and will implement IFormattable interface.
There is one tip: use some internal format in style INV{nnn} or INV[nnn] where only the part in {} or [] will be replaced with the value.
Otherwise there could be unwanted changes like in Inv contains 'n'. You could get output as I7v.
In your examples the N is upper case, but will it be the case even after each customisation?
Code (simplified version):
internal sealed class InvoiceNumberInfo : IFormattable
{
private static readonly Regex formatMatcher = new Regex(#"^(?<before>.*?)\[(?<code>\w+?)\](?<after>.*)$");
private readonly DateTime date;
private readonly int number;
public InvoiceNumberInfo(DateTime date, int number)
{
this.date = date;
this.number = number;
}
public string ToString(string format, IFormatProvider formatProvider)
{
var output = format;
while (true)
{
var match = formatMatcher.Match(output);
if (!match.Success)
{
return output;
}
output = match.Groups["before"].Value + FormatValue(match.Groups["code"].Value) + match.Groups["after"].Value;
}
}
private string FormatValue(string code)
{
if (code[0] == 'n')
{
var numberFormat = "D" + code.Length.ToString(CultureInfo.InvariantCulture);
return this.number.ToString(numberFormat);
}
else
{
return this.date.ToString(code);
}
}
}
Use:
internal static class Program
{
public static void Main(string[] args)
{
if (args.Length == 0)
{
Console.WriteLine("No format to display");
return;
}
var inv = new InvoiceNumberInfo(DateTime.Now, number: 7);
foreach (string format in args)
{
Console.WriteLine("Format: '{0}' is formatted as {1:" + format + "}", format, inv);
}
}
}
And output:
Format: 'WH[nnn]' is formatted as WH007
Format: 'INV[yyyyMMdd]' is formatted as INV20220227
Format: 'ORDER[yyyyMMdd]-[nnn]' is formatted as ORDER20220227-007
NOTE This is only a simplified version, proof of concept. Use proper error checking for your code.

Related

DateTime format string for invalid Format string

I am trying to format Datetime string using
date.ToString(format)
If user feeds in wrong format, e.g. "YYYY MM DDr" I would like to know whether I can convert datetime using that format, rather than returning
2015 04 DDr
since
DateTime.ToString(format)
always returns a valid String.
For example, is there any method that perhaps throw an exception on failed conversion so that I can catch and decide not display my output string instead of displaying something like
2015 04 DDr
If you assume that all the letters inserted in your format are either separators or letters that should be converted to a DatePart, you can check if after converting the date you still have non separator chars that were not converted, as follows:
public static class DateTimeExtension
{
public static string ToStringExt(this DateTime p_Date, String format)
{
char[] separators = { ' ', '/', '-' };
String stringDate = p_Date.ToString(format);
foreach (char dateChar in format)
{
if (stringDate.Contains(dateChar) && !separators.Contains(dateChar))
{
throw new FormatException("Format Error");
}
}
return stringDate;
}
}
Edited after #Vladimir Mezentsev observation:
This code assumes that you are converting only to Numbers, if you are doing something that will convert to Day strings like Tuesday, the logic may fail. To address this scenario the code would get a little more complicated but can also be achieved with something like this:
public static string ToStringExt(this DateTime p_Date, String format)
{
foreach (string dateFormatPart in getFormatStrings(format))
{
if (p_Date.ToString(dateFormatPart) == dateFormatPart)
{
throw new FormatException("Format Error");
}
}
return p_Date.ToString(format);
}
private static IEnumerable<string> getFormatStrings(String format)
{
char[] separators = { ' ', '/', '-' };
StringBuilder builder = new StringBuilder();
char previous = format[0];
foreach (char c in format)
{
if (separators.Contains(c) || c != previous)
{
string formatPart = builder.ToString();
if (!String.IsNullOrEmpty(formatPart))
{
yield return formatPart;
builder.Clear();
}
}
if(!separators.Contains(c))
{
builder.Append(c);
}
previous = c;
}
if (builder.Length > 0)
yield return builder.ToString();
}
Have a look at https://msdn.microsoft.com/en-us/library/8kb3ddd4(v=vs.110).aspx. Particularly this part...
If you're wanting to validate the string used to format the DateTime object, then you'll probably have to write your own using the link provided to know what formats are acceptable, and treat any other characters as errors.
There is no invalid format, because you can parse formatted string with exact same format. Even if after parsing you have not loss in any part of date, that could not form the final decision - valid or invalid format.
You should carefully consider what can be appropriate for user, even give him an opportunity to construct format from some predefined blocks. Maybe show sample conversion with confirmation.
For specific formats you can create some extension method where you can apply your business rules and throw exceptions when you need it.

Enum and string match

I'm essentially trying to read an xml file. One of the values has a suffix, e.g. "30d". This is meant to mean '30 days'. So I'm trying to convert this to a DateTime.Now.AddDays(30). To read this field in the XML, i decided to use an Enum:
enum DurationType { Min = "m", Hours = "h", Days = "d" }
Now I'm not exactly sure how exactly to approach this efficiently (I'm a little daft when it comes to enums). Should I separate the suffix, in this case "d", out of the string first, then try and match it in the enum using a switch statement?
I guess if you dumb down my question, it'd be: What's the best way to get from 30d, to DateTime.Now.AddDays(30) ?
You could make an ExtensionMethod to parse the string and return the DateTime you want
Something like:
public static DateTime AddDuration(this DateTime datetime, string str)
{
int value = 0;
int mutiplier = str.EndsWith("d") ? 1440 : str.EndsWith("h") ? 60 : 1;
if (int.TryParse(str.TrimEnd(new char[]{'m','h','d'}), out value))
{
return datetime.AddMinutes(value * mutiplier);
}
return datetime;
}
Usage:
var date = DateTime.Now.AddDuration("2d");
This seems like a good place to use regular expressions; specifically, capture groups.
Below is a working example:
using System;
using System.Text.RegularExpressions;
namespace RegexCaptureGroups
{
class Program
{
// Below is a breakdown of this regular expression:
// First, one or more digits followed by "d" or "D" to represent days.
// Second, one or more digits followed by "h" or "H" to represent hours.
// Third, one or more digits followed by "m" or "M" to represent minutes.
// Each component can be separated by any number of spaces, or none.
private static readonly Regex DurationRegex = new Regex(#"((?<Days>\d+)d)?\s*((?<Hours>\d+)h)?\s*((?<Minutes>\d+)m)?", RegexOptions.IgnoreCase);
public static TimeSpan ParseDuration(string input)
{
var match = DurationRegex.Match(input);
var days = match.Groups["Days"].Value;
var hours = match.Groups["Hours"].Value;
var minutes = match.Groups["Minutes"].Value;
int daysAsInt32, hoursAsInt32, minutesAsInt32;
if (!int.TryParse(days, out daysAsInt32))
daysAsInt32 = 0;
if (!int.TryParse(hours, out hoursAsInt32))
hoursAsInt32 = 0;
if (!int.TryParse(minutes, out minutesAsInt32))
minutesAsInt32 = 0;
return new TimeSpan(daysAsInt32, hoursAsInt32, minutesAsInt32, 0);
}
static void Main(string[] args)
{
Console.WriteLine(ParseDuration("30d"));
Console.WriteLine(ParseDuration("12h"));
Console.WriteLine(ParseDuration("20m"));
Console.WriteLine(ParseDuration("1d 12h"));
Console.WriteLine(ParseDuration("5d 30m"));
Console.WriteLine(ParseDuration("1d 12h 20m"));
Console.WriteLine("Press any key to exit.");
Console.ReadKey();
}
}
}
EDIT: Below is an alternative, slightly more condensed version of the above, though I'm not sure which one I prefer more. I'm usually not a fan of overly dense code.
I adjusted the regular expression to put a limit of 10 digits on each number. This allows me to safely use the int.Parse function, because I know that the input consists of at least one digit and at most ten (unless it didn't capture at all, in which case it would be empty string: hence, the purpose of the ParseInt32ZeroIfNullOrEmpty method).
// Below is a breakdown of this regular expression:
// First, one to ten digits followed by "d" or "D" to represent days.
// Second, one to ten digits followed by "h" or "H" to represent hours.
// Third, one to ten digits followed by "m" or "M" to represent minutes.
// Each component can be separated by any number of spaces, or none.
private static readonly Regex DurationRegex = new Regex(#"((?<Days>\d{1,10})d)?\s*((?<Hours>\d{1,10})h)?\s*((?<Minutes>\d{1,10})m)?", RegexOptions.IgnoreCase);
private static int ParseInt32ZeroIfNullOrEmpty(string input)
{
return string.IsNullOrEmpty(input) ? 0 : int.Parse(input);
}
public static TimeSpan ParseDuration(string input)
{
var match = DurationRegex.Match(input);
return new TimeSpan(
ParseInt32ZeroIfNullOrEmpty(match.Groups["Days"].Value),
ParseInt32ZeroIfNullOrEmpty(match.Groups["Hours"].Value),
ParseInt32ZeroIfNullOrEmpty(match.Groups["Minutes"].Value),
0);
}
EDIT: Just to take this one more step, I've added another version below, which handles days, hours, minutes, seconds, and milliseconds, with a variety of abbreviations for each. I split the regular expression into multiple lines for readability. Note, I also had to adjust the expression by using (\b|(?=[^a-z])) at the end of each component: this is because the "ms" unit was being captured as the "m" unit. The special syntax of "?=" used with "[^a-z]" indicates to match the character but not to "consume" it.
// Below is a breakdown of this regular expression:
// First, one to ten digits followed by "d", "dy", "dys", "day", or "days".
// Second, one to ten digits followed by "h", "hr", "hrs", "hour", or "hours".
// Third, one to ten digits followed by "m", "min", "minute", or "minutes".
// Fourth, one to ten digits followed by "s", "sec", "second", or "seconds".
// Fifth, one to ten digits followed by "ms", "msec", "millisec", "millisecond", or "milliseconds".
// Each component may be separated by any number of spaces, or none.
// The expression is case-insensitive.
private static readonly Regex DurationRegex = new Regex(#"
((?<Days>\d{1,10})(d|dy|dys|day|days)(\b|(?=[^a-z])))?\s*
((?<Hours>\d{1,10})(h|hr|hrs|hour|hours)(\b|(?=[^a-z])))?\s*
((?<Minutes>\d{1,10})(m|min|minute|minutes)(\b|(?=[^a-z])))?\s*
((?<Seconds>\d{1,10})(s|sec|second|seconds)(\b|(?=[^a-z])))?\s*
((?<Milliseconds>\d{1,10})(ms|msec|millisec|millisecond|milliseconds)(\b|(?=[^a-z])))?",
RegexOptions.IgnoreCase | RegexOptions.IgnorePatternWhitespace);
private static int ParseInt32ZeroIfNullOrEmpty(string input)
{
return string.IsNullOrEmpty(input) ? 0 : int.Parse(input);
}
public static TimeSpan ParseDuration(string input)
{
var match = DurationRegex.Match(input);
return new TimeSpan(
ParseInt32ZeroIfNullOrEmpty(match.Groups["Days"].Value),
ParseInt32ZeroIfNullOrEmpty(match.Groups["Hours"].Value),
ParseInt32ZeroIfNullOrEmpty(match.Groups["Minutes"].Value),
ParseInt32ZeroIfNullOrEmpty(match.Groups["Seconds"].Value),
ParseInt32ZeroIfNullOrEmpty(match.Groups["Milliseconds"].Value));
}
update:
Don't vote for this. I'm leaving it simply because it's an alternative approach. Instead look at sa_ddam213 and Dr. Wily's Apprentice's answers.
Should I separate the suffix, in this case "d", out of the string
first, then try and match it in the enum using a switch statement?
Yes.
For a fully working example:
private void button1_Click( object sender, EventArgs e ) {
String value = "30d";
Duration d = (Duration)Enum.Parse(typeof(Duration), value.Substring(value.Length - 1, 1).ToUpper());
DateTime result = d.From(new DateTime(), value);
MessageBox.Show(result.ToString());
}
enum Duration { D, W, M, Y };
static class DurationExtensions {
public static DateTime From( this Duration duration, DateTime dateTime, Int32 period ) {
switch (duration)
{
case Duration.D: return dateTime.AddDays(period);
case Duration.W: return dateTime.AddDays((period*7));
case Duration.M: return dateTime.AddMonths(period);
case Duration.Y: return dateTime.AddYears(period);
default: throw new ArgumentOutOfRangeException("duration");
}
}
public static DateTime From( this Duration duration, DateTime dateTime, String fullValue ) {
Int32 period = Convert.ToInt32(fullValue.ToUpper().Replace(duration.ToString(), String.Empty));
return From(duration, dateTime, period);
}
}
I really don't see how using an enum helps here.
Here's how I might approach it.
string s = "30d";
int typeIndex = s.IndexOfAny(new char[] { 'd', 'w', 'm' });
if (typeIndex > 0)
{
int value = int.Parse(s.Substring(0, typeIndex));
switch (s[typeIndex])
{
case 'd':
result = DateTime.Now.AddDays(value);
break;
case 'w':
result = DateTime.Now.AddDays(value * 7);
break;
case 'm':
result = DateTime.Now.AddMonths(value);
break;
}
}
Depending on the reliability of your input data, you might need to use int.TryParse() instead of int.Parse(). Otherwise, this should be all you need.
Note: I've also written a sscanf() replacement for .NET that would handle this quite easily. You can see the code for that in the article A sscanf() Replacement for .NET.
Try the following code, assuming that values like "30d" are in a string 'val'.
DateTime ConvertValue(string val) {
if (val.Length > 0) {
int prefix = Convert.ToInt32(val.Length.Remove(val.Length-1));
switch (val[val.Length-1]) {
case 'd': return DateTime.Now.AddDays(prefix);
case 'm': return DateTime.Now.AddMonths(prefix);
// etc.
}
throw new ArgumentException("string in unexpected format.");
}
Example of a console application example/tutorial:
enum DurationType
{
[DisplayName("m")]
Min = 1,
[DisplayName("h")]
Hours = 1 * 60,
[DisplayName("d")]
Days = 1 * 60 * 24
}
internal class Program
{
private static void Main(string[] args)
{
string input1 = "10h";
string input2 = "1d10h3m";
var x = GetOffsetFromDate(DateTime.Now, input1);
var y = GetOffsetFromDate(DateTime.Now, input2);
}
private static Dictionary<string, DurationType> suffixDictionary
{
get
{
return Enum
.GetValues(typeof (DurationType))
.Cast<DurationType>()
.ToDictionary(duration => duration.GetDisplayName(), duration => duration);
}
}
public static DateTime GetOffsetFromDate(DateTime date, string input)
{
MatchCollection matches = Regex.Matches(input, #"(\d+)([a-zA-Z]+)");
foreach (Match match in matches)
{
int numberPart = Int32.Parse(match.Groups[1].Value);
string suffix = match.Groups[2].Value;
date = date.AddMinutes((int)suffixDictionary[suffix]);
}
return date;
}
}
[AttributeUsage(AttributeTargets.Field)]
public class DisplayNameAttribute : Attribute
{
public DisplayNameAttribute(String name)
{
this.name = name;
}
protected String name;
public String Name { get { return this.name; } }
}
public static class ExtensionClass
{
public static string GetDisplayName<TValue>(this TValue value) where TValue : struct, IConvertible
{
FieldInfo fi = typeof(TValue).GetField(value.ToString());
DisplayNameAttribute attribute = (DisplayNameAttribute)fi.GetCustomAttributes(typeof(DisplayNameAttribute), false).FirstOrDefault();
if (attribute != null)
return attribute.Name;
return value.ToString();
}
}
Uses an attribute to define your suffix, uses the enum value to define your offset.
Requires:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Reflection;
using System.Text.RegularExpressions;
It may be considered a hack to use the enum integer value but this example will still let you parse out all the Enums (for any other use like switch case) with little tweaks.
Enums can't be backed with non-numeric types, so string-based enums are out. It's possible you may be overthinking it. Without knowing any more about the problem, the most straightforward solution seems to be splitting off the last character, converting the rest to an int, and then handling each final char as a separate case.
I'd suggest using regexp to strip the number first and than execute Enum.Parse Method to evaluate the value of the enum. Than you can use a switch (see Corylulu's answer) to get the right offset, based on the parsed number and enum value.

How to detect if string is currency in c#

Usually when I have need to convert currency string (like 1200,55 zł or $1,249) to decimal value I do it like this:
if (currencyString.Contains("zł)) {
decimal value = Decimal.Parse(dataToCheck.Trim(), NumberStyles.Number | NumberStyles.AllowCurrencySymbol);
}
Is there a way to check if string is currency without checking for specific currency?
If you just do the conversion (you should add | NumberStyles.AllowThousands
| NumberStyles.AllowDecimalPoint as well) then if the string contains the wrong currency symbol for the current UI the parse will fail - in this case by raising an exception. It it contains no currency symbol the parse will still work.
You can therefore use TryParse to allow for this and test for failure.
If your input can be any currency you can use this version of TryParse that takes a IFormatProvider as argument with which you can specify the culture-specific parsing information about the string. So if the parse fails for the default UI culture you can loop round each of your supported cultures trying again. When you find the one that works you've got both your number and the type of currency it is (Zloty, US Dollar, Euro, Rouble etc.)
As I understand it's better to do:
decimal value = -1;
if (Decimal.TryParse(dataToCheck.Trim(), NumberStyles.Number |
NumberStyles.AllowCurrencySymbol,currentCulture, out value)
{do something}
See Jeff Atwood description about TryParse. It doesn't throw an exception and extremely faster than Parse in exception cases.
To check if a string is a currency amount that would be used for entering wages - I used this:
public bool TestIfWages(string wages)
{
Regex regex = new Regex(#"^\d*\.?\d?\d?$");
bool y = regex.IsMatch(wages);
return y;
}
You might try searching the string for what you think is a currency symbol, then looking it up in a dictionary to see if it really is a currency symbol. I would just look at the beginning of the string and the end of the string and pick out anything that's not a digit, then that's what you look up. (If there's stuff at both ends then I think you can assume it's not a currency.)
The advantage to this approach is that you only have to scan the string once, and you don't have to test separately for each currency.
Here's an example of what I had in mind, although it could probably use some refinement:
class Program
{
private static ISet<string> _currencySymbols = new HashSet<string>() { "$", "zł", "€", "£" };
private static bool StringIsCurrency(string str)
{
// Scan the beginning of the string until you get to the first digit
for (int i = 0; i < str.Length; i++)
{
if (char.IsDigit(str[i]))
{
if (i == 0)
{
break;
}
else
{
return StringIsCurrencySymbol(str.Substring(0, i).TrimEnd());
}
}
}
// Scan the end of the string until you get to the last digit
for (int i = 0, pos = str.Length - 1; i < str.Length; i++, pos--)
{
if (char.IsDigit(str[pos]))
{
if (i == 0)
{
break;
}
else
{
return StringIsCurrencySymbol(str.Substring(pos + 1, str.Length - pos - 1).TrimStart());
}
}
}
// No currency symbol found
return false;
}
private static bool StringIsCurrencySymbol(string symbol)
{
return _currencySymbols.Contains(symbol);
}
static void Main(string[] args)
{
Test("$1000.00");
Test("500 zł");
Test("987");
Test("book");
Test("20 €");
Test("99£");
}
private static void Test(string testString)
{
Console.WriteLine(testString + ": " + StringIsCurrency(testString));
}
}

String tokens in .NET

I am writing a app in .NET which will generate random text based on some input. So if I have text like "I love your {lovely|nice|great} dress" I want to choose randomly from lovely/nice/great and use that in text. Any suggestions in C# or VB.NET are welcome.
You could do it using a regex to make a replacement for each {...}. The Regex.Replace function can take a MatchEvaluator which can do the logic for selecting a random value from the choices:
Random random = new Random();
string s = "I love your {lovely|nice|great} dress";
s = Regex.Replace(s, #"\{(.*?)\}", match => {
string[] options = match.Groups[1].Value.Split('|');
int index = random.Next(options.Length);
return options[index];
});
Console.WriteLine(s);
Example output:
I love your lovely dress
Update: Translated to VB.NET automatically using .NET Reflector:
Dim random As New Random
Dim s As String = "I love your {lovely|nice|great} dress"
s = Regex.Replace(s, "\{(.*?)\}", Function (ByVal match As Match)
Dim options As String() = match.Groups.Item(1).Value.Split(New Char() { "|"c })
Dim index As Integer = random.Next(options.Length)
Return options(index)
End Function)
This may be a bit of an abuse of the custom formatting functionality available through the ICustomFormatter and IFormatProvider interfaces, but you could do something like this:
public class ListSelectionFormatter : IFormatProvider, ICustomFormatter
{
#region IFormatProvider Members
public object GetFormat(Type formatType)
{
if (typeof(ICustomFormatter).IsAssignableFrom(formatType))
return this;
else
return null;
}
#endregion
#region ICustomFormatter Members
public string Format(string format, object arg, IFormatProvider formatProvider)
{
string[] values = format.Split('|');
if (values == null || values.Length == 0)
throw new FormatException("The format is invalid. At least one value must be specified.");
if (arg is int)
return values[(int)arg];
else if (arg is Random)
return values[(arg as Random).Next(values.Length)];
else if (arg is ISelectionPicker)
return (arg as ISelectionPicker).Pick(values);
else
throw new FormatException("The argument is invalid.");
}
#endregion
}
public interface ISelectionPicker
{
string Pick(string[] values);
}
public class RandomSelectionPicker : ISelectionPicker
{
Random rng = new Random();
public string Pick(string[] values)
{
// use whatever logic is desired here to choose the correct value
return values[rng.Next(values.Length)];
}
}
class Stuff
{
public static void DoStuff()
{
RandomSelectionPicker picker = new RandomSelectionPicker();
string result = string.Format(new ListSelectionFormatter(), "I am feeling {0:funky|great|lousy}. I should eat {1:a banana|cereal|cardboard}.", picker, picker);
}
}
String.Format("static text {0} more text {1}", randomChoice0, randomChoice1);
write a simple parser that will get the information in the braces, split it with string.Split , get a random index for that array and build up the string again.
use the StringBuilder for building the result due to performance issues with other stringoperations.

Parsing formatted string

I am trying to create a generic formatter/parser combination.
Example scenario:
I have a string for string.Format(), e.g. var format = "{0}-{1}"
I have an array of object (string) for the input, e.g. var arr = new[] { "asdf", "qwer" }
I am formatting the array using the format string, e.g. var res = string.Format(format, arr)
What I am trying to do is to revert back the formatted string back into the array of object (string). Something like (pseudo code):
var arr2 = string.Unformat(format, res)
// when: res = "asdf-qwer"
// arr2 should be equal to arr
Anyone have experience doing something like this? I'm thinking about using regular expressions (modify the original format string, and then pass it to Regex.Matches to get the array) and run it for each placeholder in the format string. Is this feasible or is there any other more efficient solution?
While the comments about lost information are valid, sometimes you just want to get the string values of of a string with known formatting.
One method is this blog post written by a friend of mine. He implemented an extension method called string[] ParseExact(), akin to DateTime.ParseExact(). Data is returned as an array of strings, but if you can live with that, it is terribly handy.
public static class StringExtensions
{
public static string[] ParseExact(
this string data,
string format)
{
return ParseExact(data, format, false);
}
public static string[] ParseExact(
this string data,
string format,
bool ignoreCase)
{
string[] values;
if (TryParseExact(data, format, out values, ignoreCase))
return values;
else
throw new ArgumentException("Format not compatible with value.");
}
public static bool TryExtract(
this string data,
string format,
out string[] values)
{
return TryParseExact(data, format, out values, false);
}
public static bool TryParseExact(
this string data,
string format,
out string[] values,
bool ignoreCase)
{
int tokenCount = 0;
format = Regex.Escape(format).Replace("\\{", "{");
for (tokenCount = 0; ; tokenCount++)
{
string token = string.Format("{{{0}}}", tokenCount);
if (!format.Contains(token)) break;
format = format.Replace(token,
string.Format("(?'group{0}'.*)", tokenCount));
}
RegexOptions options =
ignoreCase ? RegexOptions.IgnoreCase : RegexOptions.None;
Match match = new Regex(format, options).Match(data);
if (tokenCount != (match.Groups.Count - 1))
{
values = new string[] { };
return false;
}
else
{
values = new string[tokenCount];
for (int index = 0; index < tokenCount; index++)
values[index] =
match.Groups[string.Format("group{0}", index)].Value;
return true;
}
}
}
You can't unformat because information is lost. String.Format is a "destructive" algorithm, which means you can't (always) go back.
Create a new class inheriting from string, where you add a member that keeps track of the "{0}-{1}" and the { "asdf", "qwer" }, override ToString(), and modify a little your code.
If it becomes too tricky, just create the same class, but not inheriting from string and modify a little more your code.
IMO, that's the best way to do this.
It's simply not possible in the generic case. Some information will be "lost" (string boundaries) in the Format method. Assume:
String.Format("{0}-{1}", "hello-world", "stack-overflow");
How would you "Unformat" it?
Assuming "-" is not in the original strings, can you not just use Split?
var arr2 = formattedString.Split('-');
Note that this only applies to the presented example with an assumption. Any reverse algorithm is dependent on the kind of formatting employed; an inverse operation may not even be possible, as noted by the other answers.
A simple solution might be to
replace all format tokens with (.*)
escape all other special charaters in format
make the regex match non-greedy
This would resolve the ambiguities to the shortest possible match.
(I'm not good at RegEx, so please correct me, folks :))
After formatting, you can put the resulting string and the array of objects into a dictionary with the string as key:
Dictionary<string,string []> unFormatLookup = new Dictionary<string,string []>
...
var arr = new string [] {"asdf", "qwer" };
var res = string.Format(format, arr);
unFormatLookup.Add(res,arr);
and in Unformat method, you can simply pass a string and look up that string and return the array used:
string [] Unformat(string res)
{
string [] arr;
unFormatLoopup.TryGetValue(res,out arr); //you can also check the return value of TryGetValue and throw an exception if the input string is not in.
return arr;
}

Categories