parsing of a string containing an array - c#

I'd like to convert string containing recursive array of strings to an array of depth one.
Example:
StringToArray("[a, b, [c, [d, e]], f, [g, h], i]") == ["a", "b", "[c, [d, e]]", "f", "[g, h]", "i"]
Seems quite simple. But, I come from functional background and I'm not that familiar with .NET Framework standard libraries, so every time (I started from scratch like 3 times) I end up just plain ugly code. My latest implementation is here. As you see, it's ugly as hell.
So, what's the C# way to do this?

#ojlovecd has a good answer, using Regular Expressions.
However, his answer is overly complicated, so here's my similar, simpler answer.
public string[] StringToArray(string input) {
var pattern = new Regex(#"
\[
(?:
\s*
(?<results>(?:
(?(open) [^\[\]]+ | [^\[\],]+ )
|(?<open>\[)
|(?<-open>\])
)+)
(?(open)(?!))
,?
)*
\]
", RegexOptions.IgnorePatternWhitespace);
// Find the first match:
var result = pattern.Match(input);
if (result.Success) {
// Extract the captured values:
var captures = result.Groups["results"].Captures.Cast<Capture>().Select(c => c.Value).ToArray();
return captures;
}
// Not a match
return null;
}
Using this code, you will see that StringToArray("[a, b, [c, [d, e]], f, [g, h], i]") will return the following array: ["a", "b", "[c, [d, e]]", "f", "[g, h]", "i"].
For more information on the balanced groups that I used for matching balanced braces, take a look at Microsoft's documentation.
Update:
As per the comments, if you want to also balance quotes, here's a possible modification. (Note that in C# the " is escaped as "") I also added descriptions of the pattern to help clarify it:
var pattern = new Regex(#"
\[
(?:
\s*
(?<results>(?: # Capture everything into 'results'
(?(open) # If 'open' Then
[^\[\]]+ # Capture everything but brackets
| # Else (not open):
(?: # Capture either:
[^\[\],'""]+ # Unimportant characters
| # Or
['""][^'""]*?['""] # Anything between quotes
)
) # End If
|(?<open>\[) # Open bracket
|(?<-open>\]) # Close bracket
)+)
(?(open)(?!)) # Fail while there's an unbalanced 'open'
,?
)*
\]
", RegexOptions.IgnorePatternWhitespace);

with Regex, it can solve your problem:
static string[] StringToArray(string str)
{
Regex reg = new Regex(#"^\[(.*)\]$");
Match match = reg.Match(str);
if (!match.Success)
return null;
str = match.Groups[1].Value;
List<string> list = new List<string>();
reg = new Regex(#"\[[^\[\]]*(((?'Open'\[)[^\[\]]*)+((?'-Open'\])[^\[\]]*)+)*(?(Open)(?!))\]");
Dictionary<string, string> dic = new Dictionary<string, string>();
int index = 0;
str = reg.Replace(str, m =>
{
string temp = "ojlovecd" + (index++).ToString();
dic.Add(temp, m.Value);
return temp;
});
string[] result = str.Split(',');
for (int i = 0; i < result.Length; i++)
{
string s = result[i].Trim();
if (dic.ContainsKey(s))
result[i] = dic[s].Trim();
else
result[i] = s;
}
return result;
}

Honestly I would just write this method in an F# assembly as its probably much easier. If you look at the JavaScriptSerializer implementation in C# (with a decompiler like dotPeek or reflector) you can see how messy the array parsing code is for a similar array in JSON. Granted this has to handle a much more varied array of tokens, but you get the idea.
Here is their DeserializeList implementation, uglier than it is normally as its dotPeek's decompiled version, not the original, but you get the idea. The DeserializeInternal would recurse down to the child list.
private IList DeserializeList(int depth)
{
IList list = (IList) new ArrayList();
char? nullable1 = this._s.MoveNext();
if (((int) nullable1.GetValueOrDefault() != 91 ? 1 : (!nullable1.HasValue ? 1 : 0)) != 0)
throw new ArgumentException(this._s.GetDebugString(AtlasWeb.JSON_InvalidArrayStart));
bool flag = false;
char? nextNonEmptyChar;
char? nullable2;
do
{
char? nullable3 = nextNonEmptyChar = this._s.GetNextNonEmptyChar();
if ((nullable3.HasValue ? new int?((int) nullable3.GetValueOrDefault()) : new int?()).HasValue)
{
char? nullable4 = nextNonEmptyChar;
if (((int) nullable4.GetValueOrDefault() != 93 ? 1 : (!nullable4.HasValue ? 1 : 0)) != 0)
{
this._s.MovePrev();
object obj = this.DeserializeInternal(depth);
list.Add(obj);
flag = false;
nextNonEmptyChar = this._s.GetNextNonEmptyChar();
char? nullable5 = nextNonEmptyChar;
if (((int) nullable5.GetValueOrDefault() != 93 ? 0 : (nullable5.HasValue ? 1 : 0)) == 0)
{
flag = true;
nullable2 = nextNonEmptyChar;
}
else
goto label_8;
}
else
goto label_8;
}
else
goto label_8;
}
while (((int) nullable2.GetValueOrDefault() != 44 ? 1 : (!nullable2.HasValue ? 1 : 0)) == 0);
throw new ArgumentException(this._s.GetDebugString(AtlasWeb.JSON_InvalidArrayExpectComma));
label_8:
if (flag)
throw new ArgumentException(this._s.GetDebugString(AtlasWeb.JSON_InvalidArrayExtraComma));
char? nullable6 = nextNonEmptyChar;
if (((int) nullable6.GetValueOrDefault() != 93 ? 1 : (!nullable6.HasValue ? 1 : 0)) != 0)
throw new ArgumentException(this._s.GetDebugString(AtlasWeb.JSON_InvalidArrayEnd));
else
return list;
}
Recursive parsing is just not managed as well though in C# as it is in F#.

There is no real "standard" way of doing this. Note that the implementation can get pretty messy if you want to consider all possibilities. I would recommend something recursive like:
private static IEnumerable<object> StringToArray2(string input)
{
var characters = input.GetEnumerator();
return InternalStringToArray2(characters);
}
private static IEnumerable<object> InternalStringToArray2(IEnumerator<char> characters)
{
StringBuilder valueBuilder = new StringBuilder();
while (characters.MoveNext())
{
char current = characters.Current;
switch (current)
{
case '[':
yield return InternalStringToArray2(characters);
break;
case ']':
yield return valueBuilder.ToString();
valueBuilder.Clear();
yield break;
case ',':
yield return valueBuilder.ToString();
valueBuilder.Clear();
break;
default:
valueBuilder.Append(current);
break;
}
Although your not restricted to recursiveness and can always fall back to a single method like
private static IEnumerable<object> StringToArray1(string input)
{
Stack<List<object>> levelEntries = new Stack<List<object>>();
List<object> current = null;
StringBuilder currentLineBuilder = new StringBuilder();
foreach (char nextChar in input)
{
switch (nextChar)
{
case '[':
levelEntries.Push(current);
current = new List<object>();
break;
case ']':
current.Add(currentLineBuilder.ToString());
currentLineBuilder.Clear();
var last = current;
if (levelEntries.Peek() != null)
{
current = levelEntries.Pop();
current.Add(last);
}
break;
case ',':
current.Add(currentLineBuilder.ToString());
currentLineBuilder.Clear();
break;
default:
currentLineBuilder.Append(nextChar);
break;
}
}
return current;
}
Whatever smells good to you

using System;
using System.Text;
using System.Text.RegularExpressions;
using Microsoft.VisualBasic.FileIO; //Microsoft.VisualBasic.dll
using System.IO;
public class Sample {
static void Main(){
string data = "[a, b, [c, [d, e]], f, [g, h], i]";
string[] fields = StringToArray(data);
//check print
foreach(var item in fields){
Console.WriteLine("\"{0}\"",item);
}
}
static string[] StringToArray(string data){
string[] fields = null;
Regex innerPat = new Regex(#"\[\s*(.+)\s*\]");
string innerStr = innerPat.Matches(data)[0].Groups[1].Value;
StringBuilder wk = new StringBuilder();
var balance = 0;
for(var i = 0;i<innerStr.Length;++i){
char ch = innerStr[i];
switch(ch){
case '[':
if(balance == 0){
wk.Append('"');
}
wk.Append(ch);
++balance;
continue;
case ']':
wk.Append(ch);
--balance;
if(balance == 0){
wk.Append('"');
}
continue;
default:
wk.Append(ch);
break;
}
}
var reader = new StringReader(wk.ToString());
using(var csvReader = new TextFieldParser(reader)){
csvReader.SetDelimiters(new string[] {","});
csvReader.HasFieldsEnclosedInQuotes = true;
fields = csvReader.ReadFields();
}
return fields;
}
}

Related

Variant delimiting behavior depending on brackets wrapping string in C#

I have the following (incorrect) code:
this.MetadataProperties = new HashSet<string>(SplitMetadataProperties(metadataProperties));
private static List<string> SplitMetadataProperties(string properties)
{
// Speed up accesses and splits
char[] propertiesArray = properties.ToCharArray();
// Look for a delimiter
int bracketCount = 0;
int start = 0;
List<string> result = new List<string>();
for (int i = 0; i < propertiesArray.Length; ++i)
{
switch (properties[i])
{
case '[':
// Column open
bracketCount++;
break;
case ']':
// Column close
bracketCount--;
break;
case '|':
// Delimiter
if (bracketCount != 0)
{
// Treat this as a normal character, since it's not actually a delimiter. It's a part of a column.
break;
}
if (i > start)
{
// It's not empty, add it
int propertiesArrayLength = propertiesArray.Length - start;
result.Add(new string(propertiesArray, start, propertiesArrayLength));
}
// This is a delimiter. Split off this property and move to the next one
start = i + 1;
break;
}
// Add last item if needed
if (start < propertiesArray.Length)
{
int propertiesArrayLength = propertiesArray.Length - start;
result.Add(new string(propertiesArray, start, propertiesArrayLength));
}
}
return result;
}
Here's the desired behavior. Suppose we have:
string properties = "foo||||bar";
Desired result is:
expectedHashSet[0] = "foo";
expectedHashSet[1] = "bar";
However, right now I'm getting this:
"foo||||bar"
"|||bar"
"||bar"
"|bar"
"bar"
Note that if you have:
string properties = "[foo||||bar]";
then the desired result is instead the same exact string:
"[foo||||bar]";
"|" and "[" / "]" cause special behavior, hence the switch. I think the issue lies somewhere in the length calculations or related computation, but I'm not quite sure the exact change(s) I need. Any suggestions? Thanks!
I'd recommend the regular expressions solution here:
using System.Text.RegularExpressions;
And
private static IEnumerable<string> SplitMetadataProperties(string properties)
{
string pattern = #"(\[.+\])|[^\|\s]+";
foreach (Match match in Regex.Matches(properties, pattern))
yield return match.Value;
}
Now all you need is to define your HashSet
this.MetadataProperties = new HashSet<string>(SplitMetadataProperties(metadataProperties));
For the input:
[foo||||bar]
foo||||bar
[foo||||bar]
foo||||bar
foo|||bar
[foo||||bar]
[foo||||bar]
The output was:
[foo||||bar]
foo
bar
Pattern explanation:
Find either a string that starts with [ and ends with ] and have at least one character in between no matter what is it; or, a string with any character except for the delimiter (|) and white-spaces.
What you need is a state machine to parse the input. In this case, the state is inside [,] or outside. I normally like to create an IEnumerable state machine to return each parsed field, which can then be further processed as needed.
So, an extension method to handle your parsing:
public static class StringExt {
public static IEnumerable<string> SplitMetadataProperties(this string s) {
var sb = new StringBuilder();
bool inQuote = false;
foreach (var ch in s) {
switch (ch) {
case '[':
inQuote = true;
sb.Append(ch);
break;
case ']':
inQuote = false;
sb.Append(ch);
break;
case '|':
if (inQuote)
sb.Append(ch);
else {
if (sb.Length > 0) {
yield return sb.ToString();
sb.Clear();
}
}
break;
default:
sb.Append(ch);
break;
}
}
if (sb.Length > 0)
yield return sb.ToString();
}
}
And, now you can parse your input and process as needed:
this.MetadataProperties = new HashSet<string>(metadataProperties.SplitMetadataProperties());
NOTE: I would normally use an extension method (that is included in the latest .Net):
public static HashSet<T> ToHashSet<T>(this IEnumerable<T> source) => new HashSet<T>(source);
public static HashSet<T> ToHashSet<T>(this IEnumerable<T> source, IEqualityComparer<T> cmp) => new HashSet<T>(source, cmp);
So you would have
this.MetadataProperties = metadataProperties.SplitMetadataProperties().ToHashSet();
You want a list or array? For array:
string properties = "foo||||bar";
if (!(properties.StartWith("[") && properties.EndWith("]"))
{
var expectedHashSet = properties.Split(new char[] { '|'}, StringSplitOptions.RemoveEmptyEntries);
}
For a List simply use .ToList().
I think I now understand what you are asking. This one is actually based on Aly El-Haddad's reply, only made a slight addition for edge cases.
void Main()
{
string properties = "foo||||bar[as|is]ak|yak[yet|another]ll";
foreach (var t in SplitMetadataProperties(properties))
{
Console.WriteLine(t);
}
}
private static IEnumerable<string> SplitMetadataProperties(string properties)
{
foreach (Match m in Regex.Matches(properties, #"\[.+?\]"))
{
properties = properties.Replace(m.Value, "|" + m.Value + "|");
}
string pattern = #"(\[.+?\])|[^\|\s]+";
foreach (Match match in Regex.Matches(properties, pattern))
{
yield return match.Value;
}
}

Find two strings in list with a regular expression

I need to find two strings within a list that contains the characters from another string, which are not in order. To make it clear, an example could be a list of animals like:
lion
dog
bear
cat
And a given string is: oodilgn.
The answer here would be: lion and dog
Each character from the string will be used only once.
Is there a regular expression that will allow me to do this?
You could try to put the given string between []. These brackets will allow choosing - in any order - from these letters only. This may not be a perfect solution, but it will catch the majority of your list.
For example, you could write oodilgn as [oodilgn], then add a minimum number of letters to be found - let's say 3 - by using the curly brackets {}. The full regex will be like this:
[oodilgn]{3,}
This code basically says: find any word that has three of the letters that are located between brackets in any order.
Demo: https://regex101.com/r/MCWHjQ/2
Here is some example algorithm that does the job. I have assumed that the two strings together don't need to take all letters from the text else i make additional commented check. Also i return first two appropriate answers.
Here is how you call it in the outside function, Main or else:
static void Main(string[] args)
{
var text = "oodilgn";
var listOfWords = new List<string> { "lion", "dog", "bear", "cat" };
ExtractWordsWithSameLetters(text, listOfWords);
}
Here is the function with the algorithm. All string manuplations are entirely with regex.
public static void ExtractWordsWithSameLetters(string text, List<string> listOfWords)
{
string firstWord = null;
string secondWord = null;
for (var i = 0; i < listOfWords.Count - 1; i++)
{
var textCopy = text;
var firstWordIsMatched = true;
foreach (var letter in listOfWords[i])
{
var pattern = $"(.*?)({letter})(.*?)";
var regex = new Regex(pattern);
if (regex.IsMatch(text))
{
textCopy = regex.Replace(textCopy, "$1*$3", 1);
}
else
{
firstWordIsMatched = false;
break;
}
}
if (!firstWordIsMatched)
{
continue;
}
firstWord = listOfWords[i];
for (var j = i + 1; j < listOfWords.Count; j++)
{
var secondWordIsMatched = true;
foreach (var letter in listOfWords[j])
{
var pattern = $"(.*?)({letter})(.*?)";
var regex = new Regex(pattern);
if (regex.IsMatch(text))
{
textCopy = regex.Replace(textCopy, "$1*$3", 1);
}
else
{
secondWordIsMatched = false;
break;
}
}
if (secondWordIsMatched)
{
secondWord = listOfWords[j];
break;
}
}
if (secondWord == null)
{
firstWord = null;
}
else
{
//if (textCopy.ToCharArray().Any(l => l != '*'))
//{
// break;
//}
break;
}
}
if (firstWord != null)
{
Console.WriteLine($"{firstWord} { secondWord}");
}
}
Function is far from optimised but does what you want. If you want to return results, not print them just create an array and stuff firstWord and secondWord in it and have return type string[] or add two paramaters with ref out In those cases you will need to check the result in the calling function.
please try this out
Regex r=new Regex("^[.*oodilgn]$");
var list=new List<String>(){"lion","dog","fish","god"};
var output=list.Where(x=>r.IsMatch(x));
result
output=["lion","dog","god"];

C# string.split() separate string by uppercase

I've been using the Split() method to split strings. But this work if you set some character for condition in string.Split(). Is there any way to split a string when is see Uppercase?
Is it possible to get few words from some not separated string like:
DeleteSensorFromTemplate
And the result string is to be like:
Delete Sensor From Template
Use Regex.split
string[] split = Regex.Split(str, #"(?<!^)(?=[A-Z])");
Another way with regex:
public static string SplitCamelCase(string input)
{
return System.Text.RegularExpressions.Regex.Replace(input, "([A-Z])", " $1", System.Text.RegularExpressions.RegexOptions.Compiled).Trim();
}
If you do not like RegEx and you really just want to insert the missing spaces, this will do the job too:
public static string InsertSpaceBeforeUpperCase(this string str)
{
var sb = new StringBuilder();
char previousChar = char.MinValue; // Unicode '\0'
foreach (char c in str)
{
if (char.IsUpper(c))
{
// If not the first character and previous character is not a space, insert a space before uppercase
if (sb.Length != 0 && previousChar != ' ')
{
sb.Append(' ');
}
}
sb.Append(c);
previousChar = c;
}
return sb.ToString();
}
I had some fun with this one and came up with a function that splits by case, as well as groups together caps (it assumes title case for whatever follows) and digits.
Examples:
Input -> "TodayIUpdated32UPCCodes"
Output -> "Today I Updated 32 UPC Codes"
Code (please excuse the funky symbols I use)...
public string[] SplitByCase(this string s) {
var ʀ = new List<string>();
var ᴛ = new StringBuilder();
var previous = SplitByCaseModes.None;
foreach(var ɪ in s) {
SplitByCaseModes mode_ɪ;
if(string.IsNullOrWhiteSpace(ɪ.ToString())) {
mode_ɪ = SplitByCaseModes.WhiteSpace;
} else if("0123456789".Contains(ɪ)) {
mode_ɪ = SplitByCaseModes.Digit;
} else if(ɪ == ɪ.ToString().ToUpper()[0]) {
mode_ɪ = SplitByCaseModes.UpperCase;
} else {
mode_ɪ = SplitByCaseModes.LowerCase;
}
if((previous == SplitByCaseModes.None) || (previous == mode_ɪ)) {
ᴛ.Append(ɪ);
} else if((previous == SplitByCaseModes.UpperCase) && (mode_ɪ == SplitByCaseModes.LowerCase)) {
if(ᴛ.Length > 1) {
ʀ.Add(ᴛ.ToString().Substring(0, ᴛ.Length - 1));
ᴛ.Remove(0, ᴛ.Length - 1);
}
ᴛ.Append(ɪ);
} else {
ʀ.Add(ᴛ.ToString());
ᴛ.Clear();
ᴛ.Append(ɪ);
}
previous = mode_ɪ;
}
if(ᴛ.Length != 0) ʀ.Add(ᴛ.ToString());
return ʀ.ToArray();
}
private enum SplitByCaseModes { None, WhiteSpace, Digit, UpperCase, LowerCase }
Here's another different way if you don't want to be using string builders or RegEx, which are totally acceptable answers. I just want to offer a different solution:
string Split(string input)
{
string result = "";
for (int i = 0; i < input.Length; i++)
{
if (char.IsUpper(input[i]))
{
result += ' ';
}
result += input[i];
}
return result.Trim();
}

Remove comma's between the [] brackets in string using c#

I want remove comma's between the square brackets[] instead of entire comma from the string.
Here my string is,
string result= "a,b,c,[c,d,e],f,g,[h,i,j]";
Expected output:
a,b,c,[cde],f,g,[hij]
Thanks advance.
As I've written, you need a simple state machine (inside brackets, outside brackets)... Then for each character, you analyze it and if necessary you change the state of the state machine and decide if you need to output it or not.
public static string RemoveCommas(string str)
{
int bracketLevel = 0;
var sb = new StringBuilder(str.Length);
foreach (char ch in str)
{
switch (ch) {
case '[':
bracketLevel++;
sb.Append(ch);
break;
case ']':
if (bracketLevel > 0) {
bracketLevel--;
}
sb.Append(ch);
break;
case ',':
if (bracketLevel == 0) {
sb.Append(ch);
}
break;
default:
sb.Append(ch);
break;
}
}
return sb.ToString();
}
Use it like:
string result = "a,b,c,[c,d,e],f,g,[h,i,j]";
Console.WriteLine(RemoveCommas(result));
Note that to "save" the state of the state machine I'm using an int, so that it works with recursive brackets, like a,b,[c,d,[e,f]g,h]i,j
Just as an interesting exercise, it can be done with a slower LINQ expression:
string result2 = result.Aggregate(new
{
BracketLevel = 0,
Result = string.Empty,
}, (state, ch) => new {
BracketLevel = ch == '[' ?
state.BracketLevel + 1 :
ch == ']' && state.BracketLevel > 0 ?
state.BracketLevel - 1 :
state.BracketLevel,
Result = ch != ',' || state.BracketLevel == 0 ? state.Result + ch : state.Result
}).Result;
In the end the code is very similar... There is a state that is brought along (the BracketLevel) plus the string (Result) that is being built. please don't use it, it is only written as an amusing piece of LINQ.
Regex approach
string stringValue = "a,b,c,[c,d,e],f,g,[h,i,j]";
var result = Regex.Replace(stringValue, #",(?![^\]]*(?:\[|$))", string.Empty);
if you don't have nested brackets
You can try this:
var output = new string(result
.Where((s, index) => s != ',' ||
IsOutside(result.Substring(0, index)))
.ToArray()
);
//output: a,b,c,[cde],f,g,[hij]
And
private static bool IsOutside(string value)
{
return value.Count(i => i == '[') <= value.Count(i => i == ']');
}
But remember this is not the efficient way of doing this job.

compare the characters in two strings

In C#, how do I compare the characters in two strings.
For example, let's say I have these two strings
"bc3231dsc" and "bc3462dsc"
How do I programically figure out the the strings
both start with "bc3" and end with "dsc"?
So the given would be two variables:
var1 = "bc3231dsc";
var2 = "bc3462dsc";
After comparing each characters from var1 to var2, I would want the output to be:
leftMatch = "bc3";
center1 = "231";
center2 = "462";
rightMatch = "dsc";
Conditions:
1. The strings will always be a length of 9 character.
2. The strings are not case sensitive.
The string class has 2 methods (StartsWith and Endwith) that you can use.
After reading your question and the already given answers i think there are some constraints are missing, which are maybe obvious to you, but not to the community. But maybe we can do a little guess work:
You'll have a bunch of string pairs that should be compared.
The two strings in each pair are of the same length or you are only interested by comparing the characters read simultaneously from left to right.
Get some kind of enumeration that tells me where each block starts and how long it is.
Due to the fact, that a string is only a enumeration of chars you could use LINQ here to get an idea of the matching characters like this:
private IEnumerable<bool> CommonChars(string first, string second)
{
if (first == null)
throw new ArgumentNullException("first");
if (second == null)
throw new ArgumentNullException("second");
var charsToCompare = first.Zip(second, (LeftChar, RightChar) => new { LeftChar, RightChar });
var matchingChars = charsToCompare.Select(pair => pair.LeftChar == pair.RightChar);
return matchingChars;
}
With this we can proceed and now find out how long each block of consecutive true and false flags are with this method:
private IEnumerable<Tuple<int, int>> Pack(IEnumerable<bool> source)
{
if (source == null)
throw new ArgumentNullException("source");
using (var iterator = source.GetEnumerator())
{
if (!iterator.MoveNext())
{
yield break;
}
bool current = iterator.Current;
int index = 0;
int length = 1;
while (iterator.MoveNext())
{
if(current != iterator.Current)
{
yield return Tuple.Create(index, length);
index += length;
length = 0;
}
current = iterator.Current;
length++;
}
yield return Tuple.Create(index, length);
}
}
Currently i don't know if there is an already existing LINQ function that provides the same functionality. As far as i have already read it should be possible with SelectMany() (cause in theory you can accomplish any LINQ task with this method), but as an adhoc implementation the above was easier (for me).
These functions could then be used in a way something like this:
var firstString = "bc3231dsc";
var secondString = "bc3462dsc";
var commonChars = CommonChars(firstString, secondString);
var packs = Pack(commonChars);
foreach (var item in packs)
{
Console.WriteLine("Left side: " + firstString.Substring(item.Item1, item.Item2));
Console.WriteLine("Right side: " + secondString.Substring(item.Item1, item.Item2));
Console.WriteLine();
}
Which would you then give this output:
Left side: bc3
Right side: bc3
Left side: 231
Right side: 462
Left side: dsc
Right side: dsc
The biggest drawback is in someway the usage of Tuple cause it leads to the ugly property names Item1 and Item2 which are far away from being instantly readable. But if it is really wanted you could introduce your own simple class holding two integers and has some rock-solid property names. Also currently the information is lost about if each block is shared by both strings or if they are different. But once again it should be fairly simply to get this information also into the tuple or your own class.
static void Main(string[] args)
{
string test1 = "bc3231dsc";
string tes2 = "bc3462dsc";
string firstmatch = GetMatch(test1, tes2, false);
string lasttmatch = GetMatch(test1, tes2, true);
string center1 = test1.Substring(firstmatch.Length, test1.Length -(firstmatch.Length + lasttmatch.Length)) ;
string center2 = test2.Substring(firstmatch.Length, test1.Length -(firstmatch.Length + lasttmatch.Length)) ;
}
public static string GetMatch(string fist, string second, bool isReverse)
{
if (isReverse)
{
fist = ReverseString(fist);
second = ReverseString(second);
}
StringBuilder builder = new StringBuilder();
char[] ar1 = fist.ToArray();
for (int i = 0; i < ar1.Length; i++)
{
if (fist.Length > i + 1 && ar1[i].Equals(second[i]))
{
builder.Append(ar1[i]);
}
else
{
break;
}
}
if (isReverse)
{
return ReverseString(builder.ToString());
}
return builder.ToString();
}
public static string ReverseString(string s)
{
char[] arr = s.ToCharArray();
Array.Reverse(arr);
return new string(arr);
}
Pseudo code of what you need..
int stringpos = 0
string resultstart = ""
while not end of string (either of the two)
{
if string1.substr(stringpos) == string1.substr(stringpos)
resultstart =resultstart + string1.substr(stringpos)
else
exit while
}
resultstart has you start string.. you can do the same going backwards...
Another solution you can use is Regular Expressions.
Regex re = new Regex("^bc3.*?dsc$");
String first = "bc3231dsc";
if(re.IsMatch(first)) {
//Act accordingly...
}
This gives you more flexibility when matching. The pattern above matches any string that starts in bc3 and ends in dsc with anything between except a linefeed. By changing .*? to \d, you could specify that you only want digits between the two fields. From there, the possibilities are endless.
using System;
using System.Text.RegularExpressions;
using System.Collections.Generic;
class Sample {
static public void Main(){
string s1 = "bc3231dsc";
string s2 = "bc3462dsc";
List<string> common_str = commonStrings(s1,s2);
foreach ( var s in common_str)
Console.WriteLine(s);
}
static public List<string> commonStrings(string s1, string s2){
int len = s1.Length;
char [] match_chars = new char[len];
for(var i = 0; i < len ; ++i)
match_chars[i] = (Char.ToLower(s1[i])==Char.ToLower(s2[i]))? '#' : '_';
string pat = new String(match_chars);
Regex regex = new Regex("(#+)", RegexOptions.Compiled);
List<string> result = new List<string>();
foreach (Match match in regex.Matches(pat))
result.Add(s1.Substring(match.Index, match.Length));
return result;
}
}
for UPDATE CONDITION
using System;
class Sample {
static public void Main(){
string s1 = "bc3231dsc";
string s2 = "bc3462dsc";
int len = 9;//s1.Length;//cond.1)
int l_pos = 0;
int r_pos = len;
for(int i=0;i<len && Char.ToLower(s1[i])==Char.ToLower(s2[i]);++i){
++l_pos;
}
for(int i=len-1;i>0 && Char.ToLower(s1[i])==Char.ToLower(s2[i]);--i){
--r_pos;
}
string leftMatch = s1.Substring(0,l_pos);
string center1 = s1.Substring(l_pos, r_pos - l_pos);
string center2 = s2.Substring(l_pos, r_pos - l_pos);
string rightMatch = s1.Substring(r_pos);
Console.Write(
"leftMatch = \"{0}\"\n" +
"center1 = \"{1}\"\n" +
"center2 = \"{2}\"\n" +
"rightMatch = \"{3}\"\n",leftMatch, center1, center2, rightMatch);
}
}

Categories