How to find the matching pair of braces in a string? - c#

Suppose I have a string "(paid for) + (8 working hours) + (company rules)" . Now I want to check whether this complete string is surrounded with parentheses or not. Basically I want to check if the string is like this or not : "((paid for) + (8 working hours) + (company rules))". If it is already surrounded with parentheses, then I will leave it as it is, otherwise I will apply parentheses to the complete string so that the ouput is : "((paid for) + (8 working hours) + (company rules))" . By counting the number of parentheses, I am not able to solve this problem.
Can anyone please suggest a solution?

The Stack is a good idea, but as you want to see if the complete string is surrounded with parens, i suggest you put the index of the encountered opening paren on the Stack. That way, each time you pop an item on the stack, check if it's 0, meaning the opening paren that corresponds to this closing paren was on the beginning of the string. The result of this check for the last closing paren will tell you if you need to add parens.
Example:
String s = "((paid for) + (8 working hours) + (company rules))";
var stack = new Stack<int>();
bool isSurroundedByParens = false;
for (int i = 0; i < s.Length; i++) {
switch (s[i]) {
case '(':
stack.Push(i);
isSurroundedByParens = false;
break;
case ')':
int index = stack.Any() ? stack.Pop() : -1;
isSurroundedByParens = (index == 0);
break;
default:
isSurroundedByParens = false;
break;
}
}
if (!isSurroundedByParens) {
// surround with parens
}

use a stack.. as in when u find a ( bracket push it and when u see ) pop the stack..
Finally when the string is parsed completely the stack should be empty... This will ensure you that the brackets are not missing..
in your case if in between the stack becomes empty then there are no surrounding brackets for entire string
for example:
for input string:
(paid for) + (8 working hours) + (company rules)
the first ( would be pushed and when it encounters the ) it will pop the stack, now check if there is more string to be parsed and stack is not empty. If stack is empty that means the entire string is not in bracket.
whereas for the string:
((paid for) + (8 working hours) + (company rules))
stack will not be empty untill the last ) appears.
Hope this helps...

Finds the closing bracket index
public int FindClosingBracketIndex(string text, char openedBracket = '{', char closedBracket = '}')
{
int index = text.IndexOf(openedBracket);
int bracketCount = 1;
var textArray = text.ToCharArray();
for (int i = index + 1; i < textArray.Length; i++)
{
if (textArray[i] == openedBracket)
{
bracketCount++;
}
else if (textArray[i] == closedBracket)
{
bracketCount--;
}
if (bracketCount == 0)
{
index = i;
break;
}
}
return index;
}

Tests
static void Main()
{
Console.WriteLine("Expected: {0}, Is: {1}", false, IsSurrounded(""));
Console.WriteLine("Expected: {0}, Is: {1}", false, IsSurrounded("("));
Console.WriteLine("Expected: {0}, Is: {1}", false, IsSurrounded(")"));
Console.WriteLine("Expected: {0}, Is: {1}", true, IsSurrounded("()"));
Console.WriteLine("Expected: {0}, Is: {1}", false, IsSurrounded("(()"));
Console.WriteLine("Expected: {0}, Is: {1}", false, IsSurrounded("())"));
Console.WriteLine("Expected: {0}, Is: {1}", true, IsSurrounded("(.(..)..(..)..)"));
Console.WriteLine("Expected: {0}, Is: {1}", false, IsSurrounded("(..)..(..)"));
Console.WriteLine("Expected: {0}, Is: {1}", false, IsSurrounded("(..)..(..)..)"));
Console.WriteLine("Expected: {0}, Is: {1}", false, IsSurrounded("(.(..)..(..)"));
}
Method
Very fast
No stack
No loop through entire string
If the first opening parenthesis has its closing counterpart, then the result can't be true. Same thing about last closing parenthesis.
static bool IsSurrounded(string text)
{
if (text.Length < 2 || text.First() != '(' || text.Last() != ')')
return false;
for (var i = 1; i < text.Length - 1; i++)
{
if (text[i] == ')')
return false;
if (text[i] == '(')
break;
}
for (var i = text.Length - 2; i > 0; i--)
{
if (text[i] == '(')
return false;
if (text[i] == ')')
break;
}
return true;
}
Limitations
Should be not used when there are more recursive parentheses such as ((..)) + ((..))

To ensure there are parenthesis you could simply add them:
text = "(" + text + ")"
Otherwise the suggested stack by Botz3000:
string text = "(paid for)";
Stack<int> parenthesis = new Stack<int>();
int last = 0;
for (int i = 0; i < text.Length; i++)
{
if (text[i] == '(')
parenthesis.Push(i);
else if (text[i] == ')')
{
last = parenthesis.Pop();
}
}
if (last == 0)
{
// The matching parenthesis was the first letter.
}

You can check the right number of parenthesises by using something like a stack. Count up for each opening and count down for each closing brace. The same number of opening and closing braces means it matches. If you ever encounter a closing brace while your count is zero, that's a mismatch. If you want to know if your string is completely enclosed by paranthesises, check if all of them match, then check if your string starts with one.
static void BraceMatch(string text)
{
int level = 0;
foreach (char c in text)
{
if (c == '(')
{
// opening brace detected
level++;
}
if (c == ')')
{
level--;
if (level < 0)
{
// closing brace detected, without a corresponding opening brace
throw new ApplicationException("Opening brace missing.");
}
}
}
if (level > 0)
{
// more open than closing braces
throw new ApplicationException("Closing brace missing.");
}
}

Related

C#, Finding equal amount of brackets

I've made my own way to check if the amount of () and "" are equal. So for example "H(ell)o") is correct. However, the problem I face is that what if the first bracket is ) and the other is ( example "H)ell(o" this would mean it's incorrect. So my question is how would I check whether the first bracket in any word is opening?
EDIT:
public static Boolean ArTinkaSintakse(char[] simboliai)
{
int openingBracketsAmount = 0;
int closingBracketsAmount = 0;
int quotationMarkAmount = 0;
for (int i = 0; i < simboliai.Length; i++)
{
if (openingBracketsAmount == 0 && simboliai[i] == ')')
break;
else if (simboliai[i] == '\"')
quotationMarkAmount++;
else if (simboliai[i] == '(')
openingBracketsAmount++;
else if (simboliai[i] == ')')
closingBracketsAmount++;
}
int bracketAmount = openingBracketsAmount + closingBracketsAmount;
if (quotationMarkAmount % 2 == 0 && bracketAmount % 2 == 0)
return true;
else
return false;
}
Add a check for if (openingBracketsAmount < closingBracketsAmount). If that's ever true, you know that the brackets are unbalanced.
Add an if statement that breaks out of the loop if the first bracket encountered is a closing bracket, like so:
for (int i = 0; i < word.Length; i++)
{
if ((openingBracketsAmount == 0) && (word[i] == ')'))
{
<Log error...>
break;
}
<Rest of your loop...>
}
This way, as soon as openingBracketsAmount is updated, this if statement will be unreachable.
I would approach the problem recursively.
Create a Dictionary<char, char> to keep track of which opening character goes with which closing one.
Then I would implement:
boolean findMatches(string input, out int lastIndex) {}
The method will scan until a member of the Dictionary keys is found. Then it will recursively call itself with the remainder of the string. If the recursive call comes back false, return that. If true, check if the lastIndex character (or the one after; I always need to write the code to check fenceposts) is the matching bracket you want. If it is, and you're not at the end of the string, return the value of a recursive call with the rest of the string after that matching bracket. If you are at the end, return true with that character's index. If that character isn't the matching bracket/quote, pass the remainder of the string (including that last character) to another recursive call.
Continue until you reach the end of the string (returning true if you aren't matching a bracket or quote, false otherwise). Either way with a lastIndex of the last character.
Probably you need a stack-based approach to validate such expressions.
Please try the following code:
public static bool IsValid(string s)
{
var pairs = new List<Tuple<char, char>>
{
new Tuple<char, char>('(', ')'),
new Tuple<char, char>('{', '}'),
new Tuple<char, char>('[', ']'),
};
var openTags = new HashSet<char>();
var closeTags = new Dictionary<char, char>(pairs.Count);
foreach (var p in pairs)
{
openTags.Add(p.Item1);
closeTags.Add(p.Item2, p.Item1);
}
// Remove quoted parts
Regex r = new Regex("\".*?\"", RegexOptions.Compiled | RegexOptions.Multiline);
s = r.Replace(s, string.Empty);
var opened = new Stack<char>();
foreach (var ch in s)
{
if (ch == '"')
{
// Unpaired quote char
return false;
}
if (openTags.Contains(ch))
{
// This is a legal open tag
opened.Push(ch);
}
else if (closeTags.TryGetValue(ch, out var openTag) && openTag != opened.Pop())
{
// This is an illegal close tag or an unbalanced legal close tag
return false;
}
}
return true;
}

Function to return the acronym of a string

How can I write a function which given an input string, passes back the acronym for the string using only If/Then/Else, simple String functions, and Looping syntax (not use the Split( ) function or its equivalent)?
String s_input, s_acronym
s_input = "Mothers against drunk driving"
s_acronym = f_get_acronym(s_input)
print "acronym = " + s_acronym
/* acronym = MADD */
My code is here. just looking to see if I could get better solution
static string f_get_acronym(string s_input)
{
string s_acronym = "";
for (int i = 0; i < s_input.Length; i++)
{
if (i == 0 && s_input[i].ToString() != " ")
{
s_acronym += s_input[i];
continue;
}
if (s_input[i - 1].ToString() == " " && s_input[i].ToString() != " ")
{
s_acronym += s_input[i];
}
}
return s_acronym.ToUpper();
}
Regex is the way to go in C#. I know you only wanted simple functions, but I want to put this here for any further readers who shall be directed on the right path. ;)
var regex = new Regex(#"(?:\s*(?<first>\w))\w+");
var matches = regex.Matches("Mother against druck driving");
foreach (Match match in matches)
{
Console.Write(match.Groups["first"].Value);
}
private static string f_get_acronym(string s_input)
{
if (string.IsNullOrWhiteSpace(s_input))
return string.Empty;
string accr = string.Empty;
accr += s_input[0];
while (s_input.Length > 0)
{
int index = s_input.IndexOf(' ');
if (index > 0)
{
s_input = s_input.Substring(index + 1);
if (s_input.Length == 0)
break;
accr += s_input[0];
}
else
{
break;
}
}
return accr.ToUpper();
}
Keep it simple:
public static string Acronym(string input)
{
string result = string.Empty;
char last = ' ';
foreach(var c in input)
{
if(char.IsWhiteSpace(last))
{
result += c;
}
last = c;
}
return result.ToUpper();
}
Best practice says you should use a StringBuilder when adding to a string in a loop though. Don't know how long your acronyms are going to be.
Your best way to do so would be to set up a loop to loop over every letter. If it is the first letter in the string, OR the first letter after a space, add that letter to a temp string, which is returned at the end.
eg (basic c++)
string toAcronym(string sentence)
{
string acronym = "";
bool wasSpace = true;
for (int i = 0; i < sentence.length(); i++)
{
if (wasSpace == true)
{
if (sentence[i] != ' ')
{
wasSpace = false;
acronym += toupper(sentence[i]);
}
else
{
wasSpace = true;
}
}
else if (sentence[i] == ' ')
{
wasSpace = true;
}
}
return acronym;
}
This could be further improved by checking to make sure the letter to add to the acronym is a letter, and not a number / symbol. OR...
If it is the first letter in the string, add it to the acronym. Then, run a loop for "find next of" a space. Then, add the next character. Continuously loop the "find next of" space until it returns null / eof / end of string, then return.

Parse comma seperated string with a complication in C#

I know how to get substrings from a string which are coma seperated but here's a complication: what if substring contains a coma.
If a substring contains a coma, new line or double quotes the entire substring is encapsulated with double quotes.
If a substring contains a double quote the double quote is escaped with another double quote.
Worst case scenario would be if I have something like this:
first,"second, second","""third"" third","""fourth"", fourth"
In this case substrings are:
first
second, second
"third" third
"fourth", fourth
second, second is encapsulated with double quotes, I don't want those double quotes in a list/array.
"third" third is encapsulated with double quotes because it contains double quotes and those are escaped with aditional double quotes. Again I don't want the encapsulating double quotes in a list/array and i don't want the double quotes that escape double quotes, but I want original double quotes which are a part of the substring.
One way using TextFieldParser:
using (var reader = new StringReader("first,\"second, second\",\"\"\"third\"\" third\",\"\"\"fourth\"\", fourth\""))
using (var parser = new Microsoft.VisualBasic.FileIO.TextFieldParser(reader))
{
parser.Delimiters = new[] { "," };
parser.HasFieldsEnclosedInQuotes = true;
while (!parser.EndOfData)
{
foreach (var field in parser.ReadFields())
Console.WriteLine(field);
}
}
For
first
second, second
"third" third
"fourth", fourth
Try this
string input = "first,\"second, second\",\"\"\"third\"\" third\",\"\"\"fourth\"\", fourth\"";
string[] output = input.Split(new string[] {"\",\""}, StringSplitOptions.RemoveEmptyEntries);
I would suggest you to construct a small state machine for this problem. You would have states like:
Out - before the first field is reached
InQuoted - you were Out and " arrived; now you're in and the field is quoted
InQuotedMaybeOut - you were InQuoted and " arrived; now you wait for the next character to figure whether it is another " or something else; if else, then select the next valid state (character could be space, new line, comma, so you decide the next state); otherwise, if " arrived, you push " to the output and step back to InQuoted
In - after Out, when any character has arrived except , and ", you are automatically inside a new field which is not quoted.
This will certainly read CSV correctly. You can also make the separator configurable, so that you support TSV or semicolon-separated format.
Also keep in mind one very important case in CSV format: Quoted field may contain new line! Another special case to keep an eye on: empty field (like: ,,).
This is not the most elegant solution but it might help you. I would loop through the characters and do an odd-even count of the quotes. For example you have a bool that is true if you have encountered an odd number of quotes and false for an even number of quotes.
Any comma encountered while this bool value is true should not be considered as a separator. If you know it is a separator you can do several things with that information. Below I replaced the delimiter with something more manageable (not very efficient though):
bool odd = false;
char replacementDelimiter = "|"; // Or some very unlikely character
for(int i = 0; i < str.len; ++i)
{
if(str[i] == '\"')
odd = !odd;
else if (str[i] == ',')
{
if(!odd)
str[i] = replacementDelimiter;
}
}
string[] commaSeparatedTokens = str.Split(replacementDelimiter);
At this point you should have an array of strings that are separated on the commas that you have intended. From here on it will be simpler to handle the quotes.
I hope this can help you.
Mini parser
using System;
using System.Collections.Generic;
using System.Text;
namespace ConsoleApp
{
class Program
{
private static IEnumerable<string> Parse(string input)
{
if (string.IsNullOrWhiteSpace(input))
{
// empty string => nothing to do
yield break;
}
int count = input.Length;
StringBuilder sb = new StringBuilder();
int j;
for (int i = 0; i < count; i++)
{
char c = input[i];
if (c == ',')
{
yield return sb.ToString();
sb.Clear();
}
else if (c == '"')
{
// begin quoted string
sb.Clear();
for (j = i + 1; j < count; j++)
{
if (input[j] == '"')
{
// quote
if (j < count - 1 && input[j + 1] == '"')
{
// double quote
sb.Append('"');
j++;
}
else
{
break;
}
}
else
{
sb.Append(input[j]);
}
}
yield return sb.ToString();
// clear buffer and skip to next comma
sb.Clear();
for (i = j + 1; i < count && input[i] != ','; i++) ;
}
else
{
sb.Append(c);
}
}
}
[STAThread]
static void Main(string[] args)
{
foreach (string str in Parse("first,\"second, second\",\"\"\"third\"\" third\",\"\"\"fourth\"\", fourth\""))
{
Console.WriteLine(str);
}
Console.WriteLine();
Console.WriteLine("Press any key to continue...");
Console.ReadKey();
}
}
}
Result
first
second, second
"third" third
"fourth", fourth
Thank you for your answers, but before I got to see them I wrote this solution, it's not pretty but it works for me.
string line = "first,\"second, second\",\"\"\"third\"\" third\",\"\"\"fourth\"\", fourth\"";
var substringArray = new List<string>();
string substring = null;
var doubleQuotesCount = 0;
for (var i = 0; i < line.Length; i++)
{
if (line[i] == ',' && (doubleQuotesCount % 2) == 0)
{
substringArray.Add(substring);
substring = null;
doubleQuotesCount = 0;
continue;
}
else
{
if (line[i] == '"')
doubleQuotesCount++;
substring += line[i];
//If it is a last character
if (i == line.Length - 1)
{
substringArray.Add(substring);
substring = null;
doubleQuotesCount = 0;
}
}
}
for(var i = 0; i < substringArray.Count; i++)
{
if (substringArray[i] != null)
{
//remove first double quote
if (substringArray[i][0] == '"')
{
substringArray[i] = substringArray[i].Substring(1);
}
//remove last double quote
if (substringArray[i][substringArray[i].Length - 1] == '"')
{
substringArray[i] = substringArray[i].Remove(substringArray[i].Length - 1);
}
//Replace double double quotes with single double quote
substringArray[i] = substringArray[i].Replace("\"\"", "\"");
}
}

How find sub-string(0,91) from long string?

I write this program in c#:
static void Main(string[] args)
{
int i;
string ss = "fc7600109177";
// I want to found (0,91) in ss string
for (i=0; i<= ss.Length; i++)
if (((char)ss[i] == '0') && (((char)ss[i+1] + (char)ss[i+2]) == "91" ))
Console.WriteLine(" found");
}
What's wrong in this program and how can I find (0,91)?
First of all, you don't have to cast to char your ss[i] or others. ss[i] and others are already char.
As a second, you try to concatanate two char (ss[i+1] and ss[i+2]) in your if loop and after you check equality with a string. This is wrong. Change it to;
if ( (ss[i] == '0') && (ss[i + 1] == '9') && (ss[i + 2]) == '1')
Console.WriteLine("found");
As a third, which I think the most important, don't write code like that. You can easly use String.Contains method which does exactly what you want.
Returns a value indicating whether the specified String object occurs
within this string.
string ss = "fc7600109177";
bool found = ss.Contains("091");
Here a DEMO.
use "contain" return only true or false and "index of" return location
of string but I want to find location of "091" in ss and if "091"
repeat like: ss ="763091d44a0914" how can I find second "091" ??
Here how you can find all indexes in your string;
string chars = "091";
string ss = "763091d44a0914";
List<int> indexes = new List<int>();
foreach ( Match match in Regex.Matches(ss, chars) )
{
indexes.Add(match.Index);
}
for (int i = 0; i < indexes.Count; i++)
{
Console.WriteLine("{0}. match in index {1}", i+1, indexes[i]);
}
Output will be;
1. match in index: 3
2. match in index: 10
Here a DEMO.
Use String.Contains() for this purpose
if(ss.Contains("091"))
{
Console.WriteLine(" found");
}
if you want to know where "091" starts in the string then you can use:
var pos = ss.IndexOf("091")

Parsing strings recursively

I am trying to extract information out of a string - a fortran formatting string to be specific. The string is formatted like:
F8.3, I5, 3(5X, 2(A20,F10.3)), 'XXX'
with formatting fields delimited by "," and formatting groups inside brackets, with the number in front of the brackets indicating how many consecutive times the formatting pattern is repeated. So, the string above expands to:
F8.3, I5, 5X, A20,F10.3, A20,F10.3, 5X, A20,F10.3, A20,F10.3, 5X, A20,F10.3, A20,F10.3, 'XXX'
I am trying to make something in C# that will expand a string that conforms to that pattern. I have started going about it with lots of switch and if statements, but am wondering if I am not going about it the wrong way?
I was basically wondering if some Regex wizzard thinks that Regular expressions can do this in one neat-fell swoop? I know nothing about regular expressions, but if this could solve my problem I am considering putting in some time to learn how to use them... on the other hand if regular expressions can't sort this out then I'd rather spend my time looking at another method.
This has to be doable with Regex :)
I've expanded my previous example and it test nicely with your example.
// regex to match the inner most patterns of n(X) and capture the values of n and X.
private static readonly Regex matcher = new Regex(#"(\d+)\(([^(]*?)\)", RegexOptions.None);
// create new string by repeating X n times, separated with ','
private static string Join(Match m)
{
var n = Convert.ToInt32(m.Groups[1].Value); // get value of n
var x = m.Groups[2].Value; // get value of X
return String.Join(",", Enumerable.Repeat(x, n));
}
// expand the string by recursively replacing the innermost values of n(X).
private static string Expand(string text)
{
var s = matcher.Replace(text, Join);
return (matcher.IsMatch(s)) ? Expand(s) : s;
}
// parse a string for occurenses of n(X) pattern and expand then.
// return the string as a tokenized array.
public static string[] Parse(string text)
{
// Check that the number of parantheses is even.
if (text.Sum(c => (c == '(' || c == ')') ? 1 : 0) % 2 == 1)
throw new ArgumentException("The string contains an odd number of parantheses.");
return Expand(text).Split(new[] { ',', ' ' }, StringSplitOptions.RemoveEmptyEntries);
}
I would suggest using a recusive method like the example below( not tested ):
ResultData Parse(String value, ref Int32 index)
{
ResultData result = new ResultData();
Index startIndex = index; // Used to get substrings
while (index < value.Length)
{
Char current = value[index];
if (current == '(')
{
index++;
result.Add(Parse(value, ref index));
startIndex = index;
continue;
}
if (current == ')')
{
// Push last result
index++;
return result;
}
// Process all other chars here
}
// We can't find the closing bracket
throw new Exception("String is not valid");
}
You maybe need to modify some parts of the code, but this method have i used when writing a simple compiler. Although it's not completed, just a example.
Personally, I would suggest using a recursive function instead. Every time you hit an opening parenthesis, call the function again to parse that part. I'm not sure if you can use a regex to match a recursive data structure.
(Edit: Removed incorrect regex)
Ended up rewriting this today. It turns out that this can be done in one single method:
private static string ExpandBrackets(string Format)
{
int maxLevel = CountNesting(Format);
for (int currentLevel = maxLevel; currentLevel > 0; currentLevel--)
{
int level = 0;
int start = 0;
int end = 0;
for (int i = 0; i < Format.Length; i++)
{
char thisChar = Format[i];
switch (Format[i])
{
case '(':
level++;
if (level == currentLevel)
{
string group = string.Empty;
int repeat = 0;
/// Isolate the number of repeats if any
/// If there are 0 repeats the set to 1 so group will be replaced by itself with the brackets removed
for (int j = i - 1; j >= 0; j--)
{
char c = Format[j];
if (c == ',')
{
start = j + 1;
break;
}
if (char.IsDigit(c))
repeat = int.Parse(c + (repeat != 0 ? repeat.ToString() : string.Empty));
else
throw new Exception("Non-numeric character " + c + " found in front of the brackets");
}
if (repeat == 0)
repeat = 1;
/// Isolate the format group
/// Parse until the first closing bracket. Level is decremented as this effectively takes us down one level
for (int j = i + 1; j < Format.Length; j++)
{
char c = Format[j];
if (c == ')')
{
level--;
end = j;
break;
}
group += c;
}
/// Substitute the expanded group for the original group in the format string
/// If the group is empty then just remove it from the string
if (string.IsNullOrEmpty(group))
{
Format = Format.Remove(start - 1, end - start + 2);
i = start;
}
else
{
string repeatedGroup = RepeatString(group, repeat);
Format = Format.Remove(start, end - start + 1).Insert(start, repeatedGroup);
i = start + repeatedGroup.Length - 1;
}
}
break;
case ')':
level--;
break;
}
}
}
return Format;
}
CountNesting() returns the highest level of bracket nesting in the format statement, but could be passed in as a parameter to the method. RepeatString() just repeats a string the specified number of times and substitutes it for the bracketed group in the format string.

Categories