How can find all permutations of spinning text in c# - c#

I have a spinning text : {T1{M1|{A1|B1}|M2}F1|{X1|X2}}
My question is : How can i find all permutations in C# ?
T1M1F1
T1M2F1
T1A1F1
T1B1F1
X1
X2
Any suggestions ?
Edit :
Thank you for your help but M1,A1, .. are examples
With words that could give :
{my name is james vick and i am a {member|user|visitor} on this {forum|website|site} and i am loving it | i am admin and i am a {supervisor|admin|moderator} on this {forum|website|site} and i am loving it}.
my name is james vick and i am a {member|user|visitor} on this {forum|website|site} and i am loving it => 3 * 3 => 9 permutations
i am admin and i am a {supervisor|admin|moderator} on this {forum|website|site} and i am loving it => 3 * 3 => 9 permutations
Result : 18 permutations

Method to generate all permuatuons of spinnable strings
I've implemented a simple method to solve this problem.
It takes an ArrayList argument containing spinnable text string(s).
I use it to generate all the permutations of multiple spinnable strings.
It comes with extra functionality of support of optional blocks, surronded by "[ ]" brackets.
Eq.:
If you have a single string object in the ArrayList with content of:
{A | {B1 | B2 } [B optional] }
It populates the array list with all the permutations, "extracted"
Contents after invocation of method:
A
B1
B1 B optional
B2
B2 B optional
You can also pass multiple strings as argument to generate permutations for all of them:
Eg.:
Input:
ArraList with two string
{A1 | A2}
{B1 | B2}
Contents after invocation:
A1
A2
B1
B2
This implementation works by always finding the inner most bracket pair in the first spinnable section, then extract it. I do this until all the special {}, [] characters are removed.
private void ExtractVersions(ArrayList list)
{
ArrayList IndicesToRemove = new ArrayList();
for (int i = 0; i < list.Count; i++)
{
string s = list[i].ToString();
int firstIndexOfCurlyClosing = s.IndexOf('}');
int firstIndexOfBracketClosing = s.IndexOf(']');
if ((firstIndexOfCurlyClosing > -1) || (firstIndexOfBracketClosing > -1))
{
char type = ' ';
int endi = -1;
int starti = -1;
if ((firstIndexOfBracketClosing == -1) && (firstIndexOfCurlyClosing > -1))
{ // Only Curly
endi = firstIndexOfCurlyClosing;
type = '{';
}
else
{
if ((firstIndexOfBracketClosing > -1) && (firstIndexOfCurlyClosing == -1))
{ // Only bracket
endi = firstIndexOfBracketClosing;
type = '[';
}
else
{
// Both
endi = Math.Min(firstIndexOfBracketClosing, firstIndexOfCurlyClosing);
type = s[endi];
if (type == ']')
{
type = '[';
}
else
{
type = '{';
}
}
}
starti = s.Substring(0, endi).LastIndexOf(type);
if (starti == -1)
{
throw new Exception("Brackets are not valid.");
}
// start index, end index and type found. -> make changes
if (type == '[')
{
// Add two new lines, one with the optional part, one without it
list.Add(s.Remove(starti, endi - starti+1));
list.Add(s.Remove(starti, 1).Remove(endi-1, 1));
IndicesToRemove.Add(i);
}
else
if (type == '{')
{
// Add as many new lines as many alternatives there are. This must be an in most bracket.
string alternatives = s.Substring(starti + 1, endi - starti - 1);
foreach(string alt in alternatives.Split('|'))
{
list.Add(s.Remove(starti,endi-starti+1).Insert(starti,alt));
}
IndicesToRemove.Add(i);
}
} // End of if( >-1 && >-1)
} // End of for loop
for (int i = IndicesToRemove.Count-1; i >= 0; i--)
{
list.RemoveAt((int)IndicesToRemove[i]);
}
}
I hope I've helped.
Maybe it is not the simplest and best implementation, but it works well for me. Please feedback, and vote!

In my opinion, you should proceed like this:
All nested choice lists i.e. between { } should be "flattened" to a single choice list. Like in your example:
{M1|{A1|B1}|M2} -> {M1|A1|B1|M2}
Use recursion to generate all possible combinations. For example, starting from an empty array, first place T1 since it is the only option. Then from the nested list {M1|A1|B1|M2} choose each element in turn an place it on the next position and then finally F1. Repeat until all possibilities are exhausted.
This is just a rough hint, you need to fill in the rest of the details.

Related

Matching Unicode characters in a regular expression

I retrieve strings from a website using the HttpClient class. The webserver sends them in UTF-8 encoding. The strings have the form abc | a and I'd like to remove the pipe, the space and the character after the space from them, if they are at the end of the string.
sText = Regex.Replace (sText, #"\| .$", "");
works as expected. Now, in some cases, the pipe and the space is followed by another character, for example a smiley. The string has then the form abc | 😉. The regular expression above does not work and I have to use
sText = Regex.Replace (sText, #"\| ..$", "");
instead (two dots).
I'm quite sure it has something to do with the encoding and with the fact that the smiley uses more bytes in UTF-8 than a latin character - and the fact that c# doesn't know the encoding. The smiley is just one character, even if it uses more bytes, so after telling c# the correct encoding (or converting the string), the first regular expression should work in both cases.
How can this be done?
Like it was suggested in the comments, this problem is hard to solve using Regex. What you call "looks like one item" is actually a grapheme cluster. The corresponding .NET term is a "text element" that can be parsed and iterated through using StringInfo.GetTextElementEnumerator.
A possible solution based on text elements can be quite simple: we just need to extract the last 3 text elements from the input string and ensure that they refer to a pipe, a space and the last one can be any. Please find below the proposed approach implementation.
void Main()
{
var inputs = new[] {
"abc | a",
"abc | ab", // The only that shouldn't be trimmed
"abc | 😉",
"abc | " + "\uD83D\uDD75\u200D\u2642\uFE0F" // "man-detective" (on Windows)
};
foreach (var input in inputs)
{
var res = TrimTrailingTextElement(input);
Console.WriteLine("Input : " + input);
Console.WriteLine("Result: " + res);
Console.WriteLine();
}
}
string TrimTrailingTextElement(string input)
{
// A circular buffer for storing the last 3 text elements
var lastThreeElementIdxs = new int[3] { -1, -1, -1 };
// Get enumerator of text elements in the input string
var enumerator = StringInfo.GetTextElementEnumerator(input);
// Iterate through the enitre input string,
// at each step save to the buffer the current element index
var i = -1;
while (enumerator.MoveNext())
{
i = (i + 1) % 3;
lastThreeElementIdxs[i] = enumerator.ElementIndex;
}
// The buffer index must be positive for a non-empty input
if (i >= 0)
{
// Extract indexes of the last 3 elements
// from the circular buffer
var i1 = lastThreeElementIdxs[(i + 1) % 3];
var i2 = lastThreeElementIdxs[(i + 2) % 3];
var i3 = lastThreeElementIdxs[i];
if (i1 >= 0 && i2 >= 0 && i3 >= 0 && // All 3 indexes must be initialized
i3 - i2 == 1 && i2 - i1 == 1 && // The 1 and 2 elements must be 1 char long
input[i1] == '|' && // The 1 element must be a pipe
input[i2] == ' ') // The 2 element must be a space
{
return input.Substring(0, i1);
}
}
return input;
}

c# optimizing cycle working with big numbers

I have this code that finds numbers in a given range that contain only 3 and 5 and are polynoms(symetrical, 3553 for example). The problem is that the numbers are between 1 and 10^18, so there are cases in which I have to work with big numbers, and using BigInteger makes the program way too slow, so is there a way to fix this ? Here's my code:
namespace Lucky_numbers
{
class Program
{
static void Main(string[] args)
{
string startString = Console.ReadLine();
string finishString = Console.ReadLine();
BigInteger start = BigInteger.Parse(startString);
BigInteger finish = BigInteger.Parse(finishString);
int numbersFound = 0;
for (BigInteger i = start; i <= finish; i++)
{
if (Lucky(i.ToString()))
{
if (Polyndrome(i.ToString()))
{
numbersFound++;
}
}
}
}
static bool Lucky(string number)
{
if (number.Contains("1") || number.Contains("2") || number.Contains("4") || number.Contains("6") || number.Contains("7") || number.Contains("8") || number.Contains("9") || number.Contains("0"))
{
return false;
}
else
{
return true;
}
}
static bool Polyndrome(string number)
{
bool symetrical = true;
int middle = number.Length / 2;
int rightIndex = number.Length - 1;
for (int leftIndex = 0; leftIndex <= middle; leftIndex++)
{
if (number[leftIndex] != number[rightIndex])
{
symetrical = false;
break;
}
rightIndex--;
}
return symetrical;
}
}
}
Edit: Turns out it's not BigInteger, it's my shitty implementation.
You could use ulong:
Size: Unsigned 64-bit integer
Range: 0 to 18,446,744,073,709,551,615
But I would guess that BigInteger is not a problem here. I think you should create algorithm for palindrome creation instead of brute-force increment+check solution.
Bonus
Here is a palyndrome generator I wrote in 5 minutes. I think it will be much faster than your approach. Could you test it and tell how much faster it is? I'm curious about that.
public class PalyndromeGenerator
{
private List<string> _results;
private bool _isGenerated;
private int _length;
private char[] _characters;
private int _middle;
private char[] _currentItem;
public PalyndromeGenerator(int length, params char[] characters)
{
if (length <= 0)
throw new ArgumentException("length");
if (characters == null)
throw new ArgumentNullException("characters");
if (characters.Length == 0)
throw new ArgumentException("characters");
_length = length;
_characters = characters;
}
public List<string> Results
{
get
{
if (!_isGenerated)
throw new InvalidOperationException();
return _results.ToList();
}
}
public void Generate()
{
_middle = (int)Math.Ceiling(_length / 2.0) - 1;
_results = new List<string>((int)Math.Pow(_characters.Length, _middle + 1));
_currentItem = new char[_length];
GeneratePosition(0);
_isGenerated = true;
}
private void GeneratePosition(int position)
{
if(position == _middle)
{
for (int i = 0; i < _characters.Length; i++)
{
_currentItem[position] = _characters[i];
_currentItem[_length - position - 1] = _characters[i];
_results.Add(new string(_currentItem));
}
}
else
{
for(int i = 0; i < _characters.Length; i++)
{
_currentItem[position] = _characters[i];
_currentItem[_length - position - 1] = _characters[i];
GeneratePosition(position + 1);
}
}
}
}
Usage:
var generator = new PalyndromeGenerator(6, '3', '5');
generator.Generate();
var items = generator.Results.Select(x => ulong.Parse(x)).ToList();
Strange riddle, but can be simplified if I understand the requirement.
I would first map these numbers to binary as there is only two possible
"lucky" digits, then generate the numbers by counting in binary until
I have completed nine bits. Reflect it for the full number, then
convert 0 to 3 and 1 to 5.
Example 1101
Reflect it = 10111101 --> 53555535
Do this from 0 all the way to 111111111
Declare start and finish to be static inside the class.
Change the method Lucky to:
static bool Lucky(string number)
{
return !(number.Contains("1") || number.Contains("2") || number.Contains("4") || number.Contains("6") || number.Contains("7") || number.Contains("8") || number.Contains("9") || number.Contains("0"));
}
Also, you can use Parallel library to parallelize the computation.
Instead of using a regular for loop, you could use a Parallel.For.
Look at the problem a different way - how many strings of up to 9 characters (using only '3' and '5') can you make? for each string you have 2 palindromes (one repeating the last character, one not) that you can make.
e.g.
3 -> 33
5 ->, 55
33 -> 333, 3333
35 -> 353, 3553
53 -> 535, 5335
...
The only suggestion I have is to use a 3rd party library like intx, or some unmanaged code. The intx author reports that it can work faster than BigInteger in some situations: "System.Numerics.BigInteger class was introduced in .NET 4.0 so I was interested in performance of this solution. I did some tests (grab test code from GitHub) and it appears that BigInteger has performance in general comparable with IntX on standard operations but starts losing when FHT comes into play (when multiplying really big integers, for example)."
Since the number has to be symmetrical, you only need to check the first half of the number. You don't need to check 18 digits, you only have to check to 9 digits and then swap the order of the characters and add them to the back as a string.
One thing I can think of is if you are only going to count integers that are containing 3 or 5 you don't need to traverse the entire list of numbers between your beginning & ending range.
Instead, look at your character set as either '3' or '5'. Then you can simply go through the allowed permutations of half of the number itself, leaving the other half to be completed to successfully create a polyndrome.
There are some rules to this method which would help, such as :
if the starting number's left-most digit was greater than 5 there is no need to attempt for that specific number of digits.
if both numbers fall on the same amount of digits but left-most digits do not traverse / include 5 or 3, no need to process.
Developing some set of rules such as this may help other than attempting to check every possible permutation.
So, for example, your Lucky function would become something more along the lines of :
static bool Lucky(string number)
{
if((number[0] != '3') && (number[0] != '5'))
{
return false;
} //and you could continue this for the entire string
...
}

How to read and parse specific integer value form a text file and add it to listbox in c#?

i'd like to know how to read and parse specific integer value form a text file and add it to listbox in c#. For example I have a text file MyText.txt like this:
<>
101
192
-
399
~
99
128
-
366
~
101
192
-
403
~
And I want to parse the integer value between '-' and '~' and add each one of it to items in list box for example:
#listBox1
399
366
403
Notice that each line of value separated by Carriage Return and Line Feed. And by the way, it is a data transmitted through RS-232 Serial Communication from microcontroller. Sorry, I'm just new in c# programming. Thanks in advance.
Here's a way to do it with LINQ:
bool keep = false;
listBox1.Items.AddRange(
File.ReadLines("MyText.txt")
.Where(l =>
{
if (l == "-") keep = true;
else if (l == "~") keep = false;
else return keep;
return false;
})
.ToArray());
you could use regular expressions like so:
var s = System.Text.RegularExpressions.Regex.Matches(stringtomatch,#"(?<=-\s*)[0-9]+\b(?=\s*~)");
The regex basically looks for a number. It then checks the characters behind, looks for an optional whitespace and a dash (-). then it matches all the numbers until it encounters another non-word character. it checks for an optional whitespace and then a required ~ (dunno what that's called). Also, it only returns the number (not the whitespace and symbols).
So basically this method returns a list of matches. you could then use it like so:
for (int i = 0; i < s.Count; i++)
{
listBox1.Items.Add(s[i]);
}
EDIT:
typo in the regex and updated the loop (for some reason, foreach doesn't work with the MatchCollection).
you can try running this test script:
var stringtomatch = " asdjasdk jh kjh asd\n-\n123123\n~\nasdasd";
var s = System.Text.RegularExpressions.Regex.Matches(stringtomatch,#"(?<=-\s*)[0-9]+\b(?=\s*~)");
Console.WriteLine(stringtomatch);
for (int i = 0; i < s.Count; i++)
{
listBox1.Items.Add(s[i]);
}
Try
List<Int32> values = new List<Int32>();
bool open = false;
String[] lines = File.ReadAllLines(fileName);
foreach(String line in lines)
{
if( (!open) && (line == "-") )
{
open = true;
}
else if( (open) && (line == "~") )
{
open = false;
}
else if(open)
{
Int32 v;
if(Int32.TryParse(line, out v))
{
values.Add(v);
}
}
}
Listbox.Items.AddRange(values);
This is a easy piece of code with reading a file, converting to integer (although you could stay with strings) and handling lists. You should start with some basic .NET/C# tutorials.
Edit: To add the values to the listbox you can switch to values.ForEach(v => listbox.Items.Add(v.ToString()) if you use .NET 3.5. Otherwise make a foreach yourself.

Parsing a chemical formula from a string in C#? [duplicate]

This question already has answers here:
Parsing a chemical formula
(5 answers)
Closed 1 year ago.
I am trying to parse a chemical formula (in the format, for example: Al2O3 or O3 or C or C11H22O12) in C# from a string. It works fine unless there is only one atom of a particular element (e.g. the oxygen atom in H2O). How can I fix that problem, and in addition, is there a better way to parse a chemical formula string than I am doing?
ChemicalElement is a class representing a chemical element. It has properties AtomicNumber (int), Name (string), Symbol (string).
ChemicalFormulaComponent is a class representing a chemical element and atom count (e.g. part of a formula). It has properties Element (ChemicalElement), AtomCount (int).
The rest should be clear enough to understand (I hope) but please let me know with a comment if I can clarify anything, before you answer.
Here is my current code:
/// <summary>
/// Parses a chemical formula from a string.
/// </summary>
/// <param name="chemicalFormula">The string to parse.</param>
/// <exception cref="FormatException">The chemical formula was in an invalid format.</exception>
public static Collection<ChemicalFormulaComponent> FormulaFromString(string chemicalFormula)
{
Collection<ChemicalFormulaComponent> formula = new Collection<ChemicalFormulaComponent>();
string nameBuffer = string.Empty;
int countBuffer = 0;
for (int i = 0; i < chemicalFormula.Length; i++)
{
char c = chemicalFormula[i];
if (!char.IsLetterOrDigit(c) || !char.IsUpper(chemicalFormula, 0))
{
throw new FormatException("Input string was in an incorrect format.");
}
else if (char.IsUpper(c))
{
// Add the chemical element and its atom count
if (countBuffer > 0)
{
formula.Add(new ChemicalFormulaComponent(ChemicalElement.ElementFromSymbol(nameBuffer), countBuffer));
// Reset
nameBuffer = string.Empty;
countBuffer = 0;
}
nameBuffer += c;
}
else if (char.IsLower(c))
{
nameBuffer += c;
}
else if (char.IsDigit(c))
{
if (countBuffer == 0)
{
countBuffer = c - '0';
}
else
{
countBuffer = (countBuffer * 10) + (c - '0');
}
}
}
return formula;
}
I rewrote your parser using regular expressions. Regular expressions fit the bill perfectly for what you're doing. Hope this helps.
public static void Main(string[] args)
{
var testCases = new List<string>
{
"C11H22O12",
"Al2O3",
"O3",
"C",
"H2O"
};
foreach (string testCase in testCases)
{
Console.WriteLine("Testing {0}", testCase);
var formula = FormulaFromString(testCase);
foreach (var element in formula)
{
Console.WriteLine("{0} : {1}", element.Element, element.Count);
}
Console.WriteLine();
}
/* Produced the following output
Testing C11H22O12
C : 11
H : 22
O : 12
Testing Al2O3
Al : 2
O : 3
Testing O3
O : 3
Testing C
C : 1
Testing H2O
H : 2
O : 1
*/
}
private static Collection<ChemicalFormulaComponent> FormulaFromString(string chemicalFormula)
{
Collection<ChemicalFormulaComponent> formula = new Collection<ChemicalFormulaComponent>();
string elementRegex = "([A-Z][a-z]*)([0-9]*)";
string validateRegex = "^(" + elementRegex + ")+$";
if (!Regex.IsMatch(chemicalFormula, validateRegex))
throw new FormatException("Input string was in an incorrect format.");
foreach (Match match in Regex.Matches(chemicalFormula, elementRegex))
{
string name = match.Groups[1].Value;
int count =
match.Groups[2].Value != "" ?
int.Parse(match.Groups[2].Value) :
1;
formula.Add(new ChemicalFormulaComponent(ChemicalElement.ElementFromSymbol(name), count));
}
return formula;
}
The problem with your method is here:
// Add the chemical element and its atom count
if (countBuffer > 0)
When you don't have a number, count buffer will be 0, I think this will work
// Add the chemical element and its atom count
if (countBuffer > 0 || nameBuffer != String.Empty)
This will work when for formulas like HO2 or something like that.
I believe that your method will never insert into the formula collection the las element of the chemical formula.
You should add the last element of the bufer to the collection before return the result, like this:
formula.Add(new ChemicalFormulaComponent(ChemicalElement.ElementFromSymbol(nameBuffer), countBuffer));
return formula;
}
first of all: I haven't used a parser generator in .net, but I'm pretty sure you could find something appropriate. This would allow you to write the grammar of Chemical Formulas in a far more readable form. See for example this question for a first start.
If you want to keep your approach: Is it possible that you do not add your last element no matter if it has a number or not? You might want to run your loop with i<= chemicalFormula.Length and in case of i==chemicalFormula.Length also add what you have to your Formula. You then also have to remove your if (countBuffer > 0) condition because countBuffer can actually be zero!
Regex should work fine with simple formula, if you want to split something like:
(Zn2(Ca(BrO4))K(Pb)2Rb)3
it might be easier to use the parser for it (because of compound nesting). Any parser should be capable of handling it.
I spotted this problem few days ago I thought it would be good example how one can write grammar for a parser, so I included simple chemical formula grammar into my NLT suite. The key rules are -- for lexer:
"(" -> LPAREN;
")" -> RPAREN;
/[0-9]+/ -> NUM, Convert.ToInt32($text);
/[A-Z][a-z]*/ -> ATOM;
and for parser:
comp -> e:elem { e };
elem -> LPAREN e:elem RPAREN n:NUM? { new Element(e,$(n : 1)) }
| e:elem++ { new Element(e,1) }
| a:ATOM n:NUM? { new Element(a,$(n : 1)) }
;

Is there a better way than String.Replace to remove backspaces from a string?

I have a string read from another source such as "\b\bfoo\bx". In this case, it would translate to the word "fox" as the first 2 \b's are ignored, and the last 'o' is erased, and then replaced with 'x'. Also another case would be "patt\b\b\b\b\b\b\b\b\b\bfoo" should be translated to "foo"
I have come up with something using String.Replace, but it is complex and I am worried it is not working correctly, also it is creating a lot of new string objects which I would like to avoid.
Any ideas?
Probably the easiest is to just iterate over the entire string. Given your inputs, the following code does the trick in 1-pass
public string ReplaceBackspace(string hasBackspace)
{
if( string.IsNullOrEmpty(hasBackspace) )
return hasBackspace;
StringBuilder result = new StringBuilder(hasBackspace.Length);
foreach (char c in hasBackspace)
{
if (c == '\b')
{
if (result.Length > 0)
result.Length--;
}
else
{
result.Append(c);
}
}
return result.ToString();
}
The way I would do it is low-tech, but easy to understand.
Create a stack of characters. Then iterate through the string from beginning to end. If the character is a normal character (non-slash), push it onto the stack. If it is a slash, and the next character is a 'b', pop the top of the stack. If the stack is empty, ignore it.
At the end, pop each character in turn, add it to a StringBuilder, and reverse the result.
Regular expressions version:
var data = #"patt\b\b\b\b\b\b\b\b\b\bfoo";
var regex = new Regex(#"(^|[^\\b])\\b");
while (regex.IsMatch(data))
{
data = regex.Replace(data, "");
}
Optimized version (and this one works with backspace '\b' and not with string "\b"):
var data = "patt\b\b\b\b\b\b\b\b\b\bfoo";
var regex = new Regex(#"[^\x08]\x08", RegexOptions.Compiled);
while (data.Contains('\b'))
{
data = regex.Replace(data.TrimStart('\b'), "");
}
public static string ProcessBackspaces(string source)
{
char[] buffer = new char[source.Length];
int idx = 0;
foreach (char c in source)
{
if (c != '\b')
{
buffer[idx] = c;
idx++;
}
else if (idx > 0)
{
idx--;
}
}
return new string(buffer, 0, idx);
}
EDIT
I've done a quick, rough benchmark of the code posted in answers so far (processing the two example strings from the question, one million times each):
ANSWER | TIME (ms)
------------------------|-----------
Luke (this one) | 318
Alexander Taran | 567
Robert Paulson | 683
Markus Nigbur | 2100
Kamarey (new version) | 7075
Kamarey (old version) | 30902
You could iterate through the string backward, making a character array as you go. Every time you hit a backspace, increment a counter, and every time you hit a normal character, skip it if your counter is non-zero and decrement the counter.
I'm not sure what the best C# data structure is to manage this and then be able to get the string in the right order afterward quickly. StringBuilder has an Insert method but I don't know if it will be performant to keep inserting characters at the start or not. You could put the characters in a stack and hit ToArray() at the end -- that might or might not be faster.
String myString = "patt\b\b\b\b\b\b\b\b\b\bfoo";
List<char> chars = myString.ToCharArray().ToList();
int delCount = 0;
for (int i = chars.Count -1; i >= 0; i--)
{
if (chars[i] == '\b')
{
delCount++;
chars.RemoveAt(i);
} else {
if (delCount > 0 && chars[i] != null) {
chars.RemoveAt(i);
delCount--;
}
}
}
i'd go like this:
code is not tested
char[] result = new char[input.Length()];
int r =0;
for (i=0; i<input.Length(); i++){
if (input[i] == '\b' && r>0) r--;
else result[r]=input[i];
}
string resultsring = result.take(r);
Create a StringBuilder and copy over everything but backspace chars.

Categories