I must do a automatic codes generator with user-configurable string with predefined keys and can not find a good way to do it.
For example, a string
OT-{CustomCode}-{Date}-{##}
could generate codes
OT-C0001-20100420-01
OT-C0001-20100420-02
I thought of using RegExpr.Replace(), but I would have problems if the code of a customer was {##}
Any help is welcome! (and sorry for my english)
You can use string.Format():
string generated = string.Format("OT-{0}-{1}-{2}", code, date, num);
The {x} are placeholders for strings to be replaced.
Do you mean an auto-generated code definition is for example:
Foo {##} , Bar {Date}
and that will produce:
Foo 01 , Bar 20100420
Foo 02 , Bar 20100420
don't you ?
I think RegExpr.Replace() is a good solution, to the ## problem you can do something like this:
private void Generate()
{
Regex doubleSharpRegEx = new Regex("{#+}");
string customString = "Foo {####}";
string[] generatedCodes = new string[3];
for (int i = 0; i < generatedCodes.Length; i++)
{
string newString = doubleSharpRegEx.Replace(customString,
match =>
{
// Calculate zero padding for format
// remove brackets
string zeroPadding = match.Value.Substring(1, match.Value.Length - 2);
// replace # with zero
zeroPadding = zeroPadding.Replace('#', '0');
return string.Format("{0:" + zeroPadding + "}", i);
});
generatedCodes[i] = newString;
}
}
And the array generatedCodes contains:
Foo 0000
Foo 0001
Foo 0002
Foo 0003
EDIT:
Lambdas expression work only for framework 3.5.
If you need a solution for 2.0, you must only replace the lambda expression part with a delegate (obviously setting i available for the delegated method e.g. class member)
EDIT 2:
You can combine the 2 answer for example in the following code:
private void Generate2()
{
Regex customCodeRegex = new Regex("{CustomCode}");
Regex dateRegex = new Regex("{Date}");
Regex doubleSharpRegex = new Regex("{#+}");
string customString = "Foo-{##}-{Date}-{CustomCode}-{####}";
string newString = customCodeRegex.Replace(customString, "{0}");
newString = dateRegex.Replace(newString, "{1}");
newString = doubleSharpRegex.Replace(newString,
match =>
{
string zeroPadding = match.Value.Substring(1, match.Value.Length - 2);
zeroPadding = zeroPadding.Replace('#', '0');
return "{2:" + zeroPadding + "}";
});
string customCode = "C001";
string date = DateTime.Today.ToString("yyyyMMdd");
string[] generatedCodes = new string[3];
for (int i = 0; i < generatedCodes.Length; i++)
{
generatedCodes[i] = string.Format(newString, customCode, date, i);
}
}
The StringBuilder class provides an efficient replace:
string code = "C0001";
DateTime date = DateTime.Now;
int count = 1;
String formatString = "OT-{CustomCode}-{Date}-{##}";
StringBuilder sb = new StringBuilder(formatString);
sb.Replace("{CustomCode}", code);
sb.Replace("{Date}", date.ToString("yyyyMMdd"));
sb.Replace("{##}", count);
string result = sb.ToString();
But this is more useful if you're doing multiple replaces for the same tokens. Looks like you need String.Format as suggested by Elisha
Related
I have some strings containing code for emoji icons, like :grinning:, :kissing_heart:, or :bouquet:. I'd like to process them to remove the emoji codes.
For example, given:
Hello:grinning: , how are you?:kissing_heart: Are you fine?:bouquet:
I want to get this:
Hello , how are you? Are you fine?
I know I can use this code:
richTextBox2.Text = richTextBox1.Text.Replace(":kissing_heart:", "").Replace(":bouquet:", "").Replace(":grinning:", "").ToString();
However, there are 856 different emoji icons I have to remove (which, using this method, would take 856 calls to Replace()). Is there any other way to accomplish this?
You can use Regex to match the word between :anything:. Using Replace with function you can make other validation.
string pattern = #":(.*?):";
string input = "Hello:grinning: , how are you?:kissing_heart: Are you fine?:bouquet: Are you super fan, for example. :words not to replace:";
string output = Regex.Replace(input, pattern, (m) =>
{
if (m.ToString().Split(' ').Count() > 1) // more than 1 word and other validations that will help preventing parsing the user text
{
return m.ToString();
}
return String.Empty;
}); // "Hello , how are you? Are you fine? Are you super fan, for example. :words not to replace:"
If you don't want to use Replace that make use of a lambda expression, you can use \w, as #yorye-nathan mentioned, to match only words.
string pattern = #":(\w*):";
string input = "Hello:grinning: , how are you?:kissing_heart: Are you fine?:bouquet: Are you super fan, for example. :words not to replace:";
string output = Regex.Replace(input, pattern, String.Empty); // "Hello , how are you? Are you fine? Are you super fan, for example. :words not to replace:"
string Text = "Hello:grinning: , how are you?:kissing_heart: Are you fine?:bouquet:";
i would solve it that way
List<string> Emoj = new List<string>() { ":kissing_heart:", ":bouquet:", ":grinning:" };
Emoj.ForEach(x => Text = Text.Replace(x, string.Empty));
UPDATE - refering to Detail's Comment
Another approach: replace only existing Emojs
List<string> Emoj = new List<string>() { ":kissing_heart:", ":bouquet:", ":grinning:" };
var Matches = Regex.Matches(Text, #":(\w*):").Cast<Match>().Select(x => x.Value);
Emoj.Intersect(Matches).ToList().ForEach(x => Text = Text.Replace(x, string.Empty));
But i'm not sure if it's that big difference for such short chat-strings and it's more important to have code that's easy to read/maintain. OP's question was about reducing redundancy Text.Replace().Text.Replace() and not about the most efficient solution.
I would use a combination of some of the techniques already suggested. Firstly, I'd store the 800+ emoji strings in a database and then load them up at runtime. Use a HashSet to store these in memory, so that we have a O(1) lookup time (very fast). Use Regex to pull out all potential pattern matches from the input and then compare each to our hashed emoji, removing the valid ones and leaving any non-emoji patterns the user has entered themselves...
public class Program
{
//hashset for in memory representation of emoji,
//lookups are O(1), so very fast
private HashSet<string> _emoji = null;
public Program(IEnumerable<string> emojiFromDb)
{
//load emoji from datastore (db/file,etc)
//into memory at startup
_emoji = new HashSet<string>(emojiFromDb);
}
public string RemoveEmoji(string input)
{
//pattern to search for
string pattern = #":(\w*):";
string output = input;
//use regex to find all potential patterns in the input
MatchCollection matches = Regex.Matches(input, pattern);
//only do this if we actually find the
//pattern in the input string...
if (matches.Count > 0)
{
//refine this to a distinct list of unique patterns
IEnumerable<string> distinct =
matches.Cast<Match>().Select(m => m.Value).Distinct();
//then check each one against the hashset, only removing
//registered emoji. This allows non-emoji versions
//of the pattern to survive...
foreach (string match in distinct)
if (_emoji.Contains(match))
output = output.Replace(match, string.Empty);
}
return output;
}
}
public class MainClass
{
static void Main(string[] args)
{
var program = new Program(new string[] { ":grinning:", ":kissing_heart:", ":bouquet:" });
string output = program.RemoveEmoji("Hello:grinning: :imadethis:, how are you?:kissing_heart: Are you fine?:bouquet: This is:a:strange:thing :to type:, but valid :nonetheless:");
Console.WriteLine(output);
}
}
Which results in:
Hello :imadethis:, how are you? Are you fine? This is:a:strange:thing :to type:,
but valid :nonetheless:
You do not have to replace all 856 emoji's. You only have to replace those that appear in the string. So have a look at:
Finding a substring using C# with a twist
Basically you extract all tokens ie the strings between : and : and then replace those with string.Empty()
If you are concerned that the search will return strings that are not emojis such as :some other text: then you could have a hash table lookup to make sure that replacing said found token is appropriate to do.
Finally got around to write something up. I'm combining a couple previously mentioned ideas, with the fact we should only loop over the string once. Based on those requirement, this sound like the perfect job for Linq.
You should probably cache the HashSet. Other than that, this has O(n) performance and only goes over the list once. Would be interesting to benchmark, but this could very well be the most efficient solution.
The approach is pretty straight forwards.
First load all Emoij in a HashSet so we can quickly look them up.
Split the string with input.Split(':') at the :.
Decide if we keep the current element.
If the last element was a match, keep the current element.
If the last element was no match, check if the current element matches.
If it does, ignore it. (This effectively removes the substring from the output).
If it doesn't, append : back and keep it.
Rebuild our string with a StringBuilder.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace ConsoleApplication1
{
static class Program
{
static void Main(string[] args)
{
ISet<string> emojiList = new HashSet<string>(new[] { "kissing_heart", "bouquet", "grinning" });
Console.WriteLine("Hello:grinning: , ho:w: a::re you?:kissing_heart:kissing_heart: Are you fine?:bouquet:".RemoveEmoji(':', emojiList));
Console.ReadLine();
}
public static string RemoveEmoji(this string input, char delimiter, ISet<string> emojiList)
{
StringBuilder sb = new StringBuilder();
input.Split(delimiter).Aggregate(true, (prev, curr) =>
{
if (prev)
{
sb.Append(curr);
return false;
}
if (emojiList.Contains(curr))
{
return true;
}
sb.Append(delimiter);
sb.Append(curr);
return false;
});
return sb.ToString();
}
}
}
Edit: I did something cool using the Rx library, but then realized Aggregate is the IEnumerable counterpart of Scan in Rx, thus simplifying the code even more.
If efficiency is a concern and to avoid processing "false positives", consider rewriting the string using a StringBuilder while skipping the special emoji tokens:
static HashSet<string> emojis = new HashSet<string>()
{
"grinning",
"kissing_heart",
"bouquet"
};
static string RemoveEmojis(string input)
{
StringBuilder sb = new StringBuilder();
int length = input.Length;
int startIndex = 0;
int colonIndex = input.IndexOf(':');
while (colonIndex >= 0 && startIndex < length)
{
//Keep normal text
int substringLength = colonIndex - startIndex;
if (substringLength > 0)
sb.Append(input.Substring(startIndex, substringLength));
//Advance the feed and get the next colon
startIndex = colonIndex + 1;
colonIndex = input.IndexOf(':', startIndex);
if (colonIndex < 0) //No more colons, so no more emojis
{
//Don't forget that first colon we found
sb.Append(':');
//Add the rest of the text
sb.Append(input.Substring(startIndex));
break;
}
else //Possible emoji, let's check
{
string token = input.Substring(startIndex, colonIndex - startIndex);
if (emojis.Contains(token)) //It's a match, so we skip this text
{
//Advance the feed
startIndex = colonIndex + 1;
colonIndex = input.IndexOf(':', startIndex);
}
else //No match, so we keep the normal text
{
//Don't forget the colon
sb.Append(':');
//Instead of doing another substring next loop, let's just use the one we already have
sb.Append(token);
startIndex = colonIndex;
}
}
}
return sb.ToString();
}
static void Main(string[] args)
{
List<string> inputs = new List<string>()
{
"Hello:grinning: , how are you?:kissing_heart: Are you fine?:bouquet:",
"Tricky test:123:grinning:",
"Hello:grinning: :imadethis:, how are you?:kissing_heart: Are you fine?:bouquet: This is:a:strange:thing :to type:, but valid :nonetheless:"
};
foreach (string input in inputs)
{
Console.WriteLine("In <- " + input);
Console.WriteLine("Out -> " + RemoveEmojis(input));
Console.WriteLine();
}
Console.WriteLine("\r\n\r\nPress enter to exit...");
Console.ReadLine();
}
Outputs:
In <- Hello:grinning: , how are you?:kissing_heart: Are you fine?:bouquet:
Out -> Hello , how are you? Are you fine?
In <- Tricky test:123:grinning:
Out -> Tricky test:123
In <- Hello:grinning: :imadethis:, how are you?:kissing_heart: Are you fine?:bouquet: This is:a:strange:thing :to type:, but valid :nonetheless:
Out -> Hello :imadethis:, how are you? Are you fine? This is:a:strange:thing :to type:, but valid :nonetheless:
Use this code I put up below I think using this function your problem will be solved.
string s = "Hello:grinning: , how are you?:kissing_heart: Are you fine?:bouquet:";
string rmv = ""; string remove = "";
int i = 0; int k = 0;
A:
rmv = "";
for (i = k; i < s.Length; i++)
{
if (Convert.ToString(s[i]) == ":")
{
for (int j = i + 1; j < s.Length; j++)
{
if (Convert.ToString(s[j]) != ":")
{
rmv += s[j];
}
else
{
remove += rmv + ",";
i = j;
k = j + 1;
goto A;
}
}
}
}
string[] str = remove.Split(',');
for (int x = 0; x < str.Length-1; x++)
{
s = s.Replace(Convert.ToString(":" + str[x] + ":"), "");
}
Console.WriteLine(s);
Console.ReadKey();
I'd use extension method like this:
public static class Helper
{
public static string MyReplace(this string dirty, char separator)
{
string newText = "";
bool replace = false;
for (int i = 0; i < dirty.Length; i++)
{
if(dirty[i] == separator) { replace = !replace ; continue;}
if(replace ) continue;
newText += dirty[i];
}
return newText;
}
}
Usage:
richTextBox2.Text = richTextBox2.Text.MyReplace(':');
This method show be better in terms of performance compare to one with Regex
I would split the text with the ':' and then build the string excluding the found emoji names.
const char marker = ':';
var textSections = text.Split(marker);
var emojiRemovedText = string.Empty;
var notMatchedCount = 0;
textSections.ToList().ForEach(section =>
{
if (emojiNames.Contains(section))
{
notMatchedCount = 0;
}
else
{
if (notMatchedCount++ > 0)
{
emojiRemovedText += marker.ToString();
}
emojiRemovedText += section;
}
});
Hope you can give me some light on this:
I have this var
string TestString = "KESRNAN FOREST S BV";
I want to replace the S that is alone, so I tried with the following
public static string FixStreetName(string streetName)
{
string result = "";
string stringToCheck = streetName.ToUpper();
// result = stringToCheck.Replace(StreetDirection(stringToCheck), "").Replace(StreetType(stringToCheck),"").Trim();
result = stringToCheck.Replace("S", "").Replace("BV", "").Trim();
return result;
}
But this is replacing all S on that string. any ideas?
Use regular expressions,
\b
denotes word boundaries. here is an example on C# Pad
string x = "KESRNAN FOREST S BV";
var result = System.Text.RegularExpressions.Regex.Replace(x, #"\bS\b", "");
Console.WriteLine(result);
If you can easily identify certain "delimiter" characters, one possibility is to 1. split your input string into several parts using string.Split; then 2. pick the parts that you want, and finally 3. "glue" them back together using string.Join:
var partsToExclude = new string[] { "S", "BV" };
/* 1. */ var parts = stringToCheck.Split(' ');
/* 2. */ var selectedParts = parts.Where(part => !partsToExclude.Contains(part));
/* 3. */ return string.Join(" ", selectedParts.ToArray());
Using Regex:
string input = "S KESRNAN FOREST S BV S";
string result = Regex.Replace(input, #"\b(S)", "");
As you can see alone S is before a space " ". In the other word have this string "S " which want to replace it.
Try this :
string TestString = "KESRNAN FOREST S BV";
string replacement = TestString.Replace("S ", "");
Another way of doing what you want:
using System;
namespace ConsoleApplication2
{
class Program
{
static void Main(string[] args)
{
string testString = "S KESRNAN S FOREST BV S";
// deleting S in middle of string
for (int i = 1; i < testString.Length-1; i++)
{
if (testString[i]=='S'&&testString[i-1]==' '&&testString[i+1]==' ')
testString=testString.Remove(i,2);
}
// deleting S in the begining of string
if (testString.StartsWith("S "))
testString = testString.Remove(0, 2);
// deleting S at the end of string
if (testString.EndsWith(" S"))
testString = testString.Remove(testString.Length-2, 2);
Console.WriteLine(testString);
}
}
}
I have a string that looks something like this:
"PID||000000|Z123345|23345|SOMEONE^FIRSTNAME^^^MISS^||150|F|1111||1 DREYFUS CLOSE^SOUTH CITY^COUNTY^^POST CODE^^^||0123 45678910^PRN^PH^^^^0123 45678910^^~^^CP^^^^^^~^NET^^^^^^^||||1A|||||A||||||||N||||||||||";
I am trying to remove any separating '|' characters after the 30th '|' in the string so that the output string looks like this:
"PID||000000|Z123345|23345|SOMEONE^FIRSTNAME^^^MISS^||150|F|1111||1 DREYFUS CLOSE^SOUTH CITY^COUNTY^^POST CODE^^^||0123 45678910^PRN^PH^^^^0123 45678910^^~^^CP^^^^^^~^NET^^^^^^^||||1A|||||A||||||||N";
I am trying to do it using as little code as possible, but not having much luck. Any help or ideas would be great.
You can use the TrimEnd method
string text = "stuff||||N||||||||||";
string result = text.TrimEnd('|'); //Result is stuff||||N
Brute force but only a little bit of code:
string s2 = string.Join("|", s1.Split('|').Take(31));
If you need any other processing of this kind of data (it looks like a kind of nested CSV) then string.Split() is useful to know.
string str = "PID||000000|Z123345|23345|SOMEONE^FIRSTNAME^^^MISS^||150|F|1111||1 DREYFUS CLOSE^SOUTH CITY^COUNTY^^POST CODE^^^||0123 45678910^PRN^PH^^^^0123 45678910^^~^^CP^^^^^^~^NET^^^^^^^||||1A|||||A||||||||N||||||||||";
int c = 0;
int after = 30;
StringBuilder newStr = new StringBuilder();
for(int i = 0;i < str.length; i++){
if(str[i] == '|'){
if(after != c){
newStr.append(str[i]);
c++;
}
}else{
newStr.append(str[i]);
}
}
results in
newStr == "PID||000000|Z123345|23345|SOMEONE^FIRSTNAME^^^MISS^||150|F|1111||1 DREYFUS CLOSE^SOUTH CITY^COUNTY^^POST CODE^^^||0123 45678910^PRN^PH^^^^0123 45678910^^~^^CP^^^^^^~^NET^^^^^^^||||1A|||||A||||||||N";
A regex should do the trick:
var regex = new Regex(#"^([^\|]*\|){0,30}[^\|]*");
var match = regex.Match(input);
if(match.Success)
{
var val = match.Value;
}
If what you really want is that everything after the 30th chunk loses its '|', then try:
var chunks = input.Split('|');
var output = String.Join('|',chunks.Take(30)) + String.Concat(chunks.Skip(30));
That said, I think it sounds like what you're really looking for is probably something like:
var output = input.TrimEnd('|');
// Get the indexes of all the | characters.
int[] pipeIndexes = Enumerable.Range(0, s.Length).Where(i => s[i] == '|').ToArray();
// If there are more than thirty pipes:
if (pipeIndexes.Length > 30)
{
// The former part of the string remains intact.
string formerPart = s.Substring(0, pipeIndexes[30]);
// The latter part needs to have all | characters removed.
string latterPart = s.Substring(pipeIndexes[30]).Replace("|", "");
s = formerPart + latterPart;
}
I have a string 731478718861993983 and I want to get this 73-1478-7188-6199-3983 using C#. How can I format it like this ?
Thanks.
By using regex:
public static string FormatTest1(string num)
{
string formatPattern = #"(\d{2})(\d{4})(\d{4})(\d{4})(\d{4})";
return Regex.Replace(num, formatPattern, "$1-$2-$3-$4-$5");
}
// test
string test = FormatTest1("731478718861993983");
// test result: 73-1478-7188-6199-3983
If you're dealing with a long number, you can use a NumberFormatInfo to format it:
First, define your NumberFormatInfo (you may want additional parameters, these are the basic 3):
NumberFormatInfo format = new NumberFormatInfo();
format.NumberGroupSeparator = "-";
format.NumberGroupSizes = new[] { 4 };
format.NumberDecimalDigits = 0;
Next, you can use it on your numbers:
long number = 731478718861993983;
string formatted = number.ToString("n", format);
Console.WriteLine(formatted);
After all, .Net has very good globalization support - you're better served using it!
string s = "731478718861993983"
var newString = (string.Format("{0:##-####-####-####-####}", Convert.ToInt64(s));
LINQ-only one-liner:
var str = "731478718861993983";
var result =
new string(
str.ToCharArray().
Reverse(). // So that it will go over string right-to-left
Select((c, i) => new { #char = c, group = i / 4}). // Keep group number
Reverse(). // Restore original order
GroupBy(t => t.group). // Now do the actual grouping
Aggregate("", (s, grouping) => "-" + new string(
grouping.
Select(gr => gr.#char).
ToArray())).
ToArray()).
Trim('-');
This can handle strings of arbitrary lenghs.
Simple (and naive) extension method :
class Program
{
static void Main(string[] args)
{
Console.WriteLine("731478718861993983".InsertChar("-", 4));
}
}
static class Ext
{
public static string InsertChar(this string str, string c, int i)
{
for (int j = str.Length - i; j >= 0; j -= i)
{
str = str.Insert(j, c);
}
return str;
}
}
If you're dealing strictly with a string, you can make a simple Regex.Replace, to capture each group of 4 digits:
string str = "731478718861993983";
str = Regex.Replace(str, "(?!^).{4}", "-$0" ,RegexOptions.RightToLeft);
Console.WriteLine(str);
Note the use of RegexOptions.RightToLeft, to start capturing from the right (so "12345" will be replaced to 1-2345, and not -12345), and the use of (?!^) to avoid adding a dash in the beginning.
You may want to capture only digits - a possible pattern then may be #"\B\d{4}".
string myString = 731478718861993983;
myString.Insert(2,"-");
myString.Insert(7,"-");
myString.Insert(13,"-");
myString.Insert(18,"-");
My first thought is:
String s = "731478718861993983";
s = s.Insert(3,"-");
s = s.Insert(8,"-");
s = s.Insert(13,"-");
s = s.Insert(18,"-");
(don't remember if index is zero-based, in which case you should use my values -1)
but there is probably some easier way to do this...
If the position of "-" is always the same then you can try
string s = "731478718861993983";
s = s.Insert(2, "-");
s = s.Insert(7, "-");
s = s.Insert(12, "-");
s = s.Insert(17, "-");
Here's how I'd do it; it'll only work if you're storing the numbers as something which isn't a string as they're not able to be used with format strings.
string numbers = "731478718861993983";
string formattedNumbers = String.Format("{0:##-####-####-####-####}", long.Parse(numbers));
Edit: amended code, since you said they were held as a string in your your original question
if I have the string "freq1" or "freq12" and so on, how can I strip out freq and also the number by itself?
string foo = "freq12";
string fooPart = foo.Substring(4); // "12"
int fooNumber = int.parse(fooPart); // 12
if the "freq" part is not constant, then you can use regular expressions:
using System.Text.RegularExpressions;
string pattern = #"([A-Za-z]+)(\d+)";
string foo = "freq12";
Match match = Regex.Match(foo, pattern);
string fooPart = match.Groups[1].Value;
int fooNumber = int.Parse(match.Groups[2].Value);
Is it always going to be the text freq that prepends the number within the string? If so, your solution is very simple:
var str = "freq12";
var num = int.Parse(str.Substring(4));
Edit: Here's a more generic method in the case that the first part of the string isn't always "freq".
var str = "freq12";
int splitIndex;
for(splitIndex = 0; splitIndex < str.Length; splitIndex++)
{
if (char.IsNumeric(str[splitIndex]))
break;
}
if (splitIndex == str.Length)
throw new InvalidOperationException("The input string does not contain a numeric part.");
var textPart = int.Parse(str.Substring(0, splitIndex));
var numPart = int.Parse(str.Substring(splitIndex));
In the given example, textPart should evaluate to freq and numPart to 12. Let me know if this still isn't what you want.
Try something like this:
String oldString = "freq1";
String newString = oldString.Replace("freq", String.Empty);
If you know that the word "freq" will always be there, then you can do something like:
string number = "freq1".Replace("freq","");
That will result in "1".