c# string analysis - c#

I have a string for example like " :)text :)text:) :-) word :-( " i need append it in textbox(or somewhere else), with condition:
Instead of ':)' ,':-(', etc. need to call function which enter specific symbol
I thinck exists solution with Finite-state machine, but how implement it don't know. Waiting for advises.
update: " :)text :)text:) :-) word :-( " => when we meet ':)' wec all functions Smile(":)") and it display image on the textbox
update: i like idea with delegates and Regex.Replace. Can i when meet ':)' send to the delegate parameter ':)' and when meet ':(' other parameter.
update: Found solution with converting to char and comparing every symbol to ':)' if is equal call smile(':)')

You can use Regex.Replace with delegate where you can process matched input or you can simply use string.Replace method.
Updated:
you can do something like this:
string text = "aaa :) bbb :( ccc";
string replaced = Regex.Replace(text, #":\)|:\(", match => match.Value == ":)" ? "case1" : "case2");
replaced variable will have "aaa case1 bbb case2 ccc" value after execution.

It seems that you want to replace portions of the string with those symbols, right? No need to build that yourself, just use string.Replace. Something like this:
string text = " :)text :)text:) :-) word :-( ";
text = text.Replace(":)", "☺").Replace(":(", "☹"); // similar for others
textbox.Text += text;
Note that this is not the most efficient code ever produced, but if this is for something like a chat program, you'll probably never know the difference.

You could just create a dictionary with these specific characters as the key and pull the value.

Related

How to remove a pattern from a string using Regex

I want to find paths from a string and remove them, e.g.:
string1 = "'c:\a\b\c'!MyUDF(param1, param2,..) + 'c:\a\b\c'!MyUDF(param3, param4,..)..."`
I'd like a regex to find the pattern '[some path]'!MyUDF, and remove '[path]'.
Thanks.
Edit:
Example input:
string1 = "'c:\a\b\c'!MyUDF(param1, param2,..) + 'c:\a\b\c'!MyUDF(param3, param4,..)";
Expected output: "MyUDF(param1, param2,...) + MyUDF(param3, param4,...)"
where MyUDF is a function name, so it consists of only letters
input=Regex.Replace(input,"'[^']+'(?=!MyUDF)","");
In case if the path is followed by ! and some other word you can use
input=Regex.Replace(input,#"'[^']+'(?=!\w+)","");
Alright, if the ! is always in the string as you suggest, this Regex !(.*)?\( will get you what you want. Here is a Regex 101 to prove it.
To use it, you might do something like this:
var result = Regex.Replace(myString, #"!(.*)?\(");
The feature you want, if you are dealing with file paths, is in System.Path.
There are many methods there, but that is one of it's specific purposes.

Regex in C# - remove quotes and escaped quotes from a value after another value

I am using HighCharts and am generating script from C# and there's an unfortunate thing where they use inline functions for formatters and events. Unfortunately, I can't output JSON like that from any serializer I know of. In other words, they want something like this:
"labels":{"formatter": function() { return Highcharts.numberFormat(this.value, 0); }}
And with my serializers available to me, I can only get here:
"labels":{"formatter":"function() { return Highcharts.numberFormat(this.value, 0); }"}
These are used for click events as well as formatters, and I absolutely need them.
So I'm thinking regex, but it's been years and years and also I was never a regex wizard.
What kind of Regex replace can I use on the final serialized string to replace any quoted value that starts with function() with the unquoted version of itself? Also, the function itself may have " in it, in which case the quoted string might have \" in it, which would need to also be replaced back down to ".
I'm assuming I can use a variant of the first answer here:
Finding quoted strings with escaped quotes in C# using a regular expression
but I can't seem to make it happen. Please help me for the love of god.
I've put more sweat into this, and I've come up with
serialized = Regex.Replace(serialized, #"""function\(\)[^""\\]*(?:\\.[^""\\]*)*""", "function()$1");
However, my end result is always:
formatter:function()$1
This tells me I'm matching the proper stuff, but my capture isn't working right. Now I feel like I'm probably being an idiot with some C# specific regex situation.
Update: Yes, I was being an idiot. I didn't have a capture around what I really wanted.
`enter code here` serialized = Regex.Replace(serialized, #"""function\(\)([^""\\]*(?:\\.[^""\\]*)*)""", "function()$1");
that gets my match, but in a case like this:
"formatter":"function() { alert(\"hi!\"); return Highcharts.numberFormat(this.value, 0); }"
it returns:
"formatter":function() { alert(\"hi!\"); return Highcharts.numberFormat(this.value, 0); }
and I need to get those nasty backslashes out of there. Now I think I'm truly stuck.
Regexp for match
"function\(\) (?<code>.*)"
Replace expression
function() ${code}
Try this : http://regexr.com?30jpf
What it does :
Finds double quotes JUST before a function declaration and immediately after it.
Regex :
(")(?=function()).+(?<=\})(")
Replace groups 1 & 3 with nothing :
3 capturing groups:
group 1: (")
group 2: ()
group 3: (")
string serialized = JsonSerializer.Serialize(chartDefinition);
serialized = Regex.Replace(serialized, #"""function\(\)([^""\\]*(?:\\.[^""\\]*)*)""", "function()$1").Replace("\\\"", "\"");

Multiline string literal in C#

Is there an easy way to create a multiline string literal in C#?
Here's what I have now:
string query = "SELECT foo, bar"
+ " FROM table"
+ " WHERE id = 42";
I know PHP has
<<<BLOCK
BLOCK;
Does C# have something similar?
You can use the # symbol in front of a string to form a verbatim string literal:
string query = #"SELECT foo, bar
FROM table
WHERE id = 42";
You also do not have to escape special characters when you use this method, except for double quotes as shown in Jon Skeet's answer.
It's called a verbatim string literal in C#, and it's just a matter of putting # before the literal. Not only does this allow multiple lines, but it also turns off escaping. So for example you can do:
string query = #"SELECT foo, bar
FROM table
WHERE name = 'a\b'";
This includes the line breaks (using whatever line break your source has them as) into the string, however. For SQL, that's not only harmless but probably improves the readability anywhere you see the string - but in other places it may not be required, in which case you'd either need to not use a multi-line verbatim string literal to start with, or remove them from the resulting string.
The only bit of escaping is that if you want a double quote, you have to add an extra double quote symbol:
string quote = #"Jon said, ""This will work,"" - and it did!";
As a side-note, with C# 6.0 you can now combine interpolated strings with the verbatim string literal:
string camlCondition = $#"
<Where>
<Contains>
<FieldRef Name='Resource'/>
<Value Type='Text'>{(string)parameter}</Value>
</Contains>
</Where>";
The problem with using string literal I find is that it can make your code look a bit "weird" because in order to not get spaces in the string itself, it has to be completely left aligned:
var someString = #"The
quick
brown
fox...";
Yuck.
So the solution I like to use, which keeps everything nicely aligned with the rest of your code is:
var someString = String.Join(
Environment.NewLine,
"The",
"quick",
"brown",
"fox...");
And of course, if you just want to logically split up lines of an SQL statement like you are and don't actually need a new line, you can always just substitute Environment.NewLine for " ".
One other gotcha to watch for is the use of string literals in string.Format. In that case you need to escape curly braces/brackets '{' and '}'.
// this would give a format exception
string.Format(#"<script> function test(x)
{ return x * {0} } </script>", aMagicValue)
// this contrived example would work
string.Format(#"<script> function test(x)
{{ return x * {0} }} </script>", aMagicValue)
Why do people keep confusing strings with string literals? The accepted answer is a great answer to a different question; not to this one.
I know this is an old topic, but I came here with possibly the same question as the OP, and it is frustrating to see how people keep misreading it. Or maybe I am misreading it, I don't know.
Roughly speaking, a string is a region of computer memory that, during the execution of a program, contains a sequence of bytes that can be mapped to text characters. A string literal, on the other hand, is a piece of source code, not yet compiled, that represents the value used to initialize a string later on, during the execution of the program in which it appears.
In C#, the statement...
string query = "SELECT foo, bar"
+ " FROM table"
+ " WHERE id = 42";
... does not produce a three-line string but a one liner; the concatenation of three strings (each initialized from a different literal) none of which contains a new-line modifier.
What the OP seems to be asking -at least what I would be asking with those words- is not how to introduce, in the compiled string, line breaks that mimick those found in the source code, but how to break up for clarity a long, single line of text in the source code without introducing breaks in the compiled string. And without requiring an extended execution time, spent joining the multiple substrings coming from the source code. Like the trailing backslashes within a multiline string literal in javascript or C++.
Suggesting the use of verbatim strings, nevermind StringBuilders, String.Joins or even nested functions with string reversals and what not, makes me think that people are not really understanding the question. Or maybe I do not understand it.
As far as I know, C# does not (at least in the paleolithic version I am still using, from the previous decade) have a feature to cleanly produce multiline string literals that can be resolved during compilation rather than execution.
Maybe current versions do support it, but I thought I'd share the difference I perceive between strings and string literals.
UPDATE:
(From MeowCat2012's comment) You can. The "+" approach by OP is the best. According to spec the optimization is guaranteed: http://stackoverflow.com/a/288802/9399618
Add multiple lines : use #
string query = #"SELECT foo, bar
FROM table
WHERE id = 42";
Add String Values to the middle : use $
string text ="beer";
string query = $"SELECT foo {text} bar ";
Multiple line string Add Values to the middle: use $#
string text ="Customer";
string query = $#"SELECT foo, bar
FROM {text}Table
WHERE id = 42";
You can use # and "".
string sourse = #"{
""items"":[
{
""itemId"":0,
""name"":""item0""
},
{
""itemId"":1,
""name"":""item1""
}
]
}";
In C# 11 [2022], you will be able to use Raw String literals.
The use of Raw String Literals makes it easier to use " characters without having to write escape sequences.
Solution for OP:
string query1 = """
SELECT foo, bar
FROM table
WHERE id = 42
""";
string query2 = """
SELECT foo, bar
FROM table
WHERE id = 42
and name = 'zoo'
and type = 'oversized "jumbo" grand'
""";
More details about Raw String Literals
See the Raw String Literals GitHub Issue for full details; and Blog article C# 11 Preview Updates – Raw string literals, UTF-8 and more!
I haven't seen this, so I will post it here (if you are interested in passing a string you can do this as well.) The idea is that you can break the string up on multiple lines and add your own content (also on multiple lines) in any way you wish. Here "tableName" can be passed into the string.
private string createTableQuery = "";
void createTable(string tableName)
{
createTableQuery = #"CREATE TABLE IF NOT EXISTS
["+ tableName + #"] (
[ID] INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
[Key] NVARCHAR(2048) NULL,
[Value] VARCHAR(2048) NULL
)";
}
Yes, you can split a string out onto multiple lines without introducing newlines into the actual string, but it aint pretty:
string s = $#"This string{
string.Empty} contains no newlines{
string.Empty} even though it is spread onto{
string.Empty} multiple lines.";
The trick is to introduce code that evaluates to empty, and that code may contain newlines without affecting the output. I adapted this approach from this answer to a similar question.
There is apparently some confusion as to what the question is, but there are two hints that what we want here is a string literal not containing any newline characters, whose definition spans multiple lines. (in the comments he says so, and "here's what I have" shows code that does not create a string with newlines in it)
This unit test shows the intent:
[TestMethod]
public void StringLiteralDoesNotContainSpaces()
{
string query = "hi"
+ "there";
Assert.AreEqual("hithere", query);
}
Change the above definition of query so that it is one string literal, instead of the concatenation of two string literals which may or may not be optimized into one by the compiler.
The C++ approach would be to end each line with a backslash, causing the newline character to be escaped and not appear in the output. Unfortunately, there is still then the issue that each line after the first must be left aligned in order to not add additional whitespace to the result.
There is only one option that does not rely on compiler optimizations that might not happen, which is to put your definition on one line. If you want to rely on compiler optimizations, the + you already have is great; you don't have to left-align the string, you don't get newlines in the result, and it's just one operation, no function calls, to expect optimization on.
If you don't want spaces/newlines, string addition seems to work:
var myString = String.Format(
"hello " +
"world" +
" i am {0}" +
" and I like {1}.",
animalType,
animalPreferenceType
);
// hello world i am a pony and I like other ponies.
You can run the above here if you like.
using System;
namespace Demo {
class Program {
static void Main(string[] args) {
string str = #"Welcome User,
Kindly wait for the image to
load";
Console.WriteLine(str);
}
}
}
Output
Welcome User,
Kindly wait for the image to
load

Extract substring from string with Regex

Imagine that users are inserting strings in several computers.
On one computer, the pattern in the configuration will extract some characters of that string, lets say position 4 to 5.
On another computer, the extract pattern will return other characters, for instance, last 3 positions of the string.
These configurations (the Regex patterns) are different for each computer, and should be available for change by the administrator, without having to change the source code.
Some examples:
Original_String Return_Value
User1 - abcd78defg123 78
User2 - abcd78defg123 78g1
User3 - mm127788abcd 12
User4 - 123456pp12asd ppsd
Can it be done with Regex?
Thanks.
Why do you want to use regex for this? What is wrong with:
string foo = s.Substring(4,2);
string bar = s.Substring(s.Length-3,3);
(you can wrap those up to do a bit of bounds-checking on the length easily enough)
If you really want, you could wrap it up in a Func<string,string> to put somewhere - not sure I'd bother, though:
Func<string, string> get4and5 = s => s.Substring(4, 2);
Func<string,string> getLast3 = s => s.Substring(s.Length - 3, 3);
string value = "abcd78defg123";
string foo = getLast3(value);
string bar = get4and5(value);
If you really want to use regex:
^...(..)
And:
.*(...)$
To have a regex capture values for further use you typically use (), depending on the regex compiler it might be () or for microsoft MSVC I think it's []
Example
User4 - 123456pp12asd ppsd
is most interesting in that you have here 2 seperate capture areas. Is there some default rule on how to join them together, or would you then want to be able to specify how to make the result?
Perhaps something like
r/......(..)...(..)/\1\2/ for ppsd
r/......(..)...(..)/\2-\1/ for sd-pp
do you want to run a regex to get the captures and handle them yourself, or do you want to run more advanced manipulation commands?
I'm not sure what you are hoping to get by using RegEx. RegEx is used for pattern matching. If you want to extract based on position, just use substring.
It seems to me that Regex really isn't the solution here. To return a section of a string beginning at position pos (starting at 0) and of length length, you simply call the Substring function as such:
string section = str.Substring(pos, length)
Grouping. You could match on /^.{3}(.{2})/ and then look at group $1 for example.
The question is why? Normal string handling i.e. actual substring methods are going to be faster and clearer in intent.

Easiest way to convert a URL to a hyperlink in a C# string?

I am consuming the Twitter API and want to convert all URLs to hyperlinks.
What is the most effective way you've come up with to do this?
from
string myString = "This is my tweet check it out http://tinyurl.com/blah";
to
This is my tweet check it out http://tinyurl.com/>blah
Regular expressions are probably your friend for this kind of task:
Regex r = new Regex(#"(https?://[^\s]+)");
myString = r.Replace(myString, "$1");
The regular expression for matching URLs might need a bit of work.
I did this exact same thing with jquery consuming the JSON API here is the linkify function:
String.prototype.linkify = function() {
return this.replace(/[A-Za-z]+:\/\/[A-Za-z0-9-_]+\.[A-Za-z0-9-_:%&\?\/.=]+/, function(m) {
return m.link(m);
});
};
This is actually an ugly problem. URLs can contain (and end with) punctuation, so it can be difficult to determine where a URL actually ends, when it's embedded in normal text. For example:
http://example.com/.
is a valid URL, but it could just as easily be the end of a sentence:
I buy all my witty T-shirts from http://example.com/.
You can't simply parse until a space is found, because then you'll keep the period as part of the URL. You also can't simply parse until a period or a space is found, because periods are extremely common in URLs.
Yes, regex is your friend here, but constructing the appropriate regex is the hard part.
Check out this as well: Expanding URLs with Regex in .NET.
You can add some more control on this by using MatchEvaluator delegate function with regular expression:
suppose i have this string:
find more on http://www.stackoverflow.com
now try this code
private void ModifyString()
{
string input = "find more on http://www.authorcode.com ";
Regex regx = new Regex(#"\b((http|https|ftp|mailto)://)?(www.)+[\w-]+(/[\w- ./?%&=]*)?");
string result = regx.Replace(input, new MatchEvaluator(ReplaceURl));
}
static string ReplaceURl(Match m)
{
string x = m.ToString();
x = "< a href=\"" + x + "\">" + x + "</a>";
return x;
}
/cheer for RedWolves
from: this.replace(/[A-Za-z]+://[A-Za-z0-9-]+.[A-Za-z0-9-:%&\?/.=]+/, function(m){...
see: /[A-Za-z]+://[A-Za-z0-9-]+.[A-Za-z0-9-:%&\?/.=]+/
There's the code for the addresses "anyprotocol"://"anysubdomain/domain"."anydomainextension and address",
and it's a perfect example for other uses of string manipulation. you can slice and dice at will with .replace and insert proper "a href"s where needed.
I used jQuery to change the attributes of these links to "target=_blank" easily in my content-loading logic even though the .link method doesn't let you customize them.
I personally love tacking on a custom method to the string object for on the fly string-filtering (the String.prototype.linkify declaration), but I'm not sure how that would play out in a large-scale environment where you'd have to organize 10+ custom linkify-like functions. I think you'd definitely have to do something else with your code structure at that point.
Maybe a vet will stumble along here and enlighten us.

Categories