Simple Regex c#, Replace and Split Question - c#

I am trying to remove " from a string using Regex.
I am receiving a string into a method, I would like to take the string and split it up into the words that are in the string.
My Code is below, hopefully you can see what I am doing.
The problem I am having is trying to tell Regex that " is what I would like to remove. I have tried numerous ways: I have searched Google for a answer and have had to resort to here.
search_string looks like this: blah="blah" la="la" ta="ta" and in the end I want just the blah blah la la ta ta.
public blahblah blahblah(blah blah, string search_string)
{
Regex r = new Regex(#"/"+");
string s3 = r.Replace(search_string, #" ");
Regex r2 = new Regex(" ");
Regex r3 = new Regex("=");
string[] new_Split = { };
string[] split_String = r2.Split(s3);
foreach (string match in split_String)
{
new_Split = r3.Split(match);
}
//do blahblah stuff with new_Split[1] .. etc
// new_Split[0] should be blah and new_Split[1] should
// be blah with out "", not "blah"
return blah_Found;

Just use:
myString = myString.Replace( "\"", String.Empty );
[Update]
The String.Empty or "" is not a space char. You wrote this
blah="blah" la="la" ta="ta"
you want to convert to
blah blah la la ta ta
So you have white spaces anyway. If you want this:
blahblahlalatata
you need to remove them too:
myString = myString.Replace( "\"", String.Empty ).Replace( " ", String.Empty );
for '=' do it again, and so on...
You need to be more precise in your questions.

As a quick thought - and barking maybe up entirely the wrong tree, but wouldnt you want something like
Regex r = new Regex("(\".*\")");
eg, a reg expression of ".*"

This is one way to do it.
It will Search for anything in that form: SomeWord="somethingelse"
and replace it with SomeWord somethingelse
var regex = new Regex(#"(\w+)=\""(.+)\""");
var result = regex.Replace("bla=\"bla\"", "$1 $2");

I can't help you with Regex.
Anyway if you only need to remove = and " and split words you could try:
string[] arr = s
.Replace("="," ")
.Replace("\""," ")
.Split(new string[1] {" "}, StringSplitOptions.RemoveEmptyEntries);

I did it in 2 passes
string input = "blah=\"blah\" la=\"la\" ta=\"ta\"";
//replace " and = with a space
string output = Regex.Replace(input, "[\"=]", " ");
//condense the spaces
output = Regex.Replace(output, #"\s+", " ");
EDIT:
Treating " and = differently as per comment.
string input = "blah=\"blah\" la=\"la\" ta=\"ta\"";
//replace " and = with a space
string output = Regex.Replace(input, "\"", String.Empty);
output = Regex.Replace(output, "=", " ");
Clearly regex is a bit overkill here.

Related

Retrieving specific characters from string separated by a delimiter

I want to retrieve characters separated by a specific delimiter.
Example :
Here, I want to access the string between the " " delimiters. But I want the 2nd set of characters between "".
abc"def"ghi"jklm // Output : ghi
"hello" yes "world" // output : world
How can I get that?
I know we can use split. But sometimes the string might not start with " character.
Can anyone please help me with this?
You can just find the first quote, and use your approach from there:
var firstQuote = str.IndexOf('"');
var startsWithQuote = str.Substring(firstQuote);
string valueStr = "abc\"def\"ghi\"jklm";
var result = valueStr.Split('"')[2];
Console.WriteLine(result);
https://dotnetfiddle.net/T3fMof
Obviously check for the array elements before accessing them
You can use regular expressions to match them:
var test = "abc\"def\"ghi\"jklm";
var test2 = "\"hello\" yes \"world\"";
var match1 = Regex.Matches(test, ".+\"(.+)\"");
var match2 = Regex.Matches(test2, ".+\"(.+)\"");
Console.WriteLine("Match1: " + match1[0].Groups[1].Captures[0]);
Console.WriteLine("Match2: " + match2[0].Groups[1].Captures[0]);
// Match1: ghi
// Match2: world

How to replace a word in a string

It is very basic question but i am not sure why it is not working. I have code where 'And' can be written in any of the ways 'And', 'and', etc. and i want to replace it with ','
I tried this:
and.Replace("and".ToUpper(),",");
but this is not working, any other way to do this or make it work?
You should check out the Regex class
http://msdn.microsoft.com/en-us/library/xwewhkd1.aspx
using System.Text.RegularExpressions;
Regex re = new Regex("\band\b", RegexOptions.IgnoreCase);
string and = "This is my input string with and string in between.";
re.Replace(and, ",");
words = words.Replace("AND", ",")
.Replace("and", ",");
Or use RegEx.
The Replace method returns a string where the replacement is visible. It does not modify the original string. You should try something along the lines of
and = and.Replace("and",",");
You can do this for all variations of "and" you may encounter, or as other answers have suggested, you could use a regex.
I guess you should take care if some words contain and, say "this is sand and sea". The word "sand" must not be influenced by the replacement.
string and = "this is sand and sea";
//here you should probably add those delimiters that may occur near your "and"
//this substitution is not universal and will omit smth like this " and, "
string[] delimiters = new string[] { " " };
//it result in: "this is sand , sea"
and = string.Join(" ",
and.Split(delimiters,
StringSplitOptions.RemoveEmptyEntries)
.Select(s => s.Length == 3 && s.ToUpper().Equals("AND")
? ","
: s));
I would also add smth like this:
and = and.Replace(" , ", ", ");
So, the output:
this is sand, sea
try this way to use the static Regex.Replace() method:
and = System.Text.RegularExpressions.Regex.Replace(and,"(?i)and",",");
The "(?i)" causes the following text search to be case-insensitive.
http://msdn.microsoft.com/en-us/library/yd1hzczs.aspx
http://msdn.microsoft.com/en-us/library/xwewhkd1(v=vs.100).aspx

Remove space from String

I have a lot of strings that look like this:
current affairs
and i want to make the string be :
current affairs
i try to use Trim() but it won't do the job
Regex can do the job
string_text = Regex.Replace(string_text, #"\s+", " ");
You can use regular expressions for this, see Regex.Replace:
var normalizedString = Regex.Replace(myString, " +", " ");
If you want all types of whitespace, use #"\s+" instead of " +" which just deals with spaces.
var normalizedString = Regex.Replace(myString, #"\s+", " ");
Use a regular expression.
yourString= Regex.Replace(yourString, #"\s+", " ");
You can use a regex:
public string RemoveMultipleSpaces(string s)
{
return Regex.Replace(value, #"\s+", " ");
}
After:
string s = "current affairs ";
s = RemoveMultipleSpaces(s);
Using Regex here is the way,
System.Text.RegularExpressions.Regex.Replace(input, #”\s+”, ” “);
This will removes all whitespace characters including tabs, newlines etc.
First you need to split the whole string and then apply trim to each item.
string [] words = text.Split(' ');
text="";
forearch(string s in words){
text+=s.Trim();
}
//text should be ok at this time

Regex to remove string from string

Is there a regex pattern that can remove .zip.ytu from the string below?
werfds_tyer.abc.zip.ytu_20111223170226_20111222.20111222
Here is an answer using regex as the OP asked.
To use regex, put the replacment text in a match ( ) and then replace that match with nothing string.Empty:
string text = #"werfds_tyer.abc.zip.ytu_20111223170226_20111222.20111222";
string pattern = #"(\.zip\.ytu)";
Console.WriteLine( Regex.Replace(text, pattern, string.Empty ));
// Outputs
// werfds_tyer.abc_20111223170226_20111222.20111222
Just use String.Replace()
String.Replace(".zip.ytu", "");
You don't need regex for exact matches.
txt = txt.Replace(".zip.ytu", "");
Why don't you simply do above?
Don't really know what is the ".zip.ytu", but if you don't need exact matches, you might use something like that:
string txt = "werfds_tyer.abc.zip.ytu_20111223170226_20111222.20111222";
Regex mRegex = new Regex(#"^([^.]*\.[^.]*)\.[^.]*\.[^_]*(_.*)$");
Match mMatch = mRegex.Match(txt);
string new_txt = mRegex.Replace(txt, mMatch.Groups[1].ToString() + mMatch.Groups[2].ToString());
use string.Replace:
txt = txt.Replace(".zip.ytu", "");
Here is the method I use for more complex repaces. Check out the link: http://msdn.microsoft.com/en-us/library/system.text.regularexpressions.regex.replace(v=vs.110).aspx for a Regular Expression replace. I added the code below as well.
string input = "This is text with far too much " +
"whitespace.";
string pattern = "\\s+";
string replacement = " ";
Regex rgx = new Regex(pattern);
string result = rgx.Replace(input, replacement);
Console.WriteLine("Original String: {0}", input);
Console.WriteLine("Replacement String: {0}", result);

Remove formatting on string literal

Given the c# code:
string foo = #"
abcde
fghijk";
I am trying to remove all formatting, including whitespaces between the lines.
So far the code
foo = foo.Replace("\n","").Replace("\r", "");
works but the whitespace between lines 2 and 3 and still kept.
I assume a regular expression is the only solution?
Thanks.
I'm assuming you want to keep multiple lines, if not, i'd choose CAbbott's answer.
var fooNoWhiteSpace = string.Join(
Environment.NewLine,
foo.Split(new string[] { Environment.NewLine }, StringSplitOptions.RemoveEmptyEntries)
.Select(fooline => fooline.Trim())
);
What this does it split the string into lines (foo.Split),
trim whitespace from the start and end of each line (.Select(fooline => fooline.Trim())),
then combine them back together with a new line inbetween (string.Join).
You could use a regular expression:
foo = Regex.Replace(foo, #"\s+", "");
How about this?
string input = #"
abcde
fghijk";
string output = "";
string[] parts = input.Split('\n');
foreach (var part in parts)
{
// If you want everything on one line... else just + "\n" to it
output += part.Trim();
}
This should remove everthing.
If the whitespace is all spaces, you could use
foo.Replace(" ", "");
For any other whitespace that may be in there, do the same. Example:
foo.Replace("\t", "");
Just add a Replace(" ", "") your dealing with a string literal which mean all the white space is part of the string.
Try something like this:
string test = #"
abcde
fghijk";
EDIT: Addded code to only filter out white spaces.
string newString = new string(test.Where(c => Char.IsWhiteSpace(c) == false).ToArray());
Produces the following: abcdefghijk
I've written something similar to George Duckett but put my logic into a string extension method so it easier for other to read/consume:
public static class Extensions
{
public static string RemoveTabbing(this string fmt)
{
return string.Join(
System.Environment.NewLine,
fmt.Split(new string[] { System.Environment.NewLine }, StringSplitOptions.RemoveEmptyEntries)
.Select(fooline => fooline.Trim()));
}
}
you can the call it like this:
string foo = #"
abcde
fghijk".RemoveTabbing();
I hope that helps someone

Categories