Split null or space on the first of a string?

Split null or space on the first of a string? - c#

I'm getting a string " abc df fd";
I want to split a space or null of string. As result is "abc df fd" That such I want;
private string _senselist;
public string senselist
{
get
{
return _senselist;
}
set
{
_senselist = value.Replace("\t", "").Replace(" "," ").Split(,1);
}
}

To remove spaces from the begining and the ending of a string, you can use Trim() method.
string data = " abc df fd";
string trimed = data.Trim(); // "abc df fd"
On your code add Trim at the end, instead of Split
_senselist = value.Replace("\t", "").Replace(" "," ").Trim();
As #andrei-rînea recommended you can also check TrimStart(' ') and TrimEnd(' ').

Related

Escaping reverse backSlash

I want to display data in this format yyyy/mm/dd
How I can do it? When I use
String.Format("{0:yyyy/MM/dd}", Model.RequiredDate)
I get yyyy.mm.dd

You can escape it like;
String.Format("{0:yyyy'/'MM'/'dd}", Model.RequiredDate)
I strongly suspect your CurrentCulture's DateSeparator is ., that's why / format specifier replace itself to it.

What about this option?
string DateToString(DateTime date, string seperator)
{
return date.ToString("yyyy") + seperator + date.ToString("MM") + seperator + date.ToString("dd");
}
But as Soner Gönül said, you can also escape it like this:
String.Format("{0:yyyy'/'MM'/'dd}", Model.RequiredDate)
EDIT:
What about this two options?
string DateToString(DateTime date, string seperator, string first, string second, string third)
{
return date.ToString(first) + seperator + date.ToString(second) + seperator + date.ToString(third);
}
string DateToString(DateTime date, string seperator, string[] format)
{
string result = "";
foreach (string s in format)
{
s += (date.ToString(s) + seperator);
}
return result;
}

Automatic Editing of a string to remove spaces C#

When I copy a string from an excel data set into the text box, the string has HUGE spaces in between each item in the string.
I currently have if (textBox1.Text.Contains(" ") == true) to detect the spaces in the string.
What would I use to delete those spaces?
Bonus Question: I do still need one space inbetween each item in the string, how would I add that and still delete the massive spaces?
private void radioGenerateScript_CheckedChanged(object sender, EventArgs e)
{
hexData.Cells.Copy();
textBox1.Clear();
textBox1.Paste();
if (textBox1.Text.Contains(" ") == true)
{
}
}
private void radioWriteScript_CheckedChanged(object sender, EventArgs e)
{
string waveForm = textBox1.Text;
System.IO.File.WriteAllText("E:/Scripts/Test.us1", waveForm);
}

If you want to remove all kinds of whitespaces use:
textBox1.Text = Regex.Replace(textBox1.Text, #"\s+", "");
\s matches all whitespaces (spaces, tabs and new lines).

textBox1.Text = Regex.Replace(textBox1.Text, " +", " ");
It seems that you have tabs as separators, so the following is better (as Alexei suggested):
textBox1.Text = Regex.Replace(textBox1.Text, #"\s+", " ");

textBox1.Text = textBox1.Text.Replace(" ", "");
If you want to keep some spaces then use Split and string.Join
var words = textBox1.Text.Split(new [] { ' ' }, StringSplitOptions.RemoveEmptyEntries);
textBox1.Text = string.Join(" ", words);

Regex find all occurrences of a pattern in a string

I have a problem finding all occurences of a pattern in a string.
Check this string :
string msg= "=?windows-1258?B?UkU6IFRyIDogUGxhbiBkZSBjb250aW51aXTpIGQnYWN0aXZpdOkgZGVz?= =?windows-1258?B?IHNlcnZldXJzIFdlYiBHb1ZveWFnZXN=?=";
I want to return the 2 occurrences (in order to later decode them):
=?windows-1258?B?UkU6IFRyIDogUGxhbiBkZSBjb250aW51aXTpIGQnYWN0aXZpdOkgZGVz?=
and
=?windows-1258?B?IHNlcnZldXJzIFdlYiBHb1ZveWFnZXN=?="
With the following regex code, it returns only 1 occurrence: the full string.
var charSetOccurences = new Regex(#"=\?.*\?B\?.*\?=", RegexOptions.IgnoreCase);
var charSetMatches = charSetOccurences.Matches(input);
foreach (Match match in charSetMatches)
{
charSet = match.Groups[0].Value.Replace("=?", "").Replace("?B?", "").Replace("?b?", "");
}
Do you know what I'm missing?

When regexp parser sees the .* character sequence, it matches everything up to the end of the string and goes back, char by char, (greedy match). So, to avoid the problem, you can use a non-greedy match or explicitly define the characters that can appear at a string.
"=\?[a-zA-Z0-9?=-]*\?B\?[a-zA-Z0-9?=-]*\?="

A non-regex way:
string msg= "=?windows-1258?B?UkU6IFRyIDogUGxhbiBkZSBjb250aW51aXTpIGQnYWN0aXZpdOkgZGVz?= =?windows-1258?B?IHNlcnZldXJzIFdlYiBHb1ZveWFnZXN=?=";
string[] charSetOccurences = msg.Split(new string[]{ " " }, StringSplitOptions.None);
foreach (string s in charSetOccurences)
{
string charSet = s.Replace("=?", "").Replace("?B?", "").Replace("?b?", "");
Console.WriteLine(charSet);
}
See an ideone.
And if you still want to use regex, you should make the .* lazy by adding a ?. This was already mentioned by the previous users, but it seems you are not getting matches?
string msg= "=?windows-1258?B?UkU6IFRyIDogUGxhbiBkZSBjb250aW51aXTpIGQnYWN0aXZpdOkgZGVz?= =?windows-1258?B?IHNlcnZldXJzIFdlYiBHb1ZveWFnZXN=?=";
var charSetOccurences = new Regex(#"=\?.*?\?B\?.*?\?=", RegexOptions.IgnoreCase);
var charSetMatches = charSetOccurences.Matches(msg);
foreach (Match match in charSetMatches)
{
string charSet = match.Groups[0].Value.Replace("=?", "").Replace("?B?", "").Replace("?b?", "");
Console.WriteLine(charSet);
}
See another ideone.
The output is the same in both cases:
windows-1258UkU6IFRyIDogUGxhbiBkZSBjb250aW51aXTpIGQnYWN0aXZpdOkgZGVz?=
windows-1258IHNlcnZldXJzIFdlYiBHb1ZveWFnZXN=
EDIT: As per update, see an all in one solution for your problem
string msg= "=?windows-1258?B?UkU6IFRyIDogUGxhbiBkZSBjb250aW51aXTpIGQnYWN0aXZpdOkgZGVz?= =?windows-1258?B?IHNlcnZldXJzIFdlYiBHb1ZveWFnZXN=?=";
var charSetOccurences = new Regex(#"=\?.*?\?[BQ]\?.*?\?=", RegexOptions.IgnoreCase);
MatchCollection matches = charSetOccurences.Matches(msg);
foreach (Match match in matches)
{
string[] encoding = match.Groups[0].Value.Split(new string[]{ "?" }, StringSplitOptions.None);
string charSet = encoding[1];
string encodeType = encoding[2];
string encodedString = encoding[3];
Console.WriteLine("Charset: " + charSet);
Console.WriteLine("Encoding type: " + encodeType);
Console.WriteLine("Encoded String: " + encodedString + "\n");
}
Returns:
Charset: windows-1258
Encoding type: B
Encoded String: UkU6IFRyIDogUGxhbiBkZSBjb250aW51aXTpIGQnYWN0aXZpdOkgZGVz
Charset: windows-1258
Encoding type: B
Encoded String: IHNlcnZldXJzIFdlYiBHb1ZveWFnZXN=
See this.
Or since we already had the regex, we can use:
string msg= "=?windows-1258?B?UkU6IFRyIDogUGxhbiBkZSBjb250aW51aXTpIGQnYWN0aXZpdOkgZGVz?= =?windows-1258?B?IHNlcnZldXJzIFdlYiBHb1ZveWFnZXN=?=";
var charSetOccurences = new Regex(#"=\?(.*?)\?([BQ])\?(.*?)\?=", RegexOptions.IgnoreCase);
MatchCollection matches = charSetOccurences.Matches(msg);
foreach (Match match in matches)
{
Console.WriteLine("Charset: " + match.Groups[1].Value);
Console.WriteLine("Encoding type: " + match.Groups[2].Value);
Console.WriteLine("Encoded String: " + match.Groups[3].Value + "\n");
}
Returns the same output.

.* is greedy and will match everything from the first ? to the last ?B?.
You need to use either a non-greedy match
=\?.*?\?B\?.*?\?=
or exclude ? from your list of characters
=\?[^?]*\?B\?[^?]*\?=

The best way to split a string without a separator

I have string:
MONEY-ID123456:MONEY-STAT43:MONEY-PAYetr-1232832938
From the string above you can see that it is separated by colon (:), but in the actual environment, it does not have a standard layout.
The standard is the fields name, example MONEY-ID, and MONEY-STAT.
How I can I split it the right way? And get the value from after the fields name?

Something like that should work:
string s = "MONEY-ID123456:MONEY-STAT43:MONEY-PAYetr-1232832938";
Regex regex = new Regex(#"MONEY-ID(?<moneyId>.*?)\:MONEY-STAT(?<moneyStat>.*?)\:MONEY-PAYetr-(?<moneyPaetr>.*?)$"); Match match = regex.Match(s);
if (match.Success)
{
Console.WriteLine("Money ID: " + match.Groups["moneyId"].Value);
Console.WriteLine("Money Stat: " + match.Groups["moneyStat"].Value);
Console.WriteLine("Money Paetr: " + match.Groups["moneyPaetr"].Value);
}
Console.WriteLine("hit <enter>");
Console.ReadLine();
UPDATE
Answering additional question, if we're not sure in format, then something like the following could be used:
string s = "MONEY-ID123456:MONEY-STAT43:MONEY-PAYetr-1232832938";
var itemsToExtract = new List<string> { "MONEY-STAT", "MONEY-PAYetr-", "MONEY-ID", };
string regexFormat = #"{0}(?<{1}>[\d]*?)[^\w]";//sample - MONEY-ID(?<moneyId>.*?)\:
foreach (var item in itemsToExtract)
{
string input = s + ":";// quick barbarian fix of lack of my knowledge of regex. Sorry
var match = Regex.Match(input, string.Format(regexFormat, item, "match"));
if (match.Success)
{
Console.WriteLine("Value of {0} is:{1}", item, match.Groups["match"]);
}
}
Console.WriteLine("hit <enter>");
Console.ReadLine();

As Andre said, I would personally go with regular expressions.
Use groups of something like,
"MONEY-ID(?<moneyid>.*)MONEY-STAT(?<moneystat>.*)MONEY-PAYetr(?<moneypay>.*)"
See this post for how to extract the groups.
Probably followed by a private method that trims off illegal characters in the matched group (e.g. : or -).

Check this out:
string regex = #"^(?i:money-id)(?<moneyid>.*)(?i:money-stat)(?<moneystat>.*)(?i:money-pay)(?<moneypay>.*)$";
string input = "MONEY-ID123456:MONEY-STAT43:MONEY-PAYetr-1232832938";
Match regexMatch = Regex.Match(input, regex);
string moneyID = regexMatch.Groups["moneyid"].Captures[0].Value.Trim();
string moneyStat = regexMatch.Groups["moneystat"].Captures[0].Value.Trim();
string moneyPay = regexMatch.Groups["moneypay"].Captures[0].Value.Trim();

Try
string data = "MONEY-ID123456:MONEY-STAT43:MONEY-PAYetr-1232832938";
data = data.Replace("MONEY-", ";");
string[] myArray = data.Split(';');
foreach (string s in myArray)
{
if (!string.IsNullOrEmpty(s))
{
if (s.StartsWith("ID"))
{
}
else if (s.StartsWith("STAT"))
{
}
else if (s.StartsWith("PAYetr"))
{
}
}
}
results in
ID123456:
STAT43:
PAYetr-1232832938

For example, using regular expressions,
(?<=MONEY-ID)(\d)*
It will extract
123456
from your string.

RegEx -- getting rid of double whitespaces?

I have an app that goes in, replaces "invalid" chars (as defined by my Regex) with a blankspace. I want it so that if there are 2 or more blank spaces in the filename, to trim one. For example:
Deal A & B.txt after my app runs, would be renamed to Deal A   B.txt (3 spaces b/w A and B). What i want is really this: Deal A B.txt (one space between A and B).
I'm trying to determine how to do this--i suppose my app will have to run through all filenames at least once to replace invalid chars and then run through filenames again to get rid of extraneous whitespace.
Can anybody help me with this?
Here is my code currently for replacing the invalid chars:
public partial class CleanNames : Form
{
public CleanNames()
{
InitializeComponent();
}
public void Sanitizer(List<string> paths)
{
string regPattern = (#"[~#&$!%+{}]+");
string replacement = " ";
Regex regExPattern = new Regex(regPattern);
StreamWriter errors = new StreamWriter(#"S:\Testing\Errors.txt", true);
var filesCount = new Dictionary<string, int>();
dataGridView1.Rows.Clear();
try
{
foreach (string files2 in paths)
{
string filenameOnly = System.IO.Path.GetFileName(files2);
string pathOnly = System.IO.Path.GetDirectoryName(files2);
string sanitizedFileName = regExPattern.Replace(filenameOnly, replacement);
string sanitized = System.IO.Path.Combine(pathOnly, sanitizedFileName);
if (!System.IO.File.Exists(sanitized))
{
DataGridViewRow clean = new DataGridViewRow();
clean.CreateCells(dataGridView1);
clean.Cells[0].Value = pathOnly;
clean.Cells[1].Value = filenameOnly;
clean.Cells[2].Value = sanitizedFileName;
dataGridView1.Rows.Add(clean);
System.IO.File.Move(files2, sanitized);
}
else
{
if (filesCount.ContainsKey(sanitized))
{
filesCount[sanitized]++;
}
else
{
filesCount.Add(sanitized, 1);
}
string newFileName = String.Format("{0}{1}{2}",
System.IO.Path.GetFileNameWithoutExtension(sanitized),
filesCount[sanitized].ToString(),
System.IO.Path.GetExtension(sanitized));
string newFilePath = System.IO.Path.Combine(System.IO.Path.GetDirectoryName(sanitized), newFileName);
System.IO.File.Move(files2, newFilePath);
sanitized = newFileName;
DataGridViewRow clean = new DataGridViewRow();
clean.CreateCells(dataGridView1);
clean.Cells[0].Value = pathOnly;
clean.Cells[1].Value = filenameOnly;
clean.Cells[2].Value = newFileName;
dataGridView1.Rows.Add(clean);
}
}
}
catch (Exception e)
{
errors.Write(e);
}
}
private void SanitizeFileNames_Load(object sender, EventArgs e)
{ }
private void dataGridView1_CellContentClick(object sender, DataGridViewCellEventArgs e)
{
}
private void button1_Click(object sender, EventArgs e)
{
Application.Exit();
}
}
The problem is, that not all files after a rename will have the same amount of blankspaces. As in, i could have Deal A&B.txt which after a rename would become Deal A B.txt (1 space b/w A and B--this is fine). But i will also have files that are like: Deal A & B & C.txt which after a rename is: Deal A   B   C.txt (3 spaces between A,B and C--not acceptable).
Does anybody have any ideas/code for how to accomplish this?

Do the local equivalent of:
s/\s+/ /g;

Just add a space to your regPattern. Any collection of invalid characters and spaces will be replaced with a single space. You may waste a little bit of time replacing a space with a space, but on the other hand you won't need a second string manipulation call.

Does this help?
var regex = new System.Text.RegularExpressions.Regex("\\s{2,}");
var result = regex.Replace("Some text with a lot of spaces, and 2\t\ttabs.", " ");
Console.WriteLine(result);
output is:
Some text with a lot of spaces, and 2 tabs.
It just replaces any sequence of 2 or more whitespace characters with a single space...
Edit:
To clarify, I would just perform this regex right after your existing one:
public void Sanitizer(List<string> paths)
{
string regPattern = (#"[~#&$!%+{}]+");
string replacement = " ";
Regex regExPattern = new Regex(regPattern);
Regex regExPattern2 = new Regex(#"\s{2,}");
and:
foreach (string files2 in paths)
{
string filenameOnly = System.IO.Path.GetFileName(files2);
string pathOnly = System.IO.Path.GetDirectoryName(files2);
string sanitizedFileName = regExPattern.Replace(filenameOnly, replacement);
sanitizedFileName = regExPattern2.Replace(sanitizedFileName, replacement); // clean up whitespace
string sanitized = System.IO.Path.Combine(pathOnly, sanitizedFileName);
I hope that makes more sense.

you can perform another regex replace after your first one
#" +" -> " "

As Fosco said, with formatting:
while (mystring.Contains(" ")) mystring = mystring.Replace(" "," ");
// || || |

After you're done sanitizing it your way, simply replace 2 spaces with 1 space, while 2 spaces exist in the string.
while (mystring.Contains(" ")) mystring = mystring.Replace(" "," ");
I think that's the right syntax...

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Split null or space on the first of a string? - c#

I'm getting a string " abc df fd"; I want to split a space or null of string. As result is "abc df fd" That such I want; private string _senselist; public string senselist { get { return _senselist; } set { _senselist = value.Replace("\t", "").Replace(" "," ").Split(,1); } }

Related

Escaping reverse backSlash

Automatic Editing of a string to remove spaces C#

Regex find all occurrences of a pattern in a string

The best way to split a string without a separator

RegEx -- getting rid of double whitespaces?

Categories

Resources