I have a string that looks something like this:
"PID||000000|Z123345|23345|SOMEONE^FIRSTNAME^^^MISS^||150|F|1111||1 DREYFUS CLOSE^SOUTH CITY^COUNTY^^POST CODE^^^||0123 45678910^PRN^PH^^^^0123 45678910^^~^^CP^^^^^^~^NET^^^^^^^||||1A|||||A||||||||N||||||||||";
I am trying to remove any separating '|' characters after the 30th '|' in the string so that the output string looks like this:
"PID||000000|Z123345|23345|SOMEONE^FIRSTNAME^^^MISS^||150|F|1111||1 DREYFUS CLOSE^SOUTH CITY^COUNTY^^POST CODE^^^||0123 45678910^PRN^PH^^^^0123 45678910^^~^^CP^^^^^^~^NET^^^^^^^||||1A|||||A||||||||N";
I am trying to do it using as little code as possible, but not having much luck. Any help or ideas would be great.
You can use the TrimEnd method
string text = "stuff||||N||||||||||";
string result = text.TrimEnd('|'); //Result is stuff||||N
Brute force but only a little bit of code:
string s2 = string.Join("|", s1.Split('|').Take(31));
If you need any other processing of this kind of data (it looks like a kind of nested CSV) then string.Split() is useful to know.
string str = "PID||000000|Z123345|23345|SOMEONE^FIRSTNAME^^^MISS^||150|F|1111||1 DREYFUS CLOSE^SOUTH CITY^COUNTY^^POST CODE^^^||0123 45678910^PRN^PH^^^^0123 45678910^^~^^CP^^^^^^~^NET^^^^^^^||||1A|||||A||||||||N||||||||||";
int c = 0;
int after = 30;
StringBuilder newStr = new StringBuilder();
for(int i = 0;i < str.length; i++){
if(str[i] == '|'){
if(after != c){
newStr.append(str[i]);
c++;
}
}else{
newStr.append(str[i]);
}
}
results in
newStr == "PID||000000|Z123345|23345|SOMEONE^FIRSTNAME^^^MISS^||150|F|1111||1 DREYFUS CLOSE^SOUTH CITY^COUNTY^^POST CODE^^^||0123 45678910^PRN^PH^^^^0123 45678910^^~^^CP^^^^^^~^NET^^^^^^^||||1A|||||A||||||||N";
A regex should do the trick:
var regex = new Regex(#"^([^\|]*\|){0,30}[^\|]*");
var match = regex.Match(input);
if(match.Success)
{
var val = match.Value;
}
If what you really want is that everything after the 30th chunk loses its '|', then try:
var chunks = input.Split('|');
var output = String.Join('|',chunks.Take(30)) + String.Concat(chunks.Skip(30));
That said, I think it sounds like what you're really looking for is probably something like:
var output = input.TrimEnd('|');
// Get the indexes of all the | characters.
int[] pipeIndexes = Enumerable.Range(0, s.Length).Where(i => s[i] == '|').ToArray();
// If there are more than thirty pipes:
if (pipeIndexes.Length > 30)
{
// The former part of the string remains intact.
string formerPart = s.Substring(0, pipeIndexes[30]);
// The latter part needs to have all | characters removed.
string latterPart = s.Substring(pipeIndexes[30]).Replace("|", "");
s = formerPart + latterPart;
}
Related
I have to write a program which parses a string for words starting with '#' and return the words along with the # symbol.
I have tried something like:
char[] delim = { '#' };
string[] strArr = commenttext.Split(delim);
return strArr;
But it returns all the words without '#' in an array.
I need something pretty straight forward.No LINQ like things
If the string is "abc #ert #xyz" then I should get back #ert and #xyz.
If you define "word" as "separated by spaces" then this would work:
string[] strArr = commenttext.Split(' ')
.Where(w => w.StartsWith("#"))
.ToArray();
If you need something more complex, a Regular Expression might be more appropriate.
I need something pretty straight forward.No LINQ like things>
The non-Linq equivalent would be:
var words = commenttext.Split(' ');
List<string> temp = new List<string>();
foreach(string w in words)
{
if(w.StartsWith("#"))
temp.Add(w);
}
string[] strArr = temp.ToArray();
If you're against using Linq, which you should not be unless you're required to use older .NET versions, an approach along these lines would suit your needs.
string[] words = commenttext.Split(delimiter);
for (int i = 0; i < words.Length; i++)
{
string word = words[i];
if (word.StartsWith(delimiter))
{
// save in array / list
}
}
const string test = "#Amir abcdef #Stack #C# mnop xyz";
var splited = test.Split(' ').Where(m => m.StartsWith("#")).ToList();
foreach (var b in splited)
{
Console.WriteLine(b.Substring(1, b.Length - 1));
}
Console.ReadKey();
In C# what would be the best way of splitting this sort of string?
%%x%%a,b,c,d
So that I end up with the value between the %% AND another variable containing everything right of the second %%
i.e. var x = "x"; var y = "a,b,c,d"
Where a,b,c.. could be an infinite comma seperated list. I need to extract the list and the value between the two double-percentage signs.
(To combat the infinite part, I thought perhaps seperating the string out to: %%x%% and a,b,c,d. At this point I can just use something like this to get X.
var tag = "%%";
var startTag = tag;
int startIndex = s.IndexOf(startTag) + startTag.Length;
int endIndex = s.IndexOf(tag, startIndex);
return s.Substring(startIndex, endIndex - startIndex);
Would the best approach be to use regex or use lots of indexOf and substring to do the extracting based on te static %% characters?
Given that what you want is "x,a,b,c,d" the Split() function is actually pretty powerful and regex would be overkill for this.
Here's an example:
string test = "%%x%%a,b,c,d";
string[] result = test.Split(new char[] { '%', ',' }, StringSplitOptions.RemoveEmptyEntries);
foreach (string s in result) {
Console.WriteLine(s);
}
Basicly we ask it to split by both '%' and ',' and ignore empty results (eg. the result between "%%"). Here's the result:
x
a
b
c
d
To Extract X:
If %% is always at the start then;
string s = "%%x%%a,b,c,d,h";
s = s.Substring(2,s.LastIndexOf("%%")-2);
//Console.WriteLine(s);
Else;
string s = "v,u,m,n,%%x%%a,b,c,d,h";
s = s.Substring(s.IndexOf("%%")+2,s.LastIndexOf("%%")-s.IndexOf("%%")-2);
//Console.WriteLine(s);
If you need to get them all at once then use this;
string s = "m,n,%%x%%a,b,c,d";
var myList = s.ToArray()
.Where(c=> (c != '%' && c!=','))
.Select(c=>c).ToList();
This'll let you do it all in one go:
string pattern = "^%%(.+?)%%(?:(.+?)(?:,|$))*$";
string input = "%%x%%a,b,c,d";
Match match = Regex.Match(input, pattern);
if (match.Success)
{
// "x"
string first = match.Groups[1].Value;
// { "a", "b", "c", "d" }
string[] repeated = match.Groups[2].Captures.Cast<Capture>()
.Select(c => c.Value).ToArray();
}
You can use the char.IsLetter to get all the list of letter
string test = "%%x%%a,b,c,d";
var l = test.Where(c => char.IsLetter(c)).ToArray();
var output = string.Join(", ", l.OrderBy(c => c));
Since you want the value between the %% and everything after in separate variables and you don't need to parse the CSV, I think a RegEx solution would be your best choice.
var inputString = #"%%x%%a,b,c,d";
var regExPattern = #"^%%(?<x>.+)%%(?<csv>.+)$";
var match = Regex.Match(inputString, regExPattern);
foreach (var item in match.Groups)
{
Console.WriteLine(item);
}
The pattern has 2 named groups called x and csv, so rather than just looping, you can easily reference them by name and assign them to values:
var x = match.Groups["x"];
var y = match.Groups["csv"];
I am having an output in string format like following :
"ABCDED 0000A1.txt PQRSNT 12345"
I want to retreieve substring(s) having .txt in above string. e.g. For above it should return 0000A1.txt.
Thanks
You can either split the string at whitespace boundaries like it's already been suggested or repeatedly match the same regex like this:
var input = "ABCDED 0000A1.txt PQRSNT 12345 THE.txt FOO";
var match = Regex.Match (input, #"\b([\w\d]+\.txt)\b");
while (match.Success) {
Console.WriteLine ("TEST: {0}", match.Value);
match = match.NextMatch ();
}
Split will work if it the spaces are the seperator. if you use oter seperators you can add as needed
string input = "ABCDED 0000A1.txt PQRSNT 12345";
string filename = input.Split(' ').FirstOrDefault(f => System.IO.Path.HasExtension(f));
filname = "0000A1.txt" and this will work for any extension
You may use c#, regex and pattern, match :)
Here is the code, plug it in try. Please comment.
string test = "afdkljfljalf dkfjd.txt lkjdfjdl";
string ffile = Regex.Match(test, #"\([a-z0-9])+.txt").Groups[1].Value;
Console.WriteLine(ffile);
Reference: regexp
I did something like this:
string subString = "";
char period = '.';
char[] chArString;
int iSubStrIndex = 0;
if (myString != null)
{
chArString = new char[myString.Length];
chArString = myString.ToCharArray();
for (int i = 0; i < myString.Length; i ++)
{
if (chArString[i] == period)
iSubStrIndex = i;
}
substring = myString.Substring(iSubStrIndex);
}
Hope that helps.
First split your string in array using
char[] whitespace = new char[] { ' ', '\t' };
string[] ssizes = myStr.Split(whitespace);
Then find .txt in array...
// Find first element starting with .txt.
//
string value1 = Array.Find(array1,
element => element.Contains(".txt", StringComparison.Ordinal));
Now your value1 will have the "0000A1.txt"
Happy coding.
I have a string 731478718861993983 and I want to get this 73-1478-7188-6199-3983 using C#. How can I format it like this ?
Thanks.
By using regex:
public static string FormatTest1(string num)
{
string formatPattern = #"(\d{2})(\d{4})(\d{4})(\d{4})(\d{4})";
return Regex.Replace(num, formatPattern, "$1-$2-$3-$4-$5");
}
// test
string test = FormatTest1("731478718861993983");
// test result: 73-1478-7188-6199-3983
If you're dealing with a long number, you can use a NumberFormatInfo to format it:
First, define your NumberFormatInfo (you may want additional parameters, these are the basic 3):
NumberFormatInfo format = new NumberFormatInfo();
format.NumberGroupSeparator = "-";
format.NumberGroupSizes = new[] { 4 };
format.NumberDecimalDigits = 0;
Next, you can use it on your numbers:
long number = 731478718861993983;
string formatted = number.ToString("n", format);
Console.WriteLine(formatted);
After all, .Net has very good globalization support - you're better served using it!
string s = "731478718861993983"
var newString = (string.Format("{0:##-####-####-####-####}", Convert.ToInt64(s));
LINQ-only one-liner:
var str = "731478718861993983";
var result =
new string(
str.ToCharArray().
Reverse(). // So that it will go over string right-to-left
Select((c, i) => new { #char = c, group = i / 4}). // Keep group number
Reverse(). // Restore original order
GroupBy(t => t.group). // Now do the actual grouping
Aggregate("", (s, grouping) => "-" + new string(
grouping.
Select(gr => gr.#char).
ToArray())).
ToArray()).
Trim('-');
This can handle strings of arbitrary lenghs.
Simple (and naive) extension method :
class Program
{
static void Main(string[] args)
{
Console.WriteLine("731478718861993983".InsertChar("-", 4));
}
}
static class Ext
{
public static string InsertChar(this string str, string c, int i)
{
for (int j = str.Length - i; j >= 0; j -= i)
{
str = str.Insert(j, c);
}
return str;
}
}
If you're dealing strictly with a string, you can make a simple Regex.Replace, to capture each group of 4 digits:
string str = "731478718861993983";
str = Regex.Replace(str, "(?!^).{4}", "-$0" ,RegexOptions.RightToLeft);
Console.WriteLine(str);
Note the use of RegexOptions.RightToLeft, to start capturing from the right (so "12345" will be replaced to 1-2345, and not -12345), and the use of (?!^) to avoid adding a dash in the beginning.
You may want to capture only digits - a possible pattern then may be #"\B\d{4}".
string myString = 731478718861993983;
myString.Insert(2,"-");
myString.Insert(7,"-");
myString.Insert(13,"-");
myString.Insert(18,"-");
My first thought is:
String s = "731478718861993983";
s = s.Insert(3,"-");
s = s.Insert(8,"-");
s = s.Insert(13,"-");
s = s.Insert(18,"-");
(don't remember if index is zero-based, in which case you should use my values -1)
but there is probably some easier way to do this...
If the position of "-" is always the same then you can try
string s = "731478718861993983";
s = s.Insert(2, "-");
s = s.Insert(7, "-");
s = s.Insert(12, "-");
s = s.Insert(17, "-");
Here's how I'd do it; it'll only work if you're storing the numbers as something which isn't a string as they're not able to be used with format strings.
string numbers = "731478718861993983";
string formattedNumbers = String.Format("{0:##-####-####-####-####}", long.Parse(numbers));
Edit: amended code, since you said they were held as a string in your your original question
I've a string 01-India. I want to split on '-' and get only the code 01. How can I do this. I'm a .net newbie. Split function returns a array. Since I need only one string, how can this be done. Is there a ingenious way to do it using split only. Or do I've to use substring only?
Other possibility is
string xy = "01-India";
string xz = xy.Split('-')[0];
You can search for the first occurence of - and then use the method substring to cut the piece out.
var result = input.Substring(0, input.IndexOf('-'))
string str = "01-India";
string prefix = null;
int pos = str.IndexOf('-');
if (pos != -1)
prefix = str.SubString(0,pos);
var str = "01-India";
var hyphenIndex = str.IndexOf("-");
var start = str.substring(0, hyphenIndex);
or you can use regular expression if it is a more complicated string pattern.
Something like this?
var s = "01-India";
var result = s.SubString(0, s.IndexOf("-"));
Since you don't want to use arrays, you could do an IndexOf('-') and then a substring.
string s = "01-India"
int index = s.IndexOf('-');
string code = s.Substring(0, index);
Or, for added fun, you could use String.Remove.
string s = "01-India"
int index = s.IndexOf('-');
string code = s.Remove(index);
string value = "01-India";
string part1 = value.Split('-')[0];