In C# what would be the best way of splitting this sort of string?
%%x%%a,b,c,d
So that I end up with the value between the %% AND another variable containing everything right of the second %%
i.e. var x = "x"; var y = "a,b,c,d"
Where a,b,c.. could be an infinite comma seperated list. I need to extract the list and the value between the two double-percentage signs.
(To combat the infinite part, I thought perhaps seperating the string out to: %%x%% and a,b,c,d. At this point I can just use something like this to get X.
var tag = "%%";
var startTag = tag;
int startIndex = s.IndexOf(startTag) + startTag.Length;
int endIndex = s.IndexOf(tag, startIndex);
return s.Substring(startIndex, endIndex - startIndex);
Would the best approach be to use regex or use lots of indexOf and substring to do the extracting based on te static %% characters?
Given that what you want is "x,a,b,c,d" the Split() function is actually pretty powerful and regex would be overkill for this.
Here's an example:
string test = "%%x%%a,b,c,d";
string[] result = test.Split(new char[] { '%', ',' }, StringSplitOptions.RemoveEmptyEntries);
foreach (string s in result) {
Console.WriteLine(s);
}
Basicly we ask it to split by both '%' and ',' and ignore empty results (eg. the result between "%%"). Here's the result:
x
a
b
c
d
To Extract X:
If %% is always at the start then;
string s = "%%x%%a,b,c,d,h";
s = s.Substring(2,s.LastIndexOf("%%")-2);
//Console.WriteLine(s);
Else;
string s = "v,u,m,n,%%x%%a,b,c,d,h";
s = s.Substring(s.IndexOf("%%")+2,s.LastIndexOf("%%")-s.IndexOf("%%")-2);
//Console.WriteLine(s);
If you need to get them all at once then use this;
string s = "m,n,%%x%%a,b,c,d";
var myList = s.ToArray()
.Where(c=> (c != '%' && c!=','))
.Select(c=>c).ToList();
This'll let you do it all in one go:
string pattern = "^%%(.+?)%%(?:(.+?)(?:,|$))*$";
string input = "%%x%%a,b,c,d";
Match match = Regex.Match(input, pattern);
if (match.Success)
{
// "x"
string first = match.Groups[1].Value;
// { "a", "b", "c", "d" }
string[] repeated = match.Groups[2].Captures.Cast<Capture>()
.Select(c => c.Value).ToArray();
}
You can use the char.IsLetter to get all the list of letter
string test = "%%x%%a,b,c,d";
var l = test.Where(c => char.IsLetter(c)).ToArray();
var output = string.Join(", ", l.OrderBy(c => c));
Since you want the value between the %% and everything after in separate variables and you don't need to parse the CSV, I think a RegEx solution would be your best choice.
var inputString = #"%%x%%a,b,c,d";
var regExPattern = #"^%%(?<x>.+)%%(?<csv>.+)$";
var match = Regex.Match(inputString, regExPattern);
foreach (var item in match.Groups)
{
Console.WriteLine(item);
}
The pattern has 2 named groups called x and csv, so rather than just looping, you can easily reference them by name and assign them to values:
var x = match.Groups["x"];
var y = match.Groups["csv"];
Related
I have to write a program which parses a string for words starting with '#' and return the words along with the # symbol.
I have tried something like:
char[] delim = { '#' };
string[] strArr = commenttext.Split(delim);
return strArr;
But it returns all the words without '#' in an array.
I need something pretty straight forward.No LINQ like things
If the string is "abc #ert #xyz" then I should get back #ert and #xyz.
If you define "word" as "separated by spaces" then this would work:
string[] strArr = commenttext.Split(' ')
.Where(w => w.StartsWith("#"))
.ToArray();
If you need something more complex, a Regular Expression might be more appropriate.
I need something pretty straight forward.No LINQ like things>
The non-Linq equivalent would be:
var words = commenttext.Split(' ');
List<string> temp = new List<string>();
foreach(string w in words)
{
if(w.StartsWith("#"))
temp.Add(w);
}
string[] strArr = temp.ToArray();
If you're against using Linq, which you should not be unless you're required to use older .NET versions, an approach along these lines would suit your needs.
string[] words = commenttext.Split(delimiter);
for (int i = 0; i < words.Length; i++)
{
string word = words[i];
if (word.StartsWith(delimiter))
{
// save in array / list
}
}
const string test = "#Amir abcdef #Stack #C# mnop xyz";
var splited = test.Split(' ').Where(m => m.StartsWith("#")).ToList();
foreach (var b in splited)
{
Console.WriteLine(b.Substring(1, b.Length - 1));
}
Console.ReadKey();
I am having strings like below
<ad nameId="\862094\"></ad>
or comma seprated like below
<ad nameId="\862593\"></ad>,<ad nameId="\862094\"></ad>,<ad nameId="\865599\"></ad>
How to extract nameId value and store in single string like below
string extractedValues ="862094";
or in case of comma seprated string above
string extractedMultipleValues ="862593,862094,865599";
This is what I have started trying with but not sure
string myString = "<ad nameId="\862593\"></ad>,<ad nameId="\862094\"></ad>,<ad
nameId="\865599\"></ad>";
string[] myStringArray = myString .Split(',');
foreach (string str in myStringArray )
{
xd.LoadXml(str);
chkStringVal = xd.SelectSingleNode("/ad/#nameId").Value;
}
Search for:
<ad nameId="\\(\d*)\\"><\/ad>
Replace with:
$1
Note that you must search globally. Example: http://www.regex101.com/r/pL2lX1
Please see code below to extract all numbers in your example:
string value = #"<ad nameId=""\862093\""></ad>,<ad nameId=""\862094\""></ad>,<ad nameId=""\865599\""></ad>";
var matches = Regex.Matches(value, #"(\\\d*\\)", RegexOptions.RightToLeft);
foreach (Group item in matches)
{
string yourMatchNumber = item.Value;
}
Try like this;
string s = #"<ad nameId=""\862094\""></ad>";
if (!(s.Contains(",")))
{
string extractedValues = s.Substring(s.IndexOf("\\") + 1, s.LastIndexOf("\\") - s.IndexOf("\\") - 1);
}
else
{
string[] array = s.Split(new char[] { ',' }, StringSplitOptions.RemoveEmptyEntries);
string extractedMultipleValues = "";
for (int i = 0; i < array.Length; i++)
{
extractedMultipleValues += array[i].Substring(array[i].IndexOf("\\") + 1, array[i].LastIndexOf("\\") - array[i].IndexOf("\\") - 1) + ",";
}
Console.WriteLine(extractedMultipleValues.Substring(0, extractedMultipleValues.Length -1));
}
mhasan, here goes an example of what you need(well almost)
EDITED: complete code (it's a little tricky)
(Sorry for the image but i have some troubles with tags in the editor, i can send the code by email if you want :) )
A little explanation about the code, it replaces all ocurrences of parsePattern in the given string, so if the given string has multiple tags separated by "," the final result will be the numbers separated by "," stored in parse variable....
Hope it helps
I have a string that looks something like this:
"PID||000000|Z123345|23345|SOMEONE^FIRSTNAME^^^MISS^||150|F|1111||1 DREYFUS CLOSE^SOUTH CITY^COUNTY^^POST CODE^^^||0123 45678910^PRN^PH^^^^0123 45678910^^~^^CP^^^^^^~^NET^^^^^^^||||1A|||||A||||||||N||||||||||";
I am trying to remove any separating '|' characters after the 30th '|' in the string so that the output string looks like this:
"PID||000000|Z123345|23345|SOMEONE^FIRSTNAME^^^MISS^||150|F|1111||1 DREYFUS CLOSE^SOUTH CITY^COUNTY^^POST CODE^^^||0123 45678910^PRN^PH^^^^0123 45678910^^~^^CP^^^^^^~^NET^^^^^^^||||1A|||||A||||||||N";
I am trying to do it using as little code as possible, but not having much luck. Any help or ideas would be great.
You can use the TrimEnd method
string text = "stuff||||N||||||||||";
string result = text.TrimEnd('|'); //Result is stuff||||N
Brute force but only a little bit of code:
string s2 = string.Join("|", s1.Split('|').Take(31));
If you need any other processing of this kind of data (it looks like a kind of nested CSV) then string.Split() is useful to know.
string str = "PID||000000|Z123345|23345|SOMEONE^FIRSTNAME^^^MISS^||150|F|1111||1 DREYFUS CLOSE^SOUTH CITY^COUNTY^^POST CODE^^^||0123 45678910^PRN^PH^^^^0123 45678910^^~^^CP^^^^^^~^NET^^^^^^^||||1A|||||A||||||||N||||||||||";
int c = 0;
int after = 30;
StringBuilder newStr = new StringBuilder();
for(int i = 0;i < str.length; i++){
if(str[i] == '|'){
if(after != c){
newStr.append(str[i]);
c++;
}
}else{
newStr.append(str[i]);
}
}
results in
newStr == "PID||000000|Z123345|23345|SOMEONE^FIRSTNAME^^^MISS^||150|F|1111||1 DREYFUS CLOSE^SOUTH CITY^COUNTY^^POST CODE^^^||0123 45678910^PRN^PH^^^^0123 45678910^^~^^CP^^^^^^~^NET^^^^^^^||||1A|||||A||||||||N";
A regex should do the trick:
var regex = new Regex(#"^([^\|]*\|){0,30}[^\|]*");
var match = regex.Match(input);
if(match.Success)
{
var val = match.Value;
}
If what you really want is that everything after the 30th chunk loses its '|', then try:
var chunks = input.Split('|');
var output = String.Join('|',chunks.Take(30)) + String.Concat(chunks.Skip(30));
That said, I think it sounds like what you're really looking for is probably something like:
var output = input.TrimEnd('|');
// Get the indexes of all the | characters.
int[] pipeIndexes = Enumerable.Range(0, s.Length).Where(i => s[i] == '|').ToArray();
// If there are more than thirty pipes:
if (pipeIndexes.Length > 30)
{
// The former part of the string remains intact.
string formerPart = s.Substring(0, pipeIndexes[30]);
// The latter part needs to have all | characters removed.
string latterPart = s.Substring(pipeIndexes[30]).Replace("|", "");
s = formerPart + latterPart;
}
I am having an output in string format like following :
"ABCDED 0000A1.txt PQRSNT 12345"
I want to retreieve substring(s) having .txt in above string. e.g. For above it should return 0000A1.txt.
Thanks
You can either split the string at whitespace boundaries like it's already been suggested or repeatedly match the same regex like this:
var input = "ABCDED 0000A1.txt PQRSNT 12345 THE.txt FOO";
var match = Regex.Match (input, #"\b([\w\d]+\.txt)\b");
while (match.Success) {
Console.WriteLine ("TEST: {0}", match.Value);
match = match.NextMatch ();
}
Split will work if it the spaces are the seperator. if you use oter seperators you can add as needed
string input = "ABCDED 0000A1.txt PQRSNT 12345";
string filename = input.Split(' ').FirstOrDefault(f => System.IO.Path.HasExtension(f));
filname = "0000A1.txt" and this will work for any extension
You may use c#, regex and pattern, match :)
Here is the code, plug it in try. Please comment.
string test = "afdkljfljalf dkfjd.txt lkjdfjdl";
string ffile = Regex.Match(test, #"\([a-z0-9])+.txt").Groups[1].Value;
Console.WriteLine(ffile);
Reference: regexp
I did something like this:
string subString = "";
char period = '.';
char[] chArString;
int iSubStrIndex = 0;
if (myString != null)
{
chArString = new char[myString.Length];
chArString = myString.ToCharArray();
for (int i = 0; i < myString.Length; i ++)
{
if (chArString[i] == period)
iSubStrIndex = i;
}
substring = myString.Substring(iSubStrIndex);
}
Hope that helps.
First split your string in array using
char[] whitespace = new char[] { ' ', '\t' };
string[] ssizes = myStr.Split(whitespace);
Then find .txt in array...
// Find first element starting with .txt.
//
string value1 = Array.Find(array1,
element => element.Contains(".txt", StringComparison.Ordinal));
Now your value1 will have the "0000A1.txt"
Happy coding.
I have a string that looks like this:
var expression = #"Args("token1") + Args("token2")";
I want to retrieve a collection of strings that are enclosed in Args("") in the expression.
How would I do this in C# or VB.NET?
Regex:
string expression = "Args(\"token1\") + Args(\"token2\")";
Regex r = new Regex("Args\\(\"([^\"]+)\"\\)");
List<string> tokens = new List<string>();
foreach (var match in r.Matches(expression)) {
string s = match.ToString();
int start = s.IndexOf('\"');
int end = s.LastIndexOf('\"');
tokens.add(s.Substring(start + 1, end - start - 1));
}
Non-regex (this assumes that the string in the correct format!):
string expression = "Args(\"token1\") + Args(\"token2\")";
List<string> tokens = new List<string>();
int index;
while (!String.IsNullOrEmpty(expression) && (index = expression.IndexOf("Args(\"")) >= 0) {
int start = expression.IndexOf('\"', index);
string s = expression.Substring(start + 1);
int end = s.IndexOf("\")");
tokens.Add(s.Substring(0, end));
expression = s.Substring(end + 2);
}
There is another regular expression method for accomplishing this, using lookahead and lookbehind assertions:
Regex regex = new Regex("(?<=Args\\(\").*?(?=\"\\))");
string input = "Args(\"token1\") + Args(\"token2\")";
MatchCollection matches = regex.Matches(input);
foreach (var match in matches)
{
Console.WriteLine(match.ToString());
}
This strips away the Args sections of the string, giving just the tokens.
If you want token1 and token2, you can use following regex
input=#"Args(""token1"") + Args(""token2"")"
MatchCollection matches = Regex.Matches(input,#"Args\(""([^""]+)""\)");
Sorry, If this is not what you are looking for.
if your collection looks like this:
IList<String> expression = new List<String> { "token1", "token2" };
var collection = expression.Select(s => Args(s));
As long as Args returns the same type as the queried collection type this should work okay
you can then iterate over the collection like so
foreach (var s in collection)
{
Console.WriteLine(s);
}