I am parsing a set of coordinates from an XML file. Each node will have coordinates like:
-82.5,34.1,0.000 -82.6,34.2,0.000
In the code below, the coords_raw variable is already assigned the above value and I am trying to split into array lnglatset --which does look okay.
string[] lnglatset = raw_coords.Split(' ');//will yield like [0]=-82.00,34.00,00000 // Will need to get rid of the last set of zeros
foreach (string lnglat in lnglatset)
{
Console.WriteLine(lnglat);//-82.5,34.1,0.000; looks fine
}
From the above, the final value needed would be:
coords = "-82.5 34.1, -82.6 34.2";//note the space between lng/lat
But how do remove the junk values of 0.000 from each element of the array and put a space, instead of a comma between the lng and lat values in each element? I have tried some remove() function on lnglat but that was not allowed within the foreach loop. Thanks!
You can take all parts except the last one using Take method:
var parts = raw_coords.Split(' ')
.Select(x => x.Split(','))
.Select(x => string.Join(" ", x.Take(x.Length - 1)));
var result = string.Join(",", parts);
In a single line :
String result = String.Join(" ", raw_coords.Split(' ', ',')
.Select(i => double.Parse(i))
.Where(i => i != 0).Select( i => i.ToString()));
it removes each 0.000 element and removes the space and the comma.
You can't alter the members of IEnumerable during a ForEach. Instead, you can just skip the last member when splitting the raw coordinate input.
raw_coords.split(' ').Take(2).ToArray()
Like others mentioned, you can not modify iterating variable with foreach. I learnt it the hard way and ended up using simple "for" loop instead of foreach:
for(int index=0; i<lnglatset.length-2; i++)
{
}
You can use IEnumerable.Last() extension method from System.Linq.
string lastItemOfSplit = aString.Split(new char[] {#"\"[0], "/"[0]}).Last();
Related
I have the following text in a file:
"SHOP_ORDER001","SHOP_ORDER002","SHOP_ORDER003","SHOP_ORDER004","SHOP_ORDER005"
Now I am getting the values by reading the file and assigning to array by spilt:
String orderValue = "";
string[] orderArray;
orderValue = File.ReadAllText(#"C:\File.txt");
orderArray = orderValue.Split(',');
But I am getting the values as :
I need the Values in Array as "ORDER001","ORDER002","ORDER003"
The \" you see is just added by debugger visualizer for strings (because quote is a special characted and need to be escaped to don't get confused), don't worry they're not in your orderArray.
In case you want to remove quotes too so that your array will be:
SHOP_ORDER001
SHOP_ORDER002
...
Just use this (with LINQ):
var orderArray = orderValue.Split(',').Select(x => x.Trim('"'));
By the way String.Split isn't very robust unless you're sure each field will never contain a comma.
EDIT
To answer the point you added in the comments if you need to remove SHOP_ just write this:
var orderArray = orderValue.Split(',')
.Select(x => x.Trim('"').Substring("SHOP_".Length));
use this regex
var res = Regex.Matches(orderValue, #"(?<=""SHOP_)[^""]+?(?="")");
You could use this:
string[] result = Regex.Split(orderValue, "(?:^\"SHOP_)|(?:\",\"SHOP_)|(?:\"$)");
However you will have to skip the first and last items in the resulting array as they will always be empty strings.
Silly question but why don't you just do
.Replace("SHOP_", "");
How can I remove a whole line from a text file if the first word matches to a variable I have?
What I'm currently trying is:
List<string> lineList = File.ReadAllLines(dir + "textFile.txt").ToList();
lineList = lineList.Where(x => x.IndexOf(user) <= 0).ToList();
File.WriteAllLines(dir + "textFile.txt", lineList.ToArray());
But I can't get it to remove.
The only mistake that you have is you are checking <= 0 with indexOf, instead of = 0.
-1 is returned when the string does not contain the searched for string.
<= 0 means either starts with or does not contain
=0 means starts with <- This is what you want
This method will read the file line-by-line instead of all at once. Also note that this implementation is case-sensitive.
It also assumes you aren't subjected to leading spaces.
using (var writer = new StreamWriter("temp.file"))
{
//here I only write back what doesn't match
foreach(var line in File.ReadLines("file").Where(x => !x.StartsWith(user)))
writer.WriteLine(line); // not sure if this will cause a double-space ?
}
File.Move("temp.file", "file");
You were pretty close, String.StartsWith handles that nicely:
// nb: if you are case SENSITIVE remove the second argument to ll.StartsWith
File.WriteAllLines(
path,
File.ReadAllLines(path)
.Where(ll => ll.StartsWith(user, StringComparison.OrdinalIgnoreCase)));
For really large files that may not be well performing, instead:
// Write our new data to a temp file and read the old file On The Fly
var temp = Path.GetTempFileName();
try
{
File.WriteAllLines(
temp,
File.ReadLines(path)
.Where(
ll => ll.StartsWith(user, StringComparison.OrdinalIgnoreCase)));
File.Copy(temp, path, true);
}
finally
{
File.Delete(temp);
}
Another issue noted was that both IndexOf and StartsWith will treat ABC and ABCDEF as matches if the user is ABC:
var matcher = new Regex(
#"^" + Regex.Escape(user) + #"\b", // <-- matches the first "word"
RegexOptions.CaseInsensitive);
File.WriteAllLines(
path,
File.ReadAllLines(path)
.Where(ll => matcher.IsMatch(ll)));
Use `= 0` instead of `<= 0`.
Lets say I have an array (or list) of items
A[] = [a,b,c,d,e]
If I want to print them out so each item is separated by a comma (or any other delimiter), I generally have to do this:
for(int i=0; i < A.Count; i++)
{
Console.Write(A[i]);
if (i != A.Count-1)
Console.Write(",");
}
So, my output looks like:
a,b,c,d,e
Is there a better or neater way to achieve this?
I like to use a foreach loop, but that prints a comma after the last element as well, which is undesirable.
Console.WriteLine(string.Join(",", A));
You are looking for String.Join():
var list = String.join(",", A);
String.Join Method (String, String[])
Concatenates all the elements of a string array, using the specified separator between each element.
public static string Join(
string separator,
params string[] value
)
Is there a better or neater way to achieve this? I like to use a foreach loop, but that prints a comma after the last element as well, which is undesirable.
As others have said, Join does the right thing. But here's another way to think about the problem that might help you in the future. Instead of thinking of the problem as put a comma after every element except the last element -- which you correctly note makes it difficult to work with the "foreach" loop -- think of the problem as put a comma before every element except the first element. Now it is easy to do with a foreach loop!
For about a million more ways to solve a similar problem see:
Eric Lippert's challenge "comma-quibbling", best answer?
And the original blog post:
http://blogs.msdn.com/b/ericlippert/archive/2009/04/15/comma-quibbling.aspx
Use the string.Join method, very handy.
String.Join(",", my_array)
Use:
String.Join(",", arrayOfStrings);
string separator = String.Empty;
for(int i=0; i < A.Length; i++)
{
Console.Write(seperator);
Console.Write(A[i]);
separator = ",";
}
using System;
using System.Linq;
public class Program
{
public static void Main()
{
string[] values = new string[]{"banana", "papaya", "melon"};
var result = values.Aggregate((x,y) => x + ", " + y);
Console.WriteLine(result);
}
}
I have the following string which i would like to retrieve some values from:
============================
Control 127232:
map #;-
============================
Control 127235:
map $;NULL
============================
Control 127236:
I want to take only the Control . Hence is there a way to retrieve from that string above into an array containing like [127232, 127235, 127236]?
One way of achieving this is with regular expressions, which does introduce some complexity but will give the answer you want with a little LINQ for good measure.
Start with a regular expression to capture, within a group, the data you want:
var regex = new Regex(#"Control\s+(\d+):");
This will look for the literal string "Control" followed by one or more whitespace characters, followed by one or more numbers (within a capture group) followed by a literal string ":".
Then capture matches from your input using the regular expression defined above:
var matches = regex.Matches(inputString);
Then, using a bit of LINQ you can turn this to an array
var arr = matches.OfType<Match>()
.Select(m => long.Parse(m.Groups[1].Value))
.ToArray();
now arr is an array of long's containing just the numbers.
Live example here: http://rextester.com/rundotnet?code=ZCMH97137
try this (assuming your string is named s and each line is made with \n):
List<string> ret = new List<string>();
foreach (string t in s.Split('\n').Where(p => p.StartsWith("Control")))
ret.Add(t.Replace("Control ", "").Replace(":", ""));
ret.Add(...) part is not elegant, but works...
EDITED:
If you want an array use string[] arr = ret.ToArray();
SYNOPSYS:
I see you're really a newbie, so I try to explain:
s.Split('\n') creates a string[] (every line in your string)
.Where(...) part extracts from the array only strings starting with Control
foreach part navigates through returned array taking one string at a time
t.Replace(..) cuts unwanted string out
ret.Add(...) finally adds searched items into returning list
Off the top of my head try this (it's quick and dirty), assuming the text you want to search is in the variable 'text':
List<string> numbers = System.Text.RegularExpressions.Regex.Split(text, "[^\\d+]").ToList();
numbers.RemoveAll(item => item == "");
The first line splits out all the numbers into separate items in a list, it also splits out lots of empty strings, the second line removes the empty strings leaving you with a list of the three numbers. if you want to convert that back to an array just add the following line to the end:
var numberArray = numbers.ToArray();
Yes, the way exists. I can't recall a simple way for It, but string is to be parsed for extracting this values. Algorithm of it is next:
Find a word "Control" in string and its end
Find a group of digits after the word
Extract number by int.parse or TryParse
If not the end of the string - goto to step one
realizing of this algorithm is almost primitive..)
This is simplest implementation (your string is str):
int i, number, index = 0;
while ((index = str.IndexOf(':', index)) != -1)
{
i = index - 1;
while (i >= 0 && char.IsDigit(str[i])) i--;
if (++i < index)
{
number = int.Parse(str.Substring(i, index - i));
Console.WriteLine("Number: " + number);
}
index ++;
}
Using LINQ for such a little operation is doubtful.
I am working on an application which imports thousands of lines where every line has a format like this:
|* 9070183020 |04.02.2011 |107222 |M/S SUNNY MEDICOS |GHAZIABAD | 32,768.00 |
I am using the following Regex to split the lines to the data I need:
Regex lineSplitter = new Regex(#"(?:^\|\*|\|)\s*(.*?)\s+(?=\|)");
string[] columns = lineSplitter.Split(data);
foreach (string c in columns)
Console.Write("[" + c + "] ");
This is giving me the following result:
[] [9070183020] [] [04.02.2011] [] [107222] [] [M/S SUNNY MEDICOS] [] [GHAZIABAD] [] [32,768.00] [|]
Now I have two questions.
1. How do I remove the empty results. I know I can use:
string[] columns = lineSplitter.Split(data).Where(s => !string.IsNullOrEmpty(s)).ToArray();
but is there any built in method to remove the empty results?
2. How can I remove the last pipe?
Thanks for any help.
Regards,
Yogesh.
EDIT:
I think my question was a little misunderstood. It was never about how I can do it. It was only about how can I do it by changing the Regex in the above code.
I know that I can do it in many ways. I have already done it with the code mentioned above with a Where clause and with an alternate way which is also (more than two times) faster:
Regex regex = new Regex(#"(^\|\*\s*)|(\s*\|\s*)");
data = regex.Replace(data, "|");
string[] columns = data.Split(new[] { '|' }, StringSplitOptions.RemoveEmptyEntries);
Secondly, as a test case, my system can parse 92k+ such lines in less than 1.5 seconds in the original method and in less than 700 milliseconds in the second method, where I will never find more than a couple of thousand in real cases, so I don't think I need to think about the speed here. In my opinion thinking about speed in this case is Premature optimization.
I have found the answer to my first question: it cannot be done with Split as there is no such option built in.
Still looking for answer to my second question.
Regex lineSplitter = new Regex(#"[\s*\*]*\|[\s*\*]*");
var columns = lineSplitter.Split(data).Where(s => s != String.Empty);
or you could simply do:
string[] columns = data.Split(new char[] {'|'}, StringSplitOptions.RemoveEmptyEntries);
foreach (string c in columns) this.textBox1.Text += "[" + c.Trim(' ', '*') + "] " + "\r\n";
And no, there is no option to remove empty entries for RegEx.Split as is for String.Split.
You can also use matches.
I think this may work as an equivalent to remove empty strings:
string[] splitter = Regex.Split(textvalue,#"\s").Where(s => s != String.Empty).ToArray<string>();
Don't use a regex at all in your case.
It doesn't seem you need one and regexes are much slower (and have a much higher overhead) than directly using the string functions.
So use somewhat like:
const Char[] splitChars = new Char[] {'|'};
string[] splitData = data.Split(splitChars, StringSplitOptions.RemoveEmptyEntries)
As an alternative to splitting, which is always going to cause trouble when your delimiters are also present at the beginning and end of the input, you can try matching the contents within the pipes:
foreach (var token in Regex.Matches(input, #"\|\*?\s*(\S[^|]*?)\s*(?=\|)"))
{
Console.WriteLine("[{0}]", token.Groups[1].Value);
}
// Prints the following:
// [9070183020]
// [04.02.2011]
// [107222]
// [M/S SUNNY MEDICOS]
// [GHAZIABAD]
// [32,768.00]
I might have the wrong idea here, but you just want to split the data string using the '|' character as a delimiter? In that case you couldtry:
string[] result = data.Split(new[] { "|" }, StringSplitOptions.RemoveEmptyEntries).Select(d => d.Trim()).ToArray();
This will return all the fields, without spaces and with empty fields removed. You can what you like in the Select part to format the results e.g.
.Select(d => "[" + d.Trim() + "]").ToArray();
Based on #Jaroslav Jandek's great answer, I wrote an extension method, I put that here, maybe it can save your time.
/// <summary>
/// String.Split with RemoveEmptyEntries option for clean up empty entries from result
/// </summary>
/// <param name="s">Value to parse</param>
/// <param name="separator">The separator</param>
/// <param name="index">Hint: pass -1 to get Last item</param>
/// <param name="wholeResult">Get array of split value</param>
/// <returns></returns>
public static object CleanSplit(this string s, char separator, int index, bool wholeResult = false)
{
if (string.IsNullOrWhiteSpace(s)) return "";
var split = s.Split(new char[] { separator }, StringSplitOptions.RemoveEmptyEntries);
if (wholeResult) return split;
if (index == -1) return split.Last();
if (split[index] != null) return split[index];
return "";
}
1. How do I remove the empty results?
You can use LINQ to remove all entries that are equal to string.Empty :
string[] columns = lineSplitter.Split(data);
columns = columns.ToList().RemoveAll(c => c.Equals(string.Empty)).ToArray();
2. How can I remove the last pipe?
You can use LINQ here to remove all the entries equal to the character you want to remove :
columns = columns.ToList().RemoveAll(c => c.Equals("|")).ToArray();
How about this:
assuming we have a line:
line1="|* 9070183020 |04.02.2011 |107222 |M/S SUNNY MEDICOS |GHAZIABAD | 32,768.00 |";
we can have required result as:
string[] columns =Regex.Split(line1,"|");
foreach (string c in columns)
c=c.Replace("*","").Trim();
This will give following result:
[9070183020] [04.02.2011] [107222] [M/S SUNNY MEDICOS] [GHAZIABAD] [32,768.00]
use this solution:
string stringwithDelemeterNoEmptyValues= string.Join(",", stringwithDelemeterWithEmptyValues.Split(",".ToCharArray(), StringSplitOptions.RemoveEmptyEntries));