Regex to select all commas up to a specific character - c#

I am having a terrible time with regular expressions. Its terrible for me to admit, but I just don't use them enough to be really good when I need to be. Basically due to the way our application runs, I have the contents of a .csv file pulled out into a string. I need to essentially insert a new row above and below what already exists. The amount of columns can change depending on the report. What I would like to do is grab all commas without any other characters (including whitespace) up to the first set of \r\n in the string. This way I have all the columns and I can insert a blank row up top and populate the columns with what I need. Here is an example of the .csv text:
"Date, Account Code, Description, Amount\r\n23-Apr-13,12345,Account1,$12345\r\n"
What I would like the regex to grab:
",,," or ",,,\r\n"
I just cannot seem to get this. Thank you.

You don't need a regex for this.
string firstLine = file.ReadLines().First();
int numCommas = firstLine.Count(c => c == ',');
string commaString = new String(',', numCommas);
If you don't have access to file.ReadLines() method, you can use the following from this link:
string firstline = test.Substring(0, test.IndexOf(Environment.NewLine));

You actually don't need to complicate your code with Regular Epressions to accomplish what you want: to count the columns.
Here's an extremely simple method:
String textline = csvtext.Substring(0, csvtext.IndexOfAny(Environment.NewLine.ToCharArray()));
int columns = textline.Split(',').Length;
Now the columns variable has your total number of columns.
The first line grabs just the first line out of the CSV text. The second line splits that text into an array separated by commas (,), and returns the total number.

you can make use the below regex
(?<=[\d\w\s])(\r|\n|,)(?=[\d\w\s\W])
to match , and new line characters,
Use can make use of Regex.Replace("inputstring","regexpattern","replacechar", RegexOptions.IgnoreCase)
This can be done by string operations itself
string[] strs= inputstr.split(new string[]{"\n","\r",","}, StringSplitOptions.RemoveEmptyEntries);
foreach(string str in strs)
{
// do you part
}

Related

Find each instance and replace with unique value in string

I have some text in a string as below:
{121:SOMETHING1}}{4:
.
.
.
{121:SOMETHING2}}{4:
.
.
.
{121:SOMETHING3}}{4:
I want to sequentially find and replace the value between 121: and the first }. I can successfully do this using the following code in C#
var rx = #"121:(?<value>[\s\S]+?)}";
string temp = tag121;
string stringToChange = //as above
for (var m = Regex.Match(stringToChange , rx); m.Success; m = m.NextMatch())
{
temp = generateUniqueValue()
stringToChange= Regex.Replace(stringToChange, m.Value, temp);
}
However, if instead of having different values between 121: and } I have exactly the same value for all the tags e.g instead of SOMETHING1, SOMETHING 2 etc, I just have SOMETHING for all the lines, then this code does not work. It ends up setting just one value for all the lines instead of unique values for each.
When all the lines in the string are all {121:SOMETHING}}{4:, the first cycle of the loop already replaced every SOMETHING in the string to first call result of generateUniqueValue(). You can see that happening by printing out stringToChange in the end of every for-loop cycle. (A good way to debug when working around regex, too)
You need to consider a new approach or look at what you trying to achieve again:
Is it acceptable that there are at least 2 lines with same value in the input?
Should lines with same values be replaced into different UniqueValue?
If answers to both questions are yes, one approach I suggest is replace line by line. Not optimal I guess though...
Split the string by lines. String.Split("\\n") Maybe?
Foreach through that split string array.
Find part to replace by regex {121:([^}]+)} - Group 1 of the match is the string you need to replace.
After foreach loop, concat the array into one single string again.
Reference:
Match.Groups

Split string returning an extra string at the end of the returned array

I am trying to split a string using ; as a delimiter.
My output is weird, why is there an empty string a the end of the returned array?
string emails = "bitebari#gmail.com;abcd#gmail.com;";
string[] splittedEmails = emails.TrimEnd().Split(';');
foreach (var email in splittedEmails)
{
Console.WriteLine("Value is :" + email);
}
The console output looks like this:
Value is: bitebari#gmail.com
Value is: abcd#gmail.com
Value is:
The string.Split method doesn't remove empty entries by default, anyhow you can tell it to do that, by providing it with the StringSplitOptions. Try to use your method with the StringSplitOptions.RemoveEmptyEntries parameter.
string[] splittedEmails = emails.Split(';', StringSplitOptions.RemoveEmptyEntries);
Actually you should try to pass ; to your TrimEnd method, since it will truncate white spaces otherwise. Therefore your string remains with the ; at the end. This would result to the following:
string[] splittedEmails = emails.TrimEnd(';').Split(';');
Both of the solutions above work, it really comes to preference as the performance difference shouldn't be that high.
Edit
This behavior is considered to be 'standard' at least in C#, let me quote the MSDN for this one.
This behavior makes it easier for formats like comma separated values (CSV) files representing tabular data. Consecutive commas represent a blank column.
You can pass an optional StringSplitOptions.RemoveEmptyEntries parameter to exclude any empty strings in the returned array. For more complicated processing of the returned collection, you can use LINQ to manipulate the result sequence.
Also there isn't just any special case for that.

How to write data on multiple lines BUT within the same cell of csv?

I want to create one csv file using C#.
I have some data which I want to write on multiple lines BUT within the same cell.
For example
If I have following three sentences,
Sample sentence 1. This is second sample sentence. and this is third sentence.
I want to write all these three sentences within single cell of csv file but I want three of them on separate line.
My expected output is :
Sample sentence 1.
This is second sample sentence.
and this is third sentence.
Currently I am trying to achieve this by using \n character between two sentences but when I do it this way, all three sentences go on separate row.
Can anyone please tell me how to solve this problem?
To quote Wikipedia:
Fields with embedded line breaks must
be enclosed within double-quote
characters.
Like e.g.:
1997,Ford,E350,"Go get one now
they are going fast"
You might be able to change which tokens your application uses to parse rows from the csv file, but according to the definition of the csv file format, lines in the file are used to represent rows in the resulting tabular data.
string input = "...";
var r = input.Split(new[] { '.' }, StringSplitOptions.RemoveEmptyEntries) // split by dot (have you any better approaches?)
.Select(s => String.Format("\"{0}.\"", s.Trim())); // escape each with quotes
input = String.Join(Environment.NewLine, r); // break by line break
Always wrap your description in between "\" eg: CsvRow.Add("\"" + ID + " : " + description + "\"");
string filePath = #"d:\pandey.csv";
StringBuilder outString=new StringBuilder();
outString.Append("\""); //double quote
outString.Append("Line 1, Hi I am line 1 of same cell\n");
outString.Append("Line 2 Hi I am line 2 of same cell\n");
outString.Append("Line 3 Hi I am line 3 of same cell\n");
outString.Append("Line 4 Hi I am line 4 of same cell");
outString.Append("\""); //double quote
File.WriteAllText(filePath, outString.ToString());
CSV stands for "comma seperated values" which is just a description of a text-file. If you want to import into Excel and then only have the sentences in one line you should try:
Environment.NewLine
instread of a simple /n

StringReader formatted string

I have followed this Wordpress tutorial which works great. I have used a listview and when i try to format the string it doesn't recognise the \t (but does recognise \n). It also won't recognise String.Format etc.
Is there anyway that I can format the string using tabs or something similar?
Cheers
EDIT
for( i = 0; i < lstView.Items.Count;i++)
{
name = lstView.Items[i].Text;
state = lstView.Items[i].SubItems[1].Text;
country = lstView.Items[i].SubItems[2].Text;
line += name + "\t" + state + "\t" + country + "\n";
}
StringReader reader = new StringReader(line);
When line is used to print the string is joined together so the \t doesn't work. The \n for a new line does work though. Does anyone know any way that I can format the string without using spaces.
The result is like this
NameStateCountry
LongernameStateCountry
anotherNameAnotherStateAnotherCountry
Where I would like them lined up (like in a table) with name one column, state another column and country then third
Any suggestions greatly appreciated
Well, it is a bit odd that tabs are lost, but on the other hand, tabs will probably be problematic if the individual string elements (name, state etc.) varies in length.
What you could do instead is use string.Format() and use fixed column widths.
To get nice visual output this would include a parse step to determine the correct column width.
When this is done, use something like this to use spaces instead of tabs.
string line = string.Format("{0,-20}{1,-20}{2,-20}", "name", "state", "country");
EDIT: Saw that you did not want to use spaces.
In this case, you will probably need to handle this in the printing algorithm itself. You could still separate items with tabs, then for each line split it on tabs, creating an array of items (columns) per line.
For each item, print it using Graphics.DrawString() with a suitable X-position offset.
See the documentation for Graphics.DrawString.

Order the lines in a file by the last character on the line

Can you please help me with this:
I want to build a method in C# which will order a lot of files by the following rule
every line contains strings and the last character in every line is an int.
I want to order the lines in the file by this last character, the int.
Thanks
To order ascending by the last character, interpreted as an integer you could do:
var orderedLines= File.ReadAllLines(#"test.txt")
.OrderBy(line => Convert.ToInt32(line[line.Length-1]))
.ToList();
Edit:
With the clarification in your comment - integer following a space character, can be more than one digit:
var orderedLines= File.ReadAllLines(#"test.txt")
.OrderBy(line => Convert.ToInt32(line.Substring(line.LastIndexOf(" ")+1,
line.Length - line.LastIndexOf(" ")-1)))
.ToList();
You could do something like this, where filename is the name of your file:
// Replace with the actual name of your file
string fileName = "MyFile.txt";
// Read the contents of the file into memory
string[] lines = File.ReadAllLines(fileName);
// Sort the contents of the file based on the number after the last space in each line
var orderedLines = lines.OrderBy(x => Int32.Parse(x.Substring(x.LastIndexOf(' '))));
// Write the lines back to the file
File.WriteAllText(fileName, string.Join(Environment.NewLine, orderedLines));
This is just a rough outline; hopefully it's helpful.
File.WriteAllLines(
pathToWriteTo,
File.ReadLines(pathToReadFrom)
.OrderBy(s => Convert.ToInt32(s.Split(' ').Last()))
);
If the file is large, this could be ineffective as this method of sorting effectively requires reading the entire file into memory.
Assuming you want more than single digit integers and that you have a separation character between the filename and the rest (we'll call it 'splitChar') which can be any character at all:
from string str in File.ReadAllLines(fileName)
let split = str.Split(splitChar)
orderby Int32.Parse(split[split.Count()-1])
select str
will get you a sequence of strings in order of the integer value of the last grouping (separated by the split character).
Maybe one of these links can help you by sorting it the natural way:
Natural Sorting in C#
Sorting for Humans : Natural Sort Order

Categories