I have looked but failed to find a tutorial that would address my question. Perhaps I didn't word my searches correctly. In any event, I am here.
I have taken over a handyman company and he had about 150 customers. He had a program that he bought that produces records for his customers. I have the records, but he wouldn't sell me the program as it is commercial and he's afraid of going to prison for selling something like that... whatever... They are written in a text file. The format appears to be:
string name
string last_job_description
int job_codes
string job_address
string comments
The text file looks like this
*Henderson*
*Cleaned gutters, fed the lawn, added mulch to the front flower beds,
cut the back yard, edged the side walk, pressure washed the driveway*
*04 34 32 1 18 99 32 22 43 72 11 18*
*123 Anywhere ave*
*Ms.always pays cash no tip. Mr. gives you a check but always tips*
Alright.. My question is in C# I want to write a program to edit these records, add new customers and delete some I may lose, moves, or dies... But the 2nd entry is broken over two lines, sometimes 3 and 4 lines. They all start and end with *. So, how do I read the 2 to 4 lines and get them into the last_job_description string variable? I can write the class, I can read lines, I can trim away the asterisks. I can find nothing on reading multiple lines into a single variable.
Let's do it right!
First define the customer model:
public class Customer
{
public string Name { get; set; }
public string LastJobDescription { get; set; }
public List<int> JobCodes { get; set; }
public string JobAddress { get; set; }
public string Comments { get; set; }
}
Then, we need a collection of customers:
var customers = new List<Customer>();
Fill the collection with data from the file:
string text = File.ReadAllText("customers.txt");
string pattern = #"(?<= ^ \*) .+? (?= \* \r? $)";
var options = RegexOptions.IgnorePatternWhitespace | RegexOptions.Compiled
| RegexOptions.Singleline | RegexOptions.Multiline;
var matches = Regex.Matches(text, pattern, options);
for (int i = 0; i < matches.Count; i += 5)
{
var customer = new Customer
{
Name = matches[i].Value,
LastJobDescription = matches[i + 1].Value,
JobCodes = matches[i + 2].Value.Split().Select(s => int.Parse(s)).ToList(),
JobAddress = matches[i + 3].Value,
Comments = matches[i + 4].Value
};
customers.Add(customer);
}
I'm using a regular expression that allows to have the * character in the middle of the lines.
Now we can comfortably work with this collection.
Examples of usage.
Remove the first customer:
customers.RemoveAt(0);
Add a comment to the latest client:
customers.Last().Comments += " Very generous.";
Find the first record for a client by the name of Henderson and add the code of the job performed:
customers.Find(c => c.Name == "Henderson").JobCodes.Add(42);
Add new customer:
var customer = new Customer
{
Name = "Chuck Norris",
LastJobDescription= "Saved the world.",
JobCodes = new List<int>() { 1 },
JobAddress = "UN",
Comments = "Nice guy!"
};
customers.Add(customer);
And so on.
To save data to a file, use the following:
var sb = new StringBuilder();
foreach (var customer in customers)
{
sb.Append('*').Append(customer.Name).Append('*').AppendLine();
sb.Append('*').Append(customer.LastJobDescription).Append('*').AppendLine();
sb.Append('*').Append(string.Join(" ", customer.JobCodes)).Append('*').AppendLine();
sb.Append('*').Append(customer.JobAddress).Append('*').AppendLine();
sb.Append('*').Append(customer.Comments).Append('*').AppendLine();
}
File.WriteAllText("customers.txt", sb.ToString());
You probably need a graphical user interface. If so, I suggest you to ask a new question where you specify what you use: WinForms, WPF, Web-application or something else.
if you want to read all the lines in a file into a single variable then you need to do this
var txt = File.ReadAllText(.. your file location here ..);
or you can do
var lines = File.ReadAllLines(.. your file location here); and then you can iterate through each line and remove the blanks of leave as it is.
But based on your question the first line is what you're really after
when you should read last_job_description read second line. if it starts with * it means that this line is job_codes otherwise append it to pervous read line. do this fo every lines until you find job_codes.
using (var fileReader = new StreamReader("someFile"))
{
// read some pervous data here
var last_job_description = fileReader.ReadLine();
var nextLine = fileReader.ReadLine();
while (nextLine.StartsWith("*") == false)
{
last_job_description += nextLine;
nextLine = fileReader.ReadLine();
}
//you have job_codes in nextline, so parse it now and read next data
}
or you may even use the fact that each set of data starts and ENDS!!! with * so you may create function that reads each "set" and returns it as singleline string, no matter how much lines it really was. let's assume that reader variable you pass to this function is the same StreamReader as in upper code
function string readSingleValue(StreamReader fileReader){
var result = "";
do
{
result += fileReader.ReadLine().Trim();
} while (result.EndsWith("*") == false);
return result;
}
I think your problem is little complex. What i am trying to say is it doesn't mention to a specific problem. It is contained at least 3 different programming aspects that you need to know or learn to reach your goal. The steps that you must take are described below.
unfortunately you need to consider your current file as huge string type.
Then you are able to process this string and separate different parts.
Next move is defining a neat, reliable and robust XML file which can hold your data.
Fill your xml with your manipulated string.
If you have a xml file you can simply update it.
There is not built-in method that will parse your file exactly like you want, so you have to get your hands dirty.
Here's an example of a working code for your specific case. I assumed you wanted your job_codes in an array since they're multiple integers separated by spaces.
Each time we encounter a line starting with '*', we increase a counter that'll tell us to work with the next of your properties (name, last job description, etc...).
Each line is appended to the current property.
And obviously, we remove the stars from the beginning and ending of every line.
string name = String.Empty;
string last_job_description = String.Empty;
int[] job_codes = null;
string job_address = String.Empty;
string comments = String.Empty;
int numberOfPropertiesRead = 0;
var lines = File.ReadAllLines("C:/yourfile.txt");
for (int i = 0; i < lines.Count(); i++)
{
var line = lines[i];
bool newProp = line.StartsWith("*");
bool endOfProp = line.EndsWith("*");
if (newProp)
{
numberOfPropertiesRead++;
line = line.Substring(1);
}
if (endOfProp)
line = line.Substring(0, line.Length - 1);
switch (numberOfPropertiesRead)
{
case 1: name += line; break;
case 2: last_job_description += line; break;
case 3:
job_codes = line.Split(' ').Select(el => Int32.Parse(el)).ToArray();
break;
case 4: job_address += line; break;
case 5: comments += line; break;
default:
throw new ArgumentException("Wow, that's too many properties dude.");
}
}
Console.WriteLine("name: " + name);
Console.WriteLine("last_job_description: " + last_job_description);
foreach (int job_code in job_codes)
Console.Write(job_code + " ");
Console.WriteLine();
Console.WriteLine("job_address: " + job_address);
Console.WriteLine("comments: " + comments);
Console.ReadLine();
Related
In my C# program (I'm new to C# so I hope that I'm doing things correctly), I'm trying to read in all of the lines from a text file, which will look something along the lines of this, but with more entries (these are fictional people so don't worry about privacy):
Logan Babbleton ID #: 0000011 108 Crest Circle Mr. Logan M. Babbleton
Pittsburgh PA 15668 SSN: XXX-XX-XXXX
Current Program(s): Bachelor of Science in Cybersecurity
Mr. Carter J. Bairn ID #: 0000012 21340 North Drive Mr. Carter Joseph Bairn
Pittsburgh PA 15668 SSN: XXX-XX-XXXX
Current Program(s): Bachelor of Science in Computer Science
I have these lines read into an array, concentrationArray and want to find the lines that contain the word "Current", split them at the "(s): " in "Program(s): " and print the words that follow. I've done this earlier in my program, but splitting at an ID instead, like this:
nameLine = nameIDLine.Split(new string[] { "ID" }, StringSplitOptions.None)[1];
However, whenever I attempt to do this, I get an error that my index is out of the bounds of my split array (not my concentrationArray). Here's what I currently have:
for (int i = 0; i < concentrationArray.Length; i++)
{
if (concentrationArray[i].Contains("Current"))
{
lstTest.Items.Add(concentrationArray[i].Split(new string[] { "(s): " }, StringSplitOptions.None)[1]);
}
}
Where I'm confused is that if I change the index to 0 instead of 1, it will print everything out perfectly, but it will print out the first half, instead of the second half, which is what I want. What am I doing wrong? Any feedback is greatly appreciated since I'm fairly new at C# and would love to learn what I can. Thanks!
Edit - The only thing that I could think of was that maybe sometimes there wasn't anything after the string that I used to separate each element, but when I checked my text file, I found that was not the case and there is always something following the string used to separate.
You should check the result of split before trying to read at index 1.
If your line doesn't contain a "(s): " your code will crash with the exception given
for (int i = 0; i < concentrationArray.Length; i++)
{
if (concentrationArray[i].Contains("Current"))
{
string[] result = concentrationArray[i].Split(new string[] { "(s): " }, StringSplitOptions.None);
if(result.Length > 1)
lstTest.Items.Add(result[1]);
else
Console.WriteLine($"Line {i} has no (s): followeed by a space");
}
}
To complete the answer, if you always use index 0 then there is no error because when no separator is present in the input string then the output is an array with a single element containing the whole unsplitted string
If the line will always starts with
Current Program(s):
then why don't you just replace it with empty string like this:
concentrationArray[i].Replace("Current Program(s): ", "")
It is perhaps a little easier to understand and more reusable if you separate the concerns. It will also be easier to test. An example might be...
var allLines = File.ReadLines(#"C:\your\file\path\data.txt");
var currentPrograms = ExtractCurrentPrograms(allLines);
if (currentPrograms.Any())
{
lstTest.Items.AddRange(currentPrograms);
}
...
private static IEnumerable<string> ExtractCurrentPrograms(IEnumerable<string> lines)
{
const string targetPhrase = "Current Program(s):";
foreach (var line in lines.Where(l => !string.IsNullOrWhiteSpace(l)))
{
var index = line.IndexOf(targetPhrase);
if (index >= 0)
{
var programIndex = index + targetPhrase.Length;
var text = line.Substring(programIndex).Trim();
if (!string.IsNullOrWhiteSpace(text))
{
yield return text;
}
}
}
}
Here is a bit different approach
List<string> test = new List<string>();
string pattern = "Current Program(s):";
string[] allLines = File.ReadAllLines(#"C:\Users\xyz\Source\demo.txt");
foreach (var line in allLines)
{
if (line.Contains(pattern))
{
test.Add(line.Substring(line.IndexOf(pattern) + pattern.Length));
}
}
or
string pattern = "Current Program(s):";
lstTest.Items.AddRange(File.ReadLines(#"C:\Users\ODuritsyn\Source\demo.xml")
.Where(line => line.Contains(pattern))
.Select(line => line.Substring(line.IndexOf(pattern) + pattern.Length)));
I am reading data from excel file(which is actually a comma separated csv file) columns line-by-line, this file gets send by an external entity.Among the columns to be read is the time, which is in 00.00 format, so a split method is used read all the different columns, however the file sometimes comes with extra columns(commas between the elements) so the split elements are now always correct. Below is the code used to read and split the different columns, this elements will be saved in the database.
public void SaveFineDetails()
{
List<string> erroredFines = new List<string>();
try
{
log.Debug("Start : SaveFineDetails() - Saving Downloaded files fines..");
if (!this.FileLines.Any())
{
log.Info(string.Format("End : SaveFineDetails() - DataFile was Empty"));
return;
}
using (RAC_TrafficFinesContext db = new RAC_TrafficFinesContext())
{
this.FileLines.RemoveAt(0);
this.FileLines.RemoveAt(FileLines.Count - 1);
int itemCnt = 0;
int errorCnt = 0;
int duplicateCnt = 0;
int count = 0;
foreach (var line in this.FileLines)
{
count++;
log.DebugFormat("Inserting {0} of {1} Fines..", count.ToString(), FileLines.Count.ToString());
string[] bits = line.Split(',');
int bitsLength = bits.Length;
if (bitsLength == 9)
{
string fineNumber = bits[0].Trim();
string vehicleRegistration = bits[1];
string offenceDateString = bits[2];
string offenceTimeString = bits[3];
int trafficDepartmentId = this.TrafficDepartments.Where(tf => tf.DepartmentName.Trim().Equals(bits[4], StringComparison.InvariantCultureIgnoreCase)).Select(tf => tf.DepartmentID).FirstOrDefault();
string proxy = bits[5];
decimal fineAmount = GetFineAmount(bits[6]);
DateTime fineCreatedDate = DateTime.Now;
DateTime offenceDate = GetOffenceDate(offenceDateString, offenceTimeString);
string username = Constants.CancomFTPServiceUser;
bool isAartoFine = bits[7] == "1" ? true : false;
string fineStatus = "Sent";
try
{
var dupCheck = db.GetTrafficFineByNumber(fineNumber);
if (dupCheck != null)
{
duplicateCnt++;
string ExportFileName = (base.FileName == null) ? string.Empty : base.FileName;
DateTime FileDate = DateTime.Now;
db.CreateDuplicateFine(ExportFileName, FileDate, fineNumber);
}
else
{
var adminFee = db.GetAdminFee();
db.UploadFTPFineData(fineNumber, fineAmount, vehicleRegistration, offenceDate, offenceDateString, offenceTimeString, trafficDepartmentId, proxy, false, "Imported", username, adminFee, isAartoFine, dupCheck != null, fineStatus);
}
itemCnt++;
}
catch
{
errorCnt++;
}
}
else
{
erroredFines.Add(line);
continue;
}
}
Now the problem is, this file doesn't always come with 9 elements as we expect, for example on this image, the lines are not the same(ignore first line, its headers)
On first line FM is supposed to be part of 36DXGP instead of being two separated elements. This means the columns are now extra. Now this brings us to the issue at hand, which is the time element, beacuse of extra coma, the time is now something else, is now read as 20161216, so the split on the time element is not working at all. So what I did was, read the incorrect line, check its length, if the length is not 9 then, add it to the error list and continue.
But my continue key word doesn't seem to work, it gets into the else part and then goes back to read the very same error line.
I have checked answers on Break vs Continue and they provide good example on how continue works, I introduced the else because the format on this examples did not work for me(well the else did not made any difference neither). Here is the sample data,
NOTE the first line to be read starts with 96
H,1789,,,,,,,,
96/17259/801/035415,FM,36DXGP,20161216,17.39,city hall-cape town,Makofane,200,0,0
MA/80/034808/730,CA230721,20170117,17.43,malmesbury,PATEL,200,0,0,
what is it that I am doing so wrong here
I have found a way to solve my problem, there was an issue with the length of the line because of the trailing comma which caused an empty element, I then got rid of this empty element with this code and determined the new length
bits = bits.Where(x => !string.IsNullOrEmpty(x)).ToArray();
int length = bits.Length
All is well now
I suggest you use the following overload for performance and readability reasons:
line.Split(new char[] {','}, StringSplitOptions.RemoveEmptyEntries)l
is there any way to read text files with fixed columns in C # without using regex and substring?
I want to read a file with fixed columns and transfer the column to an excel file (.xlsx)
example 1
POPULACAO
MUNICIPIO UF CENSO 2010
AC 78.507
AC 15.100
Rio Branco AC 336.038
Sena Madureira AC 38.029
example 2
POPULACAO
MUNICIPIO UF CENSO 2010
AC 78.507
Epitaciolândia AC 15.100
Rio Branco AC 336.038
Sena Madureira AC 38.029
remembering that I have a case as in the second example where a column is blank, I can get the columns and the values using regex and / or substring, but if it appears as a file in Example 2, with the regex line of the file is ignored, so does substring.
Assuming you mean "fixed columns" extremely literally, and every single non-terminal column is exactly the same width, each column is separated by exactly one space, yes, you can get away with using neither regex or substring. If that's the case - and bear in mind that's also suggesting that every single person in the database has a name that's exactly four letters long - then you can just read the file in by lines. Id would be line[0].ToString(), name would be new string(new char[] { line[2], line[3], line[4], line[5]), etc.
Or, for any given value:
var str = new StringBuilder();
for (int i = firstIndex; i < lastIndex; i++)
{
str.Append(line[i]);
}
But this is basically just performing the exact function of Substring. Substring isn't your problem - handling empty values in the first (city) column is. So, for any given line, you need to check whether the line is empty:
foreach (line in yourLines)
{
if (line.Substring(cityStartIndex, cityEndIndex).IsNullOrWhitespace) == "")
{
continue;
}
}
Alternately, if you're sure the city name will always be at the very first index of the line:
foreach (line in yourLines)
{
if (line[0] == ' ') { continue; }
}
And if the value you got from the city cell was valid, you'd store that value and continue on to using Substring with the indices of the rest of the values in the row.
If for whatever reason you don't want to use a regular expression or Substring(), you have a couple of other options:
String.Split, e.g. var columns = line.Split(' ');
String.Chars, using the known widths of each column to build your output;
Why not just use string.Split()?
Something like:
using (StreamReader stream = new StreamReader(file)) {
while (!stream.EndOfStream) {
string line = stream.ReadLine();
if (string.IsNullOrWhitespace(line))
continue;
string[] fields = line.Split((char[])null, StringSplitOptions.RemoveEmptyEntries);
int ID = -1, age = -1;
string name = null, training = null;
ID = int.Parse(fields[0]);
if (fields.Length > 1)
name = fields[1];
if (fields.Length > 2)
age = int.Parse(fields[2]);
if (fields.Length > 3)
training = fields[3];
// do stuff
}
}
Only downside to this is that it will allow fields of arbitrary length. And spaces in fields will break the fields.
As for regular expressions being ignored in the last case, try something like:
Match m = Regex.Match(line, #"^(.{2}) (.{4}) (.{2})( +.+?)?$");
First - define a variable for each column in the file. Then go through the file line by line and assign each column to the correct variable. Substitute the correct start positions and lengths. This should be enough information to get you started parsing your file.
private string id;
private string name;
private string age;
private string training;
while((line = file.ReadLine()) != null)
{
id = line.Substring(0, 3)
name = line.Substring(3, 10)
age = line.Substring(12, 2)
training = line.Substring(14, 10)
...
if (string.IsNullOrWhiteSpace(name))
{
// ignore this line if the name is blank
}
else
{
// do something useful
}
counter++;
}
I'm running three counters, one to return the total amount of chars, one to return the number of '|' chars in my .txt file (total). And one to read how many separate lines are in my text file. I'm assuming my counters are wrong, I'm not sure. In my text file there are some extra '|' chars, but that is a bug I need to fix later...
The Message Boxes show
"Lines = 8"
"Entries = 8"
"Total Chars = 0"
Not sure if it helps but the .txt file is compiled using a streamwriter, and I have a datagridview saved to a string to create the output. Everything seems okay with those functions.
Here is a copy of the text file I'm reading
Matthew|Walker|MXW320|114282353|True|True|True
Audrey|Walker|AXW420|114282354|True|True|True
John|Doe|JXD020|111222333|True|True|False
||||||
And here is the code.
private void btnLoadList_Click(object sender, EventArgs e)
{
var loadDialog = new OpenFileDialog
{
InitialDirectory = Convert.ToString(Environment.SpecialFolder.MyDocuments),
Filter = "Text (*.txt)|*.txt",
FilterIndex = 1
};
if (loadDialog.ShowDialog() != DialogResult.OK) return;
using (new StreamReader(loadDialog.FileName))
{
var lines = File.ReadAllLines(loadDialog.FileName);//Array of all the lines in the text file
foreach (var assocStringer in lines)//For each assocStringer in lines (Runs 1 cycle for each line in the text file loaded)
{
var entries = assocStringer.Split('|'); // split the line into pieces (e.g. an array of "Matthew", "Walker", etc.)
var obj = (Associate) _bindingSource.AddNew();
if (obj == null) continue;
obj.FirstName = entries[0];
obj.LastName = entries[1];
obj.AssocId = entries[2];
obj.AssocRfid = entries[3];
obj.CanDoDiverts = entries[4];
obj.CanDoMhe = entries[5];
obj.CanDoLoading = entries[6];
}
}
}
Hope you guys find the bug(s) here. Sorry if the formatting is sloppy I'm self-taught, no classes. Any extra advice is welcomed, be as honest and harsh as need be, no feelings will be hurt.
In summary
Why is this program not reading the correct values from the text file I'm using?
Not totally sure I get exactly what you're trying to do, so correct me if I'm off, but if you're just trying to get the line count, pipe (|) count and character count for the file the following should get you that.
var lines = File.ReadAllLines(load_dialog.FileName);
int lineCount = lines.Count();
int totalChars = 0;
int totalPipes = 0; // number of "|" chars
foreach (var s in lines)
{
var entries = s.Split('|'); // split the line into pieces (e.g. an array of "Matthew", "Walker", etc.)
totalChars += s.Length; // add the number of chars on this line to the total
totalPipes = totalPipes + entries.Count() - 1; // there is always one more entry than pipes
}
All the Split() is doing is breaking the full line into an array of the individual fields in the string. Since you only seem to care about the number of pipes and not the fields, I'm not doing much with it other than determining the number of pipes by taking the number of fields and subtracting one (since you don't have a trailing pipe on each line).
For example a string contains the following (the string is variable):
http://www.google.comhttp://www.google.com
What would be the most efficient way of removing the duplicate url here - e.g. output would be:
http://www.google.com
I assume that input contains only urls.
string input = "http://www.google.comhttp://www.google.com";
// this will get you distinct URLs but without "http://" at the beginning
IEnumerable<string> distinctAddresses = input
.Split(new[] {"http://"}, StringSplitOptions.RemoveEmptyEntries)
.Distinct();
StringBuilder output = new StringBuilder();
foreach (string distinctAddress in distinctAddresses)
{
// when building the output, insert "http://" before each address so
// that it resembles the original
output.Append("http://");
output.Append(distinctAddress);
}
Console.WriteLine(output);
Efficiency has various definitions: code size, total execution time, CPU usage, space usage, time to write the code, etc. If you want to be "efficient", you should know which one of these you're trying for.
I'd do something like this:
string url = "http://www.google.comhttp://www.google.com";
if (url.Length % 2 == 0)
{
string secondHalf = url.Substring(url.Length / 2);
if (url.StartsWith(secondHalf))
{
url = secondHalf;
}
}
Depending on the kinds of duplicates you need to remove, this may or may not work for you.
collect strings into list and use distinct, if your string has http address you can apply regex http:.+?(?=((http:)|($)) with RegexOptions.SingleLine
var distinctList = list.Distinct(StringComparer.CurrentCultureIgnoreCase).ToList();
Given you don't know the length of the string, you don't know if something is double and you don't know what is double:
string yourprimarystring = "http://www.google.comhttp://www.google.com";
int firstCharacter;
string temp;
for(int i = 0; i <= yourprimarystring.length; i++)
{
for(int j = 0; j <= yourprimarystring.length; j++)
{
string search = yourprimarystring.substring(i,j);
firstCharacter = yourprimaryString.IndexOf(search);
if(firstCharacter != -1)
{
temp = yourprimarystring.substring(0,firstCharacter) + yourprimarystring.substring(firstCharacter + j - i,yourprimarystring.length)
yourprimarystring = temp;
}
}
This itterates through all your elements, takes all out from first to last letter and searches for them like this:
ABCDA - searches for A finds A exludes A, thats the problem, you need to specify how long the duplication needs to be if you want to make it variable, but maybe my code helps you.