Difficulty using Orderby in Foreach loop C# - c#

I have been trying to order my files by their substring at the end of their names which happens to end with a number that indicates their position relative to the rest of the files. (example: fs-1632_1.txt --> fs-1632_2.txt).
I am currently able to get the numbers and turn them into ints I just have problems getting the OrderBy Method to work correctly. I am mostly working off of this example of Orderby.
internal class Data
{
public string Name { get; set; }
public double Number { get; set; }
}
private void OrderByEx1(List<FileInfo> files)
{
int num = 0;
int index_num = 0;
string file_num = "";
string file_name = "";
foreach (FileInfo in files)
{
file_name = file.FullName;
file_name = Path.GetFileNameWithoutExtension(file_name);
index_num = file_name.LastIndexOf("_") + 1;
file_num = file_name.Substring(index_num);
num = Int32.Parse(file_num);
Data[] set = {new Data {Name = file_name, Number = num }};
}
IEnumerable<Data> query = set.OrderBy(data => data.Number);
foreach (Data file_s in query)
MessageBox.Show($"{file_s.Name} {file_s.Number}");
}

No need for the foreach-loop. You could use this safe LINQ approach:
files = files
.Select(f => new { File = f, Name = Path.GetFileNameWithoutExtension(f.Name) })
.Select(x => new
{
x.File,
x.Name,
Token = x.Name.Substring(x.Name.LastIndexOf("_", StringComparison.Ordinal) + 1)
})
.Select(x => new
{
x.File,
x.Name,
x.Token,
IsInt = int.TryParse(x.Token, out int number),
ParsedNumber = number
})
.OrderByDescending(x => x.IsInt)
.ThenBy(x => x.ParsedNumber)
.Select(x => x.File)
.ToList();
If there is no number or it can't be parsed to int the file will be listed at the bottom.

You are declaring a Data array named set, add a single element to it and then restart the loop forgetting what you have loaded in the previous loop. The order is executed only when you exit the loop, but at that point the set array contains a single element, the last one.
You need to add your Data structure to a list and then order that list
List<Data> dataFiles = new List<Data>();
foreach (FileInfo file in files)
{
file_name = file.FullName;
file_name = Path.GetFileNameWithoutExtension(file_name);
index_num = file_name.LastIndexOf("_") + 1;
file_num = file_name.Substring(index_num);
num = Int32.Parse(file_num);
dataFiles.Add(new Data {Name = file_name, Number = num });
}
// If you don't need the query var you can just order directly in the for loop
// IEnumerable<Data> query = dataFiles.OrderBy(data => data.Number);
foreach (Data file_s in dataFiles.OrderBy(data => data.Number))
{
MessageBox.Show(file_s.Name + " " + file_s.Number);
}

Related

binary search in a sorted list in c#

I am retrieving client id\ drum id from a file and storing them in a list.
then taking the client id and storing it in another list.
I need to display the client id that the user specifies (input_id) on a Datagrid.
I need to get all the occurrences of this specific id using binary search.
the file is already sorted.
I need first to find the occurrences of input_id in id_list.
The question is: how to find all the occurrences of input_id in the sorted list id_list using binary search?
using(StreamReader sr= new StreamReader(path))
{
List<string> id_list = new List<string>();
List<string> all_list= new List<string>();
List<int> indexes = new List<int>();
string line = sr.ReadLine();
line = sr.ReadLine();
while (line != null)
{
all_list.Add(line);
string[] break1 = line.Split('/');
id_list.Add(break1[0]);
line = sr.ReadLine();
}
}
string input_id = textBox1.Text;
Data in the file:
client id/drum id
-----------------
123/321
231/3213
321/213123 ...
If the requirement was to use binary search I would create a custom class with a comparer, and then find an element and loop forward/backward to get any other elements. Like:
static void Main(string[] args
{
var path = #"file path...";
// read all the Ids from the file.
var id_list = File.ReadLines(path).Select(x => new Drum
{
ClientId = x.Split('/').First(),
DrumId = x.Split('/').Last()
}).OrderBy(o => o.ClientId).ToList();
var find = new Drum { ClientId = "231" };
var index = id_list.BinarySearch(find, new DrumComparer());
if (index != -1)
{
List<Drum> matches = new List<Drum>();
matches.Add(id_list[index]);
//get previous matches
for (int i = index - 1; i > 0; i--)
{
if (id_list[i].ClientId == find.ClientId)
matches.Add(id_list[i]);
else
break;
}
//get forward matches
for (int i = index + 1; i < id_list.Count; i++)
{
if (id_list[i].ClientId == find.ClientId)
matches.Add(id_list[i]);
else
break;
}
}
}
public class Drum
{
public string DrumId { get; set; }
public string ClientId { get; set; }
}
public class DrumComparer : Comparer<Drum>
{
public override int Compare(Drum x, Drum y) =>
x.ClientId.CompareTo(y.ClientId);
}
If i understand you question right then this should be a simple where stats.
// read all the Ids from the file.
var Id_list = File.ReadLines(path).Select(x => new {
ClientId = x.Split('/').First(),
DrumId = x.Split('/').Last()
}).ToList();
var foundIds = Id_list.Where(x => x.ClientId == input_id);

How to Get Groups of Files from GetFiles()

I have to process files everyday. The files are named like so:
fg1a.mmddyyyy
fg1b.mmddyyyy
fg1c.mmddyyyy
fg2a.mmddyyyy
fg2b.mmddyyyy
fg2c.mmddyyyy
fg2d.mmddyyyy
If the entire file group is there for a particular date, I can process it. If it isn't there, I should not process it. I may have several partial file groups that run over several days. So when I have fg1a.12062017, fg1b.12062017 and fg1c.12062017, I can process that group (fg1) only.
Here is my code so far. It doesn't work because I can't figure out how to get only the full groups to add to the the processing file list.
fileList = Directory.GetFiles(#"c:\temp\");
string[] fileGroup1 = { "FG1A", "FG1B", "FG1C" }; // THIS IS A FULL GROUP
string[] fileGroup2 = { "FG2A", "FG2B", "FG2C", "FG2D" };
List<string> fileDates = new List<string>();
List<string> procFileList;
// get a list of file dates
foreach (string fn in fileList)
{
string dateString = fn.Substring(fn.IndexOf('.'), 9);
if (!fileDates.Contains(dateString))
{
fileDates.Add(dateString);
}
}
bool allFiles = true;
foreach (string fg in fileGroup1)
{
foreach (string fd in fileDates)
{
string finder = fg + fd;
bool foundIt = false;
foreach (string fn in fileList)
{
if (fn.ToUpper().Contains(finder))
{
foundIt = true;
}
}
if (!foundIt)
{
allFiles = false;
}
else
{
foreach (string fn in fileList)
{
procFileList.Add(fn);
}
}
}
}
foreach (string fg in fileGroup2)
{
foreach (string fd in fileDates)
{
string finder = fg + fd;
bool foundIt = false;
foreach (string fn in fileList)
{
if (fn.ToUpper().Contains(finder))
{
foundIt = true;
}
}
if (!foundIt)
{
allFiles = false;
}
else
{
foreach (string fn in fileList)
{
procFileList.Add(fn);
}
}
}
}
Any help or advice would be greatly appreciated.
Because it can sometimes get messy dealing with multiple lists, groupings, and parsing file names, I would start by creating a class that represents a FileGroupItem. This class would have a Parse method that takes in a file path, and then has properties that represent the group part and date part of the file name, as well as the full path to the file:
public class FileGroupItem
{
public string DatePart { get; set; }
public string GroupName { get; set; }
public string FilePath { get; set; }
public static FileGroupItem Parse(string filePath)
{
if (string.IsNullOrWhiteSpace(filePath)) return null;
// Split the file name on the '.' character to get the group and date parts
var fileParts = Path.GetFileName(filePath).Split('.');
if (fileParts.Length != 2) return null;
return new FileGroupItem
{
GroupName = fileParts[0],
DatePart = fileParts[1],
FilePath = filePath
};
}
}
Then, in my main code, I would create a list of the file group definitions, and then populate a list of FileGroupItems from the directory we're scanning. After that, we can determine if any file group definition is complete by comparing it's items (in a case-insensitive way) to the actual FileGroupItems we found in the directory (after first grouping the FileGroupItems by it's DatePart). If the intersection of these two lists has the same number of items as the file group definition, then it's complete and we can process that group.
Maybe it will make more sense in code:
private static void Main()
{
var scanDirectory = #"f:\public\temp\";
var processedDirectory = #"f:\public\temp2\";
// The lists that define a complete group
var fileGroupDefinitions = new List<List<string>>
{
new List<string> {"FG1A", "FG1B", "FG1C"},
new List<string> {"FG2A", "FG2B", "FG2C", "FG2D"}
};
// Populate a list of FileGroupItems from the files
// in our directory, and group them on the DatePart
var fileGroups = Directory.EnumerateFiles(scanDirectory)
.Select(FileGroupItem.Parse)
.GroupBy(f => f.DatePart);
// Now go through each group and compare the items
// for that date with our file group definitions
foreach (var fileGroup in fileGroups)
{
foreach (var fileGroupDefinition in fileGroupDefinitions)
{
// Get the intersection of the group definition and this file group
var intersection = fileGroup
.Where(f => fileGroupDefinition.Contains(
f.GroupName, StringComparer.OrdinalIgnoreCase))
.ToList();
// If all the items in the definition are there, then process the files
if (intersection.Count == fileGroupDefinition.Count)
{
foreach (var fileGroupItem in intersection)
{
Console.WriteLine($"Processing file: {fileGroupItem.FilePath}");
// Move the file to the processed directory
File.Move(fileGroupItem.FilePath,
Path.Combine(processedDirectory,
Path.GetFileName(fileGroupItem.FilePath)));
}
}
}
}
Console.WriteLine("\nDone!\nPress any key to exit...");
Console.ReadKey();
}
I think you could simplify your algorithm so you just have file groups as a prefix and a number of files to expect, fg1 is 3 files for a given date
I think your code to find the distinct dates present is a good idea, though you should use a hash set rather than a list, if you occasionally expect a large number of dates.. ("Valentine's Day?" - Ed)
Then you just need to work on the other loop that does the checking. An algorithm like this
//make a new Dictionary<string,int> for the filegroup prefixes and their counts3
//eg myDict["fg1"] = 3; myDict["fg2"] = 4;
//list the files in the directory, into an array of fileinfo objects
//see the DirectoryInfo.GetFiles method
//foreach string d in the list of dates
//foreach string fgKey in myDict.Keys - the list of group prefixes
//use a bit of Linq to get all the fileinfos with a
//name starting with the group and ending with the date
var grplist = myfileinfos.Where(fi => fi.Name.StartsWith(fg) && fi.Name.EndsWith(d));
//if the grplist.Count == the filegroup count ( myDict[fgKey] )
//then send every file in grplist for processing
//remember that grplist is a collection of fileinfo objects,
//if your processing method takes a string filename, use fileinfo.Fullname
Putting your file groupings into one dictionary will make things a lot easier than having them as x separate arrays
I haven't written all the code for you, but I've comment sketched the algorithm, and I've put in some of the more awkward bits like the link, dictionary declaration and how to fill it.. have a go at fleshing it out with code, ask any questions in a comment on this post
First, create an array of the groups to make processing easier:
var fileGroups = new[] {
new[] { "FG1A", "FG1B", "FG1C" },
new[] { "FG2A", "FG2B", "FG2C", "FG2D" }
};
Then you can convert the array into a Dictionary to map each name back to its group:
var fileGroupMap = fileGroups.SelectMany(g => g.Select(f => new { key = f, group = g })).ToDictionary(g => g.key, g => g.group);
Then, preprocess the files you get from the directory:
var fileList = from fname in Directory.GetFiles(...)
select new {
fname,
fdate = Path.GetExtension(fname),
ffilename = Path.GetFileNameWithoutExtension(fname).ToUpper()
};
Now you can take your fileList and group by date and group, and then filter to just completed groups:
var profFileList = (from file in fileList
group file by new { file.fdate, fgroup = fileGroupMap[file.ffilename] } into fng
where fng.Key.fgroup.All(f => fng.Select(fn => fn.ffilename).Contains(f))
from fn in fng
select fn.fname).ToList();
Since you didn't preserve the groups, I flattened the groups at the end of the query into just a list of files to be processed. If you needed, you could keep them in groups and process the groups instead.
Note: If a file exists that belongs to no group, you will get an error from the lookup in fileGroupMap. If that is a possiblity you can filter the fileList to just known names as follows:
var fileList = from fname in GetFiles
let ffilename = Path.GetFileNameWithoutExtension(fname).ToUpper()
where fileGroupMap.Keys.Contains(ffilename)
select new {
fname,
fdate = Path.GetExtension(fname),
ffilename
};
Also note that having a name in multiple groups will cause an error in the creation of fileGroupMap. If that is a possibility, the queries would become more complex and have to be written differently.
Here is a simple class
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
string[] filenames = { "fg1a.12012017", "fg1b.12012017", "fg1c.12012017", "fg2a.12012017", "fg2b.12012017", "fg2c.12012017", "fg2d.12012017" };
new SplitFileName(filenames);
List<List<SplitFileName>> results = SplitFileName.GetGroups();
}
}
public class SplitFileName
{
public static List<SplitFileName> names = new List<SplitFileName>();
string filename { get; set; }
string prefix { get; set; }
string letter { get; set; }
DateTime date { get; set; }
public SplitFileName() { }
public SplitFileName(string[] splitNames)
{
foreach(string name in splitNames)
{
SplitFileName splitName = new SplitFileName();
names.Add(splitName);
splitName.filename = name;
string[] splitArray = name.Split(new char[] { '.' });
splitName.date = DateTime.ParseExact(splitArray[1],"MMddyyyy", System.Globalization.CultureInfo.InvariantCulture);
splitName.prefix = splitArray[0].Substring(0, splitArray[0].Length - 1);
splitName.letter = splitArray[0].Substring(splitArray[0].Length - 1,1);
}
}
public static List<List<SplitFileName>> GetGroups()
{
return names.OrderBy(x => x.letter).GroupBy(x => new { date = x.date, prefix = x.prefix })
.Where(x => string.Join(",",x.Select(y => y.letter)) == "a,b,c,d")
.Select(x => x.ToList())
.ToList();
}
}
}
With everyone's help, I solved it too. This is what I'm going with because it's the most maintainable for me but the solutions were so smart!!! Thanks everyone for your help.
private void CheckFiles()
{
var fileGroups = new[] {
new [] { "FG1A", "FG1B", "FG1C", "FG1D" },
new[] { "FG2A", "FG2B", "FG2C", "FG2D", "FG2E" } };
List<string> fileDates = new List<string>();
List<string> pfiles = new List<string>();
// get a list of file dates
foreach (string fn in fileList)
{
string dateString = fn.Substring(fn.IndexOf('.'), 9);
if (!fileDates.Contains(dateString))
{
fileDates.Add(dateString);
}
}
// check if a date has all the files
foreach (string fd in fileDates)
{
int fgCount = 0;
// for each file group
foreach (Array masterfg in fileGroups)
{
foreach (string fg in masterfg)
{
// see if all the files are there
bool foundIt = false;
string finder = fg + fd;
foreach (string fn in fileList)
{
if (fn.ToUpper().Contains(finder))
{
pfiles.Add(fn);
}
}
fgCount++;
}
if (fgCount == pfiles.Count())
{
foreach (string fn in pfiles)
{
procFileList.Add(fn);
}
pfiles.Clear();
}
else
{
pfiles.Clear();
}
}
}
return;
}

PDF Multi-level break and print

I am trying print documents by simplex/duplex and then by envelope type (pressure seal or regular)
I have Boolean fields for Simplex and for PressureSeal in my Record class.
All pressure seal are simplex, then there are regular simplex and duplex documents.
I can currently print the pressure seal documents separate from the regular simplex. I need to be able to create the regular duplex documents.
I have some lines commented out that caused all documents to be duplicated.
So, I am looking for something that works like so:
if (Simplex)
if (pressureseal)
create output file
else
create regular simplex output file
else
create duplex output file
Here is my existing code
#region Mark Records By Splits
//splits - 3,7,13
var splits = Properties.Settings.Default.Splits.Split(',');
Dictionary<int, int> splitRanges = new Dictionary<int, int>();
int lastSplit = 0;
foreach (var split in splits)
{
// Attempt to convert split into a integer and skip it if we can't.
int splitNum;
if (!int.TryParse(split, out splitNum))
continue;
splitRanges.Add(lastSplit, splitNum);
lastSplit = Math.Max(lastSplit, splitNum + 1);
}
// Assign record splits.
foreach (var range in splitRanges)
{
var recordsInRange = NoticeParser.records
.Where(x => x.Sheets >= range.Key && x.Sheets <= range.Value)
.ToList();
recordsInRange.ForEach(x => x.Split = string.Format("{0}-{1}", range.Key, range.Value));
}
var unassignedRecords = NoticeParser.records.Where(x => x.Sheets >= lastSplit).ToList();
unassignedRecords.ForEach(x => x.Split = string.Format("{0}up", lastSplit));
#endregion
#region Sort out Pressure Seal records
var recordsGroupedByPressureSeal = NoticeParser.records
.GroupBy(x=>x.PressureSeal);
//var recordsGroupedBySimplex = NoticeParser.records.GroupBy(x => x.Simplex);
#endregion
int fCount = 0;
int nsCount = 0;
//foreach (var simdupGroup in recordsGroupedBySimplex)
//{
// var recordsGroupedBySimDup = simdupGroup.GroupBy(x => x.Split).OrderBy(x => x.Key).ToDictionary(x => x.Key, x => x.ToList());
foreach (var pressureGroup in recordsGroupedByPressureSeal)
{
var recordsGroupedBySplit = pressureGroup.GroupBy(x => x.Split).OrderBy(x => x.Key).ToDictionary(x => x.Key, x => x.ToList());
foreach (var recordsInSplit in recordsGroupedBySplit.Values)
{
string processingExecutable = Path.Combine(Properties.Settings.Default.RootFolder, Properties.Settings.Default.ProcessingExecutable);
string toProcessingFile = string.Format(Properties.Settings.Default.OutputFolder + "{0}_" + "toBCC.txt", fCount);
string fromProcessingFile = string.Format(Properties.Settings.Default.OutputFolder + "IBC_LN_Sort_FromBCC.txt");
// If a sortation executable is specified, run it.
if (recordsInSplit.Count >= Properties.Settings.Default.MinimumSortationCount &&
File.Exists(processingExecutable))
{
// log.Info("Sorting records...");
var processedRecords = recordsInSplit.ProcessAddresses<Record, RecordMap>(
processingExecutable,
toProcessingFile,
fromProcessingFile);
// Update records with the sortation fields.
recordsInSplit.UpdateAddresses(processedRecords);
}
else
{
toProcessingFile = string.Format(Properties.Settings.Default.OutputFolder + "{0}_no_sort_toBCC.txt", nsCount);
fromProcessingFile = string.Format(Properties.Settings.Default.OutputFolder + "IBC_LN_NoSort_FromBCC.txt");
//var processedRecords = recordsInSplit.ProcessAddresses<Record, RecordMap>(
// processingExecutable,
// toProcessingFile,
// fromProcessingFile);
// Update records with the sortation fields.
// recordsInSplit.UpdateAddresses(processedRecords);
// If not sorted, provide our own sequence number.
int sequence = 1;
recordsInSplit.ForEach(x => x.SequenceNumber = sequence++);
recordsInSplit.ForEach(x => x.TrayNumber = 1);
nsCount++;
}
fCount++;
}
}
//}
NoticeWriter noticeWriter = new NoticeWriter(noticeParser.reader);
#region Print by PressureSeal or Regular
//foreach (var simdupGroup in recordsGroupedBySimplex)
//{
// string printType = null;
// if (simdupGroup.Key)
// printType = "Simplex";
// else
// printType = "Duplex";
foreach (var splitGroup in recordsGroupedByPressureSeal)
{
string envType = ""; // envelope type
if (splitGroup.Key)
envType = "PressureSeal";
else
envType = "Regular";
var recordsGroupedBySplit = splitGroup.GroupBy(x => x.Split).OrderBy(x => x.Key).ToDictionary(x => x.Key, x => x.ToList());
foreach (var recordsInSplit in recordsGroupedBySplit)
{
string outputName = string.Format("IBC_Daily_Notices_{0}_{1}",envType, /*printType,*/ recordsInSplit.Key);
noticeWriter.WriteOutputFiles(Properties.Settings.Default.OutputFolder, outputName, recordsInSplit.Value, Properties.Settings.Default.RecordsPerBatch);
}
}
//}
#endregion

Sort a List in which each element contains 2 Values

I have a text file that contains Values in this Format: Time|ID:
180|1
60 |2
120|3
Now I want to sort them by Time. The Output also should be:
60 |2
120|3
180|1
How can I solve this problem? With this:
var path = #"C:\Users\admin\Desktop\test.txt";
List<string> list = File.ReadAllLines(path).ToList();
list.Sort();
for (var i = 0; i < list.Count; i++)
{
Console.WriteLine(list[i]);
}
I got no success ...
3 steps are necessary to do the job:
1) split by the separator
2) convert to int because in a string comparison a 6 comes after a 1 or 10
3) use OrderBy to sort your collection
Here is a linq solution in one line doing all 3 steps:
list = list.OrderBy(x => Convert.ToInt32(x.Split('|')[0])).ToList();
Explanation
x => lambda expression, x denotes a single element in your list
x.Split('|')[0] splits each string and takes only the first part of it (time)
Convert.ToInt32(.. converts the time into a number so that the ordering will be done in the way you desire
list.OrderBy( sorts your collection
EDIT:
Just to understand why you got the result in the first place here is an example of comparison of numbers in string representation using the CompareTo method:
int res = "6".CompareTo("10");
res will have the value of 1 (meaning that 6 is larger than 10 or 6 follows 10)
According to the documentation->remarks:
The CompareTo method was designed primarily for use in sorting or alphabetizing operations.
You should parse each line of the file content and get values as numbers.
string[] lines = File.ReadAllLines("path");
// ID, time
var dict = new Dictionary<int, int>();
// Processing each line of the file content
foreach (var line in lines)
{
string[] splitted = line.Split('|');
int time = Convert.ToInt32(splitted[0]);
int ID = Convert.ToInt32(splitted[1]);
// Key = ID, Value = Time
dict.Add(ID, time);
}
var orderedListByID = dict.OrderBy(x => x.Key).ToList();
var orderedListByTime = dict.OrderBy(x => x.Value).ToList();
Note that I use your ID reference as Key of dictionary assuming that ID should be unique.
Short code version
// Key = ID Value = Time
var orderedListByID = lines.Select(x => x.Split('|')).ToDictionary(x => Convert.ToInt32(x[1]), x => Convert.ToInt32(x[0])).OrderBy(x => x.Key).ToList();
var orderedListByTime = lines.Select(x => x.Split('|')).ToDictionary(x => Convert.ToInt32(x[1]), x => Convert.ToInt32(x[0])).OrderBy(x => x.Value).ToList();
You need to convert them to numbers first. Sorting by string won't give you meaningful results.
times = list.Select(l => l.Split('|')[0]).Select(Int32.Parse);
ids = list.Select(l => l.Split('|')[1]).Select(Int32.Parse);
pairs = times.Zip(ids, (t, id) => new{Time = t, Id = id})
.OrderBy(x => x.Time)
.ToList();
Thank you all, this is my Solution:
var path = #"C:\Users\admin\Desktop\test.txt";
List<string> list = File.ReadAllLines(path).ToList();
list = list.OrderBy(x => Convert.ToInt32(x.Split('|')[0])).ToList();
for(var i = 0; i < list.Count; i++)
{
Console.WriteLine(list[i]);
}
import java.util.ArrayList;
import java.util.Collections;
import java.util.List;
public class TestClass {
public static void main(String[] args) {
List <LineItem> myList = new ArrayList<LineItem>();
myList.add(LineItem.getLineItem(500, 30));
myList.add(LineItem.getLineItem(300, 20));
myList.add(LineItem.getLineItem(900, 100));
System.out.println(myList);
Collections.sort(myList);
System.out.println("list after sort");
System.out.println(myList);
}
}
class LineItem implements Comparable<LineItem>{
int time;
int id ;
#Override
public String toString() {
return ""+ time + "|"+ id + " ";
}
#Override
public int compareTo(LineItem o) {
return this.time-o.time;
}
public static LineItem getLineItem( int time, int id ){
LineItem l = new LineItem();
l.time=time;
l.id=id;
return l;
}
}

How to make the custom parser for text file

Actually I set four columns using data table and I want this column retrieve value from text file. I used regex for remove the particular line from the text file.
My objective is that I want to show text file on the grid using data table so first I am trying to create data table and remove the line (show at the program) using regex.
Here I post my full code.
namespace class
{
public partial class Form1 : Form
{
public Form1()
{
InitializeComponent();
}
private void button1_Click(object sender, EventArgs e)
{
StreamReader sreader = File.OpenText(#"C:\FareSearchRegex.txt");
string line;
DataTable dt = new DataTable();
DataRow dr;
dt.Columns.Add("PTC");
dt.Columns.Add("CUR");
dt.Columns.Add("TAX");
dt.Columns.Add("FARE BASIS");
while ((line = sreader.ReadLine()) != null)
{
var pattern = "---------- RECOMMENDATION 1 OF 3 IN GROUP 1 (USD 168.90)----------";
var result = Regex.Replace(line,pattern," ");
dt.Rows.Add(line);
}
}
}
class Class1
{
string PTC;
string CUR;
float TAX;
public string gsPTC
{
get{ return PTC; }
set{ PTC = value; }
}
public string gsCUR
{
get{ return CUR; }
set{ CUR = value; }
}
public float gsTAX
{
get{ return TAX; }
set{ TAX = value; }
}
}
}
If your format is strict(e.g. always 4 columns) and you want to remove only this complete line i don't see any reason to use regex:
var rows = File.ReadLines(#"C:\FareSearchRegex.txt")
.Where(l => l != "---------- RECOMMENDATION 1 OF 3 IN GROUP 1 (USD 168.90)----------")
.Select(l => new { line = l, items = l.Split(','), row = dt.Rows.Add() });
foreach (var x in rows)
x.row.ItemArray = x.items;
(assumed that the fields are separated by comma)
Edit: This works with your pastebin:
string header = " PTC CUR TAX FARE BASIS";
bool takeNextLine = false;
foreach (string line in File.ReadLines(#"C:\FareSearchRegex.txt"))
{
if (line.StartsWith(header))
takeNextLine = true;
else if (takeNextLine)
{
var tokens = line.Split(new[] { #" " }, StringSplitOptions.RemoveEmptyEntries);
dt.Rows.Add().ItemArray = tokens.Where((t, i) => i != 2).ToArray();
takeNextLine = false;
}
}
(since you have an empty column which you want to exclude from the result i've used the clumsy and possibly error-prone(?) query Where((t, i) => i != 2))
To parse the file you'll need to:
Split the text of the file into data chunks. A chunk, in your case can be identified by the header PTC CUR TAX FARE BASIS and by the TOTAL line. To split the text you'll need to tokenize the input as follows> (i) define a regular expression to match the headers, (ii) define a regular expression to match the Total lines (footers); Using (i) and (ii) you can join them by the order of appearance index and determine the total size of each chunk (see the line with (x,y)=>new{StartIndex = x.Match.Index, EndIndex = y.Match.Index + y.Match.Length}) below). Use String.Substring method to separate the chunks.
Extract the data from each individual chunk. Knowing that data is split by lines you just have to iterate through all lines in a chunk (ignoring header and footer) and process each line.
This code should help:
string file = #"C:\FareSearchRegex.txt";
string text = File.ReadAllText(file);
var headerRegex = new Regex(#"^(\)>)?\s+PTC\s+CUR\s+TAX\s+FARE BASIS$", RegexOptions.IgnoreCase | RegexOptions.Multiline);
var totalRegex = new Regex(#"^\s+TOTAL[\w\s.]+?$",RegexOptions.IgnoreCase | RegexOptions.Multiline);
var lineRegex = new Regex(#"^(?<Num>\d+)?\s+(?<PTC>[A-Z]+)\s+\d+\s(?<Cur>[A-Z]{3})\s+[\d.]+\s+(?<Tax>[\d.]+)",RegexOptions.IgnoreCase | RegexOptions.Multiline);
var dataIndices =
headerRegex.Matches(text).Cast<Match>()
.Select((m, index) => new{ Index = index, Match = m })
.Join(totalRegex.Matches(text).Cast<Match>().Select((m, index) => new{ Index = index, Match = m }),
x => x.Index,
x => x.Index,
(x, y) => new{ StartIndex = x.Match.Index, EndIndex = y.Match.Index + y.Match.Length });
var items = dataIndices
.Aggregate(new List<string>(), (list, x) =>
{
var item = text.Substring(x.StartIndex, x.EndIndex - x.StartIndex);
list.Add(item);
return list;
});
var result = items.SelectMany(x =>
{
var lines = x.Split(new string[]{Environment.NewLine, "\r", "\n"}, StringSplitOptions.RemoveEmptyEntries);
return lines.Skip(1) //Skip header
.Take(lines.Length - 2) // Ignore footer
.Select(line =>
{
var match = lineRegex.Match(line);
return new
{
Ptc = match.Groups["PTC"].Value,
Cur = match.Groups["Cur"].Value,
Tax = Convert.ToDouble(match.Groups["Tax"].Value)
};
});
});

Categories