How to validate multiple filenames inside string array in c# - c#

Inside a string array -"files", 10 file paths are stored.
For example, I show 2 of them,
[0] C:\\Users\\17\\Documents\\FS\\D\\mdN_2903.dat
[1] C:\\Users\\17\\Documents\\FS\\D\\mdNBP_29032.dat
I want to validate using their filename (before the numbering & extension .dat) as well the number of files counted in c#.net
I tried to use it like this, but it didn't passed even though the files are present
if(Array.Exists(files,e => e.Contains("mdN_") && e.Contains("mdNBP_") && e.count()==10)) { // All 10 files are present }
Is there any other way to implement this in c#, please explain?

Without testing it. If your file array always has 2 has mdN and mdNBP and the same order with up to 10 files, you can do something like this:
if (files[0].Contains("mdN") && files[1].Contains("mdNBP") && files.Length == 10)
{
Console.WriteLine("all good");
}
You can also use linq just example, it can be done differnt ways, here the order does not matter, it just check if you files contian the mentioned values:
if (files.Any(e => e.Contains("mdN") && e.Contains("mdNBP") && files.Length == 10))
{
Console.WriteLine("all good");
}
If you have more files, then you can for loop the file array and check against a list of what it should contain.

Related

C# search large text file quickly

I have browsed a view related articles, but haven't quite found a solution that fits my query.
In a large plain-text File (~150MB, ~1.800.000 lines) I quickly want to find specific lines that have certain features using C#.
Each line has 132 characters, every one has a region-, a section-, a subsection code and an ident.
The combination of these 4 characteristics is unique.
Depending on the section code, the exact location of the other parts may differ.
Essentially, I want to retrieve up to ~50 elements with one method, that ideally takes less than a second.
The code I have so far works, but is way to slow for my purposes (~29 seconds of execution for 30 entries):
//icaoCode is always 2 char long
public static List<Waypoint> Retrieve(List<(string ident, string icaoCode, char sectionCode, char subSectionCode)> wpData)
{
List<Waypoint> result = new List<Waypoint>();
using StreamReader reader = new StreamReader(dataFile);
while (!reader.EndOfStream)
{
string data = reader.ReadLine();
if (data.Length != 132) continue;
foreach(var x in wpData)
{
int subsPos = (x.sectionCode, x.subSectionCode) switch
{
('P', 'N') => 5,
('P', _) => 12,
(_, _) => 5
};
if (data[4].Equals(x.sectionCode) && data[subsPos].Equals(x.subSectionCode))
{
//IsNdb() and others look at the sectionCode and subSectionCode to determine data type
if (IsNdb(data) && data[13..17].Trim() == x.ident && data[19..21] == x.icaoCode) result.Add(ArincHelper.LoadNdbEntry(data));
else if (IsVhf(data) && data[13..17].Trim() == x.ident && data[19..21] == x.icaoCode) result.Add(ArincHelper.LoadVhfEntry(data));
else if (IsTacan(data) && data[13..17].Trim() == x.ident) result.Add(ArincHelper.LoadTacanEntry(data));
else if (IsIls(data) && data[13..17].Trim() == x.ident && data[10..12] == x.icaoCode) result.Add(ArincHelper.LoadIlsEntry(data));
else if (IsAirport(data) && data[6..10] == x.ident && data[10..12] == x.icaoCode) result.Add(ArincHelper.LoadAirportEntry(data));
else if (IsRunway(data) && (data[6..10] + data[15..18].Trim()) == x.ident && data[10..12] == x.icaoCode) result.Add(ArincHelper.LoadRunwayEntryAsWaypoint(data));
else if (IsWaypoint(data) && data[13..18].Trim() == x.ident && data[19..21] == x.icaoCode) result.Add(ArincHelper.LoadWaypointEntry(data));
}
}
}
reader.Close();
return result;
}
IsNdb() and the other identifying Methods all look like this:
private static bool IsNdb(string data) => (data[4], data[5]) == ('D', 'B') || (data[4], data[5]) == ('P', 'N');
Some Example data lines would be:
SEURPNEBBREB OP EB004020HOM N50561940E004353360 E0010 WGEBRUSSELS 169641609
SEURP EDDFEDAFRA 0FL100131Y N50015990E008341364E002300364250FFM ED05000 MWGE FRANKFURT/MAIN 331502006
SEURD CHA ED011535VDHB N49551597E009022334CHA N49551597E009022334E0020005292 249WGECHARLIE 867432005
SEURP LFFKLFCFK404 LF0 W F N46262560W000480430 E0000 WGE FK404 331071909
I would like to avoid loading the whole file into memory, as this takes ~400MB of RAM, although it is possible of course.
Thank you in advance for your help.
Edit:
The current solution converts this data file into an SQLite DB, which is then used.
This however takes ~3h of converting the file into the DB, which I want to avoid, as the data file is regularily swapped out.
This is why I would like to give this text parsing a try.
As #mjwills suggest there are better suited tools for the job, I would keep this data in a database. If I was going to try and make your current code faster I would try the following. I would read chunks of the data into an array, and process that part of the in a parallel for loop, and exit the loop when you have enough elements. Below is some pseudo code to get you started. I can't write complete code because I don't have you object/file.
List<Waypoint> result = new List<Waypoint>();
var max = 1800000; //set the to the max rows in your file
var allLines = new string[max];
var dataFile = "";
using (StreamReader sr = File.OpenText(dataFile))
{
int x = 0;
while (!sr.EndOfStream)
{
allLines[x] = sr.ReadLine();
x += 1;
if (x % 5000 == 0)
{
var i = x - 5000;
Parallel.For(i, allLines.Length, x =>
{
//do your process here and exit if you have enough elements also set
//a flag to exit the while loop
});
}
//you would have to write some code to handle the last group of records that are less than 5k
}
}
You could use Gigantor for this which should be many times faster. Gigantor is a c# library for doing fast regex searching of gigantic files. It is available as either source code or nuget package.
Gigantor's search benchmark searches 5 GBytes in about 3 seconds finding a total of 105160 matches.
You would just need to convert your parsing code to a regular expression instead.

C# Validate Xdocument File

I need to validate a selected Xml file using Xdocument without Xsd.
I have a file named "Cheker" and the file to check.
for example i need to compare the hierarchy ,and how much elements by name from the checker file.
if i have in the "checker" file 3 page i need to chek there is no more in the selected file.
I tried with a array but is to much complicated like this
thanks!!
XElement pageElement = metadataFile.Root.Element("Pages");
int cntPage = ((IEnumerable<XElement>)pageElement.Elements()).Count();
if (cntPage < 1 || cntPage > 3) errorDetails += "Number of Pages wrong!!";
Elements() already returns IEnumerabl<XElement>. So the explicit cast on the second line of your code is unnecessary :
int cntPage = pageElement.Elements().Count();
Which style to use is a matter of preference here, but the entire code snippet can be re-written to be as follow :
int cntPage = metadataFile.Root
.Element("Pages")
.Elements()
.Count();
if (cntPage < 1 || cntPage > 3)
errorDetails += "Number of Pages wrong!!";

How to read integers from a text file to array

So this is what I would like to do. I am kind of all over the place with this but I hope you can bear with me. This is a very new concept to me.
1) In my program I wish create an array of 50 integers to hold the data that comes from the file.
My program must get the path to the user's Documents folder.
2) The name of the file will be "grades.txt". Code this file name right in your program. No user input is required to get the file name.
3) Create a StreamReader object, using this path. This will open the file.
Write a loop that reads data from the file, until it discovers the end of the file.
4) As each integer value is read in, I display it, and store it in the array.
5) Using the concepts of partially filled arrays, write a method that takes the array as a parameter and calculates and returns the average value of the integers stored in the array
Output the average.
So right now I am having a very hard time figuring out how to get the numbers saved in the grades.txt file, save them to an array, and display them. I try to split the integers and save them as that but it doesn't seem to work.
This is the code that I have so far:
class Program
{
const int SIZE = 50;
static void Main()
{
// This line of code gets the path to the My Documents Folder
int zero = 0;
int counter = 0;
int n, m;
StreamReader myFile;
myFile = new StreamReader("C:/grades.txt");
string inputNum = myFile.ReadLine();
do
{
Console.Write("The test scores are listed as follows:");
string[] splitNum = myFile.Split();
n = int.Parse(splitNum[0]);
{
if (n != zero)
{
Console.Write("{0}", n);
counter++;
}
}
} while (counter < SIZE && inputNum != null);
// now we can use the full path to get the document
Console.ReadLine();
}
}
This is the grades.Txt file:
88
90
78
65
50
83
75
23
60
94
For reading the file you need something like this:
var scores = new List<int>();
StreamReader reader = new StreamReader("C:/grades.txt");
while (!reader.EndOfStream)
{
int score;
if (int.TryParse(reader.ReadLine(), out score) && score != 0)
scores.Add(score);
}
and you can have count of scores with scores.Count property.
1) In my program I wish create an array of 50 integers to hold the data that comes from the file.
See Arrays Tutorial (C#).
2) My program must get the path to the user's Documents folder. The name of the file will be "grades.txt". Code this file name right in your program. No user input is required to get the file name.
Use these two:
Environment.GetFolderPath Method (Environment.SpecialFolder)
Path.Combine()
3) Create a StreamReader object, using this path. This will open the file. Write a loop that reads data from the file, until it discovers the end of the file.
See StreamReader.EndOfStream().
4) As each integer value is read in, I display it, and store it in the array.
If there is only one score per line, you don't need to do any Split() calls. Use your counter variable to know where in the Array to store the value.
5) Using the concepts of partially filled arrays, write a method that takes the array as a parameter and calculates and returns the average value of the integers stored in the array Output the average.
See Methods (C# Programming Guide).
You'd pass the Array and how many values are stored in it (the counter variable).

Using c# GetFiles Length but only count the files with certain amount of chars in filename

So i'm using the simple
ImgFilesCount = ImgDirInfo.GetFiles("*.jpg").Length;
to figure out how many files are in a dir. But I need it to only count files that have exactly 26 characters in the file name. I tried
ImgFilesCount = ImgDirInfo.GetFiles("?????????????????????????.jpg").Length;
But it didn't work. Is the only option to do a foreach loop and check each filename and increment the counter? I have a feeling linq can probably do this with a .Where statement but I don't know any Linq.
Maybe
int count = ImgDirInfo.EnumerateFiles("*.jpg").Count(f => f.Name.Length == 26);
EnumerateFiles is more efficient since it doesn't need to load all files into memory before it starts processing.
When you use EnumerateFiles, you can start enumerating the collection of FileInfo objects before the whole collection is returned.
When you use GetFiles, you must wait for the whole array of FileInfo objects to be returned before you can access the array.
ImgFilesCount = ImgDirInfo.GetFiles("*.jpg")
.Where(file => file.Name.Length == 26)
.Count();
Something like this?
string[] files = Directory
.EnumerateFiles(#"c:\Users\x074\Downloads" , "*.jpg" , SearchOption.AllDirectories )
.Where( path => Path.GetFileName(path).Length > 20 )
.ToArray()
;

Read specific data from text file

I have a text file as follows(it is having more than hundered thousands of lines):
Header
AGROUP1
ADATA1|0000
ADATA2|0001
ADATA3|0002
D0000|TNE
D0001|TNE
D0002|TNE
AGROUP2
ADATA1|0000
ADATA2|0001
ADATA3|0002
D0000|TNE
D0001|TNE
D0002|TNE
AGROUP3
ADATA1|0000
ADATA2|0001
ADATA3|0002
D0000|TNE
D0001|TNE
D0002|TNE
Infact it has more than hundered thousands lines of code.
I need to read data based on group
For example in a method:
public void ReadData(string strGroup)
{
if(strGroup == "AGROUP2)
//Read from the text file starting from line "AGROUP2" to "AGROUP3"(i.e lines under AGROUP2)
}
What i have tried is
public void ReadData(string strGroup)
{
bool start = false;
while ((line = reader.ReadLine()) != null)
{
if (line == strGroup && line.Length == 5)
start = true;
else if (line.Length == 5)
start = false;
if(start)
yield return line;
}
}
It is working fine, Performance wise, it takes longer since my text file is a very very huge file....There is if condition on every line in the method.
IS the a better way to do this?
If there is anything you know about the structure of the file that might help you could use that:
if the list is sorted you might know when to stop parsing.
if the list contains jump tables or an index you could skip lines
if the groups have a specific number of lines you can skip those
If not, you're destined to search from top to bottom and you will only be able to increase the speed using technical tricks:
read batches of lines instead of single lines
try to prevent creating many tiny objects (strings) in your code that might choke the garbage collector
if you need to do a lot of random access (going back and forth throughout the file) you might consider indexing or splitting the file first.
What if you use bash command to cut the huge file into smaller ones with AGROUP# as the first line. I think bash commands are more optimized.

Categories