How to merge files order by filename

How to merge files order by filename - c#

I want to create a new pdf file using iTextSharp. All my code is working but the new file doesn't are ordered by page number.
Could you kindly let me know how to modify the following code to create new pdf file ordered by file name (page number)
foreach (var file in Directory.GetFiles(path))
{
reader = new PdfReader(file);
for (int i = 0; i < reader.NumberOfPages; i++)
{
page = pdf.GetImportedPage(reader, i + 1);
pdf.AddPage(page);
}
pdf.FreeReader(reader); reader.Close();
}
Folder contains 102 files. File name is Page_1, Page_2, Page_3 ...
I expected the output with pages ordered
Thank you very much in advance

Since you need to order numerically (i.e. Page_15 comes before Page_2 if you do it alphabetically), you need to extract that from the filename, for example:
// Needs some work but you get the idea
private int GetNumberFromFilename(string filename)
{
var baseName = Path.GetFileName(filename);
var parts = filename.Split('_','.');
if(int.TryParse(parts[1], out var number))
{
return number;
}
return 0;
}
Now you can get your ordered files like this:
var files = Directory.GetFiles(path)
.Where(f => Path.GetFileName(f).StartsWith("Page_")) //filter out non matching files
.OrderBy(GetNumberFromFilename);

Related

How to query a file's language property in C#? [duplicate]

So, i followed a tutorial to "upload" files to a local path using ASP.net core,
this is the code:
public IActionResult About(IList<IFormFile> files)
{
foreach (var file in files)
{
var filename = ContentDispositionHeaderValue
.Parse(file.ContentDisposition)
.FileName
.Trim('"');
filename = hostingEnv.WebRootPath + $#"\{filename}";
using (FileStream fs = System.IO.File.Create(filename))
{
file.CopyTo(fs);
fs.Flush();
}
}
return View();
}
I want to read the extended properties of a file (file metadata)like:
name,
author,
date posted,
etc
and to sort the files using this data, is there a way using Iformfile?

If you want to access more file metadata then the .NET framework provides ootb, I guess you need to use a third party library.
Otherwise you need to write your own COM wrapper to access those details.
See this link for a pure C# sample.
Here an example how to read the properties of a file:
Add Reference to Shell32.dll from the "Windows/System32" folder to
your project
List<string> arrHeaders = new List<string>();
List<Tuple<int, string, string>> attributes = new List<Tuple<int, string, string>>();
Shell32.Shell shell = new Shell32.Shell();
var strFileName = #"C:\Users\Admin\Google Drive\image.jpg";
Shell32.Folder objFolder = shell.NameSpace(System.IO.Path.GetDirectoryName(strFileName));
Shell32.FolderItem folderItem = objFolder.ParseName(System.IO.Path.GetFileName(strFileName));
for (int i = 0; i < short.MaxValue; i++)
{
string header = objFolder.GetDetailsOf(null, i);
if (String.IsNullOrEmpty(header))
break;
arrHeaders.Add(header);
}
// The attributes list below will contain a tuple with attribute index, name and value
// Once you know the index of the attribute you want to get,
// you can get it directly without looping, like this:
var Authors = objFolder.GetDetailsOf(folderItem, 20);
for (int i = 0; i < arrHeaders.Count; i++)
{
var attrName = arrHeaders[i];
var attrValue = objFolder.GetDetailsOf(folderItem, i);
var attrIdx = i;
attributes.Add(new Tuple<int, string, string>(attrIdx, attrName, attrValue));
Debug.WriteLine("{0}\t{1}: {2}", i, attrName, attrValue);
}
Console.ReadLine();
You can enrich this code to create custom classes and then do sorting depending on your needs.
There are many paid versions out there, but there is a free one called WindowsApiCodePack
For example accessing image metadata, I think it supports
ShellObject picture = ShellObject.FromParsingName(file);
var camera = picture.Properties.GetProperty(SystemProperties.System.Photo.CameraModel);
newItem.CameraModel = GetValue(camera, String.Empty, String.Empty);
var company = picture.Properties.GetProperty(SystemProperties.System.Photo.CameraManufacturer);
newItem.CameraMaker = GetValue(company, String.Empty, String.Empty);

Can multiple zip file entries be active using ZipOutputStream class?

I am trying to use DotNetZip open source library for creating large zip files.
I need to be able to write to each stream writer part of the data row content (see the code below) of the data table. Other limitation I have is that I can't do this in memory due to the contents being large (several giga bytes each entry).
The problem I have is that despite writing to each stream separately, the output is all written to the last entry only. The first entry contains blank. Does anybody have any idea on how to fix this issue?
static void Main(string fileName)
{
var dt = CreateDataTable();
var streamWriters = new StreamWriter[2];
using (var zipOutputStream = new ZipOutputStream(File.Create(fileName)))
{
for (var i = 0; i < 2; i++)
{
var entryName = "file" + i + ".txt";
zipOutputStream.PutNextEntry(entryName);
streamWriters[i] = new StreamWriter(zipOutputStream, Encoding.UTF8);
}
WriteContents(streamWriters[0], streamWriters[1], dt);
zipOutputStream.Close();
}
}
private DataTable CreateDataTable()
{
var dt = new DataTable();
dt.Columns.AddRange(new DataColumn[] { new DataColumn("col1"), new DataColumn("col2"), new DataColumn("col3"), new DataColumn("col4") });
for (int i = 0; i < 100000; i++)
{
var row = dt.NewRow();
for (int j = 0; j < 4; j++)
{
row[j] = j * 1;
}
dt.Rows.Add(row);
}
return dt;
}
private void WriteContents(StreamWriter writer1, StreamWriter writer2, DataTable dt)
{
foreach (DataRow dataRow in dt.Rows)
{
writer1.WriteLine(dataRow[0] + ", " + dataRow[1]);
writer2.WriteLine(dataRow[2] + ", " + dataRow[3]);
}
}
Expected Results:
Both file0.txt and file1.txt need to written.
Actual results:
Only file1.txt file is written all content. file0.txt is blank.

It seems to be the expected behaviour according to the docs
If you don't call Write() between two calls to PutNextEntry(), the first entry is inserted into the zip file as a file of zero size. This may be what you want.
So to me it seems that it is not possible to do what you want through the current API.
Also, as zip file is a continuous sequence of zip entries, it is probably physically impossible to create entries in parallel, as you would have to know the size of each entry before starting a new one.
Perhaps you could just create separate archives and then combine them (if I am not mistaken there was a simple API to do that)

How to count lines

how do i count the line in log file and create a new log files of it?
Below is my log file :
DDD.CGLOG
ID|AFP|DATE|FOLDER
1|DDD|20181204|B
2|DDD|20181104|B
3|DDD|20181004|B
FFF.CGLOG
ID|AFP|DATE|FOLDER
1|FFF|20181204|B
2|FFF|20181104|B
WWW.CGLOG
ID|AFP|DATE|FOLDER
1|WWW|20181204|B
i want to count the line and create a new log file as below :
DDD_QTY.Log
AFP|QTY
DDD|3
EEE_QTY.Log
AFP|QTY
EEE|2
WWW_QTY.Log
AFP|QTY
WWW|1
Below is what i have tried. I have managed to get the count from each log file inside the folder, now i just need to write the count into a new log file using a same name with existing log file.
string[] ori_Files = Directory.GetFiles(#"F:\Work\FLP Code\test", "*.CGLOG*", SearchOption.TopDirectoryOnly);
foreach (var file in ori_Files)
{
using (StreamReader file1 = new StreamReader(file))
{
string line;
int count = 0;
while ((line = file1.ReadLine()) != null)
{
Console.WriteLine(line);
count++;
}
Console.WriteLine(count);
}
}
Console.ReadLine();

Since you only want to count lines, You can keep it simple. Assuming your file name dictates the AFP value
static long CountLinesInFile(string fileName,string outputfile)
{
var afp = Path.GetFileNameWithoutExtension(fileName);
var lineCount = File.ReadAllLines(fileName).Length;
File.WriteAllText(outputfile,$"AFP|QTY{Environment.NewLine}{afp}|{lineCount -1}");
return lineCount-1;
}
Please note you are counting a line less(headers are not counted as in your example). In case the file is different from AFP term, you can use regex to parse the AFP Term from the any line other than the header line in each term. Example Regex for parsing AFP Term
new Regex(#"^[0-9]+\|(?<AFP>[a-zA-Z]+)\|[0-9]+\|[a-zA-Z]+$")
Update
In case your file is pretty large (say 15-20Gb - considering it is a log file), a better approach would be
static long CountLinesInFile(string fileName,string outputFileName)
{
var afp = Path.GetFileNameWithoutExtension(fileName);
uint count = 0;
int query = (int)Convert.ToByte('\n');
using (var stream = File.OpenRead(fileName))
{
int current;
do
{
current = stream.ReadByte();
if (current == query)
{
count++;
continue;
}
} while (current!= -1);
}
using (System.IO.StreamWriter file = new System.IO.StreamWriter(outputFileName, true))
{
file.WriteLine($"AFP|QTY{Environment.NewLine}{afp}|{count}");
}
return count;
}
Update 2
To invoke the method for all files in a given folder, you can make use DirectoryInfo.GetFiles, for example
DirectoryInfo d = new DirectoryInfo(#"E:\TestFolder");
FileInfo[] Files = d.GetFiles("*.txt");
foreach(FileInfo file in Files )
{
CountLinesInFile(file.FullName,$"{file.FullName}.processed");
}

a simple 2 liner
static void CountLines(string path,sting outfile)
{
var count = File.ReadLines(path).Count();
File.WriteAllText(outfile, $"AFP|QTY{Environment.NewLine}DDD|{count}");
}

How do I filter Directory.GetFiles() by a numeric range when file names are listed in numeric order?

I want filter which files are getting returned from the Directory.GetFiles() function. The files in the directory are all text files named with 6 digit numbers in incremental order (for example: "200501.txt", "200502.txt", "200503.txt", and so on), I would like to enter a "Starting Invoice Number" and "Ending Invoice Number" through 2 text box controls to return only the files within that range.
The current code is as follows...
using (var fbd = new FolderBrowserDialog())
{
DialogResult result = fbd.ShowDialog();
if (result == DialogResult.OK && !string.IsNullOrWhiteSpace(fbd.SelectedPath))
{
string[] fileDir = Directory.GetFiles(fbd.SelectedPath);
string[] files = fileDir;
foreach (string loopfile in files)
{
int counter = 0;
string line;
//Gets invoice number from text file name
//This strips all unnecessary strings out of the directory and file name
//need to change substring 32 to depending directory using
string loopfileName = loopfile.Substring(32);
string InvoiceNumberLong = Path.GetFileName(loopfile);
string InvoiceNumber = InvoiceNumberLong.Substring(0,(InvoiceNumberLong.Length - 4)).ToString();
var controlCount = new List<string>();
var EndCount = new List<string>();
//Read through text file line by line to find all instances of "control" and "------" string
//adds all line position of these strings to lists
System.IO.StreamReader file = new System.IO.StreamReader(loopfile);
while ((line = file.ReadLine()) != null)
{
if (line.Contains("Control"))
{
controlCount.Add(counter.ToString());
}
if (line.Contains("------"))
{
EndCount.Add(counter.ToString());
}
counter++;
}
}
}
}
Thank you in advance!

You can't use the built in filter that the GetFiles method provides, that can only filter by wild cards. You can do it with some LINQ:
var files = Directory.EnumerateFiles(path, "*.txt")
.Where(d => int.TryParse(Path.GetFileNameWithoutExtension(d), out var value) && value > min && value < max);
Note: Using C#7 out var but can be converted to previous versions if you are not using the latest.

How do I read the version number of an .exe inside a zip archive?

I have a zip file that contains an exe, and I want to get the version number of the exe file without having to extract it physically.
I know how to read the contents of a zip file, and have code that will read a text file in it, but I can't find out how to get the version of an exe.

Add reference to Shell32.dll -library. Then you'll likely find what you are looking for with this:
Shell shell = new Shell();
var folder = shell.NameSpace( <path_to_your_zip> );
// Just get the names of the properties
List<string> arrHeaders = new List<string>();
for (int i = 0; i < short.MaxValue; i++)
{
string header = folder.GetDetailsOf(null, i);
if (String.IsNullOrEmpty(header))
break;
arrHeaders.Add(header);
}
// Loop all files inside the zip and output their properties to console
foreach (Shell32.FolderItem2 item in folder.Items())
{
for (int i = 0; i < arrHeaders.Count; i++)
{
Console.WriteLine("{0}\t{1}: {2}", i, arrHeaders[i], folder.GetDetailsOf(item, i));
}
}
EDIT: Seems that this is not possible without actually extracting the file from the package. Something like this is fairly simple, but will take time in case the file is large and/or efficiently compressed.
Shell s = new Shell();
var folder = s.NameSpace( <path_to_your_zip> );
foreach (FolderItem2 item in folder.Items())
{
string oItemName = Path.GetFileName(item.Path);
try
{
string oTargetFile = Path.Combine(Path.GetTempPath(), oItemName);
if (File.Exists(oTargetFile))
File.Delete(oTargetFile);
Folder target = s.NameSpace(Path.GetTempPath());
target.CopyHere(item, 4);
var info = FileVersionInfo.GetVersionInfo(oTargetFile);
if (File.Exists(oTargetFile))
File.Delete(oTargetFile);
Console.WriteLine(oItemName + "'s version is: " + info.FileVersion);
}
catch (Exception e)
{
Console.WriteLine(oItemName + ": Unable to obtain version info.\n" + e.Message);
}
}

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

How to merge files order by filename - c#

Related

How to query a file's language property in C#? [duplicate]

Can multiple zip file entries be active using ZipOutputStream class?

How to count lines

How do I filter Directory.GetFiles() by a numeric range when file names are listed in numeric order?

How do I read the version number of an .exe inside a zip archive?

Categories

Resources