C# : How to save a zip file every X files - c#

I have a program written in C# which should save a zip file every n records (like 500).
My idea was using the mod operator (%) and where the result of the operation is zero then write the file. Which is good, but: what if I have 520 records? I should write 500 files inside the first zip and then 20 file on the second one.
Here the code:
using (ZipFile zip = new ZipFile())
{
zip.CompressionLevel = Ionic.Zlib.CompressionLevel.Level8;
zip.CompressionMethod = CompressionMethod.Deflate;
int indexrow = 0;
foreach(DataRow row in in_dt.Rows)
{
zip.AddFile(row["Path"].ToString(),"prova123");
if(indexrow % 500 == 0)
{
using (var myZipFile = new FileStream("c:\\tmp\\partial_"+indexrow.ToString()+".zip", FileMode.Create))
{
zip.Save(myZipFile);
}
indexrow = indexrow++;
}
}
}
}
in_dt is a datatable which contains all the file paths on filesystem.
zip object is an object based on the dotnetzip library.

I'd use LINQ for this problem:
// Define the group size
const int GROUP_SIZE = 500;
// Select a new object type that encapsulates the base item
// and a new property called "Grouping" that will group the
// objects based on their index relative to the group size
var groups = in_dt
.Rows
.AsEnumerable()
.Select(
(item, index) => new {
Item = item,
Index = index,
Grouping = Math.Floor(index / GROUP_SIZE)
}
)
.GroupBy(item => item.Grouping)
;
// Loop through the groups
foreach (var group in groups) {
// Generate a zip file for each group of files
}
For files 0 through 499, the Grouping property is 0.
For files 500 - 520, the Grouping property is 1.

What you probably want to do is something like this:
zipFiles(File[] Files, int MaxFilesInZip)
{
int Parts = Files.Count / MaxFilesInZip;
int Remaning = Files.Count % MaxFilesInZip;
for(int i = 0; i < Parts; i++)
//New zip
for(int u = 0; u < MaxFilesInZip; u++)
//Add Files[i*MaxFilesInZip + u]
//New Zip
//Add 'Remaning' amount of files
}
This way if you run the function like ths: zipFiles(520, 250), you would have 2*250 zip files and 1*20 with the remaning. You might have to work something with value on Parts (Floor/Celling).

Related

Count how many files starts with the same first characters c#

I want to make function that will count how many files in selected folder starts with the same 10 characters.
For example in folder will be files named File1, File2, File3 and int count will give 1 because all 3 files starts with the same characters "File", if in folder will be
File1,File2,File3,Docs1,Docs2,pdfs1,pdfs2,pdfs3,pdfs4
will give 3, because there are 3 unique values for fileName.Substring(0, 4).
I've tried something like this, but it gives overall number of files in folder.
int count = 0;
foreach (string file in Directory.GetFiles(folderLocation))
{
string fileName = Path.GetFileName(file);
if (fileName.Substring(0, 10) == fileName.Substring(0, 10))
{
count++;
}
}
Any idea how to count this?
You can try querying directory with a help of Linq:
using System.IO;
using System.Linq;
...
int n = 10;
int count = Directory
.EnumerateFiles(folderLocation, "*.*")
.Select(file => Path.GetFileNameWithoutExtension(file))
.Select(file => file.Length > n ? file.Substring(0, n) : file)
.GroupBy(name => name, StringComparer.OrdinalIgnoreCase)
.OrderByDescending(group => group.Count())
.FirstOrDefault()
?.Count() ?? 0;
You could instantiate a list of strings of files with a unique name, and check if each file is in that list or not:
int count = 0;
int length = 0;
List<string> list = new List<string>();
foreach (string file in Directory.GetFiles(folderLocation))
{
boolean inKnown = false;
string fileName = Path.GetFileName(file);
for (string s in list)
{
if (s.Length() < length)
{
// Add to known list just so that we don't check for this string later
inKnown = true;
count--;
break;
}
if (s.Substring(0, length) == fileName.Substring(0, length))
{
inKnown = true;
break;
}
}
if (!inKnown)
{
count++;
list.Add(s);
}
}
The limitation here is that you are asking if the first ten characters are the same, but your examples given showed the first 4, so just adjust the length variable according to how many characters you would like to check for.
#acornTime give me idea, his solution didn't work but this worked. Thanks for help!
List<string> list = new List<string>();
foreach (string file in Directory.GetFiles(folderLocation))
{
string fileName = Path.GetFileName(file);
list.Add(fileName.Substring(0, 10));
}
list = list.Distinct().ToList();
//count how many items are in list
int count = list.Count;

How to iterate through data and create a new text file every nth entries

I'm making a list of lines that need to be added to a .txt file (with tab delimitation). The text file needs to have a maximum of 500 entries plus a header.
Right now, I have this code, which is successfully iterating through my list and creating the text file with the header. If the file already exists, it appends the lines in my list without adding the header.
I can't quite figure out how to make a new file, add the header and add each line after my first file surpasses 500 entries.
Can you help me separate in 500 line files with headers? Thank you
This is the code I have so far:
var tab = new StringBuilder();
foreach (var line in textlinestoadd)
{
tab.AppendLine(line.ToString());
}
if (!File.Exists(textcsvpath))
{
string textheader = "Vendor\tDate\tInvoice\tPO\tTax\tTotal\tAcount\tType\tJobs\tClass" + Environment.NewLine;
File.WriteAllText(textcsvpath, textheader);
}
File.AppendAllLines(textcsvpath, textlinestoadd);
This seems like a good practice opportunity so I will leave the code part as exercise!
The basic idea is simple. Whenever you wrote 500 lines just reset and write to a new file
here is a high level pseudo code
Initialize StringBuilder sb
For each line do
Add line to sb
if line count == 500 then
save to file
reset sb
reset line count
update filename = next file
end if
End For
//writes the last chunk if # of lines is not multiple of 500
if line count is not 0 then
save to file
end if
I'd try something like this.
var tab = new StringBuilder();
int lineCount = 0;
string textheader = "Vendor\tDate\tInvoice\tPO\tTax\tTotal\tAcount\tType\tJobs\tClass" + Environment.NewLine;
if (File.Exists(textcsvpath)) {
FileStream fs = File.OpenRead(textcsvpath);
string[] fileContent = File.ReadAllLines(textcsvpath);
lineCount = fileContent.Length - 1; // assume the first line is the header
}
foreach (var line in textlinestoadd)
{
tab.AppendLine(line.ToString());
lineCount++;
if (lineCount > 0 && lineCount % 500 == 0)
{
if (!File.Exists(textcsvpath))
{
File.WriteAllText(textcsvpath, textheader);
}
File.AppendAllText(textcsvpath, tab.ToString());
tab.Clear();
textcsvpath = "some-new-file-name";
}
}
if (!File.Exists(textcsvpath))
{
File.WriteAllText(textcsvpath, textheader);
}
File.AppendAllText(textcsvpath, tab.ToString());
You'll need to do something to determine the new file name as you add a new file.
I'd do something like this:
const int limit = 500;
int iteration = 0;
string textHeader = "Vendor\tDate\tInvoice\tPO\tTax\tTotal\tAcount\tType\tJobs\tClass" + Environment.NewLine;
while(iteration * limit < textLinesToAdd.Count())
{
string fullPath = Path.Combine(filePath, $"{fileName}.{iteration}", extension);
IEnumerable<string> linesToAdd = textLinesToAdd.Skip(iteration++ * limit).Take(limit);
File.Create(fullPath);
File.WriteAllText(fullPath, textHeader);
File.AppendAllLines(fullPath, linesToAdd);
}
Define that filename as foo and the extension as bar, and you'll get a sequence of files called foo.0.bar, foo.1.bar, foo.2.bar and so on.
I'm assuming we want to create a file with the specified name, and then have some integer placed between the name and extension that increments every time a new file is created.
One way to do this would be to have a method that takes in a filePath string, a list of lines to write, a header string, and the maximum number of lines allowed per file. Then it could parse the directory of the file path, looking for a pattern related to the file name.
It would determine what the latest file name should be based on the contents of the directory and the number of lines in the last file that matches our pattern, then would write to that file until it was full, and then continue creating new files until the lines were all written.
Here's a sample class that can do that, where I added some helper methods to get a file's number, increment that number in the name, get the latest file from a directory, and write lines to the file. It also implements IComparer<string> so that we can pass it to OrderByDescending to easily sort the files we're interested in.
public class FileWriterHelper : IComparer<string>
{
public int Compare(string x, string y)
{
// Compare null
if (x == null) return y == null ? 0 : 1;
if (y == null) return -1;
// Compare count of parts split on '.'
var xParts = x.Split('.');
var yParts = y.Split('.');
if (xParts.Length < 3) return yParts.Length < 3 ? 0 : -1;
if (yParts.Length < 3) return 1;
// Compare numeric portion
int xNum, yNum;
if (int.TryParse(xParts[1], out xNum) &&
int.TryParse(yParts[1], out yNum))
{
return xNum.CompareTo(yNum);
}
// Unknown values
return string.Compare(x, y, StringComparison.Ordinal);
}
private static int? GetFileNumber(string fileName)
{
if (string.IsNullOrWhiteSpace(fileName)) return null;
var fileParts = fileName.Split('.');
int fileNum;
if (fileParts.Length < 3 || !int.TryParse(fileParts[1], out fileNum)) return null;
return fileNum;
}
private static string IncrementNumber(string fileName)
{
var number = GetFileNumber(fileName).GetValueOrDefault() + 1;
var fileParts = fileName.Split('.');
return $"{fileParts[0]}.{number}.{fileParts[fileParts.Length - 1]}";
}
private static string GetLatestFile(string filePath, int maxLines)
{
var fileDir = Path.GetDirectoryName(filePath);
var fileName = Path.GetFileNameWithoutExtension(filePath);
var fileExt = Path.GetExtension(filePath);
var latest = Directory.GetFiles(fileDir, $"{fileName}*{fileExt}")
.OrderByDescending(f => f, new FileWriterHelper())
.FirstOrDefault() ?? filePath;
return File.Exists(latest) && File.ReadAllLines(latest).Length >= maxLines
? Path.Combine(fileDir, IncrementNumber(Path.GetFileName(latest)))
: latest;
}
public static void WriteLinesToFile(string filePath, string header,
List<string> lines, int maxFileLines)
{
while ((lines?.Count ?? 0) > 0 && maxFileLines > 0)
{
var latestFile = GetLatestFile(filePath, maxFileLines);
if (!File.Exists(latestFile)) File.CreateText(latestFile).Close();
var lineCount = File.ReadAllLines(latestFile).Length;
if (lineCount == 0 && header != null)
{
File.WriteAllText(latestFile, string.Concat(header, Environment.NewLine));
lineCount = 1;
}
var numLinesToWrite = maxFileLines - lineCount;
File.AppendAllLines(latestFile, lines.Take(numLinesToWrite));
lines = lines.Skip(numLinesToWrite).ToList();
}
}
}
That was a bit of work, but now to use it is really simple:
private static void Main()
{
// Generate 5000 lines to write
var fileLines = Enumerable.Range(0, 5000).Select(i => $"Line number {i}").ToList();
// File path with base file name
var filePath = #"f:\public\temp\temp.csv";
// This should create 10 files
FileWriterHelper.WriteLinesToFile(filePath,
"HEADER: This should be the first line in each file.", fileLines, 500);
GetKeyFromUser("\nDone! Press any key to exit...");
}
If you run that once, it will create 10 files (because of the number of lines we're generating and the max number of lines per file we specified). And if you run it again, it will create 10 more, since we're using the same path and file name pattern, it recognizes the previous files that were in the location.
I'm sure it could use some work, but hopefully it's a start!

Save List<string> and List<double> to a .txt file's specific column

I have a C# WinForm application that has many List<string> and List<double>. I need to create a new .txt file and save each List<> to a specific column of the text file.
I tried WriteAllLines function but it write one List<>.
I also tried to create an Excel file so I can specify which column I want the List<> save to. But I have a hard time to save the temporary excel file as a .txt file.
I know this code can save an existing excel file as a PDF, but similar functions that save the excel as a text file doesn't exist.
NewExcelWorkBook.ExportAsFixedFormat(Excel.XlFixedFormatType.xlTypePDF, "Holdings in BE Import Format.txt", Excel.XlFixedFormatQuality.xlQualityStandard, true,
false, 1, 10, true);
Please tell me a way to write multiple List<> to specific .txt file column, or you can tell me how to save the excel file as a .txt file. Skipping the temporary excel file would be ideal, but this solution is acceptable if writing to .txt directly is hard.
Thank you so much!
If you want your first column to be of fixed size of 20 characters, you can try the following:
List<string> stringList = new List<string>
{
"ABCDEF",
"DEF",
"GHIAAAAAAAAAAAAAA",
"SOMETHNG LONGER THAN 20 characters",
};
List<double> doubleList = new List<double>
{
1d,
2,
3,
4
};
List<string> combined = new List<string>();
int count = stringList.Count >= doubleList.Count ? stringList.Count : doubleList.Count;
for (int i = 0; i < count; i++)
{
string firstColumn = stringList.Count <= i ? "" : stringList[i];
string secondColumn = doubleList.Count <= i ? "" : doubleList[i].ToString();
if (firstColumn.Length > 20)
{
//truncate rest of the values
firstColumn = firstColumn.Substring(0, 20);
}
else
{
firstColumn = firstColumn + new string(' ', 20 - firstColumn.Length);
}
combined.Add(string.Format("{0} {1}", firstColumn, secondColumn));
}
File.WriteAllLines("yourFilePath.csv", combined);
Ouput file would be like:
ABCDEF 1
DEF 2
GHIAAAAAAAAAAAAAA 3
SOMETHNG LONGER THAN 4
List<string> list1 = new List<string>();
List<int> list2 = new List<int>();
//...
string separator = "\t";
using (StreamWriter writer = new StreamWriter(fileName)){
for (int i = 0; i<Math.Max(list1.Count, list2.Count); i++){
var element1 = i < list1.Count ? list1[i] : "";
var element2 = i < list2.Count ? list2[i].ToString() : "";
writer.Write(element1);
writer.Write(separator);
writer.WriteLine(element2);
}
}

Selecting entries according to running total

I would like to select from a list of files only so many files that their total size does not exceed a threshold (i.e. the amount of free space on the target drive).
I understand that I could do this by adding up file sizes in a loop until I hit the threshold and then use that number to select files from the list. However, is it possible to do that with a LINQ-query instead?
This could work (files is a List<FileInfo>):
var availableSpace = DriveInfo.GetDrives()
.First(d => d.Name == #"C:\").AvailableFreeSpace;
long usedSpace = 0;
var availableFiles = files
.TakeWhile(f => (usedSpace += f.Length) < availableSpace);
foreach (FileInfo file in availableFiles)
{
Console.WriteLine(file.Name);
}
You can achieve that by using a closure:
var directory = new DirectoryInfo(#"c:\temp");
var files = directory .GetFiles();
long maxTotalSize = 2000000;
long aggregatedSize = 0;
var result = files.TakeWhile(fileInfo =>
{
aggregatedSize += fileInfo.Length;
return aggregatedSize <= maxTotalSize;
});
Theres a caveat though, because the variable aggregatedSize may get modified after you have left the scope where it has been defined.
You could wrap that in an extension method though - that would eliminate the closure:
public static IEnumerable<FileInfo> GetWithMaxAggregatedSize(this IEnumerable<FileInfo> files, long maxTotalSize)
{
long aggregatedSize = 0;
return files.TakeWhile(fileInfo =>
{
aggregatedSize += fileInfo.Length;
return aggregatedSize <= maxTotalSize;
});
}
You finally use the method like this:
var directory = new DirectoryInfo(#"c:\temp");
var files = directory.GetFiles().GetWithMaxAggregatedSize(2000000);
EDIT: I replaced the Where-method with the TakeWhile-method. The TakeWhile-extension will stop once the threshold has been reached, while the Where-extension will continue. Credits for bringing up the TakeWhile-extension go to Tim Schmelter.

Reading a file and mapping values

I found an implementation of a parallel coordinates application in c#. What I am trying to achieve is that I want to be able to read a CSV file and map the values and Labels onto the coordinates. The method mapping the values is assigning the values manually. Instead, I want those values to be read from the CSV file.
Here is the current method:
public void DataBind()
{
IList<DemoInfo> infos = new List<DemoInfo>();
for (int i = 0; i < ObjectsCount; i++)
{
var x = new DemoInfo();
x.X = m_Random.NextDouble() * 400 - 100;
x.Y = m_Random.NextDouble() * 500 - 100;
x.Z = m_Random.NextDouble() * 600 - 300;
x.V = m_Random.NextDouble() * 800 - 100;
x.K = 1.0;
//x.M = i % 2 == 0 ? 1.0 : -20.0;
x.M = i;
x.Tag = i + 1;
infos.Add(x);
}
var dataSource = new MultiDimensionalDataSource<DemoInfo>(infos, 6);
dataSource.MapDimension(0, info => info.X);
dataSource.MapDimension(1, info => info.Y);
dataSource.MapDimension(2, info => info.Z);
dataSource.MapDimension(3, info => info.V);
dataSource.MapDimension(4, info => info.K);
dataSource.MapDimension(5, info => info.M);
//dataSource.MapDimensionToOpacity(0, 0.5);
dataSource.MapTag(info => info.Tag);
dataSource.Labels[0] = "X";
dataSource.Labels[1] = "Y";
dataSource.Labels[2] = "Z";
dataSource.Labels[3] = "V";
dataSource.Labels[4] = "K";
dataSource.Labels[5] = "M";
dataSource.HelperAxisLabel = "Helper axis";
DataSource = dataSource;
}
Here is some of the data in the CSV File:
SWW Institutions Undergradutes Postgraduates
University College 2085 250
Metropolitan University 4715 1135
Would really appreciate your help !!
Thanks.
I am not sure how your CSV file is mapping to the DemoInfo class. Also, the example below is based on a CSV file, but your sample data is showing a TSV file. If it is a TSV file, just replace ',' with '/t'. Also, something watch out for is if any strings contain your delimiter, such as a SWW Institutions string like "Univeristy, Madison".
You can open the file to read the lines of text and split the line based on your delimiter.
using (var sr = File.OpenText(path))
{
var line = string.Empty;
while ((line = sr.ReadLine()) != null)
{
var dataPoints = line.Split(',');
// Create Your Data Mappings Here
// dataPoints[0]...
}
}

Categories