I am creating a file encryption application using Advanced Encryption Standard Algorithm, but I find that my application is so slow especially when I am encrypting a large amount of data,
ex. 80MB of file size, 30 mins. was already past but my application is not yet done encrypting my file with the size of 80mb.
I am using ECB(Electronic Code Book) mode in my Encryption Algorithm
how can I speed up my application in encrypting a large amount of data? I do some research and I find this, http://en.wikipedia.org/wiki/Speedup, but I'm not sure if this is the answer to my problem... or is it effective if I will used BackgroundWorker? by the way, I am using Visual Studio 2008 in the development of my project.
here is my codes in encrypting a file..
private void cmdEncrypt_Click(object sender, EventArgs e)
{
AESECB aes = new AESECB();
FileInfo fInfo = new FileInfo(txtFileSource.Text);
if (txtFileSource.Text == "")
{
MessageBox.Show("Please Select a File", "Message", MessageBoxButtons.OK, MessageBoxIcon.Warning);
return;
}
if (txtSecretKey.Text == "")
{
MessageBox.Show("Please Enter the password", "Message", MessageBoxButtons.OK, MessageBoxIcon.Warning);
return;
}
if (fInfo.Exists == false)
{
MessageBox.Show("File Not Found!", "Message", MessageBoxButtons.OK, MessageBoxIcon.Warning);
return;
}
byte[] bytePadding = aes.filePadding(txtFileSource.Text);
byte[] fByte = aes.getFileByte(txtFileSource.Text);
int farrLength = (bytePadding.Length + fByte.Length);
byte[] newFbyte = new byte[farrLength];
byte[] encryptedFByte = new byte[farrLength];
int counterBytePadding =0;
byte firstByte = 0;
for (int i = 0; i < farrLength; i++)
{
if (i < fByte.Length)
{
newFbyte[i] = fByte[i];
}
else
{
newFbyte[i] = bytePadding[counterBytePadding];
counterBytePadding++;
}
}
int plainFileBlock = newFbyte.Length / 16;
progressBar1.Maximum = plainFileBlock-1;
progressBar1.Visible = true;
int counter = 0;
int counter2 = 0;
for (int j = 0; j < plainFileBlock; j++)
{
byte[] encfbyte = aes.fileEncrypt(txtSecretKey.Text, newFbyte, counter);
for (int k = 0; k < 16; k++)
{
encryptedFByte[counter2] = encfbyte[k];
counter2++;
}
progressBar1.Value = j;
counter = counter + 16;
}
progressBar1.Visible = false;
int bytesToRead = encryptedFByte.Length;
string newPath = txtFileSource.Text + ".aesenc";
using (FileStream newFile = new FileStream(newPath, FileMode.Create, FileAccess.Write))
{
newFile.Write(encryptedFByte, 0, bytesToRead);
}
MessageBox.Show("Encryption Done!", "Message", MessageBoxButtons.OK, MessageBoxIcon.Information);
}
It's all very well being "confident in your codes", but all good programmers measure.
This is simple, do this:
using System.Diagnostics;
var startTime = DateTime.Now;
//some code here
Debug.WriteLine("This took {0}", DateTime.Now.Subtract(startTime));
Then look at your output window in VS (View->Output).
By wrapping different parts of the method with these two lines you will identify the slow bit.
My suspicions are on the loop where you copy 80MB byte by byte. Try Array.Resize here.
As suggested by weston, do measure your code in order to identify what is being slow. That said, it is probably the file encryption for loop which is slowing you down. If so you can definitely speed up the process by using more than 1 cpu. Look up the parallel constructs in C#: "Parallel.For".
Here is a simple example:
http://www.dotnetcurry.com/ShowArticle.aspx?ID=608
Related
I'm using this, slightly modified, to copy large files from a file share with the ability to continue copying, if the download was disrupted. It runs in a BackroudWorker and reports progress. This works fine, but I'd like to have the ability to write the current MD5 hash to disk (the current total, not once for each block) each time a block of file data is written to disk WITH MINIMAL ADDITIONAL OVERHEAD. If a partial file is discovered, I'd like to read the MD5 hash from file, and if it is identical to that of the partial file, continue copying. When the file has been copied completely, the MD5 hash in the file should be that of the completly copied file. I'd like to use that later to determine that the files in source and destination are identical. Thanks for any help!
This is my current copy method:
public static bool CopyFile(List<CopyObjects> FileList, FSObjToCopy job, BackgroundWorker BW)
{
Stopwatch sw = new Stopwatch();
long RestartPosition = 0;
bool Retry = false;
int BYTES_TO_READ = (0x200000)
foreach (CopyObjects co in FileList)
{
FileInfo fi = co.file;
FileInfo fo = null;
if (fi.Directory.FullName.StartsWith($#"{Test_Updater_Core.ServerName}\{Test_Updater_Core.ServerTemplateRoot}"))
{
if (File.Exists(fi.FullName.Replace($#"{Test_Updater_Core.ServerName}\{Test_Updater_Core.ServerTemplateRoot}", $#"{ Test_Updater_Core.USBStore_Drive.driveInfo.Name.Replace("\\", "")}\{Test_Updater_Core.UsbTemplateRoot}")))
{
fi = new FileInfo(fi.FullName.Replace($#"{Test_Updater_Core.ServerName}\{Test_Updater_Core.ServerTemplateRoot}", $#"{Test_Updater_Core.USBStore_Drive.driveInfo.Name.Replace("\\", "")}\{Test_Updater_Core.UsbTemplateRoot}"));
co.destination = co.destination.Replace($#"{Test_Updater_Core.USBStore_Drive.driveInfo.Name.Replace("\\", "")}\{Test_Updater_Core.UsbTemplateRoot}", $#"{Test_Updater_Core.LocalInstallDrive}\{Test_Updater_Core.LocalTemplateRoot}");
fo = new FileInfo($"{fi.FullName.Replace($#"{Test_Updater_Core.USBStore_Drive.driveInfo.Name.Replace("\\", "")}\{Test_Updater_Core.UsbTemplateRoot}", $#"{Test_Updater_Core.LocalInstallDrive}\{Test_Updater_Core.LocalTemplateRoot}")}{Test_Updater_Core.TempFileExtension}");
}
}
//If a clean cancellation was requested, we do it here, otherwise the BackgroundWorker will be killed
if (BW.CancellationPending)
{
job.Status = FSObjToCopy._Status.Complete;
return false;
}
//If a pause is requested, we loop here until resume or termination has been signaled
while (job.PauseBackgroundWorker == true)
{
Thread.Sleep(100);
if (BW.CancellationPending)
{
job.Status = FSObjToCopy._Status.Complete;
return false;
}
Application.DoEvents();
}
if (fo == null)
fo = new FileInfo($"{fi.FullName.Replace(job.Source, co.destination)}{Test_Updater_Core.TempFileExtension}");
if (fo.Exists)
{
Retry = true;
RestartPosition = fo.Length - BYTES_TO_READ;
}
else
{
RestartPosition = 0;
Retry = false;
}
if (RestartPosition <= 0)
{
Retry = false;
}
sw.Start();
try
{
// Read source files into file streams
FileStream source = new FileStream(fi.FullName, FileMode.Open, FileAccess.Read);
// Additional way to write to file stream
FileStream dest = new FileStream(fo.FullName, FileMode.OpenOrCreate, FileAccess.Write);
// Actual read file length
int destLength = 0;
// If the length of each read is less than the length of the source file, read in chunks
if (BYTES_TO_READ < source.Length)
{
byte[] buffer = new byte[BYTES_TO_READ];
long copied = 0;
if (Retry)
{
source.Seek(RestartPosition, SeekOrigin.Begin);
dest.Seek(RestartPosition, SeekOrigin.Begin);
Retry = false;
}
while (copied <= source.Length - BYTES_TO_READ)
{
destLength = source.Read(buffer, 0, BYTES_TO_READ);
source.Flush();
dest.Write(buffer, 0, BYTES_TO_READ);
dest.Flush();
// Current position of flow
dest.Position = source.Position;
copied += BYTES_TO_READ;
job.CopiedSoFar += BYTES_TO_READ;
if (sw.ElapsedMilliseconds > 250)
{
job.PercComplete = (int)(float)((float)job.CopiedSoFar / (float)job.TotalFileSize * 100);
sw.Restart();
sw.Start();
job.ProgressCell.Value = job.PercComplete;
BW.ReportProgress(job.PercComplete < 100 ? job.PercComplete : 99);
}
if (BW.CancellationPending)
{
job.Status = FSObjToCopy._Status.Complete;
return false;
}
while (job.PauseBackgroundWorker == true)
{
Thread.Sleep(100);
if (BW.CancellationPending)
{
job.Status = FSObjToCopy._Status.Complete;
return false;
}
Application.DoEvents();
}
}
int left = (int)(source.Length - copied);
destLength = source.Read(buffer, 0, left);
source.Flush();
dest.Write(buffer, 0, left);
dest.Flush();
job.CopiedSoFar += left;
}
else
{
// If the file length of each copy is longer than that of the source file, the actual file length is copied directly.
byte[] buffer = new byte[source.Length];
source.Read(buffer, 0, buffer.Length);
source.Flush();
dest.Write(buffer, 0, buffer.Length);
dest.Flush();
job.CopiedSoFar += source.Length;
job.PercComplete = (int)(float)((float)job.CopiedSoFar / (float)job.TotalFileSize * 100);
job.ProgressCell.Value = job.PercComplete;
BW.ReportProgress(job.PercComplete < 100 ? job.PercComplete : 99);
}
source.Close();
dest.Close();
fo.LastWriteTimeUtc = fi.LastWriteTimeUtc;
if (File.Exists(fo.FullName))
{
if (File.Exists(fo.FullName.Replace($"{Test_Updater_Core.TempFileExtension}", "")))
{
File.Delete(fo.FullName.Replace($"{Test_Updater_Core.TempFileExtension}", ""));
}
File.Move(fo.FullName, fo.FullName.Replace($"{Test_Updater_Core.TempFileExtension}", ""));
}
job.ProgressCell.Value = job.PercComplete;
BW.ReportProgress(job.PercComplete);
}
catch (Exception ex)
{
MessageBox.Show($"There was an error copying:{Environment.NewLine}{fi}{Environment.NewLine}to:" +
$"{Environment.NewLine}{fo}{Environment.NewLine}The error is: {Environment.NewLine}{ex.Message}",
"Error", MessageBoxButtons.OK, MessageBoxIcon.Error);
job.Status = FSObjToCopy._Status.Error;
return false;
}
finally
{
sw.Stop();
}
}
return true;
}
I decided to create Checksum files on the server that contain a series of checksums in each. As I copy the file, I add the checksums to an internal list, and compare them to the server list. If at some point, they do not match, I go back to the point where they were identical and start back up there. At the end of the copy job, I write the checksums from the internal list to disk, with the same name as of the server. If I'd like to check the integrity of a file, I can compare the server file to the local file and verify the checksums.
I am wondering why the load time for the file is so long . i would appreciate it if you would take time to look where it says
if (ReadType == 1)
Around 12,000 items loading
it takes nearly 12 seconds to load a file with a short structure i don't think this is right. I'm new to c# and could use any pointers attached below is the code and the file structure:
here also is attached a video of the issue:
video
screenshot of file:
structureloaded
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace StringEditor
{
public class ItemStr
{
public int a_index;
public byte[] a_name { get; set; }
public byte[] a_descr1 { get; set; }
}
}
private void tsbOpen_Click(object sender, EventArgs e)
{
OpenFileDialog ofd = new OpenFileDialog();
ofd.Filter = "String|*.lod";
if (ofd.ShowDialog() != DialogResult.OK)
return;
if (!ofd.FileName.Contains("strItem") && !ofd.FileName.Contains("strSkill")) //check to see if user isn't opening the right files if not return;
return;
else if (ofd.FileName.Contains("strItem"))
ReadType = 1;
else if (ofd.FileName.Contains("strSkill"))
ReadType = 2;
FileStream fs = new FileStream(ofd.FileName, FileMode.Open);
BinaryReader br = new BinaryReader(fs);
if (ReadType == 1)
{
int max = br.ReadInt32();
int max1 = br.ReadInt32();
for (int i = 0; br.BaseStream.Position < br.BaseStream.Length; i++)
{
ItemStr itemstr = new ItemStr();
itemstr.a_index = br.ReadInt32();
itemstr.a_name = br.ReadBytes(br.ReadInt32());
itemstr.a_descr1 = br.ReadBytes(br.ReadInt32());
itemStringList.Add(itemstr);
listBox1.Items.Add(itemstr.a_index.ToString() + " - " + Encoding.GetEncoding(ISO).GetString(itemstr.a_name));
}
EnableFields();
}
fs.Close();
br.Close();
if (ReadType == 2)
{
int max = br.ReadInt32();
int max1 = br.ReadInt32();
for (int i = 0; i < max; i++)
{
skillStr skillStr = new skillStr();
skillStr.a_index = br.ReadInt32();
skillStr.a_name = br.ReadString();
skillStr.a_tool_tip = br.ReadString();
skillStr.a_descr1 = br.ReadString();
skillStringList.Add(skillStr);
string test = skillStr.a_index + "- " + skillStr.a_name;
listBox1.Items.Add(test);
}
EnableFields();
}
fs.Close();
br.Close();
}
I wrote a small test on my core i5 machine. New form, one button, one listbox:
private void button1_Click(object sender, EventArgs e)
{
for (int i = 0; i < 30000; i++)
listBox1.Items.Add(i.ToString());
}
(I wrote it guessing at the index numbers in your screenshot). Clicked go. Had to wait 11 seconds before the UI became usable again.
I modified it to this:
private void button1_Click(object sender, EventArgs e)
{
listBox1.BeginUpdate();
for (int i = 0; i < 30000; i++)
listBox1.Items.Add(i.ToString());
listBox1.EndUpdate();
}
And there was a barely perceptible delay before it was usable again
The majority of the problem isn't reading the file, it's having the listbox refresh itself X thousands of times as you add one by one. Use Begin/End update to signal that you're loading a large amount of items...
...but then again, ask yourself what is a user REALLY going to do with X tens of thousands of items in a listbox? As a UI/UX guideline, avoid loading more than about 20 to 30 items into a list. Beyond that it's getting into unnavigable, especially at the quantities you're loading. Consider a type to search box - a one pixel jump of the scroll bar is going to move through more items than can fit vertically in your list!
If you're loading a lot of data from a file (or anywhere) into a list box, consider using the VirtualList approach - an example can be found here:
ListView.VirtualMode Property
You probably also want to consider performing the load in a background thread so that the user doesn't experience the apparent "hanging" delay that loading a lot of data can produce.
Listbox.beginupdate() and listbox.endupdate() fixed my problem thanks for the help guys.
I am trying to do the following:
Read file contents into byte array
convert byte array into Base64 String
find all sequences of repeating characters that are longer than 8 in length
place the found repeating patterns in a list
Here is where I am currently having some issues... I am currently reading a 1MB file using this loop:
void bkg_DoWork(object sender, DoWorkEventArgs e)
{
try
{
Byte[] bytes = File.ReadAllBytes(this.txt_Filename.Text);
string file = Convert.ToBase64String(bytes);
char lastchar = '\0';
int count = 0;
List<RepeatingPattern> patterns = new List<RepeatingPattern>();
this.Invoke((MethodInvoker)delegate
{
this.pb_Progress.Maximum = file.Length;
this.pb_Progress.Value = 0;
this.lbl_Progress.Text = "Progress: Read file contents read... Looking for patterns! 0% Done...";
});
for (int i = 0; i < file.Length; i++)
{
this.Invoke((MethodInvoker)delegate
{
this.pb_Progress.Value += 1;
this.lbl_Progress.Text = "Progress: Looking for patterns! " + (int)Decimal.Truncate((decimal)((double)i / file.Length) * 100) + "% Done...";
});
if (file[i] == lastchar)
count += 1;
else
{
//create a pattern, if the count is more than what a pattern's compressed pattern looks like to save space... 8 chars
//[$a,#$]
if (count > 8)
{
//create and add a pattern to the list if necessary.
RepeatingPattern ptn = new RepeatingPattern(lastchar, count);
if (!patterns.Contains(ptn))
patterns.Add(ptn);
}
count = 0;
lastchar = file[i];
}
}
e.Result = patterns;
}
catch (Exception ex)
{
e.Result = ex;
}
}
However, when using this loop, I find that the process is VERY long... for example, this 1MB file, takes like 1 minute to loop through... in this day in age, it feels like this is a long time for such a small file. Is there a more efficient way to do what I want to do/find the repeating patterns?
While looking at memory-mapped files in C#, there was some difficulty in identifying how to search a file quickly forward and in reverse. My goal is to rewrite the following function in the language, but nothing could be found like the find and rfind methods used below. Is there a way in C# to quickly search a memory-mapped file using a particular substring?
#! /usr/bin/env python3
import mmap
import pathlib
# noinspection PyUnboundLocalVariable
def drop_last_line(path):
with path.open('r+b') as file:
with mmap.mmap(file.fileno(), 0, access=mmap.ACCESS_READ) as search:
for next_line in b'\r\n', b'\r', b'\n':
if search.find(next_line) >= 0:
break
else:
raise ValueError('cannot find any line delimiters')
end_1st = search.rfind(next_line)
end_2nd = search.rfind(next_line, 0, end_1st - 1)
file.truncate(0 if end_2nd < 0 else end_2nd + len(next_line))
Is there a way in C# to quickly search a memory-mapped file using a particular substring?
Do you know of any way to memory-map an entire file in C# and then treat it as a byte array?
Yes, it's quite easy to map an entire file into a view then to read it into a single byte array as the following code shows:
static void Main(string[] args)
{
var sourceFile= new FileInfo(#"C:\Users\Micky\Downloads\20180112.zip");
int length = (int) sourceFile.Length; // length of target file
// Create the memory-mapped file.
using (var mmf = MemoryMappedFile.CreateFromFile(sourceFile.FullName,
FileMode.Open,
"ImgA"))
{
var buffer = new byte[length]; // allocate a buffer with the same size as the file
using (var accessor = mmf.CreateViewAccessor())
{
var read=accessor.ReadArray(0, buffer, 0, length); // read the whole thing
}
// let's try searching for a known byte sequence. Change this to suit your file
var target = new byte[] {71, 213, 62, 204,231};
var foundAt = IndexOf(buffer, target);
}
}
I couldn't seem to find any byte searching method in Marshal or Array but you can use this search algorithm courtesy of Social MSDN as a start:
private static int IndexOf2(byte[] input, byte[] pattern)
{
byte firstByte = pattern[0];
int index = -1;
if ((index = Array.IndexOf(input, firstByte)) >= 0)
{
for (int i = 0; i < pattern.Length; i++)
{
if (index + i >= input.Length ||
pattern[i] != input[index + i]) return -1;
}
}
return index;
}
...or even this more verbose example (also courtesy Social MSDN, same link)
public static int IndexOf(byte[] arrayToSearchThrough, byte[] patternToFind)
{
if (patternToFind.Length > arrayToSearchThrough.Length)
return -1;
for (int i = 0; i < arrayToSearchThrough.Length - patternToFind.Length; i++)
{
bool found = true;
for (int j = 0; j < patternToFind.Length; j++)
{
if (arrayToSearchThrough[i + j] != patternToFind[j])
{
found = false;
break;
}
}
if (found)
{
return i;
}
}
return -1;
}
User specifies filename and block size. Original file splits into blocks with users block size (except last block). For each block calculates hash-function SHA256 and writes to the console.
This is program with 2 threads: first thread reading the original file and put into queue byte array of block; second thread removes byte array of block from queue and calculate hash.
After first iteration memory doesn't dispose until the program complete.
On next iterations memory allocates and disposes normally.
So, during next reading of part array I get OutOfMemoryException.
How can I manage memory correctly to avoid memory leak?
class Encryption
{
static FileInfo originalFile;
static long partSize = 0;
static long lastPartSize = 0;
static long numParts = 0;
static int lastPartNumber = 0;
static string[] hash;
static Queue<byte[]> partQueue = new Queue<byte[]>();
public Encryption(string _filename, long _partSize)
{
try
{
originalFile = new FileInfo(#_filename);
partSize = _partSize;
numParts = originalFile.Length / partSize;
lastPartSize = originalFile.Length % partSize;
if (lastPartSize != 0)
{
numParts++;
}
else if (lastPartSize == 0)
{
lastPartSize = partSize;
}
lastPartNumber = (int)numParts - 1;
hash = new string[numParts];
}
catch (FileNotFoundException fe)
{
Console.WriteLine("Error: {0}\nStackTrace: {1}", fe.Message, fe.StackTrace);
return;
}
catch (Exception e)
{
Console.WriteLine("Error: {0}\nStackTrace: {1}", fe.Message, fe.StackTrace);
}
}
private void readFromFile()
{
try
{
using (FileStream fs = new FileStream(originalFile.FullName, FileMode.Open, FileAccess.Read))
{
for (int i = 0; i < numParts; i++)
{
long len = 0;
if (i == lastPartNumber)
{
len = lastPartSize;
}
else
{
len = partSize;
}
byte[] part = new byte[len];
fs.Read(part, 0, (int)len);
partQueue.Enqueue(part);
part = null;
}
}
}
catch(Exception e)
{
Console.WriteLine("Error: {0}\nStackTrace: {1}", fe.Message, fe.StackTrace);
}
}
private static void hashToArray()
{
try
{
SHA256Managed sha256HashString = new SHA256Managed();
int numPart = 0;
while (numPart < numParts)
{
long len = 0;
if (numPart == lastPartNumber)
{
len = lastPartSize;
}
else
{
len = partSize;
}
hash[numPart] = sha256HashString.ComputeHash(partQueue.Dequeue()).ToString();
numPart++;
}
}
catch (Exception e)
{
Console.WriteLine("Error: {0}\nStackTrace: {1}", fe.Message, fe.StackTrace);
}
}
private void hashWrite()
{
try
{
Console.WriteLine("\nResult:\n");
for (int i = 0; i < numParts; i++)
{
Console.WriteLine("{0} : {1}", i, hash[i]);
}
}
catch(Exception e)
{
Console.WriteLine("Error: {0}\nStackTrace: {1}", fe.Message, fe.StackTrace);
}
}
public void threadsControl()
{
try
{
Thread readingThread = new Thread(readFromFile);
Thread calculateThread = new Thread(hashToArray);
readingThread.Start();
calculateThread.Start();
readingThread.Join();
calculateThread.Join();
hashWrite();
}
catch (Exception e)
{
Console.WriteLine("Error: {0}\nStackTrace: {1}", fe.Message, fe.StackTrace);
}
}
}
You should read some books about .NET internals before you writing such code. Your understanding of .NET memory model is completely wrong, this is why you getting such error. OutOfMemoryException occurs very rarely, if you care about your resourses, especially if you are dealing with arrays.
You should know that in .NET runtime there are two heaps for reference objects, basic one, and Large Objects Heap, and the most important difference between them is that LOH doesn't being compacted even after garbage collection.
You should know that all the arrays, even small ones, are going to the LOH, and the memory is being consumed very quickly. Also you should know that this line:
part = null;
doesn't dispose memory immidiately. Even worse, this line doesn't do anything at all, because you still have a reference to the part of the file you've read in the queue. This is why your memory goes out. You can try to fix this by calling the GC after each hash computing, but this is highly not recommended solution.
You should rewrite your algorithm (which is very simple case of the Producer/Consumer pattern) without storing whole file contents in memory simultaneously. This is quite easy - simply move out your part variable to the static field, and read the next file part into it. Introduce the EventWaitHandle (or one of it's child classes) in your code instead of queue, and simply compute the next hash right after you've read the next part of file.
I recommend you to start from the basics in threading in C# by reading the great series of Joe Albahari, and only after that try to implement such solutions. Good luck with your projects.