I want to write a method to open (or create if it does not exist) a file from different threads.
What would be the FileAccess and FileShare flags in this case? I tried both FileAccess.Read/Write and FileShare.Read/Write but don't see any differences. I used the following code to test, looks fine but not sure about the flags (last 2).
Can anybody clarify should I use FileAccess.ReadWrite or FileAccess.Read and FileShare.ReadWrite or FileShare.Read?
class Program
{
static void Main(string[] args)
{
Task first = Task.Run(() => AccessFile());
Task second = Task.Run(() => AccessFile());
Task third = Task.Run(() => AccessFile());
Task fourth = Task.Run(() => AccessFile());
Task.WaitAll(first, second, third, fourth);
Task[] tasks = new Task[100];
for (int i = 0; i < 100; i++)
{
tasks[i] = Task.Run(() => AccessFile());
}
Task.WaitAll(tasks);
}
public static void AccessFile()
{
string path = #"c:\temp\test.txt";
// FileShare.Write gives access violation
using (FileStream fs = File.Open(path, FileMode.OpenOrCreate, FileAccess.ReadWrite, FileShare.ReadWrite))
{
byte[] by = new byte[100];
fs.Read(by, 0, 100);
Console.WriteLine(Encoding.ASCII.GetString(by));
}
}
}
If multiple threads need to write to the file then you need an exclusive lock on it. If you could have some threads read and others write to the file you could use a ReaderWriterLockSlim.
The FileShare parameter is used to indicate how other threads/processes can access the file while a handle is being opened in the current thread. ReadWrite means that other threads will be able to read and write from this file which obviously means that if you try to write to the file from the current handle you will probably corrupt the contents.
Since you are only reading you should specify just that (FileAccess.Read). And you are OK with others reading so say that (FileShare.Read).
And here's nicer test code:
ParallelEnumerable.Range(0, 100000).ForAll(_ => AccessFile());
Related
I started a small project for fun and I liked it so much I started to expand it. I needed a simple file explorer (like windows one) to just see and open files but now expanding it I have a problem with multiple files, I want to copy some from directory A and paste them in B then while doing it I want to copy some files from directory C and paste them in D and if the copy A -> B is in progress the copy C -> D is paused and when the copy A -> B finishes the second copy can start. For now, I can copy files from A to B. Is there anything not too complex I can try?
I'm using a new form to display the progress bar, file name, file count, and size when starting a copy and I'm using a BackgroundWorker
I am assuming you are just calling File.Move or File.Copy those don't give the ability to pause the actual operation, you will have to write your own Move/Copy operations
eg. to copy the file you could do the following
public void CopyFile(string sourceFileName, string destFileName, bool overwrite)
{
var outputFileMode = overwrite ? FileMode.Create : FileMode.CreateNew;
using (var inputStream = new FileStream(sourceFileName, FileMode.Open, FileAccess.Read, FileShare.Read))
using (var outputStream = new FileStream(destFileName, outputFileMode, FileAccess.Write, FileShare.None))
{
const int bufferSize = 16384; // 16 Kb
var buffer = new byte[bufferSize];
int bytesRead;
do
{
bytesRead = inputStream.Read(buffer, 0, bufferSize);
outputStream.Write(buffer, 0, bytesRead);
} while (bytesRead == bufferSize);
}
}
Now that you have this code you can simply add a while loop with a condition to "pause" the code eg.
while (_pause)
{
Thread.Sleep(100);
}
This while would go into the do loop from the above code.
Here the complete idea
public void CopyFile(string sourceFileName, string destFileName, bool overwrite)
{
var outputFileMode = overwrite ? FileMode.Create : FileMode.CreateNew;
using (var inputStream = new FileStream(sourceFileName, FileMode.Open, FileAccess.Read, FileShare.Read))
using (var outputStream = new FileStream(destFileName, outputFileMode, FileAccess.Write, FileShare.None))
{
const int bufferSize = 16384; // 16 Kb
var buffer = new byte[bufferSize];
int bytesRead;
do
{
//run this loop until _pause = false
while (_pause)
{
Thread.Sleep(100);
}
bytesRead = inputStream.Read(buffer, 0, bufferSize);
outputStream.Write(buffer, 0, bytesRead);
} while (bytesRead == bufferSize);
}
}
You could use objects to represent different operations, something like
public interface IFileOperation {
void Execute(Func<double> ReportProgress, CancellationToken cancel);
}
You could then create a queue of multiple operations, and create a task on another thread to process each item
private CancellationTokenSource cts = new CancellationTokenSource();
public double CurrentProgress {get; private set;}
public void CancelCurrentOperation() => cts.Cancel();
public BlockingCollection<IFileOperation> Queue = new BlockingCollection<IFileOperation>(new ConcurrentQueue<IFileOperation>());
public void RunOnWorkerThread(){
foreach(var op in Queue .GetConsumingEnumerable()){
cts = new CancellationTokenSource();
CurrentProgress = 0;
op.Execute(p => Progress = p, cts.Token );
}
}
This will run file operations, one at a time, on a background thread, while allowing new operations to be added from the main thread. To report progress you would need a non-modal progress bar. I.e. instead of showing a dialog you should add a progress bar control somewhere in your UI. Otherwise you would not be able to add new operations without cancelling the current operation. You will also need some way to connect the progress bar to the currently running operation, For example by running a timer on the main thread that updates the property that the progress bar is bound to. You can either run the method as a long Running task, or as a dedicated thread.
You could, if you wish, add a pause/resume method to the FileOperation. The answer by Rand Random shows how to copy files manually, so I will skip this here. You could also create a UI that will show a list of all queued file operations, and allow removing queued tasks. You could even, with some more work, run multiple operations in parallel, and show separate progress for each one.
I want to write to StreamWriter line async but I don't want to await on this.
for(int i= 0 ;i<1000;i++)
{
sw.WriteLineAsync(i.ToString());
}
But i got an error that i invoke to WriteLineAsync in same time.
What can I do to fix that?
I want to close this StreamWriter after this loop.
How can I verify that I now close without to write all data on StreamWriter, Or when I close the stream all the data that sent with WriteLineAsync will be write before the stream close?
If you want to write the lines using async, you should use await. Because, the file might be corrupted and you might encounter with possible stream errors like "the stream is already in use". In short, you should synchronize the write action.
So, I provided an example;
private async Task WriteToFileAsAsync()
{
string file = #"sample.txt";
using (FileStream stream = new FileStream(file, FileMode.Create, FileAccess.ReadWrite))
{
using (StreamWriter streamWriter = new StreamWriter(stream))
{
for (int i = 0; i < 1000; i++)
{
await streamWriter.WriteLineAsync(i.ToString());
}
}
}
}
Also, you can close and dispose the StreamWriter within using blocks.
EDIT
If you want to perform write action in separately from main thread, don't use async methods and just create a seperate Task and assign it an another thread.
private void WriteToFile()
{
string file = #"sample.txt";
using (FileStream stream = new FileStream(file, FileMode.Create, FileAccess.ReadWrite))
{
using (StreamWriter streamWriter = new StreamWriter(stream))
{
for (int i = 0; i < 1000; i++)
{
streamWriter.WriteLine(i.ToString());
}
}
}
}
Then call like this;
Task.Factory.StartNew(WriteToFile);
In your particular context there are two main issues in doing what you would like to do.
Technically speaking, if you call an async method, you need to await for it sooner or later, hence you can collect the task, and await for it later on. However, the WriteLineAsync method is not atomic, therefore calling it and performing other operations on the stream can corrupt the stream itself.
If you don't want to await, then don't call the Async method..
If you want to close a stream after using, use using with it.
using (StreamWriter writer = new StreamWriter("temp.txt"))
{
for (int i = 0; i < 1000; i++)
{
writer.WriteLine(i.ToString());
}
}
If you want it to be asynchronous, then you have to call the Async version, otherwise it will corrupt the file.
I'm using FileSystemWatcher in order to catch every created, changed, deleted and renamed change over whichever file in a folder.
Over this changes I need to perform a simple checksum of the contents of these files. Simply, I'm opening a filestream and pass it to MD5 class:
private byte[] calculateChecksum(string frl)
{
using (FileStream stream = File.Open(frl, FileMode.Open, FileAccess.Read, FileShare.ReadWrite))
{
return this.md5.ComputeHash(stream);
}
}
The problem is according the amount of files I need to handle. For example, imagine I have 200 files created along the time in a folder, and then I copy all of them and paste them on the same folder. This action is going to cause 200 event and 200 calculateChecksum() performings.
How could I solve this kind of problems?
In FileSystemWatcher handler put tasks to queue that will processed by some worker. Worker can process checksum calc tasks with targeted speed or/and frequency. Probably one worker will be better because many readers can slow down hdd with many read seeks.
Try read about BlockingCollection:
https://msdn.microsoft.com/ru-ru/library/dd997371(v=vs.110).aspx
and Producer-Consumer Dataflow Pattern
https://msdn.microsoft.com/ru-ru/library/hh228601(v=vs.110).aspx
var workerCount = 2;
BlockingCollection<String>[] filesQueues= new BlockingCollection<String>[workerCount];
for(int i = 0; i < workerCount; i++)
{
filesQueues[i] = new BlockingCollection<String>(500);
// Worker
Task.Run(() =>
{
while (!filesQueues[i].IsCompleted)
{
string url;
try
{
url= filesQueues[i].Take();
}
catch (InvalidOperationException) { }
if (!string.IsNullOrWhiteSpace(url))
{
calculateChecksum(url);
}
}
}
}
// inside of FileSystemWatcher handler
var queueIndex = hash(filename) % workersCount
// Warning!!
// Blocks if numbers.Count == dataItems.BoundedCapacity
filesQueues[queueIndex].Add(fileName);
filesQueues[queueIndex].CompleteAdding();
Also you can make multiple consumers, just call Take or TryTake concurrently - each item will only be consumed by a single consumer. But take into account in that case one file can be processed by many workers, and multiple hdd readers can slow down hdd.
UPD in case of multiple workers, it would be better to make multiple BlockingCollections, and push files in queue with index:
I've scketched a cosumer-producer pattern to solve that, and I've tried to use a thread pool in order to smooth the big amount of work, sharing a BlockingCollection
BlockingCollection & ThreadPool:
private BlockingCollection<Index.ResourceIndexDocument> documents;
this.pool = new SmartThreadPool(SmartThreadPool.DefaultIdleTimeout, 4);
this.documents = new BlockingCollection<string>();
As you cann see, I've created a I treadPool setting concurrency to 4. So, there is going to work only 4 thread at the same time regasdless of whether there is x > 4 work's units to handle in the pool.
Producer:
public void warn(string channel, string frl)
{
this.pool.QueueWorkItem<string, string>(
(file) => this.files.Add(file),
channel,
frl
);
}
Consumer:
Task.Factory.StartNew(() =>
{
Index.ResourceIndexDocument document = null;
while (this.documents.TryTake(out document, TimeSpan.FromSeconds(1)))
{
IEnumerable<Index.ResourceIndexDocument> documents = this.documents.Take(this.documents.Count);
Index.IndexEngine.Instance.index(documents);
}
},
TaskCreationOptions.LongRunning
);
I have so many files that i have to download. So i try to use power of new async features as below.
var streamTasks = urls.Select(async url => (await WebRequest.CreateHttp(url).GetResponseAsync()).GetResponseStream()).ToList();
var streams = await Task.WhenAll(streamTasks);
foreach (var stream in streams)
{
using (var fileStream = new FileStream("blabla", FileMode.Create))
{
await stream.CopyToAsync(fileStream);
}
}
What i am afraid of about this code it will cause big memory usage because if there are 1000 files that contains 2MB file so this code will load 1000*2MB streams into memory?
I may missing something or i am totally right. If am not missed something so it is better to await every request and consume stream is best approach ?
Both options could be problematic. Downloading only one at a time doesn't scale and takes time while downloading all files at once could be too much of a load (also, no need to wait for all to download before you process them).
I prefer to always cap such operation with a configurable size. A simple way to do so is to use an AsyncLock (which utilizes SemaphoreSlim). A more robust way is to use TPL Dataflow with a MaxDegreeOfParallelism.
var block = new ActionBlock<string>(url =>
{
var stream = (await WebRequest.CreateHttp(url).GetResponseAsync()).GetResponseStream();
using (var fileStream = new FileStream("blabla", FileMode.Create))
{
await stream.CopyToAsync(fileStream);
}
},
new ExecutionDataflowBlockOptions { MaxDegreeOfParallelism = 100 });
Your code will load the stream into memory whether you use async or not. Doing async work handles the I/O part by returning to the caller until your ResponseStream returns.
The choice you have to make dosent concern async, but rather the implementation of your program concerning reading a big stream input.
If I were you, I would think about how to split the work load into chunks. You might read the ResponseStream in parallel and save each stream to a different source (might be to a file) and release it from memory.
This is my own answer chunking idea from Yuval Itzchakov and i provide implementation. Please provide feedback for this implementation.
foreach (var chunk in urls.Batch(5))
{
var streamTasks = chunk
.Select(async url => await WebRequest.CreateHttp(url).GetResponseAsync())
.Select(async response => (await response).GetResponseStream());
var streams = await Task.WhenAll(streamTasks);
foreach (var stream in streams)
{
using (var fileStream = new FileStream("blabla", FileMode.Create))
{
await stream.CopyToAsync(fileStream);
}
}
}
Batch is extension method that is simply as below.
public static IEnumerable<IEnumerable<T>> Batch<T>(this IEnumerable<T> source, int chunksize)
{
while (source.Any())
{
yield return source.Take(chunksize);
source = source.Skip(chunksize);
}
}
I've the following code, that from the first Stream is reading from a file, and doing some interpretations to the content and write them to the second file, I'm facing a problem that when I've a big file the GUI in WPF is sticking, I tried to put the reading and the writing actions in:
Application.Current.Dispatcher.BeginInvoke(DispatcherPriority.Normal, new Action(() =>
{
// Here
});
This in the follwoing code:
using (StreamReader streamReader = new StreamReader(File.Open(fileName, FileMode.Open, FileAccess.Read, FileShare.ReadWrite)))
using (StreamWriter streamWriter = new StreamWriter(File.Open("Compressed_" + splitFilePath[splitFilePath.Length - 1], FileMode.Create, FileAccess.Write, FileShare.ReadWrite)))
{
// Here are the interpretations of the code
while ((dataSize = streamReader.ReadBlock(buffer, 0, BufferSize)) > 0)
{
streamWriter.Write(.....);
}
}
Can anyone help me??
Thanks
You need to move the writing into a Background thread if you want to avoid blocking the UI.
This can be done via Task.Factory.StartNew:
var task = Task.Factory.StartNew( () =>
{
using (StreamReader streamReader //.. Your code
});
This will, by default, cause this to run on a ThreadPool thread. If you need to update your user interface when this completes, you can use a continuation on the UI thread:
task.ContinueWith(t =>
{
// Update UI here
}, TaskScheduler.FromCurrentSynchronizationContext());
You need to understand, that even with BeginInvoke your code is executed on the SAME UI dispatcher thread, thus freezing your GUI. Try using tasks to run your logic in background.