I've been researching about this. Haven't find an answer about it.
Is it possible to insert/delete data to a file without overwriting it? I know there's File.AppendAllText(Path, "Content"); but what about deleting it?
For example.
We got a "Things.CB" File. The content of this file is:
-1
-2
-3
-4
-5
-6
-7
-8
-9
-10
I want to delete 7 and 4.
I open the File with my program and then proceed to read this numbers into a List<String>.
When this happens, after doing a RemoveAt() to the list, I got to serialize the file and then save it with a BinaryWriter or a streamWriter.
In this process, we did 2 things, read the whole file, deserializate and then serializate it so we can write it again.
I want to know if it's possible to only open the file, check position of the text then delete/insert and just save it without serializate or reading into list/arrays/etc...
Depending on your OS, if you are on FAT/NTFS you could use Microsoft API functions to play with the FAT/NTFS structure for a certain file.
Consider that you have three parts to your file...1,2,3. You want to delete Part 2. So you would manipulate the FAT so that the end of Part 1 now points to the start of Part 3 - effectively dropping Part 2 from the FAT and making it appear deleted. Then you have not moved any data, simply changed the various clusters and position markers for the file in the FAT.
You would use the same technique for inserting data... Simply adjust the 'pointers' stored in the FAT (file index) so that your new data is in the position you want in the file. Without moving any of your file contents.
These API functions are commonly used by defragmenting programs (Use this term for google searches) and have full access to the file structures (Although I'm not entirely sure that they have enough dynamics for you to skirt around data you want to delete, without moving the other file contents. They should though). To go to a lower level would require C / C++, and could become extremely dangerous (backup everything) and hardware specific. You can access APIs with C#/VB.NET although it is a bit tedious, something like VB6 would be surprisingly quicker to develop around the API functions, although its clunky for general coding.
This will not work over networks, so will only work on drives physically managed by your OS. This may also not work if you want to delete very tiny bits of data, as the granularity of the FAT management functions may not go that small.
Related
Almost all of file transfer softwares like [NetSupport, Radmin, PcAnyWhere..] and also the different codes i used in my application, it slows down the transfer speed when you send alot of small sized files < 1kb like Folder of a game that has alot of files.
for example on a LAN (ethernet CAT5 cables) i send a single file, let say a video, the transfer rate is between 2MB and 9MB
but when i send a folder of a game that has alot of files the transfer rate is about 300kb-800kb
as i guess it's because the way of sending a file:
Send File Info [file_path,file_Size].
Send file bytes [loop till end of the file].
End Transfer [ensure it received completely].
but when you use the regular windows [copy-paste] on a shared folder on the network, the transfer rate of sending a folder is always fast like sending a single file.
so im trying to develop a file transfer application using [WCF service c# 4.0] that would use the maximum speed available on LAN, and I'm thinking about this way:
Get all files from the folder.
if(FileSize<1 MB)
{
Create additional thread to send;
SendFile(FilePath);
}
else
{
Wait for the large file to be sent. // fileSize>1MB
}
void SendFile(string path) // a regular single file send.
{
SendFileInfo;
Open Socket and wait for server application to connect;
SendFileBytes;
Dispose;
}
but im confused about using more than one Socket for a file transfer, because that will use more ports and more time (delay of listening and accepting).
so is it a good idea to do it?
need an explaination about if it's possible to do, how to do it, a better protocol than tcp that would meant for this.
thanks in advance.
It should be noted you won't ever achieve 100% LAN speed usage - I'm hoping you're not hoping for that - there are too many factors there.
In response to your comment as well, you can't reach the same level that the OS uses to transfer files, because you're a lot further away from the bare metal than windows is. I believe file copying in Windows is only a layer or two above the drivers themselves (possibly even within the filesystem driver) - in a WCF service you're a lot further away!
The simplest thing for you to do will be to package multiple files into archives and transmit them that way, then at the receiving end you unpack the complete package into the target folder. Sure, some of those files might already be compressed and so won't benefit - but in general you should see a big improvement. For rock-solid compression in which you can preserve directory structure, I'd consider using SharpZipLib
A system that uses compression intelligently (probably medium-level, low CPU usage but which will work well on 'compressible' files) might match or possibly outperform OS copying. Windows doesn't use this method because it's hopeless for fault-tolerance. In the OS, a transfer halted half way through a file will still leave any successful files in place. If the transfer itself is compressed and interrupted, everything is lost and has to be started again.
Beyond that, you can consider the following:
Get it working using compression by default first before trying any enhancements. In some cases (depending on size/no. files) it might be you can simply compress the whole folder and then transmit it in one go. Beyond a certain size, however, and this might take too long, so you'll want to create a series of smaller zips.
Write the compressed file to a temporary location on disk as it's being received, don't buffer the whole thing in memory. Delete the file once you've then unpacked it into the target folder.
Consider adding the ability to mark certain file types as being able to be sent 'naked'- i.e. uncompressed. That way you can exclude .zips, avis etc files from the compression process. That said, a folder with a million 1kb zip files will clearly benefit from being packed into one single archive - so perhaps give yourself the ability to set a min size beyond which that file will still be packed into a compressed folder (or perhaps a file count/size on disk ratio for a folder itself - including sub-folders).
Beyond this advice you will need to play around to get the best results.
perhaps, an easy solution would be gathering all files together onto a big stream (like zipping them, but just append to make this fast) and send this one stream. This would give more speed, but will use up some cpu on both devices and a good idea how to separate all files in the stream.
But using more ports would, from what i know, only be a disadvantage, since there would be more different streams colliding and so the speed would go down.
All,
I'm making a training kit that has content given to use with 2 VOB files that I need the software to automatically merge to 1. We'll be getting upto 10-15 vob files from this vender and our requirements are to move to a single file.
Is merging these files as easy as opening byte streams and combining them?
Thanks!
If the specifications of the files match it should be possible to use the header from the first file and copy the remaining files minus their header into one file. But the specifications needs to match exactly on everything from encoding type and parameters to number of audio channels.
If so, then all you need to do is read all the files and skip the first xxx bytes of every file except the first one.
It won't work if the VOB-files are encrypted (DVD encryption).
Note: This is a job specialized tools do well. They are optimized and (more or less) bug free. So if you can, use them (i.e. from the command line).
No, it is not simple merging. Otherwise old DOS command >type 1.VOB, 2.VOB > Final.VOB would have done the job.
Unless it is for some learning, just use any VOB merging tool to merge these two.
A lot of this is probably going to depend on if the VOB files have the same resolution and bit rate, as well ensuring a lot of other encoding parameters are the same. If they are using the exact same encoding parameters, simply doing a concatenation of the files will probably work. My experience with DVDs shows that files from the DVD work fine when this is done. However, my first guess is that this wouldn't work if there was any format differences between the files.
When the Xbox 360 console formats a 1gb USB device, it adds 978mb of data to it in just 20 seconds. I can see the files on the USB and they are that size.
When I copy a file of the same length in Windows, it takes 6 minutes.
Maybe it is because Windows reads/writes, but the 360 just writes?
Is there a way to create large files like that on a USB with that kind of performance? The files can be blank, of course. I need this writing performance for my application.
Most of the cmd tools I have tried have not had any noticeable performance gains.
It would appear that the 360 is allocating space for the file and writing some data to the file, but is otherwise leaving the rest of the file filled with whatever data was there originally (so-called "garbage data"). When you copy a file of the same size to the drive, it is writing all 978MB of, which is a different scenario and is why it takes so much longer.
Most likely the 360 is not sending 978mb of data to the usb stick, but is instead creating an empty file of size 978mb - yours takes longer because rather than simply sending a few KB to alter the file system information, you are actually sending 978mb of data to the device.
You can do something similar (create an empty file of fixed size) on windows with fsutil or Sysinternals "contig" tool: See Quickly create large file on a windows system? - try this, and you'll see that it can take much less than 20 seconds (I would guess that the 360 is sending some data, as well as reserving space for more). Note that one of the answers shows how to use the windows API to do the same thing, as well as a python script.
Could it be that the 360 is just doing some direct filesystem header manipulation? If a blank file is fine for you maybe you could try that?
It is all dependent on the throughput of the usb drive. You will need a high end usb such as the following: this list
In a project I am doing I want to give users the option of 'securely' deleting a file - as in, overwriting it with random bits or 0's. Is there an easy-ish way of doing this in C#.NET? And how effective would it be?
You could invoke sysinternals SDelete to do this for you. This uses the defragmentation API to handle all those tricky edge cases.
Using the defragmentation API, SDelete
can determine precisely which clusters
on a disk are occupied by data
belonging to compressed, sparse and
encrypted files.
If you want to repackage that logic in a more convenient form, the API is described here.
You can't securely delete a file on a journaling filesystem. The only non-journaling system still in heavy use is fat32. On any other system, the only way to securely delete is to shred the entire hard drive.
EDIT
The reason secure delete doesn't work, is that that data used to overwrite a file might not be stored in the same location as the data it is overwriting.
It seems Microsoft does provide a secure delete tool, but it does not appear to be something that you can use as a drop in replacement.
The only good way to prevent deleted file recover, short of shredding the disk, would be to encrypt the file before it is written to disk.
It wouldn't be secure at all. Instead you may wish to look at alternative solutions like encryption.
One solution would be to encrypt the contents of the data file. A new key would be used each time the file is updated. When you want to "securely delete" the data simply "lose" the encryption key and delete the file. The file will still be on the disk physically but without the encryption key recovery would be impossible.
Here is more detailed explanation as to why "secure" overwrites of files is poor security:
Without a low level tool (outside of .net runtime) you have no access to the physical disk location. Take a filestream on NTFS, when you "open a file for write access" you have no guarantee that the "updated" copy (in this case random 101010 version) will be stored in the same place (thus overwriting the original file). In fact most of the time this is what happens:
1) File x.dat is stored starting at cluster 8493489
2) You open file x.dat for write access. What is returned to you by the OS is merely a pointer to the file stream abstracted by not just the OS but the underlying file system and device drivers (hardware RAID for example) and sometimes the physical disk itself (SSD). You update the contents of the file with random 1 & 0s and close the filestream.
3) The OS likely may (and likely will) write the new file to another cluster (say cluster 4384939). It will then merely update the MFT indicating file x is now stored at 4384939.
To the end user it looks like only one copy of the file exists and it now has random data in it however the original data still exists on the disk.
Instead you should consider encrypting the contents of the file with a different key each time file is saved. When the user wants the file "deleted" delete the key and file. The physical file may remain but without encryption key recovery would be impossible.
Gutmann erasing implementation
I'd first try simply to open the file and overwrite its contents as I would normally do it. Pretty trivial in C#, I won't even bother to write it. However I don't know how secure that would be. For one thing, I'm quite certain it would not work on flash drives and SSD's that use sophisticated algorithms to provide wear leveling. I don't know what would work there, perhaps it would need to be done on driver level, perhaps it would be impossible at all. On normal drives I just don't know what Windows would do. Perhaps it would retain old data as well.
In my application, the user selects a big file (>100 mb) on their drive. I wish for the program to then take the file that was selected and chop it up into archived parts that are 100 mb or less. How can this be done? What libraries and file format should I use? Could you give me some sample code? After the first 100mb archived part is created, I am going to upload it to a server, then I will upload the next 100mb part, and so on until the upload is finished. After that, from another computer, I will download all these archived parts, and then I wish to connect them into the original file. Is this possible with the 7zip libraries, for example? Thanks!
UPDATE: From the first answer, I think I'm going to use SevenZipSharp, and I believe I understand now how to split a file into 100mb archived parts, but I still have two questions:
Is it possible to create the first 100mb archived part and upload it before creating the next 100mb part?
How do you extract a file with SevenZipSharp from multiple splitted archives?
UPDATE #2: I was just playing around with the 7-zip GUI and creating multi-volume/split archives, and I found that selecting the first one and extracting from it will extract the whole file from all of the split archives. This leads me to believe that paths to the subsequent parts are included in the first one (or is it consecutive?). However, I'm not sure if this would work directly from the console, but I will try that now, and see if it solves question #2 from the first update.
Take a look at SevenZipSharp, you can use this to create your spit 7z files, do whatever you want to upload them, then extract them on the server side.
To split the archive look at the SevenZipCompressor.CustomParameters member, passing in "v100m". (you can find more parameters in the 7-zip.chm file from 7zip)
You can split the data into 100MB "packets" first, and then pass each packet into the compressor in turn, pretending that they are just separate files.
However, this sort of compression is usually stream-based. As long as the library you are using will do its I/O via a Stream-derived class, it would be pretty simple to implement your own Stream that "packetises" the data any way you like on the fly - as data is passed into your Write() method you write it to a file. When you exceed 100MB in that file, you simply close that file and open a new one, and continue writing.
Either of these approaches would allow you to easily upload one "packet" while continuing to compress the next.
edit
Just to be clear - Decompression is just the reverse sequence of the above, so once you've got the compression code working, decompression will be easy.