Advice on appropriate data structure - c#

Background
I have two pieces of data:
machineNumber which is just an id for a machine.
eventString which is an entry in a log.
The same log entry can occur multiple times on one machine and can occur on multiple machines. For example:
machineNumber
eventString
1
LogExample1
2
LogExample1
1
LogExample1
4
LogExample3
3
LogExample2
What I want to do is store this data temporarily in some sort of data structure so I can format it into the follow eventString, NumberOfMachinesEffected, TotalNumberOfInstances before storing it as a CSV file.
With the above example it would be formatted like LogExample1, 2, 3.
Problem
I'm wondering if someone can recommend an efficient method to store the data before formatting it. I need to be able to iterate over it to be able to count total number off occurrences, total number of machines effected, for each eventString.
Requested Code
I was asked to include the code. I don't think it pertains to the problem as it is purely a design question.
namespace ConfigLogAnalyser
{
/// <summary>
/// Interaction logic for MainWindow.xaml
/// </summary>
public partial class MainWindow : Window
{
public String fileName;
public MainWindow()
{
InitializeComponent();
}
private void MenuItem_Click(object sender, RoutedEventArgs e)
{
Microsoft.Win32.OpenFileDialog openFileDialog = new Microsoft.Win32.OpenFileDialog();
openFileDialog.Filter = "Text files(*.txt) | *.txt";
openFileDialog.InitialDirectory = "D:\\LogFiles"; //Testing only. Remove
if (openFileDialog.ShowDialog() == true)
{
//ensure it is a text file
fileName = openFileDialog.FileName;
if(!ProcessLogFile(fileName))
{
MessageBox.Show("Issue reading file: " + fileName);
}
}
}
//to be removed
private bool ProcessLogFile(string fileName)
{
if (!ReadLogFile(fileName))
{
return false;
}
return true;
}
//Why does this need to be a bool
private bool ReadLogFile(string fileName)
{
const Int32 BufferSize = 1024; //Changing buffersize will effect performance.
using (var fileStream = File.OpenRead(fileName))
using (var streamReader = new StreamReader(fileStream, Encoding.UTF8, true, BufferSize))
{
String line;
while ((line = streamReader.ReadLine()) != null)
{
ProcessLine(line);
}
}
return true;
}
private void ProcessLine(string line)
{
/*Process Line -
*
* Possibly use a multimap to store each logEntry of interest and a pair <machineId, NoOfOccurences>
* Problem. If an occurence happens twice by the same machine how do I make sure two isn't added to number of machines.
*
*/
throw new NotImplementedException();
}
}
}

I recommend you to create your own class to store some event information:
class EventInfo
{
public int MachineID { get; set; }
public string LogMessage { get; set; }
public DateTime EventTime { get; set; }
}
And then just create a list of EventInfo:
List<EventInfo> events = new List<EventInfo>();
C# List has quite good performance, and, in addition, using LINQ you can easily manipulate a data.
For example:
events.Where(item => item.MachineID == 1).Select(item => item.LogMessage);
This code is selecting all the events messages, related to the machine, with ID = 1

Related

How to handle and perform actions fast on websocket data in c#?

I am connecting to third party data feed provider's server using websocket.
For websocket connection my code is :
this.websocket = new WebSocket("wss://socket.polygon.io/stocks", sslProtocols: SslProtocols.Tls12 | SslProtocols.Tls11 | SslProtocols.Tls);
So when connection become established we are receiving nearly 70,000 to 1,00,000 records on every minute. So after that we bifurcating those response and store it in it's individual files. Like if we receive data for AAPL then we store that data into AAPL's file. Same as for FB, MSFT, IBM, QQQ,and so on. We have total 10,000 files which we need to handle at a time and store live records according to it.
public static string tempFile = #"D:\TempFileForLiveMarket\tempFileStoreLiveSymbols.txt";
public static System.IO.StreamWriter w;
private void websocket_MessageReceived(object sender, MessageReceivedEventArgs e)
{
using (w = System.IO.File.AppendText(tempFile))
{
Log(e.Message, w);
}
using (System.IO.StreamReader r = System.IO.File.OpenText(tempFile))
{
DumpLog(r);
}
}
public static void Log(string responseMessage, System.IO.TextWriter w)
{
w.WriteLine(responseMessage);
}
public static void DumpLog(System.IO.StreamReader r)
{
string line;
while ((line = r.ReadLine()) != null)
{
WriteRecord(line);
}
}
public static void WriteRecord(string data)
{
List<LiveData> ld = JsonConvert.DeserializeObject<List<LiveData>>(data);
var filterData = ld.Where(x => symbolList.Contains(x.sym));
List<string> fileLines = new List<string>();
foreach (var item in filterData)
{
var fileName = #"D:\SymbolsData\"+item.sym+ "_day_Aggregate.txt";
fileLines = File.ReadAllLines(fileName).AsParallel().Skip(1).ToList();
if (fileLines.Count > 1)
{
var lastLine = fileLines.Last();
if (!lastLine.Contains(item.sym))
{
fileLines.RemoveAt(fileLines.Count - 1);
}
}
fileLines.Add(item.sym + "," + item.s + "," + item.p + "-----");
System.IO.File.WriteAllLines(fileName, fileLines);
}
}
So, when websocket connection established and perform actions with live market data with our 10,000 files then it's become slower and also websocket connection become closed after few minutes and passing message like below :
Websocket Error
Received an unexpected EOF or 0 bytes from the transport stream.
Connection Closed...
I am performing whole this process because in next phase I need to perform technical analysis on live price of each and every symbols. So how can I handle this situation ? How can I make process faster then this processing speed? and how can I stop for connection closed ?
After Edit
I replace stream writer and temp file with String Builder like follow,
public static StringBuilder sb = new StringBuilder();
public static System.IO.StringWriter sw;
private void websocket_MessageReceived(object sender, MessageReceivedEventArgs e)
{
sw = new System.IO.StringWriter(sb);
sw.WriteLine(e.Message);
Reader();
}
public static void Reader()
{
System.IO.StringReader _sr = new System.IO.StringReader(sb.ToString());
while (_sr.Peek() > -1)
{
WriteRecord(sb.ToString());
}
sb.Remove(0, sb.Length);
}
public static void WriteRecord(string data)
{
List<LiveData> ld = JsonConvert.DeserializeObject<List<LiveData>>(data);
foreach (var item in filterData)
{
var fileName = #"D:\SymbolsData\"+item.sym+ "_day_Aggregate.txt";
fileLines = File.ReadAllLines(fileName).AsParallel().Skip(1).ToList();
fileLines.RemoveAt(fileLines.Count - 1);
fileLines.Add(item.sym + "," + item.s + "," + item.p)
System.IO.File.WriteAllLines(fileName, fileLines);
}
}
It looks like you append each message to the tempFile, but then you process the entire tempFile. This means you are constantly re-processing the old data plus the new record, so yes: it will gradually take longer and longer and longer until it takes so long that the other end gets bored of waiting, and cuts you off. My advice: don't do that.
There's also a lot of things you could do more efficiently in the actually processing of each record, but that is irrelevant compared to the overhead of constantly re-processing everything.

Parsing performance of row data from files to SQL Server database

I have the PAF raw data in several files (list of all addresses in the UK).
My goal is to create a PostCode lookup in our software.
I have created a new database but there is no need to understand it for the moment.
Let's take a file, his extension is ".c01" and can be open with a text editor. The data in this file are in the following format :
0000000123A
With (according to the developer guide), 8 char for the KEY, 50 char for the NAME.
This file contains 2,449,652 rows (it's a small one !)
I create a Parsing class for this
private class SerializedBuilding
{
public int Key
{
get; set;
}
public string Name
{
get; set;
}
public bool isValid = false;
public Building ToBuilding()
{
Building b = new Building();
b.BuildingKey = Key;
b.BuildingName = Name;
return b;
}
private readonly int KEYLENGTH = 8;
private readonly int NAMELENGTH = 50;
public SerializedBuilding(String line)
{
string KeyStr = null;
string Name = null;
try
{
KeyStr = line.Substring(0, KEYLENGTH);
}
catch (Exception e)
{
Console.WriteLine("erreur parsing key line " + line);
return;
}
try
{
Name = line.Substring(KEYLENGTH - 1, NAMELENGTH);
}
catch (Exception e)
{
Console.WriteLine("erreur parsing name line " + line);
return;
}
int value;
if (!Int32.TryParse(KeyStr, out value))
return;
if (value == 0 || value == 99999999)
return;
this.Name = Name;
this.Key = value;
this.isValid = true;
}
}
I use this method to read the file
public void start()
{
AddressDataContext d = new AddressDataContext();
Count = 0;
string line;
// Read the file and display it line by line.
System.IO.StreamReader file =
new System.IO.StreamReader(filename);
SerializedBuilding sb = null;
Console.WriteLine("Number of line detected : " + File.ReadLines(filename).Count());
while ((line = file.ReadLine()) != null)
{
sb = new SerializedBuilding(line);
if (sb.isValid)
{
d.Buildings.InsertOnSubmit(sb.ToBuilding());
if (Count % 100 == 0)
d.SubmitChanges();
}
Count++;
}
d.SubmitChanges();
file.Close();
Console.WriteLine("building added");
}
I use Linq to SQL classes to insert data to my database. The connection string is the default one.
This seems to work, I have added 67200 lines. It just crashed but my questions are not about that.
My estimations :
33,647,015 rows to parse
Time needed for execution : 13 hours
It's a one-time job (just needs to be done on my sql and on the client server later) so I don't really care about performances but I think it can be interesting to know how it can be improved.
My questions are :
Is readline() and substring() the most powerful ways to read these huge files ?
Can the performance be improved by modifying the connection string ?

Pack files into one, to later programmatically unpack them [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 9 years ago.
Improve this question
Is it possible to take all files and folders in a directory and pack them into a single package file, so that I may transfer this package over network and then unpack all files and folders from the package?
I tried looking into ZIP files with C#, because I'm aiming for the same idea, but the actual methods for it only comes with .NET 3.5 (I believe), I also want the program to be very lightweight, meaning I don't want external modules lying around that has to be taken with if I wish to unzip/unpack a single file.
How can I accomplish this?
Just use a BinaryWriter/Reader and your own format. Something like this:
using (var fs = File.Create(...))
using (var bw = new BinaryWriter(fs))
{
foreach (var file in Directory.GetFiles(...))
{
bw.Write(true); // means that a file will follow
bw.Write(Path.GetFileName(file));
var data = File.ReadAllBytes(file);
bw.Write(data.Length);
bw.Write(data);
}
bw.Write(false); // means end of file
}
So basically you write a bool that means whether there is a next file, the name and contents of each file, one after the other. Reading is the exact opposite. BinaryWriter/Reader take care of everything (it knows how long each string and byte array is, you will read back exactly what you wrote).
What this solution lacks: not an industry standard (but quite simple), doesn't store any additional metadata (you can add creation time, etc.), doesn't use a checksum (you can add an SHA1 hash after the contents), doesn't use compression (you said you don't need it), doesn't handle big files well (the problematic part is that it reads an entire file into a byte array and writes that, should work pretty well under 100 MB), doesn't handle multi-level directory hierarchies (can be added of course).
EDIT: The BinaryR/W know about string lengths, but not about byte array lengths. I added a length field before the byte array so that it can be read back exactly as it was written.
Take a look at ziplib, it's free, open source and can be used in all .NET versions: http://www.icsharpcode.net/opensource/sharpziplib/
What i suggest you it's to consider the advantages taken by using an external library so you can forget a lot of troubles. A zip complex class could be a huge deal. Take a look at that: http://dotnetzip.codeplex.com/ it's simple, stable and lightweigth.
By the way if you totally don'want external libraries and data compression isn't mandatory for your project you can manage it in a sort like this (please, consider it as a sample written in lees than an hour ;-) ):
usage:
//to pack
Packer.SimplePack sp = new Packer.SimplePack(#"c:\filename.pack");
sp.PackFolderContent(#"c:\yourfolder");
sp.Save();
//to unpack
Packer.SimplePack sp = new Packer.SimplePack(#"c:\filename.pack");
sp.Open();
Here is SimplePack:
using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace Packer
{
public class SimplePack
{
public class Header
{
public Int32 TotalEntries { get; set; }
public Int64[] EntriesSize
{
get
{
return EntriesSizeList.ToArray();
}
}
private List<Int64> EntriesSizeList { get; set; }
public Header()
{
TotalEntries = 0;
EntriesSizeList = new List<Int64>();
}
public void AddEntrySize(Int64 newSize)
{
EntriesSizeList.Add(newSize);
}
}
public class Item
{
public Byte[] RawData { get; set; }
public String Name { get; set; }
public String RelativeUri { get; set; }
public Int64 ItemSize
{
get
{
Int64 retVal = 4; //Name.Lenght;
retVal += Name.Length;
retVal += 4; //RelativeUri.Length
retVal += RelativeUri.Length;
retVal += RawData.Length;
return retVal;
}
}
public Byte[] SerializedData
{
get
{
List<Byte> retVal = new List<Byte>();
retVal.AddRange(BitConverter.GetBytes(Name.Length));
retVal.AddRange(Encoding.Default.GetBytes(Name));
retVal.AddRange(BitConverter.GetBytes(RelativeUri.Length));
retVal.AddRange(Encoding.Default.GetBytes(RelativeUri));
retVal.AddRange(RawData);
return retVal.ToArray();
}
}
public Item()
{
RawData = new Byte[0];
Name = String.Empty;
RelativeUri = String.Empty;
}
public Item(Byte[] serializedItem)
{
Int32 cursor = 0;
Int32 nl = BitConverter.ToInt32(serializedItem, cursor);
cursor += 4;
Name = Encoding.Default.GetString(serializedItem, cursor, nl);
cursor += nl;
Int32 rl = BitConverter.ToInt32(serializedItem, cursor);
cursor += 4;
RelativeUri = Encoding.Default.GetString(serializedItem, cursor, rl);
cursor += rl;
RawData = new Byte[serializedItem.Length - cursor];
for (int i = cursor; i < serializedItem.Length; i++)
{
RawData[i - cursor] = serializedItem[cursor];
}
}
}
public FileInfo PackedFile { get; private set; }
public List<Item> Data { get; private set; }
public Header FileHeaderDefinition { get; private set; }
public SimplePack(String fileName)
{
PackedFile = new FileInfo(fileName);
FileHeaderDefinition = new Header();
Data = new List<Item>();
}
public Boolean PackFolderContent(String folderFullName)
{
Boolean retVal = false;
DirectoryInfo di = new DirectoryInfo(folderFullName);
//Think about setting up strong checks and errors trapping
if (di.Exists)
{
FileInfo[] files = di.GetFiles("*", SearchOption.AllDirectories);
foreach (FileInfo fi in files)
{
Item it = setItem(fi, di.FullName);
if (it != null)
{
Data.Add(it);
FileHeaderDefinition.TotalEntries++;
FileHeaderDefinition.AddEntrySize(it.ItemSize);
}
}
}
//althoug it isn't checked
retVal = true;
return retVal;
}
private Item setItem(FileInfo sourceFile, String packedRoot)
{
if (sourceFile.Exists)
{
Item retVal = new Item();
retVal.Name = sourceFile.Name;
retVal.RelativeUri = sourceFile.FullName.Substring(packedRoot.Length).Replace("\\", "/");
retVal.RawData = File.ReadAllBytes(sourceFile.FullName);
return retVal;
}
else
{
return null;
}
}
public void Save()
{
if (PackedFile.Exists)
{
PackedFile.Delete();
System.Threading.Thread.Sleep(100);
}
using (FileStream fs = new FileStream(PackedFile.FullName, FileMode.CreateNew, FileAccess.Write))
{
//Writing Header
//4 bytes
fs.Write(BitConverter.GetBytes(FileHeaderDefinition.TotalEntries), 0, 4);
//8 bytes foreach size
foreach (Int64 size in FileHeaderDefinition.EntriesSize)
{
fs.Write(BitConverter.GetBytes(size), 0, 8);
}
foreach (Item it in Data)
{
fs.Write(it.SerializedData, 0, it.SerializedData.Length);
}
fs.Close();
}
}
public void Open()
{
if (PackedFile.Exists)
{
using (FileStream fs = new FileStream(PackedFile.FullName, FileMode.Open, FileAccess.Read))
{
Byte[] readBuffer = new Byte[4];
fs.Read(readBuffer, 0, readBuffer.Length);
FileHeaderDefinition.TotalEntries = BitConverter.ToInt32(readBuffer, 0);
for (Int32 i = 0; i < FileHeaderDefinition.TotalEntries; i++)
{
readBuffer = new Byte[8];
fs.Read(readBuffer, 0, readBuffer.Length);
FileHeaderDefinition.AddEntrySize(BitConverter.ToInt64(readBuffer, 0));
}
foreach (Int64 size in FileHeaderDefinition.EntriesSize)
{
readBuffer = new Byte[size];
fs.Read(readBuffer, 0, readBuffer.Length);
Data.Add(new Item(readBuffer));
}
fs.Close();
}
}
}
}
}

Save progress to a text file C#

I have a problem here...so that's what I wanna do:
I have a program that saves information about user progress, ex: Calls, Answered Calls... and the user run this program every day and save the iformation to the text file. So the problem is that when the user hit's the Save button it add's a new stat's for that day. But I want those data to be modified if user save's in that day 2 times.
What I wanna do is to create a new file where to save the last time saved, and if the date are not diferent Append to file, else modify existing for that day saves.
What I did so far is:
string input3 = string.Format("{0:yyyy-MM-dd}", DateTime.Now);
StreamWriter t,tw;
if(File.Exists(filename))
{
tw=File.AppendText(filename);
t = new StreamWriter("lasttimesaved.txt");
t.WriteLine(input3);
}
else
{
tw=new StreamWriter(filename);
t = new StreamWriter("lasttimesaved.txt");
t.WriteLine(input3);
}
tw.WriteLine();
tw.Write("Stats for Name ");
tw.Write(input);
tw.Write("_");
tw.WriteLine(input3);
tw.WriteLine();
tw.Write("Total Calls: "); tw.WriteLine(calls);
tw.Write("Total Answered: "); tw.WriteLine(answ);
tw.Close();
the only thing now that I don't know ho to do is how to add above all that a check instance to see if the user allready saved today info and to modify existing data.
it's like:
try
{
using (StreamReader sr = new StreamReader("lasttimesaved.txt"))
{
String line = sr.ReadToEnd();
}
}
catch (Exception e)
{
Console.WriteLine("The file could not be read:");
Console.WriteLine(e.Message);
}
if(String.Compare(input3,line) == 0)
{
// that's where I need to modify the existing data.
}
else
{
// do the code above
}
Can anyone help me to modify curent recorded data without losing previous records.
in text file is:
Stats for Name_2013-11-26
Total Calls: 25
Total Answered: 17
Stats for Name_2013-11-27
Total Calls: 32
Total Answered: 15
Stats for Name_2013-11-28
Total Calls: 27
Total Answered: 13
I would say use XML, it will still be readable and modifiable without code and you have some neat way to modify the file with code.
With XML you can easily query the file to see if the date of today is already mentioned in the file, if so you could edit that node if not you could easily append a node.
To append nodes to an xml file i would look at this link:
C#, XML, adding new nodes
Hope this helps, use it like here:
void main()
{
var uw = new UserInformationWriter(#"C:\temp\stats.txt");
var user = new UserInfomration { Calls = "111", Answered = "110" };
uw.Save(user);
}
Here the class(es):
public class UserInformationWriter
{
public string CentralFile { get; set; }
public UserInformationWriter(string centraFile)
{
CentralFile = centraFile;
}
public void Save(UserInfomration newUserInformation)
{
try
{
var streamReader = new StreamReader(CentralFile);
var sourceInformation = streamReader.ReadToEnd();
streamReader.Close();
var userCollection = (List<UserInfomration>)(sourceInformation.ToUserInfomation());
var checkItem = ShouldModify(userCollection);
if (checkItem.Item1)
userCollection.Remove(checkItem.Item2);
newUserInformation.DateTime = DateTime.Today;
userCollection.Add(newUserInformation);
File.Delete(CentralFile);
foreach (var userInfomration in userCollection)
WriteToFile(userInfomration);
}
catch (Exception) { }
}
private Tuple<bool, UserInfomration> ShouldModify(IEnumerable<UserInfomration> userInfomations)
{
try
{
foreach (var userInfomration in userInfomations)
if (userInfomration.DateTime == DateTime.Today)
return new Tuple<bool, UserInfomration>(true, userInfomration);
}
catch (Exception) { }
return new Tuple<bool, UserInfomration>(false, null);
}
private void WriteToFile(UserInfomration newUserInformation)
{
using (var tw = new StreamWriter(CentralFile, true))
{
tw.WriteLine("*Stats for Name_{0}", newUserInformation.DateTime.ToShortDateString());
tw.WriteLine();
tw.WriteLine("*Total Calls: {0}", newUserInformation.Calls);
tw.WriteLine("*Total Answered: {0}#", newUserInformation.Answered);
tw.WriteLine();
}
}
}
public class UserInfomration
{
public DateTime DateTime { get; set; }
public string Calls { get; set; }
public string Answered { get; set; }
}
public static class StringExtension
{
private const string CallText = "TotalCalls:";
private const string AnsweredText = "TotalAnswered:";
private const string StatsForName = "StatsforName_";
private const char ClassSeperator = '#';
private const char ItemSeperator = '*';
public static IEnumerable<UserInfomration> ToUserInfomation(this string input)
{
var splited = input.RemoveUnneededStuff().Split(ClassSeperator);
splited = splited.Where(x => !string.IsNullOrEmpty(x)).ToArray();
var userInformationResult = new List<UserInfomration>();
foreach (var item in splited)
{
if (string.IsNullOrEmpty(item)) continue;
var splitedInformation = item.Split(ItemSeperator);
splitedInformation = splitedInformation.Where(x => !string.IsNullOrEmpty(x)).ToArray();
var userInformation = new UserInfomration
{
DateTime = ConvertStringToDateTime(splitedInformation[0]),
Calls = splitedInformation[1].Substring(CallText.Length),
Answered = splitedInformation[2].Substring(AnsweredText.Length)
};
userInformationResult.Add(userInformation);
}
return userInformationResult;
}
private static DateTime ConvertStringToDateTime(string input)
{
var date = input.Substring(StatsForName.Length);
return DateTime.ParseExact(date, "dd.MM.yyyy", CultureInfo.InvariantCulture);
}
private static string RemoveUnneededStuff(this string input)
{
input = input.Replace("\n", String.Empty);
input = input.Replace("\r", String.Empty);
input = input.Replace("\t", String.Empty);
return input.Replace(" ", string.Empty);
}
}
Let me know If you need help or I understood you wrong.

EntityTooSmall in CompleteMultipartUploadResponse

using .NET SDK v.1.5.21.0
I'm trying to upload a large file (63Mb) and I'm following the example at:
http://docs.aws.amazon.com/AmazonS3/latest/dev/LLuploadFileDotNet.html
But using a helper instead the hole code and using jQuery File Upload
https://github.com/blueimp/jQuery-File-Upload/blob/master/basic-plus.html
what I have is:
string bucket = "mybucket";
long totalSize = long.Parse(context.Request.Headers["X-File-Size"]),
maxChunkSize = long.Parse(context.Request.Headers["X-File-MaxChunkSize"]),
uploadedBytes = long.Parse(context.Request.Headers["X-File-UloadedBytes"]),
partNumber = uploadedBytes / maxChunkSize + 1,
fileSize = partNumber * inputStream.Length;
bool lastPart = inputStream.Length < maxChunkSize;
// http://docs.aws.amazon.com/AmazonS3/latest/dev/LLuploadFileDotNet.html
if (partNumber == 1) // initialize upload
{
iView.Utilities.Amazon_S3.S3MultipartUpload.InitializePartToCloud(fileName, bucket);
}
try
{
// upload part
iView.Utilities.Amazon_S3.S3MultipartUpload.UploadPartToCloud(fs, fileName, bucket, (int)partNumber, uploadedBytes, maxChunkSize);
if (lastPart)
// wrap it up and go home
iView.Utilities.Amazon_S3.S3MultipartUpload.CompletePartToCloud(fileName, bucket);
}
catch (System.Exception ex)
{
// Huston, we have a problem!
//Console.WriteLine("Exception occurred: {0}", exception.Message);
iView.Utilities.Amazon_S3.S3MultipartUpload.AbortPartToCloud(fileName, bucket);
}
and
public static class S3MultipartUpload
{
private static string accessKey = System.Configuration.ConfigurationManager.AppSettings["AWSAccessKey"];
private static string secretAccessKey = System.Configuration.ConfigurationManager.AppSettings["AWSSecretKey"];
private static AmazonS3 client = Amazon.AWSClientFactory.CreateAmazonS3Client(accessKey, secretAccessKey);
public static InitiateMultipartUploadResponse initResponse;
public static List<UploadPartResponse> uploadResponses;
public static void InitializePartToCloud(string destinationFilename, string destinationBucket)
{
// 1. Initialize.
uploadResponses = new List<UploadPartResponse>();
InitiateMultipartUploadRequest initRequest =
new InitiateMultipartUploadRequest()
.WithBucketName(destinationBucket)
.WithKey(destinationFilename.TrimStart('/'));
initResponse = client.InitiateMultipartUpload(initRequest);
}
public static void UploadPartToCloud(Stream fileStream, string destinationFilename, string destinationBucket, int partNumber, long uploadedBytes, long maxChunkedBytes)
{
// 2. Upload Parts.
UploadPartRequest request = new UploadPartRequest()
.WithBucketName(destinationBucket)
.WithKey(destinationFilename.TrimStart('/'))
.WithUploadId(initResponse.UploadId)
.WithPartNumber(partNumber)
.WithPartSize(maxChunkedBytes)
.WithFilePosition(uploadedBytes)
.WithInputStream(fileStream) as UploadPartRequest;
uploadResponses.Add(client.UploadPart(request));
}
public static void CompletePartToCloud(string destinationFilename, string destinationBucket)
{
// Step 3: complete.
CompleteMultipartUploadRequest compRequest =
new CompleteMultipartUploadRequest()
.WithBucketName(destinationBucket)
.WithKey(destinationFilename.TrimStart('/'))
.WithUploadId(initResponse.UploadId)
.WithPartETags(uploadResponses);
CompleteMultipartUploadResponse completeUploadResponse =
client.CompleteMultipartUpload(compRequest);
}
public static void AbortPartToCloud(string destinationFilename, string destinationBucket)
{
// abort.
client.AbortMultipartUpload(new AbortMultipartUploadRequest()
.WithBucketName(destinationBucket)
.WithKey(destinationFilename.TrimStart('/'))
.WithUploadId(initResponse.UploadId));
}
}
my maxChunckedSize is 6Mb (6 * (1024*1024)) as I have read that the minimum is 5Mb...
why am I getting "Your proposed upload is smaller than the minimum allowed size" exception? What am I doing wrong?
The error is:
<Error>
<Code>EntityTooSmall</Code>
<Message>Your proposed upload is smaller than the minimum allowed size</Message>
<ETag>d41d8cd98f00b204e9800998ecf8427e</ETag>
<MinSizeAllowed>5242880</MinSizeAllowed>
<ProposedSize>0</ProposedSize>
<RequestId>C70E7A23C87CE5FC</RequestId>
<HostId>pmhuMXdRBSaCDxsQTHzucV5eUNcDORvKY0L4ZLMRBz7Ch1DeMh7BtQ6mmfBCLPM2</HostId>
<PartNumber>1</PartNumber>
</Error>
How can I get ProposedSize if I'm passing the stream and stream length?
Here is a working solution for the latest Amazon SDK (as today: v.1.5.37.0)
Amazon S3 Multipart Upload works like:
Initialize the request using client.InitiateMultipartUpload(initRequest)
Send chunks of the file (loop until the end) using client.UploadPart(request)
Complete the request using client.CompleteMultipartUpload(compRequest)
If anything goes wrong, remember to dispose the client and request, as well fire the abort command using client.AbortMultipartUpload(abortMultipartUploadRequest)
I keep the client in Session as we need this for each chunk upload as well, keep an hold of the ETags that are now used to complete the process.
You can see an example and simple way of doing this in Amazon Docs itself, I ended up having a class to do everything, plus, I have integrated with the lovely jQuery File Upload plugin (Handler code below as well).
The S3MultipartUpload is as follow
public class S3MultipartUpload : IDisposable
{
string accessKey = System.Configuration.ConfigurationManager.AppSettings.Get("AWSAccessKey");
string secretAccessKey = System.Configuration.ConfigurationManager.AppSettings.Get("AWSSecretKey");
AmazonS3 client;
public string OriginalFilename { get; set; }
public string DestinationFilename { get; set; }
public string DestinationBucket { get; set; }
public InitiateMultipartUploadResponse initResponse;
public List<PartETag> uploadPartETags;
public string UploadId { get; private set; }
public S3MultipartUpload(string destinationFilename, string destinationBucket)
{
if (client == null)
{
System.Net.WebRequest.DefaultWebProxy = null; // disable proxy to make upload quicker
client = Amazon.AWSClientFactory.CreateAmazonS3Client(accessKey, secretAccessKey, new AmazonS3Config()
{
RegionEndpoint = Amazon.RegionEndpoint.EUWest1,
CommunicationProtocol = Protocol.HTTP
});
this.OriginalFilename = destinationFilename.TrimStart('/');
this.DestinationFilename = string.Format("{0:yyyy}{0:MM}{0:dd}{0:HH}{0:mm}{0:ss}{0:fffff}_{1}", DateTime.UtcNow, this.OriginalFilename);
this.DestinationBucket = destinationBucket;
this.InitializePartToCloud();
}
}
private void InitializePartToCloud()
{
// 1. Initialize.
uploadPartETags = new List<PartETag>();
InitiateMultipartUploadRequest initRequest = new InitiateMultipartUploadRequest();
initRequest.BucketName = this.DestinationBucket;
initRequest.Key = this.DestinationFilename;
// make it public
initRequest.AddHeader("x-amz-acl", "public-read");
initResponse = client.InitiateMultipartUpload(initRequest);
}
public void UploadPartToCloud(Stream fileStream, long uploadedBytes, long maxChunkedBytes)
{
int partNumber = uploadPartETags.Count() + 1; // current part
// 2. Upload Parts.
UploadPartRequest request = new UploadPartRequest();
request.BucketName = this.DestinationBucket;
request.Key = this.DestinationFilename;
request.UploadId = initResponse.UploadId;
request.PartNumber = partNumber;
request.PartSize = fileStream.Length;
//request.FilePosition = uploadedBytes // remove this line?
request.InputStream = fileStream; // as UploadPartRequest;
var up = client.UploadPart(request);
uploadPartETags.Add(new PartETag() { ETag = up.ETag, PartNumber = partNumber });
}
public string CompletePartToCloud()
{
// Step 3: complete.
CompleteMultipartUploadRequest compRequest = new CompleteMultipartUploadRequest();
compRequest.BucketName = this.DestinationBucket;
compRequest.Key = this.DestinationFilename;
compRequest.UploadId = initResponse.UploadId;
compRequest.PartETags = uploadPartETags;
string r = "Something went badly wrong";
using (CompleteMultipartUploadResponse completeUploadResponse = client.CompleteMultipartUpload(compRequest))
r = completeUploadResponse.ResponseXml;
return r;
}
public void AbortPartToCloud()
{
// abort.
client.AbortMultipartUpload(new AbortMultipartUploadRequest()
{
BucketName = this.DestinationBucket,
Key = this.DestinationFilename,
UploadId = initResponse.UploadId
});
}
public void Dispose()
{
if (client != null) client.Dispose();
if (initResponse != null) initResponse.Dispose();
}
}
I use DestinationFilename as the destination file so I can avoid the same name, but I keep the OriginalFilename as I needed later.
Using jQuery File Upload Plugin, all works inside a Generic Handler, and the process is something like this:
// Upload partial file
private void UploadPartialFile(string fileName, HttpContext context, List<FilesStatus> statuses)
{
if (context.Request.Files.Count != 1)
throw new HttpRequestValidationException("Attempt to upload chunked file containing more than one fragment per request");
var inputStream = context.Request.Files[0].InputStream;
string contentRange = context.Request.Headers["Content-Range"]; // "bytes 0-6291455/14130271"
int fileSize = int.Parse(contentRange.Split('/')[1]);,
maxChunkSize = int.Parse(context.Request.Headers["X-Max-Chunk-Size"]),
uploadedBytes = int.Parse(contentRange.Replace("bytes ", "").Split('-')[0]);
iView.Utilities.AWS.S3MultipartUpload s3Upload = null;
try
{
// ######################################################################################
// 1. Initialize Amazon S3 Client
if (uploadedBytes == 0)
{
HttpContext.Current.Session["s3-upload"] = new iView.Utilities.AWS.S3MultipartUpload(fileName, awsBucket);
s3Upload = (iView.Utilities.AWS.S3MultipartUpload)HttpContext.Current.Session["s3-upload"];
string msg = System.String.Format("Upload started: {0} ({1:N0}Mb)", s3Upload.DestinationFilename, (fileSize / 1024));
this.Log(msg);
}
// cast current session object
if (s3Upload == null)
s3Upload = (iView.Utilities.AWS.S3MultipartUpload)HttpContext.Current.Session["s3-upload"];
// ######################################################################################
// 2. Send Chunks
s3Upload.UploadPartToCloud(inputStream, uploadedBytes, maxChunkSize);
// ######################################################################################
// 3. Complete Upload
if (uploadedBytes + maxChunkSize > fileSize)
{
string completeRequest = s3Upload.CompletePartToCloud();
this.Log(completeRequest); // log S3 response
s3Upload.Dispose(); // dispose all objects
HttpContext.Current.Session["s3-upload"] = null; // we don't need this anymore
}
}
catch (System.Exception ex)
{
if (ex.InnerException != null)
while (ex.InnerException != null)
ex = ex.InnerException;
this.Log(string.Format("{0}\n\n{1}", ex.Message, ex.StackTrace)); // log error
s3Upload.AbortPartToCloud(); // abort current upload
s3Upload.Dispose(); // dispose all objects
statuses.Add(new FilesStatus(ex.Message));
return;
}
statuses.Add(new FilesStatus(s3Upload.DestinationFilename, fileSize, ""));
}
Keep in mind that to have a Session object inside a Generic Handler, you need to implement IRequiresSessionState so your handler will look like:
public class UploadHandlerSimple : IHttpHandler, IRequiresSessionState
Inside fileupload.js (under _initXHRData) I have added an extra header called X-Max-Chunk-Size so I can pass this to Amazon and calculate if it's the last part of the uploaded file.
Fell free to comment and make smart edits for everyone to use.
I guess you didn't set the content-length of the part inside the UploadPartToCloud() function.

Categories