I have a working solution for uploading a CSV file. Currently, I use the IFormCollection for a user to upload multiple CSV files from a view.
The CSV files are saved as a temp file as follows:
List<string> fileLocations = new List<string>();
foreach (var formFile in files)
{
filePath = Path.GetTempFileName();
if (formFile.Length > 0)
{
using (var stream = new FileStream(filePath, FileMode.Create))
{
await formFile.CopyToAsync(stream);
}
}
fileLocations.Add(filePath);
}
I send the list of file locations to another method (just below). I loop through the file locations and stream the data from the temp files, I then use a data table and SqlBulkCopyto insert the data. I currently upload between 50 and 200 files at a time and each file is around 330KB. To insert a hundred, it takes around 6 minutes, which is around 30-35MB.
public void SplitCsvData(string fileLocation, Guid uid)
{
MetaDataModel MetaDatas;
List<RawDataModel> RawDatas;
var reader = new StreamReader(File.OpenRead(fileLocation));
List<string> listRows = new List<string>();
while (!reader.EndOfStream)
{
listRows.Add(reader.ReadLine());
}
var metaData = new List<string>();
var rawData = new List<string>();
foreach (var row in listRows)
{
var rowName = row.Split(',')[0];
bool parsed = int.TryParse(rowName, out int result);
if (parsed == false)
{
metaData.Add(row);
}
else
{
rawData.Add(row);
}
}
//Assigns the vertical header name and value to the object by splitting string
RawDatas = GetRawData.SplitRawData(rawData);
SaveRawData(RawDatas);
MetaDatas = GetMetaData.SplitRawData(rawData);
SaveRawData(RawDatas);
}
This code then passes the object to the to create the datatable and insert the data.
private DataTable CreateRawDataTable
{
get
{
var dt = new DataTable();
dt.Columns.Add("Id", typeof(int));
dt.Columns.Add("SerialNumber", typeof(string));
dt.Columns.Add("ReadingNumber", typeof(int));
dt.Columns.Add("ReadingDate", typeof(string));
dt.Columns.Add("ReadingTime", typeof(string));
dt.Columns.Add("RunTime", typeof(string));
dt.Columns.Add("Temperature", typeof(double));
dt.Columns.Add("ProjectGuid", typeof(Guid));
dt.Columns.Add("CombineDateTime", typeof(string));
return dt;
}
}
public void SaveRawData(List<RawDataModel> data)
{
DataTable dt = CreateRawDataTable;
var count = data.Count;
for (var i = 1; i < count; i++)
{
DataRow row = dt.NewRow();
row["Id"] = data[i].Id;
row["ProjectGuid"] = data[i].ProjectGuid;
row["SerialNumber"] = data[i].SerialNumber;
row["ReadingNumber"] = data[i].ReadingNumber;
row["ReadingDate"] = data[i].ReadingDate;
row["ReadingTime"] = data[i].ReadingTime;
row["CombineDateTime"] = data[i].CombineDateTime;
row["RunTime"] = data[i].RunTime;
row["Temperature"] = data[i].Temperature;
dt.Rows.Add(row);
}
using (var conn = new SqlConnection(connectionString))
{
conn.Open();
using (SqlTransaction tr = conn.BeginTransaction())
{
using (var sqlBulk = new SqlBulkCopy(conn, SqlBulkCopyOptions.Default, tr))
{
sqlBulk.BatchSize = 1000;
sqlBulk.DestinationTableName = "RawData";
sqlBulk.WriteToServer(dt);
}
tr.Commit();
}
}
}
Is there another way to do this or a better way to improve performance so that the time to upload is reduced as it can take a long time and I am seeing an ever increasing use of memory to around 500MB.
TIA
You can improve performance by removing the DataTable and reading from the input stream directly.
SqlBulkCopy has a WriteToServer overload that accepts an IDataReader instead of an entire DataTable.
CsvHelper can CSV files using a StreamReader as an input. It provides CsvDataReader as an IDataReader implementation on top of the CSV data. This allows reading directly from the input stream and writing to SqlBulkCopy.
The following method will read from an IFormFile, parse the stream using CsvHelper and use the CSV's fields to configure a SqlBulkCopy instance :
public async Task ToTable(IFormFile file, string table)
{
using (var stream = file.OpenReadStream())
using (var tx = new StreamReader(stream))
using (var reader = new CsvReader(tx))
using (var rd = new CsvDataReader(reader))
{
var headers = reader.Context.HeaderRecord;
var bcp = new SqlBulkCopy(_connection)
{
DestinationTableName = table
};
//Assume the file headers and table fields have the same names
foreach(var header in headers)
{
bcp.ColumnMappings.Add(header, header);
}
await bcp.WriteToServerAsync(rd);
}
}
This way nothing is ever written to a temp table or cached in memory. The uploaded files are parsed and written to the database directly.
In addition to #Panagiotis's answer, why don't you interleave your file processing with the file upload? Wrap up your file processing logic in an async method and change the loop to a Parallel.Foreach and process each file as it arrives instead of waiting for all of them?
private static readonly object listLock = new Object(); // only once at class level
List<string> fileLocations = new List<string>();
Parallel.ForEach(files, (formFile) =>
{
filePath = Path.GetTempFileName();
if (formFile.Length > 0)
{
using (var stream = new FileStream(filePath, FileMode.Create))
{
await formFile.CopyToAsync(stream);
}
await ProcessFileInToDbAsync(filePath);
}
// Added lock for thread safety of the List
lock (listLock)
{
fileLocations.Add(filePath);
}
});
Thanks to #Panagiotis Kanavos, I was able to work out what to do. Firstly, the way I was calling the methods, was leaving them in memory. The CSV file I have is in two parts, vertical metadata and then the usual horizontal information. So I needed to split them into two. Saving them as tmp files was also causing an overhead. It has gone from taking 5-6 minutes to now taking a minute, which for a 100 files containing 8,500 rows isn't bad I suppose.
Calling the method:
public async Task<IActionResult> UploadCsvFiles(ICollection<IFormFile> files, IFormCollection fc)
{
foreach (var f in files)
{
var getData = new GetData(_configuration);
await getData.SplitCsvData(f, uid);
}
return whatever;
}
This is the method doing the splitting:
public async Task SplitCsvData(IFormFile file, string uid)
{
var data = string.Empty;
var m = new List<string>();
var r = new List<string>();
var records = new List<string>();
using (var stream = file.OpenReadStream())
using (var reader = new StreamReader(stream))
{
while (!reader.EndOfStream)
{
var line = reader.ReadLine();
var header = line.Split(',')[0].ToString();
bool parsed = int.TryParse(header, out int result);
if (!parsed)
{
m.Add(line);
}
else
{
r.Add(line);
}
}
}
//TODO: Validation
//This splits the list into the Meta data model. This is just a single object, with static fields.
var metaData = SplitCsvMetaData.SplitMetaData(m, uid);
DataTable dtm = CreateMetaData(metaData);
var serialNumber = metaData.LoggerId;
await SaveMetaData("MetaData", dtm);
//
var lrd = new List<RawDataModel>();
foreach (string row in r)
{
lrd.Add(new RawDataModel
{
Id = 0,
SerialNumber = serialNumber,
ReadingNumber = Convert.ToInt32(row.Split(',')[0]),
ReadingDate = Convert.ToDateTime(row.Split(',')[1]).ToString("yyyy-MM-dd"),
ReadingTime = Convert.ToDateTime(row.Split(',')[2]).ToString("HH:mm:ss"),
RunTime = row.Split(',')[3].ToString(),
Temperature = Convert.ToDouble(row.Split(',')[4]),
ProjectGuid = uid.ToString(),
CombineDateTime = Convert.ToDateTime(row.Split(',')[1] + " " + row.Split(',')[2]).ToString("yyyy-MM-dd HH:mm:ss")
});
}
await SaveRawData("RawData", lrd);
}
I then use a data table for the metadata (which takes 20 seconds for a 100 files) as I map the field names to the columns.
public async Task SaveMetaData(string table, DataTable dt)
{
using (SqlBulkCopy sqlBulk = new SqlBulkCopy(_configuration.GetConnectionString("DefaultConnection"), SqlBulkCopyOptions.Default))
{
sqlBulk.DestinationTableName = table;
await sqlBulk.WriteToServerAsync(dt);
}
}
I then use FastMember for the large data parts for the raw data, which is more like a traditional CSV.
public async Task SaveRawData(string table, IEnumerable<LogTagRawDataModel> lrd)
{
using (SqlBulkCopy sqlBulk = new SqlBulkCopy(_configuration.GetConnectionString("DefaultConnection"), SqlBulkCopyOptions.Default))
using (var reader = ObjectReader.Create(lrd, "Id","SerialNumber", "ReadingNumber", "ReadingDate", "ReadingTime", "RunTime", "Temperature", "ProjectGuid", "CombineDateTime"))
{
sqlBulk.DestinationTableName = table;
await sqlBulk.WriteToServerAsync(reader);
}
}
I am sure this can be improved on, but for now, this works really well.
I have a set of instances of the Data class that I want to compare.
Each instance has an unknown number of items in it's Files property.
I want to compare each instance of Data to the others and set FoundDifference to true if a version difference is found between two files with the same Name value.
Is there a simple algorithm to accomplish this?
Here is a sample setup of how the objects might look.
In this example you'd want everything except for f1, f21, and f31 to set the FoundDifference to true
class Data
{
public string DC { get; set; }
public List<File> Files { get; set; }
}
class File
{
public string Name { get; set; }
public string Version { get; set; }
public bool FoundDifference { get; set; }
}
class Program
{
static void Main(string[] args)
{
Data d1 = new Data();
d1.DC = "DC1";
File f1 = new File();
f1.Name = "File1";
f1.Version = "1";
d1.Files.Add(f1);
File f2 = new File();
f2.Name = "File2";
f2.Version = "1";
d1.Files.Add(f2);
File f3 = new File();
f3.Name = "File3";
f3.Version = "1";
d1.Files.Add(f3);
//Another
Data d2 = new Data();
d2.DC = "DC2";
File f21 = new File();
f21.Name = "File1";
f21.Version = "1";
d2.Files.Add(f21);
File f22 = new File();
f22.Name = "File2";
f22.Version = "2";
d2.Files.Add(f22);
File f23 = new File();
f23.Name = "File3";
f23.Version = "1";
d2.Files.Add(f23);
//Another
Data d3 = new Data();
d3.DC = "DC3";
File f31 = new File();
f31.Name = "File1";
f31.Version = "1";
d3.Files.Add(f31);
File f32 = new File();
f32.Name = "File2";
f32.Version = "2";
d3.Files.Add(f32);
File f33 = new File();
f33.Name = "File3";
f33.Version = "5";
d3.Files.Add(f33);
//How Can I change All Files FoundDifference prop to true if FileName is the same and a difference is in Version is found??
Console.ReadLine();
}
I'd handle that by using a Dictionary<string, List<File>> to keep track of the files from each Data like this. First iterate all the files in all the datas then lookup the file name in the dictionary and if not found create a new list and add it. Then check if that list has any files with a different version. If one is found set all the flags and finally add the file to the list.
public void SetDifferences(IEnumerable<Data> datas)
{
var fileLookup = new Dictionary<string, List<File>>();
foreach(var file in datas.SelectMany(d => d.Files))
{
if(!fileLookup.TryGetValue(file.Name, out var fileList))
{
fileList = new List<File>();
fileLookup.Add(file.Name, fileList);
}
if(fileList.Any(f => f.Version != file.Version))
{
foreach(var other in fileList)
{
other.FoundDifference = true;
}
file.FoundDifference = true;
}
fileList.Add(file);
}
}
I currently work on a Windows Forms application and I have 3 lists of data and I want to add every list to a column of a datagrid. Is there a way how I can do this.
XDocument doc = XDocument.Load(Globals.pathNotifFile);
var dates = doc.Descendants("Date");
var hours = doc.Descendants("Time");
var message = doc.Descendants("Message");
var hoursCollection = new List<String>();
var dateCollection = new List<String>();
var messageCollection = new List<String>();
foreach (var date in dates)
{
dateCollection.Add(date.Value);
}
foreach (var hour in hours)
{
hoursCollection.Add(hour.Value);
}
foreach (var messages in message)
{
messageCollection.Add(messages.Value);
}
return Tuple.Create(hoursCollection,dateCollection, messageCollection);
}
The easiest way to accomplish this task is to build one object which contains your three datapoints. For example:
public class MyGridDateTime
{
public string Hour{get;set;}
public string Date{get;set;}
public string Message{get;set;}
}
public void InitalizeGrid()
{
List<MyGridDateTime> list = new List<MyGridDateTime>();
int i = 0;
foreach (string hour in hoursCollection)
{
list.Add(new MyGridDateTime {Hour = hour, Date = dateCollection[i], Message = messageCollection[i]};
i++;
}
grid.DataSource = list;
}
Note this only works if all of your Lists contain similiar amount of data. Else you need to update this a bit to become string.empty instead of Exception if your Lists are not of the same size.
I've got a List of Document
public class Document
{
public string[] fullFilePath;
public bool isPatch;
public string destPath;
public Document() { }
public Document(string[] fullFilePath, bool isPatch, string destPath)
{
this.fullFilePath = fullFilePath;
this.isPatch = isPatch;
this.destPath = destPath;
}
The fullFilepath should a List or an Array of Paths.
For example:
Document 1
---> C:\1.pdf
---> C:\2.pdf
Document 2
---> C:\1.pdf
---> C:\2.pdf
---> C:\3.pdf
etc.
My problem if I am using an array string all Documents got "null" in its fullFilePath.
If I'm using a List for the fullFilePath all Documents got the same entries from the last Document.
Here is how the List is filled:
int docCount = -1;
int i = 0;
List<Document> Documents = new List<Document>();
string[] sourceFiles = new string[1];
foreach (string file in filesCollected)
{
string bc;
string bcValue;
if (Settings.Default.barcodeEngine == "Leadtools")
{
bc = BarcodeReader.ReadBarcodeSymbology(file);
bcValue = "PatchCode";
}
else
{
bc = BarcodeReader.ReadBacrodes(file);
bcValue = "009";
}
if (bc == bcValue)
{
if(Documents.Count > 0)
{
Array.Clear(sourceFiles, 0, sourceFiles.Length);
Array.Resize<string>(ref sourceFiles, 1);
i = 0;
}
sourceFiles[i] = file ;
i++;
Array.Resize<string>(ref sourceFiles, i + 1);
Documents.Add(new Document(sourceFiles, true,""));
docCount++;
}
else
{
if (Documents.Count > 0)
{
sourceFiles[i] = file;
i++;
Array.Resize<string>(ref sourceFiles, i + 1);
Documents[docCount].fullFilePath = sourceFiles;
}
}
}
You are using the same instance of the array for every document. The instance is updated with a new list of files at every inner loop, but an array is a reference to an area of memory (oversimplification, I know but for the purpose of this answer is enough) and if you change the content of that area of memory you are changing it for every document.
You need to create a new instance of the source files for every new document you add to your documents list. Moreover, when you are not certain of the number of elements that you want to be included in the array, it is a lot better to use a generic List and remove all that code that handles the resizing of the array.
First change the class definition
public class Document
{
public List<string> fullFilePath;
public bool isPatch;
public string destPath;
public Document() { }
public Document(List<string> fullFilePath, bool isPatch, string destPath)
{
this.fullFilePath = fullFilePath;
this.isPatch = isPatch;
this.destPath = destPath;
}
}
And now change your inner loop to
foreach (string file in filesCollected)
{
string bc;
string bcValue;
....
if (bc == bcValue)
{
List<string> files = new List<string>();
files.Add(file);
Documents.Add(new Document(files, true, ""));
docCount++;
}
else
Documents[docCount].fullFilePath.Add(file);
}
Notice that when you need to add a new Document I build a new List<string>, add the current file and pass everything at the constructor (In reality this should be moved directly inside the constructor of the Document class). When you want to add just a new file you could add it directly to the public fullFilePath property
Moving the handling of the files inside the Documents class could be rewritten as
public class Document
{
public List<string> fullFilePath;
public bool isPatch;
public string destPath;
public Document()
{
// Every constructory initializes internally the List
fullFilePath = new List<string>();
}
public Document(string aFile, bool isPatch, string destPath)
{
// Every constructory initializes internally the List
fullFilePath = new List<string>();
this.fullFilePath.Add(aFile);
this.isPatch = isPatch;
this.destPath = destPath;
}
public void AddFile(string aFile)
{
this.fullFilePath.Add(aFile);
}
}
Of course, now in you calling code you pass only the new file or call AddFile without the need to check for the list initialization.
The issue should be here:
string[] sourceFiles = new string[1];
If you move this line of code in your foreach you should solve this problem because in your foreach you always use the same variable, so the same reference.
int docCount = -1;
int i = 0;
List<Document> Documents = new List<Document>();
foreach (string file in filesCollected)
{
string[] sourceFiles = new string[1];
string bc;
string bcValue;
if (Settings.Default.barcodeEngine == "Leadtools")
{
bc = BarcodeReader.ReadBarcodeSymbology(file);
bcValue = "PatchCode";
}
else
{
bc = BarcodeReader.ReadBacrodes(file);
bcValue = "009";
}
if (bc == bcValue)
{
if(Documents.Count > 0)
{
Array.Clear(sourceFiles, 0, sourceFiles.Length);
Array.Resize<string>(ref sourceFiles, 1);
i = 0;
}
sourceFiles[i] = file ;
i++;
Array.Resize<string>(ref sourceFiles, i + 1);
Documents.Add(new Document(sourceFiles, true,""));
docCount++;
}
else
{
if (Documents.Count > 0)
{
sourceFiles[i] = file;
i++;
Array.Resize<string>(ref sourceFiles, i + 1);
Documents[docCount].fullFilePath = sourceFiles;
}
}
}
I have my code as below
string[] keys = { "myCustomUserControl.ascx", "myCustomUserControl.ascx.cs", "myCustomUserControl.ascx.designer.cs" };
string customUserControlName = CommonDataCalls.GetCustomUserControlName(keys);
UserControl objUserControl = (UserControl)this.LoadControl("~/UserControls/" + userControlName);
userControlPlaceHolder.Controls.Add(objUserControl);
The definition of GetCustomUserControlName is as below
public string GetCustomUserControlName(string[] keys)
{
try
{
string userConrolsPhysicalPtah = System.Web.HttpContext.Current.Server.MapPath("~/UserControls/");
DataTable objDataTable = new DataTable();
foreach (string key in keys)
{
objRequestVO.addObject("ACA_KEY", key);
CResponseVO objResponseVO = (CResponseVO)objGateway.ExecuteBusinessService(CConstant.ADMIN, CConstant.ASSEMBLY_INFO, CConstant.SELECT, objRequestVO);
DataSet objDataSet = (DataSet)objResponseVO.getObject("RES_DS");
cUserTrce objGeneral = new cUserTrce();
if (!objGeneral.IsNullOrEmptyDataset(objDataSet))
{
if (objDataTable.Rows.Count == 0)
{
objDataTable = objDataSet.Tables[0].Clone();
}
objDataTable.Rows.Add(objDataSet.Tables[0].Rows[0].ItemArray);
}
}
if (objDataTable != null && objDataTable.Rows.Count == 3)
{
string containerName = "usercontrols";
foreach (DataRow dr in objDataTable.Rows)
{
string userControlFileBlobUrl = dr["ACA_ASSEMBLY_PATH"].ToString();
string userControlFileName = dr["ACA_CLASS_NAME"].ToString();
Storage.Blob blobHandler = new Storage.Blob();
Stream blobstream = blobHandler.GetBlob(userControlFileBlobUrl, containerName);
if (!(File.Exists(userConrolsPhysicalPtah + userControlFileName)))
{
MemoryStream ms = (MemoryStream)blobstream;
FileStream outStream = File.OpenWrite(userConrolsPhysicalPtah + userControlFileName);
ms.WriteTo(outStream);
outStream.Flush();
outStream.Close();
}
}
string customUserControlName = (from DataRow row in objDataTable.Rows
where row["ACA_KEY"].ToString() == keys[0]
select row["ACA_CLASS_NAME"].ToString()).First();
return customUserControlName;
}
else
{
return null;
}
}
catch
{
return null;
}
}
The mithod basically copies the user controls to the virtual path at run time .
In aspx.cs page I try to load it dynamically .
But I can see the file is getting copied to the virtual path but this. Load control gives me exception saying Could not load type 'myCustomUserControl'.
I am using azure web role
What is wrong here ?
I solved the bug . I am just putting here for anyone to refer .
It's a one word change -
http://blog.kjeldby.dk/2008/11/dynamic-compilation-in-a-web-application/
Change
CodeBehind="myCustomUserControl.ascx.cs"
to
CodeFile="myCustomUserControl.ascx.cs"
Thanks to #Roopesh & #Kristoffer Brinch Kjeldby
and it will start working.