This project is an ASP.Net Api project with Angular. What I'm trying to do is export data from a database table and into an excel file. So far, I've managed to export all the table data into an excel file, but struggle to select 2 or 3 fields in the table to export.
[HttpGet("download")]
public IActionResult DownloadExcel(string field)
{
string dbFileName = "DbTableName.xlsx";
FileInfo file = new FileInfo(dbFileName);
byte[] fileContents;
var stream = new MemoryStream();
using (ExcelPackage package = new ExcelPackage(file))
{
IList<UserTable> userList = _context.UserTable.ToList();
ExcelWorksheet worksheet = package.Workbook.Worksheets.Add("DbTableName");
int totalUserRows = userList.Count();
}
return File(fileContents, fileType, dbFileName);
}
There's no need to write so many if ... else if ... else if ... else if ... to get the related field names.
A nicer way is to
Use a field list (IList<string>)as a parameter.
And then generate a required field list by intersect.
Finally, we could use reflection to retrieve all the related values.
Implementation
public IActionResult DownloadExcel(IList<string> fields)
{
// get the required field list
var userType = typeof(UserTable);
fields = userType.GetProperties().Select(p => p.Name).Intersect(fields).ToList();
if(fields.Count == 0){ return BadRequest(); }
using (ExcelPackage package = new ExcelPackage())
{
IList<UserTable> userList = _context.UserTable.ToList();
ExcelWorksheet worksheet = package.Workbook.Worksheets.Add("DbTableName");
// generate header line
for(var i= 0; i< fields.Count; i++ ){
var fieldName = fields[i];
var pi= userType.GetProperty(fieldName);
var displayName = pi.GetCustomAttribute<DisplayNameAttribute>()?.DisplayName;
worksheet.Cells[1,i+1].Value = string.IsNullOrEmpty(displayName ) ? fieldName : displayName ;
}
// generate row lines
int totalUserRows = userList.Count();
for(var r=0; r< userList.Count(); r++){
var row = userList[r];
for(var c=0 ; c< fields.Count;c++){
var fieldName = fields[c];
var pi = userType.GetProperty(fieldName);
// because the first row is header
worksheet.Cells[r+2, c+1].Value = pi.GetValue(row);
}
}
var stream = new MemoryStream(package.GetAsByteArray());
return new FileStreamResult(stream,"application/vnd.openxmlformats-officedocument.spreadsheetml.sheet");
}
}
You could configure the display name using the DsiplayNameAttribute:
public class UserTable
{
public int Id{get;set;}
[DisplayName("First Name")]
public string fName { get; set; }
[DisplayName("Last Name")]
public string lName { get; set; }
[DisplayName("Gender")]
public string gender { get; set; }
}
It's possible to add any properties as you like without hard-coding in your DownloadExcel method.
Demo :
passing a field list fields[0]=fName&fields[1]=lName&fields[2]=Non-Exist will generate an excel as below:
[Update]
To export all the fields, we could assume the client will not pass a fields parameter. That means when the fields is null or if the fields.Count==0, we'll export all the fields:
[HttpGet("download")]
public IActionResult DownloadExcel(IList<string> fields)
{
// get the required field list
var userType = typeof(UserTable);
var pis= userType.GetProperties().Select(p => p.Name);
if(fields?.Count >0){
fields = pis.Intersect(fields).ToList();
} else{
fields = pis.ToList();
}
using (ExcelPackage package = new ExcelPackage()){
....
}
}
if you want to use the datatable then we can define which you need to select from the datatable in this way
string[] selectedColumns = new[] { "Column1","Column2"};
DataTable dt= new DataView(fromDataTable).ToTable(false, selectedColumns);
or else if you wanna you list then you can use linq for selection of particular columns
var xyz = from a in prod.Categories
where a.CatName.EndsWith("A")
select new { CatName=a.CatName, CatID=a.CatID, CatQty = a.CatQty};
I have a working solution for uploading a CSV file. Currently, I use the IFormCollection for a user to upload multiple CSV files from a view.
The CSV files are saved as a temp file as follows:
List<string> fileLocations = new List<string>();
foreach (var formFile in files)
{
filePath = Path.GetTempFileName();
if (formFile.Length > 0)
{
using (var stream = new FileStream(filePath, FileMode.Create))
{
await formFile.CopyToAsync(stream);
}
}
fileLocations.Add(filePath);
}
I send the list of file locations to another method (just below). I loop through the file locations and stream the data from the temp files, I then use a data table and SqlBulkCopyto insert the data. I currently upload between 50 and 200 files at a time and each file is around 330KB. To insert a hundred, it takes around 6 minutes, which is around 30-35MB.
public void SplitCsvData(string fileLocation, Guid uid)
{
MetaDataModel MetaDatas;
List<RawDataModel> RawDatas;
var reader = new StreamReader(File.OpenRead(fileLocation));
List<string> listRows = new List<string>();
while (!reader.EndOfStream)
{
listRows.Add(reader.ReadLine());
}
var metaData = new List<string>();
var rawData = new List<string>();
foreach (var row in listRows)
{
var rowName = row.Split(',')[0];
bool parsed = int.TryParse(rowName, out int result);
if (parsed == false)
{
metaData.Add(row);
}
else
{
rawData.Add(row);
}
}
//Assigns the vertical header name and value to the object by splitting string
RawDatas = GetRawData.SplitRawData(rawData);
SaveRawData(RawDatas);
MetaDatas = GetMetaData.SplitRawData(rawData);
SaveRawData(RawDatas);
}
This code then passes the object to the to create the datatable and insert the data.
private DataTable CreateRawDataTable
{
get
{
var dt = new DataTable();
dt.Columns.Add("Id", typeof(int));
dt.Columns.Add("SerialNumber", typeof(string));
dt.Columns.Add("ReadingNumber", typeof(int));
dt.Columns.Add("ReadingDate", typeof(string));
dt.Columns.Add("ReadingTime", typeof(string));
dt.Columns.Add("RunTime", typeof(string));
dt.Columns.Add("Temperature", typeof(double));
dt.Columns.Add("ProjectGuid", typeof(Guid));
dt.Columns.Add("CombineDateTime", typeof(string));
return dt;
}
}
public void SaveRawData(List<RawDataModel> data)
{
DataTable dt = CreateRawDataTable;
var count = data.Count;
for (var i = 1; i < count; i++)
{
DataRow row = dt.NewRow();
row["Id"] = data[i].Id;
row["ProjectGuid"] = data[i].ProjectGuid;
row["SerialNumber"] = data[i].SerialNumber;
row["ReadingNumber"] = data[i].ReadingNumber;
row["ReadingDate"] = data[i].ReadingDate;
row["ReadingTime"] = data[i].ReadingTime;
row["CombineDateTime"] = data[i].CombineDateTime;
row["RunTime"] = data[i].RunTime;
row["Temperature"] = data[i].Temperature;
dt.Rows.Add(row);
}
using (var conn = new SqlConnection(connectionString))
{
conn.Open();
using (SqlTransaction tr = conn.BeginTransaction())
{
using (var sqlBulk = new SqlBulkCopy(conn, SqlBulkCopyOptions.Default, tr))
{
sqlBulk.BatchSize = 1000;
sqlBulk.DestinationTableName = "RawData";
sqlBulk.WriteToServer(dt);
}
tr.Commit();
}
}
}
Is there another way to do this or a better way to improve performance so that the time to upload is reduced as it can take a long time and I am seeing an ever increasing use of memory to around 500MB.
TIA
You can improve performance by removing the DataTable and reading from the input stream directly.
SqlBulkCopy has a WriteToServer overload that accepts an IDataReader instead of an entire DataTable.
CsvHelper can CSV files using a StreamReader as an input. It provides CsvDataReader as an IDataReader implementation on top of the CSV data. This allows reading directly from the input stream and writing to SqlBulkCopy.
The following method will read from an IFormFile, parse the stream using CsvHelper and use the CSV's fields to configure a SqlBulkCopy instance :
public async Task ToTable(IFormFile file, string table)
{
using (var stream = file.OpenReadStream())
using (var tx = new StreamReader(stream))
using (var reader = new CsvReader(tx))
using (var rd = new CsvDataReader(reader))
{
var headers = reader.Context.HeaderRecord;
var bcp = new SqlBulkCopy(_connection)
{
DestinationTableName = table
};
//Assume the file headers and table fields have the same names
foreach(var header in headers)
{
bcp.ColumnMappings.Add(header, header);
}
await bcp.WriteToServerAsync(rd);
}
}
This way nothing is ever written to a temp table or cached in memory. The uploaded files are parsed and written to the database directly.
In addition to #Panagiotis's answer, why don't you interleave your file processing with the file upload? Wrap up your file processing logic in an async method and change the loop to a Parallel.Foreach and process each file as it arrives instead of waiting for all of them?
private static readonly object listLock = new Object(); // only once at class level
List<string> fileLocations = new List<string>();
Parallel.ForEach(files, (formFile) =>
{
filePath = Path.GetTempFileName();
if (formFile.Length > 0)
{
using (var stream = new FileStream(filePath, FileMode.Create))
{
await formFile.CopyToAsync(stream);
}
await ProcessFileInToDbAsync(filePath);
}
// Added lock for thread safety of the List
lock (listLock)
{
fileLocations.Add(filePath);
}
});
Thanks to #Panagiotis Kanavos, I was able to work out what to do. Firstly, the way I was calling the methods, was leaving them in memory. The CSV file I have is in two parts, vertical metadata and then the usual horizontal information. So I needed to split them into two. Saving them as tmp files was also causing an overhead. It has gone from taking 5-6 minutes to now taking a minute, which for a 100 files containing 8,500 rows isn't bad I suppose.
Calling the method:
public async Task<IActionResult> UploadCsvFiles(ICollection<IFormFile> files, IFormCollection fc)
{
foreach (var f in files)
{
var getData = new GetData(_configuration);
await getData.SplitCsvData(f, uid);
}
return whatever;
}
This is the method doing the splitting:
public async Task SplitCsvData(IFormFile file, string uid)
{
var data = string.Empty;
var m = new List<string>();
var r = new List<string>();
var records = new List<string>();
using (var stream = file.OpenReadStream())
using (var reader = new StreamReader(stream))
{
while (!reader.EndOfStream)
{
var line = reader.ReadLine();
var header = line.Split(',')[0].ToString();
bool parsed = int.TryParse(header, out int result);
if (!parsed)
{
m.Add(line);
}
else
{
r.Add(line);
}
}
}
//TODO: Validation
//This splits the list into the Meta data model. This is just a single object, with static fields.
var metaData = SplitCsvMetaData.SplitMetaData(m, uid);
DataTable dtm = CreateMetaData(metaData);
var serialNumber = metaData.LoggerId;
await SaveMetaData("MetaData", dtm);
//
var lrd = new List<RawDataModel>();
foreach (string row in r)
{
lrd.Add(new RawDataModel
{
Id = 0,
SerialNumber = serialNumber,
ReadingNumber = Convert.ToInt32(row.Split(',')[0]),
ReadingDate = Convert.ToDateTime(row.Split(',')[1]).ToString("yyyy-MM-dd"),
ReadingTime = Convert.ToDateTime(row.Split(',')[2]).ToString("HH:mm:ss"),
RunTime = row.Split(',')[3].ToString(),
Temperature = Convert.ToDouble(row.Split(',')[4]),
ProjectGuid = uid.ToString(),
CombineDateTime = Convert.ToDateTime(row.Split(',')[1] + " " + row.Split(',')[2]).ToString("yyyy-MM-dd HH:mm:ss")
});
}
await SaveRawData("RawData", lrd);
}
I then use a data table for the metadata (which takes 20 seconds for a 100 files) as I map the field names to the columns.
public async Task SaveMetaData(string table, DataTable dt)
{
using (SqlBulkCopy sqlBulk = new SqlBulkCopy(_configuration.GetConnectionString("DefaultConnection"), SqlBulkCopyOptions.Default))
{
sqlBulk.DestinationTableName = table;
await sqlBulk.WriteToServerAsync(dt);
}
}
I then use FastMember for the large data parts for the raw data, which is more like a traditional CSV.
public async Task SaveRawData(string table, IEnumerable<LogTagRawDataModel> lrd)
{
using (SqlBulkCopy sqlBulk = new SqlBulkCopy(_configuration.GetConnectionString("DefaultConnection"), SqlBulkCopyOptions.Default))
using (var reader = ObjectReader.Create(lrd, "Id","SerialNumber", "ReadingNumber", "ReadingDate", "ReadingTime", "RunTime", "Temperature", "ProjectGuid", "CombineDateTime"))
{
sqlBulk.DestinationTableName = table;
await sqlBulk.WriteToServerAsync(reader);
}
}
I am sure this can be improved on, but for now, this works really well.
I am passing a Stream from a csv file from my Controller to my business layer for processing. The stream makes it to the method okay but as soon as I declare my TextFieldParser, and pass in my stream, the data is gone and is therefore not processed.
public CsvRecordReportModel processCsvStream(Stream dataStream, RecordSource recordSource, string fileName)
{
//Create instance of the report.
var report = new CsvRecordReportModel();
report.InsertedRecordCount = 0;
report.FileName = fileName;
using (TextFieldParser csvParser = new TextFieldParser(dataStream))
{
csvParser.CommentTokens = new string[] {"#"};
csvParser.SetDelimiters(new string[] {","});
csvParser.HasFieldsEnclosedInQuotes = true;
// Skip the row with the column names
csvParser.ReadLine();
while (!csvParser.EndOfData)
{
//Do stuff
}
}
}
I have asp.net web api and has a HTTPResponseMessage and the api method name GetPersonDataStream, which actually stream each person object as a json. So when I see the result the actual Data has been constructed like two seperate object's with no comma in between the two objects are it isn't constructed as I required.
Actual streamed data :
{"Name":"Ram","Age":30}{"Name":"Sam","Age":32}.
But I want this to streamed as a proper JSON as:
{"response": [ {"Name":"Ram","Age":30}, {"Name":"Sam","Age":32} ]}
Is there a way we can achieve it. Below is the code I use to stream the data because the number of records will be in millions and i don't want to create all the objects at once and then streaming it, because that may be lead to Syste.OutOfMemory Exception . So is there a way we could edit/construct the object before streaming it. If yes, how can i achieve it.
CODE:
[HttpGet]
[Route("GetPersonDataStream")]
public HttpResponseMessage GetPersonDataStream()
{
List<Person> ps = new List<Person>();
Person p1 = new Person();
p1.Name = "Ram";
p1.Age = 30;
Person p2 = new Person();
p2.Name = "Sam";
p2.Age = 32;
ps.Add(p1);
ps.Add(p2);
var response = this.Request.CreateResponse(HttpStatusCode.OK);
response.Content =
new PushStreamContent((stream, content, context) =>
{
foreach (var item in ps)
{
//var result = _clmmgr.GetApprovedCCRDetail(item.ccr_id, liccrDetails);
var serializer = new JsonSerializer();
using (var writer = new StreamWriter(stream))
{
serializer.Serialize(writer, item);
stream.Flush();
}
}
});
return response;
}
public class Person
{
public string Name {get;set;}
public int Age { get; set; }
}
With JSON.NET and it's JsonTextWriter, you can wrap all the items in a JSON object with an array and still stream the result without building everything in memory first.
response.Content =
new PushStreamContent((stream, content, context) =>
{
using (var sw = new StreamWriter(stream))
using (var jsonWriter = new JsonTextWriter(sw))
{
jsonWriter.WriteStartObject();
{
jsonWriter.WritePropertyName("response");
jsonWriter.WriteStartArray();
{
foreach (var item in ps)
{
var jObject = JObject.FromObject(item);
jObject.WriteTo(jsonWriter);
}
}
jsonWriter.WriteEndArray();
}
jsonWriter.WriteEndObject();
}
});
In DocumentDB documentation examples, I find insertion of C# objects.
// Create the Andersen family document.
Family AndersenFamily = new Family
{
Id = "AndersenFamily",
LastName = "Andersen",
Parents = new Parent[] {
new Parent { FirstName = "Thomas" },
new Parent { FirstName = "Mary Kay"}
},
IsRegistered = true
};
await client.CreateDocumentAsync(documentCollection.DocumentsLink, AndersenFamily);
In my case, I'm receiving json strings from application client and would like to insert them in DocumentDB without deserializing them. Could not find any examples of doing something similar.
Any help is sincerely appreciated..
Thanks
Copied from the published .NET Sample code -
private static async Task UseStreams(string colSelfLink)
{
var dir = new DirectoryInfo(#".\Data");
var files = dir.EnumerateFiles("*.json");
foreach (var file in files)
{
using (var fileStream = new FileStream(file.FullName, FileMode.Open, FileAccess.Read))
{
Document doc = await client.CreateDocumentAsync(colSelfLink, Resource.LoadFrom<Document>(fileStream));
Console.WriteLine("Created Document: ", doc);
}
}
//Read one the documents created above directly in to a Json string
Document readDoc = client.CreateDocumentQuery(colSelfLink).Where(d => d.Id == "JSON1").AsEnumerable().First();
string content = JsonConvert.SerializeObject(readDoc);
//Update a document with some Json text,
//Here we're replacing a previously created document with some new text and even introudcing a new Property, Status=Cancelled
using (var memoryStream = new MemoryStream(Encoding.UTF8.GetBytes("{\"id\": \"JSON1\",\"PurchaseOrderNumber\": \"PO18009186470\",\"Status\": \"Cancelled\"}")))
{
await client.ReplaceDocumentAsync(readDoc.SelfLink, Resource.LoadFrom<Document>(memoryStream));
}
}