I am using LinqToExcel to get the content of the excel file out.
With header mapping class like the following I can map the property of my class to a column in the excel:
public class Transaction
{
[ExcelColumn("Trans Id")]
public string TradeNumber { get; set; }
[ExcelColumn("Trans Version")]
public string TransVersion { get; set; }
}
However, sometime the incoming file has different header, for example sometimes it has header "Trans Id" Sometimes it has "Trans ID", the program cannot convert the column when the header is "Trans ID"
Is there a way to make LinqToExcel compare column name in case insensitive mode?
Or there is a place to let me override the comparison method of LinqToExcel.
Thanks!
I tried to use the
public void AddTransformation<TSheetData>(Expression<Func<TSheetData, object>> property, Func<string, object> transformation);
part of the library, but that only deals with the value, not the column name.
Not sure if this is the best solution for it, but it worked for me. I tried to find similar ways around it, but if you're unable to control the column names like that
//Get the Header information
//Worksheet title
//List of Columns (can narrow down if you always know the placement)
ExcelQueryFactory HeaderInfo = new ExcelQueryFactory("FILE NAME.xlsx");
List<string> worksheetName = HeaderInfo.GetWorksheetNames().ToList();
IEnumerable<string> columnNames = HeaderInfo.GetColumnNames(worksheetName[0].ToString());
//Get those values that you're looking for. Pulling in the unedited Excel column name
string TradeNumber_HeaderName = columnNames.Where(a => a.ToUpper().Trim() == "TRANS ID" || a => a.ToUpper().Trim() == "TRANSID").FirstOrDefault() ?? "Trans ID";
string TransVersion_HeaderName = columnNames.Where(a => a.ToUpper().Trim() == "TRANS VERSION").FirstOrDefault() ?? "Trans Version";
//Whatever your new connection is now to , and this will use that column value dynamically.
ExcelQueryFactory ExcelConn = ...
ExcelConn.AddMapping<Transaction>(x => x.TradeNumber, TradeNumber_HeaderName);
ExcelConn.AddMapping<Transaction>(x => x.TransVersion, TransVersion_HeaderName);
You could define the mapping yourself with:
var excelFile = new ExcelQueryFactory(pathToExcelFile);
excelFile.AddMapping("Trans Id", "Trans ID");
This is just a suggestion, you would have to create a mapping for each scenario...ughh.
Let us know if the AddMapping works for you.
Related
I am working on a project in which I need to read an Excel file and validate the dataset.
let's say as an example there is a column called "Date Of Birth" in Excel Sheet, so I need to check whether it's the correct date. because the user might enter number or a just letters into that column. So I can't ask to add Excel validation to that Excel file. so there is no validation in the Excel file. so users can add anything to any column.
and the other one is this excel header is not constant. these headers can change with time.
because of that, I can't create model classes for excel.
So I use "DocumentFormat.OpenXML" to read the Excel file. and use ExpandoObject to store this data set because as I said these headers might be changed.
I created a basic class to save the basic info of the cell
public class CellDetail
{
public string CellHeading { get; set; } = string.Empty;
public string CellValue { get; set; } = string.Empty;
public string CellReference { get; set; } = string.Empty;
}
"cellHeading" is the column Name, "CellValue" is the value of the cell, "CellReference" is the reference of the cell means as an example B12. Column "B" 12th line.
I was able to read the Excel sheet and create the dataset. finally I created JSON with this dataset
private ExpandoObject ConvertCellToExpandoObject(SpreadsheetDocument spreadsheetDocument, Cell cell)
{
var cellDetail = _excelFileService.GetCellValue(spreadsheetDocument, cell);
dynamic item = new ExpandoObject();
item.Id = cellDetail.CellReference;
item.CellDetail = new CellDetail { CellHeading = cellDetail.CellHeading, CellValue = cellDetail.CellValue, CellReference = cellDetail .CellReference };
return item;
}
public string ExcelDataSetToJSON()
{
List<ExpandoObject> allCellsInOneRow = new List<ExpandoObject>();
List<List<ExpandoObject>> excelDataSet = new();
SpreadsheetDocument spreadsheetDocument;
var sheet = _excelFileService.GetSheetDataBySheetName(FILEPATH, SHEETNAME);
var rowList = sheet.Elements<Row>();
using (spreadsheetDocument = SpreadsheetDocument.Open(FILEPATH, false))
{
foreach (var row in rowList)
{
var cellList = row.Elements<Cell>().Take(6);
foreach (Cell cell in cellList)
{
var cellDetailInExpando = ConvertCellToExpandoObject(spreadsheetDocument, cell);
allCellsInOneRow.Add(cellDetailInExpando);
}
excelDataSet.Add(allCellsInOneRow);
allCellsInOneRow = new List<ExpandoObject>();
}
}
return Newtonsoft.Json.JsonConvert.SerializeObject(excelDataSet);
}
"ExcelDataSetToJSON" method return a JSON. it's look like this
so now I need to validate this JSON with JSON schema validation. still I didn't create JSON schema to validate the JSON. I saw one of the NuGet package named "Json.NET Schema". that used to validate JSON. I need to implement something like that.
I need to check mainly these things in the JSON
value should be number,
value should be string,
Minimum & maximum range,
value should be correct date
value should be one of the given options
how do I implement my own custom schema and do these validation?
I want to be able to filter out a CSV file and perform data validation on the filtered data. I imagine for loops, but the file has 2 million cells and it would take a long time. I am using Lumenworks CSVReader for accessing the file using C#.
I found this method csvfile.Where<> but I have no idea what to put in the parameters. Sorry I am still new to coding as well.
[EDIT] This is my code for loading the file. Thanks for all the help!
//Creating C# table from CSV data
var csvTable = new DataTable();
var csvReader = new CsvReader(newStreamReader(System.IO.File.OpenRead(filePath[0])), true);
csvTable.Load(csvReader);
//grabs header from the CSV data table
string[] headers = csvReader.GetFieldHeaders(); //this method gets the headers of the CSV file
string filteredData[] = csvReader.Where // this is where I would want to implement the where method, or some sort of way to filter the data
//I can access the rows and columns with this
csvTable.Rows[0][0]
csvTable.Columns[0][0]
//After filtering (maybe even multiple filters) I want to add up all the filtered data (assuming they are integers)
var dataToValidate = 0;
foreach var data in filteredData{
dataToValidate += data;
}
if (dataToValidate == 123)
//data is validated
I would read some of the documentation for the package you are using:
https://github.com/phatcher/CsvReader
https://www.codeproject.com/Articles/9258/A-Fast-CSV-Reader
To specifically answer the filtering question, so it only contains the data you are searching for consider the following:
var filteredData = new List<List<string>>();
using (CsvReader csv = new CsvReader(new StreamReader(System.IO.File.OpenRead(filePath[0])), true));
{
string searchTerm = "foo";
while (csv.ReadNextRecord())
{
var row = new List<string>();
for (int i = 0; i < csv.FieldCount; i++)
{
if (csv[i].Contains(searchTerm))
{
row.Add(csv[i]);
}
}
filteredData.Add(row);
}
}
This will give you a list of a list of string that you can enumerate over to do your validation
int dataToValidate = 0;
foreach (var row in filteredData)
{
foreach (var data in row)
{
// do the thing
}
}
--- Old Answer ---
Without seeing the code you are using to load the file, it might be a bit difficult to give you a full answer, ~2 Million cells may be slow no matter what what.
Your .Where comes from System.Linq
https://learn.microsoft.com/en-us/dotnet/api/system.linq.enumerable.where?view=net-6.0
A simple example using .Where
//Read the file and return a list of strings that match the where clause
public List<string> ReadCSV()
{
List<string> data = File.ReadLines(#"C:\Users\Public\Documents\test.csv");
.Select(line => line.Split(','))
// token[x] where x is the column number, assumes ID is column 0
.Select(tokens => new CsvFileStructure { Id = tokens[0], Value = tokens[1] })
// Where filters based on whatever you are looking for in the CSV
.Where(csvFileStructure => csvFileStructure.Id == "1")
.ToList();
return data;
}
// Map of your data structure
public class CsvFileStructure
{
public long Id { get; set; }
public string Name { get; set; }
public string Value { get; set; }
}
Modified from this answer:
https://stackoverflow.com/a/10332737/7366061
There is no csvreader.Where method. The "where" is part of Linq in C#. The link below shows an example of computing columns in a csv file using Linq:
https://learn.microsoft.com/en-us/dotnet/csharp/programming-guide/concepts/linq/how-to-compute-column-values-in-a-csv-text-file-linq
hi have a text files that contains 3 columns something like this:
contract1;pdf1;63
contract1;pdf2;5
contract1;pdf3;2
contract1;pdf4;00
contract2;pdf1;2
contract2;pdf2;30
contract2;pdf3;5
contract2;pdf4;80
now, i want to write those information into another text files ,and the output will be order put for first the records with the last column in "2,5", something like this:
contract1;pdf3;2
contract1;pdf2;5
contract1;pdf1;63
contract1;pdf4;00
contract2;pdf1;2
contract2;pdf3;5
contract2;pdf2;30
contract2;pdf4;80
how can i do?
thanks
You can use LINQ to group and sort the lines after reading, then put them back together:
var output = File.ReadAllLines(#"path-to-file")
.Select(s => s.Split(';'))
.GroupBy(s => s[0])
.SelectMany(sg => sg.OrderBy(s => s[2] == "2" ? "-" : s[2] == "5" ? "+" : s[2]).Select(sg => String.Join(";", sg)));
Then just write them to a file.
I'm not going to write your program for you, but I would recommend this library for reading and writing delimited files:
https://joshclose.github.io/CsvHelper/getting-started/
When you new up the reader make sure to specify your semi-colon delimiter:
using (var reader = new StreamReader("path\\to\\input_file.csv"))
using (var csv = new CsvReader(reader, CultureInfo.InvariantCulture))
{
csv.Configuration.Delimiter = ";";
var records = csv.GetRecords<Row>();
// manipulate the data as needed here
}
Your "Row" class (choose a more appropriate name for clarity) will specify the schema of the flat file. It sounds like you don't have headers? If not, you can specify the Order of each item.
public class Row
{
[Index(1)]
public string MyValue1 { get; set; }
[Index(2)]
public string MyValue2 { get; set; }
}
After reading the data in, you can manipulate it as needed. If the output format is different from the input format, you should convert the input class into an output class. You can use the Automapper library if you would like. However, for a simple project I would suggest to just manually convert the input class into the output class.
Lastly, write the data back out:
using (var writer = new StreamWriter("path\\to\\output_file.csv"))
using (var csv = new CsvWriter(writer, CultureInfo.InvariantCulture))
{
csv.WriteRecords(records);
}
I have a table on my Database where, aside from other columns (one of which is a UniqueIdentifier) I also have one column where I have a JSON array string with values like this (formatted):
[
{
"AttributeId": "fe153d69-8ac1-6e0c-8793-ff0000804eb3",
"AttributeValueId": "64163d69-8ac1-6e0c-8793-ff0000804eb3"
},
{
"AttributeId": "00163d69-8ac1-6e0c-8793-ff0000804eb3",
"AttributeValueId": "67163d69-8ac1-6e0c-8793-ff0000804eb3"
}
]
I then have this AttributeValuePair class which will allow me to read this data on code:
public class AttributeValuePair
{
public AttributeValuePair();
public Guid AttributeId { get; set; }
public Guid AttributeValueId { get; set; }
}
Whenever I get a list of items from this table, I want to be able to filter the resulting array based on only one AttributeValueId and get only the items where this is a match, independently of the value of any other attributes.
Since that on code, to read these attribute collection I must have a List<AttributeValuePair>, how in LINQ can I get the items where a particular AttributeValueId is present?
List<AttributeValuePair> attributeValuePairs = serializer.Deserialize<List<AttributeValuePair>>(item.Variant);
I've been lost at it for two hours already and can't seem to find an escape from this one.
EDIT
Being more clear about the problem, what I'm trying to do is, from a List<ProductVariation>, get the possible values for the attribute "Portions", when the attribute "Days" is the specified value. I'm having a lot of trouble using the serializer to build the LINQ statement.
//This code is wrong, I know, but I'm trying to show what I want
result = model.ProductVariations.Find(x, new {serializer.Deserialize<List<AttributeValuePair>>(item.Variant).Where(valuePair => valuePair.AttributeId == attributeId)});
Can you try
attributeValuePairs.Where(valuePair => valuePair.AttributeId == new Guid("SomeValue"));
The answer to this question was actually a lot simpler than previously expected:
public string SelectedVariation(string mealsAttribute, string portionsAttribute, string product)
{
Guid productId = new Guid(product);
CatalogManager catalogManager = CatalogManager.GetManager();
EcommerceManager ecommerceManager = EcommerceManager.GetManager();
RegisterOrderAccountFormModel model = new RegisterOrderAccountFormModel();
model.Product = catalogManager.GetProduct(productId);
List<ProductVariation> productVariationsCollection = catalogManager.GetProductVariations(productId).ToList();
//This is the really interesting part for the answer:
return productVariationsCollection.Where(x => x.Variant.ToLower().Contains(mealsAttribute.ToLower()) && x.Variant.ToLower().Contains(portionsAttribute.ToLower())).FirstOrDefault().Id.ToString();
}
I'm new to using Dynamic Objects in C#. I am reading a CSV file very similarly to the code found here: http://my.safaribooksonline.com/book/programming/csharp/9780321637208/csharp-4dot0-features/ch08lev1sec3
I can reference the data I need with a static name, however I can not find the correct syntax to reference using a dynamic name at run time.
For example I have:
var records = from r in myDynamicClass.Records select r;
foreach(dynamic rec in records)
{
Console.WriteLine(rec.SomeColumn);
}
And this works fine if you know the "SomeColumn" name. I would prefer to have a column name a a string and be able to make the same type refrence at run time.
Since one has to create the class which inherits from DynamicObject, simply add an indexer to the class to achieve one's result via strings.
The following example uses the same properties found in the book example, the properties which holds the individual line data that has the column names. Below is the indexer on that class to achieve the result:
public class myDynamicClassDataLine : System.Dynamic.DynamicObject
{
string[] _lineContent; // Actual line data
List<string> _headers; // Associated headers (properties)
public string this[string indexer]
{
get
{
string result = string.Empty;
int index = _headers.IndexOf(indexer);
if (index >= 0 && index < _lineContent.Length)
result = _lineContent[index];
return result;
}
}
}
Then access the data such as
var csv =
#",,SomeColumn,,,
ab,cd,ef,,,"; // Ef is the "SomeColumn"
var data = new myDynamicClass(csv); // This holds multiple myDynamicClassDataLine items
Console.WriteLine (data.OfType<dynamic>().First()["SomeColumn"]); // "ef" is the output.
You will need to use reflection. To get the names you would use:
List<string> columnNames = new List<string>(records.GetType().GetProperties().Select(i => i.Name));
You can then loop through your results and output the values for each column like so:
foreach(dynamic rec in records)
{
foreach (string prop in columnNames)
Console.Write(rec.GetType().GetProperty (prop).GetValue (rec, null));
}
Try this
string column = "SomeColumn";
var result = rec.GetType().GetProperty (column).GetValue (rec, null);