I have a PNG file to which I want to add the properties
Pixels per unit, X axis
Pixels per unit, Y axis
Unit specifier: meters
These properties are explained in the PNG specification: http://www.w3.org/TR/PNG-Chunks.html
I have programmatically read the properties of the .png to check if the properties exists, so that I can set the value for this properties, but I could not see this properties in the .png file.
(Refer pixel-per-unit.JPG)
How can we add properties to the .png file?
regards
Try using pngcs library (you need to rename the downloaded dll to "pngcs.dll")
I needed to add some custom text properties, but you can easily do much more.
Here is my implementation for adding custom text properties:
using Hjg.Pngcs; // https://code.google.com/p/pngcs/
using Hjg.Pngcs.Chunks;
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.IO;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace MarkerGenerator.Utils
{
class PngUtils
{
public string getMetadata(string file, string key)
{
PngReader pngr = FileHelper.CreatePngReader(file);
//pngr.MaxTotalBytesRead = 1024 * 1024 * 1024L * 3; // 3Gb!
//pngr.ReadSkippingAllRows();
string data = pngr.GetMetadata().GetTxtForKey(key);
pngr.End();
return data; ;
}
public static void addMetadata(String origFilename, Dictionary<string, string> data)
{
String destFilename = "tmp.png";
PngReader pngr = FileHelper.CreatePngReader(origFilename); // or you can use the constructor
PngWriter pngw = FileHelper.CreatePngWriter(destFilename, pngr.ImgInfo, true); // idem
//Console.WriteLine(pngr.ToString()); // just information
int chunkBehav = ChunkCopyBehaviour.COPY_ALL_SAFE; // tell to copy all 'safe' chunks
pngw.CopyChunksFirst(pngr, chunkBehav); // copy some metadata from reader
foreach (string key in data.Keys)
{
PngChunk chunk = pngw.GetMetadata().SetText(key, data[key]);
chunk.Priority = true;
}
int channels = pngr.ImgInfo.Channels;
if (channels < 3)
throw new Exception("This example works only with RGB/RGBA images");
for (int row = 0; row < pngr.ImgInfo.Rows; row++)
{
ImageLine l1 = pngr.ReadRowInt(row); // format: RGBRGB... or RGBARGBA...
pngw.WriteRow(l1, row);
}
pngw.CopyChunksLast(pngr, chunkBehav); // metadata after the image pixels? can happen
pngw.End(); // dont forget this
pngr.End();
File.Delete(origFilename);
File.Move(destFilename, origFilename);
}
public static void addMetadata(String origFilename,string key,string value)
{
Dictionary<string, string> data = new Dictionary<string, string>();
data.Add(key, value);
addMetadata(origFilename, data);
}
}
}
I think you are looking for SetPropertyItem. You can find the property ids here
You would use the property id to get and then set the property item for your meta-data.
EDIT
The three id's that you need (I think) are:
0x5111 - Pixel Per Unit X
0x5112 - Pixel Per Unit Y
0x5110 - Pixel Unit
Related
I am trying to read the content of an arrow file but I was not able to find the functions to get the actual data from it. I am not able to find any useful example to read the data too. For example here.
The code example for writing and reading in C#:
// Write
var recordBatch = new Apache.Arrow.RecordBatch.Builder(memoryAllocator)
.Append("Column A", false, col => col.Int32(array => array.AppendRange(Enumerable.Range(5, 15))))
.Build();
using (var stream = File.OpenWrite(filePath))
using (var writer = new Apache.Arrow.Ipc.ArrowFileWriter(stream, recordBatch.Schema, true))
{
await writer.WriteRecordBatchAsync(recordBatch);
await writer.WriteEndAsync();
}
// Read
var reader = Apache.Arrow.Ipc.ArrowFileReader.FromFile(filePath);
var readBatch = await reader.ReadNextRecordBatchAsync();
var col = readBatch.Column(0);
By debugging the code, I can see the values in the col Values property but I have no way of accessing this information in the code.
Am I missing anything or is there a different approach to read the data?
The Apache.Arrow package does not do any compute today. It will read in the file and you will have access to the raw buffers of data. This is sufficient for a number of intermediary tasks (e.g. services that shuttle data to and from or aggregate data files). So if you want to do a lot of operations on the data you may want some kind of dataframe library.
One such library is the Microsoft.Data.Analysis library which has added a DataFrame type which can be created from an Arrow RecordBatch. There is some explanation and examples of the library in this blog post.
I haven't worked with that library much but I was able to put together a short example of reading an Arrow file and printing the data:
using System;
using System.Diagnostics;
using System.IO;
using System.Threading.Tasks;
using Apache.Arrow.Ipc;
using Microsoft.Data.Analysis;
namespace DataframeExperiment
{
class Program
{
static async Task AsyncMain()
{
using (var stream = File.OpenRead("/tmp/test.arrow"))
using (var reader = new ArrowFileReader(stream))
{
var recordBatch = await reader.ReadNextRecordBatchAsync();
Console.WriteLine("Read record batch with {0} column(s)", recordBatch.ColumnCount);
var dataframe = DataFrame.FromArrowRecordBatch(recordBatch);
var columnX = dataframe["x"];
foreach (var value in columnX)
{
Console.WriteLine(value);
}
}
}
static void Main(string[] args)
{
AsyncMain().Wait();
}
}
}
I created the test file with a small python script:
import pyarrow as pa
import pyarrow.ipc as ipc
tab = pa.Table.from_pydict({'x': [1, 2, 3], 'y': ['x', 'y', 'z']})
with ipc.RecordBatchFileWriter('/tmp/test.arrow', schema=tab.schema) as writer:
writer.write_table(tab)
You could presumably also create the test file using C# with Apache.Arrow's array builders.
Update (Using Apache.Arrow directly)
On the other hand, if you want to use Apache.Arrow directly, and still get access to the data, then you can use typed arrays (e.g. Int32Array, Int64Array). You will first need to determine the type of your array somehow (either through prior knowledge of the schema or as / is style checks or pattern matching).
Here is an example using Apache.Arrow alone:
using System;
using System.IO;
using System.Threading.Tasks;
using Apache.Arrow;
using Apache.Arrow.Ipc;
namespace ArrayValuesExperiment
{
class Program
{
static async Task AsyncMain()
{
using (var stream = File.OpenRead("/tmp/test.arrow"))
using (var reader = new ArrowFileReader(stream))
{
var recordBatch = await reader.ReadNextRecordBatchAsync();
// Here I am relying on the fact that I know column
// 0 is an int64 array.
var columnX = (Int64Array) recordBatch.Column(0);
for (int i = 0; i < columnX.Values.Length; i++)
{
Console.WriteLine(columnX.Values[i]);
}
}
}
static void Main(string[] args)
{
AsyncMain().Wait();
}
}
}
Adding to the second approach proposed by Pace, an utility function like below can be used to get the values
private static dynamic GetArrayData(IArrowArray array)
{
return array switch
{
Int32Array int32array =>int32array.Values.ToArray(),
Int16Array int16array => int16array.Values.ToArray(),
StringArray stringArray => stringArray.Values.ToArray(),
FloatArray floatArray => floatArray.Values.ToArray(),
Int64Array int64Array => int64Array.Values.ToArray(),
DoubleArray doubleArray => doubleArray.Values.ToArray(),
Time32Array time32Array => time32Array.Values.ToArray(),
Time64Array time64Array => time64Array.Values.ToArray(),
BooleanArray booleanArray => booleanArray.Values.ToArray(),
Date32Array date32Array => date32Array.Values.ToArray(),
Date64Array date64Array => date64Array.Values.ToArray(),
Int8Array int8Array => int8Array.Values.ToArray(),
UInt16Array uint6Array => uint6Array.Values.ToArray(),
UInt8Array uInt8Array => uInt8Array.Values.ToArray(),
UInt64Array uInt64Array => uInt64Array.Values.ToArray(),
_ => throw new NotImplementedException(),
};
}
then iterate over the recordBatch as
object[,] results = new Object[recordBatch.Length, recordBatch.ColumnCount];
var col = 0;
foreach (var array in recordBatch.Arrays)
{
var row = 0;
foreach (var data in GetArrayData(array))
{
results[row++, col] = data;
}
col++;
}
return results;
Worth noting however that StringArrays return Bytes so you need to convert to back to string for example using
System.Text.Encoding.Unicode.GetString(stringArray.Values)
using System.IO;
using System.Collections.Generic;
using OfficeOpenXml;
namespace Project
{
public class CreateExcel
{
public static void GenerateExcel(List<string> headerList, List<string> dataList, FileInfo filePath)
{
using (ExcelPackage excel = new ExcelPackage())
{
excel.Workbook.Worksheets.Add("Worksheet1");
// Determine the header range (e.g. A1:D1)
string headerRange = "A1:" + Char.ConvertFromUtf32(headerList.Count + 64) + "1";
// Target a worksheet
var worksheet = excel.Workbook.Worksheets["Worksheet1"];
// Popular header row data
worksheet.Cells[headerRange].LoadFromCollection(headerList);
worksheet.Cells[2, 1].LoadFromCollection(dataList, false);
excel.SaveAs(filePath);
}
}
}
I would like to create .xlsx file with this function, but the headerRange get "A1:^1" value (when I use my headerList, which has 30 elements), and of course I get this error: System.Exception: 'Invalid Address format ^1'
.
How to set correctly the headerRange?
Use LoadFromArrays instead :
var values=new List<object[]> {
headerList.ToArray(),
dataList.ToArray()
};
worksheet.Cells["A1"].LoadFromArrays(values);
LoadFromCollection loads data from a strongly typed collection using reflection to create a different column for each property
We use in some of our applications the FlatFile library (https://github.com/forcewake/FlatFile) to parse some files delimited with separator (";"), since a lot of time without problems.
We faced yesterday a problem receiving files having multiple fields empty at the end of the row.
I replicated the problem with short console application to show and permit you to verify in a simple way:
using FlatFile.Delimited;
using FlatFile.Delimited.Attributes;
using FlatFile.Delimited.Implementation;
using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
namespace FlatFileTester
{
class Program
{
static void Main(string[] args)
{
var layout = GetLayout();
var factory = new DelimitedFileEngineFactory();
using (MemoryStream ms = new MemoryStream())
using (FileStream file = new FileStream(#"D:\shared\dotnet\FlatFileTester\test.csv", FileMode.Open, FileAccess.Read))
{
byte[] bytes = new byte[file.Length];
file.Read(bytes, 0, (int)file.Length);
ms.Write(bytes, 0, (int)file.Length);
var flatFile = factory.GetEngine(layout);
ms.Position = 0;
List<TestObject> records = flatFile.Read<TestObject>(ms).ToList();
foreach(var record in records)
{
Console.WriteLine(string.Format("Id=\"{0}\" - DescriptionA=\"{1}\" - DescriptionB=\"{2}\" - DescriptionC=\"{3}\"", record.Id, record.DescriptionA, record.DescriptionB, record.DescriptionC));
}
}
Console.ReadLine();
}
public static IDelimitedLayout<TestObject> GetLayout()
{
IDelimitedLayout<TestObject> layout = new DelimitedLayout<TestObject>()
.WithDelimiter(";")
.WithQuote("\"")
.WithMember(x => x.Id)
.WithMember(x => x.DescriptionA)
.WithMember(x => x.DescriptionB)
.WithMember(x => x.DescriptionC)
;
return layout;
}
}
[DelimitedFile(Delimiter = ";", Quotes = "\"")]
public class TestObject
{
[DelimitedField(1)]
public int Id { get; set; }
[DelimitedField(2)]
public string DescriptionA { get; set; }
[DelimitedField(3)]
public string DescriptionB { get; set; }
[DelimitedField(4)]
public string DescriptionC { get; set; }
}
}
This is an example of file:
1;desc1;desc1;desc1
2;desc2;desc2;desc2
3;desc3;;desc3
4;desc4;desc4;
5;desc5;;
So the first 4 rows are parsed as expected:
All fields with values in the first and second row
empty string for third field of third row
empty string for fouth field of fourth row
in the fifth row we expect empty string on third and fourth field, like this:
Id=5
DescriptionA="desc5"
DescriptionB=""
DescriptionC=""
instead we receive this:
Id=5
DescriptionA="desc5"
DescriptionB=";" // --> THE SEPARATOR!!!
DescriptionC=""
We can't understand if is a problem of configuration, bug of the library, or some other problem in the code...
Anyone have some similar experiences with this library, or can note some problem in the code above not linked with the library but causing the error...?
I took a look and debug the source code of the open source library: https://github.com/forcewake/FlatFile.
It seems there's a problem, in particular in this case, in witch there are 2 empty fields, at the end of a row, the bug take effects on the field before the last of the row.
I opened an issue for this libray, hoping some contributor of the library could invest some time to investigate, and, if it is so, to fix: https://github.com/forcewake/FlatFile/issues/80
For now we decided to fix the wrong values of the list, something like:
string separator = ",";
//...
//...
//...
records.ForEach(x => {
x.DescriptionC = x.DescriptionC.Replace(separator, "");
});
For our case, anyway, it make not sense to have a character corresponding to the separator as value of that field...
...even if it would be better to have bug fixing of the library
I am completely new to programming and trying to get the complete row data from csv file based on column value in c#. Example data is as follows:
Mat_No;Device;Mat_Des;Dispo_lvl;Plnt;MS;IPDS;TM;Scope;Dev_Cat
1111;BLB A601;BLB A601;T2;PW01;10;;OP_ELE;LED;
2222;ALP A0001;ALP A0001;T2;PW01;10;;OP_ELE;LED;
If user enters a Mat_No he gets the full row data of that particular number.
I have two files program.cs and filling.cs
overViewArea.cs contain following code for csv file reading:I dont know how to access the read values from program.cs file and display in console
`using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.IO;
using System.Data;
namespace TSDB
{
class fillData
{
public static fillData readCsv()
{
fillData getData= new fillData ();
using (var reader = new StreamReader(#"myfile.csv"))
{
List<string> headerList = null;
while (!reader.EndOfStream)
{
var line = reader.ReadLine();
if(headerList==null)
{
headerList = line.Split(';').ToList();
}
else
{
var values = line.Split(';');
for(int i = 0; i< headerList.Count; i++)
{
Console.Write(headerList[i] + "=" + values[i]+";");
}
Console.WriteLine();
}
}
}
return fillData;
}
}
}`
Program.cs has following code
class Program
{
static void Main(string[] args)
{
fillData data= fillData.readCsv();
Console.ReadLine();
}
}
First, please, do not reinvent the wheel: there are many CSV readers available: just use one of them. If you have to use your own routine (say, for a student project), I suggest extracting method. Try using File class instead of Stream/StreamReader:
// Simple: quotation has not been implemented
// Disclamer: demo only, do not use your own CSV readers
public static IEnumerable<string[]> ReadCsvSimple(string file, char delimiter) {
return File
.ReadLines(file)
.Where(line => !string.IsNullOrEmpty(line)) // skip empty lines if any
.Select(line => line.Split(delimiter));
}
Having this routine implemented, you can use Linq to query the data, e.g.
If user enters a Mat_No he gets the full row data of that particular
number.
Console.WriteLine("Mat No, please?");
string Mat_No_To_Filter = Console.ReadLine();
var result = ReadCsvSimple(#"myfile.csv", ';')
.Skip(1)
.Where(record => record[0] == Mat_No_To_Filter);
foreach (var items in result)
Console.WriteLine(string.Join(";", items));
I am working on programmatically creating a package with a data flow task containing a Script Component as a Source. I have been able to create the package, data flow task, and add a Script Component. However, the Script Component appears to default to a Transform.
Does anyone know how to get it to be a Souce?
Here is my class with the single method I'm working on:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using DynamicPackageCreator.Models;
using Microsoft.SqlServer.Dts.Runtime;
using System.IO;
using Microsoft.SqlServer.Dts.Pipeline.Wrapper;
// Alias to prevent ambiguity
using dtsColumnDataType = Microsoft.SqlServer.Dts.Runtime.Wrapper.DataType;
namespace DynamicPackageCreator
{
public class DtsClient
{
public void CreatePackageWithDataFlowAndScriptSource(string filePath, string dataFlowName, string sourceName, List<OutputDefinition> outputDefinitions)
{
// Create the Package
Package pkg = new Package();
pkg.Name = Path.GetFileNameWithoutExtension(filePath);
// Create the Dataflow task
Executable e = pkg.Executables.Add("STOCK:PipelineTask");
TaskHost thMainPipe = e as TaskHost;
thMainPipe.Name = dataFlowName;
MainPipe dataFlowTask = thMainPipe.InnerObject as MainPipe;
// Create Source Component
IDTSComponentMetaData100 sourceComponent = dataFlowTask.ComponentMetaDataCollection.New();
sourceComponent.Name = sourceName;
sourceComponent.ComponentClassID = SsisComponentType.ScriptComponent.GetComponentClassId();
// Get the design time srcDesignTime of the component
CManagedComponentWrapper srcDesignTime = sourceComponent.Instantiate();
// Initialize the component
srcDesignTime.ProvideComponentProperties();
int lastOutputId = 0;
// Add metadata
foreach (var outputDefinition in outputDefinitions)
{
var output = srcDesignTime.InsertOutput(DTSInsertPlacement.IP_AFTER, lastOutputId);
output.Name = outputDefinition.OutputName;
lastOutputId = output.ID;
var outputColumnCollection = output.OutputColumnCollection;
foreach (var outputColumnDefinition in outputDefinition.OutputColumnDefinitions)
{
var outputColumn = outputColumnCollection.New();
outputColumn.Name = outputColumnDefinition.ColumnName;
outputColumn.SetDataTypeProperties(dtsColumnDataType.DT_WSTR, outputColumnDefinition.ColumnSize, 0, 0, 0);
}
}
// Reinitialise the metadata
srcDesignTime.ReinitializeMetaData();
// Save the package
Application app = new Application();
app.SaveToXml(filePath, pkg, null);
}
}
}
The OutputDefinition class is a custom class I created for holding the definitions used when creating the outputs.
So, the solution to this issue is to remove all inputs from the component. By default the component has an "Input 0" and an "Output 0" which correlates to being a Transform script component type. A source type would have no Inputs, and a destination would have no Outputs.
To remove the inputs and outputs, add:
sourceComponent.OutputCollection.RemoveAll();
sourceComponent.InputCollection.RemoveAll();
Here:
// ...
// Initialize the component
srcDesignTime.ProvideComponentProperties();
// Remove default inputs and outputs
sourceComponent.OutputCollection.RemoveAll();
sourceComponent.InputCollection.RemoveAll();
int lastOutputId = 0;
// ...