HDF5 Example code - c#

Using HDF5DotNet, can anyone point me at example code, which will open an hdf5 file, extract the contents of a dataset, and print the contents to standard output?
So far I have the following:
H5.Open();
var h5 = H5F.open("example.h5", H5F.OpenMode.ACC_RDONLY);
var dataset = H5D.open(h5, "/Timings/aaPCBTimes");
var space = H5D.getSpace(dataset);
var size = H5S.getSimpleExtentDims(space);
Then it gets a bit confusing.
I actually want to do some processing on the contents of the dataset but I think once I have dump to standard output I can work it out from there.
UPDATE: I've hacked around this sufficient to solve my own problem. I failed to realise a dataset was a multi-array - I thought it was more like a db table. In the unlikely event anyone is interested,
double[,] dataArray = new double[size[0], 6];
var wrapArray = new H5Array<double>(dataArray);
var dataType = H5D.getType(d);
H5D.read(dataset, dataType, wrapArray);
Console.WriteLine(dataArray[0, 0]);

Try this:
using System;
using HDF5DotNet;
namespace CSharpExample1
{
class Program
{
// Function used with
static int myFunction(H5GroupId id, string objectName, Object param)
{
Console.WriteLine("The object name is {0}", objectName);
Console.WriteLine("The object parameter is {0}", param);
return 0;
}
static void Main(string[] args)
{
try
{
// We will write and read an int array of this length.
const int DATA_ARRAY_LENGTH = 12;
// Rank is the number of dimensions of the data array.
const int RANK = 1;
// Create an HDF5 file.
// The enumeration type H5F.CreateMode provides only the legal
// creation modes. Missing H5Fcreate parameters are provided
// with default values.
H5FileId fileId = H5F.create("myCSharp.h5",
H5F.CreateMode.ACC_TRUNC);
// Create a HDF5 group.
H5GroupId groupId = H5G.create(fileId, "/cSharpGroup", 0);
H5GroupId subGroup = H5G.create(groupId, "mySubGroup", 0);
// Demonstrate getObjectInfo
ObjectInfo info = H5G.getObjectInfo(fileId, "/cSharpGroup", true);
Console.WriteLine("cSharpGroup header size is {0}", info.headerSize);
Console.WriteLine("cSharpGroup nlinks is {0}", info.nHardLinks);
Console.WriteLine("cSharpGroup fileno is {0} {1}",
info.fileNumber[0], info.fileNumber[1]);
Console.WriteLine("cSharpGroup objno is {0} {1}",
info.objectNumber[0], info.objectNumber[1]);
Console.WriteLine("cSharpGroup type is {0}", info.objectType);
H5G.close(subGroup);
// Prepare to create a data space for writing a 1-dimensional
// signed integer array.
ulong[] dims = new ulong[RANK];
dims[0] = DATA_ARRAY_LENGTH;
// Put descending ramp data in an array so that we can
// write it to the file.
int[] dset_data = new int[DATA_ARRAY_LENGTH];
for (int i = 0; i < DATA_ARRAY_LENGTH; i++)
dset_data[i] = DATA_ARRAY_LENGTH - i;
// Create a data space to accommodate our 1-dimensional array.
// The resulting H5DataSpaceId will be used to create the
// data set.
H5DataSpaceId spaceId = H5S.create_simple(RANK, dims);
// Create a copy of a standard data type. We will use the
// resulting H5DataTypeId to create the data set. We could
// have used the HST.H5Type data directly in the call to
// H5D.create, but this demonstrates the use of H5T.copy
// and the use of a H5DataTypeId in H5D.create.
H5DataTypeId typeId = H5T.copy(H5T.H5Type.NATIVE_INT);
// Find the size of the type
uint typeSize = H5T.getSize(typeId);
Console.WriteLine("typeSize is {0}", typeSize);
// Set the order to big endian
H5T.setOrder(typeId, H5T.Order.BE);
// Set the order to little endian
H5T.setOrder(typeId, H5T.Order.LE);
// Create the data set.
H5DataSetId dataSetId = H5D.create(fileId, "/csharpExample",
typeId, spaceId);
// Write the integer data to the data set.
H5D.write(dataSetId, new H5DataTypeId(H5T.H5Type.NATIVE_INT),
new H5Array<int>(dset_data));
// If we were writing a single value it might look like this.
// int singleValue = 100;
// H5D.writeScalar(dataSetId, new H5DataTypeId(H5T.H5Type.NATIVE_INT),
// ref singleValue);
// Create an integer array to receive the read data.
int[] readDataBack = new int[DATA_ARRAY_LENGTH];
// Read the integer data back from the data set
H5D.read(dataSetId, new H5DataTypeId(H5T.H5Type.NATIVE_INT),
new H5Array<int>(readDataBack));
// Echo the data
for(int i=0;i<DATA_ARRAY_LENGTH;i++)
{
Console.WriteLine(readDataBack[i]);
}
// Close all the open resources.
H5D.close(dataSetId);
// Reopen and close the data sets to show that we can.
dataSetId = H5D.open(fileId, "/csharpExample");
H5D.close(dataSetId);
dataSetId = H5D.open(groupId, "/csharpExample");
H5D.close(dataSetId);
H5S.close(spaceId);
H5T.close(typeId);
H5G.close(groupId);
//int x = 10;
//H5T.enumInsert<int>(typeId, "myString", ref x);
//H5G.close(groupId);
H5GIterateDelegate myDelegate;
myDelegate = myFunction;
int x = 9;
int index = H5G.iterate(fileId, "/cSharpGroup",
myDelegate, x, 0);
// Reopen the group id to show that we can.
groupId = H5G.open(fileId, "/cSharpGroup");
H5G.close(groupId);
H5F.close(fileId);
// Reopen and reclose the file.
H5FileId openId = H5F.open("myCSharp.h5",
H5F.OpenMode.ACC_RDONLY);
H5F.close(openId);
}
// This catches all the HDF exception classes. Because each call
// generates unique exception, different exception can be handled
// separately. For example, to catch open errors we could have used
// catch (H5FopenException openException).
catch (HDFException e)
{
Console.WriteLine(e.Message);
}
Console.WriteLine("Processing complete!");
Console.ReadLine();
}
}
}

So, your start was awesome. I've created some extensions which should help you out. Using this in your code, you should be able to things that make more sense in an object-oriented language, such as (in your case):
H5.Open();
var h5FileId= H5F.open("example.h5");
double[,] dataArray = h5FileId.Read2DArray<double>("/Timings/aaPCBTimes");
// or more generically...
T[,] dataArray = h5FileId.Read2DArray<T>("/Timings/aaPCBTimes");
Here are the incomplete extensions, I'll look into adding them into the HDF5Net...
public static class HdfExtensions
{
// thank you http://stackoverflow.com/questions/4133377/splitting-a-string-number-every-nth-character-number
public static IEnumerable<String> SplitInParts(this String s, Int32 partLength)
{
if (s == null)
throw new ArgumentNullException("s");
if (partLength <= 0)
throw new ArgumentException("Part length has to be positive.", "partLength");
for (var i = 0; i < s.Length; i += partLength)
yield return s.Substring(i, Math.Min(partLength, s.Length - i));
}
public static T[] Read1DArray<T>(this H5FileId fileId, string dataSetName)
{
var dataset = H5D.open(fileId, dataSetName);
var space = H5D.getSpace(dataset);
var dims = H5S.getSimpleExtentDims(space);
var dataType = H5D.getType(dataset);
if (typeof(T) == typeof(string))
{
int stringLength = H5T.getSize(dataType);
byte[] buffer = new byte[dims[0] * stringLength];
H5D.read(dataset, dataType, new H5Array<byte>(buffer));
string stuff = System.Text.ASCIIEncoding.ASCII.GetString(buffer);
return stuff.SplitInParts(stringLength).Select(ss => (T)(object)ss).ToArray();
}
T[] dataArray = new T[dims[0]];
var wrapArray = new H5Array<T>(dataArray);
H5D.read(dataset, dataType, wrapArray);
return dataArray;
}
public static T[,] Read2DArray<T>(this H5FileId fileId, string dataSetName)
{
var dataset = H5D.open(fileId, dataSetName);
var space = H5D.getSpace(dataset);
var dims = H5S.getSimpleExtentDims(space);
var dataType = H5D.getType(dataset);
if (typeof(T) == typeof(string))
{
// this will also need a string hack...
}
T[,] dataArray = new T[dims[0], dims[1]];
var wrapArray = new H5Array<T>(dataArray);
H5D.read(dataset, dataType, wrapArray);
return dataArray;
}
}

Here is a working sample:
using System.Collections.Generic;
using System;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using HDF5DotNet;
namespace HDF5Test
{
public class HDFTester
{
static int myFunction(H5GroupId id, string objectName, Object param)
{
Console.WriteLine("The object name is {0}", objectName);
Console.WriteLine("The object parameter is {0}", param);
return 0;
}
public static void runTest()
{
try
{
// We will write and read an int array of this length.
const int DATA_ARRAY_LENGTH = 12;
// Rank is the number of dimensions of the data array.
const int RANK = 1;
// Create an HDF5 file.
// The enumeration type H5F.CreateMode provides only the legal
// creation modes. Missing H5Fcreate parameters are provided
// with default values.
H5FileId fileId = H5F.create("myCSharp.h5",
H5F.CreateMode.ACC_TRUNC);
// Create a HDF5 group.
H5GroupId groupId = H5G.create(fileId, "/cSharpGroup");
H5GroupId subGroup = H5G.create(groupId, "mySubGroup");
// Demonstrate getObjectInfo
ObjectInfo info = H5G.getObjectInfo(fileId, "/cSharpGroup", true);
Console.WriteLine("cSharpGroup header size is {0}", info.headerSize);
Console.WriteLine("cSharpGroup nlinks is {0}", info.nHardLinks);
Console.WriteLine("cSharpGroup fileno is {0} {1}",
info.fileNumber[0], info.fileNumber[1]);
Console.WriteLine("cSharpGroup objno is {0} {1}",
info.objectNumber[0], info.objectNumber[1]);
Console.WriteLine("cSharpGroup type is {0}", info.objectType);
H5G.close(subGroup);
// Prepare to create a data space for writing a 1-dimensional
// signed integer array.
long[] dims = new long[RANK];
dims[0] = DATA_ARRAY_LENGTH;
// Put descending ramp data in an array so that we can
// write it to the file.
int[] dset_data = new int[DATA_ARRAY_LENGTH];
for (int i = 0; i < DATA_ARRAY_LENGTH; i++)
dset_data[i] = DATA_ARRAY_LENGTH - i;
// Create a data space to accommodate our 1-dimensional array.
// The resulting H5DataSpaceId will be used to create the
// data set.
H5DataSpaceId spaceId = H5S.create_simple(RANK, dims);
// Create a copy of a standard data type. We will use the
// resulting H5DataTypeId to create the data set. We could
// have used the HST.H5Type data directly in the call to
// H5D.create, but this demonstrates the use of H5T.copy
// and the use of a H5DataTypeId in H5D.create.
H5DataTypeId typeId = H5T.copy(H5T.H5Type.NATIVE_INT);
// Find the size of the type
int typeSize = H5T.getSize(typeId);
Console.WriteLine("typeSize is {0}", typeSize);
// Set the order to big endian
H5T.setOrder(typeId, H5T.Order.BE);
// Set the order to little endian
H5T.setOrder(typeId, H5T.Order.LE);
// Create the data set.
H5DataSetId dataSetId = H5D.create(fileId, "/csharpExample",
typeId, spaceId);
// Write the integer data to the data set.
H5D.write(dataSetId, new H5DataTypeId(H5T.H5Type.NATIVE_INT),
new H5Array<int>(dset_data));
// If we were writing a single value it might look like this.
// int singleValue = 100;
// H5D.writeScalar(dataSetId, new H5DataTypeId(H5T.H5Type.NATIVE_INT),
// ref singleValue);
// Create an integer array to receive the read data.
int[] readDataBack = new int[DATA_ARRAY_LENGTH];
// Read the integer data back from the data set
H5D.read(dataSetId, new H5DataTypeId(H5T.H5Type.NATIVE_INT),
new H5Array<int>(readDataBack));
// Echo the data
for (int i = 0; i < DATA_ARRAY_LENGTH; i++)
{
Console.WriteLine(readDataBack[i]);
}
// Close all the open resources.
H5D.close(dataSetId);
// Reopen and close the data sets to show that we can.
dataSetId = H5D.open(fileId, "/csharpExample");
H5D.close(dataSetId);
dataSetId = H5D.open(groupId, "/csharpExample");
H5D.close(dataSetId);
H5S.close(spaceId);
H5T.close(typeId);
H5G.close(groupId);
//int x = 10;
//H5T.enumInsert<int>(typeId, "myString", ref x);
//H5G.close(groupId);
H5GIterateCallback myDelegate;
myDelegate = myFunction;
int x = 9;
int start = 0;
int index = H5G.iterate(fileId, "/cSharpGroup",myDelegate, x, ref start);
// Reopen the group id to show that we can.
groupId = H5G.open(fileId, "/cSharpGroup");
H5G.close(groupId);
H5F.close(fileId);
// Reopen and reclose the file.
H5FileId openId = H5F.open("myCSharp.h5",
H5F.OpenMode.ACC_RDONLY);
H5F.close(openId);
}
// This catches all the HDF exception classes. Because each call
// generates unique exception, different exception can be handled
// separately. For example, to catch open errors we could have used
// catch (H5FopenException openException).
catch (HDFException e)
{
Console.WriteLine(e.Message);
}
Console.WriteLine("Processing complete!");
Console.ReadLine();
}
}
}

I know this is old but for anyone who still need to work with HDF5 files I have a C# wrapper that encapsulated most of the operations at github (based on original work by other person).
There are many examples in the unit tes project.

Related

How can I access multi-element List data stored in a public class?

My first question on SO:
I created this public class, so that I can store three elements in a list:
public class myMultiElementList
{
public string Role {get;set;}
public string Country {get;set;}
public int Commonality {get;set;}
}
In my main class, I then created a new list using this process:
var EmployeeRolesCountry = new List<myMultiElementList>();
var rc1 = new myMultiElementList();
rc1.Role = token.Trim();
rc1.Country = country.Trim();
rc1.Commonality = 1;
EmployeeRolesCountry.Add(rc1);
I've added data to EmployeeRolesCountry and have validated that has 472 lines. However, when I try to retrieve it as below, my ForEach loop only retrieves the final line added to the list, 472 times...
foreach (myMultiElementList tmpClass in EmployeeRolesCountry)
{
string d1Value = tmpClass.Role;
Console.WriteLine(d1Value);
string d2Value = tmpClass.Role;
Console.WriteLine(d2Value);
int d3Value = tmpClass.Commonality;
Console.WriteLine(d3Value);
}
This was the most promising of the potential solutions I found on here, so any pointers greatly appreciated.
EDIT: adding data to EmployeeRolesCountry
/*
Before this starts, data is taken in via a csvReader function and parsed
All of the process below is concerned with two fields in the csv
One is simply the Country. No processing necessary
The other is bio, and this itself needs to be parsed and cleansed several times to take roles out
To keep things making sense, I've taken much of the cleansing out
*/
private void File_upload_Click(object sender, EventArgs e)
{
int pos = 0;
var EmployeeRolesCountry = new List<myMultiElementList>();
var rc1 = new myMultiElementList();
int a = 0;
delimiter = ".";
string token;
foreach (var line in records.Take(100))
{
var fields = line.ToList();
string bio = fields[5];
string country = fields[4];
int role_count = Regex.Matches(bio, delimiter).Count;
a = bio.Length;
for (var i = 0; i < role_count; i++)
{
//here I take first role, by parsing on delimiter, then push back EmployeeRolesCountry with result
pos = bio.IndexOf('.');
if (pos != -1)
{
token = bio.Substring(0, pos);
string original_token = token;
rc1.Role = token.Trim();
rc1.Country = country.Trim();
rc1.Commonality = 1;
EmployeeRolesCountry.Add(rc1);
a = original_token.Length;
bio = bio.Remove(0, a + 1);
}
}
}
}
EDIT:
When grouped by multiple properties, this is how we iterate through the grouped items:
var employeesGroupedByRolwAndCountry = EmployeeRolesCountry.GroupBy(x => new { x.Role, x.Country });
employeesGroupedByRolwAndCountry.ToList().ForEach
(
(countryAndRole) =>
{
Console.WriteLine("Group {0}/{1}", countryAndRole.Key.Country, countryAndRole.Key.Role);
countryAndRole.ToList().ForEach
(
(multiElement) => Console.WriteLine(" : {0}", multiElement.Commonality)
);
}
);
__ ORIGINAL POST __
You are instantiating rc1 only once (outside the loop) and add the same instance to the list.
Please make sure that you do
var rc1 = new myMultiElementList();
inside the loop where you are adding the elements, and not outside.
All references are the same in your case:
var obj = new myObj();
for(i = 0; i < 5; i++)
{
obj.Prop1 = "Prop" + i;
list.Add(obj);
}
now the list has 5 elements, all pointing to the obj (the same instance, the same object in memory), and when you do
obj.Prop1 = "Prop" + 5
you update the same memory address, and all the pointers in the list points to the same instance so, you are not getting 472 copies of the LAST item, but getting the same instance 472 times.
The solution is simple. Create a new instance every time you add to your list:
for(i = 0; i < 5; i++)
{
var obj = new myObj();
obj.Prop1 = "Prop" + i;
list.Add(obj);
}
Hope this helps.

How to make a compound datatype with HDF5DOTNET?

I have problems when I write a struct with arrays in it into an HDF5 dataset. Firstly, the window form doesn't start with the line:
H5T.insert(typeStruct, "string", 0, H5T.create_array(new H5DataTypeId(H5T.H5Type.C_S1), dims2));
The window form at least starts without the line, so I think there's something wrong with defining the compound datatype. I've looked into manuals and many examples, but I can't still fix the problems. Could I get an example of using compound datatypes to write a struct with multiple arrays in C#?
using HDF5DotNet;
using System.Globalization;
using System.IO;
using System.Runtime.InteropServices;
using System.Reflection;
namespace WindowsFormsApplication1
{
public unsafe partial class Form1 : Form
{
public unsafe struct struct_TR
{
public string[] arr_currentLong;
public struct_TR(byte size_currentTime)
{
arr_currentLong = new string[size_currentTime];
}
}
public Form1()
{
InitializeComponent();
long ARRAY_SIZE = 255;
struct_TR structMade = new struct_TR(255);
for (int i = 0; i < 255; i++)
{
structMade.arr_currentLong[i] = i.ToString();
}
string currentPath = Path.GetDirectoryName(Application.ExecutablePath);
Directory.SetCurrentDirectory(currentPath);
H5FileId fileId = H5F.create(#"weights.h5", H5F.CreateMode.ACC_TRUNC);
long[] dims1 = { 1 };
long[] dims2 = { 1, ARRAY_SIZE };
H5DataSpaceId myDataSpace = H5S.create_simple(1, dims1);
H5DataTypeId string_type = H5T.copy(H5T.H5Type.C_S1);
H5DataTypeId array_tid1 = H5T.create_array(string_type, dims2);
H5DataTypeId typeStruct = H5T.create(H5T.CreateClass.COMPOUND, Marshal.SizeOf(typeof(struct_TR)));
H5T.insert(typeStruct, "string", 0, H5T.create_array(new H5DataTypeId(H5T.H5Type.C_S1), dims2));
H5DataSetId myDataSet = H5D.create(fileId, "/dset", typeStruct, myDataSpace);
H5D.writeScalar<struct_TR>(myDataSet, typeStruct, ref structMade);
}
}
}
the only way I know how to save structs with arrays is to create an array that is constant
So for example this is a struct with an array of length 4.
[StructLayout(LayoutKind.Sequential)]
public struct Responses
{
public Int64 MCID;
public int PanelIdx;
[MarshalAs(UnmanagedType.ByValArray, SizeConst = 4)]
public short[] ResponseValues;
}
Here an array of 4 structs containing an array is created:
responseList = new Responses[4] {
new Responses() { MCID=1,PanelIdx=5,ResponseValues=new short[4]{ 1,2,3,4} },
new Responses() { MCID=2,PanelIdx=6,ResponseValues=new short[4]{ 5,6,7,8}},
new Responses() { MCID=3,PanelIdx=7,ResponseValues=new short[4]{ 1,2,3,4}},
new Responses() { MCID=4,PanelIdx=8,ResponseValues=new short[4]{ 5,6,7,8}}
};
The following lines of code write an array of structs to a HDF5 file:
string filename = "testArrayCompounds.H5";
var fileId =H5F.create(filename, H5F.ACC_TRUNC);
var status = WriteCompounds(fileId, "/test", responseList);
H5F.close(fileId);
The WriteCompounds method looks like this:
public static int WriteCompounds<T>(hid_t groupId, string name, IEnumerable<T> list) //where T : struct
{
Type type = typeof(T);
var size = Marshal.SizeOf(type);
var cnt = list.Count();
var typeId = CreateType(type);
var log10 = (int)Math.Log10(cnt);
ulong pow = (ulong)Math.Pow(10, log10);
ulong c_s = Math.Min(1000, pow);
ulong[] chunk_size = new ulong[] { c_s };
ulong[] dims = new ulong[] { (ulong)cnt };
long dcpl = 0;
if (!list.Any() || log10 == 0) { }
else
{
dcpl = CreateProperty(chunk_size);
}
// Create dataspace. Setting maximum size to NULL sets the maximum
// size to be the current size.
var spaceId = H5S.create_simple(dims.Length, dims, null);
// Create the dataset and write the compound data to it.
var datasetId = H5D.create(groupId, name, typeId, spaceId, H5P.DEFAULT, dcpl);
IntPtr p = Marshal.AllocHGlobal(size * (int)dims[0]);
var ms = new MemoryStream();
BinaryWriter writer = new BinaryWriter(ms);
foreach (var strct in list)
writer.Write(getBytes(strct));
var bytes = ms.ToArray();
GCHandle hnd = GCHandle.Alloc(bytes, GCHandleType.Pinned);
var statusId = H5D.write(datasetId, typeId, spaceId, H5S.ALL,
H5P.DEFAULT, hnd.AddrOfPinnedObject());
hnd.Free();
/*
* Close and release resources.
*/
H5D.close(datasetId);
H5S.close(spaceId);
H5T.close(typeId);
H5P.close(dcpl);
Marshal.FreeHGlobal(p);
return statusId;
}
Three additional help functions are needed, two are shown here:
private static long CreateType(Type t)
{
var size = Marshal.SizeOf(t);
var float_size = Marshal.SizeOf(typeof(float));
var int_size = Marshal.SizeOf(typeof(int));
var typeId = H5T.create(H5T.class_t.COMPOUND, new IntPtr(size));
var compoundInfo = Hdf5.GetCompoundInfo(t);
foreach (var cmp in compoundInfo)
{
H5T.insert(typeId, cmp.name, Marshal.OffsetOf(t, cmp.name), cmp.datatype);
}
return typeId;
}
private static long CreateProperty(ulong[] chunk_size)
{
var dcpl = H5P.create(H5P.DATASET_CREATE);
H5P.set_layout(dcpl, H5D.layout_t.CHUNKED);
H5P.set_chunk(dcpl, chunk_size.Length, chunk_size);
H5P.set_deflate(dcpl, 6);
return dcpl;
}
I also have a ReadCompounds to read the hdf5 file. The Hdf5.GetCompoundInfo method used in the CreateType method is also very long. So I won't show these methods here.
So that's quite a lot of code just for writing some structs.
I have made a library called HDF5DotnetTools that allows you to read and write classes and structs much more easily. There you can also find the ReadCompounds and GetCompoundInfo methods.
In the unit tests of the HDF5DotnetTools you can also find examples of how to write classes with arrays

Counting/sorting characters in a text file

I am trying to write a program that reads a text file, sorts it by character, and keeps track of how many times each character appears in the document. This is what I have so far.
class Program
{
static void Main(string[] args)
{
CharFrequency[] Charfreq = new CharFrequency[128];
try
{
string line;
System.IO.StreamReader file = new System.IO.StreamReader(#"C:\Users\User\Documents\Visual Studio 2013\Projects\Array_Project\wap.txt");
while ((line = file.ReadLine()) != null)
{
int ch = file.Read();
if (Charfreq.Contains(ch))
{
}
}
file.Close();
Console.ReadLine();
}
catch (Exception e)
{
Console.WriteLine("The process failed: {0}", e.ToString());
}
}
}
My question is, what should go in the if statement here?
I also have a Charfrequency class, which I'll include here in case it is helpful/necessary that I include it (and yes, it is necessary that I use an array versus a list or arraylist).
public class CharFrequency
{
private char m_character;
private long m_count;
public CharFrequency(char ch)
{
Character = ch;
Count = 0;
}
public CharFrequency(char ch, long charCount)
{
Character = ch;
Count = charCount;
}
public char Character
{
set
{
m_character = value;
}
get
{
return m_character;
}
}
public long Count
{
get
{
return m_count;
}
set
{
if (value < 0)
value = 0;
m_count = value;
}
}
public void Increment()
{
m_count++;
}
public override bool Equals(object obj)
{
bool equal = false;
CharFrequency cf = new CharFrequency('\0', 0);
cf = (CharFrequency)obj;
if (this.Character == cf.Character)
equal = true;
return equal;
}
public override int GetHashCode()
{
return m_character.GetHashCode();
}
public override string ToString()
{
String s = String.Format("'{0}' ({1}) = {2}", m_character, (byte)m_character, m_count);
return s;
}
}
Have a look at this post.
https://codereview.stackexchange.com/questions/63872/counting-the-number-of-character-occurrences
It uses LINQ to achieve your goal
You shouldn't use Contains
first you need to initialize your Charfreq array:
CharFrequency[] Charfreq = new CharFrequency[128];
for (int i = 0; i < Charferq.Length; i++)
{
Charfreq[i] = new CharFrequency((char)i);
}
try
then you can
int ch;
// -1 means that there are no more characters to read,
// otherwise ch is the char read
while ((ch = file.Read()) != -1)
{
CharFrequency cf = new CharFrequency((char)ch);
// This works because CharFrequency overloads the
// Equals method, and the Equals method checks only
// for the Character property of CharFrequency
int ix = Array.IndexOf(Charfreq, cf);
// if there is the "right" charfrequency
if (ix != -1)
{
Charfreq[ix].Increment();
}
}
Note that this isn't the way I would write the program. This is the minimum changes needed to make your program working.
As a sidenote, this program will count the "frequency" of ASCII characters (characters with code <= 127)
CharFrequency cf = new CharFrequency('\0', 0);
cf = (CharFrequency)obj;
And this is an useless initialization:
CharFrequency cf = (CharFrequency)obj;
is enough, otherwise you are creating a CharFrequency just to discard it the line below.
A dictionary is well suited for a task like this. You didn't say which character set and encoding the file was in. So, because Unicode is so common, let's assume the Unicode character set and UTF-8 encoding. (After all, it is the default for .NET, Java, JavaScript, HTML, XML,….) If that's not the case then read the file using the applicable encoding and fix your code because you currently are using UTF-8 in your StreamReader.
Next comes iterating across the "characters". And then incrementing the count for a "character" in the dictionary as it is seen in the text.
Unicode does have a few complex features. One is combining characters, where a base character can be overlaid with diacritics etc. Users view such combinations as one "character", or, as Unicode calls them, graphemes. Thankfully, .NET gives is the StringInfo class that iterates over them as a "text element."
So, if you think about it, using an array would be quite difficult. You'd have to build your own dictionary on top of your array.
The example below uses a Dictionary and is runnable using a LINQPad script. After it creates the dictionary, it orders and dumps it with a nice display.
var path = Path.GetTempFileName();
// Get some text we know is encoded in UTF-8 to simplify the code below
// and contains combining codepoints as a matter of example.
using (var web = new WebClient())
{
web.DownloadFile("http://superuser.com/questions/52671/which-unicode-characters-do-smilies-like-%D9%A9-%CC%AE%CC%AE%CC%83-%CC%83%DB%B6-consist-of", path);
}
// since the question asks to analyze a file
var content = File.ReadAllText(path, Encoding.UTF8);
var frequency = new Dictionary<String, int>();
var itor = System.Globalization.StringInfo.GetTextElementEnumerator(content);
while (itor.MoveNext())
{
var element = (String)itor.Current;
if (!frequency.ContainsKey(element))
{
frequency.Add(element, 0);
}
frequency[element]++;
}
var histogram = frequency
.OrderByDescending(f => f.Value)
// jazz it up with the list of codepoints in each text element
.Select(pair =>
{
var bytes = Encoding.UTF32.GetBytes(pair.Key);
var codepoints = new UInt32[bytes.Length/4];
Buffer.BlockCopy(bytes, 0, codepoints, 0, bytes.Length);
return new {
Count = pair.Value,
textElement = pair.Key,
codepoints = codepoints.Select(cp => String.Format("U+{0:X4}", cp) ) };
});
histogram.Dump(); // For use in LINQPad

how do i pass a single char value to a string variable after increment from a linQ record data

internal char SubMeasurement = 'a';
internal string GetLast;
private void CreateSub()
{
SFCDataContext SFC = new SFCDataContext();
try
{
var CheckRecordSub = SFC.Systems_SettingsMeasurements.Where(r => r.RelationData == txtNO.Text)
.Select(t => new { CODE = t.No });
int count = 0; int total = 0;
string[] row = new string[CheckRecordSub.Count()];
foreach (var r in CheckRecordSub)
{
row[count] = r.CODE;
GetLast = r.CODE;
count++;
total = count;
}
if (txtNO.Text == GetLast)
{
MessageBox.Show(SubMeasurement.ToString()); <-- Msg Box doesn't Work
}
else
{
SubMeasurement = Convert.ToChar(GetLast);
SubMeasurement++; <-- Error
MessageBox.Show(SubMeasurement.ToString()); <-- Msg Box doesn't Work
}
}
catch (Exception) { }
}
I have a record data which is when a user tries to pick a "Sub" option instead of a "Header" the process tries to check for a record last sub record and takes that record put it in a char and increment it and then place it back to a string variable at this code i just use messagebox to check if i get the last record and if it increments it if its a Header it just take the default value of the "SubMeasurement" 'a' for a start. but its not working that way please help.
It's error of course it is. Because char can not increase in normal, if you really need it let try some kind of difference:
int asciiCode = ((int)SubMeasurement) + 1;
SubMeasurement = (char)asciiCode;

Excel ExcelDNA C# / Try to copy Bloomberg BDH() behavior (writing Array after a web request)

I want to copy Bloomberg BDH behavior.
BDH makes a web request and write an array (but doesn't return an array style). During this web request, the function returns "#N/A Requesting".
When the web request finished, the BDH() function writes the array result in the worksheet.
For example, in ExcelDNA, I succeed to write in the worksheet with a thread.
The result if you use the code below in a DNA file, the result of
=WriteArray(2;2)
will be
Line 1 > #N/A Requesting Data (0,1)
Line 2 > (1,0) (1,1)
The last issue is to replace #N/A Requesting Data with the value and copy the formula.
When you uncomment //xlActiveCellType.InvokeMember("FormulaR1C1Local", you are near the result but you don't have the right behavior
File .dna
<DnaLibrary Language="CS" RuntimeVersion="v4.0">
<![CDATA[
using System;
using System.Collections.Generic;
using System.Reflection;
using System.Runtime.InteropServices;
using System.Threading;
using ExcelDna.Integration;
public static class WriteForXL
{
public static object[,] MakeArray(int rows, int columns)
{
if (rows == 0 && columns == 0)
{
rows = 1;
columns = 1;
}
object[,] result = new string[rows, columns];
for (int i = 0; i < rows; i++)
{
for (int j = 0; j < columns; j++)
{
result[i, j] = string.Format("({0},{1})", i, j);
}
}
return result;
}
public static object WriteArray(int rows, int columns)
{
if (ExcelDnaUtil.IsInFunctionWizard())
return "Waiting for click on wizard ok button to calculate.";
object[,] result = MakeArray(rows, columns);
var xlApp = ExcelDnaUtil.Application;
Type xlAppType = xlApp.GetType();
object caller = xlAppType.InvokeMember("ActiveCell", BindingFlags.GetProperty, null, xlApp, null);
object formula = xlAppType.InvokeMember("FormulaR1C1Local", BindingFlags.GetProperty, null, caller, null);
ObjectForThread q = new ObjectForThread() { xlRef = caller, value = result, FormulaR1C1Local = formula };
Thread t = new Thread(WriteFromThread);
t.Start(q);
return "#N/A Requesting Data";
}
private static void WriteFromThread(Object o)
{
ObjectForThread q = (ObjectForThread) o;
Type xlActiveCellType = q.xlRef.GetType();
try
{
for (int i = 0; i < q.value.GetLength(0); i++)
{
for (int j = 0; j < q.value.GetLength(1); j++)
{
if (i == 0 && j == 0)
continue;
Object cellBelow = xlActiveCellType.InvokeMember("Offset", BindingFlags.GetProperty, null, q.xlRef, new object[] { i, j });
xlActiveCellType.InvokeMember("Value", BindingFlags.SetProperty, null, cellBelow, new[] { Type.Missing, q.value[i, j] });
}
}
}
catch(Exception e)
{
}
finally
{
//xlActiveCellType.InvokeMember("Value", BindingFlags.SetProperty, null, q.xlRef, new[] { Type.Missing, q.value[0, 0] });
//xlActiveCellType.InvokeMember("FormulaR1C1Local", BindingFlags.SetProperty, null, q.xlRef, new [] { q.FormulaR1C1Local });
}
}
public class ObjectForThread
{
public object xlRef { get; set; }
public object[,] value { get; set; }
public object FormulaR1C1Local { get; set; }
}
}
]]>
</DnaLibrary>
#To Govert
BDH has become a standard in finance industry. People do not know how to manipulate an array (even the Ctrl+Shift+Enter).
BDH is the function that made Bloomberg so popular (to the disadvantage of Reuters).
However I will think of using your method or RTD.
Thanks for all your work in Excel DNA
I presume you have tried the Excel-DNA ArrayResizer sample, which carefully avoids many of the issue you are running into. I'd like to understand what you see as the disadvantages of the array-formula-writing approach.
Now, about your function:
Firstly, you can't safely pass the 'caller' Range COM object to another thread - rather pass a string with the address, and get the COM object from the other thread (using a call to ExcelDnaUtil.Application on the worker thread). Most of the time you'll get lucky, though.
The better way to do this is from the worker thread to get Excel to run a macro on the main thread - by calling Application.Run. The Excel-DNA ArrayResizer sample shows how this can be done.
Secondly, you almost certainly don't want the ActiveCell, but rather Application.Caller. The ActiveCell might well have nothing to do with the cell where the formula is running from.
Next - Excel will recalculate your function every time you set the Formula again - hence putting you in an endless loop when you enable the Formula set in your finally clause. You cannot set both the Value and the Formula for a cell - if a cell has a Formula then Excel will use the formula to calculate the Value. If you set the Value, the Formula gets removed.
It's not clear what you want to actually leave in the [0,0] cell - IIRC Bloomberg modifies the formula there in a way that makes it remember how large a range was written to. You could try to add some parameters to your function that tell your function whether to recalculate or whether to return an actual value as its result.
Finally, you might want to reconsider whether the Bloomberg BDH function is a good example for what you want to do. It breaks the dependency calculation of your sheet, which has implications both for performance and for maintaining consistency of the spreadsheet model.
My issue was :
writing dynamic array
data are retrieved asynchronous via a webservice
After discussing with Govert, I chose to take a result as an array and not to copy Bloomberg functions (write an array but return a single value).
Finally, to solve my issue, I used http://excel-dna.net/2011/01/30/resizing-excel-udf-result-arrays/
and reshape the resize() function.
This code is not RTD.
The code belows works in a .dna file
<DnaLibrary RuntimeVersion="v4.0" Language="C#">
<![CDATA[
using System;
using System.Collections.Generic;
using System.Reflection;
using System.Runtime.InteropServices;
using System.Threading;
using System.ComponentModel;
using ExcelDna.Integration;
public static class ResizeTest
{
public static object[,] MakeArray(int rows, int columns)
{
object[,] result = new string[rows, columns];
for (int i = 0; i < rows; i++)
{
for (int j = 0; j < columns; j++)
{
result[i,j] = string.Format("({0},{1})", i, j);
}
}
return result;
}
public static object MakeArrayAndResize()
{
// Call Resize via Excel - so if the Resize add-in is not part of this code, it should still work.
return XlCall.Excel(XlCall.xlUDF, "Resize", null);
}
}
public class Resizer
{
static Queue<ExcelReference> ResizeJobs = new Queue<ExcelReference>();
static Dictionary<string, object> JobIsDone = new Dictionary<string, object>();
// This function will run in the UDF context.
// Needs extra protection to allow multithreaded use.
public static object Resize(object args)
{
ExcelReference caller = XlCall.Excel(XlCall.xlfCaller) as ExcelReference;
if (caller == null)
return ExcelError.ExcelErrorNA;
if (!JobIsDone.ContainsKey(GetHashcode(caller)))
{
BackgroundWorker(caller);
return ExcelError.ExcelErrorNA;
}
else
{
// Size is already OK - just return result
object[,] array = (object[,])JobIsDone[GetHashcode(caller)];
JobIsDone.Remove(GetHashcode(caller));
return array;
}
}
/// <summary>
/// Simulate WebServiceRequest
/// </summary>
/// <param name="caller"></param>
/// <param name="rows"></param>
/// <param name="columns"></param>
static void BackgroundWorker(ExcelReference caller)
{
BackgroundWorker bw = new BackgroundWorker();
bw.DoWork += (sender, args) =>
{
Thread.Sleep(3000);
};
bw.RunWorkerCompleted += (sender, args) =>
{
// La requete
Random r = new Random();
object[,] array = ResizeTest.MakeArray(r.Next(10), r.Next(10));
JobIsDone[GetHashcode(caller)] = array;
int rows = array.GetLength(0);
int columns = array.GetLength(1);
EnqueueResize(caller, rows, columns);
AsyncRunMacro("DoResizing");
};
bw.RunWorkerAsync();
}
static string GetHashcode(ExcelReference caller)
{
return caller.SheetId + ":L" + caller.RowFirst + "C" + caller.ColumnFirst;
}
static void EnqueueResize(ExcelReference caller, int rows, int columns)
{
ExcelReference target = new ExcelReference(caller.RowFirst, caller.RowFirst + rows - 1, caller.ColumnFirst, caller.ColumnFirst + columns - 1, caller.SheetId);
ResizeJobs.Enqueue(target);
}
public static void DoResizing()
{
while (ResizeJobs.Count > 0)
{
DoResize(ResizeJobs.Dequeue());
}
}
static void DoResize(ExcelReference target)
{
try
{
// Get the current state for reset later
XlCall.Excel(XlCall.xlcEcho, false);
// Get the formula in the first cell of the target
string formula = (string)XlCall.Excel(XlCall.xlfGetCell, 41, target);
ExcelReference firstCell = new ExcelReference(target.RowFirst, target.RowFirst, target.ColumnFirst, target.ColumnFirst, target.SheetId);
bool isFormulaArray = (bool)XlCall.Excel(XlCall.xlfGetCell, 49, target);
if (isFormulaArray)
{
object oldSelectionOnActiveSheet = XlCall.Excel(XlCall.xlfSelection);
object oldActiveCell = XlCall.Excel(XlCall.xlfActiveCell);
// Remember old selection and select the first cell of the target
string firstCellSheet = (string)XlCall.Excel(XlCall.xlSheetNm, firstCell);
XlCall.Excel(XlCall.xlcWorkbookSelect, new object[] {firstCellSheet});
object oldSelectionOnArraySheet = XlCall.Excel(XlCall.xlfSelection);
XlCall.Excel(XlCall.xlcFormulaGoto, firstCell);
// Extend the selection to the whole array and clear
XlCall.Excel(XlCall.xlcSelectSpecial, 6);
ExcelReference oldArray = (ExcelReference)XlCall.Excel(XlCall.xlfSelection);
oldArray.SetValue(ExcelEmpty.Value);
XlCall.Excel(XlCall.xlcSelect, oldSelectionOnArraySheet);
XlCall.Excel(XlCall.xlcFormulaGoto, oldSelectionOnActiveSheet);
}
// Get the formula and convert to R1C1 mode
bool isR1C1Mode = (bool)XlCall.Excel(XlCall.xlfGetWorkspace, 4);
string formulaR1C1 = formula;
if (!isR1C1Mode)
{
// Set the formula into the whole target
formulaR1C1 = (string)XlCall.Excel(XlCall.xlfFormulaConvert, formula, true, false, ExcelMissing.Value, firstCell);
}
// Must be R1C1-style references
object ignoredResult;
XlCall.XlReturn retval = XlCall.TryExcel(XlCall.xlcFormulaArray, out ignoredResult, formulaR1C1, target);
if (retval != XlCall.XlReturn.XlReturnSuccess)
{
// TODO: Consider what to do now!?
// Might have failed due to array in the way.
firstCell.SetValue("'" + formula);
}
}
finally
{
XlCall.Excel(XlCall.xlcEcho, true);
}
}
// Most of this from the newsgroup: http://groups.google.com/group/exceldna/browse_thread/thread/a72c9b9f49523fc9/4577cd6840c7f195
private static readonly TimeSpan BackoffTime = TimeSpan.FromSeconds(1);
static void AsyncRunMacro(string macroName)
{
// Do this on a new thread....
Thread newThread = new Thread( delegate ()
{
while(true)
{
try
{
RunMacro(macroName);
break;
}
catch(COMException cex)
{
if(IsRetry(cex))
{
Thread.Sleep(BackoffTime);
continue;
}
// TODO: Handle unexpected error
return;
}
catch(Exception ex)
{
// TODO: Handle unexpected error
return;
}
}
});
newThread.Start();
}
static void RunMacro(string macroName)
{
object xlApp = null;
try
{
xlApp = ExcelDnaUtil.Application;
xlApp.GetType().InvokeMember("Run", BindingFlags.InvokeMethod, null, xlApp, new object[] {macroName});
}
catch (TargetInvocationException tie)
{
throw tie.InnerException;
}
finally
{
Marshal.ReleaseComObject(xlApp);
}
}
const uint RPC_E_SERVERCALL_RETRYLATER = 0x8001010A;
const uint VBA_E_IGNORE = 0x800AC472;
static bool IsRetry(COMException e)
{
uint errorCode = (uint)e.ErrorCode;
switch(errorCode)
{
case RPC_E_SERVERCALL_RETRYLATER:
case VBA_E_IGNORE:
return true;
default:
return false;
}
}
}
]]>
</DnaLibrary>
I think you need to implemented the request as a RTD server. Normal user defined functions will not update asynchronously.
Then you may hide the call of the RTD server via a user defined function, which can be done via Excel-DNA.
So finally you use Array formula, right? As you said, users are not familiar with array formula, they do not know ctrl+shift+enter. I think array formula is a big problem for them.
For me, I have the same issue. I am trying to build a prototype for it. see
https://github.com/kchen0723/ExcelAsync.git

Categories