reading a binary unknown size file in c# - c#

I have just switched to c# from c++. I have already done some task in c++ and the same now i have to translate in c#.
I am going through some problems.
I have to find the frequency of symbols in binary files (which is taken as sole argument, so don't know it's size/length).(these frequency will be further used to create huffman tree).
My code to do that in c++ is below :
My structure is like this:
struct Node
{
unsigned int symbol;
int freq;
struct Node * next, * left, * right;
};
Node * tree;
And how i read the file is like this :
FILE * fp;
fp = fopen(argv, "rb");
ch = fgetc(fp);
while (fread( & ch, sizeof(ch), 1, fp)) {
create_frequency(ch);
}
fclose(fp);
Could any one please help me in translating the same in c# (specially this binary file read procedure to create frequency of symbols and storing in linked list)? Thanks for the help
Edit: Tried to write the code according to what Henk Holterman explained below but still there is error and the error is :
error CS1501: No overload for method 'Open' takes '1' arguments
/usr/lib/mono/2.0/mscorlib.dll (Location of the symbol related to previous error)
shekhar_c#.cs(22,32): error CS0825: The contextual keyword 'var' may only appear within a local variable declaration
Compilation failed: 2 error(s), 0 warnings
And my code to do this is:
static void Main(string[] args)
{
// using provides exception-safe closing
using (var fp = System.IO.File.Open(args))
{
int b; // note: not a byte
while ((b = fp.Readbyte()) >= 0)
{
byte ch = (byte) b;
// now use the byte in 'ch'
//create_frequency(ch);
}
}
}
And the line corresponding to the two errors is :
using (var fp = System.IO.File.Open(args))
could some one please help me ? I am beginner to c#

string fileName = ...
using (var fp = System.IO.File.OpenRead(fileName)) // using provides exception-safe closing
{
int b; // note: not a byte
while ((b = fp.ReadByte()) >= 0)
{
byte ch = (byte) b;
// now use the byte in 'ch'
create_frequency(ch);
}
}

Related

Marshalling C array in C# - Simple HelloWorld

Building off of my marshalling helloworld question, I'm running into issues marshalling an array allocated in C to C#. I've spent hours researching where I might be going wrong, but everything I've tried ends up with errors such as AccessViolationException.
The function that handles creating an array in C is below.
__declspec(dllexport) int __cdecl import_csv(char *path, struct human ***persons, int *numPersons)
{
int res;
FILE *csv;
char line[1024];
struct human **humans;
csv = fopen(path, "r");
if (csv == NULL) {
return errno;
}
*numPersons = 0; // init to sane value
/*
* All I'm trying to do for now is get more than one working.
* Starting with 2 seems reasonable. My test CSV file only has 2 lines.
*/
humans = calloc(2, sizeof(struct human *));
if (humans == NULL)
return ENOMEM;
while (fgets(line, 1024, csv)) {
char *tmp = strdup(line);
struct human *person;
humans[*numPersons] = calloc(1, sizeof(*person));
person = humans[*numPersons]; // easier to work with
if (person == NULL) {
return ENOMEM;
}
person->contact = calloc(1, sizeof(*(person->contact)));
if (person->contact == NULL) {
return ENOMEM;
}
res = parse_human(line, person);
if (res != 0) {
return res;
}
(*numPersons)++;
}
(*persons) = humans;
fclose(csv);
return 0;
}
The C# code:
IntPtr humansPtr = IntPtr.Zero;
int numHumans = 0;
HelloLibrary.import_csv(args[0], ref humansPtr, ref numHumans);
HelloLibrary.human[] humans = new HelloLibrary.human[numHumans];
IntPtr[] ptrs = new IntPtr[numHumans];
IntPtr aIndex = (IntPtr)Marshal.PtrToStructure(humansPtr, typeof(IntPtr));
// Populate the array of IntPtr
for (int i = 0; i < numHumans; i++)
{
ptrs[i] = new IntPtr(aIndex.ToInt64() +
(Marshal.SizeOf(typeof(IntPtr)) * i));
}
// Marshal the array of human structs
for (int i = 0; i < numHumans; i++)
{
humans[i] = (HelloLibrary.human)Marshal.PtrToStructure(
ptrs[i],
typeof(HelloLibrary.human));
}
// Use the marshalled data
foreach (HelloLibrary.human human in humans)
{
Console.WriteLine("first:'{0}'", human.first);
Console.WriteLine("last:'{0}'", human.last);
HelloLibrary.contact_info contact = (HelloLibrary.contact_info)Marshal.
PtrToStructure(human.contact, typeof(HelloLibrary.contact_info));
Console.WriteLine("cell:'{0}'", contact.cell);
Console.WriteLine("home:'{0}'", contact.home);
}
The first human struct gets marshalled fine. I get the access violation exceptions after the first one. I feel like I'm missing something with marshalling structs with struct pointers inside them. I hope I have some simple mistake I'm overlooking. Do you see anything wrong with this code?
See this GitHub gist for full source.
// Populate the array of IntPtr
This is where you went wrong. You are getting back a pointer to an array of pointers. You got the first one correct, actually reading the pointer value from the array. But then your for() loop got it wrong, just adding 4 (or 8) to the first pointer value. Instead of reading them from the array. Fix:
IntPtr[] ptrs = new IntPtr[numHumans];
// Populate the array of IntPtr
for (int i = 0; i < numHumans; i++)
{
ptrs[i] = (IntPtr)Marshal.PtrToStructure(humansPtr, typeof(IntPtr));
humansPtr = new IntPtr(humansPtr.ToInt64() + IntPtr.Size);
}
Or much more cleanly since marshaling arrays of simple types is already supported:
IntPtr[] ptrs = new IntPtr[numHumans];
Marshal.Copy(humansPtr, ptrs, 0, numHumans);
I found the bug by using the Debug + Windows + Memory + Memory 1. Put humansPtr in the Address field, switched to 4-byte integer view and observed that the C code was doing it correctly. Then quickly found out that ptrs[] did not contain the values I saw in the Memory window.
Not sure why you are writing code like this, other than as a mental exercise. It is not the correct way to go about it, you are for example completely ignoring the need to release the memory again. Which is very nontrivial. Parsing CSV files in C# is quite simple and just as fast as doing it in C, it is I/O bound, not execute-bound. You'll easily avoid these almost impossible to debug bugs and get lots of help from the .NET Framework.

How do I marshal a pointer to a series of null-terminated strings in C#?

I need some help with the following. I've got a c++ API (no access to source) and I'm struggling with the methods returning char* attributes, or returned structures containing char* attributes. According to the API's documentation the return value is as follows:
Return Values
If the function succeeds, the return value is a pointer to a series of null-terminated strings, one for each project on the host system, ending with a second null character. The following example shows the buffer contents with <null> representing the terminating null character:
project1<null>project2<null>project3<null><null>
If the function fails, the return value is NULL
The problem I'm having is that the returned pointer in C# only contains the first value... project1 in this case. How can I get the full list to be able to loop through them on the managed side?
Here's the c# code:
[DllImport("vmdsapi.dll", EntryPoint = "DSGetProjectList", CallingConvention = CallingConvention.Cdecl)]
public static extern IntPtr DSGetProjectList();
Calling method:
IntPtr ptrProjectList = DSAPI.DSGetProjectList();
string strProjectList = Marshal.PtrToStringAnsi(ptrProjectList).ToString();
strProjectList only contains the first item.
Here's the info from the API's header file...
DllImport char *DSGetProjectList dsproto((void));
Here's some sample code from a c++ console app which I've used for testing purposes...
char *a;
a = DSGetProjectList( );
while( *a ) {
printf("a=%s\n", a);
a += 1 + strlen(a);
}
Each iteration correctly displays every project in the list.
The problem is that when converting the C++ char* to a C# string using Marshal.PtrToStringAnsi, it stops at the first null character.
You shouldn't convert directly the char* to a string.
You could copy the char* represented by an IntPtr to a byte[] using Marshal.Copy and then extract as many string as necessary (see Matthew Watson's answer for extracting strings from a managed array), but you'll need to get the multi-string size first.
As leppie suggest you can also extract the first string using Marshal.PtrToStringAnsi then increment the pointer by this string size and extract the next string and so on. You stops when is extracts an empty string (from the last NULL character).
Something like :
IntPtr ptrProjectList = DSAPI.DSGetProjectList();
List<string> data;
string buffer;
do {
buffer = Marshal.PtrToStringAnsi(ptrProjectList);
ptrProjectList += buffer.size() + 1;
data.Add(buffer);
}while(buffer.size() > 0)
This kind of string is called a Multi-String, and it's quite common in the Windows API.
Marshaling them is fiddly. What you have to do is marshal it as a char[] array, rather than a string, and then convert the char[] array to a set of strings.
See here for an example solution. I've copied the relevant code into this answer, but it is copied from the link I gave:
static List<string> MultiStringToList(char[] multistring)
{
var stringList = new List<string>();
int i = 0;
while (i < multistring.Length)
{
int j = i;
if (multistring[j++] == '\0')
break;
while (j < multistring.Length)
{
if (multistring[j++] == '\0')
{
stringList.Add(new string(multistring, i, j - i - 1));
i = j;
break;
}
}
}
return stringList;
}

Porting iostream input code from C++ to C#

This is C++ code for reading traces of address of main memory for cache memory simulation:
char hex[20];
ifstream infile;
infile.open(filename,ios::in);
if(!infile) {
cout<<"Error! File not found...";
exit(0);
}
int set, tag, found;
while(!infile.eof()) { //Reading each address from trace file
if(base!=10) {
infile>>hex;
address = changebase(hex, base);
} else {
infile>>address;
}
set = (address / block_size) % no_set;
tag = address / (block_size * no_set);
}
I have converted this to C# code:
char[] hex = new char[20];
FileStream infile=new FileStream(filename, FileMode.Open);
if (infile == null) {
Console.Write("Error! File not found...");
Environment.Exit(0);
}
int set;
int tag;
int found;
while (!infile.CanRead) { //Reading each address from trace file
if (#base != 10) {
infile >> hex;
address = changebase(hex, #base);
} else {
infile >> address;
}
set = (address / block_size) % no_set;
tag = address / (block_size * no_set);
}
The problem is on line infile >> hex;
C# is giving syntax errors, as shift right operator cannot be applied to string operators.
Why this is not working? I'm making a small cache hit and miss calculation project.
To quantify what Eric means:
C++ is quite flexible in the operators that can be overloaded. It has become an "idiom" that the bitshift operators << and >> also be used for input and output. This actually makes kind of sense as it is a logical construct and the eye registers some kind of "flow" between objects.
In C#, you don't overload those operators. What Eric means is, you need to say explicitly, on a stream object, to write (or indeed, read) something. This means calling the methods directly.
In essence you're doing the same thing - the operator overloading is just a nice shortcut, but at the end of the day some method is going to be called - be it a nice decorative "operator overload" or a plain old function call with a name.
So, in C++ we might write:
std::cout << "Hello" << std::endl;
Whereas in C# we'd write:
Console.WriteLine("Hello");
If we ignore the fact that std::cout could potentially be different from the console window (this is illustrative), the concept is exactly the same.
To expand on the idea of the operators, you'll also have probably come across things such as stringstream.. a class that acts like a stream for strings. It's really quite useful:
std::stringstream ss;
int age = 25;
ss << "So you must be " << age << " years old.";
In C#, we achieve this with the StringBuilder class:
StringBuilder sb = new StringBuilder();
int age = 25;
sb.Append("So you must be ").Append(age).Append(" years old");
They both do exactly the same thing. We could also do:
sb.AppendFormat("So you must be {0} years old", age);
This is more akin (in my opinion) to the more C-like sprintf methods, or more recently, boost's format library.
C# does not use the bizarre C++ convention that bitshifting also means stream manipulation. You'll have to actually call methods for I/O.

Convert C# function to VB, char value overflow error

I need the following function converted to VB.NET, but I'm not sure how to handle the statement
res = (uint)((h * 0x21) + c);
Complete function:
private static uint convert(string input)
{
uint res = 0;
foreach (int c in input)
res = (uint)((res * 0x21) + c);
return res;
}
I created the following, but I get an overflow error:
Private Shared Function convert(ByVal input As String) As UInteger
Dim res As UInteger = 0
For Each c In input
res = CUInt((res * &H21) + Asc(c)) ' also tried AscW
Next
Return res
End Function
What am I missing? Can someone explain the details?
Your code is correct. The calculation is overflowing after just a few characters since res increases exponentially with each iteration (and it’s not the conversion on the character that’s causing the overflow, it’s the unsigned integer that overflows).
C# by default allows integer operations to overflow – VB doesn’t. You can disable the overflow check in the project settings of VB, though. However I would try not to rely on this. Is there a reason this particular C# has to be ported? After all, you can effortlessly mix C# and VB libraries.
Here is a useful online converter: http://www.developerfusion.com/tools/convert/csharp-to-vb/

Cannot implicitly convert type 'char*' to 'bool'

The program below is from the book "Cracking the coding interview", by Gayle Laakmann McDowell.
The original code is written in C.
Here is the original code:
void reverse(char *str) {
char * end = str;
char tmp;
if (str) {
while (*end) {
++end;
}
--end;
while (str < end) {
tmp = *str;
*str++ = *end;
*end-- = tmp;
}
}
}
I am trying to convert it in C#. After researching via Google and playing with the code, below is what I have. I am a beginner and really stuck. I am not getting the value I am expecting. Can someone tell me what I am doing wrong?
class Program
{
unsafe void reverse(char *str)
{
char* end = str;
char tmp;
if (str) // Cannot implicitly convert type 'char*' to 'bool'
{
while(*end) // Cannot implicitly convert type 'char*' to 'bool'
{
++end;
}
--end;
while(str < end)
{
tmp = *str;
*str += *end;
*end -= tmp;
}
}
}
public static void Main(string[] args)
{
}
}
I can't really remember if this ever worked in C#, but I am quite certain it should not work now.
To start off by answering your question. There is no automatic cast between pointers and bool. You need to write
if(str != null)
Secondly, you can't convert char to bool. Moreover, there is no terminating character for C# strings, so you can't even implement this. Normally, you would write:
while(*end != '\0') // not correct, for illustration only
But there is no '\0' char, or any other magic-termination-char. So you will need to take an int param for length.
Going back to the big picture, this sort of code seems like a terribly inappropriate place to start learning C#. It's way too low level, few C# programmers deal with pointers, chars and unsafe contexts on a daily basis.
... and if you must know how to fix your current code, here's a working program:
unsafe public static void Main(string[] args)
{
var str = "Hello, World";
fixed(char* chr = str){
reverse(chr, str.Length);
}
}
unsafe void reverse(char *str, int length)
{
char* end = str;
char tmp;
if (str != null) //Cannot implicitly convert type 'char*' to 'bool'
{
for(int i = 0; i < length; ++i) //Cannot implicitly convert type 'char*' to 'bool'
{
++end;
}
--end;
while(str<end)
{
tmp = *str;
*str = *end;
*end = tmp;
--end;
++str;
}
}
}
Edit: removed a couple of .Dump() calls, as I was trying it out in LINQPad :)
C# is not like C in that you cannot use an integer value as an implicit bool. You need to manually convert it. One example:
if (str != 0)
while (*end != 0)
A word of warning: If you are migrating from C, there are a few things that can trip you up in a program like this. The main one is that char is 2 bytes. strings and chars are UTF-16 encoded. The C# equivalent of char is byte. Of course, you should use string rather than C-strings.
Another thing: If you got your char* by converting a normal string to a char*, forget your entire code. This is not C. Strings are not null-terminated.
Unless this is homework, you would much rather be doing something like this:
string foo = "Hello, World!";
string bar = foo.Reverse(); // bar now holds "!dlroW ,olleH";
As you discovered, you can use pointers in C#, if you use the unsafe keyword. But you should do that only when really necessary and when you really know what you're doing. You certainly shouldn't use pointers if you're just beginning with the language.
Now, to your actual question: you are given a string of characters and you want to reverse it. Strings in C# are represented as the string class. And string is immutable, so you can't modify it. But you can convert between a string and an array of characters (char[]) and you can modify that. And you can reverse an array by using the static method Array.Reverse(). So, one way to write your method would be:
string Reverse(string str)
{
if (str == null)
return null;
char[] array = str.ToCharArray(); // convert the string to array
Array.Reverse(array); // reverse the array
string result = new string(array); // create a new string out of the array
return result; // and return it
}
If you wanted to write the code that actually does the reversing, you can do that too (as an exercise, I wouldn't do it in production code):
string Reverse(string str)
{
if (str == null)
return null;
char[] array = str.ToCharArray();
// use indexes instead of pointers
int start = 0;
int end = array.Length - 1;
while (start < end)
{
char tmp = array[start];
array[start] = array[end];
array[end] = tmp;
start++;
end--;
}
return new string(array);
}
Try making it more explicit what you are checking in the conditions:
if(str != null)
while(*end != '\0')
You might also want to watch out for that character swapping code: it looks like you've got a +/- in there. If that's supposed to update your pointer positions, I'd suggest making those separate operations.

Categories