free memory of an local array of string in method C# - c#

I have recently encountered a problem about the memory my program used. The reason is the memory of an array of string i used in a method. More specifically, this program is to read an integer array from a outside file. Here is my code
class Program
{
static void Main(string[] args)
{
int[] a = loadData();
for (int i = 0; i < a.Length; i++)
{
Console.WriteLine(a[i]);
}
Console.ReadKey();
}
private static int[] loadData()
{
string[] lines = System.IO.File.ReadAllLines(#"F:\data.txt");
int[] a = new int[lines.Length];
for (int i = 0; i < lines.Length; i++)
{
string[] temp = lines[i].Split(new char[]{','},StringSplitOptions.RemoveEmptyEntries);
a[i] = Convert.ToInt32(temp[0]);
}
return a;
}
}
File data.txt is about 7.4 MB and 574285 lines. But when I run, the memory of program shown in task manager is : 41.6 MB. It seems that the memory of the array of string I read in loadData() (it is string[] lines) is not be freed. How can i free it, because it is never used later.

You can call GC.Collect() after setting lines to null, but I suggest you look at all answers here, here and here. Calling GC.Collect() is something that you rarely want to do. The purpose of using a language such as C# is that it manages the memory for you. If you want granular control over the memory read in, then you could create a C++ dll and call into that from your C# code.
Instead of reading the entire file into a string array, you could read it line by line and perform the operations that you need to on that line. That would probably be more efficient as well.
What problem does the 40MB of used memory cause? How often do you read the data? Would it be worth caching it for future use (assuming the 7MB is tolerable).

Related

C# are field reads guaranteed to be reliable (fresh) when using multithreading?

Background
My colleague thinks reads in multithreaded C# are reliable and will always give you the current, fresh value of a field, but I've always used locks because I was sure I'd experienced problems otherwise.
I spent some time googling and reading articles, but I mustn't be able to provide google with correct search input, because I didn't find exactly what I was after.
So I wrote the below program without locks in an attempt to prove why that's bad.
Question
I'm assuming the below is a valid test, then the results show that the reads aren't reliable/fresh.
Can someone explain what this is caused by? (reordering, staleness or something else)?
And link me to official Microsoft documentation/section explaining why this happens and what is the recommended solution?
If the below isn't a valid test, what would be?
Program
If there are two threads, one calls SetA and the other calls SetB, if the reads are unreliable without locks, then intermittently Foo's field "c" will be false.
using System;
using System.Threading.Tasks;
namespace SetASetBTestAB
{
class Program
{
class Foo
{
public bool a;
public bool b;
public bool c;
public void SetA()
{
a = true;
TestAB();
}
public void SetB()
{
b = true;
TestAB();
}
public void TestAB()
{
if (a && b)
{
c = true;
}
}
}
static void Main(string[] args)
{
int timesCWasFalse = 0;
for (int i = 0; i < 100000; i++)
{
var f = new Foo();
var t1 = Task.Run(() => f.SetA());
var t2 = Task.Run(() => f.SetB());
Task.WaitAll(t1, t2);
if (!f.c)
{
timesCWasFalse++;
}
}
Console.WriteLine($"timesCWasFalse: {timesCWasFalse}");
Console.WriteLine("Finished. Press Enter to exit");
Console.ReadLine();
}
}
}
Output
Release mode. Intel Core i7 6700HQ:
Run 1: timesCWasFalse: 8
Run 2: timesCWasFalse: 10
Of course it is not fresh. The average CPU nowadays has 3 layers of Caches between each cores Registers and the RAM. And it can take quite some time for a write to one cache to be propagate to all of them.
And then there is the JiT Compiler. Part of it's job is dead code dection. And one of the first things it will do is cut out "useless" variables. For example this code tried to force a OOM excpetion by running into the 2 GiB Limit on x32 Systems:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace OOM_32_forced
{
class Program
{
static void Main(string[] args)
{
//each short is 2 byte big, Int32.MaxValue is 2^31.
//So this will require a bit above 2^32 byte, or 2 GiB
short[] Array = new short[Int32.MaxValue];
/*need to actually access that array
Otherwise JIT compiler and optimisations will just skip
the array definition and creation */
foreach (short value in Array)
Console.WriteLine(value);
}
}
}
The thing is that if you cut out the output stuff, there is a decent chance that the JiT will remove the variable Array inlcuding the instantionation order. The JiT has a decent chance to reduce this programming to doing nothing at all at runtime.
volatile is first preventing the JiT from doing any optimisations on that value. And it might even have some effect on how the CPU processes stuff.

GPU global memory calculation

In the worst case, does this sample allocate testCnt * xArray.Length storage in the GPU global memory? How to make sure just one copy of the array is transferred to the device? The GpuManaged attribute seems to serve this purpose but it doesn't solve our unexpected memory consumption.
void Worker(int ix, byte[] array)
{
// process array - only read access
}
void Run()
{
var xArray = new byte[100];
var testCnt = 10;
Gpu.Default.For(0, testCnt, ix => Worker(ix, xArray));
}
EDIT
The main question in a more precise form:
Does each worker thread get a fresh copy of xArray or is there only one copy of xArray for all threads?
Your sample code should allocate 100 bytes of memory on the GPU and 100 bytes of memory on the CPU.
(.Net adds a bit of overhead, but we can ignore that)
Since you're using implicit memory, some resources need to be allocated to track that memory, (basically where it lives: CPU/GPU).
Now... You're probably seeing a bigger memory consumption on the CPU side I assume.
The reason for that is possibly due to kernel compilation happening on the fly.
AleaGPU has to compile your IL code into LLVM, that LLVM is fed into the Cuda compiler which in turn converts it into PTX.
This happens when you run a kernel for the first time.
All of the resources and unmanaged dlls are loaded into memory.
That's possibly what you're seeing.
testCnt has no effect on the amount of memory being allocated.
EDIT*
One suggestion is to use memory in an explicit way.
Its faster and more efficient:
private static void Run()
{
var input = Gpu.Default.AllocateDevice<byte>(100);
var deviceptr = input.Ptr;
Gpu.Default.For(0, input.Length, i => Worker(i, deviceptr));
Console.WriteLine(string.Join(", ", Gpu.CopyToHost(input)));
}
private static void Worker(int ix, deviceptr<byte> array)
{
array[ix] = 10;
}
Try use explicit memory:
static void Worker(int ix, byte[] array)
{
// you must write something back, note, I changed your Worker
// function to static!
array[ix] += 1uy;
}
void Run()
{
var gpu = Gpu.Default;
var hostArray = new byte[100];
// set your host array
var deviceArray = gpu.Allocate<byte>(100);
// deviceArray is of type byte[], but deviceArray.Length = 0,
assert deviceArray.Length == 0
assert Gpu.ArrayGetLength(deviceArray) == 100
Gpu.Copy(hostArray, deviceArray);
var testCnt = 10;
gpu.For(0, testCnt, ix => Worker(ix, deviceArray));
// you must copy memory back
Gpu.Copy(deviceArray, hostArray);
// check your result in hostArray
Gpu.Free(deviceArray);
}

C# "A heap has been corrupted" using sam-ba.dll

I'm writing a C# program that makes a call to the AT91Boot_Scan function in sam-ba.dll. In the documentation for this DLL, the signature for this function is void AT91Boot_Scan(char *pDevList). The purpose of this function is to scan and return a list of connected devices.
Problem: My current problem is that every time I call this function from C#, the code in the DLL throws an a heap has been corrupted exception.
Aside: From what I understand from reading the documentation, the char *pDevList parameter is a pointer to an array of buffers that the function can use to store the device names. However, when calling the method from C#, IntelliSense reports that the signature for this function is actually void AT91Boot_Scan(ref byte pDevList)
I was sort of confused by why this is. A single byte isn't long enough to be a pointer. We would need 4 bytes for 32-bit and 8 bytes for 64-bit... If it's the ref keyword that is making this parameter a pointer, then what byte should I be passing in? The first byte in my array of buffers or the first byte of the first buffer?
Code: The C# method I've written that calls this function is as follows.
/// <summary>
/// Returns a string array containing the names of connected devices
/// </summary>
/// <returns></returns>
private string[] LoadDeviceList()
{
const int MAX_NUM_DEVICES = 10;
const int BYTES_PER_DEVICE_NAME = 100;
SAMBADLL samba = new SAMBADLL();
string[] deviceNames = new string[MAX_NUM_DEVICES];
try
{
unsafe
{
// Allocate an array (of size MAX_NUM_DEVICES) of pointers
byte** deviceList = stackalloc byte*[MAX_NUM_DEVICES];
for (int n = 0; n < MAX_NUM_DEVICES; n++)
{
// Allocate a buffer of size 100 for the device name
byte* deviceNameBuffer = stackalloc byte[BYTES_PER_DEVICE_NAME];
// Assign the buffer to a pointer in the deviceList
deviceList[n] = deviceNameBuffer;
}
// Create a pointer to the deviceList
byte* pointerToStartOfList = *deviceList;
// Call the function. A heap has been corrupted error is thrown here.
samba.AT91Boot_Scan(ref* pointerToStartOfList);
// Read back out the names by converting the bytes to strings
for (int n = 0; n < MAX_NUM_DEVICES; n++)
{
byte[] nameAsBytes = new byte[BYTES_PER_DEVICE_NAME];
Marshal.Copy((IntPtr)deviceList[n], nameAsBytes, 0, BYTES_PER_DEVICE_NAME);
string nameAsString = System.Text.Encoding.UTF8.GetString(nameAsBytes);
deviceNames[n] = nameAsString;
}
}
}
catch (Exception e)
{
Console.WriteLine(e.Message);
}
return deviceNames;
}
My attempt at a solution: I noticed that the line byte* pointerToStartOfList = *deviceList; wasn't correctly assigning the pointer for deviceList to pointerToStartOfList. The address was always off by 0x64.
I thought if I hard-coded in a 0x64 offset then the two addresses would match and all would be fine. pointerToStartOfList += 0x64;
However despite forcing the addresses to match I was still getting a a heap has been corrupted error.
My thoughts: I think in my code I'm either not creating the array of buffers correctly, or I'm not passing the pointer for said array correctly .
In the end I could not get sam-ba.dll to work. I had tried writing a C++ wrapper around the DLL but even then it was still throwing the a heap has been corrupted error. My final solution was to just embed the SAM-BA executable, sam-ba.exe, along with all of its dependencies inside my C# program.
Then, whenever I needed to use it I would run sam-ba.exe in command line mode and pass it the relevant arguments. Section 5.1 in the SAM-BA documentation provides instructions on how to run sam-ba.exe in command line mode.
i.e.
SAM-BA.exe \usb\ARM0 AT91SAM9G25-EK myCommand.tcl

Perform operations while streamreader is open or copy stream locally, close the stream and then perform operations?

Which of the following approaches is better? I meant to ask, is it better to copy the stream locally, close it and do whatever operations that are needed to be done using the data? or just perform operations with the stream open? Assume that the input from the stream is huge.
First method:
public static int calculateSum(string filePath)
{
int sum = 0;
var list = new List<int>();
using (StreamReader sr = new StreamReader(filePath))
{
while (!sr.EndOfStream)
{
list.Add(int.Parse(sr.ReadLine()));
}
}
foreach(int item in list)
sum += item;
return sum;
}
Second method:
public static int calculateSum(string filePath)
{
int sum = 0;
using (StreamReader sr = new StreamReader(filePath))
{
while (!sr.EndOfStream)
{
sum += int.Parse(sr.ReadLine());
}
}
return sum;
}
If the file is modified often, then read the data in and then work with it. If it is not accessed often, then you are fine to read the file one line at a time and work with each line separately.
In general, if you can do it in a single pass, then do it in a single pass. You indicate that the input is huge, so it might not all fit into memory. If that's the case, then your first option isn't even possible.
Of course, there are exceptions to every rule of thumb. But you don't indicate that there's anything special about the file or the access pattern (other processes wanting to access it, for example) that prevents you from keeping it open longer than absolutely necessary to copy the data.
I don't know if your example is a real-world scenario or if you're just using the sum thing as a placeholder for more complex processing. In any case, if you're processing a file line-by-line, you can save yourself a lot of trouble by using File.ReadLines:
int sum = 0;
foreach (var line in File.ReadLines(filePath))
{
sum += int.Parse(line);
}
This does not read the entire file into memory at once. Rather, it uses an enumerator to present one line at a time, and only reads as much as it must to maintain a relatively small (probably four kilobyte) buffer.

C# StringWriter is faster than C++ ofstream (via pinvoke)?

I'm just a beginner at c++ so I might be doing something wrong but anyways I created a c++ dll and I call it from my wpf project:
c++ code:
extern "C" __declspec (dllexport) double writeTxt()
{
ofstream mf("c:\\cpp.txt");
for(int i=0;i<999;i++)
{
mf<<"xLine: \n";
}
mf.close();
return 1;
}
calling code from c#:
[DllImport(#"C:\Users\neo\Documents\visual studio 2010\Projects\TestDll\Debug\TestDll.dll",
CallingConvention = CallingConvention.Cdecl)]
public static extern double writeTxt();
Now I'm trying to compare the execution time with this c# function:
double writeTxtCs()
{
StreamWriter sw = new StreamWriter(#"c:\cs.txt");
for (int i = 0; i < 999; i++)
{
sw.WriteLine("Line: " + i);
}
sw.Close();
return 0;
}
but the c# function is about twice as faster than the c++ function.
tested like this:
private void Window_Loaded(object sender, RoutedEventArgs e)
{
long[] arr = new long[100];
Stopwatch sw = new Stopwatch();
for (int i = 0; i < 99; i++)
{
sw.Start();
//double xxx = writeTxt();
double xxx = writeTxtCs();
arr[i] = sw.ElapsedMilliseconds;
sw.Reset();
}
MessageBox.Show(arr.Average().ToString());
Close();
}
When running the c# function I normally get ~0.65ms and when running the c++ function I get ~1.1ms.
my question is: am I doing something wrong or does c# really is faster in this scenario than c++?
All other answers have valid points. In addition to those:
You are testing against "Debug" build of your C++ DLL and that might be degrading C++ performance more than how it affects C#'s performance. Try unleashing optimizations on both and see how it works out for you.
Nevertheless I/O doesn't have much to do with the "language". It's more about runtime and the OS.
You're not testing C++ vs. C#. You're testing [C++ plus libraries] vs. [C# plus libraries].
In order to find out why ofstream is slower than StreamWriter you'd need to profile the code or look into the internals.
Anyway, single milliseconds are a very small amount of time for a computer. I'd repeat the test with 1000 times the load to make timing jitter irrelevant.
It could be a buffering issue, i.e. the C# and C++ file writing guts might be buffering data in a different way, which would result in performance differences.
I recommend you use an operation which is purely CPU bound for benchmarking, rather than something IO bound (such as writing to a hard drive). For instance, see how fast each function can count from 0 to MAX_INT.
That's not really a reliable benchmark. You're doing file IO there, which is highly dependent on the implementation of ofstream and StreamWriter. Also, is it a problem?

Categories