C# repeat data, reset on boundary

C# repeat data, reset on boundary - c#

I'm trying to write some repeated data into a byte array, it looks like this:
byte[] bytes = Encoding.ASCII.GetBytes("UNKNOWN");
int count = 0;
for (int i = 0; i < several_MB_worth_of_bytes; i++)
{
output[i] = bytes[count];
count++;
if (count >= bytes.Length) count = 0;
}
This works, however, I need to reset the count variable if I've written exactly 1MB worth of bytes, so the next byte after the 1MB boundary will the the first 'U' in the string. This needs to happen on every MB boundary.
I can't seem to quite figure out the best way of handling the reset. I've taken a look at the ByteSize library for .net, but I'm still not sure how that's going to help me.

You could check whether i is a multiple of 1 MB inside the same if statement where you are resetting the counter:
if (count >= bytes.Length || i % (1024*1024) == 0) count = 0;

Related

Copying pointers data to byte array and writing to MemoryStream results to tons of allocation in LOH

I have WPF app and using ffmpeg library. I have a video recording preview using SDL2. For SDL pixel format its used PixelFormat_UYVY, so every frame is converted to YUV420P.
Conversation is done with
MemoryStream ms = null;
using (ms = new MemoryStream())
{
int shift = 0;
byte* yuv_factor;
for (uint i = 0; i < 3; i++)
{
shift = (i == 0 ? 0 : 1);
yuv_factor = frame->data[i];
for (int j = 0; j < (frame->height >> shift); j++)
{
byte[] frameData = new byte[frame->width >> shift];
Marshal.Copy((IntPtr)yuv_factor, frameData, 0, frameData.Length);
yuv_factor += frame->linesize[i];
ms.Write(frameData, 0, frameData.Length);
}
}
}
return ms.ToArray();
Then this byte[] is simply casted to IntPtr and passed to SDL.
sdl.Preview((IntPtr)data);
The problem is that I can see tons of GC Pressure and a lot of System.Byte[] allocations in LOH. Is there a way to fix that?

I would suggest you begin with analyzing the number of allocations in your code:
MemoryStream which is essentially backed by an expanding byte array under the hood (see here). I have seen something like a recycleable version of it. That might help as well.
As already suggested in the comment, renting an array using the ArrayPool might be an easy way to reduce memory very quickly, especially in case of the frameData array.
As it seems, the ToArray() call at the end of your method also creates a new array (look at the implementation using the above link).
I would try to target these three spots first and then reevaluate.

Why is this C# function behaving like I'm using pointers?

So I've got this code I'm writing in C#. It's supposed to reverse the order of bits in a bit array (not invert them). Through heavy use of breakpoints and watches, I've determined that the function is somehow modifying both the input parameter array array and the array I copied that into in an attempt to make the function NOT change the input array, tempArray.
static BitArray reverseBits(BitArray array)
{
BitArray tempArray = array;
int length = tempArray.Length;
int mid = length / 2;
for (int i = 0; i < mid; i++)
{
bool tempBit = tempArray[i];
tempArray[i] = tempArray[length - 1 - i]; //the problem seems to be happening
tempArray[length - 1 - i] = tempBit; //somewhere in this vicinity
}
return tempArray;
}
I have no idea why it's behaving like this. Granted, pointers were never my strong suit, but I do try to avoid them whenever possible and they don't seem to be used much at all in c#, which is why I'm puzzled about this behavior.
TL;DR: if you pass my function 00000001, you'll be returned 10000000 from the function AND the array that was passed from the outside will be changed to that as well
P.S. this is for a FFT related task, thats why I'm bothering with the bit reversal at all.

I believe you want to create a new instance of a BitArray like this:
BitArray tempArray = new BitArray(array);
This should create a new instance of a BitArray instead of creating another variable referencing the original array.

You haven't copied the array, you've just assigned it to another variable.
BitArray is a class, and so is always passed by reference (similar to pointers in C/etc).
If you want to copy the array, use the .CopyTo method.

Maybe this Byte similar function could help you
/// <summary>
/// Reverse bit order in each byte (8 bits) of a BitArray
/// (change endian bit order)
/// </summary>
public static void BytewiseReverse(BitArray bitArr)
{
int byteCount = bitArr.Length / 8;
for (int i = 0; i < byteCount; i++)
{
for (int j = 0; j < 4; j++)
{
bool temp = bitArr[i * 8 + 7 - j];
bitArr[i * 8 + 7 - j] = bitArr[i * 8 + j];
bitArr[i * 8 + j] = temp;
}
}
}

What defines the capacity of a memory stream

I was calculating the size of object(a List that is being Populated), using the following code:
long myObjectSize = 0;
System.IO.MemoryStream memoryStreamObject = new System.IO.MemoryStream();
System.Runtime.Serialization.Formatters.Binary.BinaryFormatter binaryBuffer = new System.Runtime.Serialization.Formatters.Binary.BinaryFormatter();
binaryBuffer.Serialize(memoryStreamObject, myListObject);
myObjectSize = memoryStreamObject.Position;
A the initial point the capacity of the memoryStreamObject was 1024
Later(after adding more elements to the list) It was shown as 2048.
And It seems to be increasing as the stream content increasing. Then what is the purpose of capacity in this scenario.?

This is caused by the internal implementation of the MemoryStream. The Capacity property is the size of the internal buffer. This make sense if the MemoryStream is created with a fixed size buffer. But in your case the MemoryStream can grow and the actual implementation doubles the size of the buffer if the buffer is too small.
Code of MemoryStream
private bool EnsureCapacity(int value)
{
if (value < 0)
{
throw new IOException(Environment.GetResourceString("IO.IO_StreamTooLong"));
}
if (value > this._capacity)
{
int num = value;
if (num < 256)
{
num = 256;
}
if (num < this._capacity * 2)
{
num = this._capacity * 2;
}
if (this._capacity * 2 > 2147483591)
{
num = ((value > 2147483591) ? value : 2147483591);
}
this.Capacity = num;
return true;
}
return false;
}
And somewhere in Write
int num = this._position + count;
// snip
if (num > this._capacity && this.EnsureCapacity(num))

The purpose of the capacity for a memory stream and for lists is that the underlying data structure is really an array, and arrays cannot be dynamically resized.
So to start with you use an array with a small(ish) size, but once you add enough data so that the array is no longer big enough you need to create a new array, copy over all the data from the old array to the new array, and then switch to using the new array from now on.
This create+copy takes time, the bigger the array, the longer the time it takes to do this. As such, if you resized the array just big enough every time, you would effectively do this every time you write to the memory stream or add a new element to the list.
Instead you have a capacity, saying "you can use up to this value before having to resize" to reduce the number of create+copy cycles you have to perform.
For instance, if you were to write one byte at a time to this array, and not have this capacity concept, every extra byte would mean one full cycle of create+copy of the entire array. Instead, with the last screenshot in your question, you can write one byte at a time, 520 more times before this create+copy cycle has to be performed.
So this is a performance optimization.
An additional bonus is that repeatedly allocating slightly larger memory blocks will eventually fragment memory so that you would risk getting "out of memory" exceptions, reducing the number of such allocations also helps to stave off this scenario.
A typical method to calculate this capacity is by just doubling it every time.

CodePointAt equivalent in c#

I have this code in JAVA and works fine
String a = "ABC";
System.out.println(a.length());
for (int n = 0; n < a.length(); n++)
System.out.println(a.codePointAt(n));
The output as expected is
3
65
66
67
I am a little confused aboud a.length() because it is suposed to return the length in chars but String must store every < 256 char in 16 bits or whatever a unicode character would need.
But the question is how can i do the same i C#?.
I need to scan a string and act depending on some unicode characters found.
The real code I need to translate is
String str = this.getString();
int cp;
boolean escaping = false;
for (int n = 0; n < len; n++)
{
//===================================================
cp = str.codePointAt(n); //LOOKING FOR SOME EQUIVALENT IN C#
//===================================================
if (!escaping)
{
....
//Closing all braces below.
Thanks in advance.
How much i love JAVA :). Just need to deliver a Win APP that is a cliend of a Java / Linux app server.

The exact translation would be this :
string a = "ABC⤶"; //Let's throw in a rare unicode char
Console.WriteLine(a.Length);
for (int n = 0; n < a.Length; n++)
Console.WriteLine((int)a[n]); //a[n] returns a char, which we can cast in an integer
//final result : 4 65 66 68 10550
In C# you don't need codePointAt at all, you can get the unicode number directly by casting the character into an int (or for an assignation, it's casted implicitly). So you can get your cp simply by doing
cp = (int)str[n];
How much I love C# :)
However, this is valid only for low Unicode values. Surrogate pairs are handled as two different characters when you break the string down, so they won't be printed as one value. If you really need to handle UTF32, you can refer to this answer, which basically uses
int cp = Char.ConvertToUtf32(a, n);
after incrementing the loop by two (because it's coded on two chars), with the Char.IsSurrogatePair() condition.
Your translation would then become
string a = "ABC\U0001F01C";
Console.WriteLine(s.Count(x => !char.IsHighSurrogate(x)));
for (var i = 0; i < a.Length; i += char.IsSurrogatePair(a, i) ? 2 : 1)
Console.WriteLine(char.ConvertToUtf32(a, i));
Please note the change from s.Length() to a little bit of LINQ for the count, because surrogates are counted as two chars. We simply count how many characters are not higher surrogates to get the clear count of actual characters.

The following code gets the codpoint of a part of a string
var s = "\uD834\uDD61";
for (var i = 0; i < s.Length; i += char.IsSurrogatePair(s, i) ? 2 : 1)
{
var codepoint = char.ConvertToUtf32(s, i);
Console.WriteLine("U+{0:X4}", codepoint);
}

MemoryMappedViewAccessor.ReadArray<> throws IndexOutOfRangeException

I'm trying to read a c-style unicode string from a memory-mapped file and IndexOutOfRangeException occurred, so I fixed it by copying char by char but I'd like to use ReadArray, which is more readable.
MemoryMappedFile file = MemoryMappedFile.OpenExisting("some name");
MemoryMappedViewAccessor view = file.CreateViewAccessor();
int len = (int)view.ReadUInt64(0); // Length of string + 1 is stored.
char[] buffer = new char[len];
//view.ReadArray<char>(0, buffer, sizeof(UInt64), len); // EXCEPTION
for (int i = 0; i < len; i++) // char by char, works fine.
buffer[i] = view.ReadChar(sizeof(UInt64) + sizeof(char) * i);
Tried to find a short example showing how to use ReadArray<> but I couldn't.

in ReadArray, you indicate the desired position with the first parameter, and the offset within the array as the 3rd:
public int ReadArray<T>(
long position,
T[] array,
int offset,
int count
)
So:
view.ReadArray<char>(0, buffer, sizeof(UInt64), len);
Is saying to fill the array at indexes from sizeof(UInt64) to sizeof(UInt64) + len - 1 - which will always overflow the usable index values (assuming sizeof(UInt64) is greater than 0 :-)).
Try:
view.ReadArray<char>(sizeof(UInt64), buffer, 0, len);

In ReadArray, Param 1 and 3 should be swapped.
Intellisense of VS 2010 incorrectly describes ReadArray<>'s parameters.
(may vary with language/locale of VS)

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.