When I try to copy arrays into a jagged array. My goal is to take an array of type char, separate the "words" into separate arrays (I use an already working function for it) and want to put them into an array.
static char[][] split_string(char[] str)
{
int length = count_words(str);
char[][] sentence = new char[length][];
int start = 0;
int end = 0;
int word = 0;
int i = -1;
while (i != str.Length)
{
i++;
if (str[i]==' ')
{
end = i-1;
char[] aux_array = substring(str, start, end);
//issue
aux_array.CopyTo(sentence[word], 0);
//alternative (not working either)
/*
for(int j=0; j<aux_array.Length;j++)
{
sentence[word][j] = aux_array[j];
}
*/
while (str[i]==' ')
{
i++;
}
word++;
start = i;
}
}
return sentence;
}
For information,
substring if of the form: substring(array, int, int) -> array
count_word is of the form: count_word(array) -> int
My goal is to take an array of type char, separate the "words" into separate arrays (I use an already working function for it) and want to put them into an array.
Then just put them into array
//...
sentence[word] = substring(str, start, end);
Note that the jagged array elements are null by default and you didn't allocate them, so you probably are getting null reference exception. If you really need to do a copy of the returned array, then the easiest way is to use Array.Clone method like this
sentence[word] = (char[])substring(str, start, end).Clone();
It is easier to work with strings and not raw char arrays but I assume it is with intention that you have decided to use char arrays.
One way to simplify your code is to build the char array as you go instead of preallocating it. .NET arrays have fixed size but List<T> allows you to grow a collection of items.
You can also change your function into an iterator block to simplify it further. When a word is complete you yield return it to the caller.
IEnumerable<char[]> SplitString(char[] str) {
var word = new List<char>();
foreach (var ch in str) {
if (ch == ' ') {
if (word.Count > 0) {
yield return word.ToArray();
word = new List<char>();
}
}
else
word.Add(ch);
}
if (word.Count > 0)
yield return word.ToArray();
}
This function will not return an array so if you want an array of arrays (jagged array) you need to use ToArray():
var str = "The quick brown fox jumps over the lazy dog".ToCharArray();
var result = SplitString(str).ToArray();
This code will correctly handle multiple spaces and spaces in the beginning and end of the source string.
Related
I have a follow string example
0 0 1 2.33 4
2.1 2 11 2
There are many ways to convert it to an array, but I need the fastest one, because files can contain 1 billion elements.
string can contain an indefinite number of spaces between numbers
i'am trying
static void Main()
{
string str = "\n\n\n 1 2 3 \r 2322.2 3 4 \n 0 0 ";
byte[] byteArray = Encoding.ASCII.GetBytes(str);
MemoryStream stream = new MemoryStream(byteArray);
var values = ReadNumbers(stream);
}
public static IEnumerable<object> ReadNumbers(Stream st)
{
var buffer = new StringBuilder();
using (var sr = new StreamReader(st))
{
while (!sr.EndOfStream)
{
char digit = (char)sr.Read();
if (!char.IsDigit(digit) && digit != '.')
{
if (buffer.Length == 0) continue;
double ret = double.Parse(buffer.ToString() , culture);
buffer.Clear();
yield return ret;
}
else
{
buffer.Append(digit);
}
}
if (buffer.Length != 0)
{
double ret = double.Parse(buffer.ToString() , culture);
buffer.Clear();
yield return ret;
}
}
}
There are a few things you can do to improve the performance of your code. First, you can use the Split method to split the string into an array of strings, where each element of the array is a number in the string. This will be faster than reading each character of the string one at a time and checking if it is a digit.
Next, you can use double.TryParse to parse each element of the array into a double, rather than using double.Parse and catching any potential exceptions. TryParse will be faster because it does not throw an exception if the string is not a valid double.
Here is an example of how you could implement this:
public static IEnumerable<double> ReadNumbers(string str)
{
string[] parts = str.Split(new[] {' ', '\n', '\r', '\t'}, StringSplitOptions.RemoveEmptyEntries);
foreach (string part in parts)
{
if (double.TryParse(part, NumberStyles.Any, CultureInfo.InvariantCulture, out double value))
{
yield return value;
}
}
}
I'd rather suggest the simpliest solution first and haunt for nano-seconds if there really is a problem with that code.
var doubles = myInput.Split(new[] { ' ' }, StringSplitOptions.RemoveEmptyEntries).Select(x => double.Parse(x, whateverCulture))
Do that for every line in your file, not for the entire file at once, as reading such a huge file at once may crush your memory.
Pretty easy to understand. Afterwards perform a benchmark-test with your data and see if it really affects performance when trying to parse the data. However chances are the actual bottleneck is reading that huge file- which essentially is a IO-thing.
You can improve your solution by trying to avoid creating many objects on the heap. Especially buffer.ToString() is called repeatedly and creates new strings. You can use a ReadOnlySpan<char> struct to slice the string and at the same time avoid heap allocations. A span provides pointers into the original string without making copies of it or parts of it when slicing.
Also do not return the doubles as object, as this will box them. I.e., it will store them on the heap. See: Boxing and Unboxing (C# Programming Guide). If you prefer your solution over mine, use an IEnumerable<double> as return type of your method.
The use of ReadOnlySpans; however, has the disadvantage that it cannot be used in iterator methods. The reason is that a ReadOnlySpan must be allocated on the stack, but an iterator method wraps its state in a class. If you try, you will get the Compiler Error CS4013:
Instance of type cannot be used inside a nested function, query expression, iterator block or async method
Therefore, we must either store the numbers in a collection or consume them in-place. Since I don't know what you want to do with the numbers, I use the former approach:
public static List<double> ReadNumbers(string input)
{
ReadOnlySpan<char> inputSpan = input.AsSpan();
int start = 0;
bool isNumber = false;
var list = new List<double>(); // Improve by passing the expected maximum length.
int i;
for (i = 0; i < inputSpan.Length; i++) {
char c = inputSpan[i];
bool isDigit = Char.IsDigit(c);
if (isDigit && !isNumber) {
start = i;
isNumber = true;
} else if (isNumber && !isDigit && c != '.') {
isNumber = false;
if (Double.TryParse(inputSpan[start..i], CultureInfo.InvariantCulture, out double d)) {
list.Add(d);
}
}
}
if (isNumber) {
if (Double.TryParse(inputSpan[start..i], CultureInfo.InvariantCulture, out double d)) {
list.Add(d);
}
}
return list;
}
inputSpan[start..i] creates a slice as ReadOnlySpan<char>.
Test
string str = "\n\n\n 1 2 3 \r 2322.2 3 4 \n 0 0 ";
foreach (double d in ReadNumbers(str)) {
Console.WriteLine(d);
}
But whenever you are asking for speed, you must run benchmarks to compare the different approaches. Very often what seems a superior solution may fail in the benchmark.
See also: All About Span: Exploring a New .NET Mainstay
I have an array :
string[] arr = new string[2]
arr[0] = "a=01"
arr[1] = "b=02"
How can I take those number out and make a new array to store them? What I am expecting is :
int [] newArr = new int[2]
Inside newArr, there are 2 elements, one is '01' and the other one is '02' which both from arr.
Another way besides Substring to get the desired result is to use String.Split on the = character. This is assuming the string will always have the format of letters and numbers, separated by a =, with no other = characters in the input string.
for (var i = 0; i < arr.Length; i++)
{
// Split the array item on the `=` character.
// This results in an array of two items ("a" and "01" for the first item)
var tmp = arr[i].Split('=');
// If there are fewer than 2 items in the array, there was not a =
// character to split on, so continue to the next item.
if (tmp.Length < 2)
{
continue;
}
// Try to parse the second item in the tmp array (which is the number
// in the provided example input) as an Int32.
int num;
if (Int32.TryParse(tmp[1], out num))
{
// If the parse is succesful, assign the int to the corresponding
// index of the new array.
newArr[i] = num;
}
}
This can be shortened in a lambda expression like the other answer like so:
var newArr = arr.Select(x => Int32.Parse(x.Split('=')[1])).ToArray();
Though doing it with Int32.Parse can result in an exception if the provided string is not an integer. This also assumes that there is a = character, with only numbers to the right of it.
Take a substring and then parse as int.
var newArr = arr.Select(x=>Int32.Parse(x.Substring(2))).ToArray();
As other answers have noted, it's quite compact to use linq. PM100 wrote:
var newArr = arr.Select(x=>Int32.Parse(x.Substring(2))).ToArray();
You asked what x was.. that linq statement there is conceptually the equivalent of something like:
List<int> nums = new List<int>();
foreach(string x in arr)
nums.Add(Int32.Parse(x.Substring(2);
var newArr = nums.ToArray();
It's not exactly the same, internally linq probably doesn't use a List, but it embodies the same concept - for each element (called x) in the string array, cut the start off it, parse the result as an int, add it to a collection, convert the collection to an array
Sometimes I think linq is overused; here probably efficiencies could be gained by directly declaring an int array the size of the string one and filling it directly, rather than adding to a List or other collection, that is later turned into an int array. Proponents of either style could easily be found; linq is compact and makes relatively trivial work of more long hand constructs such as loops within loops within loops. Though not necessarily easy to work out for those unfamiliar with how to read it it does bring a certain self documenting aspect to code because it uses English words like Any, Where, Distinct and these more quickly convey a concept than does looking at a loop code that exits early when a test returns true (Any) or builds a dictionary/hashset from all elements and returns it (Distinct)
I didn't get the problem - I was trying to do a simple action:
for(i = x.Length-1, j = 0 ; i >= 0 ; i--, j++)
{
backx[j] = x[i];
}
Both are declared:
String x;
String backx;
What is the problem ? It says the error in the title...
If there is a problem - is there another way to do that?
The result (As the name 'backx' hints) is that backx will contain the string X backwards.
P.S. x is not empty - it contains a substring from another string.
Strings are immutable: you can retrieve the character at a certain position, but you cannot change the character to a new one directly.
Instead you'll have to build a new string with the change. There are several ways to do this, but StringBuilder does the job in a similar fashion to what you already have:
StringBuilder sb = new StringBuilder(backx);
sb[j] = x[i];
backx = sb.ToString();
EDIT: If you take a look at the string public facing API, you'll see this indexer:
public char this[int index] { get; }
This shows that you can "get" a value, but because no "set" is available, you cannot assign values to that indexer.
EDITx2: If you're looking for a way to reverse a string, there are a few different ways, but here's one example with an explanation as to how it works: http://www.dotnetperls.com/reverse-string
String is immutable in .NET - this is why you get the error.
You can get a reverse string with LINQ:
string x = "abcd";
string backx = new string(x.Reverse().ToArray());
Console.WriteLine(backx); // output: "dcba"
String are immuatable. You have convert to Char Array and then you would be able to modify.
Or you can use StringBuilder.
for example
char[] wordArray = word.ToCharArray();
In C# strings are immutable. You cannot "set" Xth character to whatever you want. If yo uwant to construct a new string, or be able to "edit" a string, use i.e. StringBuilder class.
Strings are immutable in C#. You can read more about it here: http://msdn.microsoft.com/en-us/library/362314fe.aspx
Both the variables you have are string while you are treating them as if they were arrays (well, they are). Of course it is a valid statement to access characters from a string through this mechanism, you cannot really assign it that way.
Since you are trying to reverse a string, do take a look at this post. It has lot of information.
public static string ReverseName( string theName)
{
string revName = string.Empty;
foreach (char a in theName)
{
revName = a + revName;
}
return revName;
}
This is simple and does not involve arrays directly.
The code below simply swaps the index of each char in the string which enables you to only have to iterate half way through the original string which is pretty efficient if you're dealing with a lot of characters. The result is the original string reversed. I tested this with a string consisting of 100 characters and it executed in 0.0000021 seconds.
private string ReverseString(string testString)
{
int j = testString.Length - 1;
char[] charArray = new char[testString.Length];
for (int i = 0; i <= j; i++)
{
if (i != j)
{
charArray[i] = testString[j];
charArray[j] = testString[i];
}
j--;
}
return new string(charArray);
}
In case you need to replace e.g. index 2 in string use this (it is ugly, but working and is easily maintainbable)
V1 - you know what you want to put their. Here you saying in pseudocode string[2] = 'R';
row3String.Replace(row3String[2], 'R');
V2 - you need to put their char R or char Y. Here string[2] = 'R' if was 'Y' or if was not stay 'Y' (this one line if needs some form of else)
row3String.Replace(row3String[2], row3String[2].Equals('Y') ? 'R' : 'Y');
How to remove item from a simple array once? For example, a char array contains these letters:
a,b,d,a
I would like to remove the letter "a" one time, then the result would be:
a,b,d
Removing an item from an array is not really possible. The size of an array is immutable once allocated. There is no way to remove an element per say. You can overwrite / clear an element but the size of the array won't change.
If you want to actually remove an element and change the size of the collection then you should use List<char> instead of char[]. Then you can use the RemoveAt API
List<char> list = ...;
list.RemoveAt(3);
If your goal is to just skip the first 'a', then remove the second, you could use something like:
int first = Array.IndexOf(theArray, 'a');
if (first != -1)
{
int second = Array.IndexOf(theArray, 'a', first+1);
if (second != -1)
{
theArray = theArray.Take(second - 1).Concat(theArray.Skip(second+1)).ToArray();
}
}
If you just need to remove any of the 'a' characters (since you specified that the order is not relevant), you could use:
int index = Array.IndexOf(theArray, 'a');
if (index != -1)
{
theArray = theArray.Take(index - 1).Concat(theArray.Skip(index+1)).ToArray();
}
Note that these don't actually remove the item from the array - they create a new array with that element missing from the newly created array. Since arrays are not designed to change in total length once created, this is typically the best alternative.
If you will be doing this frequently, you may want to use a collection type that does allow simple removal of elements. Switching from an array to a List<char>, for example, makes removal far simpler, as List<T> supports simple APIs such as use List<T>.Remove directly.
You can use linq by trying the following
var someArray= new string[3];
someArray[0] = "a";
someArray[1] = "b";
someArray[2] = "c";
someArray= someArray.Where(sa => !sa.Equals("a")).ToArray();
Please note: This method is not removing the element from the array but that it is creating a new array that is excluding the element. This may have an effect on performance.
You might consider using a list or collection.
static void Main(string[] args)
{
char[] arr = "aababde".ToArray();
arr = RemoveCharacter(arr, 'a');
arr = RemoveCharacter(arr, 'b');
arr = RemoveCharacter(arr, 'd');
arr = RemoveCharacter(arr, 'z');
arr = RemoveCharacter(arr, 'a');
//result is 'a' 'b' 'e'
}
static char[] RemoveCharacter(char[] array, char c)
{
List<char> list = array.ToList();
list.Remove(c);
return list.ToArray();
}
The program below is from the book "Cracking the coding interview", by Gayle Laakmann McDowell.
The original code is written in C.
Here is the original code:
void reverse(char *str) {
char * end = str;
char tmp;
if (str) {
while (*end) {
++end;
}
--end;
while (str < end) {
tmp = *str;
*str++ = *end;
*end-- = tmp;
}
}
}
I am trying to convert it in C#. After researching via Google and playing with the code, below is what I have. I am a beginner and really stuck. I am not getting the value I am expecting. Can someone tell me what I am doing wrong?
class Program
{
unsafe void reverse(char *str)
{
char* end = str;
char tmp;
if (str) // Cannot implicitly convert type 'char*' to 'bool'
{
while(*end) // Cannot implicitly convert type 'char*' to 'bool'
{
++end;
}
--end;
while(str < end)
{
tmp = *str;
*str += *end;
*end -= tmp;
}
}
}
public static void Main(string[] args)
{
}
}
I can't really remember if this ever worked in C#, but I am quite certain it should not work now.
To start off by answering your question. There is no automatic cast between pointers and bool. You need to write
if(str != null)
Secondly, you can't convert char to bool. Moreover, there is no terminating character for C# strings, so you can't even implement this. Normally, you would write:
while(*end != '\0') // not correct, for illustration only
But there is no '\0' char, or any other magic-termination-char. So you will need to take an int param for length.
Going back to the big picture, this sort of code seems like a terribly inappropriate place to start learning C#. It's way too low level, few C# programmers deal with pointers, chars and unsafe contexts on a daily basis.
... and if you must know how to fix your current code, here's a working program:
unsafe public static void Main(string[] args)
{
var str = "Hello, World";
fixed(char* chr = str){
reverse(chr, str.Length);
}
}
unsafe void reverse(char *str, int length)
{
char* end = str;
char tmp;
if (str != null) //Cannot implicitly convert type 'char*' to 'bool'
{
for(int i = 0; i < length; ++i) //Cannot implicitly convert type 'char*' to 'bool'
{
++end;
}
--end;
while(str<end)
{
tmp = *str;
*str = *end;
*end = tmp;
--end;
++str;
}
}
}
Edit: removed a couple of .Dump() calls, as I was trying it out in LINQPad :)
C# is not like C in that you cannot use an integer value as an implicit bool. You need to manually convert it. One example:
if (str != 0)
while (*end != 0)
A word of warning: If you are migrating from C, there are a few things that can trip you up in a program like this. The main one is that char is 2 bytes. strings and chars are UTF-16 encoded. The C# equivalent of char is byte. Of course, you should use string rather than C-strings.
Another thing: If you got your char* by converting a normal string to a char*, forget your entire code. This is not C. Strings are not null-terminated.
Unless this is homework, you would much rather be doing something like this:
string foo = "Hello, World!";
string bar = foo.Reverse(); // bar now holds "!dlroW ,olleH";
As you discovered, you can use pointers in C#, if you use the unsafe keyword. But you should do that only when really necessary and when you really know what you're doing. You certainly shouldn't use pointers if you're just beginning with the language.
Now, to your actual question: you are given a string of characters and you want to reverse it. Strings in C# are represented as the string class. And string is immutable, so you can't modify it. But you can convert between a string and an array of characters (char[]) and you can modify that. And you can reverse an array by using the static method Array.Reverse(). So, one way to write your method would be:
string Reverse(string str)
{
if (str == null)
return null;
char[] array = str.ToCharArray(); // convert the string to array
Array.Reverse(array); // reverse the array
string result = new string(array); // create a new string out of the array
return result; // and return it
}
If you wanted to write the code that actually does the reversing, you can do that too (as an exercise, I wouldn't do it in production code):
string Reverse(string str)
{
if (str == null)
return null;
char[] array = str.ToCharArray();
// use indexes instead of pointers
int start = 0;
int end = array.Length - 1;
while (start < end)
{
char tmp = array[start];
array[start] = array[end];
array[end] = tmp;
start++;
end--;
}
return new string(array);
}
Try making it more explicit what you are checking in the conditions:
if(str != null)
while(*end != '\0')
You might also want to watch out for that character swapping code: it looks like you've got a +/- in there. If that's supposed to update your pointer positions, I'd suggest making those separate operations.