Passing a vector from C# to Python - c#

I use Python.Net for C# interaction with Python libraries. I solve the problem of text classification. I use FastText to index and get the vector, as well as Sklearn to train the classifier (Knn).During the implementation, I encountered a lot of problems, but all were solved, with the exception of one.
After receiving the vectors of the texts on which I train Knn, I save them to a separate text file and then, if necessary, use it.
string loadKnowVec = File.ReadAllText("vectorKnowClass.txt", Encoding.Default);
string[] splitKnowVec = loadKnowVec.Split('\r');
splitKnowVec = splitKnowVec.Where(x => x != "").ToArray();
for()
{
keyValues_vector.Add(float.Parse(splitKnowVec[i], NumberFormatInfo.InvariantInfo), 1);
}
dynamic X_vec = np.array(keyValues_vector.Keys.ToArray()).reshape(-1, 1);
dynamic y_tag = np.array(keyValues_vector.Values.ToArray());
dynamic neigh = KNN(n_neighbors: 3);
dynamic KnnFit = neigh.fit(X_vec, y_tag);
string predict = neigh.predict("0.00889");
MessageBox.Show("Скорее всего это: "+predict);
During the training of the classifier, I encountered such a problem that from c# to python, it is not values with the float type, but the value of System.Single[].
Python.Runtime.PythonException: "TypeError : float() argument must be a string or a number,
not 'Single[]'
The stored value, at this point, of dynamic X_vec is "System.Single[]".(I think that's exactly the problem)
2.At first I tried to manually set the values of X_vec, but the error and its values were the same.
The first idea was to change the array type using the numpy library, but it didn't help, it also gave out "".
dynamic Xx = np.array(X_vec, dtype: "float");
dynamic yY = np.array(y_tag, dtype: "int");
Next, it was tried to create an empty array in advance and load specific values into it before changing the data type, but this also did not work.
Perhaps I do not understand the principle of the formation and interaction of the MSVS19 IDE and the python interpreter.

I solved this issue for a couple of days and each time I thought it was worth reading the documentation on python.net .
As a result, I found a solution and it turned out to be quite banal, it is necessary to represent X_vec not as a float[] , but as a List<float>
List<float> vectors = keyValues_vector.Keys.ToList();
List<int> classTag = keyValues_vector.Values.ToList();
dynamic a = np.array(vectors);
dynamic X_vec = a.reshape(-1, 1);
dynamic y_tag = np.array(classTag);

Related

How to convert String to One Int

following problem in C# (working in VS Community 2015):
First off, i fairly new to C#, so excuse me if that question would be an easy fix.
I have a contact sensor giving me a string of numbers (length measurement). I read them with the SystemPort Methods and cut them down to the numbers that i need with substring (as the beginning of the string, the "SR00002" is useless to me).
In the end i end up with a string like : "000.3422" or "012.2345". Now i want to convert that string to one solid int-variable that i can work with, meaning subtract values from and such.
Bsp: I want to calculate 012.234 - 000.3422 (or , instead of . but i could change that beforehand)
I already tried Parse and ConvertToInt (while iterating through the string) but the endresult is always a string.
string b = serialPort2.ReadLine();
string[] b1 = Regex.Split(b, "SR,00,002,");
string b2 = b1[1].Substring(1);
foreach (char c in b2)
{
Convert.ToInt32(c);
}
textBox2.Text = b2 + b2.GetType();
I know that when b2 will be int it can not be printed in the Textbox but ill take care of that later.
When everything is converted accordingly, ill outsource the conversion to its own method =)
The GetType is just for testing and as said shows only System.String (which i dont want). Help would be much appreaciated. I also browsed the searchfunction and google but couldnt find anything of help. I wish any possible helpers a nice day, mfg Chris.
use the int.Parse
int.Parse("123")
You need to assign the converted values to a new variable or array that takes int or other numeric values that you want.
int[] numbers = new int[b1.length];
for(int i = 0; i < b2.length; i++)
{
numbers[i] = Convert.ToInt32(b2[i]);
}

Equivalent to string * 10 from VB6 in C#

I have import an unmanageable .dll to my project. It's no any document left
and the original working code is in VB6. so I try to make C# code equivalent to VB6 as same as possible.
PROBLEM
I don't know how to convert following code to C#...
Dim ATQ As String * 10
Dim Uid As String * 10
Dim MultiTag As String * 10
NOTE
Q: some users ask me that do you really need string fixed length?
A: I already try string in c# but there are no result update to these variable. So, I think input signature for the dllImport function might be wrong. So, I want to make it as same as VB6 did because I didn't know exactly what should be the right signature.
TRIAL & ERROR
I tried all of this but it's not working (still no result update to these variable)
Microsoft.VisualBasic.Compatibility.VB6.FixedLengthString ATQ = new Microsoft.VisualBasic.Compatibility.VB6.FixedLengthString(10)
Microsoft.VisualBasic.Compatibility.VB6.FixedLengthString Uid = new Microsoft.VisualBasic.Compatibility.VB6.FixedLengthString(10)
Microsoft.VisualBasic.Compatibility.VB6.FixedLengthString MultiTag = new Microsoft.VisualBasic.Compatibility.VB6.FixedLengthString(10)
You can use Microsoft.VisualBasic.Compatibility:
using Microsoft.VisualBasic.Compatibility;
var ATQ = new VB6.FixedLengthString(10);
var Uid = new VB6.FixedLengthString(10);
var MultiTag = new VB6.FixedLengthString(10);
But it's marked as obsolete and specifically not supported for 64-bit processes, so write your own that replicates the functionality, which is to truncate on setting long values and padding right with spaces for short values. It also sets an "uninitialised" value, like above, to nulls.
Sample code from LinqPad (which I can't get to allow using Microsoft.VisualBasic.Compatibility I think because it is marked obsolete, but I have no proof of that):
var U = new Microsoft.VisualBasic.Compatibility.VB6.FixedLengthString(5);
var S = new Microsoft.VisualBasic.Compatibility.VB6.FixedLengthString(5,"Test");
var L = new Microsoft.VisualBasic.Compatibility.VB6.FixedLengthString(5,"Testing");
Func<string,string> p0=(s)=>"\""+s.Replace("\0","\\0")+"\"";
p0(U.Value).Dump();
p0(S.Value).Dump();
p0(L.Value).Dump();
U.Value="Test";
p0(U.Value).Dump();
U.Value="Testing";
p0(U.Value).Dump();
which has this output:
"\0\0\0\0\0"
"Test "
"Testi"
"Test "
"Testi"
string ATQ;
string Uid;
string MultiTag;
One difference is that, in VB6, I believe the String * 10 syntax may create fixed-size strings. If that's the case, then the padding behavior may be different.

Word Interop Delete results in Wrong Parameter

I have the pleasure to write some code that moves around stuff in an Office XP environment. I've referenced the OfficeXP Interop Assemblies and written code to Search/Replace stuff. That works fine. Now I need to delete Text around a Bookmark and i keep getting Exceptions thrown at me.
Here is some of the code:
object units = WdUnits.wdLine;
object lines = 2;
object extend = WdMovementType.wdExtend;
object bookmarkName = "Bank1";
var bm = doc.Bookmarks;
var bm1 = doc.Bookmarks.get_Item(bookmarkName);
var ra = bm1.Range;
ra.Delete(ref units, ref lines);
The last line is where i get a "Wrong Parameter" Exception. Looking at the Definition in the MSDN I kind of think I'm right. But obviously I'm not. Hope you guys can help me out here.
Update: ok, i see. Using the Delete method on the Range object i can only use wdWord as a Parameter. I'd like to change my question now: what i do want to do is delete two lines starting from the bookmark. How would i do this?
Range objects in Word are not "line oriented", they don't allow line operations, only paragraph operations. However, selections allow line operations. The current selection is not a property of the word document, but of the word application object. Here is some VBA code which does essentially what you try, I think you can easily port this to C#:
Dim rng As Range
Dim doc As Document
Set doc = ActiveDocument
Set rng = doc.Bookmarks("BM").Range
Dim s As Long, e As Long
rng.Select
s = Application.Selection.Start
e = Application.Selection.Next(wdLine, 1).End
Application.Selection.SetRange s, e
Application.Selection.Delete
Ok, i found a way to do what i had to do. Here is the Code:
if (doc.Bookmarks.Exists("Bank1"))
{
object bookmarkName = "Bank1";
object units = WdUnits.wdLine;
object lines = 2;
object extend = WdMovementType.wdExtend;
doc.Bookmarks.get_Item(bookmarkName).Select();
app.Selection.MoveDown(units, lines, extend);
app.Selection.Delete();
}

Using a large static array in C# (Silverlight on Windows Phone 7)

I have a question that's so simple I cannot believe I can't answer it myself. But, there you go.
I have a large-ish static list (of cities, latitudes and longitudes) that I want to use in my Windows Phone 7 Silverlight application. There are around 10,000 of them. I'd like to embed this data statically in my application and access it in an array (I need to cycle through the whole list in code pretty regularly).
What is going to be my most effective means of storing this? I'm a bit of an old school sort, so I reckoned the fastest way to do it would be:
public struct City
{
public string name;
public double lat;
public double lon;
};
and then...
private City[] cc = new City[10000];
public CityDists()
{
cc[2].name = "Lae, Papua New Guinea"; cc[2].lat = 123; cc[2].lon = 123;
cc[3].name = "Rabaul, Papua New Guinea"; cc[3].lat = 123; cc[3].lon = 123;
cc[4].name = "Angmagssalik, Greenland"; cc[4].lat = 123; cc[4].lon = 123;
cc[5].name = "Angissoq, Greenland"; cc[5].lat = 123; cc[5].lon = 123;
...
However, this bums out with an "out of memory" error before the code actually gets to run (I'm assuming the code itself ended up being too much to load into memory).
Everything I read online tells me to use an XML resource or file and then to deserialise that into instances of a class. But can that really be as fast as using a struct? Won't the XML take ages to parse?
I think I'm capable of writing the code here - I'm just not sure what the best approach is to start with. I'm interested in speed of load and (more importantly) run time access more than anything.
Any help very much appreciated - first question here so I hope I haven't done anything boneheaded.
Chris
10,000 structs shouldn't run out of memory, but just to make sure, I would first try turning your struct into a class such that it uses the heap instead of the stack. There is a strong possibility that doing that will fix your out of memory errors.
An XML file stored in isolated storage might be a good way to go if your data is going to be updated even every once in a while. You could pull the cities from a web service and serialize those classes to the Application Store in isolated storage whenever they get updated.
Also, I notice in the code samples that the cc array is not declared static. If you have a few instances of CityDists, then that could also be bogging down memory as the array is getting re-created every time a new CityDists class is created. Try declaring your array as static and initializing it in the static constructor:
private static City[] cc = new City[10000];
static CityDists()
{
cc[2].name = "Lae, Papua New Guinea"; cc[2].lat = 123; cc[2].lon = 123;
cc[3].name = "Rabaul, Papua New Guinea"; cc[3].lat = 123; cc[3].lon = 123;
cc[4].name = "Angmagssalik, Greenland"; cc[4].lat = 123; cc[4].lon = 123;
cc[5].name = "Angissoq, Greenland"; cc[5].lat = 123; cc[5].lon = 123;
...
If loading an xml doc from the xap works for you..
Here's a project I posted demonstrating loading of an xml doc from the XAP via XDocument/LINQ and databinding to a listbox for reference.
binding a Linq datasource to a listbox
If you want to avoid the XML parsing and memory overhead, you could use a plain text file for storing your data and use the .Net string tokenizer functions to parse the entries e.g. use String.Split()
You could also load the file partially to keep memory consumption low. For example, you load only k out of n lines of the file. In case you need to access a record that is outside the currently loaded k segments, load the appropriate k segments. You could either do it the old school way or even use the fancy Serialization stuff from .Net
Using a file such as XML or a simple delimited file would be a better approach as others have pointed out. However can I also suggest another change to improve the use of memory.
Something like this (although the actual loading should be done using an external file):-
public struct City
{
public string name;
public string country;
public double lat;
public double lon;
}
private static City[] cc = new City[10000];
static CityDists()
{
string[] countries = new string[500];
// Replace following with loading from a "countries" file.
countries[0] = "Papua New Guinea";
countries[1] = "Greenland";
// Replace following with loading from a "cities" file.
cc[2].name = "Lae"; cc[2].country = contries[0]; cc[2].lat = 123; cc[2].lon = 123;
cc[3].name = "Rabaul"; cc[3].country = countries[0]; cc[3].lat = 123; cc[3].lon = 123;
cc[4].name = "Angmagssalik"; cc[4].country = countries[1]; cc[4].lat = 123; cc[4].lon = 123;
cc[5].name = "Angissoq"; cc[5].country= countries[1]; cc[5].lat = 123; cc[5].lon = 123;
}
This increases the size of the structure slightly but reduces the memory used by duplicate country names signficantly.
I hear your frustration. Run your code without the debugger, it should work fine. I'm loading 2 arrays in under 3 seconds, each with over 100,000 elements. Debugger reports "Out of Memory", which is simply not the case.
Oh and you are correct about the efficiency. Loading the same information from an XML file was taking over 30 seconds on the phone.
I don't know who was responding to your question but they really should stick to marketing.

How to force c# binary int division to return a double?

How to force double x = 3 / 2; to return 1.5 in x without the D suffix or casting? Is there any kind of operator overload that can be done? Or some compiler option?
Amazingly, it's not so simple to add the casting or suffix for the following reason:
Business users need to write and debug their own formulas. Presently C# is getting used like a DSL (domain specific language) in that these users aren't computer science engineers. So all they know is how to edit and create a few types of classes to hold their "business rules" which are generally just math formulas.
But they always assume that double x = 3 / 2; will return x = 1.5
however in C# that returns 1.
A. they always forget this, waste time debugging, call me for support and we fix it.
B. they think it's very ugly and hurts the readability of their business rules.
As you know, DSL's need to be more like natural language.
Yes. We are planning to move to Boo and build a DSL based on it but that's down the road.
Is there a simple solution to make double x = 3 / 2; return 1.5 by something external to the class so it's invisible to the users?
Thanks!
Wayne
No, there's no solution that can make 3 / 2 return 1.5.
The only workaround taking into consideration your constraints is to discourage the users to use literals in the formula. Encourage them to use constants. Or, if they really need to use literals, Encourage them to use literals with a decimal point.
never say never...
The (double)3/2 solution looks nice...
but it failed for 4+5/6
try this:
donated to the public domain to be used freely by SymbolicComputation.com.
It's alpha but you can try it out, I've only run it on a few tests, my site and software should be up soon.
It uses Microsoft's Roslyn, it'll put a 'd' after every number if all goes well. Roslyn is alpha too, but it will parse a fair bit of C#.
public static String AddDSuffixesToEquation(String inEquation)
{
SyntaxNode syntaxNode = EquationToSyntaxNode(inEquation);
List<SyntaxNode> branches = syntaxNode.DescendentNodesAndSelf().ToList();
List<Int32> numericBranchIndexes = new List<int>();
List<SyntaxNode> replacements = new List<SyntaxNode>();
SyntaxNode replacement;
String lStr;
Int32 L;
for (L = 0; L < branches.Count; L++)
{
if (branches[L].Kind == SyntaxKind.NumericLiteralExpression)
{
numericBranchIndexes.Add(L);
lStr = branches[L].ToString() + "d";
replacement = EquationToSyntaxNode(lStr);
replacements.Add(replacement);
}
}
replacement = EquationToSyntaxNode(inEquation);
List<SyntaxNode> replaceMeBranches;
for (L = numericBranchIndexes.Count - 1; L >= 0; L--)
{
replaceMeBranches = replacement.DescendentNodesAndSelf().ToList();
replacement = replacement.ReplaceNode(replaceMeBranches[numericBranchIndexes[L]],replacements[L]);
}
return replacement.ToString();
}
public static SyntaxNode EquationToSyntaxNode(String inEquation)
{
SyntaxTree tree = EquationToSyntaxTree(inEquation);
return EquationSyntaxTreeToEquationSyntaxNode(tree);
}
public static SyntaxTree EquationToSyntaxTree(String inEquation)
{
return SyntaxTree.ParseCompilationUnit("using System; class Calc { public static object Eval() { return " + inEquation + "; } }");
}
public static SyntaxNode EquationSyntaxTreeToEquationSyntaxNode(SyntaxTree syntaxTree)
{
SyntaxNode syntaxNode = syntaxTree.Root.DescendentNodes().First(x => x.Kind == SyntaxKind.ReturnStatement);
return syntaxNode.ChildNodes().First();
}
simple, if I'm not mistaken:
double x = 3D / 2D;
One solution would be writing a method that does this for them and teach them to use it. Your method would always take in doubles and the answer will always have the correct number of decimals.
I'm not pretty sure, but I believe you can get a double using 3.0/2.0
But if you think .0 just as another way of suffixing then it's not the answer too :-)
Maybe you can try RPN Expression Parser Class for now or bcParser? These are very small expression parsing libraries.
I like strong, statically typed languages for my own work, but I don't think they're suited for beginners who have no interest in becoming professionals.
So I'd have to say unfortunately your choice of C# might not of been the best for that audience.
Boo seems to be statically typed to. Have you thought about embedding a Javascript engine, Python, or some other dynamically typed engine? These usually are not that hard to plug into an existing application and you have the benefit of lots of existing documentation.
Perhaps an extenstion method on int32?
Preprocess formulas before passing them to the c# compiler. Do something like:
formula = Regex.Replace(formula, #"(^|[\^\s\+\*\/-])(\d+)(?![DF\.])", "$1$2D")
To convert integer literals to double literals.
Alternately, you could use a simple state machine to track whether or not you're in a string literal or comment rather than blindly replacing, but for simple formulas I think a regex will suffice.
Try doing it like this:
double result = (double) 3 / 2;
result = 1.5

Categories