I am trying to read endless XML fragments come from FIFO, and convert it to IObservable<T> by using XmlReader on Linux.
My sample code below works on .NET Core 2. But XmlReader.ReadToFollowing method does not return "false" (blocked), even if all resources have released.
How do I fix and call an OnCompleted?
using System;
using System.IO;
using System.Reactive.Concurrency;
using System.Reactive.Disposables;
using System.Reactive.Linq;
using System.Xml;
namespace ConsoleApp
{
internal class Program
{
private static readonly XmlReaderSettings XmlReaderSettings =
new XmlReaderSettings {ConformanceLevel = ConformanceLevel.Fragment /*Async = true, CloseInput = true*/};
private static void Main(string[] args)
{
var fifoPath = "/tmp/fifo";
using (var fifoStream = new FileStream(fifoPath, FileMode.Open))
using (var fifoReader = new StreamReader(fifoStream))
using (var xmlReader = XmlReader.Create(fifoReader, XmlReaderSettings))
{
var disposable = GetObservable(xmlReader)
.SubscribeOn(new EventLoopScheduler())
.Subscribe(Console.WriteLine, Console.WriteLine,() => Console.WriteLine("OnCompleted called."));
Console.ReadLine();
disposable.Dispose();
}
Console.ReadLine();
}
private static IObservable<string> GetObservable(XmlReader xmlReader)
{
return Observable.Create<string>(o =>
{
while (xmlReader.ReadToFollowing("item"))
{
// Actually, parse item element and return it.
o.OnNext("OnNext item.");
}
o.OnCompleted();
Console.WriteLine("OnCompleted.");
return Disposable.Empty;
});
}
}
}
Repro steps
1. Make fifo. mkfifo /tmp/fifo
2. Run sample code.
3. Simulate endless xml. echo "<item/><item/><item/>" > /tmp/fifo
4. Press any key. Not show "OnCompleted".
Related
I am trying to read the content of an arrow file but I was not able to find the functions to get the actual data from it. I am not able to find any useful example to read the data too. For example here.
The code example for writing and reading in C#:
// Write
var recordBatch = new Apache.Arrow.RecordBatch.Builder(memoryAllocator)
.Append("Column A", false, col => col.Int32(array => array.AppendRange(Enumerable.Range(5, 15))))
.Build();
using (var stream = File.OpenWrite(filePath))
using (var writer = new Apache.Arrow.Ipc.ArrowFileWriter(stream, recordBatch.Schema, true))
{
await writer.WriteRecordBatchAsync(recordBatch);
await writer.WriteEndAsync();
}
// Read
var reader = Apache.Arrow.Ipc.ArrowFileReader.FromFile(filePath);
var readBatch = await reader.ReadNextRecordBatchAsync();
var col = readBatch.Column(0);
By debugging the code, I can see the values in the col Values property but I have no way of accessing this information in the code.
Am I missing anything or is there a different approach to read the data?
The Apache.Arrow package does not do any compute today. It will read in the file and you will have access to the raw buffers of data. This is sufficient for a number of intermediary tasks (e.g. services that shuttle data to and from or aggregate data files). So if you want to do a lot of operations on the data you may want some kind of dataframe library.
One such library is the Microsoft.Data.Analysis library which has added a DataFrame type which can be created from an Arrow RecordBatch. There is some explanation and examples of the library in this blog post.
I haven't worked with that library much but I was able to put together a short example of reading an Arrow file and printing the data:
using System;
using System.Diagnostics;
using System.IO;
using System.Threading.Tasks;
using Apache.Arrow.Ipc;
using Microsoft.Data.Analysis;
namespace DataframeExperiment
{
class Program
{
static async Task AsyncMain()
{
using (var stream = File.OpenRead("/tmp/test.arrow"))
using (var reader = new ArrowFileReader(stream))
{
var recordBatch = await reader.ReadNextRecordBatchAsync();
Console.WriteLine("Read record batch with {0} column(s)", recordBatch.ColumnCount);
var dataframe = DataFrame.FromArrowRecordBatch(recordBatch);
var columnX = dataframe["x"];
foreach (var value in columnX)
{
Console.WriteLine(value);
}
}
}
static void Main(string[] args)
{
AsyncMain().Wait();
}
}
}
I created the test file with a small python script:
import pyarrow as pa
import pyarrow.ipc as ipc
tab = pa.Table.from_pydict({'x': [1, 2, 3], 'y': ['x', 'y', 'z']})
with ipc.RecordBatchFileWriter('/tmp/test.arrow', schema=tab.schema) as writer:
writer.write_table(tab)
You could presumably also create the test file using C# with Apache.Arrow's array builders.
Update (Using Apache.Arrow directly)
On the other hand, if you want to use Apache.Arrow directly, and still get access to the data, then you can use typed arrays (e.g. Int32Array, Int64Array). You will first need to determine the type of your array somehow (either through prior knowledge of the schema or as / is style checks or pattern matching).
Here is an example using Apache.Arrow alone:
using System;
using System.IO;
using System.Threading.Tasks;
using Apache.Arrow;
using Apache.Arrow.Ipc;
namespace ArrayValuesExperiment
{
class Program
{
static async Task AsyncMain()
{
using (var stream = File.OpenRead("/tmp/test.arrow"))
using (var reader = new ArrowFileReader(stream))
{
var recordBatch = await reader.ReadNextRecordBatchAsync();
// Here I am relying on the fact that I know column
// 0 is an int64 array.
var columnX = (Int64Array) recordBatch.Column(0);
for (int i = 0; i < columnX.Values.Length; i++)
{
Console.WriteLine(columnX.Values[i]);
}
}
}
static void Main(string[] args)
{
AsyncMain().Wait();
}
}
}
Adding to the second approach proposed by Pace, an utility function like below can be used to get the values
private static dynamic GetArrayData(IArrowArray array)
{
return array switch
{
Int32Array int32array =>int32array.Values.ToArray(),
Int16Array int16array => int16array.Values.ToArray(),
StringArray stringArray => stringArray.Values.ToArray(),
FloatArray floatArray => floatArray.Values.ToArray(),
Int64Array int64Array => int64Array.Values.ToArray(),
DoubleArray doubleArray => doubleArray.Values.ToArray(),
Time32Array time32Array => time32Array.Values.ToArray(),
Time64Array time64Array => time64Array.Values.ToArray(),
BooleanArray booleanArray => booleanArray.Values.ToArray(),
Date32Array date32Array => date32Array.Values.ToArray(),
Date64Array date64Array => date64Array.Values.ToArray(),
Int8Array int8Array => int8Array.Values.ToArray(),
UInt16Array uint6Array => uint6Array.Values.ToArray(),
UInt8Array uInt8Array => uInt8Array.Values.ToArray(),
UInt64Array uInt64Array => uInt64Array.Values.ToArray(),
_ => throw new NotImplementedException(),
};
}
then iterate over the recordBatch as
object[,] results = new Object[recordBatch.Length, recordBatch.ColumnCount];
var col = 0;
foreach (var array in recordBatch.Arrays)
{
var row = 0;
foreach (var data in GetArrayData(array))
{
results[row++, col] = data;
}
col++;
}
return results;
Worth noting however that StringArrays return Bytes so you need to convert to back to string for example using
System.Text.Encoding.Unicode.GetString(stringArray.Values)
I am trying to access the macros inside of an Access database (accdb).
I tried using:
using Microsoft.Office.Interop.Access.Dao;
...
DBEngine dbe = new DBEngine();
Database ac = dbe.OpenDatabase(fileName);
I found a container["Scripts"] that had a document["Macro1"] which is my target. I am struggling to access the contents of the document. I also question if the Microsoft.Office.Interop.Access.Dao is the best reference for what I am trying to achieve.
What is the best way to view the content of the macros and modules?
You can skip the DAO part, it's not needed in this case. Macros are project specific, so in order to get them all, you would need to loop through your projects. In my example, i just have one project.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using Microsoft.Office.Interop.Access;
namespace Sandbox48
{
public class Program
{
public static void Main(string[] args)
{
Microsoft.Office.Interop.Access.Application oAccess = null;
string savePath = #"C:\macros\";
oAccess = new Microsoft.Office.Interop.Access.Application();
// Open a database in exclusive mode:
oAccess.OpenCurrentDatabase(
#"", //filepath
true //Exclusive
);
var allMacros = oAccess.CurrentProject.AllMacros;
foreach(var macro in allMacros)
{
var fullMacro = (AccessObject)macro;
Console.WriteLine(fullMacro.Name);
oAccess.SaveAsText(AcObjectType.acMacro, fullMacro.FullName, $"{savePath}{ fullMacro.Name}.txt");
}
Console.Read();
}
}
}
I know that most of the time, the Deserialize method of XmlSerializer will complain if there's something wrong (for example, if there is a typo). However, I've found an example where it doesn't complain, when I would have expected it to; and I'd like to know if there's a way of being told about the problem.
The example code below contains 3 things: an good example which works as expected, and example which would complain (commented out) and an example which does not complain, which is the one I want to know how to tell that there is something wrong.
Note: I appreciate that one possible route would be XSD validation; but that really feels like a sledgehammer to crack what seems like a simpler problem. For example, if I was writing a deserializer which had unexpected data that it didn't know what to do with, I'd make my code complain about it.
I've used NUnit (NuGet package) for assertions; but you don't really need it, just comment out the Assert lines - you can see what I'm expecting.
using System.IO;
using System.Linq;
using System.Text;
using System.Xml.Serialization;
using NUnit.Framework;
public static class Program
{
public static void Main()
{
string goodExampleXml = #"<?xml version=""1.0"" encoding=""utf-8""?><Example><Weathers><Weather>Sunny</Weather></Weathers></Example>";
var goodExample = Load(goodExampleXml);
Assert.That(goodExample, Is.Not.Null);
Assert.That(goodExample.Weathers, Is.Not.Null);
Assert.That(goodExample.Weathers, Has.Length.EqualTo(1));
Assert.That(goodExample.Weathers.First(), Is.EqualTo(Weather.Sunny));
string badExampleXmlWhichWillComplainXml = #"<?xml version=""1.0"" encoding=""utf-8""?><Example><Weathers><Weather>Suny</Weather></Weathers></Example>";
// var badExampleWhichWillComplain = Load(badExampleXmlWhichWillComplainXml); // this would complain, quite rightly, so I've commented it out
string badExampleXmlWhichWillNotComplain = #"<?xml version=""1.0"" encoding=""utf-8""?><Example><Weathers><Weathe>Sunny</Weathe></Weathers></Example>";
var badExample = Load(badExampleXmlWhichWillNotComplain);
Assert.That(badExample, Is.Not.Null);
Assert.That(badExample.Weathers, Is.Not.Null);
// clearly, the following two assertions will fail because I mis-typed the tag name; but I want to know there has been a problem before this point.
Assert.That(badExample.Weathers, Has.Length.EqualTo(1));
Assert.That(badExample.Weathers.First(), Is.EqualTo(Weather.Sunny));
}
private static Example Load(string serialized)
{
byte[] byteArray = Encoding.UTF8.GetBytes(serialized);
var xmlSerializer = new XmlSerializer(typeof(Example));
using var stream = new MemoryStream(byteArray, false);
return (Example)xmlSerializer.Deserialize(stream);
}
}
public enum Weather
{
Sunny,
Cloudy,
Rainy,
Windy,
Stormy,
Snowy,
}
public class Example
{
[System.Diagnostics.CodeAnalysis.SuppressMessage("Microsoft.Performance", "CA1819:PropertiesShouldNotReturnArrays", Justification = "Serialized XML")]
[XmlArray("Weathers")]
[XmlArrayItem("Weather")]
public Weather[] Weathers { get; set; }
}
Having looked at Microsoft's published source code for XmlSerializer, it became apparent that there are events that you can subscribe to (which is what I was hoping for); but they aren't exposed on the XmlSerializer itself... you have to inject a struct containing them into the constructor.
So I've been able to modify the code from the question to have an event handler which gets called when an unknown node is encountered (which is exactly what I was after). You need one extra using, over the ones given in the question...
using System.Xml;
and then here is the modified "Load" method...
private static Example Load(string serialized)
{
XmlDeserializationEvents events = new XmlDeserializationEvents();
events.OnUnknownNode = (sender, e) => System.Diagnostics.Debug.WriteLine("Unknown Node: " + e.Name);
var xmlSerializer = new XmlSerializer(typeof(Example));
using var reader = XmlReader.Create(new StringReader(serialized));
return (Example)xmlSerializer.Deserialize(reader, events);
}
So now I just need to do something more valuable than just write a line to the Debug output.
Note that more events are available, as described on the XmlDeserializationEvents page, and I'll probably pay attention to each of them.
I tested following and it works
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml;
using System.Xml.Serialization;
using System.IO;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
string xml =#"<?xml version=""1.0"" encoding=""utf-8"" ?>
<Example>
<Weathers>Sunny</Weathers>
<Weathers>Cloudy</Weathers>
<Weathers>Rainy</Weathers>
<Weathers>Windy</Weathers>
<Weathers>Stormy</Weathers>
<Weathers>Snowy</Weathers>
</Example>";
StringReader sReader = new StringReader(xml);
XmlReader reader = XmlReader.Create(sReader);
XmlSerializer serializer = new XmlSerializer(typeof(Example));
Example example = (Example)serializer.Deserialize(reader);
}
}
public enum Weather
{
Sunny,
Cloudy,
Rainy,
Windy,
Stormy,
Snowy,
}
public class Example
{
[XmlElement("Weathers")]
public Weather[] Weathers { get; set; }
}
}
A C# Windows application would like to load vector drawings that are stored in loose XAML files without allowing arbitrary code execution.
I am already loading such drawings from resources in linked assemblies over which I have control. However, I would like to also support loading loose XAML files. I imagine you can use XAML access control to limit the objects that can be instantiated in such XAML? Ideally, I would limit the loader to instantiating only the drawing primitives that are in the files we know about. It's ok that it would reject a file that has new drawing primitives in it that we have not whitelisted.
Is this a standard thing already supported by an API? Because I could not find it. Otherwise, does anyone have an example or beginnings of an example? This is for a free open source project and any help getting started would probably cut down the research I need to do by a lot.
The following seems to do a pretty decent job of white listing specific types in a XAML load:
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.IO;
using System.Reflection;
using System.Windows.Controls;
using System.Windows.Media;
using System.Xaml;
using System.Xml;
namespace TestXamlLoading
{
internal class SchemaContext : XamlSchemaContext
{
// map from XAML element name to required namespace (currently always the same)
private static readonly Dictionary<string, string> AllowedTypes = new Dictionary<string, string>();
static SchemaContext()
{
// questionable: <Image> is used in some drawing XAML, should review it
foreach (string name in new[]
{
"Canvas", "Compound", "Ellipse", "GradientStop", "GradientStopCollection", "Group", "Line",
"LinearGradientBrush", "MatrixTransform", "Path", "PathGeometry", "Polygon",
"RadialGradientBrush", "Rectangle", "RotateTransform", "ScaleTransform", "SkewTransform", "TextBlock",
"TransformGroup", "TranslateTransform"
})
{
AllowedTypes[name] = "http://schemas.microsoft.com/winfx/2006/xaml/presentation";
}
}
public SchemaContext(IEnumerable<Assembly> referenceAssemblies, XamlSchemaContextSettings settings) : base(
referenceAssemblies, settings)
{
// no code
}
protected override XamlType GetXamlType(string xamlNamespace, string name, params XamlType[] typeArguments)
{
if (!AllowedTypes.TryGetValue(name, out string requiredNamespace) || xamlNamespace != requiredNamespace)
{
throw new Exception($"disallowed instantiation of '{xamlNamespace}' '{name}' from XAML");
}
return base.GetXamlType(xamlNamespace, name, typeArguments);
}
}
internal class Program
{
[STAThreadAttribute]
private static void Main(string[] args)
{
bool shouldFail = TestLoad("..\\..\\..\\badfile.xaml");
Debug.Assert(!shouldFail);
bool shouldSucceed = TestLoad("..\\..\\..\\goodfile.xaml");
Debug.Assert(shouldSucceed);
}
private static bool TestLoad(string path)
{
Stream inputStream = new FileStream(path, FileMode.Open);
XmlReader xmlReader = new XmlTextReader(inputStream);
Assembly[] referenceAssemblies =
{
// these are two separate assemblies which contain all the types we allow
Assembly.GetAssembly(typeof(Canvas)),
Assembly.GetAssembly(typeof(TransformGroup))
};
XamlSchemaContextSettings settings = new XamlSchemaContextSettings();
XamlSchemaContext schemaContext = new SchemaContext(referenceAssemblies, settings);
try
{
XamlReader reader = new XamlXmlReader(xmlReader, schemaContext);
Canvas canvas = (Canvas) System.Windows.Markup.XamlReader.Load(reader);
}
catch (Exception e)
{
Debug.WriteLine(e);
return false;
}
return true;
}
}
}
I have a piece of software that generates code for a C# project based on user actions. I would like to create a GUI to automatically compile the solution so I don't have to load up Visual Studio just to trigger a recompile.
I've been looking for a chance to play with Roslyn a bit and decided to try and use Roslyn instead of msbuild to do this. Unfortunately, I can't seem to find any good resources on using Roslyn in this fashion.
Can anyone point me in the right direction?
You can load the solution by using Roslyn.Services.Workspace.LoadSolution. Once you have done so, you need to go through each of the projects in dependency order, get the Compilation for the project and call Emit on it.
You can get the compilations in dependency order with code like below. (Yes, I know that having to cast to IHaveWorkspaceServices sucks. It'll be better in the next public release, I promise).
using Roslyn.Services;
using Roslyn.Services.Host;
using System;
using System.Collections.Generic;
using System.IO;
class Program
{
static void Main(string[] args)
{
var solution = Solution.Create(SolutionId.CreateNewId()).AddCSharpProject("Foo", "Foo").Solution;
var workspaceServices = (IHaveWorkspaceServices)solution;
var projectDependencyService = workspaceServices.WorkspaceServices.GetService<IProjectDependencyService>();
var assemblies = new List<Stream>();
foreach (var projectId in projectDependencyService.GetDependencyGraph(solution).GetTopologicallySortedProjects())
{
using (var stream = new MemoryStream())
{
solution.GetProject(projectId).GetCompilation().Emit(stream);
assemblies.Add(stream);
}
}
}
}
Note1: LoadSolution still does use msbuild under the covers to parse the .csproj files and determine the files/references/compiler options.
Note2: As Roslyn is not yet language complete, there will likely be projects that don't compile successfully when you attempt this.
I also wanted to compile a full solution on the fly. Building from Kevin Pilch-Bisson's answer and Josh E's comment, I wrote code to compile itself and write it to files.
Software Used
Visual Studio Community 2015 Update 1
Microsoft.CodeAnalysis v1.1.0.0 (Installed using Package Manager Console with command Install-Package Microsoft.CodeAnalysis).
Code
using System;
using System.Collections.Generic;
using System.IO;
using Microsoft.CodeAnalysis;
using Microsoft.CodeAnalysis.Emit;
using Microsoft.CodeAnalysis.MSBuild;
namespace Roslyn.TryItOut
{
class Program
{
static void Main(string[] args)
{
string solutionUrl = "C:\\Dev\\Roslyn.TryItOut\\Roslyn.TryItOut.sln";
string outputDir = "C:\\Dev\\Roslyn.TryItOut\\output";
if (!Directory.Exists(outputDir))
{
Directory.CreateDirectory(outputDir);
}
bool success = CompileSolution(solutionUrl, outputDir);
if (success)
{
Console.WriteLine("Compilation completed successfully.");
Console.WriteLine("Output directory:");
Console.WriteLine(outputDir);
}
else
{
Console.WriteLine("Compilation failed.");
}
Console.WriteLine("Press the any key to exit.");
Console.ReadKey();
}
private static bool CompileSolution(string solutionUrl, string outputDir)
{
bool success = true;
MSBuildWorkspace workspace = MSBuildWorkspace.Create();
Solution solution = workspace.OpenSolutionAsync(solutionUrl).Result;
ProjectDependencyGraph projectGraph = solution.GetProjectDependencyGraph();
Dictionary<string, Stream> assemblies = new Dictionary<string, Stream>();
foreach (ProjectId projectId in projectGraph.GetTopologicallySortedProjects())
{
Compilation projectCompilation = solution.GetProject(projectId).GetCompilationAsync().Result;
if (null != projectCompilation && !string.IsNullOrEmpty(projectCompilation.AssemblyName))
{
using (var stream = new MemoryStream())
{
EmitResult result = projectCompilation.Emit(stream);
if (result.Success)
{
string fileName = string.Format("{0}.dll", projectCompilation.AssemblyName);
using (FileStream file = File.Create(outputDir + '\\' + fileName))
{
stream.Seek(0, SeekOrigin.Begin);
stream.CopyTo(file);
}
}
else
{
success = false;
}
}
}
else
{
success = false;
}
}
return success;
}
}
}