How to persuade GetProto to spit out proto3 format - c#

Using the excellent ProtobufNet by Marc Gravell, we are able to maintain our types in C# and then export them to .proto files for conversion into all the languages needed by our clients.
However we would like to use the proto3 protocol format which is much simpler and less error prone than proto2 which seems to be standard.
After looking around the net we found this encouraging post from the author that seems to indicate that there is proto3 support: https://github.com/mgravell/protobuf-net/issues/187
However we have not found any documentation for ProtobufNet, and so it is a bit difficult to know how to pull this off. So the question is, how can we have GetProto generate proto3 compatible output for our decorated C# types?

In the current versions there is an optional parameter (technically an overload) that defines the schema version. I think it might even default to proto3.
So... just update? Or worst case: update and specify the optional parameter to GetProto.

Related

How to pass experimental_allow_proto3_optional for C# <proto/> definitions to enable optional in proto3?

I've looked everywhere for this. The C# grpc people don't know how to do it, and point to the grpc/grpc people for the tooling but you're not allowed to ask questions there. I guess I could phrase this as a feature but that feels like cheating. (please add documentation too show how...)
How does one pass the parameter for this to C# grpc in the <proto> definition so that we can use the optional keyword?
Thanks!
As for January 2021, the only - yet hacky - way around this is to make your proto filename (or a directory name) contain the string test_proto3_optional, as pointed out by protobuf documentation:
If you try to run protoc on a file with proto3 optional fields, you will get an error because the feature is still experimental. [...] There are two options for getting around this error:
Pass --experimental_allow_proto3_optional to protoc.
Make your filename (or a directory name) contain the string test_proto3_optional. This indicates that the proto file is specifically for testing proto3 optional support, so the check is suppressed.
For more information see #977 (grpc-dotnet), #19164 (AspNetCore.Docs) and #23686 (grpc) issues.

Alternative to XML Documentation Comments in C#

When asking around for the conventions of documentation comments in C# code, the answer always leads to using XML comments. Microsoft recommends this approach themselves aswell. https://learn.microsoft.com/en-us/dotnet/csharp/programming-guide/xmldoc/recommended-tags-for-documentation-comments
/// <summary>
/// This is an XML comment.
/// </summary>
void Foo();
However, when inspecting Microsoft's code, such as ASP.NET Core, comments instead look like this.
//
// Summary:
// A builder for Microsoft.AspNetCore.Hosting.IWebHost.
public interface IWebHostBuilder
Does the included doc generation tool work with this convention, or is there a documentation generation tool that uses this convention instead of XML? Why does Microsoft use this convention in their code instead of the XML comments they recommend themselves?
Why does Microsoft use this convention in their code instead of the XML comments they recommend themselves?
C# documentation comments provide a precise syntax for encoding many types of content and references, such as to types, parameters, URLs, and other documentation files. It uses XML to accomplish this, and so inherits XML's verbosity. Remember that XML comments go way back to C# version 1, when it was a much more verbose language than it is today.
To avoid the readability problems with XML, Visual Studio displays the comments in a simplified, plain text format. You couldn't run this format back through a compiler. For example, if a comment has the term customerId, it may be ambiguous as to whether it refers to a method parameter or a class field. The ambiguity occurs infrequently enough to not be a problem for a human.
Ideally, there's be a single format that was well-defined for compiler input with good readability that avoids boilerplate. There is an issue open to modernize the comment syntax, but unfortunately, it hasn't gone anywhere in 3 years.

C# Alias an Attribute (such as Inline Hinting)

I've been wanting for a while to shorten the (no-using-pollution) "inline" attribute from the absurdity that is:
[System.Runtime.CompilerServices.MethodImpl(System.Runtime.CompilerServices.MethodImplOptions.AggressiveInlining)]
to, well, [InlineHint] or [MyCompilerSvcs.InlineHint] or similar- something both quicker to type and quicker to mentally parse.
Is there any way to actually do that? Right now the only "sane" options I can see are to either add using System.Runtime.CompilerServices; (which is a pain when dealing with code behind of an ASP.NET website), adding more specific using aliases (even worse), or to keep it long-form in an accessible location for copy-paste.
Providing this question from 2009 isn't too outdated, using seems to be the only way to shorten how large the attribute reference is (as nobody suggested a more keyword-like variant for large, multifile projects). This related question was from 2010, and also suggests a using trick.
From 2015, there was this question, but it was in reference to the resulting decorations. Since what I'm interested in is the compiler directives themselves (and a performance-based one at that!) I doubt a runtime IL Emit could do this, and a "code bridge" doesn't quite naturally extend to compiler services in my mind.
Currently targeting C# 4.5, but newer versions are not forbidden.
And before "The compiler does inlining automatically!", it only does so for 32 IL Bytes or less and the inline hint overrides the size restriction. There are also other options which could be useful to have more accessible, such as NoOptimization, NoInline, and Synchronized, all of which I would very much like to not have to type absurdly long attributes to access without using statements.
You can write a Roslyn-based tool to do that. This enables to apply an attribute with a name of your choice (some short name such as AggInline) and the tool will emit the actual AggressiveInlining attribute and the required using directives. You can see the ImmutableObjectGraph tool as an example on how to do something like that in Roslyn.

Proto2 vs. Proto3 in C#

I have to send messages to another team using the proto2 version of Google Protocol Buffers. They are using Java and C++ on Linux. I'm using C# on Windows.
Jon Skeet's protobuf-csharp-port (https://github.com/jskeet/protobuf-csharp-port) supports proto2. If I understand correctly, Google has taken this code and folded an updated version of it into the main protobuf project (https://github.com/google/protobuf/tree/master/csharp). But it no longer supports proto2 for C#, only proto3.
I'm not sure which project I should use. It seems like the new one will be better supported (performance, support for proto3 if the other team ever upgrades). But I would have to convert the .proto file that I was given from proto2 to proto3 and risk any issues that come with that.
I've read that for the most part, the messages for proto2 and proto3 are compatible. I have no experience with Protocol Buffers, but the .proto file I'm working with looks pretty vanilla, no default values or oneof or nested anything. So it seems like I could just delete their "required" and "optional" keywords and use the new library, treating this as a proto3 file.
In your opinion, is it worth the hassle to use the newer library? Is there a list of proto features that would make the proto2 and proto3 messages incompatible?
If the other team has any required fields and you send messages to them without specifying those fields (or even explicitly specifying the default value, for primitives) then the other end will fail to receive the messages - they won't validate.
There are various differences between proto2 and proto3 - some are listed on the releases page:
The following are the main new features in language version 3:
Removal of field presence logic for primitive value fields, removal of required fields, and removal of default values. This makes proto3 significantly easier to implement with open struct representations, as in languages like Android Java, Objective C, or Go.
Removal of unknown fields.
Removal of extensions, which are instead replaced by a new standard type called Any.
Fix semantics for unknown enum values.
Addition of maps.
Addition of a small set of standard types for representation of time, dynamic data, etc.
A well-defined encoding in JSON as an alternative to binary proto encoding.
The removal of unknown fields could be a significant issue to you - if the other team expects to be able to send you a message with some fields your code is unaware of, and you be able to return a message to them maintaining those fields, proto3 could pose problems for you.
If you can use proto3, I'd suggest using proto3 version, partly as it will have proper support whereas the proto2 version is basically in maintenance mode. There are significant differences between the two, primarily in terms of mutability - the generated message classes in the proto3 codebase are mutable, which is great for immediate usability, but can pose challenges in other areas.

Which language idioms/paradigms/features make it hard to add support for "type providers"?

F# 3.0 has added type providers.
I wonder if it is possible to add this language feature to other languages running on the CLR like C# or if this feature only works well in a more functional/less OO programming style?
As Tomas says, it is theoretically straightforward to add this kind of feature to any statically-typed language (though still a lot of grunt-work).
I am not a meta-programming expert, but #SK-logic asks why not a general compile-time meta-programming system instead, and I shall try to answer. I don't think you can easily achieve what you can do with F# type providers using meta-programming, because F# type providers can be lazy and dynamically interactive at design-time. Let's give an example that Don has demo-ed in one of his earlier videos: a Freebase type provider. Freebase is kind of like a schematized, programmable wikipedia, it has data on everything. So you can end up writing code along the lines of
for e in Freebase.Science.``Chemical Elements`` do
printfn "%d: %s - %s" e.``Atomic number`` e.Name e.Discoverer.Name
or whatnot (I don't have the exact code offhand), but just as easily write code that gets information about baseball statistics, or when famous actors have been in drug rehab facilities, or a zillion other types of information available through Freebase.
From an implementation point-of-view, it is infeasible to generate a schema for all of Freebase and bring it into .NET a-priori; you can't just do one compile-time step at the beginning to set all this up. You can do this for small data sources, and in fact many other type providers use this strategy, e.g. a SQL type provider gets pointed at a database, and generates .NET types for all the types in that database. But this strategy does not work for large cloud data stores like Freebase, because there are too many interrelated types (if you tried to generate .NET metadata for all of Freebase, you'd find that there are so many millions of types (one of which is ChemicalElement with AtomicNumber and Discoverer and Name and many other fields, but there are literally millions of such types) that you need more memory than is available to a 32-bit .NET process just to represent the entire type schema.
So the F# type-provider strategy is an API architecture that allows type providers to supply information on-demand, running at design-time within the IDE. Until you type e.g. Freebase.Science., the type provider does not need to know about the entities under the science categories, but once you do press . after Science, then the type provider can go and query the APIs to learn one-more-level of the overall schema, to know what categories exist under Science, one of which is ChemicalElements. And then as you try to "dot into" one of those, it will discover that elements have atomic numbers and what-not. So the type provider lazily fetches just enough of the overall schema to deal with the exact code the user happens to be typing into the editor at that moment in time. As a result, the user still has the freedom to explore any part of the universe of information, but any one source code file or interactive session will only explore a tiny fraction of what is available. When it comes time to compile/codegen, the compiler need only generate enough code to accomodate exactly the bits that the user has actually used in his code, rather than the potentially huge runtime bits to afford the possibility of talking to the whole data store.
(Maybe you can do that with some of today's meta-programming facilities now, I don't know, but the ones I learned about in school a long while back could not have easily handled this.)
As Brian and Tomas point out, there's nothing particularly "functional" about this feature. It's just a particularly slick way to provide metadata to the compiler.
The C# design team has been kicking around ideas like this for a long time. There was a proposal a few years before I joined the C# team for a feature that was going to be called "type blueprints" (or something like that) whereby a combination of XML documents, XML schema and custom code that proffered up type metadata could be used by the C# compiler. I don't recall the details and it never came to fruition, obviously. (Though it did influence the design and the implementation of the Visual Studio Tools for Office document format, which I was working on at the time.)
In any event, we have no plans on the immediate horizon for adding such a feature to C#, but we are watching with great interest to see if it does a good job of solving customer problems in F#.
(As always, Eric's musings about possible future features of unnannounced and entirely hypothetical products are for entertainment purposes only.)
I don't see any technical reason why something like type providers couldn't be added to C# or similar languages. The only family of langauges that make it difficult to add type providers (in a similar way as in F#) are dynamically typed languages.
F# type providers rely on the fact that the type information that are generated by the provider nicely propagate through the program and the editor can use them to show useful IntelliSense. In dynamically typed languages, this would require more elaborate IDE support (and "type providers" for dynamic langauges reduce to just IDE or IntelliSense).
Why are they implemented directly as a feature of F#? I think the meta-programming system would have to be really complex (note that the types are not actually generated) to support this. The other things that could be done using it wouldn't contribute to the F# language that much (they would only make it too complex, which is a bad thing). However, you could get similar thing if you had some sort of compiler extensibility.
In fact, I think this is how the C# team will add something like type providers in the future (they talked about compiler extensibility for some time now).

Categories