Json serialize error via RecyclableMemoryStream - c#

I use code like this to replace default json serialize method:
private readonly static RecyclableMemoryStreamManager _recyclableMemoryStreamManager =
new RecyclableMemoryStreamManager(blockSize: 128 * 1024, largeBufferMultiple: 1024 * 1024, maximumBufferSize: 128 * 1024 * 1024);
private ByteArrayContent Serialize(object content, JsonSerializerSettings serializerSettings, Encoding encoding, string mediaType)
{
var jsonSerializer = Newtonsoft.Json.JsonSerializer.Create(serializerSettings);
using (var memoryStream = _recyclableMemoryStreamManager.GetStream())
{
using (var textWriter = new StreamWriter(memoryStream, encoding, 1024, true))
{
using (var jsonTextWriter = new JsonTextWriter(textWriter) { CloseOutput = false })
{
jsonSerializer.Serialize(jsonTextWriter, content);
jsonTextWriter.Flush();
var arraySegment = new ArraySegment<byte>(memoryStream.GetBuffer(), 0, (int)memoryStream.Length);
var resContent = new ByteArrayContent(arraySegment.Array, arraySegment.Offset, arraySegment.Count);
resContent.Headers.ContentType = new MediaTypeHeaderValue(mediaType);
return resContent;
}
}
}
}
But sometimes, http response json with sytanx error:
{
"code": 0,
"msg": null,
"data": [
// ....
]
}
')","foo":"","bar":"baz","flag":0,')","foo":"","bar":"baz","flag":0,
')","foo":"","bar":"baz","flag":0,')","foo":"","bar":"baz","flag":0,
How to fix this?
I think it maybe reuse buffer error,
Maybe can change the values of RecyclableMemoryStreamManager ?
_recyclableMemoryStreamManager.AggressiveBufferReturn = true;

The buffer from GetBuffer() is only well-defined for the lifetime of the stream; you dispose the stream when the method exits the using block for memoryStream, which means those buffers are now up for grabs for re-use.
You may wish to use StreamContent instead; this accepts a Stream of the payload, and disposes it when sent: I'd use that; that would give you the exact semantics you want here. Note: don't dispose memoryStream yourself - remove that using (perhaps adding a catch block that does memoryStream?.Dispose(); throw;).
Note also that GetBuffer() is not necessarily the optimal API for RecyclableMemoryStream, since it may use multiple discontiguous buffers internally; there should be a ReadOnlySequence<byte> GetReadOnlySequence() API which allows that usage - however, this still has the same lifetime limitations impacting buffer re-use, so: it wouldn't change anything here.
Untested, but for consideration:
private HttpContent Serialize(object content, JsonSerializerSettings serializerSettings, Encoding encoding, string mediaType)
{
var jsonSerializer = JsonSerializer.Create(serializerSettings);
var memoryStream = _recyclableMemoryStreamManager.GetStream();
try
{
using (var textWriter = new StreamWriter(memoryStream, encoding, 1024, true))
{
using var jsonTextWriter = new JsonTextWriter(textWriter) { CloseOutput = false };
jsonSerializer.Serialize(jsonTextWriter, content);
jsonTextWriter.Flush();
}
memoryStream.Position = 0; // rewind
var resContent = new StreamContent(memoryStream);
resContent.Headers.ContentType = new MediaTypeHeaderValue(mediaType);
return resContent;
}
catch
{
memoryStream?.Dispose();
throw;
}
}
However, I would expect it would be better to serialize directly to the output via the inbuilt JSON media encoder, rather than using an intermediate buffer.

Related

Invalid data while decoding a byte[] using gzip

I have a message payload and one of the properties is "contents" which contains a gzip compressed data. Ive extracted out the property contents from the message and published it to another queue. This next queue then needs to decompress the byte[] and then extract out the original string.
I get an "Invalid data while decoding" exception when decompressing the byte[] received from the previous queue. Ive actually taken the gzip compressed data and run it through online gzip decompressor to check that it is valid and it is. Im not sure why it is unable to decompress the byte[] even though an online gzip decompressor can do it without issue.
Code for how I am compressing and sending the original message (I am using a dynamic object cause i am extracting other properties which are not show in this snippet):
dynamic messageObject = new ExpandoObject();
messageObject.PreviousWorkerResult = message.PreviousWorkersResults;
messageObject.Contents = GzipHelper.Compress(Encoding.UTF8.GetBytes(serializedMessage));
var serializedExpandoObject = JsonConvert.SerializeObject(messageObject, new JsonSerializerSettings
{
NullValueHandling = NullValueHandling.Ignore,
ContractResolver = new CustomCamelCasePropertyNamesContractResolver(),
});
body = Encoding.UTF8.GetBytes(serializedExpandoObject);
_bus.Advanced.Publish(exchange,
queueName,
mandatory: true,
messageProperties: properties,
body: body);
Code for extracting the gzipped compressed data property "contents":
dynamic data = JsonConvert.DeserializeObject(message.Payload);
string contents = data.contents;
var properties = new MessageProperties()
{
ContentType = "gzip"
};
await messageQueuePublisher.PublishAsync(contents, dto.SendTo, properties);
Code for the receiving end to decompress. This is where the exception is being thrown on the GzipHelper.Decompress(body):
if (properties.ContentType == "gzip")
{
var decompressed = GzipHelper.Decompress(body);
jsonMessage = Encoding.UTF8.GetString(decompressed);
}
Code for the Compress and Decompress in GzipHelper:
public static byte[] Compress(byte[] data)
{
using (var memory = new MemoryStream())
{
using (var gzip = new GZipStream(memory, CompressionLevel.Optimal))
{
gzip.Write(data, 0, data.Length);
}
return memory.ToArray();
}
}
public static byte[] Decompress(byte[] data)
{
try
{
using (var compressedStream = new MemoryStream(data))
using (var zipStream = new GZipStream(compressedStream, CompressionMode.Decompress))
using (var resultStream = new MemoryStream())
{
zipStream.CopyTo(resultStream);
return resultStream.ToArray();
}
}
catch (Exception e)
{
throw new Exception(e.ToString());
}
}

Value already read, or no value when trying to read from a Stream

I've been trying this for a long time but it keeps giving me an error. I have an array of bytes that should represent a nbt document. I would like to convert this into a c# object with a library: fNbt.
Here is my code:
byte[] buffer = Convert.FromBase64String(value);
byte[] decompressed;
using (var inputStream = new MemoryStream(buffer))
{
using var outputStream = new MemoryStream();
using (var gzip = new GZipStream(inputStream, CompressionMode.Decompress, leaveOpen: true))
{
gzip.CopyTo(outputStream);
}
fNbt.NbtReader reader = new fNbt.NbtReader(outputStream, true);
var output = reader.ReadValueAs<AuctionItem>(); //Error: Value already read, or no value to read.
return output;
}
When I try this, it works:
decompressed = outputStream.ToArray();
outputStream.Seek(0, SeekOrigin.Begin);
outputStream.Read(new byte[1000], 0, decompressed.Count() - 1);
But when I try this, it doesn't:
outputStream.Seek(0, SeekOrigin.Begin);
fNbt.NbtReader reader = new fNbt.NbtReader(outputStream, true);
reader.ReadValueAs<AuctionItem>();
NbtReader, like most stream readers, begins reading from the current position of whatever stream you give it. Since you're just done writing to outputStream, then that position is the stream's end. Which means at that point there's nothing to be read.
The solution is to seek the outputStream back to the beginning before reading from it:
outputStream.Seek(0, SeekOrigin.Begin); // <-- seek to the beginning
// Do the read
fNbt.NbtReader reader = new fNbt.NbtReader(outputStream, true);
var output = reader.ReadValueAs<AuctionItem>(); // No error anymore
return output;
The solution is as follows. NbtReader.ReadValueAs does not consider a nbtCompound or nbtList as value. I made this little reader but it is not done yet (I will update the code once it is done).
public static T ReadValueAs<T>(string value) where T: new()
{
byte[] buffer = Convert.FromBase64String(value);
using (var inputStream = new MemoryStream(buffer))
{
using var outputStream = new MemoryStream();
using (var gzip = new GZipStream(inputStream, CompressionMode.Decompress, leaveOpen: true))
{
gzip.CopyTo(outputStream);
}
outputStream.Seek(0, SeekOrigin.Begin);
return new EasyNbt.NbtReader(outputStream).ReadValueAs<T>();
}
}
This is the NbtReader:
private MemoryStream MemStream { get; set; }
public NbtReader(MemoryStream memStream)
{
MemStream = memStream;
}
public T ReadValueAs<T>() where T: new()
{
return ReadTagAs<T>(new fNbt.NbtReader(MemStream, true).ReadAsTag());
}
private T ReadTagAs<T>(fNbt.NbtTag nbtTag)
{
//Reads to the root and adds to T...
}

JToken.WriteToAsync does not write to JsonWriter

I'm trying to create a middleware that changes the request in a certain way. I am able to read it and change the content but I cannot figure out how to correctly setup the stream writers to create a new body. When I call normalized.WriteToAsync(jsonWriter) the MemoryStream remains empty and consequently I receive the A non-empty request body is required. exception. What am I missing here? This is what I have so far:
public async Task Invoke(HttpContext context)
{
if (context.Request.ContentType == "application/json" && context.Request.ContentLength > 0)
{
using var scope = _logger.BeginScope("NormalizeJson");
try
{
using var requestReader = new HttpRequestStreamReader(context.Request.Body, Encoding.UTF8);
using var jsonReader = new JsonTextReader(requestReader);
var json = await JToken.LoadAsync(jsonReader);
var normalized = _normalize.Visit(json); // <-- Modify json and return JToken
// Create new Body
var memoryStream = new MemoryStream();
var requestWriter = new StreamWriter(memoryStream);
var jsonWriter = new JsonTextWriter(requestWriter);
await normalized.WriteToAsync(jsonWriter); // <-- At this point the MemoryStream has still 0 length.
var content = new StreamContent(memoryStream.Rewind()); // <-- Use helper extension to Seek.Begin = 0
context.Request.Body = await content.ReadAsStreamAsync();
}
catch (Exception e)
{
_logger.Scope().Exceptions.Push(e);
}
}
await _next(context);
}
Demo for LINQPad etc.:
async Task Main()
{
var token = JToken.FromObject(new User { Name = "Bob" });
var memoryStream = new MemoryStream();
var requestWriter = new StreamWriter(memoryStream);
var jsonWriter = new JsonTextWriter(requestWriter);
await token.WriteToAsync(jsonWriter);
memoryStream.Length.Dump(); // <-- MemoryStream.Length = 0
}
public class User
{
public string Name { get; set; }
}
You need to properly flush and close your JsonTextWriter and StreamWriter in order to fully populate the memoryStream, like so:
var memoryStream = new MemoryStream();
// StreamWriter implements IAsyncDisposable
// Leave the underlying stream open
await using (var requestWriter = new StreamWriter(memoryStream, leaveOpen: true))
{
var jsonWriter = new JsonTextWriter(requestWriter); // But JsonTextWriter does not implement IAsyncDisposable, only IDisposable!
try
{
await token.WriteToAsync(jsonWriter);
}
finally
{
await jsonWriter.CloseAsync();
}
}
Demo fiddle #1 here.
Or, since you're writing to a MemoryStream, there's really no nead to use async at all, and instead you can do:
var memoryStream = new MemoryStream();
using (var requestWriter = new StreamWriter(memoryStream, leaveOpen: true)) // Leave the underlying stream open
using (var jsonWriter = new JsonTextWriter(requestWriter))
{
token.WriteTo(jsonWriter);
}
Demo fiddle #2 here.
Notes:
Note the use of await using for the StreamWriter. This syntax guarantees that the StreamWriter will be flushed and closed asynchronously, and can be used on any object that implements IAsyncDisposable. (This only really matters if you were writing to a file stream or other non-memory stream.)
It seems that neither JsonTextWriter nor the base class JsonWriter implement IAsyncDisposable, so I had to asynchronously close the JSON writer manually rather than via a using statement. The outer await using should ensure that the underlying StreamWriter is not left open in the event of an exception.
JSON RFC 8259 specifies that Implementations MUST NOT add a byte order mark (U+FEFF) to the beginning of a networked-transmitted JSON text. Thus, when constructing a StreamWriter, it is recommended to pass an encoding such as new UTF8Encoding(false) that does not prepend a BOM. Alternatively, if you just want UTF-8, the StreamWriter constructors will create a StreamWriter with UTF-8 encoding without a Byte-Order Mark (BOM) if you do not specify one yourself and leave a default value for that parameter as is shown in the code above.

JSON.NET: Minify/Format content without re-parsing

Is it possible to minify/format a JSON string using the Newtonsoft JSON.NET library without forcing the system to reparse the code? This is what I have for my methods:
public async Task<string> Minify(string json)
{
// TODO: Some way to do this without a re-parse?
var jsonObj = await JsonOpener.GetJsonFromString(json);
return jsonObj.ToString(Formatting.None);
}
public async Task<string> Beautify(string json)
{
// TODO: Some way to do this without a re-parse?
var jsonObj = await JsonOpener.GetJsonFromString(json);
return FormatJson(jsonObj);
}
private string FormatJson(JToken input)
{
// We could just do input.ToString(Formatting.Indented), but this allows us
// to take advantage of JsonTextWriter's formatting options.
using (var stringWriter = new StringWriter(new StringBuilder()))
{
using (var jsonWriter = new JsonTextWriter(stringWriter))
{
// Configures indentation character and indentation width
// (e.g., "indent each level using 2 spaces", or "use tabs")
ConfigureWriter(jsonWriter);
var serializer = new JsonSerializer();
serializer.Serialize(jsonWriter, input);
return stringWriter.ToString();
}
}
}
This code works just fine in small blocks of JSON, but it starts to get bogged down with large blocks of content. If I could just strip out everything without having to go through the parser, it would be much faster, I'd imagine.
If I have to reinvent the wheel and strip out all whitespace or whatnot myself, I will, but I don't know if there any gotchas that come into play.
For that matter, is there another library better suited to this?
EDIT: My bad, JSON does not support comments natively.
Yes, you can do this using Json.Net. Just connect a JsonTextReader directly to a JsonTextWriter. That way you are reusing the tokenizer logic of the reader and the formatting logic of the writer, but you skip the step of converting the tokens into an intermediate object representation and back (which is the time-consuming part).
Here is how I would break it into helper methods to make it super easy and flexible to use:
public static string Minify(string json)
{
return ReformatJson(json, Formatting.None);
}
public static string Beautify(string json)
{
return ReformatJson(json, Formatting.Indented);
}
public static string ReformatJson(string json, Formatting formatting)
{
using (StringReader stringReader = new StringReader(json))
using (StringWriter stringWriter = new StringWriter())
{
ReformatJson(stringReader, stringWriter, formatting);
return stringWriter.ToString();
}
}
public static void ReformatJson(TextReader textReader, TextWriter textWriter, Formatting formatting)
{
using (JsonReader jsonReader = new JsonTextReader(textReader))
using (JsonWriter jsonWriter = new JsonTextWriter(textWriter))
{
jsonWriter.Formatting = formatting;
jsonWriter.WriteToken(jsonReader);
}
}
Here is a short demo: https://dotnetfiddle.net/RevZNU
With this setup you could easily add additional overloads that work on streams, too, if you needed it. For example:
public static void Minify(Stream inputStream, Stream outputStream, Encoding encoding = null)
{
ReformatJson(inputStream, outputStream, Formatting.None, encoding);
}
public static void Beautify(Stream inputStream, Stream outputStream, Encoding encoding = null)
{
ReformatJson(inputStream, outputStream, Formatting.Indented, encoding);
}
public static void ReformatJson(Stream inputStream, Stream outputStream, Formatting formatting, Encoding encoding = null)
{
if (encoding == null)
encoding = new UTF8Encoding(false);
const int bufferSize = 1024;
using (StreamReader streamReader = new StreamReader(inputStream, encoding, true, bufferSize, true))
using (StreamWriter streamWriter = new StreamWriter(outputStream, encoding, bufferSize, true))
{
ReformatJson(streamReader, streamWriter, formatting);
}
}

Serializing to MemoryStream causes an OutOfmemoryException, but serializing to a FileStream does not. Can anyone tell me why?

I'm using Newtonsoft Json.Net to serialize objects as json. I continually run into an OutOfMemoryException when I try to serialize to a MemoryStream, but not when I serialize to a FileStream. Could someone explain why this might be happening? These are the two methods that I am using to serialize.
Throws an OutOfMemoryException
private static MemoryStream _serializeJson<T>(T obj)
{
try
{
var stream = new MemoryStream();
var streamWriter = new StreamWriter(stream);
var jsonWriter = new JsonTextWriter(streamWriter);
var serializer = new JsonSerializer();
serializer.ContractResolver = new CamelCasePropertyNamesContractResolver();
serializer.Formatting = Formatting.Indented;
serializer.Serialize(jsonWriter, obj);
streamWriter.Flush();
stream.Position = 0;
return stream;
}
catch (Exception e)
{
//Logger.WriteError(e.ToString());
Console.WriteLine(e.ToString());
return null;
}
}
Doesn't throw an OutOfMemoryException
private static void _serializeJsonToFile<T>(T obj, string path)
{
try
{
using (FileStream fs = File.Open(path, FileMode.Create, FileAccess.ReadWrite))
using (StreamWriter sw = new StreamWriter(fs))
using (JsonWriter jw = new JsonTextWriter(sw))
{
jw.Formatting = Formatting.Indented;
JsonSerializer serializer = new JsonSerializer();
serializer.ContractResolver = new CamelCasePropertyNamesContractResolver();
serializer.Serialize(jw, obj);
}
}
catch (Exception e)
{
Console.WriteLine(e.ToString());
}
}
P.S. some might ask why I would want to return a stream instead of simply serializing to the file stream. This is because I want to keep serialization in one class and file handling in another, so I'm passing the memory stream to a WriteFile method in another class later.
You are getting OutOfMemoryExceptions because the memory stream is very aggressive about it's growth. Every time it needs to re-size it doubles it's internal buffer.
//The code from MemoryStream http://referencesource.microsoft.com/mscorlib/system/io/memorystream.cs.html#1416df83d2368912
private bool EnsureCapacity(int value) {
// Check for overflow
if (value < 0)
throw new IOException(Environment.GetResourceString("IO.IO_StreamTooLong"));
if (value > _capacity) {
int newCapacity = value;
if (newCapacity < 256)
newCapacity = 256;
// We are ok with this overflowing since the next statement will deal
// with the cases where _capacity*2 overflows.
if (newCapacity < _capacity * 2)
newCapacity = _capacity * 2;
// We want to expand the array up to Array.MaxArrayLengthOneDimensional
// And we want to give the user the value that they asked for
if ((uint)(_capacity * 2) > Array.MaxByteArrayLength)
newCapacity = value > Array.MaxByteArrayLength ? value : Array.MaxByteArrayLength;
Capacity = newCapacity;
return true;
}
return false;
}
With a 17.8 MB file that is a worst case scenario of a 35.6 MB byte array being used. The old byte arrays that where discarded during the resizing process can also cause Memory Fragmentation depending on how long they live, this can easily get your program to throw a OOM error before you get to the 32 bit memory limit.
Writing directly to a FileStream does not require any large buffers to be created in memory so it uses much less space.
There is a way to separate the logic of the saving from the serializing, just pass in the stream to the function instead of creating it in the function itself.
private static void _serializeJson<T>(T obj, Stream stream)
{
try
{
using(var streamWriter = new StreamWriter(stream, Encoding.UTF8, 1024, true))
using(var jsonWriter = new JsonTextWriter(streamWriter))
{
var serializer = new JsonSerializer();
serializer.ContractResolver = new CamelCasePropertyNamesContractResolver();
serializer.Formatting = Formatting.Indented;
serializer.Serialize(jsonWriter, obj);
}
}
catch (Exception e)
{
//Logger.WriteError(e.ToString());
Console.WriteLine(e.ToString());
}
}
I also dispose of the StreamWriter that is created, the constructor I used has a leaveOpen flag which causes the underlying stream to not be closed when you dispose of the StreamWriter.

Categories