I have the following code to deserilize JSON objects sent over a stream.
When running this code with network stream, I get an error message every time the jsonTextReader.LinePosition crosses the value 1000. Even if it's the same object that has been previously successfully deserialized. i.e. all the JSON messages being sent are correctly formatted and match the Schema.
And when using a MemoryStream, weirdly, this code works perfectly irrespective of the jsonTextReader.LinePosition.
Has anyone else experienced similar issue when the LinePosition goes beyond 1000?
var Serializer = new JsonSerializer()
{
NullValueHandling = NullValueHandling.Ignore
};
var streamReader = new StreamReader(stream, new UTF8Encoding());
var jsonTextReader = new JsonTextReader(streamReader)
{
CloseInput = false,
SupportMultipleContent = true
};
while (true)
{
if (jsonTextReader.Read())
{
message = Serializer.Deserialize<MyObject>(jsonTextReader);
}
}
PS I can't move the if (jsonTextReader.Read()) to the while loop because it's done in a library function. In short, I can't change the flow of code.
Update
#Tewr suggested that it could be maxlength some value restricted to 1000/1024 on the source stream. But when I run the following, it works fine.
while (true)
{
if (jsonTextReader.Read())
{
message = Serializer.Deserialize<MyObject>(jsonTextReader);
}
else
{
jsonTextReader = new JsonTextReader(streamReader);
}
}
But the problem is, I can't do this because as mentioned, part of the code is run in a library. And this doesn't entirely resolve the problem (what if there is a single JSON object string longer than 1024?).
Now, when creating a new JsonTextReader, everything is reset. Including jsonTextReader.LinePosition. So, not letting jsonTextReader.LinePosition hitting 1000 makes things work. In that case, what is the relation between jsonTextReader.LinePosition and NetworkStream/Socket?
Here are some of the error message I get:
Unexpected character encountered while parsing number: T. Path 'timeStamp', line 1, position 1026.
for
{"version":"35","timestamp":"2018-05-14T07:52:23.3347793Z","message":{"type":"msgType","value":{"id":"44","params":["AA","AC","BD"]}}}
and
Invalid JavaScript property identifier character: ". Path 'message.indicator', line 1, position 1023.
for
{"version":"35","timestamp":"2018-05-14T07:52:23.3347793Z","message":{"type":"msgType","value":{"id":"44","params":["AA","AC","BD"]},"indicator":{"data":"normal","indication":"no"}}}
Testing with MemoryStream:
var str = "{\"version\":\"35\",\"timestamp\":\"2018-05-14T07:52:23.3347793Z\",\"message\":{\"type\":\"msgType\",\"value\":{\"id\":\"44\",\"params\":[\"AA\",\"AC\",\"BD\"]},\"indicator\":{\"data\":\"normal\",\"indication\":\"no\"}}}";
str += str;
str += str;
str += str;
var ms = new MemoryStream();
ms.Write(str.GetBytes(), 0, str.GetBytes().Count());
ms.Position = 0;
Console.WriteLine(string.Format("StringLength:{0}", str.Length));
var streamReader = new StreamReader(ms);
var jsonTextReader = new JsonTextReader(streamReader)
{
CloseInput = false,
SupportMultipleContent = true
};
and then use the jsonTextReader as before.
Related
I'm using .netCore 3.1 to create a RESTful API.
Here I'm trying to modify the response body to filter out some values based on a corporate use case that I have.
My problem is that at first, I figured that the CanRead value of context.HttpContext.Response.Body is false, thus it is unreadable, so I searched around and found this question and its answers.
which
basically converts a stream that can't seek to one that can
so I applied the answer with a little modification to fit my use case :
Stream originalBody = context.HttpContext.Response.Body;
try
{
using (var memStream = new MemoryStream())
{
context.HttpContext.Response.Body = memStream;
memStream.Position = 0;
string responseBody = new StreamReader(memStream).ReadToEnd();
memStream.Position = 0;
memStream.CopyTo(originalBody);
string response_body = new StreamReader(originalBody).ReadToEnd();
PagedResponse<List<UserPhoneNumber>> deserialized_body;
deserialized_body = JsonConvert.DeserializeObject<PagedResponse<List<UserPhoneNumber>>>(response_body);
// rest of code logic
}
}
finally
{
context.HttpContext.Response.Body = originalBody;
}
But when debugging, I found out that memStream.Length is always 0, and therefore the originalBody value is an empty string : "" .
Even though after this is executed , the response is returned successfully, (thanks to the finally block).
I can't seem to understand why this is happening, is this an outdated method? what am I doing wrong?
Thank you in advance.
using is closing the stream try
string body= new StreamReader(Request.Body).ReadToEnd();
I have sent one text file of data to Kafka producer after reading that file in string. Now I want to consume the same data in text file. How do I consume it?
var fileName = #"D:\kafka_examples\new2.txt";
var options = new KafkaOptions(new Uri("http://localhost:9092"),
new Uri("http://localhost:9092"));
var router = new BrokerRouter(options);
var consumer = new KafkaNet.Consumer(new ConsumerOptions("Hello-Kafka",
new BrokerRouter(options)));
var text="";
//Consume returns a blocking IEnumerable (ie: never ending stream)
if (File.Exists(fileName))
{
File.Delete(fileName);
}
foreach (var message in consumer.Consume())
{
Console.WriteLine("Response: P{0},O{1} : {2}",
message.Meta.PartitionId, message.Meta.Offset,
text= Encoding.UTF8.GetString(message.Value));
using (StreamWriter sw = File.CreateText(fileName))
{
sw.WriteLine(text);
}
}
I tried this but the file is not written in given text file. All messages are coming. I want only the last message.
There is no concept of the "latest" message in a stream; they are infinite.
But what you could do is try to lookup the current latest offset when your code starts, then subtract one (or the number of lines in the file), then seek the consumer group there and break the for loop on reading just those many messages.
i.e.
var filename = ...
var lines = linesInFile(filename)
var consumer = ... // (with a consumer group id)
var latestOffset = seekToEnd(consumer, -1 * lines) // second param is the delta offset from the end
var i = lines;
foreach (var message in consumer.Consume()) {
...
if (i <= 0) break;
}
Also, Kafka is not an http service. Remove http:// and your duplicate localhost address from the code
I'm working on Pdf to text file conversion using google cloud vision API.
I got an initial code help through there side, image to text conversion working fine with JSON key which I got through registration and activation,
here is a code which I got for pdf to text conversion
private static object DetectDocument(string gcsSourceUri,
string gcsDestinationBucketName, string gcsDestinationPrefixName)
{
var client = ImageAnnotatorClient.Create();
var asyncRequest = new AsyncAnnotateFileRequest
{
InputConfig = new InputConfig
{
GcsSource = new GcsSource
{
Uri = gcsSourceUri
},
// Supported mime_types are: 'application/pdf' and 'image/tiff'
MimeType = "application/pdf"
},
OutputConfig = new OutputConfig
{
// How many pages should be grouped into each json output file.
BatchSize = 2,
GcsDestination = new GcsDestination
{
Uri = $"gs://{gcsDestinationBucketName}/{gcsDestinationPrefixName}"
}
}
};
asyncRequest.Features.Add(new Feature
{
Type = Feature.Types.Type.DocumentTextDetection
});
List<AsyncAnnotateFileRequest> requests =
new List<AsyncAnnotateFileRequest>();
requests.Add(asyncRequest);
var operation = client.AsyncBatchAnnotateFiles(requests);
Console.WriteLine("Waiting for the operation to finish");
operation.PollUntilCompleted();
// Once the rquest has completed and the output has been
// written to GCS, we can list all the output files.
var storageClient = StorageClient.Create();
// List objects with the given prefix.
var blobList = storageClient.ListObjects(gcsDestinationBucketName,
gcsDestinationPrefixName);
Console.WriteLine("Output files:");
foreach (var blob in blobList)
{
Console.WriteLine(blob.Name);
}
// Process the first output file from GCS.
// Select the first JSON file from the objects in the list.
var output = blobList.Where(x => x.Name.Contains(".json")).First();
var jsonString = "";
using (var stream = new MemoryStream())
{
storageClient.DownloadObject(output, stream);
jsonString = System.Text.Encoding.UTF8.GetString(stream.ToArray());
}
var response = JsonParser.Default
.Parse<AnnotateFileResponse>(jsonString);
// The actual response for the first page of the input file.
var firstPageResponses = response.Responses[0];
var annotation = firstPageResponses.FullTextAnnotation;
// Here we print the full text from the first page.
// The response contains more information:
// annotation/pages/blocks/paragraphs/words/symbols
// including confidence scores and bounding boxes
Console.WriteLine($"Full text: \n {annotation.Text}");
return 0;
}
this function required 3 parameters
string gcsSourceUri,
string gcsDestinationBucketName,
string gcsDestinationPrefixName
I don't understand which value should I set for those 3 params.
I never worked on third party API before so it's a little bit confusing for me
Suppose you own a GCS bucket named 'giri_bucket' and you put a pdf at the root of the bucket 'test.pdf'. If you wanted to write the results of the operation to the same bucket you could set the arguments to be
gcsSourceUri: 'gs://giri_bucket/test.pdf'
gcsDestinationBucketName: 'giri_bucket'
gcsDestinationPrefixName: 'async_test'
When the operation completes, there will be 1 or more output files in your GCS bucket at giri_bucket/async_test.
If you want, you could even write your output to a different bucket. You just need to make sure your gcsDestinationBucketName + gcsDestinationPrefixName is unique.
You can read more about the request format in the docs: AsyncAnnotateFileRequest
I'm writing a Windows Service to listen and process messages from MSMQ. The listener has various error handling steps, but if all else fails, I want to save the body of the message to a text file so that I can look at it. However, I can't seem to extract the content of my messages when this condition is hit. The following code is a simple representation of the sections in question, and it always produces an empty text file even though I know the message I'm testing with is not empty. HOWEVER, if I comment-out the initial attempt to deserialize the XML, the fail safe does work and produces a text file with the message body. So I think the problem is something to do with how the deserialization attempt leaves the underlying Stream? Just to clarify, when the message contains valid XML that CAN be deserialized, the service all works fine and the fail-safe never comes into action.
MyClass myClass = null;
try
{
XmlSerializer serializer = new XmlSerializer(typeof(MyClass));
// Comment the following out and the fail safe works
// Let this run and fail and the text file below is always empty
myClass = (MyClass)serializer.Deserialize(m.BodyStream);
}
catch (Exception ex)
{
}
if (myClass == null)
{
string filePath = #"D:\path\file.txt";
m.Formatter = new ActiveXMessageFormatter();
StreamReader reader = new StreamReader(m.BodyStream);
File.WriteAllText(filePath, reader.ReadToEnd());
}
Depending on the formatter you are using:
For windows binary messages:
File.WriteAllText(<path>, (new UTF8Encoding()).GetString((byte[])msg.Body)); // for binary
For XML messages try this:
msg.Formatter = new XmlMessageFormatter(new String[] { "System.String, mscorlib" });
var text = msg.Body.ToString();
// write to file..
For neither binary or XML, use the native formatter:
msg.Formatter = new ActiveXMessageFormatter();
reader = new StreamReader(msg.BodyStream);
msgBody = reader.ReadToEnd();
// write to file..
I need to construct a JObject that has a single property that could potentially contain a very large amount of text. I have this text being read from a stream, but I can't figure out how to write it to a single JToken.
Here's what I've tried so far:
using (var stream = new MemoryStream())
{
using (var streamWriter = new StreamWriter(stream))
{
// write a lot of random text to the stream
var docSize = 1024 * 1024;
var rnd = new Random();
for (int i = 0; i < docSize; i++)
{
var c = (char) rnd.Next('A', 'Z');
streamWriter.Write(c);
}
streamWriter.Flush();
stream.Seek(0, SeekOrigin.Begin);
// read from the stream and write a token
using (var streamReader = new StreamReader(stream))
using (var jTokenWriter = new JTokenWriter())
{
const int blockSize = 1024;
var buffer = new char[blockSize];
while (!streamReader.EndOfStream)
{
var charsRead = streamReader.Read(buffer, 0, blockSize);
var str = new string(buffer, 0, charsRead);
jTokenWriter.WriteValue(str);
}
// add the token to an object
var doc = new JObject();
doc.Add("Text", jTokenWriter.Token);
// spit out the json for debugging
var json = doc.ToString(Formatting.Indented);
Debug.WriteLine(json);
}
}
}
This is just a proof of concept. Of course, in reality, I will be getting the stream from elsewhere (a filestream, for example). The data could potentially be very large - hundreds of megabytes. So just working with strings is out of the question.
This example doesn't work. Only the last block read is left in the token. How can I write a value to the token and have it append to what was previously written instead of replacing it?
Is there a more efficient way to do this?
To clarify - the text being written is not already in json format. It is closer to human readable text. It will need to go through the same escaping and formatting that would occur if you wrote a plain string value.
After much research, I believe that the answer is "It can't be done".
Really, I think a single JValue of a very large string is something to avoid. I instead broke it up into smaller values stored in a JArray.
If I am wrong, please post a better answer. Thanks.