I need to get a content of the Microsoft Word (.docx) file from Amazon S3. I am able to get the object, but the result is not exactly what I want, because it looks like Word file opened in Notepad. I tried to read .txt file and it works perfectly. I think the problem is a content-type.
I would like to ask two question:
Is it possible to get the content of the document as is in file #Amazon and how to modify my code do achive that?
Is it possible to get the content with formatting (colors, bold text etc.)? If it is, I would aprreciate some clues.
My Code:
public static string ReadObjectData(string keyName)
{
string responseBody = "";
//using (IAmazonS3 client = new AmazonS3Client(RegionEndpoint.USEast1))
using (IAmazonS3 client = new Amazon.S3.AmazonS3Client("key", "secretKey", Amazon.RegionEndpoint.EUCentral1))
{
GetObjectRequest request = new GetObjectRequest
{
BucketName = "bucketName",
Key = keyName
};
using (GetObjectResponse response = client.GetObject(request))
using (Stream responseStream = response.ResponseStream)
using (StreamReader reader = new StreamReader(responseStream))
{
responseBody = reader.ReadToEnd();
}
}
return responseBody;
}
The correct Content-Type for a .docx file is application/vnd.openxmlformats-officedocument.wordprocessingml.document.
The Content-type being set incorrectly may cause a web browser to render the document incorrectly, but that isn't likely the problem, here. Setting it correctly will have no impact on the bytes that are actually contained in responseBody if you are trying to read it from code.
You need a library that understands the internals of files in the .docx format.
I understand your question regarding getting the object with content type. I think Michael's answer has some information to resolve the problem.
I just like to add some additional information while storing the objects in S3 bucket. The content type can be set in the meta data field when the objects are added to bucket.
If you are storing the objects and retrieving it later, please add the content type (Content-Type) in meta data. So that you can get the content type of the object when you read it.
This is the better approach if you adding and retrieving the object later.
doc application/msword
docx application/vnd.openxmlformats-officedocument.wordprocessingml.document
If you are reading the object added by someone else, you can request then to add the content type information (or) you need to derive it as mentioned in Michael's answer.
I need to save a list of ParseObjects to disk in C#, and was wondering if anyone had any recommendations on how to achieve this.
I am sure that the Parse .NET SDK has a means of converting ParseObjects to/from JSON since that is the format used during transit to parse server, but I haven't been able to find any public methods to do this :(
Hopefully someone has an answer for what should be an easy question! :)
If you are using Newtonsoft You can do this
var jsonString = JsonConvert.SerializeObject(yourObject);
using (StreamWriter writer =
new StreamWriter("SerializedObject.json"))
{
writer.Write(jsonString);
}
To read the JSON file you can do this
using (StreamReader reader =
new StreamReader("SerializedObject.json"))
{
string jsonString = reader.ReadToEnd();
YourObject ObjectFromJson = JsonConvert.DeserializeObject<YourObject>(jsonString);
}
I think you can use ToString() method for parsing. You override the ToString() method in your Class and write parsing code on it.
For more information you can look into below link
https://msdn.microsoft.com/en-us/library/system.object.tostring(v=vs.110).aspx
I'm working with c# within VS2012 and have installed the json.net files to handle the deserialization of a json string that's stored in an external file (1.json). As a newbie, i've come across a situation in which I want to extract information called score and avarage score from a single json string; see below:
{"LEVEL": [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],
"score": 1,
"average score": 2 }
The output I get from the debugger once I step through the process shows that the stream only picks up the first part of the json file (everything from the first opening square bracket to the closing square bracket) so I'm unable to obtain the score and average score. Here is what I have at the moment to try to extract this information...
using (var sr = new StreamReader(File.OpenRead(filename)))
{
levelData = sr.ReadLine();
var stats = JsonConvert.DeserializeObject<Dictionary<string, dynamic>>(levelData);
}
Can anyone provide any advice as to how I can extract this information? Any help would be greatly appreciated.
The problem is that you're reading and the file and deserializing data line by line. You cannot do that with json as it is the whole structure (like xml).
Instead you should deserialize the whole file:
var json = File.ReadAllText(filename);
var stats = JsonConvert.DeserializeObject<Dictionary<string, dynamic>>(json);
This is more of a theoretical question, but I am curious what the difference between these two methods of reading a file is and why I would want to choose one over the other.
I am parsing a JSON configuration file (from local disk). Here is one method of doing it:
// Uses JSON.NET Serializer + StreamReader
using(var s = new StreamReader(file))
{
var jtr = new JsonTextReader(sr);
var jsonSerializer = new JsonSerializer();
return jsonSerializer.Deserialize<Configuration>(jtr);
}
...and, the second option...
// Reads the entire file and deserializes.
var json = File.ReadAllText(file);
return JsonConvert.DeserializeObject<JsonVmrConfigurationProvider>(json);
Is one any better than the other? Is there a case where one or the other should be used?
Again, this is more theoretical, but, I realized I don't really know the answer to it, and a search online didn't produce results that satisfied me. I could see the second being bad if the file was large (it isn't) since it's being read into memory in one shot. Any other reasons?
by reading the code you found that deserializaton from a string finally reach:
public static object DeserializeObject(string value, Type type, JsonSerializerSettings settings)
{
ValidationUtils.ArgumentNotNull(value, "value");
JsonSerializer jsonSerializer = JsonSerializer.CreateDefault(settings);
// by default DeserializeObject should check for additional content
if (!jsonSerializer.IsCheckAdditionalContentSet())
jsonSerializer.CheckAdditionalContent = true;
using (var reader = new JsonTextReader(new StringReader(value)))
{
return jsonSerializer.Deserialize(reader, type);
}
}
That is the creation of a JsonTextReader.
So the difference seems to effectively be the handling of huge files.
-- previous comment temper:
JsonTextReader overrides JsonReader.Close() and handles the stream (if CloseInput is true), but not only.
CloseInput should be true by default as the StringReader is not explicitly disposed in previous fragment of code
With File.ReadAllText(), the entire JSON needs to be loaded into memory first before deserializing it. Using a StreamReader, the file is read and deserialized incrementally. So if your file is huge, you might want to use the StreamReader to avoid loading the whole file into memory. If your JSON file is small (most cases), it doesn't really make a difference.