Why is http content different when sent from c# vs java?

Why is http content different when sent from c# vs java? - c#

I have an xml file that I need to send to a REST server as a post. When I read the exact same file from c# and java the bytes do not match when they arrive at the server. The java ones fail with a 500 Internal Server Error while the c# one works perfectly. The server is c#.
The file in c# is read as follows:
using (ms = new MemoryStream())
{
string fullPath = #"c:\pathtofile\datalast.xml";
using (FileStream outStream = File.OpenRead(fullPath))
{
outStream.CopyTo(ms);
outStream.Flush();
}
ms.Position = 0;
var xmlDoc = new XmlDocument();
xmlDoc.Load(ms);
content = xmlDoc.OuterXml;
}
content is then sent to a call that uses an HttpWebResponse
The java (Android) code reads the file like this:
FileInputStream fis = app.openFileInput(DATA_LAST_FILE_NAME);
byte[] buffer = new byte[1024];
ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
int len;
while ((len = fis.read(buffer)) != -1)
{
outputStream.write(buffer, 0, len);
}
outputStream.close();
fis.close();
ByteArrayEntity data = new ByteArrayEntity(buffer);
data.setContentType("application/xml");
post.setEntity(data);
HttpResponse response = request.execute(post);
For the most part the arrays generated are identical. The only difference seems to be in the first 3 bytes. The c# byte array's first 3 values are:
239,187,191
The java ones are:
-17,-69,-65
What is happening here? What should I do?
Thanks,
\ ^ / i l l

Look at what you're doing here:
FileInputStream fis = app.openFileInput(DATA_LAST_FILE_NAME);
byte[] buffer = new byte[1024];
ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
int len;
while ((len = fis.read(buffer)) != -1)
{
outputStream.write(buffer, 0, len);
}
outputStream.close();
fis.close();
ByteArrayEntity data = new ByteArrayEntity(buffer);
You're creating the ByteArrayEntity from the buffer that you've used when reading the data. It's almost certainly not the right length (it will always be length 1024), and it may well not have all the data either.
You should be using the ByteArrayOutputStream you've been writing into, e.g.
ByteArrayEntity data = new ByteArrayEntity(outputStream.toByteArray());
(You should be closing fis in a finally block, by the way.)
EDIT: The values you've printed to the console are indeed just showing the differences between signed and unsigned representations. They have nothing to do with the reason the Java code is failing, which is due to the above problem, I believe. You should look at what's being sent over the wire in Wireshark - that'll show you what's really going on.

Take a look at this: http://en.wikipedia.org/wiki/Byte_order_mark
EDIT: The reason why java and C# are different is that when reading the bytes, C# is unsigned, and java is signed. Same binary values, however.

Related

File API seems to always write corrupt files when used in a loop, except for the last file

I know the title is long, but it describes the problem exactly. I didn't know how else to explain it because this is totally out there.
I have a utility written in C# targeting .NET Core 2.1 that downloads and decrypts (AES encryption) files originally uploaded by our clients from our encrypted store, so they can be reprocessed through some of our services in the case that they fail. This utility is run via CLI using database IDs for the files as arguments, for example download.bat 101 102 103 would download 3 files with the corresponding IDs. I'm receiving byte data through a message queue (really not much more than a TCP socket) which describes a .TIF image.
I have a good reason to believe that the byte data is not ever corrupted on the server. That reason is when I run the utility with only one ID parameter, such as download.bat 101, then it works just fine. Furthermore, when I run it with multiple IDs, the last file that is downloaded by the utility is always intact, but the rest are always corrupted.
This odd behavior has persisted across two different implementations for writing the byte data to a file. Those implementations are below.
File.ReadAllBytes implementation:
private static void WriteMessageContents(FileServiceResponseEnvelope envelope, string destination, byte[] encryptionKey, byte[] macInitialVector)
{
using (var inputStream = new MemoryStream(envelope.Payload))
using (var outputStream = new MemoryStream(envelope.Payload.Length))
{
var sha512 = YellowAesEncryptor.DecryptStream(inputStream, outputStream, encryptionKey, macInitialVector, 0);
File.WriteAllBytes(destination, outputStream.ToArray());
_logger.LogStatement($"Finished writing [{envelope.Payload.Length} bytes] to [{destination}].", LogLevel.Debug);
}
}
FileStream implementation:
private static void WriteMessageContents(FileServiceResponseEnvelope envelope, string destination, byte[] encryptionKey, byte[] macInitialVector)
{
using (var inputStream = new MemoryStream(envelope.Payload))
using (var outputStream = new MemoryStream(envelope.Payload.Length))
{
var sha512 = YellowAesEncryptor.DecryptStream(inputStream, outputStream, encryptionKey, macInitialVector, 0);
using (FileStream fs = new FileStream(destination, FileMode.Create))
{
var bytes = outputStream.ToArray();
fs.Write(bytes, 0, envelope.Payload.Length);
_logger.LogStatement($"File byte content: [{string.Join(", ", bytes.Take(16))}]", LogLevel.Trace);
fs.Flush();
}
_logger.LogStatement($"Finished writing [{envelope.Payload.Length} bytes] to [{destination}].", LogLevel.Debug);
}
}
This method is called from a for loop which first receives the messages I described earlier and then feeds their payloads to the above method:
using (var requestSocket = new RequestSocket(fileServiceEndpoint))
{
// Envelopes is constructed beforehand
foreach (var envelope in envelopes)
{
var timer = Stopwatch.StartNew();
requestSocket.SendMoreFrame(messageTypeBytes);
requestSocket.SendMoreFrame(SerializationHelper.SerializeObjectToBuffer(envelope));
if (!requestSocket.TrySendFrame(_timeout, signedPayloadBytes, signedPayloadBytes.Length))
{
var message = $"Timeout exceeded while processing [{envelope.ActionType}] request.";
_logger.LogStatement(message, LogLevel.Error);
throw new Exception(message);
}
var responseReceived = requestSocket.TryReceiveFrameBytes(_timeout, out byte[] responseBytes);
...
var responseEnvelope = SerializationHelper.DeserializeObject<FileServiceResponseEnvelope>(responseBytes);
...
_logger.LogStatement($"Received response with payload of [{responseEnvelope.Payload.Length} bytes].", LogLevel.Info);
var destDir = downloadDetails.GetDestinationPath(responseEnvelope.FileId);
if (!Directory.Exists(destDir))
Directory.CreateDirectory(destDir);
var dest = Path.Combine(destDir, idsToFileNames[responseEnvelope.FileId]);
WriteMessageContents(responseEnvelope, dest, encryptionKey, macInitialVector);
}
}
I also know that TIFs have a very specific header, which looks something like this in raw bytes:
[73, 73, 42, 0, 8, 0, 0, 0, 20, 0...
It always begins with "II" (73, 73) or "MM" (77, 77) followed by 42 (probably a Hitchhiker's reference). I analyzed the bytes written by the utility. The last file always has a header that resembles this one. The rest are always random bytes; seemingly jumbled or mis-ordered image binary data. Any insight on this would be greatly appreciated because I can't wrap my mind around what I would even need to do to diagnose this.
UPDATE
I was able to figure out this problem with the help of elgonzo in the comments. Sometimes it isn't a direct answer that helps, but someone picking your brain until you look in the right place.

All right, as I suspected this was a dumb mistake (I had severe doubts that the File API was simply this flawed for so long). I just needed help thinking through it. There was an additional bit of code which I didn't post that was biting me, when I was retrieving the metadata for the file so that I could then request the file from our storage box.
byte[] encryptionKey = null;
byte[] macInitialVector = null;
...
using (var conn = new SqlConnection(ConnectionString))
using (var cmd = new SqlCommand(uploadedFileQuery, conn))
{
conn.Open();
var reader = cmd.ExecuteReader();
while (reader.Read())
{
FileServiceMessageEnvelope readAllEnvelope = null;
var originalFileName = reader["UploadedFileClientName"].ToString();
var fileId = Convert.ToInt64(reader["UploadedFileId"].ToString());
//var originalFileExtension = originalFileName.Substring(originalFileName.IndexOf('.'));
//_logger.LogStatement($"Scooped extension: {originalFileExtension}", LogLevel.Trace);
envelopes.Add(readAllEnvelope = new FileServiceMessageEnvelope
{
ActionType = FileServiceActionTypeEnum.ReadAll,
FileType = FileTypeEnum.UploadedFile,
FileName = reader["UploadedFileServerName"].ToString(),
FileId = fileId,
WorkerAuthorization = null,
BinaryTimestamp = DateTime.Now.ToBinary(),
Position = 0,
Count = Convert.ToInt32(reader["UploadedFileSize"]),
SignerFqdn = _messengerConfig.FullyQualifiedDomainName
});
readAllEnvelope.SignMessage(_messengerConfig.PrivateKeyBytes, _messengerConfig.PrivateKeyPassword);
signedPayload = new SecureMessage { Payload = new byte[0] };
signedPayload.SignMessage(_messengerConfig.PrivateKeyBytes, _messengerConfig.PrivateKeyPassword);
signedPayloadBytes = SerializationHelper.SerializeObjectToBuffer(signedPayload);
encryptionKey = (byte[])reader["UploadedFileEncryptionKey"];
macInitialVector = (byte[])reader["UploadedFileEncryptionMacInitialVector"];
}
conn.Close();
}
Eagle-eyed observers might realize that I have not properly coupled the encryptionKey and macInitialVector to the correct record, since each file has a unique key and vector. This means I was using the key for one of the files to decrypt all of them which is why they were all corrupt except for one file -- they were not properly decrypted. I solved this issue by coupling them together with the ID in a simple POCO and retrieving the appropriate key and vector for each file upon decryption.

How to use NAudio/SoundTouch to stream MP3 in ASP.NET MVC 5?

I am completely new to working with audio. I eventually want to stream an MP3 to a web page and allow user to alter the tempo. I got a HTML5 audio element set up and it can stream a MP3 fine. I can import the MP3 into NAudio.AudioFileReader and stream that to the page and that also works fine using the following code:
string fn = Server.MapPath("~/Uploads/Music/" + filename);
AudioFileReader reader = new AudioFileReader(fn);
MemoryStream outputStream = new MemoryStream();
using (NAudio.Wave.WaveFileWriter waveFileWriter = new WaveFileWriter(outputStream, reader.WaveFormat))
{
byte[] bytes = new byte[reader.Length];
reader.Position = 0;
reader.Read(bytes, 0, (int)reader.Length);
waveFileWriter.Write(bytes, 0, bytes.Length);
waveFileWriter.Flush();
}
return File(outputStream.ToArray(), "audio/mp3");
I'm not even sure if this is the proper way to do this, but I modified some code I found online and this does work. However, when looking at the NAudio Varispeed demo which integrates the SoundTouch library and trying to incorporate it, it no longer works.
I modified my code like this:
string fn = Server.MapPath("~/Uploads/Music/" + filename);
AudioFileReader reader = new AudioFileReader(fn);
bool useTempo = true;
VarispeedSampleProvider speedControl = new VarispeedSampleProvider(reader, 100, new SoundTouchProfile(useTempo, false));
MemoryStream outputStream = new MemoryStream();
using (NAudio.Wave.WaveFileWriter waveFileWriter = new WaveFileWriter(outputStream, reader.WaveFormat))
{
byte[] bytes = new byte[reader.Length];
speedControl.Read(bytes.Select(b => (float)Convert.ToDouble(b)).ToArray(), 0, (int)reader.Length);
waveFileWriter.Write(bytes, 0, bytes.Length);
waveFileWriter.Flush();
}
return File(outputStream.ToArray(), "audio/mp3");
It builds and appears like it's working but when I hit play, I get no audio.
What am I doing wrong here? Is this not even a good way to accomplish what I want?

You are reading into a temporary array (created by ToArray), so the audio you read is lost.
Instead, declare a float[], read into that, and then write the contents of that into the waveFileWriter.
Also, it is very important to use the return value from Read which will indicate the number of samples actually written into the array.

C#: how to read a line from a stream and then start reading it from beginning?

I need to read the first line from a stream to determine file's encoding, and then recreate the stream with that Encoding
The following code does not work correctly:
var r = response.GetResponseStream();
var sr = new StreamReader(r);
string firstLine = sr.ReadLine();
string encoding = GetEncodingFromFirstLine(firstLine);
string text = new StreamReader(r, Encoding.GetEncoding(encoding)).ReadToEnd();
The text variable doesn't contain the whole text. For some reason the first line and several lines after it are skipped.
I tried everything: closing the StreamReader, resetting it, calling a separate GetResponseStream... but nothing worked.
I can't get the response stream again as I'm getting this file from the internet, and redownloading it again would be bad performance wise.
Update
Here's what GetEncodingFromFirstLine() looks like:
public static string GetEncodingFromFirstLine(string line)
{
int encodingIndex = line.IndexOf("encoding=");
if (encodingIndex == -1)
{
return "utf-8";
}
return line.Substring(encodingIndex + "encoding=".Length).Replace("\"", "").Replace("'", "").Replace("?", "").Replace(">", "");
}
...
// true
Assert.AreEqual("windows-1251", GetEncodingFromFirstLine(#"<?xml version=""1.0"" encoding=""windows-1251""?>"));
** Update 2 **
I'm working with XML files, and the text variable is parsed as XML:
var feedItems = XElement.Parse(text);

Well you're asking it to detect the encoding... and that requires it to read data. That's reading it from the underlying stream, and you're then creating another StreamReader around the same stream.
I suggest you:
Get the response stream
Retrieve all the data into a byte array (or MemoryStream)
Detect the encoding (which should be performed on bytes, not text - currently you're already assuming UTF-8 by creating a StreamReader)
Create a MemoryStream around the byte array, and a StreamReader around that
It's not clear what your GetEncodingFromFirstLine method does... or what this file really is. More information may make it easier to help you.
EDIT: If this is to load some XML, don't reinvent the wheel. Just give the stream to one of the existing XML-parsing classes, which will perform the appropriate detection for you.

You need to change the current position in the stream to the beginning.
r.Position = 0;
string text = new StreamReader(r, Encoding.GetEncoding(encoding)).ReadToEnd();

I found the answer to my question here:
How can I read an Http response stream twice in C#?
Stream responseStream = CopyAndClose(resp.GetResponseStream());
// Do something with the stream
responseStream.Position = 0;
// Do something with the stream again
private static Stream CopyAndClose(Stream inputStream)
{
const int readSize = 256;
byte[] buffer = new byte[readSize];
MemoryStream ms = new MemoryStream();
int count = inputStream.Read(buffer, 0, readSize);
while (count > 0)
{
ms.Write(buffer, 0, count);
count = inputStream.Read(buffer, 0, readSize);
}
ms.Position = 0;
inputStream.Close();
return ms;
}

Out of Memory using TextWriter Stream with HttpWebRequest

Merged with How to free up memory after base64 convert.
Thanks for your great suggestions to an OOM (out of memory) problem I'm seeing in code intended to stream files for web services. [I hope it is OK to start another thread which provides a bit more detail.] From the suggestions, I shrunk the buffer size used to read from the file, and it looks like memory consumption is better, but I'm still seeing an OOM problem, and I'm seeing this problem with files sizes as small as 5MB. I potentially want to deal with files ten times larger.
My problem seems now to be with the use of TextWriter.
I create a request as follows [with a few edits to shrink the code]:
HttpWebRequest oRequest = (HttpWebRequest)WebRequest.Create(new Uri(strURL));
oRequest.Method = httpMethod;
oRequest.ContentType = "application/atom+xml";
oRequest.Headers["Authorization"] = getAuthHeader();
oRequest.ContentLength = strHead.Length + strTail.Length + longContentSize;
oRequest.SendChunked = true;
using (TextWriter tw = new StreamWriter(oRequest.GetRequestStream()))
{
tw.Write(strHead);
using (FileStream fileStream = new FileStream(strPath, FileMode.Open,
FileAccess.Read, System.IO.FileShare.ReadWrite))
{
StreamEncode(fileStream, tw);
}
tw.Write(strTail);
}
.....
Which calls into the routine:
public void StreamEncode(FileStream inputStream, TextWriter tw)
{
// For Base64 there are 4 bytes output for every 3 bytes of input
byte[] base64Block = new byte[9000];
int bytesRead = 0;
string base64String = null;
do
{
// read one block from the input stream
bytesRead = inputStream.Read(base64Block, 0, base64Block.Length);
// encode the base64 string
base64String = Convert.ToBase64String(base64Block, 0, bytesRead);
// write the string
tw.Write(base64String);
} while (bytesRead !=0 );
}
Should I use something other than TextWriter because of the potential large content? It seems very convenient for being able to create the whole payload of the request.
Is this totally the wrong approach? I want to be able to support very large files.

Modifying XMP data with C#

I'm using C# in ASP.NET version 2. I'm trying to open an image file, read (and change) the XMP header, and close it back up again. I can't upgrade ASP, so WIC is out, and I just can't figure out how to get this working.
Here's what I have so far:
Bitmap bmp = new Bitmap(Server.MapPath(imageFile));
MemoryStream ms = new MemoryStream();
StreamReader sr = new StreamReader(Server.MapPath(imageFile));
*[stuff with find and replace here]*
byte[] data = ToByteArray(sr.ReadToEnd());
ms = new MemoryStream(data);
originalImage = System.Drawing.Image.FromStream(ms);
Any suggestions?

How about this kinda thing?
byte[] data = File.ReadAllBytes(path);
... find & replace bit here ...
File.WriteAllBytes(path, data);
Also, i really recommend against using System.Bitmap in an asp.net process, as it leaks memory and will crash/randomly fail every now and again (even MS admit this)
Here's the bit from MS about why System.Drawing.Bitmap isn't stable:
http://msdn.microsoft.com/en-us/library/system.drawing.aspx
"Caution:
Classes within the System.Drawing namespace are not supported for use within a Windows or ASP.NET service. Attempting to use these classes from within one of these application types may produce unexpected problems, such as diminished service performance and run-time exceptions."

Part 1 of the XMP spec 2012, page 10 specifically talks about how to edit a file in place without needing to understand the surrounding format (although they do suggest this as a last resort). The embedded XMP packet looks like this:
<?xpacket begin="■" id="W5M0MpCehiHzreSzNTczkc9d"?>
... the serialized XMP as described above: ...
<x:xmpmeta xmlns:x="adobe:ns:meta/">
<rdf:RDF xmlns:rdf= ...>
...
</rdf:RDF>
</x:xmpmeta>
... XML whitespace as padding ...
<?xpacket end="w"?>
In this example, ‘■’ represents the
Unicode “zero width non-breaking space
character” (U+FEFF) used as a
byte-order marker.
The (XMP Spec 2010, Part 3, Page 12) also gives specific byte patterns (UTF-8, UTF16, big/little endian) to look for when scanning the bytes. This would complement Chris' answer about reading the file in as a giant byte stream.

You can use the following functions to read/write the binary data:
public byte[] GetBinaryData(string path, int bufferSize)
{
MemoryStream ms = new MemoryStream();
using (FileStream fs = File.Open(path, FileMode.Open, FileAccess.Read))
{
int bytesRead;
byte[] buffer = new byte[bufferSize];
while((bytesRead = fs.Read(buffer,0,bufferSize))>0)
{
ms.Write(buffer,0,bytesRead);
}
}
return(ms.ToArray());
}
public void SaveBinaryData(string path, byte[] data, int bufferSize)
{
using (FileStream fs = File.Open(path, FileMode.Create, FileAccess.Write))
{
int totalBytesSaved = 0;
while (totalBytesSaved<data.Length)
{
int remainingBytes = Math.Min(bufferSize, data.Length - totalBytesSaved);
fs.Write(data, totalBytesSaved, remainingBytes);
totalBytesSaved += remainingBytes;
}
}
}
However, loading entire images to memory would use quite a bit of RAM. I don't know much about XMP headers, but if possible you should:
Load only the headers in memory
Manipulate the headers in memory
Write the headers to a new file
Copy the remaining data from the original file

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Why is http content different when sent from c# vs java? - c#

Take a look at this: http://en.wikipedia.org/wiki/Byte_order_mark EDIT: The reason why java and C# are different is that when reading the bytes, C# is unsigned, and java is signed. Same binary values, however.

Related

File API seems to always write corrupt files when used in a loop, except for the last file

How to use NAudio/SoundTouch to stream MP3 in ASP.NET MVC 5?

C#: how to read a line from a stream and then start reading it from beginning?

Out of Memory using TextWriter Stream with HttpWebRequest

Modifying XMP data with C#

Categories

Resources