How to determine if an HTTP response is complete - c#

I am working on building a simple proxy which will log certain requests which are passed through it. The proxy does not need to interfere with the traffic being passed through it (at this point in the project) and so I am trying to do as little parsing of the raw request/response as possible durring the process (the request and response are pushed off to a queue to be logged outside of the proxy).
My sample works fine, except for a cannot reliably tell when the "response" is complete so I have connections left open for longer than needed. The relevant code is below:
var request = getRequest(url);
byte[] buffer;
int bytesRead = 1;
var dataSent = false;
var timeoutTicks = DateTime.Now.AddMinutes(1).Ticks;
Console.WriteLine(" Sending data to address: {0}", url);
Console.WriteLine(" Waiting for response from host...");
using (var outboundStream = request.GetStream()) {
while (request.Connected && (DateTime.Now.Ticks < timeoutTicks)) {
while (outboundStream.DataAvailable) {
dataSent = true;
buffer = new byte[OUTPUT_BUFFER_SIZE];
bytesRead = outboundStream.Read(buffer, 0, OUTPUT_BUFFER_SIZE);
if (bytesRead > 0) { _clientSocket.Send(buffer, bytesRead, SocketFlags.None); }
Console.WriteLine(" pushed {0} bytes to requesting host...", _backBuffer.Length);
}
if (request.Connected) { Thread.Sleep(0); }
}
}
Console.WriteLine(" Finished with response from host...");
Console.WriteLine(" Disconnecting socket");
_clientSocket.Shutdown(SocketShutdown.Both);
My question is whether there is an easy way to tell that the response is complete without parsing headers. Given that this response could be anything (encoded, encrypted, gzip'ed etc), I dont want to have to decode the actual response to get the length and determine if I can disconnect my socket.

As David pointed out, connections should remain open for a period of time. You should not close connections unless the client side does that (or if the keep alive interval expires).
Changing to HTTP/1.0 will not work since you are a server and it's the client that will specify HTTP/1.1 in the request. Sure, you can send a error message with HTTP/1.0 as version and hope that the client changes to 1.0, but it seems inefficient.
HTTP messages looks like this:
REQUEST LINE
HEADERS
(empty line)
BODY
The only way to know when a response is done is to search for the Content-Length header. Simply search for "Content-Length:" in the request buffer and extract everything to the linefeed. (But trim the found value before converting to int).
The other alternative is to use the parser in my webserver to get all headers. It should be quite easy to use just the parser and nothing more from the library.
Update: There is a better parser here: HttpParser.cs

If you make a HTTP/1.0 request instead of 1.1, the server should close the connection as soon as it's through since it doesn't need to keep the connection open for another request.
Other than that, you really need to parse the content length header in the response to get the best value.

Using blocking IO and multiple threads might be your answer. Specifically
using(var response = request.GetResponse())
using(var stream = response.GetResponseStream())
using(var reader = new StreamReader(stream)
data = reader.ReadToEnd()
This is for textual data, however binary handling is similar.

Related

TcpListener + TcpClient - wait for client to read data before closing

I am building a simple HTTP server for PDF files with TcpClient. It works well, however the TcpClient closes before the browser is downloading of the PDF is being finished. How can I force TcpClient to wait until the remote client get everything that is written before closing?
//pdf is byte[]
TcpListener server = new TcpListener(address, port);
server.Start();
TcpClient client = server.AcceptTcpClient(); //Wait for connection
var ns = client.GetStream();
string headers;
using (var writer = new StringWriter())
{
writer.WriteLine("HTTP/1.1 200 OK");
//writer.WriteLine("Accept: text/html");
writer.WriteLine("Content-type: application/pdf");
writer.WriteLine("Content-length: " + pdf.Length);
writer.WriteLine();
headers = writer.ToString();
}
var bytes = Encoding.UTF8.GetBytes(headers);
ns.Write(bytes, 0, bytes.Length);
ns.Write(pdf, 0, pdf.Length);
Thread.Sleep(TimeSpan.FromSeconds(10)); //Adding this line fixes the problem....
client.Close();
server.Stop();
Can I replace that ugly 'Thread.Sleep' hack?
EDIT: The code below works, based on the answers:
TcpListener Server = null;
public void StartServer()
{
Server = new TcpListener(IPAddress.Any, Port);
Server.Start();
Server.BeginAcceptTcpClient(AcceptClientCallback, null);
}
void AcceptClientCallback(IAsyncResult result)
{
var client = Server.EndAcceptTcpClient(result);
var ns = client.GetStream();
string headers;
byte[] pdf = //get pdf
using (var writer = new StringWriter())
{
writer.WriteLine("HTTP/1.1 200 OK");
//writer.WriteLine("Accept: text/html");
writer.WriteLine("Content-type: application/pdf");
writer.WriteLine("Content-length: " + pdf.Length);
writer.WriteLine();
headers = writer.ToString();
}
var bytes = Encoding.UTF8.GetBytes(headers);
ns.Write(bytes, 0, bytes.Length);
ns.Write(pdf, 0, pdf.Length);
client.Client.Shutdown(SocketShutdown.Send);
byte[] buffer = new byte[1024];
int byteCount;
while ((byteCount = ns.Read(buffer, 0, buffer.Length)) > 0)
{
}
client.Close();
Server.Stop();
}
The main issue in your code is that your server (the file host) neglected to read from the socket it's writing the file to, and so has no way to detect, never mind wait for, the client shutting down the connection.
The code could be way better, but at a minimum you could probably get it to work by adding something like this just before your client.Close(); statement:
// Indicate the end of the bytes being sent
ns.Socket.Shutdown(SocketShutdown.Send);
// arbitrarily-sized buffer...most likely nothing will ever be written to it
byte[] buffer = new byte[4096];
int byteCount;
while ((byteCount = ns.Read(buffer, 0, buffer.Length)) > 0)
{
// ignore any data read here
}
When an endpoint initiates a graceful closure (e.g. by calling Socket.Shutdown(SocketShutdown.Send);), that will allow the network layer to identify the end of the stream of data. Once the other endpoint has read all of the remaining bytes that the remote endpoint has sent, the next read operation will complete with a byte length of zero. That's that other endpoint's signal that the end-of-stream has been reached, and that it's time to close the connection.
Either endpoint can initiate the graceful closure with the "send" shutdown reason. The other endpoint can acknowledge it once it's finished sending whatever it wants to send by using the "both" shutdown reason, at which point both endpoints can close their sockets (or streams or listeners or whatever other higher-level abstraction they might be using the wrap the socket).
Of course, in a properly-implemented protocol, you'd know in advance whether any data would ever actually be sent by the remote endpoint. If you know none ever will be, you could get away with a zero-length buffer, and if you know some data might be sent back from the client, then you'd actually do something with that data (as opposed to the empty loop body above).
In any case, the above is strictly a kludge, to get the already-kludged code you posted to work. Please don't mistake it for something intended to be seen in production-quality code.
All that said, the code you posted is a long way from being all that good. You aren't just implementing a basic TCP connection, but apparently are trying to reimplement the HTTP protocol. There's no point in doing that, as .NET already has HTTP server functionality built in (see e.g. System.Net.HttpListener). If you do intend to reinvent the HTTP server, you need a lot more code than the code you posted. The lack of error-handling alone is a major flaw, and will cause all kinds of headaches.
If you intend to write low-level network code, you should do a lot more research and experimentation. One very good resource is the Winsock Programmer’s FAQ. It's primary focus is, of course, programmers targeting the Winsock API. But there is a wealth of general-purpose information there as well, and in any case all of the various socket APIs are very similar, as they are all based on the same low-level concepts.
You may also want to review various existing Stack Overflow Q&A. Here are a couple of ones closely related to your specific problem:
How to correctly use TPL with TcpClient?
Send a large file over tcp connection
Do be careful though. There's almost as much bad advice out there as good. There's no shortage of people going around acting like they are experts in network programming when they aren't, so take everything you read with a grain of salt (including my advice above!).

C#, clear HttpListenerContext.Response.OutputStream

I have a web-service using HttpListener.
I have noticed this thing:
HttpListenerContext context = listener.GetContext();
...
context.Response.StatusCode = 200;
context.Response.OutputStream.Write(buffer, 0, bufferSize);
context.Response.StatusCode = 500;
context.Response.OutputStream.Close();
A client in this case receives a status code 200, so if i have wrote some data to the output network stream i can't change the status code, as, i suppose, it is already written to the response stream.
What i want: after i have started writing a response to the output stream, in some case i want to "abort and reset" the response, clear the output stream (so the client won't receive any data in HTTP response body), and change the status code.
I have no idea how to clear the output stream and change the status code. These two lines below won't help, they throw exceptions.
context.Response.OutputStream.SetLength(0);
context.Response.OutputStream.Position = 0;
I suppose, what the program writes buffer data into network device after i call context.Response.OutputStream.Close(), until this the data is stored in RAM and we can reset it, can't we?
EDIT: It seems what writing into the context.Response.OutputStream takes too much of time sometimes, in some case. From 100 to 1000 ms... That's why i would just interrupt writing, if it's possible.
You either could use a MemoryStream to cache the answer, and if you are sure it is complete, set the status to 200 and return it (e.g. with Stream.CopyTo).
You can't "clear" the OutputStream, since it isn't stored (for long), instead it is sent right away to the client, so you can't edit it anymore.
Apart from that, HTTP does not offer a way to gracefully say "DATADATADATA... oh forget that, this was wrong, use the Status Code 500 instead.". You only can try to kill the TCP connection (TCP RST instead of TCP FIN) and hope that the client will handle failing to continue reading on the connection in a suitable way, after it probably already started to process the data you've already sent.
Try context.Response.Abort() before closing, this won't allow you to set a status code, but will at least communicate that something went wrong.

Connecting to a TCP server unable to get response

I have a TCP client that successfully connects to an external Server and I am sending a request to the server which is received successfully.
However when I try and receive the response it just stalls and never receives it.
var message = "Hello World!";
var port = 9999; //Changed for question
var ip = "100.100.100.100"; //Changed for question
var tcpclnt = new TcpClient();
Console.WriteLine("Connecting.....");
await tcpclnt.ConnectAsync(ip, port);
Console.WriteLine("Connected");
Stream stm = tcpclnt.GetStream();
byte[] ba = Encoding.ASCII.GetBytes(message);
Console.WriteLine("Transmitting {0}", message);
await stm.WriteAsync(ba, 0, ba.Length);
byte[] bb = new byte[100];
var bytesRecieved = await stm.ReadAsync(bb, 0, 100);
var response = new StringBuilder();
foreach (byte t in bb)
response.Append(Convert.ToChar(t));
Console.WriteLine("Received {0}", response);
tcpclnt.Close();
I have the correct port open on my computer and on my router.
There's many things wrong with your code, but one of the biggies is that the other side has no way of knowing when you're done transmitting.
TCP is a stream-based protocol, not a message-based one. That means that there's no 1:1 mapping between sends on one side, and reads on the other. It's like writing and reading a file stream, not like sending e-mail.
This among other things means that unless you're actually working with streamed data, you'll need some way to distinguish messages in the stream from each other. For example, by adding \r\n to each of your strings - but that's something the client and the server have to agree on. You'll have to also show the server code if you want more specific help.
As for the other mistakes:
You can't igore the return value of ReadAsync - it tells you how much data was actually read.
You can't just wave-away the encoding issues - both sides have to explicitly use an (agreed upon) encoding for writing and reading. In your case, this would mean using Encoding.ASCII.GetString instead of Convert.ToChar.
Related to the first point, you can't just call ReadAsync once and expect to get the whole message (or even expect to get exactly one message). For anything realistic, you'll need a loop, reading data as long as there is some.
No handling of graceful shutdown. Not important if you're only doing HTTP-style "request -> response", but problematic otherwise.

POST request using sockets C#

I'm trying to make an auction sniper for a site. To place a bid you need to send 4 parameters(and cookies of course) to /auction/place_bid. I need to use sockets, not HttpWebRequest. Here's the code:
string request1 = "POST /auction/place_bid HTTP/1.1\r\nHost: *host here*\r\nConnection: Keep-Alive\r\nUser-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; .NET CLR 1.0.3705;)\r\nAccept: /*\r\nContent-Type: application/x-www-form-urlencoded; charset=UTF-8\r\nX-Requested-With: XMLHttpRequest\r\n" + cookies +"\r\n";
string request3 = "token=" + token + "&aid=" + aid + "&bidReq=" + ptzReq + "&recaptcha_challenge_field=" + rcf + "&recaptcha_response_field=" + rrf+"\r\n\r\n";
string request2 = "Content-Length: " + (Encoding.UTF8.GetByteCount(request1+request3)+23).ToString() + "\r\n";
byte[] dataSent = Encoding.UTF8.GetBytes(request1+request2+request3);
byte[] dataReceived = new byte[10000];
Socket socket = ConnectSocket(server, 80);
if (socket == null)
{
return null;
}
socket.Send(dataSent, dataSent.Length, 0);
int bytes = 0;
string page = "";
do
{
bytes = socket.Receive(dataReceived, dataReceived.Length, 0);
page = page + Encoding.ASCII.GetString(dataReceived, 0, bytes);
}
while (bytes > 0);
return page;
When I'm trying to receive the webpage Visual Studio says that "Operation on an unblocked socket cannot be completed immediatly", when I add
socket.Blocking = true;
My application stops responsing and after ~1 minute it returns page, but it's empty! When I'm trying to make a GET request it works perfect. I hope you will help me. By the way, this is the first time when I use sockets so my code is pretty bad, sorry about that.
*I'm using a ConnectSocket class, which was given as an example at msdn (The link leads to Russian MSDN, sorry, I didn't find the same article in English, but you'll understand the code anyway)
The Content-Length header should indicate the size of the content. You're setting it to the total size of your headers and content.
Encoding.UTF8.GetByteCount(request1+request3)+23).ToString()
Since the content part of your message is just request3, the server is patiently waiting for ByteCount(request1)+23 more bytes of content which you never send.
Try this instead:
"Content-Length: " + Encoding.UTF8.GetByteCount(request3).ToString() + "\r\n"
Another issue looks like your loop:
do
{
bytes = socket.Receive(dataReceived, dataReceived.Length, 0);
page = page + Encoding.ASCII.GetString(dataReceived, 0, bytes);
}
while (bytes > 0);
Since non-blocking socket operations always return immediately whether or not they've completed yet, you need a loop that keeps calling Receive() until the operation has actually completed. Here, if the call to Receive() returns 0 (which it almost certainly will the first time) you exit the loop.
You should at least change it to while (bytes <= 0) which would get you at least some data (probably just the first packet's worth or so). Ideally, you should keep calling Receive() until you see the Content-Length header in the reply, then continue calling Receive() until the end of the headers, then read Content-Length more bytes.
Since you're using sockets, you really have to re-implement the HTTP protocol.
As people already has pointed out: HttpWebRequest is not the cause of you performance issues. Switching to a socket implementation will not affect anything.
The fact is that the HttpWebRequest can do zillions of stupid things if it want to, and it will still be faster than the time it takes to get stuff from the webserver.
Switching to a socket implementation might speed things up if you have good knowledge when it comes to sockets AND the http protocol. You clearly do not have that, so I would recommend that you go back to HttpWebRequest again.
You might want to use WebClient if you are going to fetch lots of pages from the same webserver since it will keep the connection alive.
Update
I don't need a lot of connections, I need to make 1 request a time, and it should be as fast as it possible
Well. Then it doesn't really matter which implementation you use. The network latency will ALWAYS be a lot larger than the actual HTTP client implementation. Building a HTTP request doesn't take very much resources, parsing a response doesn't do that either.

The operation has timed out with WebClient.DownloadFile and correct url's

I am batch uploading products to a database.
I am download the image urls to the site to be used for the products.
The code I written works fine for the first 25 iterations (always that number for some reason), but then throws me a System.Net.WebException "The operation has timed out".
if (!File.Exists(localFilename))
{
using (WebClient Client = new WebClient())
{
Client.DownloadFile(remoteFilename, localFilename);
}
}
I checked the remote url it was requesting and it is a valid image url that returns an image.
Also, when I step through it with the debugger, I don't get the timeout error.
HELP! ;)
If I were in your shoes, here's a few possibilities I'd investigate:
if you're running this code from multiple threads, you may be bumping up against the System.Net.ServicePointManager.DefaultConnectionLimit property. Try increasing it to 50-100 when you start up your app. note that I don't think this is your problem, but trying this is easier than the other stuff below. :-)
another possibility is that you're swamping the server. This is usually hard to do with a single-threaded client, but is possible since multiple other clients may be hitting the server also. But because the problem always happens at #25, this seems unlikely since you'd expect to see more variation.
you may be running into a problem with keepalive HTTP connections backing up between your client and the server. this also seems unlikely.
the hard cutoff of 25 makes me think that this may be a proxy or firewall limit, either on your end or the server's, where >25 connections made from one client IP to one server (or proxy) will get throttled.
My money is on the latter one, since the fact that it always breaks at a nice round number of requests, and that stepping in the debugger (aka slower!) doesn't trigger the problem.
To test all this, I'd start with the easy thing: stick in a delay (Thread.Sleep) before each HTTP call, and see if the problem goes away. If it does, reduce the delay until the problem comes back. If it doesn't, increase the delay up to a large number (e.g. 10 seconds) until the problem goes away. If it doesn't go away with a 10 second delay, that's truly a mystery and I'd need more info to diagnose.
If it does go away with a delay, then you need to figure out why-- and whether the limit is permanent (e.g. server's firewall which you can't change) or something you can change. To get more info, you'll want to time the requests (e.g. check DateTime.Now before and after each call) to see if you see a pattern. If the timings are all consistent and suddenly get huge, that suggests a network/firewall/proxy throttling. If the timings gradually increase, that suggests a server you're gradually overloading and lengthening its request queue.
In addition to timing the requests, I'd set the timeout of your webclient calls to be longer, so you can figure out if the timeout is infinite or just a bit longer than the default. To do this, you'll need an alternative to the WebClient class, since it doesn't support a timeout. This thread on MSDN Forums has a reasonable alternative code sample.
An alternative to adding timing in your code is to use Fiddler:
download fiddler and start it up.
set your webclient code's Proxy property to point to the fiddler proxy (localhost:8888)
run your app and look at fiddler.
it seems that WebClient is not closing the Response object it uses when done which will cause, in your case, many responses to be opened at the same time and with a limit of 25 connections on the remote server, you got the 'Timeout exception'. When you debug, early opened reponses get closed due to their inner timeout, etc...
(I inpected WebClient that with Reflector, I can't find an instruction for closing the response).
I propse that you use HttpWebRequest & HttpWebResponse so that you can clean objects after each download:
HttpWebRequest request;
HttpWebResponse response = null;
try
{
FileStream fs;
Stream s;
byte[] read;
int count;
read = new byte[256];
request = (HttpWebRequest)WebRequest.Create(remoteFilename);
request.Timeout = 30000;
request.AllowWriteStreamBuffering = false;
response = (HttpWebResponse)request.GetResponse();
s = response.GetResponseStream();
fs = new FileStream(localFilename, FileMode.Create);
while((count = s.Read(read, 0, read.Length))> 0)
{
fs.Write(read, 0, count);
count = s.Read(read, 0, read.Length);
}
fs.Close();
s.Close();
}
catch (System.Net.WebException)
{
//....
}finally
{
//Close Response
if (response != null)
response.Close();
}
Here's a slightly simplified version of manji's answer:
private static void DownloadFile(Uri remoteUri, string localPath)
{
var request = (HttpWebRequest)WebRequest.Create(remoteUri);
request.Timeout = 30000;
request.AllowWriteStreamBuffering = false;
using (var response = (HttpWebResponse)request.GetResponse())
using (var s = response.GetResponseStream())
using (var fs = new FileStream(localPath, FileMode.Create))
{
byte[] buffer = new byte[4096];
int bytesRead;
while ((bytesRead = s.Read(buffer, 0, buffer.Length)) > 0)
{
fs.Write(buffer, 0, bytesRead);
bytesRead = s.Read(buffer, 0, buffer.Length);
}
}
}
I have the same problem and I solve it adding this lines to the configuration file app.config:
<system.net>
<connectionManagement>
<add address="*" maxconnection="100" />
</connectionManagement>
</system.net>

Categories