I'm trying to get the zip file out of this link with C#:
http://dl.opensubtitles.org/en/download/sub/4860863
I've tried:
string ResponseText;
HttpWebRequest m = (HttpWebRequest)WebRequest.Create(o.link);
m.Method = WebRequestMethods.Http.Get;
using (HttpWebResponse response = (HttpWebResponse)m.GetResponse())
{
using (StreamReader reader = new StreamReader(response.GetResponseStream()))
{
ResponseText = reader.ReadToEnd();
// ResponseText = HttpUtility.HtmlDecode(ResponseText);
XmlTextReader xmlr = new XmlTextReader(new StringReader(ResponseText));
}
}
and
WebRequest request = WebRequest.Create(o.link);
using (WebResponse response = request.GetResponse())
using (Stream stream = response.GetResponseStream())
{
string contentType = response.ContentType;
// TODO: examine the content type and decide how to name your file
string filename = "test.zip";
// Download the file
using (Stream file = File.OpenWrite(filename))
{
// Remark: if the file is very big read it in chunks
// to avoid loading it into memory
byte[] buffer = new byte[response.ContentLength];
stream.Read(buffer, 0, buffer.Length);
file.Write(buffer, 0, buffer.Length);
}
}
But they all return something weird, nothing that looks like the file I need...
I think the link is php generated, but I'm not sure...
The opensubtitles api is no option for me...
Many thanks
It seems the Content-Type response is ok for me for your link:
Request URL:http://dl.opensubtitles.org/en/download/sub/4860863
Request Method:GET
Status Code:200 OK
Request Headersview:
Accept:text/html,application/xhtml+xml,application/xml;q=0.9,*//*;q=0.8
Accept-Encoding:gzip,deflate,sdch
Accept-Language:en-US,en;q=0.8
Connection:keep-alive
Cookie:PHPSESSID=gk86hdrce96pu06kuajtue45a6; ts=1372177758
Host:dl.opensubtitles.org
User-Agent:Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/27.0.1453.116 Safari/537.36
Response Headersview:
Accept-Ranges:bytes
Age:0
Cache-Control:must-revalidate, post-check=0, pre-check=0
Connection:keep-alive
Content-Disposition:attachment; filename="the.dark.knight.(2008).dut.1cd.(4860863).zip"
Content-Length:48473
Content-Transfer-Encoding:Binary
Content-Type:application/zip
Date:Tue, 25 Jun 2013 16:29:45 GMT
Expires:Mon, 1 Apr 2006 01:23:45 GMT
Pragma:public
Set-Cookie:ts=1372177785; expires=Thu, 25-Jul-2013 16:29:45 GMT; path=/
X-Cache:MISS
X-Cache-Backend:web1
I have check your code and test it using the link and the manual download produced a 48473 bytes file, and using your code produced 48564 bytes with zero after 0xDC2 and when I compared it with Hex editor, it have many different part. We may need to put more request header before sending the request.
ok, now i can resolve it: put cookie and read at a smaller chunk
private void button1_Click(object sender, EventArgs e) {
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(new Uri("http://dl.opensubtitles.org/en/download/sub/4860863"));
//request.UserAgent = "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/27.0.1453.116 Safari/537.36";
//request.Accept = "text/html,application/xhtml+xml,application/xml;q=0.9,*//*;q=0.8";
//request.Headers["Accept-Encoding"] = "gzip,deflate,sdch";
request.Headers["Cookie"] = "PHPSESSID=gk86hdrce96pu06kuajtue45a6; ts=1372177758";
using (WebResponse response = request.GetResponse())
using (Stream stream = response.GetResponseStream()) {
string contentType = response.ContentType;
// TODO: examine the content type and decide how to name your file
string filename = "test.zip";
// Download the file
using (Stream file = File.OpenWrite(filename)) {
byte[] buffer = ReadFully(stream, 256);
stream.Read(buffer, 0, buffer.Length);
file.Write(buffer, 0, buffer.Length);
}
}
}
/// <summary>
/// Reads data from a stream until the end is reached. The
/// data is returned as a byte array. An IOException is
/// thrown if any of the underlying IO calls fail.
/// </summary>
/// <param name="stream">The stream to read data from</param>
/// <param name="initialLength">The initial buffer length</param>
public static byte[] ReadFully(Stream stream, int initialLength) {
// If we've been passed an unhelpful initial length, just
// use 32K.
if (initialLength < 1) {
initialLength = 32768;
}
byte[] buffer = new byte[initialLength];
int read = 0;
int chunk;
while ((chunk = stream.Read(buffer, read, buffer.Length - read)) > 0) {
read += chunk;
// If we've reached the end of our buffer, check to see if there's
// any more information
if (read == buffer.Length) {
int nextByte = stream.ReadByte();
// End of stream? If so, we're done
if (nextByte == -1) {
return buffer;
}
// Nope. Resize the buffer, put in the byte we've just
// read, and continue
byte[] newBuffer = new byte[buffer.Length * 2];
Array.Copy(buffer, newBuffer, buffer.Length);
newBuffer[read] = (byte)nextByte;
buffer = newBuffer;
read++;
}
}
// Buffer is now too big. Shrink it.
byte[] ret = new byte[read];
Array.Copy(buffer, ret, read);
return ret;
}
EDIT: You don't need to set Cookie at all, you'll produce a different file but a valid one. I assume the server add extra info to the file when you revisit them.
Related
I created a bot with bot framework and now i'm trying to use the CustomSpeech service instead of the bing SpeechToText Service that works fine. I have tried various way to resolve the problem but i get the error 400 and i don't know how to solve this.
The method where i would like to get the text from a Stream of a wav pcm audio:
public static async Task<string> CustomSpeechToTextStream(Stream audioStream)
{
audioStream.Seek(0, SeekOrigin.Begin);
var customSpeechUrl = "https://westus.stt.speech.microsoft.com/speech/recognition/interactive/cognitiveservices/v1?cid=<MyEndPointId>";
string token;
token = GetToken();
HttpWebRequest request = null;
request = (HttpWebRequest)HttpWebRequest.Create(customSpeechUrl);
request.SendChunked = true;
//request.Accept = #"application/json;text/xml";
request.Method = "POST";
request.ProtocolVersion = HttpVersion.Version11;
request.ContentType = "audio/wav; codec=\"audio/pcm\"; samplerate=16000";
request.Headers["Authorization"] = "Bearer " + token;
byte[] buffer = null;
int bytesRead = 0;
using (Stream requestStream = request.GetRequestStream())
{
// Read 1024 raw bytes from the input audio file.
buffer = new Byte[checked((uint)Math.Min(1024, (int)audioStream.Length))];
while ((bytesRead = audioStream.Read(buffer, 0, buffer.Length)) != 0)
{
requestStream.Write(buffer, 0, bytesRead);
}
requestStream.Flush();
}
string responseString = string.Empty;
// Get the response from the service.
using (WebResponse response = request.GetResponse()) // Here i get the error
{
using (StreamReader sr = new StreamReader(response.GetResponseStream()))
{
responseString = sr.ReadToEnd();
}
}
dynamic deserializedResponse = Newtonsoft.Json.JsonConvert.DeserializeObject(responseString);
if (deserializedResponse.RecognitionStatus == "Success")
{
return deserializedResponse.DisplayText;
}
else
{
return null;
}
}
At using (WebResponse response = request.GetResponse()){} i get an exception (Error 400).
Am I doing the HttpWebRequest in the right way?
I read in internet that maybe the problem is the file audio... but then why with the same Stream bing speech service doesn't return this error?
In my case the problem was that i had a wav stream audio that doesn't had the file header that Cris (Custom Speech Service) needs. The sulution is creating a temporary file wav, read the file wav and copy it in a Stream to send it as array to Cris
byte[] buffer = null;
int bytesRead = 0;
using (Stream requestStream = request.GetRequestStream())
{
buffer = new Byte[checked((uint)Math.Min(1024, (int)audioStream.Length))];
while ((bytesRead = audioStream.Read(buffer, 0, buffer.Length)) != 0)
{
requestStream.Write(buffer, 0, bytesRead);
}
requestStream.Flush();
}
or copy it in a MemoryStream and send it as array
using (Stream requestStream = request.GetRequestStream())
{
requestStream.Write(audioStream.ToArray(), 0, audioStream.ToArray().Length);
requestStream.Flush();
}
I've recently written a C# function that does a multi part form post for uploading files. To track the progress, I'd write the form data to the request stream at 4096 bytes at a time and call back with each write. However, it seems that the request does not even get sent until GetResponseAsync() is called.
If this is the case, is the reporting of every 4096 bytes written to the request stream an accurate reporting of upload progress?
If not, how can I accurately report progress? WebClient is out of the question for me, this is in a PCL Xamarin project.
private async Task<string> PostFormAsync (string postUrl, string contentType, byte[] formData)
{
try {
HttpWebRequest request = WebRequest.Create (postUrl) as HttpWebRequest;
request.Method = "POST";
request.ContentType = contentType;
request.Headers ["Cookie"] = Constants.Cookie;
byte[] buffer = new byte[4096];
int count = 0;
int length = 0;
using (Stream requestStream = await request.GetRequestStreamAsync ()) {
using (Stream inputStream = new MemoryStream (formData)) {
while ((count = await inputStream.ReadAsync (buffer, 0, buffer.Length)) > 0) {
await requestStream.WriteAsync (buffer, 0, count);
length += count;
Device.BeginInvokeOnMainThread (() => {
_progressBar.Progress = length / formData.Length;
});
}
}
}
_progressBar.Progress = 0;
WebResponse resp = await request.GetResponseAsync ();
using (Stream stream = resp.GetResponseStream ()) {
StreamReader respReader = new StreamReader (stream);
return respReader.ReadToEnd ();
}
} catch (Exception e) {
Debug.WriteLine (e.ToString ());
return String.Empty;
}
}
Please note that I am asking about monitoring progress of an upload at 4096 bytes at a time, not a download
I ended up accomplishing this by setting the AllowWriteStreamBuffering property of the WebRequest equal to false and the SendChunked property to true.
HOWEVER Xamarin.PCL (Profile 78) does not allow you to access these properties of the HttpWebRequest, so I had to instantiate my HttpWebRequest and return it from a dependency service in my platform specific project (only tested in iOS).
public class WebDependency : IWebDependency
{
public HttpWebRequest GetWebRequest(string uri)
{
var request = WebRequest.Create (uri) as HttpWebRequest;
request.SendChunked = true;
request.AllowWriteStreamBuffering = false;
return request;
}
}
And then to instantiate my web request -
HttpWebRequest request = DependencyService.Get<IWebDependency>().GetWebRequest(uri);
I have the following code:
string url = "https://myurl.com/is/here/example?param1=100¶m2=200";
string post = "POST " + url + " HTTP/1.1\r\n" +
"Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8\r\n" +
"Accept-Encoding: gzip, deflate\r\n" +
"Accept-Language: en-US,en;q=0.5\r\n" +
"Connection: keep-alive\r\n" +
"Host: myurl.com\r\n" +
"User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:26.0) Gecko/20100101 Firefox/26.0\r\n" +
"Content-type: application/json\r\n" +
"X-HTTP-Method-Override: GET\r\n" +
"X-MY-CUSTOM-HEADER: headervalue\r\n\r\n";
TcpClient tcp = new TcpClient("myurl.com", 443);
string returnData = string.Empty;
using (SslStream stream = new SslStream(tcp.GetStream()))
{
//Authenticate here...
byte[] data = Encoding.ASCII.GetBytes(post);
stream.Write(data, 0, data.Length);
stream.Flush();
stream.ReadByte();
data = new byte[4096];
var n = stream.Read(data, 0, data.Length);
using (MemoryStream ms = new MemoryStream(data, 0, n))
{
ms.ReadByte();
ms.ReadByte();
using (DeflateStream df = new DeflateStream(ms, CompressionMode.Decompress))
using (StreamReader rd = new StreamReader(df))
{
returnData = rd.ReadToEnd();
}
}
}
tcp.Close();
However, the response is always empty. Is the post string correct? What am I missing here?
Edit:
I'm using SslStream and it seems to be retrieving data. However, Is there a way to just read till all data is received instead of
It appears that you're declaring the length of content to by the length of your URL (when expressed as UTF-8), rather than whatever is in the post body
If the content is not as long as the content-length header indicates, I would expect the remote server to wait for the rest of the content.
You're also connecting to port 443, and your URL starts 'https' but you don't appear to be making any attempt to do the SSL negotiation.
I'm passing the stream this way:
StreamReader sr = new StreamReader(openFileDialog1.FileName);
byte[] fileStream = Utility.ReadFully(sr.BaseStream);
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(new Uri(baseAddress));
request.Method = "POST";
request.ContentType = "application/octet-stream";
Stream serverStream = request.GetRequestStream();
serverStream.Write(fileStream, 0, fileStream.Length);
serverStream.Close();
HttpWebResponse response2 = (HttpWebResponse)request.GetResponse();
if (response2.StatusCode == HttpStatusCode.OK)
{
MessageBox.Show(Utility.ReadResponse(response2));
}
-------------------------------------------------------------------------
public static byte[] ReadFully(Stream input)
{
byte[] buffer = new byte[16 * 1024];
using (MemoryStream ms = new MemoryStream())
{
if (input != null)
{
int read;
while ((read = input.Read(buffer, 0, buffer.Length)) > 0)
{
ms.Write(buffer, 0, read);
}
}
return ms.ToArray();
}
}
Then handling it on the server:
public bool UploadPhotoStream(string someStringParam, Stream fileData)
{
string filePath = string.Format("{0}/{1}", 'sdfgsdf87s7df8sd', '24asd54s4454d5f4g');
ProductPhoto newphoto = new ProductPhoto();
newphoto.FileSizeBytes = fileData.Length / 1024 / 1024;
newphoto.FileLocation = filePath;
...
}
Now I'm getting NotSupportedException when calling fileData.Length. I know it happens because the stream is closed. But how can I re-open it? Or what should I do so that when I pass the stream to the service I can still get its length?
Why don't you pass content-length header? Your server can check the header and know exactly how many bytes is the content being sent. How you read the header depends on which http framework you are using, ASP.NET Web Api, classic WCF Web Api, HttpListener, etc.
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(new Uri(baseAddress));
request.Method = "POST";
request.ContentType = "application/octet-stream";
request.ContentLength = new FileInfo(openFileDialog1.FileName).Length
Without a Content-Length header, an http server can never know how many bytes are left to read. All it knows is there is a Stream and will read it till there is no more data. This is also how your browser can display a progress bar when downloading something. It takes bytesDownloaded / Content-Length.
According to this post: https://stackoverflow.com/a/8239268/1160036
You can access the header like this from your web method.
long dataLength = long.Parse(HttpContext.Current.Request.Headers["Content-Length"]);
I need to download a text file from the internet using C#. The file size can be quite large and the information I need is always within the first 1000 bytes. Is this possible?
Stolen from here.
string GetWebPageContent(string url)
{
string result = string.Empty;
HttpWebRequest request;
const int bytesToGet = 1000;
request = WebRequest.Create(url) as HttpWebRequest;
//get first 1000 bytes
request.AddRange(0, bytesToGet - 1);
// the following code is alternative, you may implement the function after your needs
using (WebResponse response = request.GetResponse())
{
using (Stream stream = response.GetResponseStream())
{
byte[] buffer = new byte[1024];
int read = stream.Read(buffer, 0, 1000);
Array.Resize(ref buffer, read);
return Encoding.ASCII.GetString(buffer);
}
}
}
(Edited as requested in the comments... ;) )
I did this as an answer to your newer question. You could put the range header in too if you want, but I excluded it.
string GetWebPageContent(string url)
{
//string result = string.Empty;
HttpWebRequest request;
const int bytesToGet = 1000;
request = WebRequest.Create(url) as HttpWebRequest;
var buffer = new char[bytesToGet];
using (WebResponse response = request.GetResponse())
{
using (StreamReader sr = new StreamReader(response.GetResponseStream()))
{
sr.Read(buffer, 0, bytesToGet);
}
}
return new string(buffer);
}