Locking with asynchronous httpwebrequest - c#

I have an object that downloads a file from a server, saves it into Isolated Storage asynchronously and provides a GetData method to retrieve the data. Would I use a
IsolatedStorageFile storageObj; //initialized in the constructor
lock(storageObj)
{
//save code
}
In the response and
lock(storageObj)
{
//load code
}
In the GetData method?
Edit: I'll give some context here.
The app (for Windows Phone) needs to download and cache multiple files from a server, so I've created a type that takes 2 strings (a uri and a filename), sends out for data from the given uri, and saves it. The same object also has the get data method. Here's the code (simplified a bit)
public class ServerData: INotifyPropertyChanged
{
public readonly string ServerUri;
public readonly string Filename;
IsolatedStorageFile appStorage;
DownloadState _downloadStatus = DownloadState.NotStarted;
public DownloadState DownloadStatus
{
protected set
{
if (_downloadStatus == value) return;
_downloadStatus = value;
OnPropertyChanged(new PropertyChangedEventArgs("DownloadStatus"));
}
get { return _downloadStatus; }
}
public ServerData(string serverUri, string filename)
{
ServerUri = serverUri;
Filename = filename;
appStorage = IsolatedStorageFile.GetUserStoreForApplication();
}
protected virtual void OnPropertyChanged(PropertyChangedEventArgs args)
{
if (PropertyChanged != null)
PropertyChanged(this, args);
}
public void RequestDataFromServer()
{
DownloadStatus = DownloadState.Downloading;
//this first bit adds a random unused query to the Uri,
//so Silverlight won't cache the request
Random rand = new Random();
StringBuilder uriText = new StringBuilder(ServerUri);
uriText.AppendFormat("?YouHaveGotToBeKiddingMeHack={0}",
rand.Next().ToString());
Uri uri = new Uri(uriText.ToString(), UriKind.Absolute);
HttpWebRequest serverRequest = (HttpWebRequest)WebRequest.Create(uri);
ServerRequestUpdateState serverState = new ServerRequestUpdateState();
serverState.AsyncRequest = serverRequest;
serverRequest.BeginGetResponse(new AsyncCallback(RequestResponse),
serverState);
}
void RequestResponse(IAsyncResult asyncResult)
{
var serverState = (ServerRequestUpdateState)asyncResult.AsyncState;
var serverRequest = (HttpWebRequest)serverState.AsyncRequest;
Stream serverStream;
try
{
// end the async request
serverState.AsyncResponse =
(HttpWebResponse)serverRequest.EndGetResponse(asyncResult);
serverStream = serverState.AsyncResponse.GetResponseStream();
Save(serverStream);
serverStream.Dispose();
}
catch (WebException)
{
DownloadStatus = DownloadState.Error;
}
Deployment.Current.Dispatcher.BeginInvoke(() =>
{
DownloadStatus = DownloadState.FileReady;
});
}
void Save(Stream streamToSave)
{
StreamReader reader = null;
IsolatedStorageFileStream file;
StreamWriter writer = null;
reader = new StreamReader(streamToSave);
lock (appStorage)
{
file = appStorage.OpenFile(Filename, FileMode.Create);
writer = new StreamWriter(file);
writer.Write(reader.ReadToEnd());
reader.Dispose();
writer.Dispose();
}
}
public XDocument GetData()
{
XDocument xml = null;
lock(appStorage)
{
if (appStorage.FileExists(Filename))
{
var file = appStorage.OpenFile(Filename, FileMode.Open);
xml = XDocument.Load(file);
file.Dispose();
}
}
if (xml != null)
return xml;
else return new XDocument();
}
}

Your question doesn't provide an awful lot of context, and with the amount of information given people could be inclined to simply tell you yes, maybe with small, but pertinent additions.
Practice generally sees locking occur on an instance of a dedicated object, being sure to stay away from locking on this since you lock the whole instance of the current object down, which is scarcely, if ever the intent - but, in your case, we don't rightly know to the fullest extent, however, I hardly think locking your storage instance is the way to go.
Also, since you mention client and server interaction, it isn't as straight forward.
Depending on the load and many other factors, you might want to provide many reads of the file from the server yet only a single write at any one time on the client that is downloading; for this purpose I would recommend using the ReaderWriterLockSlim class, which exposes TryEnterReadLock, TryEnterWriteLock and corresponding release methods.
For more detailed information on this class see this MSDN link.
Also, remember to use try, catch and finally when coding within the scope of a lock, always releasing the lock in the finally block.

What class contains this code? That matters as it's important if it's being created more than once. If it's created once in the process' lifetime, you can do this, if not you should lock a static object instance.
I believe though that it's good practice to create a separate object that's used only for the purpose of locking, I've forgotten why. E.g.:
IsolatedStorageFile storageObj; //initialized in the constructor
(static) storageObjLock = new object();
...
// in some method
lock(storageObjLock)
{
//save code
}

Related

Abandoned memory in posting image data to server

Showing high consumption of memory while posting image data to server and it is not releasing. reportModel in following source code has base64 string of image data. Here is a snapshot of source code,
public async Task<FaultReportResponseModel> ReportFault(ReportFaultRequestModel reportModel)
{
try
{
App.IsConnectedToInternet(true);
reportModel.Token = App.WebOpsToken;
//var httpContent = CreateHttpContent(reportModel);
var jsonBody = JsonConvert.SerializeObject(reportModel);
_log.Trace("ReportFault api jsonBody length: {0}", jsonBody.Length);
var content = new StringContent(jsonBody, Encoding.UTF8, "application/json");
AddAuthorizationHeader();
string serviceURL;
if (reportModel.IssueType == IssueTypes.CantFind)
{
serviceURL = Constants.CantFindSvcURL;
}
else
{
serviceURL = Constants.ReportFaultSvcURL;
}
//var url = string.Format("{0}{1}", Constants.DataSVCBaseURL, serviceURL);
var url = GetURLStringForService(serviceURL, ServiceType.WebOpsData);
var response = await _restClient.PostAsync(url, content);
var responseStr = await response.Content.ReadAsStringAsync();
var parsedResponse = JsonConvert.DeserializeObject<FaultReportResponseModel>(responseStr);
_log.Trace("Uploaded fault text: {0}", parsedResponse.OK);
content.Dispose();
return parsedResponse;
}
catch (Exception ex)
{
_log.Trace("Exception: {0}", ex.Message);
}
return null;
}
Snapshot of the memory footprint,
It is showing that Json serialization is taking memory and that never got released. Because of this abandoned memory, after few cycles of image upload app crashes.
What I tried,
Used Stream content to Post to server. In this case, it is showing memory problem in Stream. Problem pointer changed but the problem is same.
On Internet I found that it is because of Large Object Heap so, I tried to invoke GC manually but no change in memory footprint.
Any help or pointer to get out of this problem would be helpful.
You are creating large blocks of memory on the LOH. This is likely not a memory heap, though it definitely isn't optimal in high throughput applications
Assuming you want to actually use Json.Net on serialisation you can achieve this with JsonTextWriter and serialize directly to a stream (ideally the HttpClient NetworkStream). Note that Test.Json also has a very efficient methods for serializing to stream as well.
To get access to the underlying NetworkStream in HttpClient, you could create a derived HttpContent class
Example
public class SerializedStreamedContent<T> :HttpContent
{
private readonly T _value;
public SerializedStreamedContent(T value) => _value = value;
protected override Task SerializeToStreamAsync(Stream stream, TransportContext? context)
{
try
{
using var writer = new StreamWriter(stream, leaveOpen:true);
using var jsonWriter = new JsonTextWriter(writer);
var ser = new JsonSerializer();
ser.Serialize(jsonWriter, _value);
jsonWriter.Flush();
return Task.CompletedTask;
}
catch (Exception e)
{
return Task.FromException(e);
}
}
protected override bool TryComputeLength(out long length)
{
length = -1;
return false;
}
}
Note 1 : This is not intended to be a complete solution, just an example. There are many considerations that you will need to weigh up using this approach
Note 2 : In .Net 5 there is a JsonContent Class, that does all this for you with Text.Json implementation (and more)

Finding a memory leak

I have an issue with the following code. I create a memory stream in the GetDB function and the return value is used in a using block. For some unknown reason if I dump my objects I see that the MemoryStream is still around at the end of the Main method. This cause me a massive leak. Any idea how I can clean this buffer ?
I have actually checked that the Dispose method has been called on the MemoryStream but the object seems to stay around, I have used the diagnostic tools of Visual Studio 2017 for this task.
class Program
{
static void Main(string[] args)
{
List<CsvProduct> products;
using (var s = GetDb())
{
products = Utf8Json.JsonSerializer.Deserialize<List<CsvProduct>>(s).ToList();
}
}
public static Stream GetDb()
{
var filepath = Path.Combine("c:/users/tom/Downloads", "productdb.zip");
using (var archive = ZipFile.OpenRead(filepath))
{
var data = archive.Entries.Single(e => e.FullName == "productdb.json");
using (var s = data.Open())
{
var ms = new MemoryStream();
s.CopyTo(ms);
ms.Seek(0, SeekOrigin.Begin);
return (Stream)ms;
}
}
}
}
For some unknown reason if I dump my objects I see that the MemoryStream is still around at the end of the Main method.
That isn't particuarly abnormal; GC happens separately.
This cause me a massive leak.
That isn't a leak, it is just memory usage.
Any idea how I can clean this buffer ?
I would probably just not use a MemoryStream, instead returning something that wraps the live uncompressing stream (from s = data.Open()). The problem here, though, is that you can't just return s - as archive would still be disposed upon leaving the method. So if I needed to solve this, I would create a custom Stream that wraps an inner stream and which disposes a second object when disposed, i.e.
class MyStream : Stream {
private readonly Stream _source;
private readonly IDisposable _parent;
public MyStream(Stream, IDisposable) {...assign...}
// not shown: Implement all Stream methods via `_source` proxy
public override void Dispose()
{
_source.Dispose();
_parent.Dispose();
}
}
then have:
public static Stream GetDb()
{
var filepath = Path.Combine("c:/users/tom/Downloads", "productdb.zip");
var archive = ZipFile.OpenRead(filepath);
var data = archive.Entries.Single(e => e.FullName == "productdb.json");
var s = data.Open();
return new MyStream(s, archive);
}
(could be improved slightly to make sure that archive is disposed if an exception happens before we return with success)

Using public static object for locking thread shared resources

Lets Consider the example, I have two classes:
Main_Reader -- Read from file
public class Main_Reader
{
public static object tloc=new object();
public void Readfile(object mydocpath1)
{
lock (tloc)
{
string mydocpath = (string)mydocpath1;
StringBuilder sb = new StringBuilder();
using (StreamReader sr = new StreamReader(mydocpath))
{
String line;
// Read and display lines from the file until the end of
// the file is reached.
while ((line = sr.ReadLine()) != null)
{
sb.AppendLine(line);
}
}
string allines = sb.ToString();
}
}
}
MainWriter -- Write the file
public class MainWriter
{
public void Writefile(object mydocpath1)
{
lock (Main_Reader.tloc)
{
string mydocpath = (string)mydocpath1;
// Compose a string that consists of three lines.
string lines = "First line.\r\nSecond line.\r\nThird line.";
// Write the string to a file.
System.IO.StreamWriter file = new System.IO.StreamWriter(mydocpath);
file.WriteLine(lines);
file.Close();
Thread.Sleep(10000);
MessageBox.Show("Done----- " + Thread.CurrentThread.ManagedThreadId.ToString());
}
}
}
In main have instatiated two function with two threads.
public string mydocpath = "E:\\testlist.txt"; //Here mydocpath is shared resorces
MainWriter mwr=new MainWriter();
Writefile wrt=new Writefile();
private void button1_Click(object sender, EventArgs e)
{
Thread t2 = new Thread(new ParameterizedThreadStart(wrt.Writefile));
t2.Start(mydocpath);
Thread t1 = new Thread(new ParameterizedThreadStart(mrw.Readfile));
t1.Start(mydocpath);
MessageBox.Show("Read kick off----------");
}
For making this thread safe, i am using a public static field,
public static object tloc=new object(); //in class Main_Reader
My Question is, is this a good approach?
Because I read in one of MSDN forums:
avoid locking on a public type
Is there another approach for making this thread safe?
I believe the MSDN statement has meaning if you share your code with other people. You never know if they are going to use the locks properly, and then your threads might get blocked.
The solution is probably to write both thread bodies into the same class.
On the other hand, since you're dealing with files, the filesystem has a locking mechanism of its own. You won't be allowed to write into a file that is being read, or read of file that is being written. In a case like this, I would perform the reading and the writing in the same thread.

retrieving partial content using multiple http requsets to fetch data via parllel tasks

i am trying to be as thorough as i can in this post, as it is very important for me,
though the issue is very simple, and only by reading the title of this question, you can get the idea...
question is:
with healthy bandwidth (30mb Vdsl) available...
how is it possible to get multiple httpWebRequest for a single data / file ?,
so each reaquest,will download only a portion of the data
then when all instances have completed, all parts are joined back to one piece.
Code:
...what i have got working so far is same idea only each task =HttpWebRequest = different file,
so speedup is pure tasks parallelism rather acceleration of one download using multiple tasks/threads
as in my question.
see code below
the next part is only more detailed explantion and background on the subject...if you don't mind reading.
while i am still on a similar project that differ from this (in question)one,
in the way that it(see code below..) was trying to fetch as many different data sources for each of separated tasks(different downloads/files).
... so the speedup was gaind while each(task) does not have to wait for the former one to complete first before it get a chance to be executed .
what i am trying to do in this current-subjected question (having allmost everything ready in the code below) is actually targetting same url for same data,
so this time the speedup to gain is for the single-task - current download .
implementing same idea as in code below only this time let SmartWebClient target same url by
using multiple instances.
then (only theory for now) it will request partial content of data,
with multiple requests with each one of instances .
last issue is i need to "put puzle back to one peace"... another problem i need to find out about...
as you can see in this code , what i did not get to work on yet is only the data parsing/processing which i find to be very easy using htmlAgilityPack so no problem.
current code
main entry:
var htmlDictionary = urlsForExtraction.urlsConcrDict();
Parallel.ForEach(
urlList.Values,
new ParallelOptions { MaxDegreeOfParallelism = 20 },
url => Download(url, htmlDictionary)
);
foreach (var pair in htmlDictionary)
{
///Process(pair);
MessageBox.Show(pair.Value);
}
public class urlsForExtraction
{
const string URL_Dollar= "";
const string URL_UpdateUsersTimeOut="";
public ConcurrentDictionary<string, string> urlsConcrDict()
{
//need to find the syntax to extract fileds names so it would be possible to iterate on each instead of specying
ConcurrentDictionary<string, string> retDict = new Dictionary<string,string>();
retDict.TryAdd("URL_Dollar", "Any.Url.com");
retDict.TryAdd("URL_UpdateUserstbl", "http://bing.com");
return retDict;
}
}
/// <summary>
/// second Stage Class consumes the Dictionary of urls for extraction
/// then downloads Each via parallel for each using The Smart WeBClient! (download(); )
/// </summary>
public class InitConcurentHtmDictExtrct
{
private void Download(string url, ConcurrentDictionary<string, string> htmlDictionary)
{
using (var webClient = new SmartWebClient())
{
webClient.Encoding = Encoding.GetEncoding("UTF-8");
webClient.Proxy = null;
htmlDictionary.TryAdd(url, webClient.DownloadString(url));
}
}
private ConcurrentDictionary<string, string> htmlDictionary;
public ConcurrentDictionary<string, string> LoopOnUrlsVia_SmartWC(Dictionary<string, string> urlList)
{
htmlDictionary = new ConcurrentDictionary<string, string>();
Parallel.ForEach(
urlList.Values,
new ParallelOptions { MaxDegreeOfParallelism = 20 },
url => Download(url, htmlDictionary)
);
return htmlDictionary;
}
}
/// <summary>
/// the Extraction Process, done via "HtmlAgility pack"
/// easy usage to collect information within a given html Documnet via referencing elements attributes
/// </summary>
public class Results
{
public struct ExtracionParameters
{
public string FileNameToSave;
public string directoryPath;
public string htmlElementType;
}
public enum Extraction
{
ById, ByClassName, ByElementName
}
public void ExtractHtmlDict( ConcurrentDictionary<string, string> htmlResults, Extract By)
{
// helps with easy elements extraction from the page.
HtmlAttribute htAgPcAttrbs;
HtmlDocument HtmlAgPCDoc = new HtmlDocument();
/// will hold a name+content of each documnet-part that was aventually extracted
/// then from this container the build of the result page will be possible
Dictionary<string, HtmlDocument> dictResults = new Dictionary<string, HtmlDocument>();
foreach (KeyValuePair<string, string> htmlPair in htmlResults)
{
Process(htmlPair);
}
}
private static void Process(KeyValuePair<string, string> pair)
{
// do the html processing
}
}
public class SmartWebClient : WebClient
{
private readonly int maxConcurentConnectionCount;
public SmartWebClient(int maxConcurentConnectionCount = 20)
{
this.Proxy = null;
this.Encoding = Encoding.GetEncoding("UTF-8");
this.maxConcurentConnectionCount = maxConcurentConnectionCount;
}
protected override WebRequest GetWebRequest(Uri address)
{
var httpWebRequest = (HttpWebRequest)base.GetWebRequest(address);
if (httpWebRequest == null)
{
return null;
}
if (maxConcurentConnectionCount != 0)
{
httpWebRequest.ServicePoint.ConnectionLimit = maxConcurentConnectionCount;
}
return httpWebRequest;
}
}
}
this allows me to take advantage of good bandwith,
only i am far from the subjected solution, i will realy appriciate any clue on where to start .
If the server support what's wikipedia calls byte serving, you can multiplex a file download spawning multiple requests with a specific Range header value (using the AddRange method. See also How to download the data from the server discontinuously?). Most serious HTTP servers do support byte-range.
Here is some sample code that implements a parallel download of a file using byte range:
public static void ParallelDownloadFile(string uri, string filePath, int chunkSize)
{
if (uri == null)
throw new ArgumentNullException("uri");
// determine file size first
long size = GetFileSize(uri);
using (FileStream file = new FileStream(filePath, FileMode.Create, FileAccess.Write, FileShare.Write))
{
file.SetLength(size); // set the length first
object syncObject = new object(); // synchronize file writes
Parallel.ForEach(LongRange(0, 1 + size / chunkSize), (start) =>
{
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(uri);
request.AddRange(start * chunkSize, start * chunkSize + chunkSize - 1);
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
lock (syncObject)
{
using (Stream stream = response.GetResponseStream())
{
file.Seek(start * chunkSize, SeekOrigin.Begin);
stream.CopyTo(file);
}
}
});
}
}
public static long GetFileSize(string uri)
{
if (uri == null)
throw new ArgumentNullException("uri");
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(uri);
request.Method = "HEAD";
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
return response.ContentLength;
}
private static IEnumerable<long> LongRange(long start, long count)
{
long i = 0;
while (true)
{
if (i >= count)
{
yield break;
}
yield return start + i;
i++;
}
}
And sample usage:
private static void TestParallelDownload()
{
string uri = "http://localhost/welcome.png";
string fileName = Path.GetFileName(uri);
ParallelDownloadFile(uri, fileName, 10000);
}
PS: I'd be curious to know if it's really more interesting to do this parallel thing rather than to just use WebClient.DownloadFile... Maybe in slow network scenarios?

How to overcome OutOfMemoryException pulling large xml documents from an API?

I am pulling 1M+ records from an API. The pull works ok, but I'm getting an out of memory exception when attempting to ReadToEnd into a string variable.
Here's the code:
XDocument xmlDoc = new XDocument();
HttpWebRequest client = (HttpWebRequest)WebRequest.Create(uri);
client.Timeout = 2100000;//35 minutes
WebResponse apiResponse = client.GetResponse();
Stream receivedStream = apiResponse.GetResponseStream();
StreamReader reader = new StreamReader(receivedStream);
string s = reader.ReadToEnd();
Stack trace:
at System.Text.StringBuilder.ToString()
at System.IO.StreamReader.ReadToEnd()
at MyApplication.DataBuilder.getDataFromAPICall(String uri) in
c:\Users\RDESLONDE\Documents\Projects\MyApplication\MyApplication\DataBuilder.cs:line 578
at MyApplication.DataBuilder.GetDataFromAPIAsXDoc(String uri) in
c:\Users\RDESLONDE\Documents\Projects\MyApplication\MyApplication\DataBuilder.cs:line 543
What can I do to work around this?
It sounds like your file is too big for your environment. Loading the DOM for a large file can be problematic, especially when using the win32 platform (you haven't indicated whether this is the case).
You can combine the speed and memory efficiency of XmlReader with the convenience of XElement/Xnode, etc and use an XStreamingElement to save the transformed content after processing. This is much more memory-efficient for large files
Here's an example in pseudo-code:
// use a XStreamingElement for writing
var st = new XStreamingElement("root");
using(var xr = new XmlTextReader(stream))
{
while (xr.Read())
{
// whatever you're interested in
if (xr.NodeType == XmlNodeType.Element)
{
var node = XNode.ReadFrom(xr) as XElement;
if (node != null)
{
ProcessNode(node);
st.Add(node);
}
}
}
}
st.Save(outstream); // or st.WriteTo(xmlwriter);
XMLReader is the way to go when memory is an issue. It is also fastest.
Unfortunately, you didn't show your code but it sounds like the entire file is being loaded into memory. That's what you need to avoid.
Best if you can use a stream to process the file without loading the entire thing in memory.
class MyXmlDocument : IDisposable
{
private bool _disposed = false;
private XmlDocument _xmldoc;
public XmlDocument xmldoc
{
get { return _xmldoc; }
}
public MyXmlDocument()
{
_xmldoc = new XmlDocument();
}
~MyXmlDocument()
{
this.Dispose();
}
// Public implementation of Dispose pattern callable by consumers.
public void Dispose()
{
Dispose(true);
GC.SuppressFinalize(this);
}
// Protected implementation of Dispose pattern.
protected virtual void Dispose(bool disposing)
{
if (_disposed)
{
return;
}
if (disposing)
{
// TODO: dispose managed state (managed objects).
this._xmldoc = null;
GC.Collect();
GC.WaitForPendingFinalizers();
}
// TODO: free unmanaged resources (unmanaged objects) and override a finalizer below.
// TODO: set large fields to null.
_disposed = true;
}
}
You can use this and then you can write the code like
Using(MyXmlDocument doc = new MyXmlDocument())
{
doc.xmldoc = xmldoc.Load(new StreamReader(file));
}

Categories