Memory Leak From WebBrowser BSTR - c#

I am experiencing a leak of some sort using webbrowser object; I am still surfing all over the place for answers -- I've seen some similar questions on this forum as well, but I cant see how to apply those findings in my case.
After a page loads the DocumentCompleted action fires and I parse the HTML on the page,
void PageScrollTimerTick(object sender, EventArgs e)
{
String pageSrc = webBrowser1.Document.Body.InnerHtml;
// Check if we need to stop scrolling..
if (m_iLastFramePageLength == pageSrc.Length)
{
m_iLastFramePageLength = 0;
m_scrollTimer.Tick -= PageScrollTimerTick
m_scrollTimer.Enabled = false;
parsePage();
nextPage();
}
else
{
m_iLastFramePageLength = pageSrc.Length;
webBrowser1.Document.Window.ScrollTo(0, webBrowser1.Document.Body.ScrollRectangle.Height);
}
}
The Leak:
As I type this, I wonder why these functions? I have 6 different functions that do very similar tasks. I think these have problems because they are executed from a TIMER which probably uses a different thread. I'm I close? How can I resolve this. Perhaps Invoke() on the web browser control?
doParse():
List<String> doSomeExtractions()
{
List<String> retVal = new List<String>();
foreach (HtmlElement div in webBrowser1.Document.GetElementsByTagName("div"))
{
String szClassName = div.GetAttribute("classname");
switch (szClassName)
{
case "someDivClass":
{
if (div.InnerHtml.Contains("<b>"))
{
retVal.Add(div.InnerHtml);
}
break;
}
default:
{
break;
}
}
}
return retVal;
}
moveNext():
// Store data, navigate to next page.
webBrowser1.DocumentCompleted += this.scrapeData;
webBrowser1.Navigate("about:blank");

Related

office DocumentBeforeSave events only work after bind in few seconds

I'm working on a feature which is to create a backup when a open word saved each times.
I'm using the blow code to hooking into word process and bind events to it, the word is opened by process.
officeApplication = (Application)Marshal.GetActiveObject("Word.Application").
officeApplication.DocumentBeforeSave += new ApplicationEvents4_DocumentBeforeSaveEventHandler(App_BeforeSaveDocument);
And in App_BeforeSaveDocument I did my work.
I get officeApplication right, and bind events were fine, when I click save in word, the events triggered perfectly.
The problem is, a few seconds(may be 30s) after, the events will not fire anymore, no matter click save or save us or close document.
Is there any suggestions?
After a lot of researching, I still can't find the reason. And I decide to use a trick to approach it.
First, open a thread in the binding event:
static void App_BeforeSaveDocument(Microsoft.Office.Interop.Word.Document document, ref bool saveAsUI, ref bool cancel)
{
if (th != null)
th.Abort();
th = new Thread(backupOnSave);
th.IsBackground = true;
th.Start(document);
}
Then do an infinity loop in the thread:
internal static void backupOnSave(object obj)
{
try
{
Application app = obj as Application;
if (app == null || app.ActiveDocument == null)
{
return;
}
Microsoft.Office.Interop.Word.Document document = app.ActiveDocument;
if (!tempData.ContainsKey(document.FullName))
return;
var loopTicks = 2000;
while (true)
{
Thread.Sleep(loopTicks);
if (document.Saved)
{
if (!tempData.ContainsKey(document.FullName))
break;
var p = tempData[document.FullName];
var f = new FileInfo(p.FileFullName);
if (f.LastWriteTime != p.LastWriteTime)//changed, should create new backup
{
BackupFile(p, f);
p.LastWriteTime = f.LastWriteTime;
}
}
}
}
catch (Exception ex)
{
log.write(ex);
}
}
And it works fine. Don't remember to abort the thread when the document closed or exception happen.

C# String comparison not working

I'm having this wierd problem within the application I'm currently working on.
string searchText = "onMouseOver=\"CallList_onMouseOver(this);\" id=\"";
List<int> searchOrders = AllIndexesOf(scraper.clientBrowser.DocumentText, searchText);
StringBuilder sb = new StringBuilder();
for (int i = 0; i < searchOrders.Count; i++)
{
string order = scraper.clientBrowser.DocumentText.Substring(searchOrders[i] + searchText.Length, 6);
scraper.clientBrowser.Document.GetElementById(order).InvokeMember("Click");
for (int j = 0; j < scraper.clientBrowser.Document.Window.Frames.Count; j++)
{
if (scraper.clientBrowser.Document.Window.Frames[j].Document != null && scraper.clientBrowser.Document.Window.Frames[j].Document.Body != null)
{
string orderText = scraper.clientBrowser.Document.Window.Frames[j].Document.Body.InnerText ?? "Nope";
//MessageBox.Show(j + Environment.NewLine + orderText);
if (!orderText.Contains("Nope"))
{
sb.AppendLine(orderText + Environment.NewLine);
}
}
}
}
Clipboard.SetText(sb.ToString());
The thing is, whenever I uncomment the MessageBox.Show, I can clearly see orderText is filled with another value than "Nope", the Stringbuilder gets filled, and the correct text is copied.
However if I comment the Messagebox.Show, the outcome of this loop is always "Nope". I'm stuck here, I have no idea what could cause something like this.
The scraper.clientBrowser is a System.Windows.Forms.WebBrowser.
Update:
Solved the issue by waiting for the document to be loaded, created this mechanism:
public bool DocumentLoaded
{
get { return documentLoaded; }
set { documentLoaded = value; }
}
private void wb_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
this.DocumentLoaded = true;
this.clientBrowser = sender as WebBrowser;
}
void clientBrowser_Navigating(object sender, WebBrowserNavigatingEventArgs e)
{
this.DocumentLoaded = false;
}
Then in the class I'm using:
while(!scraper.DocumentLoaded)
{
System.Threading.Thread.Sleep(100);
}
It sounds like you need to ensure that the page is fully loaded, like there might be a race condition. I would suggest wiring up the WebBrowser.DocumentCompleted event, and then attempting your scrapping logic.
Update
I overlooked this initially, this certainly has something to do with your issue. The line where you are invoking a click, like so scraper.clientBrowser.Document.GetElementById(order).InvokeMember("Click");. This is done in the iteration, which will more than likely manipulate the DOM -- will it not? I suggest going about this problem entirely different. What are you trying to achieve exactly, (not how you're trying to do it)?
With this alone, I would suggest that you refer to this SO Q/A and look at how they're waiting for the click to finish.
Only one thing I can guest here:
When you uncomment MessageBox.Show, at the time the message box show the info, the clientBrowser use this time to finish loading page. Then when you press OK on message box, the page is load completed, so you get the result. When you comment it, you dont wai for page loaded, so the result is diffent.

get item[i] from a listview using threads

I have a function like this:
foreach (ListViewItem item in getListViewItems(listView2)) //for proxy
{
if (reader.Peek() == -1)
{
break;
}
lock (reader)
{
line = reader.ReadLine();
}
//proxy code
List<string> mylist = new List<string>();
if (item != null)
{
for (int s = 0; s < 3; s++)
{
if (item.SubItems[s].Text != null)
{
mylist.Add(item.SubItems[s].Text);
}
else
{
mylist.Add("");
}
}
}
else
{
break;
}
//end proxy code
//some other code including the threadpool
}
and the delegate code:
private delegate ListView.ListViewItemCollection GetItems(ListView lstview);
private ListView.ListViewItemCollection getListViewItems(ListView lstview)
{
ListView.ListViewItemCollection temp = new ListView.ListViewItemCollection(new ListView());
if (!lstview.InvokeRequired)
{
foreach (ListViewItem item in lstview.CheckedItems)
{
temp.Add((ListViewItem)item.Clone());
}
return temp;
}
else
{
return (ListView.ListViewItemCollection)this.Invoke(new GetItems(getListViewItems), new object[] { lstview });
}
}
EDIT:
I wanna replace that foreach loop in the main function with a conditional function:
if (reader.Peek() == -1)
{
break;
}
lock (reader)
{
line = reader.ReadLine();
}
if (use_proxy == true)
{
mylist2 = get_current_proxy();
}
//some other code including the threadpool
private List<string> get_current_proxy()
{
//what shall I add here?
}
How can I make that function do the same as foreach loop but using for loop? I mean getting the proxies one by one ...
I see multiple questions revolving around an idea of scraping a website for emails then spamming. You have very cool tools for that already, no need for a new one.
Anyway - I don't understand your question, and it seems that I'm not the only one here, but the thing you'll have to KNOW before anything else is:
Having ANYTHING in Windows run in multiple threads will ultimately have to be synchronized when you do Invoke() which HAVE TO wait until it all passes through ONE thread and that's the one that holds a message loop. So you can try to read from or write to ListView from multiple threads, but to do each read/write you'll have to Invoke() (you probably tried it directly and BAAAAM) and every Invoke() has only ONE hole to go through, and all your threads will have to wait their turn.
Next: having ListView to be a CONTAINER for your data is so BAD I can't even comment any further. Consider something as a
class MyData
{
public string Name;
public string URL;
// ...
}
and
List<MyData> _myData;
to hold your data. You'll be able to access it from multiple threads, if you take care of some low-key sync issues.
Lastly, how come you ask us questions about .net C# programming if you don't even know the syntax. Well, it's rhetorical, ...

Asynchronous Call Best Practice

I am developing a windows application with vs.NET 2010 and C# windows forms. This app has an user control that queries a service(WCF hosted on win service) and needs to do this without blocking the UI. The user control contains a grid that will show that results. I think that my situation is most common. My question to you is what can be done with C# in order for the following code to run smoother and with a better error handling. I am using MehtodInvoker so I can avoid writeing two seprate methods for this call - wait - fill scenario.
public void LoadData()
{
StartWaitProgress(0);
ThreadPool.QueueUserWorkItem(x =>
{
try
{
MyDocMail[] mails;
var history = Program.NoxProxy.GetDocumentHistory(out mails, Program.MySessionId, docId);
this.Invoke(new MethodInvoker(delegate()
{
this.SuspendLayout();
gridVersions.Rows.Clear();
foreach (var item in history)
{
gridVersions.Rows.Add();
int RowIndex = gridVersions.RowCount - 1;
DataGridViewRow demoRow = gridVersions.Rows[RowIndex];
demoRow.Tag = item.Id;
if (gridVersions.RowCount == 1)
{
demoRow.Cells[0].Value = Properties.Resources.Document_16;
}
demoRow.Cells[1].Value = item.Title;
demoRow.Cells[2].Value = item.Size.GetFileSize();
demoRow.Cells[3].Value = item.LastModified;
demoRow.Cells[4].Value = item.CheckoutBy;
demoRow.Cells[5].Value = item.Cotegory;
}
gridEmails.Rows.Clear();
foreach (var item in mails)
{
gridEmails.Rows.Add();
int RowIndex = gridEmails.RowCount - 1;
DataGridViewRow demoRow = gridEmails.Rows[RowIndex];
demoRow.Tag = item.Id;
demoRow.Cells[1].Value = item.From;
demoRow.Cells[2].Value = item.To;
demoRow.Cells[3].Value = item.Date;
}
this.ResumeLayout();
}));
}
catch (Exception ex)
{
Program.PopError(ex);
this.Invoke(new MethodInvoker(delegate() { this.Close(); }));
}
finally { this.Invoke(new MethodInvoker(delegate() { StopWaitProgress(); })); }
});
}
There's nothing wrong with your solution, although you can accomplish it more easily with BackgroundWorker.
BackgroundWorker handles thread exceptions, calling Invoke on the WPF window, and helps with progress reporting and cancellation. More examples here.
P.S. Future versions of C# may make this even easier - check out the Async CTP.

Closing or Hiding forms causes a cross thread error

I am baffled by this simple task i do over and over again.
I have an array of child forms. The array is initiated in another form's constructor:
frmChildren = new ChildGUI[20];
When the user requests to see a child form, i do this:
if (frmChildren[nb] == null)
{
frmChildren[nb] = new ChildGUI();
frmChildren[nb].MdiParent = this.MdiParent;
}
frmChildren[nb].Show();
So far this works. In the background i can download new content for these forms. When a download is finished i fire a ChildChange event. Here is where it stops working.
I simply want to close/hide any forms open then regenerate a new set of -frmChildren = new ChildGUI[20];- here is one of many trials:
for (int i = 0; i < frmChildren.Length;i++ )
{
if (frmChildren[i] != null)
{
//frmChildren[i].BeginInvoke(new EventHandler(delegate
//{
frmChildren[i].Close();
//}));
}
}
frmChildren= new ChildGUI[20];
I get a Cross Thread exception on the .Close(). Notice i've already tried doing an invoke, but doing so bypasses the !=null for some reason. I think it may have something to do with the garbage collector. Anybody have an input?
The problem is that your anonymous method is capturing i - so by the time it's actually invoked in the UI thread, you've got a different value of i, which may be null. Try this:
for (int i = 0; i < frmChildren.Length; i++)
{
ChildGUI control = frmChildren[i];
if (control != null)
{
control.BeginInvoke(new EventHandler(delegate
{
control.Close();
}));
}
}
frmChildren = new ChildGUI[20];
See Eric Lippert's blog post for why introducing a new variable within the loop fixes the problem.
EDIT: If you want to use a foreach loop, it would look like this:
foreach (ChildGUI control in frmChildren)
{
// Create a "new" variable to be captured
ChildGUI copy = control;
if (copy != null)
{
copy.BeginInvoke(new EventHandler(delegate
{
copy.Close();
}));
}
}
frmChildren = new ChildGUI[20];
Just as an aside, you can use the fact that you just want to call a void method to make the code slightly simpler. As this no longer uses an anonymous method, you can make do away with the "inner" variable:
foreach (ChildGUI control in frmChildren)
{
if (control != null)
{
control.BeginInvoke(new MethodInvoker(control.Close));
}
}
frmChildren = new ChildGUI[20];

Categories