Code works much slower on production server

Code works much slower on production server - c#

I have some legacy code which is pretty basic. Code extracts files from ZIP file, de-serializes contents of the file from the ZIP from XML to objects, and does something with that objects.
Zip file is around 90mb. Problem with this is that this code executes around 3 seconds on local machine (1.5 sec to extract, and around 1.3 sec to deserialize all files), but when i publish that code on Windows server and IIS 6.1, it takes around 28 seconds to do the same action with same file. 14 sec to extract and 13 secs to deserialize.
Server is VPS, 8 cores, 16GB RAM.
Does anyone have any ideas?
public List<FileNameStream> UnzipFilesTest(List<string> files, string zippedPathAndFile)
{
//var result = new Dictionary<string, MemoryStream>();
var unzipedFiles = new List<FileNameStream>();
string file1 = System.Web.Hosting.HostingEnvironment.MapPath(zippedPathAndFile);
if (File.Exists(file1))
{
using (MemoryStream data = new MemoryStream())
{
using (Ionic.Zip.ZipFile zipFile = Ionic.Zip.ZipFile.Read(file1))
{
zipFile.ParallelDeflateThreshold = -1;
foreach (ZipEntry e in zipFile)
{
if (files.Contains(e.FileName, StringComparer.OrdinalIgnoreCase))
{
e.Extract(data);
unzipedFiles.Add(new FileNameStream() { FileContent = Encoding.UTF8.GetString(fs..ToArray()), FileName = e.FileName }); //(e.FileName, data);
}
}
}
}
}
return unzipedFiles;
}

Optimizing the foreach loop using a Parallel.Foreach loop will schedule the work of unzipping the files using multiple threads. The more threads the faster it will go. I am not saying it isn't a hardware, network, firewall or antivirus issue on the server - but it isn't wise to throw hardware at a software problem.
Here is a MSDN Link that may prove useful.
Your code would look something like:
Parallel.ForEach(zipEntires, (e) =>
{
if (files.Contains(e.FileName, StringComparer.OrdinalIgnoreCase))
{
e.Extract(data);
unzipedFiles.Add(new FileNameStream() { FileContent = Encoding.UTF8.GetString(fs..ToArray()), FileName = e.FileName }); //(e.FileName, data);
}
}

It was something in the VPS itself. After 7 days of research, hosting provider staff offered to migrate to a new machine, and everything seems to be in order now.

Related

Get files information from given directory and process N file for Uploading on server simultaneously and in sequence

I need to process N files at a time, So I've stored all files information in Dictionary with Filename, Size, and SequenceNo, Now I've to select 5 files from that Dictionary and process that file, meanwhile if process for any file completed then it will select another 1 file from that dictionary.
For Example :
If I've 10 Files in the dictionary and I select the first 5 files File 1, File 2, File 3, File 4, File 5 from the dictionary and process it. If process File 3 is completed then the process for File 6 should be started.
So Help me.
Thank You.

Thanks, #netmage I finally find my answer with the user of ConcurrentBag so I'll post the answer of my own question.
There is one namespace that provides several thread-safe collection classes that is System.Collections.Concurrent. I have used one from that namespace that is ConcurrentBag
Unlike List, A ConcurrentBag bag allow modification while we are doing iteration on it. it's also thread-safe and allow concurrent access on it.
I have Implemented the following code for the solution of my problem.
I have declared ConcurrentBag object FileData as global.
ConcurrentBag<string[]> FileData = new ConcurrentBag<string[]>();
Created one function to get file information and store them into FileData.
private void GetFileInfoIntoBag(string DirectoryPath)
{
var files = Directory.GetFiles(DirectoryPath, " *", SearchOption.AllDirectories);
foreach (var file in files)
{
FileInfo f1 = new FileInfo(file);
fileData = new string[4];
fileData[0] = f1.Name;
fileData[1] = GetFileSize.ToActualFileSize(f1.Length, 2);
fileData[2] = Convert.ToString(i);
fileData[3] = f1.FullName;
i++;
FileData.Add(fileData);
}
}
And then on then for upload proccess, I have created N Task as I required and Implemented logic for upload inside them.
private void Upload_Click(object sender, EventArgs e)
{
List<Task> tskCopy = new List<Task>();
for (int i = 0; i < N; i++)
{
tskCopy.Add(Task.Run(() =>
{
while (FileData.Count > 0)
{
string[] file;
FileData.TryTake(out file);
if (file != null && file.Count() > 3)
{
/* Upload Logic*/
GC.Collect();
}
}
}));
}
Task.WaitAll(tskCopy.ToArray());
MessageBox.Show("Upload Complited Successfully");
}
Thank you all for your support.

Apparently, you wish to process your files in a specific order, at most five at a time.
So far, information about your files is stored sequentially in a List<T>.
One straightforward way to move across the list is to store the index of the next element to access in an int variable, e.g. nextFileIndex. You initialize it to 0.
When starting to process one of your files, you take the information from your list:
MyFileInfo currentFile;
lock (myFiles)
{
if (nextFileIndex < myFiles.Count)
{
currentFile = myFiles[nextFileIndex++];
}
}
You start five "processes" like that in the beginning, and whenever one of them has ended, you start a new one.
Now, for these "processes" to run in parallel (it seems like that is what you intend), please read about multithreading, e.g. the task parallel library that is part of .NET. My suggestion would be to create five tasks that grab the next file as long as the nextFileIndex has not exceeded the maximum index in the list, and use something like Task<TResult>.WaitAll to wait until none of the tasks has anything to do anymore.
Be aware of multi-threading issues.

Monitor FTP directory in ASP.NET/C#

I have FileSystem watcher for a local directory. It's working fine. I want same to implement for FTP. Is there any way I can achieve it? I have checked many solutions but it's not clear.
Logic: Want to get files from FTP later than some timestamp.
Problem faced: Getting all files from FTP and then filtering the result is hitting the performance (used FtpWebRequest).
Is there any right way to do this? (WinSCP is on hold. Cant use it now.)
FileSystemWatcher oFsWatcher = new FileSystemWatcher();
OFSWatchers.Add(oFsWatcher);
oFsWatcher.Path = sFilePath;
oFsWatcher.Filter = string.IsNullOrWhiteSpace(sFileFilter) ? "*.*" : sFileFilter;
oFsWatcher.NotifyFilter = NotifyFilters.FileName;
oFsWatcher.EnableRaisingEvents = true;
oFsWatcher.IncludeSubdirectories = bIncludeSubdirectories;
oFsWatcher.Created += new FileSystemEventHandler(OFsWatcher_Created);

You cannot use the FileSystemWatcher or any other way, because the FTP protocol does not have any API to notify a client about changes in the remote directory.
All you can do is to periodically iterate the remote tree and find changes.
It's actually rather easy to implement, if you use an FTP client library that supports recursive listing of a remote tree. Unfortunately, the built-in .NET FTP client, the FtpWebRequest does not. But for example with WinSCP .NET assembly, you can use the Session.EnumerateRemoteFiles method.
See the article Watching for changes in SFTP/FTP server:
// Setup session options
SessionOptions sessionOptions = new SessionOptions
{
Protocol = Protocol.Ftp,
HostName = "example.com",
UserName = "user",
Password = "password",
};
using (Session session = new Session())
{
// Connect
session.Open(sessionOptions);
List<string> prevFiles = null;
while (true)
{
// Collect file list
List<string> files =
session.EnumerateRemoteFiles(
"/remote/path", "*.*", EnumerationOptions.AllDirectories)
.Select(fileInfo => fileInfo.FullName)
.ToList();
if (prevFiles == null)
{
// In the first round, just print number of files found
Console.WriteLine("Found {0} files", files.Count);
}
else
{
// Then look for differences against the previous list
IEnumerable<string> added = files.Except(prevFiles);
if (added.Any())
{
Console.WriteLine("Added files:");
foreach (string path in added)
{
Console.WriteLine(path);
}
}
IEnumerable<string> removed = prevFiles.Except(files);
if (removed.Any())
{
Console.WriteLine("Removed files:");
foreach (string path in removed)
{
Console.WriteLine(path);
}
}
}
prevFiles = files;
Console.WriteLine("Sleeping 10s...");
Thread.Sleep(10000);
}
}
(I'm the author of WinSCP)
Though, if you actually want to just download the changes, it's a way easier. Just use the Session.SynchronizeDirectories in the loop.
while (true)
{
SynchronizationResult result =
session.SynchronizeDirectories(
SynchronizationMode.Local, "/remote/path", #"C:\local\path", true);
result.Check();
// You can inspect result.Downloads for a list for updated files
Console.WriteLine("Sleeping 10s...");
Thread.Sleep(10000);
}
This will update even modified files, not only new files.
Though using WinSCP .NET assembly from a web application might be problematic. If you do not want to use a 3rd party library, you have to do with limitations of the FtpWebRequest. For an example how to recursively list a remote directory tree with the FtpWebRequest, see my answer to List names of files in FTP directory and its subdirectories.
You have edited your question to say that you have performance problems with the solutions I've suggested. Though you have already asked a new question that covers this:
Get FTP file details based on datetime in C#

Unless you have access to the OS which hosts the service; it will be a bit harder.
FileSystemWatcher places a hook on the filesystem, which will notify your application as soon as something happened.
FTP command specifications does not have such a hook. Besides that it's always initiated by the client.
Therefor, to implement such logic you should periodical perform a NLST to list the FTP-directory contents and track the changes (or hashes, perhaps (MDTM)) yourself.
More info:
FTP return codes
FTP

I have got an alternative solution to do my functionality.
Explanation:
I am downloading the files from FTP (Read permission reqd.) with same folder structure.
So everytime the job/service runs I can check into the physical path same file(Full Path) exists or not If not exists then it can be consider as a new file. And Ii can do some action for the same and download as well.
Its just an alternative solution.
Code Changes:
private static void GetFiles()
{
using (FtpClient conn = new FtpClient())
{
string ftpPath = "ftp://myftp/";
string downloadFileName = #"C:\temp\FTPTest\";
downloadFileName += "\\";
conn.Host = ftpPath;
//conn.Credentials = new NetworkCredential("ftptest", "ftptest");
conn.Connect();
//Get all directories
foreach (FtpListItem item in conn.GetListing(conn.GetWorkingDirectory(),
FtpListOption.Modify | FtpListOption.Recursive))
{
// if this is a file
if (item.Type == FtpFileSystemObjectType.File)
{
string localFilePath = downloadFileName + item.FullName;
//Only newly created files will be downloaded.
if (!File.Exists(localFilePath))
{
conn.DownloadFile(localFilePath, item.FullName);
//Do any action here.
Console.WriteLine(item.FullName);
}
}
}
}
}

Reading an XML file throws the exception -> system out of memory exception?

I've been developing a 3D viewer in XNA and Win-forms. I use a simple concept to save and read data to and from XML files to rebuild the 3D environment.
Now I'm trying to load a larger XML file, and an exception is raised during the generation of my 3D environment:
[System out of memory exception]
I do not understand why, because after the exception is raised my file is not very heavy, it's about 80 mb. Is it normal to see an exception raise during the generation of an XML file up to 80 mb?
Development Environment used
Microsoft Windows 7 (x64) bits
Visual Studio 2013 professional Update 3
16.0 GB (15.9GB usable) Ram
Intel i7 - 2670QM CPU # 2.20GHz
Keep in mind, this is an 32bit application (XNA > 32bit)
I did do a test with ANTS memory profiler to check how the garbage collector works:
Cant post image because of reputation...
Results indicate Class Vector3[] as the culprit:
Microsoft.Xna.Framework Vector3[] Grows form 0(bytes) to 305 218 872(bytes).
Code:
Read the Xml file and load objects(basic-Model,-Mesh,-MeshPart) with properties into memory. This is needed to display models in 3D world.
Create a specialized Class that contains a object of type basic-model,-mesh,-meshPart. This class will contain extra info on the model,mesh,meshpart and have a link to the Database. I then load the new objects of the new class into 3 lists (model,mesh and meshPart list).
I then iterate through these lists through the entire project to do what is needed.
The problem comes in when:
I added models.
Saved to my xml file.
Loading that file again!!!
So I assume the reading of the xml file is the problem...
or can it be that the lists with the objects gets too big??
Reading file:
foreach (string directory in Directory.GetDirectories(tempPath))
{
string directoryName = Path.GetFileName(directory); //Get the directory name.
if (directoryName == "Models") //If it is the Models directory name.
{
foreach (string filePath in Directory.GetFiles(directory))
{
if (Path.GetExtension(filePath) == ".xml") //If a .xml file.
{
Model_Data data = Model_Storage.Load(filePath);
modelList.Add(data);
}
}
}
}
public static Model_Data Load(string FilePath)
{
Model_Data data = Serialization.XNA_XML_Deserialize<Model_Data>(FilePath);
data = XNB_Path_Load(data, FilePath);
data = Texture_Custom_Load(data, FilePath);
data = Linked_Files_Load(data, FilePath);
return data;
}
public static T XNA_XML_Deserialize<T>(string LoadPath)
{
using (XmlReader reader = XmlReader.Create(LoadPath))
{
return IntermediateSerializer.Deserialize<T>(reader, null);
}
}
This is how the constructor of the specialized class looks for model:
public Spec_Model(Basic_Model _Model, int _ID_3Dem = -1)
{
bModel = _Model; //object of basic-model
ID_3Dem = _ID_3Dem;
Model_Vertices = Generate_Vertices();
Model_Hash = Generate_Hash();
State = ModelState.New;
}
Here I save the specialized class objects for future use:
FAC_Model fac_Model = new FAC_Model(Model, ID_3Dem);
lst_FAC_Models.Add(fac_Model);
foreach (Basic_Mesh Mesh in Model.Mesh_Collection)
{
FAC_Mesh fac_Mesh = new FAC_Mesh(Mesh, Model.Unique_Identifier);
lst_FAC_Mesh.Add(fac_Mesh);
int meshPartCounter = 0;
foreach (Basic_Mesh_Part_Properties Mesh_Part in Mesh.Properties.Mesh_Part_Properties)
{
meshPartCounter++;
FAC_MeshPart fac_MeshPart = new FAC_MeshPart(Mesh_Part,Model.Unique_Identifier,Mesh.Properties.Unique_Identifier, meshPartCounter.AsStr());
lst_FAC_MeshPart.Add(fac_MeshPart);
}
}
No problems when loading a smaller xml file up 60mb.
In the file that I'm trying to load:
Models = 18
Meshes = +-3200
MeshParts = +- 4000
Any advice?

Parsing Large List of Excel Files Failing

This is a C#/VSTO program. I've been working on a data capture project. The scope is basically 'process Excel files sent by a variety of third party companies.' Practically, this mean:
Locate columns that contain the data I want through a search method.
Grab data out of the workbooks
Clean the data, run some calculations, etc
Output cleaned data into new workbook
The program I have written works great for small-medium data sets, ~25 workbooks with a combined total of ~1000 rows of relavent data. I'm grabbing 7 columns with of data out of these workbooks. One edge case I have, though, is occasionally I will need to run a much larger data set, ~50 workbooks with a combined total of ~8,000 rows of relavent data (and possibly another ~2000 of duplicate data that I also have to remove).
I am currently putting a list of the files through a Parallel.ForEach loop inside of which I open a new Excel.Application() to process each file with multiple ActiveSheets. The parallel process runs much faster on the smaller data set than going through each one sequentially. But on the larger data set, I seem to hit a wall.
I start getting the message: Microsoft Excel is waiting for another application to complete an OLE action and eventually it just fails. Switching back to sequential foreach does allow the program to finish, but it just grinds along - going from 1-3 minutes for a Parallel medium sized data set to 20+ minutes for a sequential large data set. If I mess with ParallelOptions.MaxDegreeOfParallelism set to 10 it will complete the cycle, but still take 15 minutes. If I set it to 15, it fails. I also really don't like messing with TPL settings if I don't have to. I've also tried inserting a Thread.Sleep to just manually slow things down, but that only made the failure happen further out.
I close the workbook, quit the application, then ReleaseComObject to the Excel object and GC.Collect and GC.WaitForPendingFinalizers at the end of each loop.
My ideas at the moment are:
Split the list in half and run them seperately
Open some number of new Excel.Application() in parallel, but run a list of files sequentially inside of that Excel instance (so kinda like #1, but using a different path)
Seperate the list by file size, and run a small set of very large files independently/sequentially, run the rest as I have been
Things I am hoping to get some help with:
Suggestions on making real sure my memory is getting cleared (maybe Process.Id is getting twisted up in all the opening and closing?)
Suggestions on ordering a parallel process - I'm wondering if I can throw the 'big' guys in first, that will make the longer-running process more stable.
I have been looking at: http://reedcopsey.com/2010/01/26/parallelism-in-net-part-5-partitioning-of-work/ and he says "With prior knowledge about your work, it may be possible to partition data more meaningfully than the default Partitioner." But I'm having a hard time really knowing what/if partitioning makes sense.
Really appreciate any insights!
UPDATE
So as a general rule I test against Excel 2010, as we have both 2010 and 2013 under use here. I ran it against 2013 and it works fine - run time about 4 minutes, which is about what I would expect. Before I just abandon 2010 compatibility, any other ideas? The 2010 machine is a 64-bit machine with 64-bit Office, and the 2013 machine is a 64-bit machine with a 32-bit Office. Would that matter at all?

A few years ago i worked with excel files and automation. I then had problems of having zombie processes in task manager. Although our program ended and i thought i quit excel properly, the processes were not quitting.
The solution was not something i liked but it was effective. I can summarize the solution like this.
1) never use two dots consecutively like:
workBook.ActiveSheet.PageSetup
instead use variables.. when you are done relase and null them.
example: instead of doing this:
m_currentWorkBook.ActiveSheet.PageSetup.LeftFooter = str.ToString();
follow the practices in this function. (This function adds a barcode to excel footer.)
private bool SetBarcode(string text)
{
Excel._Worksheet sheet;
sheet = (Excel._Worksheet)m_currentWorkbook.ActiveSheet;
try
{
StringBuilder str = new StringBuilder();
str.Append(#"&""IDAutomationHC39M,Regular""&22(");
str.Append(text);
str.Append(")");
Excel.PageSetup setup;
setup = sheet.PageSetup;
try
{
setup.LeftFooter = str.ToString();
}
finally
{
RemoveReference(setup);
setup = null;
}
}
finally
{
RemoveReference(sheet);
sheet = null;
}
return true;
}
Here is the RemoveReference function (putting null in this function did not work)
private void RemoveReference(object o)
{
try
{
System.Runtime.InteropServices.Marshal.ReleaseComObject(o);
}
catch
{ }
finally
{
o = null;
}
}
If you follow this pattern EVERYWHERE it guarantees no leaks, no zombie processes etc..
2) In order to create excel files you can use excel application, however to get data from excel, i suggesst using OleDB. You can approach excel like a database and get data from it with sql queries, datatables etc.
Sample Code: (instead of filling dataset, you can use datareader for memory performance)
private List<DataTable> getMovieTables()
{
List<DataTable> movieTables = new List<DataTable>();
var connectionString = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + excelFilePath + ";Extended Properties=\"Excel 12.0;IMEX=1;HDR=NO;TypeGuessRows=0;ImportMixedTypes=Text\""; ;
using (var conn = new OleDbConnection(connectionString))
{
conn.Open();
DataRowCollection sheets = conn.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, new object[] { null, null, null, "TABLE" }).Rows;
foreach (DataRow sheet in sheets)
{
using (var cmd = conn.CreateCommand())
{
cmd.CommandText = "SELECT * FROM [" + sheet["TABLE_NAME"].ToString() + "] ";
var adapter = new OleDbDataAdapter(cmd);
var ds = new DataSet();
try
{
adapter.Fill(ds);
movieTables.Add(ds.Tables[0]);
}
catch (Exception ex)
{
//Debug.WriteLine(ex.ToString());
continue;
}
}
}
}
return movieTables;
}

As an alternative solution to the one proposed by #Mustafa Düman I recommend you to use Version 4 beta of EPPlus. I used it without problems in several projects.
Pros:
Fast
No memory leaks (I can't tell the same for versions <4)
Does not require Office to be installed on the machine where you use it
Cons:
Can be used only for .xlsx files ( Excel 2007 / 2010 )
I tested it with the following code on 20 excel files around 12.5 MB each (over 50k records in each file) and I think it's enough to mention that it didn't crashed :)
Console.Write("Path: ");
var path = Console.ReadLine();
var dirInfo = new DirectoryInfo(path);
while (string.IsNullOrWhiteSpace(path) || !dirInfo.Exists)
{
Console.WriteLine("Invalid path");
Console.Write("Path: ");
path = Console.ReadLine();
dirInfo = new DirectoryInfo(path);
}
string[] files = null;
try
{
files = Directory.GetFiles(path, "*.xlsx", SearchOption.AllDirectories);
}
catch (Exception ex)
{
Console.WriteLine(ex.Message);
Console.ReadLine();
return;
}
Console.WriteLine("{0} files found.", files.Length);
if (files.Length == 0)
{
Console.ReadLine();
return;
}
int succeded = 0;
int failed = 0;
Action<string> LoadToDataSet = (filePath) =>
{
try
{
FileInfo fileInfo = new FileInfo(filePath);
using (ExcelPackage excel = new ExcelPackage(fileInfo))
using (DataSet dataSet = new DataSet())
{
int workSheetCount = excel.Workbook.Worksheets.Count;
for (int i = 1; i <= workSheetCount; i++)
{
var worksheet = excel.Workbook.Worksheets[i];
var dimension = worksheet.Dimension;
if (dimension == null)
continue;
bool hasData = dimension.End.Row >= 1;
if (!hasData)
continue;
DataTable dataTable = new DataTable();
//add columns
foreach (var firstRowCell in worksheet.Cells[1, 1, 1, dimension.End.Column])
dataTable.Columns.Add(firstRowCell.Start.Address);
for (int j = 0; j < dimension.End.Row; j++)
dataTable.Rows.Add(worksheet.Cells[j + 1, 1, j + 1, dimension.End.Column].Select(erb => erb.Value).ToArray());
dataSet.Tables.Add(dataTable);
}
dataSet.Clear();
dataSet.Tables.Clear();
}
Interlocked.Increment(ref succeded);
}
catch (Exception)
{
Interlocked.Increment(ref failed);
}
};
Stopwatch sw = new Stopwatch();
sw.Start();
files.AsParallel().ForAll(LoadToDataSet);
sw.Stop();
Console.WriteLine("{0} succeded, {1} failed in {2} seconds", succeded, failed, sw.Elapsed.TotalSeconds);
Console.ReadLine();

Transfer file from Windows Mobile device to...anywhere

I can't seem to find a solution to this issue. I'm trying to get my Compact Framework application on Windows Mobile 6 to have the ability to move a file on its local filesystem to another system.
Here's the solutions I'm aware of:
FTP - Problem with that is most of
the APIs are way to expensive to use.
HTTP PUT - As far as I have been able to find, I can't use anonymous PUT with IIS7, and that's the web server the system is running. (An extreme workaround for this would be to use a different web server to PUT the file, and have that other system transfer it to the IIS system).
Windows share - I would need authentication on the shares, and I haven't seen that a way to pass this authentication through windows mobile.
The last resort would be to require that the devices be cradled to transfer these files, but I'd really like to be able to have these files be transferred wirelessly.

FTP: define "too expensive". Do you mean performance or byte overhead or dollar cost? Here's a free one with source.
HTTP: IIS7 certainly supports hosting web services or custom IHttpHandlers. You could use either for a data upload pretty easily.
A Windows Share simply requires that you to P/Invoke the WNet APIs to map the share, but it's not terribly complex.

I ended up just passing information to a web server via a PHP script.
The options provided above just didn't work out for my situation.
Here's the gist of it. I've got some code in there with progress bars and various checks and handlers unrelated to simply sending a file, but I'm sure you can pick through it. I've removed my authentication code from both the C# and the PHP, but it shouldn't be too hard to roll your own, if necessary.
in C#:
/*
* Here's the short+sweet about how I'm doing this
* 1) Copy the file from mobile device to web server by querying PHP script with paramaters for each line
* 2) PHP script checks 1) If we got the whole data file 2) If this is a duplicate data file
* 3) If it is a duplicate, or we didn't get the whole thing, it goes away. The mobile
* device will hang on to it's data file in the first case (if it's duplicate it deletes it)
* to be tried again later
* 4) The server will then process the data files using a scheduled task/cron job at an appropriate time
*/
private void process_attempts()
{
Uri CheckUrl = new Uri("http://path/to/php/script?action=check");
WebRequest checkReq = WebRequest.Create(CheckUrl);
try
{
WebResponse CheckResp = checkReq.GetResponse();
CheckResp.Close();
}
catch
{
MessageBox.Show("Error! Connection not available. Please make sure you are online.");
this.Invoke(new Close(closeme));
}
StreamReader dataReader = File.OpenText(datafile);
String line = null;
line = dataReader.ReadLine();
while (line != null)
{
Uri Url = new Uri("http://path/to/php/script?action=process&line=" + line);
WebRequest WebReq = WebRequest.Create(Url);
try
{
WebResponse Resp = WebReq.GetResponse();
Resp.Close();
}
catch
{
MessageBox.Show("Error! Connection not available. Please make sure you are online.");
this.Invoke(new Close(closeme));
return;
}
try
{
process_bar.Invoke(new SetInt(SetBarValue), new object[] { processed });
}
catch { }
process_num.Invoke(new SetString(SetNumValue), new object[] { processed + "/" + attempts });
processed++;
line = dataReader.ReadLine();
}
dataReader.Close();
Uri Url2 = new Uri("http://path/to/php/script?action=finalize&lines=" + attempts);
Boolean finalized = false;
WebRequest WebReq2 = WebRequest.Create(Url2);
try
{
WebResponse Resp = WebReq2.GetResponse();
Resp.Close();
finalized = true;
}
catch
{
MessageBox.Show("Error! Connection not available. Please make sure you are online.");
this.Invoke(new Close(closeme));
finalized = false;
}
MessageBox.Show("Done!");
this.Invoke(new Close(closeme));
}
In PHP (thoroughly commented for your benefit!):
<?php
//Get the GET'd values from the C#
//The current line being processed
$line = $_GET['line'];
//Which action we are doing
$action = $_GET['action'];
//# of lines in the source file
$totalLines = $_GET['lines'];
//If we are processing the line, open the data file, and append this new line and a newline.
if($action == "process"){
$dataFile = "tempdata/SOME_KIND_OF_UNIQUE_FILENAME.dat";
//open the file
$fh = fopen($dataFile, 'a');
//Write the line, and a newline to the file
fwrite($fh, $line."\r\n");
//Close the file
fclose($fh);
//Exit the script
exit();
}
//If we are done processing the original file from the C# application, make sure the number of lines in the new file matches that in the
//file we are transferring. An expansion of this could be to compare some kind of hash function value of both files...
if($action == "finalize"){
$dataFile = "tempdata/SOME_KIND_OF_UNIQUE_FILENAME.dat";
//Count the number of lines in the new file
$lines = count(file($dataFile));
//If the new file and the old file have the same number of lines...
if($lines == $totalLines){
//File has the matching number of lines, good enough for me over TCP.
//We should move or rename this file.
}else{
//File does NOT have the same number of lines as the source file.
}
exit();
}
if($action == "check"){
//If a file with this unique file name already exists, delete it.
$dataFile = "tempdata/SOME_KIND_OF_UNIQUE_FILENAME.dat";
if(file_exists($dataFile)){
unlink($dataFile);
}
}
?>

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.