C#: Opening and processing very large GeoJSON file - c#

I'm trying to process a 25GB GeoJSON file using GeoJSON.net
The accepted answer here works on a small test file but is causing Memory Exception errors with a large 25GB file
There isn't a huge amount of info about how to process the FeatureCollection so I'm just looping through
Can anyone advise what I'm doing wrong?
CODE
try
{
JsonSerializer serializer = new JsonSerializer();
using (FileStream s = File.Open(jsonFile, FileMode.Open))
using (StreamReader sr = new StreamReader(s, Encoding.UTF8))
using (JsonReader reader = new JsonTextReader(sr))
{
while (reader.Read())
{
// deserialize only when there's "{" character in the stream
if (reader.TokenType == JsonToken.StartObject)
{
FeatureCollection FC = serializer.Deserialize<FeatureCollection>(reader);
// Errors Here
foreach (var Feature in FC.Features)
{
if (Feature.Properties.ContainsKey("place"))
{
foreach (var p in Feature.Properties)
{
var Key = p.Key;
var Value = p.Value;
Console.WriteLine("Tags K: {0} Value: {1} ", Key, Value);
}
}
}
}
}
}
}
catch (Exception e)
{
Console.WriteLine("Err: " + e.Message);
}
The second answer on the same page isn't doing anything
BTW JsonReaderExtensions is in a separate file copied from that page
Regex regex = new Regex(#"^\[\d+\]$");
using (FileStream s = File.Open(jsonFile, FileMode.Open))
using (StreamReader sr = new StreamReader(s, Encoding.UTF8))
using (JsonReader reader = new JsonTextReader(sr))
{
IEnumerable<FeatureCollection> objects = reader.SelectTokensWithRegex<FeatureCollection>(regex);
foreach (var Feature in objects)
{
Console.WriteLine("Hello");
// Doesn't get here
}
Update:
I think the problem is with GeoJSON.net not Newtonsoft.Json as I've used the same method above to open bigger json files using
dynamic jsonFeatures = serializer.Deserialize<ExpandoObject>(reader);
Following the comment by #Martin Costello
I've come up with opening the file using a standard StreamReader line by line then convert the filtered lines back into valid GeoJSON
I'm sure there must be a better way to do this?
string Start = #"{
""type"": ""FeatureCollection"",
""name"": ""points"",
""crs"": { ""type"": ""name"", ""properties"": { ""name"": ""urn:ogc:def:crs:OGC:1.3:CRS84"" } },
""features"": [";
String End = #"]
}
";
try
{
string line;
StreamReader INfile = new StreamReader(jsonFile);
while ((line = INfile.ReadLine()) != null)
{
// for debugging
CurrentLine = line;
// Filter only the line with "place"
if (Regex.Match(line, #"\bplace\b", RegexOptions.IgnoreCase).Success)
{
// rebuild to valid GeoJSON
string json = Start + line + End;
FeatureCollection FC = JsonConvert.DeserializeObject<FeatureCollection>(json);
foreach (var Feature in FC.Features)
{
foreach (var p in Feature.Properties)
{
var Key = p.Key;
var Value = p.Value;
switch (Key.ToLower())
{
// Do Stuff
}
}
}
}
}
}
catch (Exception e)
{
Console.OutputEncoding = System.Text.Encoding.UTF8;
Console.WriteLine("Err: " + e.Message + "\n" + CurrentLine);
}

Related

C# : Editing/saving/Sending a docx document

Been strugling with a lot of problems. Using OpenXML on a ASP.NET Core server, I want to create a new docx document based on a template one. Once this document is fully saved, I want it to be sent to my client so he can download it directly. Here's my code :
public IActionResult Post([FromBody] Consultant consultant)
{
using (Stream templateStream = new MemoryStream(Properties.Resources.templateDossierTech))
using (WordprocessingDocument template =
WordprocessingDocument.Open(templateStream, false))
{
string fileName = environment.WebRootPath + #"\Resources\"+ consultant.FirstName + "_" + consultant.LastName + ".docx";
WordprocessingDocument dossierTechniqueDocument =
WordprocessingDocument.Create(fileName,
WordprocessingDocumentType.Document);
foreach (var part in template.Parts)
{
dossierTechniqueDocument.AddPart(part.OpenXmlPart, part.RelationshipId);
}
var body = dossierTechniqueDocument.MainDocumentPart.Document.Body;
var paras = body.Elements();
foreach (var para in paras)
{
foreach (var run in para.Elements())
{
foreach (var text in run.Elements())
{
if (text.InnerText.Contains("{{prenom}}"))
{
var t = new Text(text.InnerText.Replace("{{prenom}}", consultant.FirstName));
run.RemoveAllChildren<Text>();
run.AppendChild(t);
}
}
}
}
dossierTechniqueDocument.MainDocumentPart.Document.Save();
dossierTechniqueDocument.Close();
var cd = new System.Net.Mime.ContentDisposition
{
FileName = consultant.FirstName + "_" + consultant.LastName + ".docx",
Inline = true
};
Response.Headers.Add("Content-Disposition", cd.ToString());
Response.Headers.Add("X-Content-Type-Options", "nosniff");
return File(System.IO.File.ReadAllBytes(fileName),"application/vnd.openxmlformats-officedocument.wordprocessingml.document","Dossier Technique");
}
}
As a first look, it looks like is saving well but when I try to open it on word, it says that it is corrupted for some reason.
That's the same problem when I try to send it. Once it's sent my client doesn't download it (Ajax query).
Do anyone of you have any idea how to fix it ?
Here is the function which creates a document from a template:
static void GenerateDocumentFromTemplate(string inputPath, string outputPath)
{
MemoryStream documentStream;
using (Stream stream = File.OpenRead(inputPath))
{
documentStream = new MemoryStream((int)stream.Length);
CopyStream(stream, documentStream);
documentStream.Position = 0L;
}
using (WordprocessingDocument template = WordprocessingDocument.Open(documentStream, true))
{
template.ChangeDocumentType(DocumentFormat.OpenXml.WordprocessingDocumentType.Document);
MainDocumentPart mainPart = template.MainDocumentPart;
mainPart.DocumentSettingsPart.AddExternalRelationship("http://schemas.openxmlformats.org/officeDocument/2006/relationships/attachedTemplate",
new Uri(inputPath, UriKind.Absolute));
mainPart.Document.Save();
}
File.WriteAllBytes(outputPath, documentStream.ToArray());
}

Convert .XYZ to .csv using c#

Hi i am using this method to replace " " to "," but is failing when i try to use it on data that have 32 millions lines. Is anyone knows how to modify it to make it running?
List<String> lines = new List<String>();
//loop through each line of file and replace " " sight to ","
using (StreamReader sr = new StreamReader(inputfile))
{
int id = 1;
int i = File.ReadAllLines(inputfile).Count();
while (sr.Peek() >= 0)
{
//Out of memory issuee
string fileLine = sr.ReadLine();
//do something with line
string ttt = fileLine.Replace(" ", ", ");
//Debug.WriteLine(ttt);
lines.Add(ttt);
//lines.Add(id++, 'ID');
}
using (StreamWriter writer = new StreamWriter(outputfile, false))
{
foreach (String line in lines)
{
writer.WriteLine(line+","+id);
id++;
}
}
}
//change extension to .csv
FileInfo f = new FileInfo(outputfile);
f.MoveTo(Path.ChangeExtension(outputfile, ".csv"));
I general i am trying to convert big .XYZ file to .csv format and add incremental field at the end. I am using c# for first time in my life to be honest :) Can you help me?
See my comment above - you could modify your reading / writing as follows :
using (StreamReader sr = new StreamReader(inputfile))
{
using (StreamWriter writer = new StreamWriter(outputfile, false))
{
int id = 1;
while (sr.Peek() >= 0)
{
string fileLine = sr.ReadLine();
//do something with line
string ttt = fileLine.Replace(" ", ", ");
writer.WriteLine(ttt + "," + id);
id++;
}
}
}

Unable to delete a record from a flat file c#

I am trying to delete specific line from a text file using c# as:
public static void DeleteProject(int id)
{
var fileloc = WebConfigurationManager.AppSettings["FFProject-Manager"];
var fileloc2 = WebConfigurationManager.AppSettings["TempFile"];
string line = null;
using (StreamReader reader = new StreamReader(fileloc))
{
using (StreamWriter writer = new StreamWriter(fileloc2))
{
while ((line = reader.ReadLine()) != null)
{
string uidCompare = Convert.ToString(line.Split(',')[0]);
string uidToCompare = Convert.ToString(id);
if (String.Compare(uidCompare, uidToCompare) == 0)
continue;
writer.WriteLine(line);
}
}
}
File.Delete(fileloc);
File.Move(fileloc2, fileloc);
}
}
But when I try to execute it gives the error as:
The process cannot access the file 'C:\Flat_Files\Project-Manager.txt' because it is being used by another process.
Where am I going wrong?

How can I remove the oldest lines in a file when using a FileStream and StreamWriter?

Based on Prakash's answer here, I thought I'd try something like this to remove the oldest lines in a file prior to adding a new line to it:
private ExceptionLoggingService()
{
_fileStream = File.OpenWrite(GetExecutionFolder() + "\\Application.log");
_streamWriter = new StreamWriter(_fileStream);
}
public void WriteLog(string message)
{
const int MAX_LINES_DESIRED = 1000;
StringBuilder formattedMessage = new StringBuilder();
formattedMessage.AppendLine("Date: " + DateTime.Now.ToString());
formattedMessage.AppendLine("Message: " + message);
// First, remove the earliest lines from the file if it's grown too much
List<string> logList = File.ReadAllLines(_fileStream).ToList();
while (logList.Count > MAX_LINES_DESIRED)
{
logList.RemoveAt(0);
}
File.WriteAllLines(_fileStream, logList.ToArray());
_streamWriter.WriteLine(formattedMessage.ToString());
_streamWriter.Flush();
}
...but in my version of .NET (Compact Framework, Windows CE C# project in VS 2008), neither ReadAllLines() nor WriteAllLines() are available.
What is the ReadAllLines/WriteAllLines-challenged way of accomplishing the same thing?
UPDATE
This is doubtless kludgy, but it seems like it should work, and I'm going to test it out. I moved the "shorten the log file" code from the WriteLog() method to the constructor:
private ExceptionLoggingService()
{
const int MAX_LINES_DESIRED = 1000;
string uriPath = GetExecutionFolder() + "\\Application.log";
string localPath = new Uri(uriPath).LocalPath;
if (!File.Exists(localPath))
{
File.Create(localPath);
}
_fileStream = File.OpenWrite(localPath);
// First, remove the earliest lines from the file if it's grown too much
StreamReader reader = new StreamReader(_fileStream);
List<String> logList = new List<String>();
while (!reader.EndOfStream)
{
logList.Add(reader.ReadLine());
}
while (logList.Count > MAX_LINES_DESIRED)
{
logList.RemoveAt(0);
}
if (logList.Count > MAX_LINES_DESIRED)
{
_fileStream.Close();
File.Delete(GetExecutionFolder() + "\\Application.log");
File.Create(GetExecutionFolder() + "\\Application.log");
_fileStream = File.OpenWrite(GetExecutionFolder() + "\\Application.log");
}
_streamWriter = new StreamWriter(_fileStream);
foreach (String s in logList)
{
_streamWriter.WriteLine(s);
_streamWriter.Flush();
}
}
public void WriteLog(string message)
{
StringBuilder formattedMessage = new StringBuilder();
formattedMessage.AppendLine("Date: " + DateTime.Now.ToString());
formattedMessage.AppendLine("Message: " + message);
_streamWriter.WriteLine(formattedMessage.ToString());
_streamWriter.Flush();
}
ReadAllLines and WriteAllLines are just hiding a loop from you. Just do:
StreamReader reader = new StreamReader(_fileStream);
List<String> logList = new List<String>();
while (!reader.EndOfStream)
logList.Add(reader.ReadLine());
Note that this is nearly identical to the implementation of File.ReadAllLines (from MSDN Reference Source)
String line;
List<String> lines = new List<String>();
using (StreamReader sr = new StreamReader(path, encoding))
while ((line = sr.ReadLine()) != null)
lines.Add(line);
return lines.ToArray();
WriteAllLines is simialr:
StreamWriter writer = new StreamWriter(path, false); //Don't append!
foreach (String line in logList)
{
writer.WriteLine(line);
}
I would write simple extension methods for this, that do the job lazily without loading whole file to memory.
Usage would be something like this:
outfile.MyWriteLines(infile.MyReadLines().Skip(1));
public static class Extensions
{
public static IEnumerable<string> MyReadLines(this FileStream f)
{
var sr = new StreamReader(f);
var line = sr.ReadLine();
while (line != null)
{
yield return line;
line = sr.ReadLine();
}
}
public static void MyWriteLines(this FileStream f, IEnumerable<string> lines)
{
var sw = new StreamWriter(f);
foreach(var line in lines)
{
sw.WriteLine(line);
}
}
}

appending textfile using streamwriter

I have this code, where i would run the following code in sequence.
I will always create a new text file at the beginning. then for the 2nd and 3rd portion of the code i just need to append the text file "checksnapshot" how do i do that using streamwriter?
//1
using (StreamReader sr = new StreamReader("C:\\Work\\labtoolssnapshot.txt")) ;
{
string contents = sr.ReadToEnd();
using (StreamWriter sw = new StreamWriter("C:\\Work\\checksnapshot.properties"))
{
if (contents.Contains(args[0]))
{
sw.WriteLine("SASE= 1");
}
else
{
sw.WriteLine("SASE= 0");
}
}
}
//2
using (StreamReader sr = new StreamReader("C:\\Work\\analyzercommonsnapshot.txt")) ;
{
string contents = sr.ReadToEnd();
using (StreamWriter sw = new StreamWriter("C:\\Work\\checksnapshot.properties"))
{
if (contents.Contains(args[0]))
{
sw.WriteLine("Analyzer= 1");
}
else
{
sw.WriteLine("Analyzer= 0");
}
}
}
//3
using (StreamReader sr = new StreamReader("C:\\Work\\mobilesnapshot.txt")) ;
{
string contents = sr.ReadToEnd();
using (StreamWriter sw = new StreamWriter("C:\\Work\\checksnapshot.properties"))
{
if (contents.Contains(args[0]))
{
sw.WriteLine("mobile= 1");
}
else
{
sw.WriteLine("mobile= 0");
}
}
}
What about doing this,
new StreamWriter("C:\\Work\\checksnapshot.properties",true)
true means append if the file Exists.
By using the proper constructor of StreamWriter:
new StreamWriter(someFile, true)
will open someFile and append.
I don't why your code does not works, but why don't you use the builtin methods :
string contents = File.ReadAllText("C:\\Work\\labtoolssnapshot.txt");
string contents2 = File.ReadAllText("C:\\Work\\analyzercommonsnapshot.txt");
string contents3 = File.ReadAllText("C:\\Work\\mobilesnapshot.txt");
string outFile = "C:\\Work\\checksnapshot.properties";
//1
if (contents.Contains(args[0]))
{
File.WriteAllText(outFile,"SASE=1");
}
else
{
File.WriteAllText(outFile,"SASE=0");
}
//2
if (contents2.Contains(args[0]))
{
File.AppendAllText(outFile,"Analyzer= 1");
}
else
{
File.AppendAllText(outFile,"Analyzer= 0");
}
//3
if (contents3.Contains(args[0]))
{
File.AppendAllText(outFile,"mobile= 1");
}
else
{
File.AppendAllText(outFile,"mobile= 0");
}
Or, in an even more laziest code :
var contents = File.ReadAllText("C:\\Work\\labtoolssnapshot.txt");
var contents2 = File.ReadAllText("C:\\Work\\analyzercommonsnapshot.txt");
var contents3 = File.ReadAllText("C:\\Work\\mobilesnapshot.txt");
var outFile = "C:\\Work\\checksnapshot.properties";
File.WriteAllText(outfile, string.Format(
#"SASE= {0}
Analyzer= {1}
mobile= {2}
",
contents.Contains(args[0]) ? "1" : "0",
contents2.Contains(args[0]) ? "1" : "0",
contents3.Contains(args[0]) ? "1" : "0"
));
use FileStream instead of StreamWriter:
using (FileStream fs = new FileStream("C:\\Work\\checksnapshot.properties",FileMode.OpenOrCreate,FileAccess.Append))
{
StreamWriter writer = new StreamWriter(fs);
writer.Write(whatever);
}
note: I've only used this in .NET 4

Categories