Linq to XML parses files in a folder - c#

So I have this code building with no errors but I need to alter how its opening the xml documents. Right now it can open a single xml documents what I need it to do is open up a folder on my c: and parse through all the xml files in the folder. Any help?
static void Main(string[] args)
{
XDocument doc = XDocument.Load(#"c:\.cfg"); //Change here
var query = from x in doc.Descendants("X")
select new
{
Max1 = x.Attribute("Max").Value,
Min2 = x.Attribute("Min").Value
};
foreach (var x in query) ;
Console.WriteLine("X");
var query2 = from x in doc.Descendants("Y")
select new
{
Max3 = x.Attribute("Max").Value,
Min4 = x.Attribute("Min").Value
};
foreach (var x in query2)
Console.WriteLine("Y");
var query3 = from x in doc.Descendants("ZA")
select new
{
Max5 = x.Attribute("Max").Value,
Min6 = x.Attribute("Min").Value
};
foreach (var x in query3)
Console.WriteLine("Z");
}

You should loop through Directory.EnumerateFiles(#"C:\Something", "*.xml").

... A slightly more "declarative" manner:
// Program.cs
class Program
{
static void Main(string[] args)
{
const string path = #"C:\stuff";
Parallel.ForEach(Directory.EnumerateFiles(path, "*.xml"), x => Walk(XDocument.Load(x)));
}
static IEnumerable<Calib> MapItem(IEnumerable<XElement> elements)
{
return elements.Select(x => new Calib
{
Max = x.Attribute("Max").Value,
Min = x.Attribute("Min").Value
});
}
static void Walk(XDocument doc)
{
var xitems = MapItem(doc.Descendants("XaxisCalib"));
xitems.Iter(x => Console.WriteLine("(XaxisCalib) X: Min = {0} | Max = {1}", x.Min, x.Max));
var yitems = MapItem(doc.Descendants("YAxisCalib"));
yitems.Iter(x => Console.WriteLine("(YaxisCalib) Y: Min = {0} | Max = {1}", x.Min, x.Max));
var zitems = MapItem(doc.Descendants("ZAxisCalib"));
zitems.Iter(x => Console.WriteLine("(ZaxisCalib) Z: Min = {0} | Max = {1}", x.Min, x.Max));
}
}
// Exts.cs
public static class Exts
{
public static void Iter<T>(this IEnumerable<T> source, Action<T> action)
{
foreach (var item in source)
{
action(item);
}
}
}
// Calib.cs
public class Calib
{
public string Max { get; set; }
public string Min { get; set; }
}

Rather than just writing the values out to the console, you could create a new Xml document from the values in the files and do whatever you want with from that (generate an Excel spreadsheet?):
var fileData = new XElement("root",
from file in New System.IO.DirectoryInfo("C:\Something").GetFiles()
where file.Extension.Equals(".xml", String Comparison.CurrentCultureIgnoreCase)
Let doc = XElement.Load(file.FullName)
select new XElement("File",
new XAttribute("Path", file.FullName),
select new XElement("XAxisCalibs",
from x in doc.Descendants("XAxisCalib")
select new XElement("XAxisCalib",
new XAttribute("Max", x.Attribute("Max").Value),
new XAttribute("Min", x.Attribute("Min").Value)
)
),
select new XElement("YAxisCalibs",
from y in doc.Descendants("YAxisCalib")
select new XElement("YAxisCalib",
new XAttribute("Max", x.Attribute("Max").Value),
new XAttribute("Min", x.Attribute("Min").Value)
)
),
select new XElement("ZAxisCalibs",
from z in doc.Descendants("ZAxisCalib")
select new XElement("ZAxisCalib",
new XAttribute("Max", x.Attribute("Max").Value),
new XAttribute("Min", x.Attribute("Min").Value)
)
)
);
Granted, since this is complete declarative and one long statement, it is a bit of a trick to debug if necessary.

Related

Elastic Search MoreLikeThis Query Never Returns Results

I must be doing something fundamentally wrong here. I'm trying to get a "More Like This" query working in a search engine project we have that uses Elastic Search. The idea is that the CMS can write tags (like categories) to the page in a Meta tag or something, and we would read those into Elastic and use them to drive a "more like this" search based upon an input document id.
So if the input document has tags of catfish, chicken, goat I would expect Elastic Search to find other documents that share those tags and not return ones for racecar and airplane.
I've built a proof of concept console app by:
Getting a local Elastic Search 6.6.1 instance running in Docker by following the instructions on https://www.elastic.co/guide/en/elasticsearch/reference/current/docker.html
Creating a new .NET Framework 4.6.1 Console App
Adding the NuGet packages for NEST 6.5.0 and ElasticSearch.Net 6.5.0
Then I created a new elastic index that contains objects (Type "MyThing") that have a "Tags" property. This tag is a random comma-delimited set of words from a set of possible values. I've inserted anywhere from 100 to 5000 items in the index in testing. I've tried more and fewer possible words in the set.
No matter what I try the MoreLikeThis query never returns anything, and I don't understand why.
Query that isn't returning results:
var result = EsClient.Search<MyThing>(s => s
.Index(DEFAULT_INDEX)
.Query(esQuery =>
{
var mainQuery = esQuery
.MoreLikeThis(mlt => mlt
.Include(true)
.Fields(f => f.Field(ff => ff.Tags, 5))
.Like(l => l.Document(d => d.Id(id)))
);
return mainQuery;
}
Full "program.cs" source:
using Nest;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace Test_MoreLikeThis_ES6
{
class Program
{
public class MyThing
{
public string Tags { get; set; }
}
const string ELASTIC_SERVER = "http://localhost:9200";
const string DEFAULT_INDEX = "my_index";
const int NUM_RECORDS = 1000;
private static Uri es_node = new Uri(ELASTIC_SERVER);
private static ConnectionSettings settings = new ConnectionSettings(es_node).DefaultIndex(DEFAULT_INDEX);
private static ElasticClient EsClient = new ElasticClient(settings);
private static Random rnd = new Random();
static void Main(string[] args)
{
Console.WriteLine("Rebuild index? (y):");
var answer = Console.ReadLine().ToLower();
if (answer == "y")
{
RebuildIndex();
for (int i = 0; i < NUM_RECORDS; i++)
{
AddToIndex();
}
}
Console.WriteLine("");
Console.WriteLine("Getting a Thing...");
var aThingId = GetARandomThingId();
Console.WriteLine("");
Console.WriteLine("Looking for something similar to document with id " + aThingId);
Console.WriteLine("");
Console.WriteLine("");
GetMoreLikeAThing(aThingId);
}
private static string GetARandomThingId()
{
var firstdocQuery = EsClient
.Search<MyThing>(s =>
s.Size(1)
.Query(q => {
return q.FunctionScore(fs => fs.Functions(fn => fn.RandomScore(rs => rs.Seed(DateTime.Now.Ticks).Field("_seq_no"))));
})
);
if (!firstdocQuery.IsValid || firstdocQuery.Hits.Count == 0) return null;
var hit = firstdocQuery.Hits.First();
Console.WriteLine("Found a thing with id '" + hit.Id + "' and tags: " + hit.Source.Tags);
return hit.Id;
}
private static void GetMoreLikeAThing(string id)
{
var result = EsClient.Search<MyThing>(s => s
.Index(DEFAULT_INDEX)
.Query(esQuery =>
{
var mainQuery = esQuery
.MoreLikeThis(mlt => mlt
.Include(true)
.Fields(f => f.Field(ff => ff.Tags, 5))
.Like(l => l.Document(d => d.Id(id)))
);
return mainQuery;
}
));
if (result.IsValid)
{
if (result.Hits.Count > 0)
{
Console.WriteLine("These things are similar:");
foreach (var hit in result.Hits)
{
Console.WriteLine(" " + hit.Id + " : " + hit.Source.Tags);
}
}
else
{
Console.WriteLine("No similar things found.");
}
}
else
{
Console.WriteLine("There was an error running the ES query.");
}
Console.WriteLine("");
Console.WriteLine("Enter (y) to get another thing, or anything else to exit");
var y = Console.ReadLine().ToLower();
if (y == "y")
{
var aThingId = GetARandomThingId();
GetMoreLikeAThing(aThingId);
}
Console.WriteLine("");
Console.WriteLine("Any key to exit...");
Console.ReadKey();
}
private static void RebuildIndex()
{
var existsResponse = EsClient.IndexExists(DEFAULT_INDEX);
if (existsResponse.Exists) //delete existing mapping (and data)
{
EsClient.DeleteIndex(DEFAULT_INDEX);
}
var rebuildResponse = EsClient.CreateIndex(DEFAULT_INDEX, c => c.Settings(s => s.NumberOfReplicas(1).NumberOfShards(5)));
var response2 = EsClient.Map<MyThing>(m => m.AutoMap());
}
private static void AddToIndex()
{
var myThing = new MyThing();
var tags = new List<string> {
"catfish",
"tractor",
"racecar",
"airplane",
"chicken",
"goat",
"pig",
"horse",
"goose",
"duck"
};
var randNum = rnd.Next(0, tags.Count);
//get randNum random tags
var rand = tags.OrderBy(o => Guid.NewGuid().ToString()).Take(randNum);
myThing.Tags = string.Join(", ", rand);
var ir = new IndexRequest<MyThing>(myThing);
var indexResponse = EsClient.Index(ir);
Console.WriteLine("Index response: " + indexResponse.Id + " : " + string.Join(" " , myThing.Tags));
}
}
}
The issue here is that the default min_term_freq value of 2 will never be satisfied for any of the terms of the prototype document because all documents contain only each tag (term) once. If you drop min_term_freq to 1, you'll get results. Might also want to set min_doc_freq to 1 too, and combine with a query that excludes the prototype document.
Here's an example to play with
const string ELASTIC_SERVER = "http://localhost:9200";
const string DEFAULT_INDEX = "my_index";
const int NUM_RECORDS = 1000;
private static readonly Random _random = new Random();
private static readonly IReadOnlyList<string> Tags =
new List<string>
{
"catfish",
"tractor",
"racecar",
"airplane",
"chicken",
"goat",
"pig",
"horse",
"goose",
"duck"
};
private static ElasticClient _client;
private static void Main()
{
var pool = new SingleNodeConnectionPool(new Uri(ELASTIC_SERVER));
var settings = new ConnectionSettings(pool)
.DefaultIndex(DEFAULT_INDEX);
_client = new ElasticClient(settings);
Console.WriteLine("Rebuild index? (y):");
var answer = Console.ReadLine().ToLower();
if (answer == "y")
{
RebuildIndex();
AddToIndex();
}
Console.WriteLine();
Console.WriteLine("Getting a Thing...");
var aThingId = GetARandomThingId();
Console.WriteLine();
Console.WriteLine("Looking for something similar to document with id " + aThingId);
Console.WriteLine();
Console.WriteLine();
GetMoreLikeAThing(aThingId);
}
public class MyThing
{
public List<string> Tags { get; set; }
}
private static string GetARandomThingId()
{
var firstdocQuery = _client
.Search<MyThing>(s =>
s.Size(1)
.Query(q => q
.FunctionScore(fs => fs
.Functions(fn => fn
.RandomScore(rs => rs
.Seed(DateTime.Now.Ticks)
.Field("_seq_no")
)
)
)
)
);
if (!firstdocQuery.IsValid || firstdocQuery.Hits.Count == 0) return null;
var hit = firstdocQuery.Hits.First();
Console.WriteLine($"Found a thing with id '{hit.Id}' and tags: {string.Join(", ", hit.Source.Tags)}");
return hit.Id;
}
private static void GetMoreLikeAThing(string id)
{
var result = _client.Search<MyThing>(s => s
.Index(DEFAULT_INDEX)
.Query(esQuery => esQuery
.MoreLikeThis(mlt => mlt
.Include(true)
.Fields(f => f.Field(ff => ff.Tags))
.Like(l => l.Document(d => d.Id(id)))
.MinTermFrequency(1)
.MinDocumentFrequency(1)
) && !esQuery
.Ids(ids => ids
.Values(id)
)
)
);
if (result.IsValid)
{
if (result.Hits.Count > 0)
{
Console.WriteLine("These things are similar:");
foreach (var hit in result.Hits)
{
Console.WriteLine($" {hit.Id}: {string.Join(", ", hit.Source.Tags)}");
}
}
else
{
Console.WriteLine("No similar things found.");
}
}
else
{
Console.WriteLine("There was an error running the ES query.");
}
Console.WriteLine();
Console.WriteLine("Enter (y) to get another thing, or anything else to exit");
var y = Console.ReadLine().ToLower();
if (y == "y")
{
var aThingId = GetARandomThingId();
GetMoreLikeAThing(aThingId);
}
Console.WriteLine();
Console.WriteLine("Any key to exit...");
}
private static void RebuildIndex()
{
var existsResponse = _client.IndexExists(DEFAULT_INDEX);
if (existsResponse.Exists) //delete existing mapping (and data)
{
_client.DeleteIndex(DEFAULT_INDEX);
}
var rebuildResponse = _client.CreateIndex(DEFAULT_INDEX, c => c
.Settings(s => s
.NumberOfShards(1)
)
.Mappings(m => m
.Map<MyThing>(mm => mm.AutoMap())
)
);
}
private static void AddToIndex()
{
var bulkAllObservable = _client.BulkAll(GetMyThings(), b => b
.RefreshOnCompleted()
.Size(1000));
var waitHandle = new ManualResetEvent(false);
Exception exception = null;
var bulkAllObserver = new BulkAllObserver(
onNext: r =>
{
Console.WriteLine($"Indexed page {r.Page}");
},
onError: e =>
{
exception = e;
waitHandle.Set();
},
onCompleted: () => waitHandle.Set());
bulkAllObservable.Subscribe(bulkAllObserver);
waitHandle.WaitOne();
if (exception != null)
{
throw exception;
}
}
private static IEnumerable<MyThing> GetMyThings()
{
for (int i = 0; i < NUM_RECORDS; i++)
{
var randomTags = Tags.OrderBy(o => Guid.NewGuid().ToString())
.Take(_random.Next(0, Tags.Count))
.OrderBy(t => t)
.ToList();
yield return new MyThing { Tags = randomTags };
}
}
And here's an example output
Found a thing with id 'Ugg9LGkBPK3n91HQD1d5' and tags: airplane, goat
These things are similar:
4wg9LGkBPK3n91HQD1l5: airplane, goat
9Ag9LGkBPK3n91HQD1l5: airplane, goat
Vgg9LGkBPK3n91HQD1d5: airplane, goat, goose
sQg9LGkBPK3n91HQD1d5: airplane, duck, goat
lQg9LGkBPK3n91HQD1h5: airplane, catfish, goat
9gg9LGkBPK3n91HQD1l5: airplane, catfish, goat
FQg9LGkBPK3n91HQD1p5: airplane, goat, goose
Jwg9LGkBPK3n91HQD1p5: airplane, goat, goose
Fwg9LGkBPK3n91HQD1d5: airplane, duck, goat, tractor
Kwg9LGkBPK3n91HQD1d5: airplane, goat, goose, horse

Get the positions of unique elements in a string[]

I have an xml file that I am accessing to create a report of time spent on a project. I'm returning the unique dates to a label created dynamically on a winform and would like to compile the time spent on a project for each unique date. I have been able to return all of the projects under each date or only one project. Currently I'm stuck on only returning one project. Can anyone please help me?? This is what the data should look like if it's correct.
04/11/15
26820 2.25
27111 8.00
04/12/15
26820 8.00
04/13/15
01det 4.33
26820 1.33
27225 4.25
etc.
This is how I'm retrieving the data
string[] weekDateString = elementDateWeekstring();
string[] uniqueDates = null;
string[] weeklyJobNumber = elementJobNumWeek();
string[] weeklyTicks = elementTicksWeek();
This is how I'm getting the unique dates.
IEnumerable<string> distinctWeekDateIE = weekDateString.Distinct();
foreach (string d in distinctWeekDateIE)
{
uniqueDates = distinctWeekDateIE.ToArray();
}
And this is how I'm creating the labels.
try
{
int dateCount;
dateCount = uniqueDates.Length;
Label[] lblDate = new Label[dateCount];
int htDate = 1;
int padDate = 10;
for (int i = 0; i < dateCount; i++ )
{
lblDate[i] = new Label();
lblDate[i].Name = uniqueDates[i].Trim('\r');
lblDate[i].Text = uniqueDates[i];
lblDate[i].TabIndex = i;
lblDate[i].Bounds = new Rectangle(18, 275 + padDate + htDate, 75, 22);
targetForm.Controls.Add(lblDate[i]);
htDate += 22;
foreach (string x in uniqueDates)
{
int[] posJobNumber;
posJobNumber = weekDateString.Select((b, a) => b == uniqueDates[i].ToString() ? a : -1).Where(a => a != -1).ToArray();
for (int pjn = 0; pjn < posJobNumber.Length; pjn++)
{
if (x.Equals(lblDate[i].Text))
{
Label lblJobNum = new Label();
int htJobNum = 1;
int padJobNum = 10;
lblJobNum.Name = weeklyJobNumber[i];
lblJobNum.Text = weeklyJobNumber[i];
lblJobNum.Bounds = new Rectangle(100, 295 + padJobNum + htJobNum, 75, 22);
targetForm.Controls.Add(lblJobNum);
htJobNum += 22;
htDate += 22;
padJobNum += 22;
}
}
}
}
}
I've been stuck on this for about 3 months. Is there anyone that can describe to me why I'm not able to properly retrieve the job numbers that are associated with a particular date. I don't believe that these are specifically being returned as dates. Just a string that looks like a date.
I really appreciate any help I can get. I'm just completely baffled. Thank you for any responses in advance. I truly appreciate the assistance.
EDIT: #Sayka - Here is the xml sample.
<?xml version="1.0" encoding="utf-8"?>
<Form1>
<Name Key="4/21/2014 6:51:17 AM">
<Date>4/21/2014</Date>
<JobNum>26820</JobNum>
<RevNum>00000</RevNum>
<Task>Modeling Secondary</Task>
<Start>06:51 AM</Start>
<End>04:27 PM</End>
<TotalTime>345945089017</TotalTime>
</Name>
<Name Key="4/22/2014 5:44:22 AM">
<Date>4/22/2014</Date>
<JobNum>26820</JobNum>
<RevNum>00000</RevNum>
<Task>Modeling Secondary</Task>
<Start>05:44 AM</Start>
<End>06:56 AM</End>
<TotalTime>43514201221</TotalTime>
</Name>
<Name Key="4/22/2014 6:57:02 AM">
<Date>4/22/2014</Date>
<JobNum>02e-n-g</JobNum>
<RevNum>00000</RevNum>
<Task>NET Eng</Task>
<Start>06:57 AM</Start>
<End>07:16 AM</End>
<TotalTime>11706118875</TotalTime>
</Name>
....
</Form1>
This is how I'm getting the information out of the xml file and returning a string[].
public static string[] elementDateWeekstring()
{
//string datetxtWeek = "";
XmlDocument xmldoc = new XmlDocument();
fileExistsWeek(xmldoc);
XmlNodeList nodeDate = xmldoc.GetElementsByTagName("Date");
int countTicks = 0;
string[] dateTxtWeek = new string[nodeDate.Count];
for (int i = 0; i < nodeDate.Count; i++)
{
dateTxtWeek[i] = nodeDate[i].InnerText;
countTicks++;
}
return dateTxtWeek;
}
Job number and Ticks are returned in a similar fashion. I've been able to reuse these snippets throught out the code. This is a one dimensional xml file?? It will always return a position for a jobnumber that equates to a date or Ticks. I will never have more or less of any one element.
You can use Linq-to-XML to parse the XML file, and then use Linq-to-objects to group (and order) the data by job date and order each group by job name.
The code to parse the XML file is like so:
var doc = XDocument.Load(filename);
var jobs = doc.Descendants("Name");
// Extract the date, job number, and total time from each "Name" element.:
var data = jobs.Select(job => new
{
Date = (DateTime)job.Element("Date"),
Number = (string)job.Element("JobNum"),
Duration = TimeSpan.FromTicks((long)job.Element("TotalTime"))
});
The code to group and order the jobs by date and order the groups by job name is:
var result =
data.GroupBy(job => job.Date).OrderBy(g => g.Key)
.Select(g => new
{
Date = g.Key,
Jobs = g.OrderBy(item => item.Number)
});
Then you can access the data by iterating over each group in result and then iterate over each job in the group, like so:
foreach (var jobsOnDate in result)
{
Console.WriteLine("{0:d}", jobsOnDate.Date);
foreach (var job in jobsOnDate.Jobs)
Console.WriteLine(" {0} {1:hh\\:mm}", job.Number, job.Duration);
}
Putting this all together in a sample compilable console application (substitute the filename for the XML file as appropriate):
using System;
using System.Linq;
using System.Xml.Linq;
namespace ConsoleApplication2
{
class Program
{
private static void Main()
{
string filename = #"d:\test\test.xml"; // Substitute your own filename here.
// Open XML file and get a collection of each "Name" element.
var doc = XDocument.Load(filename);
var jobs = doc.Descendants("Name");
// Extract the date, job number, and total time from each "Name" element.:
var data = jobs.Select(job => new
{
Date = (DateTime)job.Element("Date"),
Number = (string)job.Element("JobNum"),
Duration = TimeSpan.FromTicks((long)job.Element("TotalTime"))
});
// Group the jobs by date, and order the groups by job name:
var result =
data.GroupBy(job => job.Date).OrderBy(g => g.Key)
.Select(g => new
{
Date = g.Key,
Jobs = g.OrderBy(item => item.Number)
});
// Print out the results:
foreach (var jobsOnDate in result)
{
Console.WriteLine("{0:d}", jobsOnDate.Date);
foreach (var job in jobsOnDate.Jobs)
Console.WriteLine(" {0} {1:hh\\:mm}", job.Number, job.Duration);
}
}
}
}
The output is like this
Create a new project
Set form size bigger.
Apply these codes.
Set the location for your XML file.
Namespaces
using System.Xml;
using System.IO;
Form Code
public partial class Form1 : Form
{
const string XML_FILE_NAME = "D:\\emps.txt";
public Form1()
{
InitializeComponent();
}
private void Form1_Load(object sender, EventArgs e)
{
prepareDataGrid();
List<JOBS> jobsList = prepareXML(XML_FILE_NAME);
for (int i = 0; i < jobsList.Count; i++)
{
addDateRow(jobsList[i].jobDate.ToString("M'/'d'/'yyyy"));
for (int j = 0; j < jobsList[i].jobDetailsList.Count; j++)
dgv.Rows.Add(new string[] {
jobsList[i].jobDetailsList[j].JobNumber,
jobsList[i].jobDetailsList[j].JobHours
});
}
}
DataGridView dgv;
void prepareDataGrid()
{
dgv = new DataGridView();
dgv.BackgroundColor = Color.White;
dgv.GridColor = Color.White;
dgv.DefaultCellStyle.SelectionBackColor = Color.White;
dgv.DefaultCellStyle.SelectionForeColor = Color.Black;
dgv.DefaultCellStyle.ForeColor = Color.Black;
dgv.DefaultCellStyle.BackColor = Color.White;
dgv.DefaultCellStyle.Alignment = DataGridViewContentAlignment.MiddleRight;
dgv.Width = 600;
dgv.Dock = DockStyle.Left;
this.BackColor = Color.White;
dgv.Columns.Add("Col1", "Col1");
dgv.Columns.Add("Col2", "Col2");
dgv.Columns[0].Width = 110;
dgv.Columns[1].Width = 40;
dgv.DefaultCellStyle.Font = new System.Drawing.Font("Segoe UI", 10);
dgv.RowHeadersVisible = dgv.ColumnHeadersVisible = false;
dgv.AllowUserToAddRows =
dgv.AllowUserToDeleteRows =
dgv.AllowUserToOrderColumns =
dgv.AllowUserToResizeColumns =
dgv.AllowUserToResizeRows =
!(dgv.ReadOnly = true);
Controls.Add(dgv);
}
void addJobRow(string jobNum, string jobHours)
{
dgv.Rows.Add(new string[] {jobNum, jobHours });
}
void addDateRow(string date)
{
dgv.Rows.Add(new string[] { date, ""});
dgv.Rows[dgv.Rows.Count - 1].DefaultCellStyle.SelectionForeColor =
dgv.Rows[dgv.Rows.Count - 1].DefaultCellStyle.ForeColor = Color.Firebrick;
dgv.Rows[dgv.Rows.Count - 1].DefaultCellStyle.Font = new Font("Segoe UI Light", 13.5F);
dgv.Rows[dgv.Rows.Count - 1].DefaultCellStyle.Alignment = DataGridViewContentAlignment.MiddleLeft;
dgv.Rows[dgv.Rows.Count - 1].Height = 25;
}
List<JOBS> prepareXML(string fileName)
{
string xmlContent = "";
using (FileStream fs = new FileStream(fileName, FileMode.Open, FileAccess.Read))
using (StreamReader sr = new StreamReader(fs)) xmlContent = sr.ReadToEnd();
XmlDocument doc = new XmlDocument();
doc.LoadXml(xmlContent);
List<JOBS> jobsList = new List<JOBS>();
XmlNode form1Node = doc.ChildNodes[1];
for (int i = 0; i < form1Node.ChildNodes.Count; i++)
{
XmlNode dateNode = form1Node.ChildNodes[i].ChildNodes[0].ChildNodes[0],
jobNumNode = form1Node.ChildNodes[i].ChildNodes[1].ChildNodes[0],
timeTicksNode = form1Node.ChildNodes[i].ChildNodes[6].ChildNodes[0];
bool foundDate = false;
for (int j = 0; j < jobsList.Count; j++) if (jobsList[j].compareDate(dateNode.Value))
{
jobsList[j].addJob(jobNumNode.Value, Math.Round(TimeSpan.FromTicks(
(long)Convert.ToDouble(timeTicksNode.Value)).TotalHours, 2).ToString());
foundDate = true;
break;
}
if (!foundDate)
{
JOBS job = new JOBS(dateNode.Value);
string jbnum = jobNumNode.Value;
string tbtck = timeTicksNode.Value;
long tktk = Convert.ToInt64(tbtck);
double tkdb = TimeSpan.FromTicks(tktk).TotalHours;
job.addJob(jobNumNode.Value, Math.Round(TimeSpan.FromTicks(
Convert.ToInt64(timeTicksNode.Value)).TotalHours, 2).ToString());
jobsList.Add(job);
}
}
jobsList.OrderByDescending(x => x.jobDate);
return jobsList;
}
class JOBS
{
public DateTime jobDate;
public List<JobDetails> jobDetailsList = new List<JobDetails>();
public void addJob(string jobNumber, string jobHours)
{
jobDetailsList.Add(new JobDetails() { JobHours = jobHours, JobNumber = jobNumber });
}
public JOBS(string dateString)
{
jobDate = getDateFromString(dateString);
}
public JOBS() { }
public bool compareDate(string dateString)
{
return getDateFromString(dateString) == jobDate;
}
private DateTime getDateFromString(string dateString)
{
string[] vals = dateString.Split('/');
return new DateTime(Convert.ToInt32(vals[2]), Convert.ToInt32(vals[0]), Convert.ToInt32(vals[1]));
}
}
class JobDetails
{
public string JobNumber { get; set; }
public string JobHours { get; set; }
}
}

Writing a LINQ result into a text file

I am trying to write the output of a LINQ query into a text file. For that I am using an extension method.
This is my LINQ query:
var group =
from c in census_data
group c by c.state into g
join s in state_gdp on g.FirstOrDefault().state equals s.state
orderby s.gdp descending
select new
{
State = g.Key,
Count = g.Count(),
SavingsBalance = g.Average(x => x.savingsBalanceDouble),
GDP = s.gdp
};
This is my extension method:
public static class CSVWriter
{
public static void write(this Enumerable e, string file)
{
using (System.IO.StreamWriter f = new System.IO.StreamWriter(file))
{
foreach (var i in e)
{
f.WriteLine(i);
}
}
}
}
However I am getting an error that says System.Linq.Enumerable does not have a getEnumerator method.
A possible solution can look like this:
var result =
from c in census_data
group c by c.state into g
join s in state_gdp on g.FirstOrDefault().state equals s.state
orderby s.gdp descending
select new
{
State = g.Key,
Count = g.Count(),
SavingsBalance = g.Average(x => x.savingsBalanceDouble),
GDP = s.gdp
};
var buffer = new StringBuilder();
buffer.AppendLine("#key,name,sum,gdp");
result.ToList().ForEach(item => buffer.AppendLine(String.Format("{0},{1},{2},{3}", item.State, item.Count, item.SavingBalance, item.GDP)));
File.WriteAllText("d:\\temp\\file.csv", buffer.ToString());
You need to change Enumerable to IEnumerable. Since you are creating an anonymous object, your solution will transfer the list of anonymous objects (IEnumerable<anonymous>) to your function where you write the data to the file, but you will not be able to format the ouput as desired.
One possible solution would be to put the lines you want to write to the file to a string buffer and then write the text at once using the System.IO.File.WriteAllText method:
// test data
var data = new List<Int32> { 1, 20, 30, 40, 50, 70 };
// create a list of anonymous objects
var result = data.Select (d => new
{
Count = d,
State = String.Format("Item {0}", d),
SavingBalance = d * 10
});
// create the output text buffer
var buffer = new StringBuilder();
// add header line
buffer.AppendLine("#key,name,sum");
// add each result line
result.ToList().ForEach(item => buffer.AppendLine(String.Format("{0},{1},{2}", item.Count, item.State, item.SavingBalance)));
// write to file
File.WriteAllText("d:\\temp\\file.csv", buffer.ToString());
The output is:
#key,name,sum
1,Item 1,10
20,Item 20,200
30,Item 30,300
40,Item 40,400
50,Item 50,500
70,Item 70,700
The solution which #aravol and #StephneKennedy mentioned will look like this:
public static class CSVWriter
{
public static void write<T>(this IEnumerable<T> e, string file)
{
using (System.IO.StreamWriter f = new System.IO.StreamWriter(file))
{
foreach (var i in e)
{
f.WriteLine(i);
}
}
}
}
and can be used like this:
result.write<object>(file);
As already stated, the problem with this solution is that you can not format the output, because you are using the Object.ToString method and you can't format it (the default output looks something like { key = value, key = value, ... }).
If you still want to transfer the result to another method, then create a typed class and create an object for every result entry (and then transfer the list). An example typed class can look like this:
public class Placeholder
{
public String Name { get; set; }
public Int32 Index { get; set; }
public Double Sum { get; set; }
}
Then change your LINQ query to create a new object of Placeholder, instead of anonymous object:
// test data
var data = new List<Int32> { 1, 20, 30, 40, 50, 70 };
var result = data.Select (d => new Placeholder
{
Key = d,
Name = String.Format("Item {0}", d),
Sum = d * 10.0m
}).ToList();
result.write<Placeholder>("d:\\temp\\file.csv");
And your extension method can directly use the write(this IEnumerable<Placeholder>...) or cast every object to use the class properties:
public static class CSVWriter
{
public static void write<T>(this IEnumerable<T> e, string file)
{
using (System.IO.StreamWriter f = new System.IO.StreamWriter(file))
{
foreach (var i in e)
{
f.WriteLine(((Placeholder)i).Sum);
}
}
}
}

Read a file 2 by 2 lines using Linq

I try to read a simple TXT file using Linq, but, my dificult is. read a file in 2 by 2 lines, for this, I made a simple function, but, I belive I can read the TXT separating 2 by 2 lines...
My code to read the text lines is:
private struct Test
{
public string Line1, Line2;
};
static List<Test> teste_func(string[] args)
{
List<Test> exemplo = new List<Test>();
var lines = File.ReadAllLines(args[0]).Where(x => x.StartsWith("1") || x.StartsWith("7")).ToArray();
for(int i=0;i<lines.Length;i++)
{
Test aux = new Test();
aux.Line1 = lines[i];
i+=1;
aux.Line2 = lines[i];
exemplo.Add(aux);
}
return exemplo;
}
Before I create this function, I tried to do this:
var lines = File.ReadAllLines(args[0]). .Where(x=>x.StartsWith("1") || x.StartsWith("7")).Select(x =>
new Test
{
Line1 = x.Substring(0, 10),
Line2 = x.Substring(0, 10)
});
But, it's obvious, that system will be get line by line and create a new struct for the line...
So, how I can make to get 2 by 2 lines with linq ?
--- Edit
Maybe is possible to create a new 'linq' function, to make that ???
Func<T> Get2Lines<T>(this Func<T> obj....) { ... }
Something like this?
public static IEnumerable<B> MapPairs<A, B>(this IEnumerable<A> sequence,
Func<A, A, B> mapper)
{
var enumerator = sequence.GetEnumerator();
while (enumerator.MoveNext())
{
var first = enumerator.Current;
if (enumerator.MoveNext())
{
var second = enumerator.Current;
yield return mapper(first, second);
}
else
{
//What should we do with left over?
}
}
}
Then
File.ReadAllLines(...)
.Where(...)
.MapPairs((a1,a2) => new Test() { Line1 = a1, Line2 = a2 })
.ToList();
File.ReadLines("example.txt")
.Where(x => x.StartsWith("1") || x.StartsWith("7"))
.Select((l, i) => new {Index = i, Line = l})
.GroupBy(o => o.Index / 2, o => o.Line)
.Select(g => new Test(g));
public struct Test
{
public Test(IEnumerable<string> src)
{
var tmp = src.ToArray();
Line1 = tmp.Length > 0 ? tmp[0] : null;
Line2 = tmp.Length > 1 ? tmp[1] : null;
}
public string Line1 { get; set; }
public string Line2 { get; set; }
}

Determining value jumps in List<T>

I have a class:
public class ShipmentInformation
{
public string OuterNo { get; set; }
public long Start { get; set; }
public long End { get; set; }
}
I have a List<ShipmentInformation> variable called Results.
I then do:
List<ShipmentInformation> FinalResults = new List<ShipmentInformation>();
var OuterNumbers = Results.GroupBy(x => x.OuterNo);
foreach(var item in OuterNumbers)
{
var orderedData = item.OrderBy(x => x.Start);
ShipmentInformation shipment = new ShipmentInformation();
shipment.OuterNo = item.Key;
shipment.Start = orderedData.First().Start;
shipment.End = orderedData.Last().End;
FinalResults.Add(shipment);
}
The issue I have now is that within each grouped item I have various ShipmentInformation but the Start number may not be sequential by x. x can be 300 or 200 based on a incoming parameter. To illustrate I could have
Start = 1, End = 300
Start = 301, End = 600
Start = 601, End = 900
Start = 1201, End = 1500
Start = 1501, End = 1800
Because I have this jump I cannot use the above loop to create an instance of ShipmentInformation and take the first and last item in orderedData to use their data to populate that instance.
I would like some way of identifying a jump by 300 or 200 and creating an instance of ShipmentInformation to add to FinalResults where the data is sequnetial.
Using the above example I would have 2 instances of ShipmentInformation with a Start of 1 and an End of 900 and another with a Start of 1201 and End of 1800
Try the following:
private static IEnumerable<ShipmentInformation> Compress(IEnumerable<ShipmentInformation> shipments)
{
var orderedData = shipments.OrderBy(s => s.OuterNo).ThenBy(s => s.Start);
using (var enumerator = orderedData.GetEnumerator())
{
ShipmentInformation compressed = null;
while (enumerator.MoveNext())
{
var current = enumerator.Current;
if (compressed == null)
{
compressed = current;
continue;
}
if (compressed.OuterNo != current.OuterNo || compressed.End < current.Start - 1)
{
yield return compressed;
compressed = current;
continue;
}
compressed.End = current.End;
}
if (compressed != null)
{
yield return compressed;
}
}
}
Useable like so:
var finalResults = Results.SelectMany(Compress).ToList();
If you want something that probably has terrible performance and is impossible to understand, but only uses out-of-the box LINQ, I think this might do it.
var orderedData = item.OrderBy(x => x.Start);
orderedData
.SelectMany(x =>
Enumerable
.Range(x.Start, 1 + x.End - x.Start)
.Select(n => new { time = n, info = x))
.Select((x, i) => new { index = i, time = x.time, info = x.info } )
.GroupBy(t => t.time - t.info)
.Select(g => new ShipmentInformation {
OuterNo = g.First().Key,
Start = g.First().Start(),
End = g.Last().End });
My brain hurts.
(Edit for clarity: this just replaces what goes inside your foreach loop. You can make it even more horrible by putting this inside a Select statement to replace the foreach loop, like in rich's answer.)
How about this?
List<ShipmentInfo> si = new List<ShipmentInfo>();
si.Add(new ShipmentInfo(orderedData.First()));
for (int index = 1; index < orderedData.Count(); ++index)
{
if (orderedData.ElementAt(index).Start ==
(si.ElementAt(si.Count() - 1).End + 1))
{
si[si.Count() - 1].End = orderedData.ElementAt(index).End;
}
else
{
si.Add(new ShipmentInfo(orderedData.ElementAt(index)));
}
}
FinalResults.AddRange(si);
Another LINQ solution would be to use the Except extension method.
EDIT: Rewritten in C#, includes composing the missing points back into Ranges:
class Program
{
static void Main(string[] args)
{
Range[] l_ranges = new Range[] {
new Range() { Start = 10, End = 19 },
new Range() { Start = 20, End = 29 },
new Range() { Start = 40, End = 49 },
new Range() { Start = 50, End = 59 }
};
var l_flattenedRanges =
from l_range in l_ranges
from l_point in Enumerable.Range(l_range.Start, 1 + l_range.End - l_range.Start)
select l_point;
var l_min = 0;
var l_max = l_flattenedRanges.Max();
var l_allPoints =
Enumerable.Range(l_min, 1 + l_max - l_min);
var l_missingPoints =
l_allPoints.Except(l_flattenedRanges);
var l_lastRange = new Range() { Start = l_missingPoints.Min(), End = l_missingPoints.Min() };
var l_missingRanges = new List<Range>();
l_missingPoints.ToList<int>().ForEach(delegate(int i)
{
if (i > l_lastRange.End + 1)
{
l_missingRanges.Add(l_lastRange);
l_lastRange = new Range() { Start = i, End = i };
}
else
{
l_lastRange.End = i;
}
});
l_missingRanges.Add(l_lastRange);
foreach (Range l_missingRange in l_missingRanges) {
Console.WriteLine("Start = " + l_missingRange.Start + " End = " + l_missingRange.End);
}
Console.ReadKey(true);
}
}
class Range
{
public int Start { get; set; }
public int End { get; set; }
}

Categories