Guarantee a duplicate record is not inserted? - c#

I have some code that inserts a record, and I want to first delete any existing records with matching tuples. This code is called rapidly from a number of executables:
public void AddMemberEligibility(long memberId, string internalContractKey, int planSponsorId, int vendorId, string vendorContractKey) {
using (IDocumentSession session = Global.DocumentStore.OpenSession()) {
var existingMember = session.Query<MemberEligibility>().FirstOrDefault(x => x.VendorId == vendorId
&& x.MemberId == memberId && x.PlanSponsorId == planSponsorId);
if (existingMember != null) {
session.Delete<MemberEligibility>(existingMember);
session.SaveChanges();
}
Eligibility elig = new Eligibility() {
InternalContractKey = internalContractKey,
MemberId = memberId,
PlanSponsorId = planSponsorId,
VendorId = vendorId
};
session.Store(elig);
session.SaveChanges();
}
}
This doesn't seem to be enough to protect against duplicates. Any suggestions?

A Hash collection would fix this problem this nicely enough.
It calls hashCode() on input and contains functions to keep the collection somewhat organized then equals() to test the overlapping hash codes. This combination makes it put and contains functions typically 0(1); though if say all the hash codes are the same then it increases contains to 0(logn).
Most likely a concurrent hash collection would be preferable. If you are in java (which it looks like), you can use a CurrentHashSet

What I ended up doing, after taking Oren Eini's advice on the Raven Google group, was to use the Unique Constraints Bundle.
My DTO now looks something like this:
using Raven.Client.UniqueConstraints;
public class MemberEligibility {
[UniqueConstraint]
public string EligibilityKey { get { return $"{MemberId}_{VendorId}_{PlanSponsorId}_{VendorContractKey}"; } }
public long MemberId { get; set; }
public int VendorId { get; set; }
public int PlanSponsorId { get; set; }
public string VendorContractKey { get; set; }
// other fields
}
and my add/update looks like this:
public void AddMemberEligibility(long memberId, int planSponsorId, int vendorId, string vendorContractKey, ...) {
using (IDocumentSession session = Global.DocumentStore.OpenSession()) {
MemberEligibility elig = new MemberEligibility() {
MemberId = memberId,
PlanSponsorId = planSponsorId,
VendorId = vendorId,
VendorContractKey = vendorContractKey,
//other stuff
};
var existing = session.LoadByUniqueConstraint<MemberEligibility>(x => x.EligibilityKey, elig.EligibilityKey);
if (existing != null) {
// set some fields
} else {
session.Store(elig);
}
session.SaveChanges();
}
}
At this point I'm not 100% certain this is the solution I'll push to production, but it works. Keep in mind session.SaveChanges() will throw an exception if there's already a document with the same [UniqueConstraint] property in the store. Also I started with this property typed as a Tuple<...>, but Raven's serializer couldn't figure out how to work with it, so I settled on a string for now.

Related

rearrange a list of objects by type field in C#

I have an incoming list of alerts and I use a MapFunction as:
private static BPAlerts MapToAlerts(List<IntakeAlert> intakeAlerts)
{
// Make sure that there are alerts
if (intakeAlerts.IsNullOrEmpty()) return new BPAlerts { AllAlerts = new List<BPAlert>(), OverviewAlerts = new List<BPAlert>() };
// All Alerts
var alerts = new BPAlerts
{
AllAlerts = intakeAlerts.Select(
alert => new BPAlert
{
AlertTypeId = alert.AlertTypeId ?? 8100,
IsOverview = alert.IsOverviewAlert.GetValueOrDefault(),
Text = alert.AlertText,
Title = alert.AlertTitle,
Type = alert.AlertTypeId == 8106 ? "warning" : "report",
Severity = alert.AlertSeverity.GetValueOrDefault(),
Position = alert.Position.GetValueOrDefault()
}).OrderBy(a => a.Position).ToList()
};
// Alerts displayed on the overview page
alerts.OverviewAlerts =
alerts.AllAlerts
.ToList()
.Where(a => a.IsOverview && !string.IsNullOrEmpty(a.Title))
.Take(3)
.ToList();
return alerts;
}
the BPAlerts type contains list of two type:
public class BPAlerts
{
public List<BPAlert> AllAlerts { get; set; }
public List<BPAlert> OverviewAlerts { get; set; }
}
And the BPAlert type is defined as:
public class BPAlert
{
public short AlertTypeId { get; set; }
public string Type { get; set; }
public int Severity { get; set; }
public bool IsOverview { get; set; }
public string Title { get; set; }
public string Text { get; set; }
public int Position { get; set; }
public string Id { get; internal set; } = Guid.NewGuid().ToString();
}
I want to achieve a task in which the MaptoAlerts function returns a alerts object with overviewalerts which are sorted based on the type of BPAlert. To be more clear in the following order if present:
Confirmed Out of Business - 8106 \n
Bankruptcy - 8105 \n
Lack of Licensing - 8111 \n
Investigations - 8109 \n
Government Actions - 8103 \n
Pattern of Complaints - 8104 \n
Customer Reviews - 8112 \n
Accreditation - 8110 \n
Misuse of BBB Name - 8101 \n
Advisory - 8107 \n
Advertising Review – 8102 \n
Solution #1 Order values array
I would just define the order of those ids in some kind of collection, can be an array:
var orderArray = new int[]
{
8106, // Confirmed Out of Busine
8105, // Bankruptcy
8111, // Lack of Licensing
8109, // Investigations
8103, // Government Actions
8104, // Pattern of Complaints
8112, // Customer Reviews
8110, // Accreditation
8101, // Misuse of BBB Name
8107, // Advisory
8102, // Advertising Review
};
Then iterate through array while incrementing order value. While looping check if order array contains actual type id which order value I'm trying to evaluate:
for (int orderValue = 0; orderValue < orderArray.Length; orderValue++)
{
if (alertTypeId == orderArray[orderValue])
{
return orderValue;
}
}
If not found in the array, return highest value possible:
return int.MaxValue
Whole method would look like this and it would evaluate the order for alert type id:
public int GetAlertTypeIdOrder(short alertTypeId)
{
var orderArray = new int[]
{
8106, // Confirmed Out of Busine
8105, // Bankruptcy
8111, // Lack of Licensing
8109, // Investigations
8103, // Government Actions
8104, // Pattern of Complaints
8112, // Customer Reviews
8110, // Accreditation
8101, // Misuse of BBB Name
8107, // Advisory
8102, // Advertising Review
};
for (int orderValue = 0; orderValue < orderArray.Length; orderValue++)
{
if (alertTypeId == orderArray[orderValue])
{
return orderValue;
}
}
return int.MaxValue;
}
Usage:
var sortedAlerts = alerts
.AllAlerts
.OrderByDescending(a => GetAlertTypeIdOrder(a.AlertTypeId))
.ToList();
It also works in a descending way.
Solution #2 Order values dictionary
You could achieve better performance by reducing the redundancy - repeated creation of array storing order values. Better idea would be to store the order rules in a dictionary. I know that code below creates an array too, but the concept is that it would be called once to get the dictionary which would be then passed over.
public Dictionary<int, int> GetOrderRules()
{
var alertTypeIds = new int[]
{
8106, // Confirmed Out of Busine
8105, // Bankruptcy
8111, // Lack of Licensing
8109, // Investigations
8103, // Government Actions
8104, // Pattern of Complaints
8112, // Customer Reviews
8110, // Accreditation
8101, // Misuse of BBB Name
8107, // Advisory
8102, // Advertising Review
};
var orderRules = new Dictionary<int, int>();
for (int orderValue = 0; orderValue < alertTypeIds.Length; orderValue++)
{
orderRules.Add(alertTypeIds[orderValue], orderValue);
}
return orderRules;
}
So the GetAlertIdOrder() method would look different, but still keeping the idea from previous solution:
public int GetAlertIdOrder(short alertTypeId, IDictionary<int, int> orderRules)
{
if (orderRules.TryGetValue(alertTypeId, out int orderValue))
{
return orderValue;
}
else
{
return int.MaxValue;
}
}
Usage:
var orderRules = GetOrderRules();
var sortedAlerts = alerts
.AllAlerts
.OrderBy(a => GetAlertIdOrder(a.AlertTypeId, orderRules))
.ToList();
(a) I wouldn't mix sorting with the mapper. let the mapper just do its thing. (this is separation of concerns ) .. aka, no ordering/sorting. IMHO, you'll always end up with way too much voodoo in the mapper that is hard to understand. You're already on this path with the above code.
(b) if "OverviewAlerts" is a subset of AllAlerts (aka, AllAlerts is the superset), then hydrate AllAlerts, and create a read-only "get" property where you filter AllAlerts to your subset by its rules. optionally, consider a AllAlertsSorted get property. this way, you allow your consumers to choose if they want raw or sorted...since there is a cost with sorting.
public class BPAlerts
{
public List<BPAlert> AllAlerts { get; set; }
public List<BPAlert> OverviewAlerts {
get
{
return null == this.AllAlerts ? null : this.AllAlerts.Where (do you filtering and maybe sorting here ) ;
}
}
}
public List<BPAlert> AllAlertsSorted{
get
{
return null == this.AllAlerts ? null : this.AllAlerts.Sort(do you filtering and maybe sorting here ) ;
}
}
}
if you do the read-only properties, then you have more simple linq operations like
OrderBy(x => x.PropertyAbc).ThenByDescending(x => x.PropertyDef);
99% of my mapping code looks like this. I don't even throw an error if you give null input, i just return null.
public static class MyObjectMapper {
public static ICollection < MyOtherObject > ConvertToMyOtherObject(ICollection <MyObjectMapper> inputItems) {
ICollection <MyOtherObject> returnItems = null;
if (null != inputItems) {
returnItems = new List <MyOtherObject> ();
foreach(MyObjectMapper inputItem in inputItems) {
MyOtherObject moo = new MyOtherObject();
/* map here */
returnItems.Add(moo);
}
}
return returnItems;
}
}

Replace in MongoDB With C#

I am trying to actually replace a collection of Objects of type Game in my Collection "Games".
I want to replace these Objects with entirely new Objects. I have researched a bit on MongoDB and I see that 'UpdateMany' will replace Fields with new values but that's not exactly what I want. I wish to replace the entire Object.
For reference, this is my Game class:
public class Game
{
public Guid Id { get; set; }
public string Title { get; set; }
public string Developer { get; set; }
public int ProjectId { get; set; }
public Game()
{
this.Id = Guid.NewGuid();
}
}
This is my method I am using to attempt a bulk Replace. I am passing in a ProjectId, so for all of the Game Objects that have a ProjectId = to the argument, replace the Object with a new Game Object.
public static void ReplaceGame(int ProjectId, IMongoDatabase Database)
{
IMongoCollection<Game> gameCollection = Database.GetCollection<Game>("Game");
List<Game> gameCollectionBeforeReplacement = gameCollection.Find(g => true).ToList();
if (gameCollectionBeforeReplacement.Count == 0)
{
Console.WriteLine("No Games in Collection...");
return;
}
var filter = Builders<Game>.Filter.Eq(g => g.ProjectId, ProjectId);
foreach (Game game in gameCollection.AsQueryable())
gameCollection.ReplaceOneASync(filter, new Game() { Title = "REPLACEMENT TITLE" });
}
Not only does this take an excessive amount of time. I suspect it's because of the .AsQueryable() call but it also doesn't work. I am wondering how I can actually replace all instances picked up by my filter with new Game Objects.
Consider the following code:
public virtual ReplaceOneResult ReplaceOne(TDocument replacement, int projId)
{
var filter = Builders<TDocument>.Filter.Eq(x => x.ProjectId, projId);
var result = Collection.ReplaceOne(filter, replacement, new UpdateOptions() { IsUpsert = false }, _cancellationToken);
return result;
}
You will find that ReplaceOneResult has a property that tells you the matched count. This makes it possible for you to keep executing the ReplaceOne call until the matched count equals 0. When this happens, you know all documents in your collection that had the corresponding project id have been replaced.
Example:
var result = ReplaceOne(new Game() { Title = "REPLACEMENT TITLE" }, 12);
while (result.MatchedCount > 0)
result = ReplaceOne(new Game() { Title = "REPLACEMENT TITLE" }, 12);
This makes it so that you don't need the call to the database before you start replacing.
However, if you wish to insert the same values for every existing game, I would suggest you to do an UpdateMany operation. There you can use $set to specify all required values. The code above is simply not performant, with going to the database for every single replace call.

MongoDB Concurrent writing/fetching from multiple processes causes bulk write operation error

I'm currently implementing a MongoDB database for caching.
I've made a very generic client, with the save method working like this:
public virtual void SaveAndOverwriteExistingCollection<T>(string collectionKey, T[] data)
{
if (data == null || !data.Any())
return;
var collection = Connector.MongoDatabase.GetCollection<T>(collectionKey.ToString());
var filter = new FilterDefinitionBuilder<T>().Empty;
var operations = new List<WriteModel<T>>
{
new DeleteManyModel<T>(filter),
};
operations.AddRange(data.Select(t => new InsertOneModel<T>(t)));
try
{
collection.BulkWrite(operations, new BulkWriteOptions { IsOrdered = true});
}
catch (MongoBulkWriteException mongoBulkWriteException)
{
throw mongoBulkWriteException;
}
}
With our other clients, calling this method looking similar to this:
public Person[] Get(bool bypassCache = false)
{
Person[] people = null;
if (!bypassCache)
people = base.Get<Person>(DefaultCollectionKeys.People.CreateCollectionKey());
if (people.SafeAny())
return people;
people = Client<IPeopleService>.Invoke(s => s.Get());
base.SaveAndOverwriteExistingCollection(DefaultCollectionKeys.People.CreateCollectionKey(), people);
return people;
}
After we've persisted data to the backend we reload the cache from MongoDB by calling our Get methods, passing the argument true. So we reload all of the data.
This works fine for most use cases. But considering how we are using a Web-garden solution (multiple processes) for the same application this leads to concurrency issues. If I save and reload the cache while another user is reloading the page, sometimes it throws a E11000 duplicate key error collection.
Command createIndexes failed: E11000 duplicate key error collection:
cache.Person index: Id_1_Name_1_Email_1 dup
key: { : 1, : "John Doe", : "foo#bar.com" }.
Considering how this is a web garden with multiple IIS processes running, locking won't do much good. Considering how bulkwrites should be threadsafe I'm a bit puzzled. I've looked into Upserting the data, but changing our clients to be type specific and updating each field will take too long and feels like unnecessary work. Therefore I'm looking for a very generic solution.
UPDATE
Removed the Insert and Delete. Changed it to a collection of ReplaceOneModel. Currently experiencing issues with only the last element in a collection being persisted.
public virtual void SaveAndOverwriteExistingCollection<T>(string collectionKey, T[] data)
{
if (data == null || !data.Any())
return;
var collection = Connector.MongoDatabase.GetCollection<T>(collectionKey.ToString());
var filter = new FilterDefinitionBuilder<T>().Empty;
var operations = new List<WriteModel<T>>();
operations.AddRange(data.Select(t => new ReplaceOneModel<T>(filter, t) { IsUpsert = true }));
try
{
collection.BulkWrite(operations, new BulkWriteOptions { IsOrdered = true });
}
catch (MongoBulkWriteException mongoBulkWriteException)
{
throw mongoBulkWriteException;
}
}
Just passed in a collection of 811 items and only the last one can be found in the collection after this method has been executed.
Example of a DTO being persisted:
public class TranslationSetting
{
[BsonId(IdGenerator = typeof(GuidGenerator))]
public object ObjectId { get; set; }
public string LanguageCode { get; set; }
public string SettingKey { get; set; }
public string Text { get; set; }
}
With this index:
string TranslationSettings()
{
var indexBuilder = new IndexKeysDefinitionBuilder<TranslationSetting>()
.Ascending(_ => _.SettingKey)
.Ascending(_ => _.LanguageCode);
return MongoDBClient.CreateIndex(DefaultCollectionKeys.TranslationSettings, indexBuilder);
}

How to convert from 'System.Linq.IQueryable<System.Collections.Generic.List<Model.Record> to List<Record>

How does one retrieve the list in a model?
This is what I'm trying:
private void cbxPlayers_SelectedValueChanged(object sender, EventArgs e)
{
List<Record> records = new List<Record>();
string selectedPlayer = cbxPlayers.SelectedItem.ToString();
using (ProgressRecordContext context = new ProgressRecordContext())
{
records = (from Player in context.Players
where Player.Name == selectedPlayer
select Player.Records).ToList<Record>();
}
}
That doesn't work however, what am I missing?
These are the models in case they're needed:
public class Player
{
[Key][DatabaseGenerated(DatabaseGeneratedOption.None)]
public int AccountNumberId { get; set; }
public string Name { get; set; }
public virtual List<Record> Records { get; set; }
}
public class Record
{
public int RecordId { get; set; }
public int AccountNumberId { get; set; }
public double Level { get; set; }
public int Economy { get; set; }
public int Fleet { get; set; }
public int Technology { get; set; }
public int Experience { get; set; }
public DateTime TimeStamp { get; set; }
public virtual Player Player { get; set; }
}
EDIT: Here's the error messages:
Error 1 'System.Linq.IQueryable>' does not contain a definition for 'ToList' and the best extension method overload 'System.Linq.ParallelEnumerable.ToList(System.Linq.ParallelQuery)' has some invalid arguments
Error 2 Instance argument: cannot convert from 'System.Linq.IQueryable>' to 'System.Linq.ParallelQuery'
EDIT:
I see that I probably wasn't very clear with what I was trying to do. I eventually worked out a way to do what I wanted and here it is:
private void cbxPlayers_SelectedValueChanged(object sender, EventArgs e)
{
lstvRecords.Items.Clear();
if(cbxPlayers.SelectedIndex == -1)
{
return;
}
string selectedPlayer = cbxPlayers.SelectedItem.ToString();
using (ProgressRecordContext context = new ProgressRecordContext())
{
var records = from Player in context.Players
from Record in context.Records
where Player.Name == selectedPlayer &&
Player.AccountNumberId == Record.AccountNumberId
select new
{
Level = Record.Level,
Economy = Record.Economy,
Fleet = Record.Fleet,
Technology = Record.Technology,
Experience = Record.Experience,
TimeStamp = Record.TimeStamp
};
foreach (var element in records)
{
string[] elements = {element.Level.ToString(),
element.Economy.ToString(),
element.Fleet.ToString(),
element.Technology.ToString(),
element.Experience.ToString(),
element.TimeStamp.ToString()
};
ListViewItem lvi = new ListViewItem(elements);
lstvRecords.Items.Add(lvi);
}
}
}
Is there a better way to write that query or is the way that I've done it correct?
No idea why you're getting ParallelQuery - unless you've got some wacky usings in your source file.
In any case, you appear to have an enumerable of enumerables - try SelectMany (note you need using System.Linq; for this to work as an extension method, too):
records = (from Player in context.Players
where Player.Name == selectedPlayer
select Player.Records).SelectMany(r => r).ToList();
Also - unless you intend to add/remove to/from that list, you should just use an array, i.e. use .ToArray().
As pointed out by #Tim S (+1) - if you expect only a single player here then you should be using SingleOrDefault() to get the single player - whose Records you then turn into an array/list.
Your problem is that Player.Records is a List<Record>, and you are getting an IEnumerable<List<Record>> (i.e. 0 to many player's records) from your query, so .ToList() gets you a List<List<Record>>. If there are multiple players with the same name and you want it to collect the records from all of them, use Andras Zoltan's solution. If you want to ensure (via throwing an exception if there are 0 or more than 1 results) that exactly one player has the given name, and only his records are returned, use one of these solutions: (key change being .Single() - also take a look at SingleOrDefault to see if it fits your needs better)
//I prefer this solution for its conciseness and clarity.
records = context.Players.Single(Player => Player.Name == selectedPlayer).Records;
//if you'd like to use the LINQ query format, I'd recommend this.
records = (from Player in context.Players
where Player.Name == selectedPlayer
select Player).Single().Records;
//this is more similar to your original query.
records = (from Player in context.Players
where Player.Name == selectedPlayer
select Player.Records).Single().ToList();
If you change
List<Record> records = new List<Record>();
to
var records = new List<List<Record>>();
Does it work? If a Player has a list of Records, it looks like your query is returning a List of a List of Records.
Edit:
There, fixed the return list... either way this is probably not the solution you're looking for, just highlighting what the problem is.
You could try refactoring your query
records = context.Players.First(player => player.Name == selectedPlayer).Records.ToList();

C# Best way to store strings from an input file for manipulation and use?

I've got a file of blocks of strings, each which end with a certain keyword. I've currently got a stream reader setup which adds each line of the file to a list up until the end of the current block(line contains keyword indicating end of block).
listName.Add(lineFromFile);
Each block contains information e.g. Book bookName, Author AuthorName, Journal JournalName etc. So each block is hypothetically a single item (book, journal, conference etc)..
Now with around 50 or so blocks of information(items) i need some way to store the information so i can manipulate it and store each author(s), Title, pages etc. and know what information goes with what item etc.
While typing this I've come up with the idea of possibly storing each Item as an object of a class called 'Item', however with potentially several authors, I'm not sure how to achieve this, as i was thinking maybe using a counter to name a variable e.g.
int i = 0;
String Author[i] = "blahblah";
i++;
But as far as i know it's not allowed? So my question is basically what would be the simplest/easiest way to store each item so that i can manipulate the strings to store each item for use later.
#yamen here's an example of the file:
Author Bond, james
Author Smith John A
Year 1994
Title For beginners
Book Accounting
Editor Smith Joe
Editor Doe John
Publisher The University of Chicago Press
City Florida, USA
Pages 15-23
End
Author Faux, M
Author Sedge, M
Author McDreamy, L
Author Simbha, D
Year 2000
Title Medical advances in the modern world
Journal Canadian Journal of medicine
Volume 25
Pages 1-26
Issue 2
End
Author McFadden, B
Author Goodrem, G
Title Shape shifting dinosaurs
Conference Ted Vancouver
City Vancouver, Canada
Year 2012
Pages 2-6
End
Update in lieu of your sample
How to parse the string is beyond the scope of this answer - you might want to have a go at that yourself, and then ask another SO (I suggest reading the golden rules of SO: https://meta.stackexchange.com/questions/128548/what-stack-overflow-is-not).
So I'll present the solution assuming that you have a single string representing the full block of book/journal information (this data looks like citations). The main change from my original answer is that you have multiple authors. Also you might want to consider whether you want to transform the authors' names back to [first name/initial] [middle names] [surname].
I present two solutions - one using Dictionary and one using Linq. The Linq solution is a one-liner.
Define an Info class to store the item:
public class Info
{
public string Title { get; private set; }
public string BookOrJournal { get; private set; }
public IEnumerable<string> Authors { get; private set; }
//more members of pages, year etc.
public Info(string stringFromFile)
{
Title = /*read book name from stringFromFile */;
BookOrJournalName = /*read journal name from stringFromFile */;
Authors = /*read authors from stringFromFile */;
}
}
Note that the stringFromFile should be one block, including newlines, of citation information.
Now a dictionary to store each info by author:
Dictionary<string, List<Info>> infoByAuthor =
new Dictionary<string, List<Info>>(StringComparer.OrdinalIrgnoreCase);
Note the OrdinalIgnoreCase comparer - to handle situations where an author's name is printed in a different case.
Given a List<string> that you're adding to as per your listName.Add, this simple loop will do the trick:
List<Info> tempList;
Info tempInfo;
foreach(var line in listName)
{
if(string.IsNullOrWhiteSpace(line))
continue;
tempInfo = new Info(line);
foreach(var author in info.Authors)
{
if(!infoByAuthor.TryGetValue(author, out tempList))
tempInfo[author] = tempList = new List<Info>();
tempList.Add(tempInfo);
}
}
Now you can iterate through the dictionary, and each KeyValuePair<string, List<Info>> will have a Key equal to the author name and the Value will be the list of Info objects that have that author. Note that the casing of the AuthorName will be preserved from the file even though you're grouping case-insensitively such that two items with "jon skeet" and "Jon Skeet" will be grouped into the same list, but their original cases will be preserved on the Info.
Also the code is written to ensure that only one Info instance is created per citation, this is preferable for many reasons (memory, centralised updates etc).
Alternatively, with Linq, you can simply do this:
var grouped = listName.Where(s => !string.IsNullOrWhiteSpace(s))
.Select(s => new Info(s))
.SelectMany(i =>
s.Authors.Select(ia => new KeyValuePair<string, Info>(ia, i))
.GroupBy(kvp => kvp.Key, kvp => kvp.Value, StringComparer.OrdinalIgnoreCase);
Now you have enumerable of groups, where the Key is the Author Name and the inner enumerable is all the Info objects with that author name. The same case-preserving behaviour regarding 'the two Skeets' will be observed here, too.
Here is the complete code for this problem.
It is written with a simple, straight forward approach. It can be optimized, there's no error checking and the AddData Method can be written in a much more efficient way by using reflection. But it does the job in an elegant way.
using System;
using System.Collections.Generic;
using System.IO;
namespace MutiItemDict
{
class MultiDict<TKey, TValue> // no (collection) base class
{
private Dictionary<TKey, List<TValue>> _data = new Dictionary<TKey, List<TValue>>();
public void Add(TKey k, TValue v)
{
// can be a optimized a little with TryGetValue, this is for clarity
if (_data.ContainsKey(k))
_data[k].Add(v);
else
_data.Add(k, new List<TValue>() { v });
}
public List<TValue> GetValues(TKey key)
{
if (_data.ContainsKey(key))
return _data[key];
else
return new List<TValue>();
}
}
class BookItem
{
public BookItem()
{
Authors = new List<string>();
Editors = new List<string>();
}
public int? Year { get; set; }
public string Title { get; set; }
public string Book { get; set; }
public List<string> Authors { get; private set; }
public List<string> Editors { get; private set; }
public string Publisher { get; set; }
public string City { get; set; }
public int? StartPage { get; set; }
public int? EndPage { get; set; }
public int? Issue { get; set; }
public string Conference { get; set; }
public string Journal { get; set; }
public int? Volume { get; set; }
internal void AddPropertyByText(string line)
{
string keyword = GetKeyWord(line);
string data = GetData(line);
AddData(keyword, data);
}
private void AddData(string keyword, string data)
{
if (keyword == null)
return;
// Map the Keywords to the properties (can be done in a more generic way by reflection)
switch (keyword)
{
case "Year":
this.Year = int.Parse(data);
break;
case "Title":
this.Title = data;
break;
case "Book":
this.Book = data;
break;
case "Author":
this.Authors.Add(data);
break;
case "Editor":
this.Editors.Add(data);
break;
case "Publisher":
this.Publisher = data;
break;
case "City":
this.City = data;
break;
case "Journal":
this.Journal = data;
break;
case "Volume":
this.Volume = int.Parse(data);
break;
case "Pages":
this.StartPage = GetStartPage(data);
this.EndPage = GetEndPage(data);
break;
case "Issue":
this.Issue = int.Parse(data);
break;
case "Conference":
this.Conference = data;
break;
}
}
private int GetStartPage(string data)
{
string[] pages = data.Split('-');
return int.Parse(pages[0]);
}
private int GetEndPage(string data)
{
string[] pages = data.Split('-');
return int.Parse(pages[1]);
}
private string GetKeyWord(string line)
{
string[] words = line.Split(' ');
if (words.Length == 0)
return null;
else
return words[0];
}
private string GetData(string line)
{
string[] words = line.Split(' ');
if (words.Length < 2)
return null;
else
return line.Substring(words[0].Length+1);
}
}
class Program
{
public static BookItem ReadBookItem(StreamReader streamReader)
{
string line = streamReader.ReadLine();
if (line == null)
return null;
BookItem book = new BookItem();
while (line != "End")
{
book.AddPropertyByText(line);
line = streamReader.ReadLine();
}
return book;
}
public static List<BookItem> ReadBooks(string fileName)
{
List<BookItem> books = new List<BookItem>();
using (StreamReader streamReader = new StreamReader(fileName))
{
BookItem book;
while ((book = ReadBookItem(streamReader)) != null)
{
books.Add(book);
}
}
return books;
}
static void Main(string[] args)
{
string fileName = "../../Data.txt";
List<BookItem> bookList = ReadBooks(fileName);
MultiDict<string, BookItem> booksByAutor = new MultiDict<string, BookItem>();
bookList.ForEach(bk =>
bk.Authors.ForEach(autor => booksByAutor.Add(autor, bk))
);
string author = "Bond, james";
Console.WriteLine("Books by: " + author);
foreach (BookItem book in booksByAutor.GetValues(author))
{
Console.WriteLine(" Title : " + book.Title);
}
Console.WriteLine("");
Console.WriteLine("Click to continue");
Console.ReadKey();
}
}
}
And I also want to mention that all the parsing stuff can be avoided if you represent the Data in XML.
The Data then looks like:
<?xml version="1.0" encoding="utf-8"?>
<ArrayOfBookItem >
<BookItem>
<Year>1994</Year>
<Title>For beginners</Title>
<Book>Accounting</Book>
<Authors>
<string>Bond, james</string>
<string>Smith John A</string>
</Authors>
<Editors>
<string>Smith Joe</string>
<string>Doe John</string>
</Editors>
<Publisher>The University of Chicago Press</Publisher>
<City>Florida, USA</City>
<StartPage>15</StartPage>
<EndPage>23</EndPage>
</BookItem>
<BookItem>
<Year>2000</Year>
<Title>Medical advances in the modern world</Title>
<Authors>
<string>Faux, M</string>
<string>Sedge, M</string>
<string>McDreamy, L</string>
<string>Simbha, D</string>
</Authors>
<StartPage>1</StartPage>
<EndPage>26</EndPage>
<Issue>2</Issue>
<Journal>Canadian Journal of medicine</Journal>
<Volume>25</Volume>
</BookItem>
<BookItem>
<Year>2012</Year>
<Title>Shape shifting dinosaurs</Title>
<Authors>
<string>McFadden, B</string>
<string>Goodrem, G</string>
</Authors>
<City>Vancouver, Canada</City>
<StartPage>2</StartPage>
<EndPage>6</EndPage>
<Conference>Ted Vancouver</Conference>
</BookItem>
</ArrayOfBookItem>
And the code for reading it:
using (FileStream stream =
new FileStream(#"../../Data.xml", FileMode.Open,
FileAccess.Read, FileShare.Read))
{
List<BookItem> books1 = (List<BookItem>)serializer.Deserialize(stream);
}
You should create a class Book
public class Book
{
public string Name { get; set; }
public string Author { get; set; }
public string Journal { get; set; }
}
and maintain a List<Book>
var books = new List<Book>();
books.Add(new Book { Name = "BookName", Author = "Some Auther", Journal = "Journal" });
I would use a multi value dictionary for this:
public struct BookInfo
{
public string Title;
public string Journal;
}
Then create a dictionary object:
var dict = new Dictionary<Author, BookInfo>();
This way, if you do run into multiple authors, the data will be sorted by author, which makes writing future code to work with this data easy. Printing out a list of all books under some author will be dead easy and not require a cumbersome search process.
You can use a class with simple attributes like these:
class Book {
string Title;
int PageCount;
}
You can either initialize Book[] lines = Book[myFile.LineCount]; or maintain a List<Book>, but string[] is easier to access individual line numbers (lines[34] means 34'th book, and 34th line).
But basically a System.Data.DataTable may be better suited, because you have rows that contain multiple columns. With DataTable, you can access individual rows and access their columns by name.
Example:
DataTable dt = new DataTable();
DataTable.Columns.Add("bookName");
DataRow dr = dt.NewRow();
dr["bookName"] = "The Lost Island";
dt.Rows.Add(dr);
//You can access last row this way:
dt.Rows[dt.Rows.Count-1]["bookName"].
One more good thing about a DataTable is that you can use grouping and summing on its rows like on an ordinary SQL table.
Edit: Initially my answer used structs but as #AndrasZoltan pointed out, it may be better to use classes when you're not sure what the application will evolve in.
You are well on your way to inventing the relational database. Conveniently, these are already available. In addition to solving the problem of storing relationships between entities, they also handle concurrency issues and are supported by modelling techniques founded in provable mathematics.
Parsers are a subject unto themselves. Since SQL is out of the question, this being a contrived university assignment, I do have some observations.
The easy way is with a regex. However this is extremely inefficient and a poor solution for large input files.
In the absence of regexes, String.IndexOf() and String.Split() are your friends.
If your assessor can't cope with SQL then LINQ is going to be quite a shock, but I really really like Zoltan's LINQ solution, it's just plain elegant.
Its not quite clear what you need without a better example of the file or how you want to use the data but it sounds like you need to parse the string and put it into an entity. The following is an example using the fields you mentioned above.
public IList<Entry> ParseEntryFile(string fileName)
{
...
var entries = new List<Entry>();
foreach(var line in file)
{
var entry = new Entry();
...
entries.Add(entry);
}
return entries;
}
public class Entry
{
public Book BookEntry { get; set; }
public Author AuthorEntry { get; set; }
public Journal JournalEntry { get; set; }
}
public class Book
{
public string Name{ get; set; }
...
}
public class Author
{
public string FirstName { get; set; }
public string LastName { get; set; }
}
...
You can create a class for each item:
class BookItem
{
public string Name { get; set; }
public string Author { get; set; }
}
Read the data from each line into an instance of this class and store them in a temporary list:
var books = new List<BookItem>();
while (NotEndOfFile())
{
BookItem book= ReadBookItem(...)
books.Add(book);
}
After you have this list you can create Multi Value Dictionaries and have quick access to any item by any key. For example to find a book by its author:
var booksByAuthor = new MultiDict<string, BookItem>();
add the items to the Dictionary:
books.ForEach(bk => booksByAuthor.Add(bk.Author, bk));
and then you can iterate on it:
string autorName = "autor1";
Console.WriteLine("Books by: " + autorName);
foreach (BookItem bk1 in booksByAutor)
{
Console.WriteLine("Book: " + bk1.Name);
}
I got the basic Multi Item Dictionary from here:
Multi Value Dictionary?
This is my implementation:
class MultiDict<TKey, TValue> // no (collection) base class
{
private Dictionary<TKey, List<TValue>> _data = new Dictionary<TKey, List<TValue>>();
public void Add(TKey k, TValue v)
{
// can be a optimized a little with TryGetValue, this is for clarity
if (_data.ContainsKey(k))
_data[k].Add(v);
else
_data.Add(k, new List<TValue>() { v });
}
// more members
public List<TValue> GetValues(TKey key)
{
if (_data.ContainsKey(key))
return _data[key];
else
return new List<TValue>();
}
}

Categories