Im making an app which needs to loop through steam games.
reading libraryfolder.vbf, i need to loop through and find the first value and save it as a string.
"libraryfolders"
{
"0"
{
"path" "D:\\Steam"
"label" ""
"contentid" "-1387328137801257092942"
"totalsize" "0"
"update_clean_bytes_tally" "42563526469"
"time_last_update_corruption" "1663765126"
"apps"
{
"730" "31892201109"
"4560" "9665045969"
"9200" "22815860246"
"11020" "776953234"
"34010" "11967809445"
"34270" "1583765638"
for example, it would record:
730
4560
9200
11020
34010
34270
Im already using System.Text.JSON in the program, is there any way i could loop through and just get the first value using System.Text.JSON or would i need to do something different as vdf doesnt separate the values with colons or commas?
That is not JSON, that is the KeyValues format developed by Valve. You can read more about the format here:
https://developer.valvesoftware.com/wiki/KeyValues
There are existing stackoverflow questions regarding converting a VDF file to JSON, and they mention libraries already developed to help read VDF which can help you out.
VDF to JSON in C#
If you want a very quick and dirty way to read the file without needing any external library I would probably use REGEX and do something like this:
string pattern = "\"apps\"\\s+{\\s+(\"(\\d+)\"\\s+\"\\d+\"\\s+)+\\s+}";
string libraryPath = #"C:\Program Files (x86)\Steam\steamapps\libraryfolders.vdf";
string input = File.ReadAllText(libraryPath);
List<string> indexes = Regex.Matches(input, pattern, RegexOptions.Singleline)
.Cast<Match>().ToList()
.Select(m => m.Groups[2].Captures).ToList()
.SelectMany(c => c.Cast<Capture>())
.Select(c => c.Value).ToList();
foreach(string s in indexes)
{
Debug.WriteLine(s);
}
See the regular expression explaination here:
https://regex101.com/r/bQSt79/1
It basically captures all occurances of "apps" { } in the 0 group, and does a repeating capture of pairs of numbers inbetween the curely brackets in the 1 group, but also captures the left most number in the pair of numbers in the 2 group. Generally repeating captures will only keep the last occurance but because this is C# we can still access the values.
The rest of the code takes each match, the 2nd group of each match, the captures of each group, and the values of those captures, and puts them in a list of strings. Then a foreach will print the value of those strings to log.
Related
I have a file that is formatted this way --
{2000}000000012199{3100}123456789*{3320}110009558*{3400}9876
54321*{3600}CTR{4200}D2343984*JOHN DOE*1232 STREET*DALLAS TX
78302**{5000}D9210293*JANE DOE*1234 STREET*SUITE 201*DALLAS
TX 73920**
Basically, the number in curly brackets denotes field, followed by the value for that field. For example, {2000} is the field for "Amount", and the value for it is 121.99 (implied decimal). {3100} is the field for "AccountNumber" and the value for it is 123456789*.
I am trying to figure out a way to split the file into "records" and each record would contain the record type (the value in the curly brackets) and record value, but I don't see how.
How do I do this without a loop going through each character in the input?
A different way to look at it.... The { character is a record delimiter, and the } character is a field delimiter. You can just use Split().
var input = #"{2000}000000012199{3100}123456789*{3320}110009558*{3400}987654321*{3600}CTR{4200}D2343984*JOHN DOE*1232 STREET*DALLAS TX78302**{5000}D9210293*JANE DOE*1234 STREET*SUITE 201*DALLASTX 73920**";
var rows = input.Split( new [] {"{"} , StringSplitOptions.RemoveEmptyEntries);
foreach (var row in rows)
{
var fields = row.Split(new [] { "}"}, StringSplitOptions.RemoveEmptyEntries);
Console.WriteLine("{0} = {1}", fields[0], fields[1]);
}
Output:
2000 = 000000012199
3100 = 123456789*
3320 = 110009558*
3400 = 987654321*
3600 = CTR
4200 = D2343984*JOHN DOE*1232 STREET*DALLAS TX78302**
5000 = D9210293*JANE DOE*1234 STREET*SUITE 201*DALLASTX 73920**
Fiddle
This regular expression should get you going:
Match a literal {
Match 1 or more digts ("a number")
Match a literal }
Match all characters that are not an opening {
\{\d+\}[^{]+
It assumes that the values itself cannot contain an opening curly brace. If that's the case, you need to be more clever, e.g. #"\{\d+\}(?:\\{|[^{])+" (there are likely better ways)
Create a Regex instance and have it match against the text. Each "field" will be a separate match
var text = #"{123}abc{456}xyz";
var regex = new Regex(#"\{\d+\}[^{]+", RegexOptions.Compiled);
foreach (var match in regex.Matches(text)) {
Console.WriteLine(match.Groups[0].Value);
}
This doesn't fully answer the question, but it was getting too long to be a comment, so I'm leaving it here in Community Wiki mode. It does, at least, present a better strategy that may lead to a solution:
The main thing to understand here is it's rare — like, REALLY rare — to genuinely encounter a whole new kind of a file format for which an existing parser doesn't already exist. Even custom applications with custom file types will still typically build the basic structure of their file around a generic format like JSON or XML, or sometimes an industry-specific format like HL7 or MARC.
The strategy you should follow, then, is to first determine exactly what you're dealing with. Look at the software that generates the file; is there an existing SDK, reference, or package for the format? Or look at the industry surrounding this data; is there a special set of formats related to that industry?
Once you know this, you will almost always find an existing parser ready and waiting, and it's usually as easy as adding a NuGet package. These parsers are genuinely faster, need less code, and will be less susceptible to bugs (because most will have already been found by someone else). It's just an all-around better way to address the issue.
Now what I see in the question isn't something I recognize, so it's just possible you genuinely do have a custom format for which you'll need to write a parser from scratch... but even so, it doesn't seem like we're to that point yet.
Here is how to do it in linq without slow regex
string x = "{2000}000000012199{3100}123456789*{3320}110009558*{3400}987654321*{3600}CTR{4200}D2343984*JOHN DOE*1232 STREET*DALLAS TX78302**{5000}D9210293*JANE DOE*1234 STREET*SUITE 201*DALLASTX 73920**";
var result =
x.Split('{',StringSplitOptions.RemoveEmptyEntries)
.Aggregate(new List<Tuple<string, string>>(),
(l, z) => { var az = z.Split('}');
l.Add(new Tuple<string, string>(az[0], az[1]));
return l;})
LinqPad output:
I have a string as shown below
string names = "<?startname; Max?><?startname; Alex?><?startname; Rudy?>";
is there any way I can split this string and add Max , Alex and Rudy into a separate list ?
Sure, split on two strings (all that consistently comes before, and all that consistently comes after) and specify that you want Split to remove the empties:
var r = names.Split(new[]{ "<?startname; ", "?>" }, StringSplitOptions.RemoveEmptyEntries);
If you take out the RemoveEmptyEntries it will give you a more clear idea of how the splitting is working, but in essence without it you'd get your names interspersed with array entries that are empty strings because split found a delimiter (the <?...) immediately following another (the ?>) with an empty string between the delimiters
You can read the volumes of info about this form of split here - that's a direct link to netcore3.1, you can change your version in the table of contents - this variant of Split has been available since framework2.0
You did also say "add to a separate list" - didn't see any code for that so I guess you will either be happy to proceed with r here being "a separate list" (an array actually, but probably adequately equivalent and easy to convert with LINQ's ToList() if not) or if you have another list of names (that really is a List<string>) then you can thatList.AddRange(r) it
Another Idea is to use Regex
The following regex should work :
(?<=; )(.*?)(?=\s*\?>)
I have a string builder which stores many words..for example, i did
StringBuilder builder = new StringBuilder();
builder.Append(reader.Value);
now, builder contains string as
" india is a great great country and it has many states and territories".. it contains many paragraphs.
I want that each word should be unique represented and its word count. example,
india: 1
great: 2
country: 1
and: 2
Also, this result should be saved in a excel file. But I am not getting the result.
I searched in google, but i am getting it by linq or by writing the words itself. Can you please help me out. I am a beginner.
You can use Linq to achieve it. Try something like this.
var result = from word in builder.Split(' ')
group word by word into g
select new { Word = g.Key, Count = g.Count() };
You can also convert this result into Dictionary object like this
Dictionary<string, int> output = result.ToDictionary(a => a.Word, a => a.Count);
So here each item in output will contains Word as Key and it's Count as value.
Well, this is one way to get the words:
IEnumerable<string> words = builder.ToString().Split(' ');
Look into using the String.Split() function to break up your string into words. You can then use a Dictionary<string, int> to keep track of unique words and their counts.
You don't really need a StringBuilder for this, though - a StringBuilder is useful when you contatenate strings together a lot. You only have a single input string here and you won't add to it - you'll split it up.
Once you finish processing all the words in the input string, you can write the code to export the results to Excel. The simplest way to do that is to create a comma-separated text file - search for that phrase and look into using a StreamWriter to save the output. Excel has built-in converters for CSV files.
Regex is one of those things I've wanted to be able to write myself and although I have a basic understand of how it works I've never found myself in the situation where I needed to use it where it doesn't exist already widely on the web (such as for validating email addresses).
A problem that I have is that I am receiving a string which is comma separated, however some of the string values contain commas also. For example I might receive:
$COMMAND=1,2,3,"string","another,string",4,5,6
Generally I will never receive anything like this, however the device sending me this string array allows for it to happen so I would like to be able to split the array accordingly if it ever were to occur.
So obviously just splitting it like so (where rawResponse has the $COMMAND= part removed:
string[] response = rawResponse.Split(',');
Is not good enough! I think regex is the correct tool for the job, could anyone help me write it?
string rawResponse = #"1,2,3,""string"",""another,string"",4,5";
string pattern = #"[^,""]+|""([^""]*)""";
foreach(Match match in Regex.Matches(rawResponse, pattern))
// use match.Value
Results:
1
2
3
"string"
"another,string"
4
5
If you need response as array of strings you can use Linq:
var response = Regex.Matches(rawResponse, pattern).Cast<Match>()
.Select(m => m.Value).ToArray();
string originalString = #"1,2,3,""string"",""another,string"",4,5,6";
string regexPattern = #"(("".*?"")|(.*?))(,|$)";
foreach(Match match in Regex.Matches(originalString, regexPattern))
{
}
I have a string of attribute names and definitions.
I am trying to split the string on the attribute name, into a Dictionary of string string. Where the key is the attribute name and the definition is the value. I won't know the attribute names ahead of time, so I have been trying to somehow split on the ":" character, but am having trouble with that because the attribute name is is not included in the split.
For example, I need to split this string on "Organization:", "OranizationType:", and "Nationality:" into a Dictionary. Any ideas on the best way to do this with C#.Net?
Organization: Name of a governmental, military or other organization. OrganizationType: Organization classification to one of the following types: sports, governmental military, governmental civilian or political party. (required) Nationality: Organization nationality if mentioned in the document. (required)
Here is some sample code to help:
private static void Main()
{
const string str = "Organization: Name of a governmental, military or other organization. OrganizationType: Organization classification to one of the following types sports, governmental military, governmental civilian or political party. (required) Nationality: Organization nationality if mentioned in the document. (required)";
var array = str.Split(':');
var dictionary = array.ToDictionary(x => x[0], x => x[1]);
foreach (var item in dictionary)
{
Console.WriteLine("{0}: {1}", item.Key, item.Value);
}
// Expecting to see the following output:
// Organization: Name of a governmental, military or other organization.
// OrganizationType: Organization classification to one of the following types sports, governmental military, governmental civilian or political party.
// Nationality: Organization nationality if mentioned in the document. (required)
}
Here is a visual explanation of what I am trying to do:
http://farm5.static.flickr.com/4081/4829708565_ac75b119a0_b.jpg
I'd do it in two phases, firstly split into the property pairs using something like this:
Regex.Split(input, "\s(?=[A-Z][A-Za-z]*:)")
this looks for any whitespace, followed by a alphabetic string followed by a colon. The alphabetic string must start with a capital letter. It then splits on that white space. That will get you three strings of the form "PropertyName: PropertyValue". Splitting on that first colon is then pretty easy (I'd personally probably just use substring and indexof rather than another regular expression but you sound like you can do that bit fine on your own. Shout if you do want help with the second split.
The only thing to say is be carful in case you get false matches due to the input being awkward. In this case you'll just have to make the regex more complicated to try to compensate.
You would need some delimiter to indicate when it is the end of each pair as opposed to having one large string with sections in between e.g.
Organization: Name of a governmental, military or other organization.|OrganizationType: Organization classification to one of the following types: sports, governmental military, governmental civilian or political party. (required) |Nationality: Organization nationality if mentioned in the document. (required)
Notice the | character which is indicating the end of the pair. Then it is just a case of using a very specific delimiter, something that is not likely to be used in the description text, instead of one colon you could use 2 :: as one colon could possibly crop up on occassions as others have suggested. That means you would just need to do:
// split the string into rows
string[] rows = myString.Split('|');
Dictionary<string, string> pairs = new Dictionary<string, string>();
foreach (var r in rows)
{
// split each row into a pair and add to the dictionary
string[] split = Regex.Split(r, "::");
pairs.Add(split[0], split[1]);
}
You can use LINQ as others have suggested, the above is more for readability so you can see what is happening.
Another alternative is to devise some custom regex to do what you need but again you would need to be making a lot of assumptions of how the description text would be formatted etc.
Considering that each word in front of the colon always has at least one capital (please confirm), you could solve this by using regular expressions (otherwise you'd end up splitting on all colons, which also appear inside the sentences):
var resultDict = Regex.Split(input, #"(?<= [A-Z][a-zA-Z]+):")
.ToDictionary(a => a[0], a => a[1]);
The (?<=...) is a positive look-behind expression that doesn't "eat up" the characters, thus only the colon is removed from the output. Tested with your input here.
The [A-Z][a-zA-Z]+ means: a word that starts with a capital.
Note that, as others have suggested, a "smarter" delimiter will provide easier parsing, as does escaping the delimiter (i.e. like "::" or ":" when you are required to use colons. Not sure if those are options for you though, hence the solution with regular expressions above.
Edit
For one reason or another, I kept getting errors with using ToDictionary, so here's the unwinded version, at least it works. Apologies for earlier non-working version. Not that the regular expression is changed, the first did not include the key, which is the inverse of the data.
var splitArray = Regex.Split(input, #"(?<=( |^)[A-Z][a-zA-Z]+):|( )(?=[A-Z][a-zA-Z]+:)")
.Where(a => a.Trim() != "").ToArray();
Dictionary<string, string> resultDict = new Dictionary<string, string>();
for(int i = 0; i < splitArray.Count(); i+=2)
{
resultDict.Add(splitArray[i], splitArray[i+1]);
}
Note: the regular expression becomes a tad complex in this scenario. As suggested in the thread below, you can split it in smaller steps. Also note that the current regex creates a few empty matches, which I remove with the Where-expression above. The for-loop should not be needed if you manage to get ToDictionary working.