I have a lot of different collections of values I generate at runtime and want to send them to ElasticSearch. I can represent them as List<object> or if really doesn't work any other way, as List<string>. But I can't find any example how to do that. Here is an example of the code which doesn't work. There is probably a lot wrong with it, so any additional pointers are highly appreciated.
var client = new ElasticClient(new Uri("http://localhost:9200"));
client.CreateIndex("testentry");
var values = new List<object> {"StringValue", 123, DateTime.Now};
var indexResponse = client.Index(values, descriptor => descriptor.Index("testentry"));
Console.WriteLine(indexResponse.DebugInformation);
Which results in:
Invalid NEST response built from a unsuccessful low level call on POST: /testentry/list%601
# Audit trail of this API call:
- [1] BadResponse: Node: http://localhost:9200/ Took: 00:00:00.0600035
# ServerError: ServerError: 400Type: mapper_parsing_exception Reason: "failed to parse" CausedBy: "Type: not_x_content_exception Reason: "Compressor detection can only be called on some xcontent bytes or compressed xcontent bytes""
and
[2016-09-17 14:16:20,955][DEBUG][action.index ] [Gin Genie] failed to execute [index {[t
estentry][list`1][AVc4E3HaPglqpoLcosDo], source[_na_]}] on [[testentry][1]]
MapperParsingException[failed to parse]; nested: NotXContentException[Compressor detection can only
be called on some xcontent bytes or compressed xcontent bytes];
at org.elasticsearch.index.mapper.DocumentParser.parseDocument(DocumentParser.java:156)
I'm using Elasticsearch.Net 2.4.3 and NEST 2.4.3.
In addition to Henrik's answer, you could also index values in a Dictionary<string, object>
public class MyType
{
public MyType()
{
Values = new Dictionary<string, object>();
}
public Dictionary<string, object> Values { get; private set; }
}
void Main()
{
var pool = new SingleNodeConnectionPool(new Uri("http://localhost:9200"));
var connectionSettings = new ConnectionSettings(pool);
var client = new ElasticClient(connectionSettings);
var myType = new MyType
{
Values =
{
{ "value1", "StringValue" },
{ "value2", 123 },
{ "value3", DateTime.Now },
}
};
client.Index(myType, i => i.Index("index-name"));
}
The Dictionary<string,object> will be serialized to a json object with property names to match the dictionary keys
{
"values": {
"value1": "StringValue",
"value2": 123,
"value3": "2016-09-18T18:41:48.7344837+10:00"
}
}
Within Elasticsearch, the mapping will be inferred as an object type.
Arrays with a mixture of datatypes are not supported.
You could convert all of the values to strings:
client.CreateIndex("testentry");
var values = new List<string> { "StringValue", "123", DateTime.Now.ToString() };
var indexResponse = client.Index(new { Values = values}, descriptor => descriptor.Index("testentry").Type("test"));
Or specify the fields that the values should be indexed to:
client.CreateIndex("testentry");
var values = new { Field1 = "StringValue", Field2 = 123, Field3 = DateTime.Now };
var indexResponse = client.Index(values, descriptor => descriptor.Index("testentry").Type("test"));
Consider specifying the type of the document with the IndexDescriptor or create a class for the document.
Related
I am trying to upsert documents within an array of documents via the C# Driver for MongoDB. I manage to modify existing array elements via $set & arrayFilters, but struggle to add non-existing elements via $addToSet.
I would be glad about any suggestion, even if there is a completely different way.
My simplified class in C#
internal class TimeSeries
{
[BsonId]
internal Name;
[BsonDictionaryOptions(DictionaryRepresentation.ArrayOfDocuments)]
internal Dictionary<DateTime, double> Container;
}
// add a test document
void foo()
{
var coll = _myDatabase.GetCollection<TimeSeries>("myColl");
var res = coll.InsertOne(new TimeSeries()
{ Name = "abc", Container = new Dictionary<DateTime, double>()
{{new DateTime(2000,1,1),20}}
});
}
$set and $addToSet in the Mongo Shell work fine:
// modify the existing value to 30
db.myColl.update( {"_id":"abc"}, {$set: {"Container.$[loc].v":30}}, {arrayFilters:[{"loc.k":new Date("2000-01-01")}]})
// add if no existent
db.myColl.update( {"_id":"abc"}, {$addToSet: {"Container": {"k":new Date("2000-02-01"),"v":200}}})
In C# I can reproduce $set, but get a "Specific Cast is not valid" error for $addToSet.
var filter = Builders<TimeSeries>.Filter.Eq("_id", "abc");
var arrayFilters = new List<ArrayFilterDefinition<BsonDocument>>()
{new BsonDocument("loc.k", new DateTime(2000,1,1))};
// $set
var upsert = Builders<TimeSeries>.Update.Set("Container.$[loc].v", 30);
var resUpt = coll.UpdateOne(filter, upsert, new UpdateOptions { ArrayFilters = arrayFilters })
// $addToSet
var upsert_add = Builders<TimeSeries>.Update.AddToSet("Container", new BsonDocument { { "k", new DateTime(2000, 2, 1) }, { "v", 50} });
var res_add = coll.UpdateOne(filter, upsert_add); // Specific Cast is not valid
I am setting up a ChangeStream to notify me when a document has changed in a collection so that I can upsert the "LastModified" element for that document to the time of the event. Since this update will cause a new event to occur on the ChangeStream, I need to filter out these updates to prevent an infinite loop (updating the LastModified element because the LastModified element was just updated...).
I have the following code that is working when I specify the exact field:
ChangeStreamOptions options = new ChangeStreamOptions();
options.ResumeAfter = resumeToken;
string filter = "{ $and: [ { operationType: { $in: ['replace','insert','update'] } }, { 'updateDescription.updatedFields.LastModified': { $exists: false } } ] }";
var pipeline = new EmptyPipelineDefinition<ChangeStreamDocument<BsonDocument>>().Match(filter);
var cursor = collection.Watch(pipeline, options, cancelToken);
However, instead of hard-coding the "updateDescription.updatedFields.LastModified", I would like to provide a list of element names that I don't want to exist in the updatedFields document.
I attempted:
string filter = "{ $and: [ { operationType: { $in: ['replace','insert','update'] } }, { 'updateDescription.updatedFields': { $nin: [ 'LastModified' ] } } ] }";
but this didn't work as expected (I still got the update events for the LastModified change.
I originally was using the Filter Builder:
FilterDefinitionBuilder<ChangeStreamDocument<BsonDocument>> filterBuilder = Builders<ChangeStreamDocument<BsonDocument>>.Filter;
FilterDefinition<ChangeStreamDocument<BsonDocument>> filter = filterBuilder.In("operationType", new string[] { "replace", "insert", "update" }); //Only include the change if it was one of these types. Available types are: insert, update, replace, delete, invalidate
filter &= filterBuilder.Nin("updateDescription.updatedFields", ChangedFieldsToIgnore); //If this is an update, only include it if the field(s) updated contains 1+ fields not in the ChangedFieldsToIgnore list
where ChangedFieldsToIgnore is a List containing the field names that I do not want to get events for.
Can anyone help with the syntax that I need to use? or do I need to create a loop around my ChangedFieldsToIgnore list and create a new entry in the filter for each item to "$exists: false"? (this doesn't seem very efficient).
EDIT:
I attempted the following code based on the answer by #wan-bachtiar, but I'm getting an exception on my enumerator.MoveNext() call:
var match1 = new BsonDocument { { "$match", new BsonDocument { { "operationType", new BsonDocument { { "$in", new BsonArray(new string[] { "replace", "insert", "update" }) } } } } } };
var match2 = new BsonDocument { { "$addFields", new BsonDocument { { "tmpfields", new BsonDocument { { "$objectToArray", "$updateDescription.updatedFields" } } } } } };
var match3 = new BsonDocument { { "$match", new BsonDocument { { "tmpfields.k", new BsonDocument { { "$nin", new BsonArray(updatedFieldsToIgnore) } } } } } };
var pipeline = new[] { match1, match2, match3 };
var cursor = collection.Watch<ChangeStreamDocument<BsonDocument>>(pipeline, options, Profile.CancellationToken);
enumerator = cursor.ToEnumerable().GetEnumerator();
enumerator.MoveNext();
ChangeStreamDocument<BsonDocument> doc = enumerator.Current;
The exception is: "{"Invalid field name: \"tmpfields\"."}"
I suspect the problem might be that I'm getting "replace" and "insert" events which do not contain the updateDescription field, so the $addFields/$objectToArray are failing. I'm too new to figure out the syntax, but I think I need to use a filter that does:
{ $match: { "operationType": { $in: ["replace", "insert"] } } }
OR
{ $eq: { "operationTYpe": "update" }} AND { $addFields....}
Also, it appears that the C# driver does not include a Builder that helps with the $addFields and $objectToArray operations. I was only able to use the new BsonDocument {...} method to build the pipeline variable.
ChangedFieldsToIgnore is a List containing the field names that I do not want to get events for.
If you would like to filter based on multiple keys (whether updatedFields contains certain fields), it's easier if you convert the keys to values first.
You can convert the document contained within updatedFields into values by utilising aggregation operator $objectToArray. For example:
pipeline = [{"$addFields": {
"tmpfields":{
"$objectToArray":"$updateDescription.updatedFields"}
}},
{"$match":{"tmpfields.k":{
"$nin":["LastModified", "AnotherUnwantedField"]}}}
];
The above aggregation pipeline adds a temporary field called tmpfields. This new field will pivot the content of updateDescription.updatedFields turning {name:value} into [{k:name, v:value}]. Once we have those keys as values, we can utilise $nin as an array of filter.
UPDATED
The reason you're getting an exception of tmpfields being invalid, is because the result is casted into ChangeStreamDocument model which does not have a recognizable field called tmpfields.
In the case, when it's different operations that does not have field updateDescription.updatedFields, the value of tmpfields would just be null.
Below is an example of MongoDB ChangeStream .Net/C# using MongoDB .Net driver v2.5, along with an aggregation pipeline that modifies the output change stream.
This example is not type safe, and would return BsonDocument :
var database = client.GetDatabase("database");
var collection = database.GetCollection<BsonDocument>("collection");
var options = new ChangeStreamOptions { FullDocument = ChangeStreamFullDocumentOption.UpdateLookup };
// Aggregation Pipeline
var addFields = new BsonDocument {
{ "$addFields", new BsonDocument {
{ "tmpfields", new BsonDocument {
{ "$objectToArray",
"$updateDescription.updatedFields" }
} }
} } };
var match = new BsonDocument {
{ "$match", new BsonDocument {
{ "tmpfields.k", new BsonDocument {
{ "$nin", new BsonArray{"LastModified", "Unwanted"} }
} } } } };
var pipeline = new[] { addFields, match };
// ChangeStreams
var cursor = collection.Watch<BsonDocument>(pipeline, options);
foreach (var change in cursor.ToEnumerable())
{
Console.WriteLine(change.ToJson());
}
I wrote the piece of code below as I was having the same issues you were having. No need to mess around with BsonObjects ...
//The operationType can be one of the following: insert, update, replace, delete, invalidate
//ignore the field lastrun as we would end in an endles loop
var pipeline = new EmptyPipelineDefinition<ChangeStreamDocument<ATask>>()
.Match("{ operationType: { $in: [ 'replace', 'update' ] } }")
.Match(#"{ ""updateDescription.updatedFields.LastRun"" : { $exists: false } }")
.Match(#"{ ""updateDescription.updatedFields.IsRunning"" : { $exists: false } }");
var options = new ChangeStreamOptions { FullDocument = ChangeStreamFullDocumentOption.UpdateLookup };
var changeStream = Collection.Watch(pipeline, options);
while (changeStream.MoveNext())
{
var next = changeStream.Current;
foreach (var obj in next)
yield return obj.FullDocument;
}
I'm trying to create some dynamic ExpandoObject. I've encountered a certain problem.
As I don't know what the name of these different properties in my objects should be, I can't do like this:
var list = new ArrayList();
var obj = new ExpandoObject();
obj.ID = 1,
obj.Product = "Pie",
obj.Days = 1,
obj.QTY = 65
list.Add(obj);
Let me explain my situation: I wish to get data from a random DB (I don't know which, but building a connection string from the information I get from the UI), therefore I don't know what data I need to get. This could be an example of a DB table
TABLE Sale
ID: int,
Product: nvarchar(100),
Days: int,
QTY: bigint
This could be another exmaple:
TABLE Foobar
Id: int,
Days: int
QTY: bigint
Product_Id: int
Department_Id: int
As you see, I don't know what the DB looks like (this is 100% anonymous, therefore it needs to be 100% dynamic), and the data I want to return should look like a well constructed JSON, like so:
[
{
"ID": 1,
"Product": "Pie"
"Days": 1,
"QTY": 65
},
{
"ID": 2,
"Product": "Melons"
"Days": 5,
"QTY": 12
}
]
Or, with the other example:
[
{
"ID": 1,
"Days": 2,
"QTY": 56,
"Product_Id": 5,
"Department_Id": 2
}
{
"ID": 2,
"Days": 6,
"QTY": 12,
"Product_Id": 2,
"Department_Id": 5
}
]
I've tried working with these ExpandoObjects, but can't seem to make it work, as I can't do what's illustrated in the top of this question (I don't know the names of the properties). Is there a way for me to say something like:
var obj = new ExpandoObject();
var propName = "Product";
var obj.propName = "Pie"
Console.WriteLine("Let's print!: " + obj.Product);
//OUTPUT
Let's print!: Pie
Does anyone have a solution, og simply guidance to a structure, that might solve this situation?
Rather than creating an ExpandoObject or some other dynamic type, you could create a List<Dictionary<string, object>> where each Dictionary<string, object> contains the name/value pairs you want to serialize. Then serialize to JSON using Json.NET (or JavaScriptSerializer, though that is less flexible):
var list = new List<Dictionary<string, object>>();
// Build a dictionary entry using a dictionary initializer: https://msdn.microsoft.com/en-us/library/bb531208.aspx
list.Add(new Dictionary<string, object> { { "ID", 1 }, {"Product", "Pie"}, {"Days", 1}, {"QTY", 65} });
// Build a dictionary entry incrementally
// See https://msdn.microsoft.com/en-us/library/xfhwa508%28v=vs.110%29.aspx
var dict = new Dictionary<string, object>();
dict["ID"] = 2;
dict["Product"] = "Melons";
dict["Days"] = 5;
dict["QTY"] = 12;
list.Add(dict);
Console.WriteLine(JsonConvert.SerializeObject(list, Formatting.Indented));
Console.WriteLine(new JavaScriptSerializer().Serialize(list));
The first outputs:
[
{
"ID": 1,
"Product": "Pie",
"Days": 1,
"QTY": 65
},
{
"ID": 2,
"Product": "Melons",
"Days": 5,
"QTY": 12
}
]
The second outputs the same without the indentation:
[{"ID":1,"Product":"Pie","Days":1,"QTY":65},{"ID":2,"Product":"Melons","Days":5,"QTY":12}]
Use dynamic, then cast to IDictionary<string, object> to loop through your properties:
dynamic obj = new ExpandoObject();
obj.Product = "Pie";
obj.Quantity = 2;
// Loop through all added properties
foreach(var prop in (IDictionary<string, object>)obj)
{
Console.WriteLine(prop.Key + " : " + prop.Value);
}
I've made a fiddle: https://dotnetfiddle.net/yFLy2u
Now this is a solution to your question... other answers like #dbc's might be better suited to the problem (which is not the question, really)
As you can see here ExpandoObject Class, the ExpandoObject is implementing IDictionary<string, object>, so you can use that fact like
IDictionary<string, object> obj = new ExpandoObject();
var propName = "Product";
obj[propName] = "Pie"
Console.WriteLine("Let's print!: " + obj[propName]);
// Verify it's working
Console.WriteLine("Let's print again!: " + ((dynamic)obj).Product);
While I was writing the answer, I see you already got proper answer. You can use a Dictionary<string, onject> or even Tuple.
But as per your original question, you wanted to add properties dynamically. For that you can refer to other answer using ExpandoObject. This is just the same solution (using ExpandoObject to dynamically add properties) with classes similar to your code.
//example classes
public class DictKey
{
public string DisplayName { get; set; }
public DictKey(string name) { DisplayName = name; }
}
public class DictValue
{
public int ColumnIndex { get; set; }
public DictValue(int idx) { ColumnIndex = idx; }
}
//utility method
public static IDictionary<string, object> GetExpando(KeyValuePair<DictKey, List<DictValue>> dictPair)
{
IDictionary<string, object> dynamicObject = new ExpandoObject();
dynamicObject["Station"] = dictPair.Key.DisplayName;
foreach (var item in dictPair.Value)
{
dynamicObject["Month" + (item.ColumnIndex + 1)] = item;
}
return dynamicObject;
}
Ans usage example:
var dictionaryByMonth = new Dictionary<DictKey, List<DictValue>>();
dictionaryByMonth.Add(new DictKey("Set1"), new List<DictValue> { new DictValue(0), new DictValue(2), new DictValue(4), new DictValue(6), new DictValue(8) });
dictionaryByMonth.Add(new DictKey("Set2"), new List<DictValue> { new DictValue(1), new DictValue(2), new DictValue(5), new DictValue(6), new DictValue(11) });
var rowsByMonth = dictionaryByMonth.Select(item => GetExpando(item));
First part, read this blog post by C# team thoroughly.
Lets see your code
var obj = new ExpandoObject();
var propName = "Product";
var obj.propName = "Pie"
Console.WriteLine("Let's print!: " + obj.Product);
//OUTPUT
Let's print!: Pie
In your code you are using var obj = new ExpandoObject();, so you are creating a statically typed object of type ExpandableObject. In the blog they specifically call out
I didn’t write ExpandoObject contact = new ExpandoObject(), because if I did contact would be a statically-typed object of the ExpandoObject type. And of course, statically-typed variables cannot add members at run time. So I used the new dynamic keyword instead of a type declaration, and since ExpandoObject supports dynamic operations, the code works
So if you rewrite your code to use dynamic obj, and add the dynamic properties as properties it should work!
But for your particular use case you better use Dictionaries as suggested above by #dbc
dynamic obj = new ExpandoObject();
obj.Product= "Pie"
Console.WriteLine("Let's print!: " + obj.Product);
//OUTPUT
Let's print!: Pie
This code is used with the C# driver to select items from a document of items that have a location field value in the range of location id values, I am just providing it as an example:
var locations = new BsonValue[] { 1, 2, 3, 4 };
var data = collection
.Find(Builders<BsonDocument>.Filter.In("LocationId", locations))
.Project(x => Mapper.Map<BsonDocument, ItemViewModel>(x))
.ToListAsync().Result;
Does BsonValue just serve to initialize an array here? Where do I get more information? How do I convert a regular C# list/array into that bson value?
BsonDocument provides flexible way to represent JSON/BSON in C#. Creating BsonDocument is similar to creating JSON objects.
Simple document
new BsonDocument("name", "Joe")
creates JSON { "name" : "Joe" }
More complex document
new BsonDocument
{
{"Name", "Joe"},
{
"Books", new BsonArray(new[]
{
new BsonDocument("Name", "Book1"),
new BsonDocument("Name", "Book2")
})
}
}
creates JSON {"Name":"Joe", "Books" : [ { "Name":"Book1" },{ "Name":"Book2" } ]}
Array
new BsonArray(new [] {1, 2, 3})
creates JSON [1,2,3]
Convert C# class to BsonDocument
var product = new Product{ Name = "Book", Pages = 3}.ToBsonDocument()
creates JSON {"Name":"Book","Pages":3}
Implicit conversion helps initialize variables
BsonValue bsonInt = 1;
BsonValue bsonBool = true;
new BsonValue[] { 1, 2, 3, 4 }
Can I retrieve basic information about all collections in a MongoDB with F#?
I have a MongoDB with > 450 collections. I can access the db with
open MongoDB.Bson
open MongoDB.Driver
open MongoDB.Driver.Core
open MongoDB.FSharp
open System.Collections.Generic
let connectionString = "mystring"
let client = new MongoClient(connectionString)
let db = client.GetDatabase(name = "Production")
I had considered trying to just get all collections then loop through each collection name and get basic information about each collection with
let collections = db.ListCollections()
and
db.GetCollection([name of a collection])
but the db.GetCollection([name]) requires me to define a type to pull the information about each collection. This is challenging for me as I don't want to have to define a type for each collection, of which there are > 450, and frankly, I don't really know much about this DB. (Actually, no one in my org does; that's why I'm trying to put together a very basic data dictionary.)
Is defining the type for each collection really necessary? Can I use the MongoCollection methods available here without having to define a type for each collection?
EDIT: Ultimately, I'd like to be able to output collection name, the n documents in each collection, a list of the field names in each collection, and a list of each field type.
I chose to write my examples in C# as i'm more familiar with the C# driver and it is a listed tag on the question. You can run an aggregation against each collection to find all top level fields and their (mongodb) types for each document.
The aggregation is done in 3 steps. Lets assume the input is 10 documents which all have this form:
{
"_id": ObjectId("myId"),
"num": 1,
"str": "Hello, world!"
}
$project Convert each document into an array of documents with values fieldName and fieldType. Outputs 10 documents with a single array field. The array field will have 3 elements.
$unwind the arrays of field infos. Outputs 30 documents each with a single field corresponding to an element from the output of step 1.
$group the fields by fieldName and fieldType to get distinct values. Outputs 3 documents. Since all fields with the same name always have the same type in this example, there is only one final output document for each field. If two different documents defined the same field, one as string and one as int there would be separate entries in this result set for both.
// Define our aggregation steps.
// Step 1, $project:
var project = new BsonDocument
{ {
"$project", new BsonDocument
{
{
"_id", 0
},
{
"fields", new BsonDocument
{ {
"$map", new BsonDocument
{
{ "input", new BsonDocument { { "$objectToArray", "$$ROOT" } } },
{ "in", new BsonDocument {
{ "fieldName", "$$this.k" },
{ "fieldType", new BsonDocument { { "$type", "$$this.v" } } }
} }
}
} }
}
}
} };
// Step 2, $unwind
var unwind = new BsonDocument
{ {
"$unwind", "$fields"
} };
// Step 3, $group
var group = new BsonDocument
{
{
"$group", new BsonDocument
{
{
"_id", new BsonDocument
{
{ "fieldName", "$fields.fieldName" },
{ "fieldType", "$fields.fieldType" }
}
}
}
}
};
// Connect to our database
var client = new MongoClient("myConnectionString");
var db = client.GetDatabase("myDatabase");
var collections = db.ListCollections().ToEnumerable();
/*
We will store the results in a dictionary of collections.
Since the same field can have multiple types associated with it the inner value corresponding to each field is `List<string>`.
The outer dictionary keys are collection names. The inner dictionary keys are field names.
The inner dictionary values are the types for the provided inner dictionary's key (field name).
List<string> fieldTypes = allCollectionFieldTypes[collectionName][fieldName]
*/
Dictionary<string, Dictionary<string, List<string>>> allCollectionFieldTypes = new Dictionary<string, Dictionary<string, List<string>>>();
foreach (var collInfo in collections)
{
var collName = collInfo["name"].AsString;
var coll = db.GetCollection<BsonDocument>(collName);
Console.WriteLine("Finding field information for " + collName);
var pipeline = PipelineDefinition<BsonDocument, BsonDocument>.Create(project, unwind, group);
var cursor = coll.Aggregate(pipeline);
var lst = cursor.ToList();
allCollectionFieldTypes.Add(collName, new Dictionary<string, List<string>>());
foreach (var item in lst)
{
var innerDict = allCollectionFieldTypes[collName];
var fieldName = item["_id"]["fieldName"].AsString;
var fieldType = item["_id"]["fieldType"].AsString;
if (!innerDict.ContainsKey(fieldName))
{
innerDict.Add(fieldName, new List<string>());
}
innerDict[fieldName].Add(fieldType);
}
}
Now you can iterate over your result set:
foreach(var collKvp in allCollectionFieldTypes)
{
foreach(var fieldKvp in collKvp.Value)
{
foreach(var fieldType in fieldKvp.Value)
{
Console.WriteLine($"Collection {collKvp.Key} has field name {fieldKvp.Key} with type {fieldType}");
}
}
}