I am currently trying to figure out if CouchDB is suitable for my use-case and if so, how. I have a situation similar to the following:
First set of documents (let's call them companies):
{
"_id" : 1,
"name" : "Foo"
}
{
"_id" : 2,
"name" : "Bar"
}
{
"_id" : 3,
"name" : "Baz"
}
Second set of documents (let's call them projects):
{
"_id" : 4,
"name" : "FooProject1",
"company" : 1
}
{
"_id" : 5,
"name" : "FooProject2",
"company" : 1
}
...
{
"_id" : 100,
"name" : "BazProject2",
"company" : 3
}
Third set of documents (let's call them incidents):
{
"_id" : "300",
"project" : 4,
"description" : "...",
"cost" : 200
}
{
"_id" : "301",
"project" : 4,
"description" : "...",
"cost" : 400
}
{
"_id" : "302",
"project" : 4,
"description" : "...",
"cost" : 500
}
...
So in short every company has multiple projects, and every project can have multiple incidents. One reason I model the data is, that I come mainly from a SQL background, so the modelling may be completely unsuitable. The second reason is, that I would like to add new incidents very easily by just using the REST-API provided by couchdb. So the incidents have to be single documents.
However, I now would like to get a view that would allow me to calculate the total cost for each company. I can easily define a view using map-reduce and linked documents which get's me the total amount per project. However once I am at the project level I cannot get any further to the level of the company.
Is this possible at all using couchDb? This kind of summarising data sounds like a perfect use case for map-reduce. In SQL I would just do a three-table join, but it seems like in couchDb the best I can get is two-table joins.
As mentioned you cannot do joins in CouchDb but this isn't a limitation, this is an invitation to both think about your problems and approach them differently. The correct way to do this in CouchDb is to define data structures called for example : IncidentReference composed of :
The project id
And the company id
That way your data would look like :
{
"_id" : "301",
"project" : 4,
"description" : "...",
"cost" : 400,
"reference" : {
"projectId" : 1,
"companyId" : 2
}
}
This is just fine. Once you have that, you can play with Map/Reduce to achieve whatever you want easily. Generally speaking, you need to think about the way you are going to query your data.
Related
I have elasticsearch set up for searching across a products catalog's variants. Basically where:
Product has_many variants
Variant belongs_to product
And the variant index json / mapping contains the product name.
I am trying to search variants, grouped by product id, bucket size of 1. I am able to do it and sort by min price, max price, etc.
This works:
POST /variants/_search?size=0
{
"aggs" : {
"min_price" : { "min" : { "field" : "price" } }
}
}
This is (sort of) what I need next:
POST /variants/_search?size=0
{
"aggs" : {
"product_name" : { "sort by product_name asc / desc" }
}
}
My last task is about sorting them alphabetically, but I dont seem to be able to sort by a keyword field (asc/desc) using an aggregator.
In ES 6.0, you could do this. Note that size limits how many are returned, and the more you request the more expensive the query will be to execute. So if you really need many thousands you will probably want to try a different approach. Probably something where you created a separate rolled up index for products that you could search/sort instead of trying to do it through aggregations.
GET /variants/_search
{
"size": 0,
"aggs" : {
"product_name" : {
"terms" : {
"field" : "product_name",
"size": 1000,
"order" : { "_key" : "asc" }
}
}
}
}
Reference:
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html#search-aggregations-bucket-terms-aggregation-order
I would like to change my database from SQLite to MongoDB since mongo is schema less. In SQL database i had to create multiple rows for each attribute for the same sku(a product). I have to create n number of columns since each attribute have different specifications. Now in mongo I am planning to create only one document(row) for a sku having same id. To achieve this I would like to create a field(column) for specifications like html, pdf, description, etc. Now the last field is for attributes which has different values. I would like to store it in hash.(key value pairs). Does it make sense to store all the attributes in single cell? Am I going in right direction? Someone please suggest.
EDIT:
I want something like this.
My question is, in SQL i was creating columns for each attributes like attribute 1 name, value and attribute 2 name, value. This extends the row size. Now i want to store all the attributes in hash format(as shown in the image) since MongoDB is schema less. Is it possible? And does it makes sense? Is there any better option out?
Ultimately, how you store the data should be influenced by how you intend on accessing or updating the data, but one approach would be to create an embedded attributes object within each sku/product with all attributes for that sku/product:
Based on your example:
{
"sku_id" : 14628109,
"product_name" : "TE Connectivity",
"attributes" : {
"Widhth" : [ "4", "mm" ],
"Height" : [ "56", "cm" ],
"Strain_Relief_Body_Orientation" : "straight",
"Minimum_Operating_Temperature" : [ "40" , "C" ]
}
},
{
"sku_id" : 14628110,
"product_name" : "Tuning Holder",
"attributes" : {
"Widhth" : [ "7", "mm" ],
"diametr" : [ "78", "cm" ],
"Strain_Relief_Body_Orientation" : "straight",
"Minimum_Operating_Temperature" : [ "4" , "C" ]
}
},
{
"sku_id" : 14628111,
"product_name" : "Facing Holder",
"attributes" : {
"size" : [ "56", "nos" ],
"Height" : [ "89", "cm" ],
"Strain_Relief_Body_Orientation" : "straight",
"Minimum_Operating_Temperature" : [ "56" , "C" ]
}
}
I've installed SwiftMongoDB using CocoaPods. Added 2 documents in the collection. When I try to retrieve them using .find() method It only returns one document.
func all() -> [MongoDocument]{
let UsersCollection = MongoCollection(name: "users")
mongodb?.mongodb.registerCollection(UsersCollection)
for (index,value) in UsersCollection.find().successValue!.enumerate(){
debugPrint(value)
}
// UsersCollection.find().successValue!.count
// returns 1.
return UsersCollection.find().successValue!
}
My collection looks like:
{ "_id" : ObjectId("56bb29ca42b9b41900000000"), "address" : "US", "given" : "User", "birthDate" : "1985-08-01", "family" : "UserFam", "identifier" : "E3826", "date" : "10.2.2016 at 14:15:6" }{ "_id" : ObjectId("56bb29ca42b9b41900000000"), "address" : "US", "given" : "User2", "birthDate" : "1985-08-01", "family" : "UserFam2", "identifier" : "E3826", "date" : "10.2.2016 at 14:15:6" }
Is there another way of getting all the documents? Am I doing something wrong?
I have never used SwiftMongoDB but I have used swift for iOS development and mongoDB with Java. First of all, here is your first object's and your second object's ids together:
1st: 56bb29ca42b9b41900000000
2nd: 56bb29ca42b9b41900000000
As you can see they are the same. So I strongly believe your issue arises from that. Have you defined that property as primary key?
This is a bug. Maybe the package is in it's early versions ....
When using the IOS API, I'm making a call to connect via the ApigeeDataClient connectEntities method. I pass in the type "users", then the user's uuid, then connectionType "likes", with the connectee type of "songs" and the song's uuid.
Example:
ApigeeClientResponse *response = [_dataClient connectEntities:#"users" connectorID:_apigeeUser.uuid connectionType:#"likes" connecteeType:#"songs" connecteeID:song.uuid];
When I make the connection, it says successful, but when I look at the data on the server, it seems to save the connection incorrectly. For example, for the song, I see:
connecting :likes :/songs/b523a6aa-bb39-11e4-a2bb-35673af856e9/connecting/likes
It looks like the song's uuid isn't in the connecting path.
The same is true for the connection related to the user. It's the user's uuid that seems to be connected to the same user. The uuid is that of the song's uuid, not the user's. When I make the call to getEntityConnections, like so:
ApigeeClientResponse *response = [_dataClient getEntityConnections:#"songs" connectorID:_apigeeUser.uuid connectionType:#"likes" query:nil];
It returns an error, saying "expected song, but got user's uuid.
Entity c831e1c4-2e6e-11e4-94ce-299efa8c6fd5 is not the expected type, expected song, found user"
In looking in Apigee itself, in the data section, I see the following snippet:
"connections": {
"likes": "/users/c831e1c4-2e6e-11e4-94ce-299efa8c6fd5/likes"
}
The song's uuid is missing. Even when I try to update the JSON directly on the server, basically adding the song's uuid to the end, it says it's saved, but it removes the song's uuid.
Even just using the curl method to make a connection doesn't work. For example:
curl -X POST http://api.usergrid.com/peterdj/sandbox/users/bc2fc82a-bfa3-11e4-a994-b19963f1779d/likes/c37f1eaa-bfa3-11e4-9141-97b3510c98e6
When I make that call, I get this
{"action":"post",
"application":"0baaf590-2c1b-11e4-9bb5-11cb139f1620",
"params":{
},
"path":"/users/bc2fc82a-bfa3-11e4-a994-b19963f1779d/likes",
"uri":"https://api.usergrid.com/peterdj/sandbox/users/bc2fc82a-bfa3-11e4-a994-b19963f1779d/likes",
"entities":[
{
"uuid":"c37f1eaa-bfa3-11e4-9141-97b3510c98e6",
"type":"song",
"name":"WingSpan",
"created":1425167080842,
"modified":1425167080842,
"bpm":"124",
"code":"WingSpan",
"genre":"Progressive House",
"metadata":{
"connecting":{
"likes":"/users/bc2fc82a-bfa3-11e4-a994-b19963f1779d/likes/c37f1eaa-bfa3-11e4-9141-97b3510c98e6/connecting/likes"
},
"path":"/users/bc2fc82a-bfa3-11e4-a994-b19963f1779d/likes/c37f1eaa-bfa3-11e4-9141-97b3510c98e6"
},
"title":"Wing Span"
}
],
"timestamp":1425246006718,
"duration":78,
"organization":"peterdj",
"applicationName":"sandbox"
}
Notice that the resulting connecting path seems correct when it's returned, but when do another GET curl, as so:
curl http://api.usergrid.com/peterdj/sandbox/users/bc2fc82a-bfa3-11e4-a994-b19963f1779d
The song's uuid isn't there:
{
"action" : "get",
"application" : "0baaf590-2c1b-11e4-9bb5-11cb139f1620",
"params" : { },
"path" : "/users",
"uri" : "https://api.usergrid.com/peterdj/sandbox/users",
"entities" : [ {
"uuid" : "bc2fc82a-bfa3-11e4-a994-b19963f1779d",
"type" : "user",
"name" : "peter",
"created" : 1425167068578,
"modified" : 1425167495412,
"username" : "peterdj",
"email" : "asdf#adf.com",
"activated" : true,
"picture" :"",
"metadata" : {
"path" : "/users/bc2fc82a-bfa3-11e4-a994-b19963f1779d",
"sets" : {
"rolenames" : "/users/bc2fc82a-bfa3-11e4-a994-b19963f1779d/roles",
"permissions" : "/users/bc2fc82a-bfa3-11e4-a994-b19963f1779d/permissions"
},
"connections" : {
"likes" : "/users/bc2fc82a-bfa3-11e4-a994-b19963f1779d/likes"
},
"collections" : {
"activities" : "/users/bc2fc82a-bfa3-11e4-a994-b19963f1779d/activities",
"devices" : "/users/bc2fc82a-bfa3-11e4-a994-b19963f1779d/devices",
"feed" : "/users/bc2fc82a-bfa3-11e4-a994-b19963f1779d/feed",
"groups" : "/users/bc2fc82a-bfa3-11e4-a994-b19963f1779d/groups",
"roles" : "/users/bc2fc82a-bfa3-11e4-a994-b19963f1779d/roles",
"following" : "/users/bc2fc82a-bfa3-11e4-a994-b19963f1779d/following",
"followers" : "/users/bc2fc82a-bfa3-11e4-a994-b19963f1779d/followers"
}
}
} ],
"timestamp" : 1425311662762,
"duration" : 12,
"organization" : "peterdj",
"applicationName" : "sandbox"
}
Is this a bug with the entity connections with Apigee/Usergrid or am I doing something wrong?
Thanks
Well, turns out, thanks to the comments by #remus, I've figured it out.
In this call:
ApigeeClientResponse *response = [_dataClient getEntityConnections:#"songs" connectorID:_apigeeUser.uuid connectionType:#"likes" query:nil];
The connection needs to be "users", not "songs". Works now. Thanks #remus
I read the wiki api from http://docs.neo4j.org/chunked/snapshot/rest-api-traverse.html
and check my code, i can find the shortest n paths by Traversals ,and can find nodes or relationships with index. but my projects has 300M nodes ,when i find shortest n paths by Traversals ,like retionship data property Name contain 'hi' ,if i use neo4j's fiter method,it is really slow,i want use index(i created it!),code like:
{
"order" : "breadth_first",
"return_filter" : {
"body" : "position.endNode().getProperty('name').toLowerCase().contains('t')",
"language" : "javascript"
},
"prune_evaluator" : {
"body" : "position.length() > 10",
"language" : "javascript"
},
"uniqueness" : "node_global",
"relationships" : [ {
"direction" : "all",
"type" : "knows"
}, {
"direction" : "all",
"type" : "loves"
} ],
"max_depth" : 3
}
i want :
{
"order" : "breadth_first",
"return_filter" : {
"body" : "position.endNode().name:*hi*",
"language" : "javascript"
},
"prune_evaluator" : {
"body" : "position.length() > 10",
"language" : "javascript"
},
"uniqueness" : "node_global",
"relationships" : [ {
"direction" : "all",
"type" : "knows"
}, {
"direction" : "all",
"type" : "loves"
} ],
"max_depth" : 3
}
can someone help me ?
Is it slow the first time or are consecutive requests equally slow? Properties are loaded per node/relationship the first time any property is requested for that node/relationship and maybe you're seeing that performance hit.
I think that using an index would help for nodes that haven't been loaded yet, but not otherwise. Doing this in rest could be tricky in that you'd have to do index lookup before hand and pass that list into the evaluator. But that doesn't scale. Instead could write an extension for this?