Solr Dynamic filter - solr4

I have electronics documents associated with metadata that I indexed into Solr. I also have a web application which allows users to login and perform searches. But then, I would like to apply dynamic access rights on documents. Let me explain. Basically for us, a document has:
one type (Contract, CV, birth certificate, ...) about 250 unique types.
one person concerned, about 10 000 unique persons.
one effective date.
one content: the electronic document
Some users should (or shouldn't) have access to some documents according to who they are in our organization. For example, user 'x' can see the CV of user 'y' from date #1 to date #2. There thousands of combinations and in fact it's more complex than just these three parameters. So I developed an application based on a rules engine which computes the access rights giving a user and a document. The rules might change quite often and the facts are constantly changing.
At this time, it works to filter the results that Solr returns on my client web application. However, by filtering after searching I lost many features provided by Solr, facets, paging, ... I am looking for a way to call my rules engine (web service) to filter results before calling other Solr components (especially facets).

Related

Which architure/mindset will be suitable for building a web app that users can purchase features like Azure does?

We recently had a new business rule that will require our users to pay for individual modules in our web application.
So, all the features we build in the application will not apply to all users. Some users can choose to add features that they want.
I've tried researching into an architecture/mindset to how to approach this development.
If I could get an idea on how to get started with this.. I would very much appreciate it.
I work with .NET web applications, and Microsoft SQL Server.
Thanks.
First list what "objects" or things you need to keep track of.
Users
userid
fullname
can manage his features? You said not all users can
...
Features:
featureid
description
cost
...
UserHasFeature
a link between a user and a feature
each line is userid, featureid
Using this you can query which user has what feature. Or list the users that have access to a particular feature.
In your web app, you will need administrator functions:
users management: add, remove, modify, list
feature: add, remove, modify, list
link management: add, remove, list
Reports: whatever reports you want to have
And user functions:
user: signin, modify, reset password, view all features, view features the user already has, add a new feature, remove a feature
reports: total cost of features the user is using, others
Now this is a very quick first draft. There are a lot of missing requirements:
approval workflow: can a user modify his features without the approbation of X?
payment methods
project number for internal billing
cost structure: monthly, one time, ...?
managers can view the features of the employees he manages?
...
This to remember:
Start with objects in your projet. These become tables.
Characteristics of the objects become fields in your tables.
If the same characteristic appears in many object tables, with the same values, consider creating a new table for these. Ex. in an address, you would not leave the country value as a simple VARCHAR field. You would link to another table with the country values.
List the relations. These become foreign keys, or link tables.
Split your objects. So apply 1NF, 2NF and 3NF at least. It is enough for most applications. (NF == Normal Form).
Each table and links require administrator pages (CRUD)
Users have a limited view related to their features only.
This is a huge subject, I could go on and on, but this could get you started.
Have fun!

Neo4J Multi Tenancy and Role Based Access to Nodes

I am trying to define a user management and permissions model for Neo4j. I have a web application (Angular 2) that connects to Neo4j via an API (KOANEO4J). Neo4j is the only database or persistent storage that the application uses. Through the application a user can add/edit/delete content which uses the API to carry out these instructions in Neo4j by running Cypher Statements. Up to now I have not worried about supporting multiple users but as a next step I am starting to think about this.
The product will be used by multiple different companies and each company will have multiple users so I need some way to support this. The model I am considering in Neo4J is as follows:
An "Orgaization" is represented by a node and it can have 1 or more "Organization Catalogs". All of the nodes belonging to that catalog will be children of one of the "Organization Catalogs".
Each user will also be represented by a node in the database. They will belong to an Organisation. They will have certain access permissions on an Organization Catalog identified by a an edge.
I am looking for some advice on whether or not this is an appropriate model to follow or if there are any examples or documents that describe how to achieve this in Neo4j.
If I do implement this model then would it be better to model the permissions as seperate nodes so a user is connected to a permission node (e.g. Read Only Access) that is then connected to the Organization Catalog.
Any suggestions on how I would actually get the API to work with this type of model. I'm sure I can pass the User Id to Neo4j as part of each query and then filter the results to show only nodes the user has access to but this doesn't seem like a very elegant solution - it also means that all of the security would be dependant on carefully written Cypher queries that don't leak data that a user isnt supposed to access.
Thanks a lot
I am looking for some advice on whether or not this is an appropriate
model to follow or if there are any examples or documents that
describe how to achieve this in Neo4j.
The answer for this question is: it depends. Remember that when modelling a graph database you should consider the queries that are asked to the database. If this model fits the queries that you are asking to the database then this model is appropriated, otherwise, not. Take a look in the Chapter 5 (Graphs in the Real World) of the book Graph Databases (by Ian Robison, Jim Webber and Emil Eifrem. Available for download here). This chapter shows the modelling process of an Authorization and Access Control system in Neo4j. Can be enlightening and helpful to you.
If I do implement this model then would it be better to model the
permissions as seperate nodes so a user is connected to a permission
node (e.g. Read Only Access) that is then connected to the
Organization Catalog.
Again, it depends. Do it if the Permission entity has connection to others entities of your application besides an User and an Organization Catalog. Otherwise I believe that your permission can be modeled as a relationship between an user and an organization catalog.
Any suggestions on how I would actually get the API to work with this
type of model. I'm sure I can pass the User Id to Neo4j as part of
each query and then filter the results to show only nodes the user has
access to but this doesn't seem like a very elegant solution - it also
means that all of the security would be dependant on carefully written
Cypher queries that don't leak data that a user isnt supposed to
access.
Maybe is a good idea add another layer of software between your AngularJS client app and the Neo4j database. This way in this new layer of software (a Node.js application, for example) you can implement a access control system, then verifiy if the authenticated user can access the resource that is being requested.

How to hide value from Firebase in multi part request iOS

I'm using Firebase in my iOS app but I want to ensure a value is never sent from the server to the client.
Users in the app are shown to each other based on a score they have. So a user with a score of 5 will see other users who have a score of 5. I don't want to include this value in the request/response to Firebase.
Where I can manage the server I can have server side logic handle this by looking up the user on the server then calling a function that determines who has the same score and returning the relevant users without the client ever receiving the user score.
With Firebase my understanding is I'd have to send the value to Firebase in a query i.e. get all users with this user's score.
How can I do this without exposing the user's score? I want something along the lines of a node user_scores where I can query the current users score and then using this query another node users to return me the relevant users without having to nest the query on the client and thus expose the score in the request/response?
Many thanks!
Your understanding is pretty much on point, there is no way to make a "dynamic" query like this without actually exposing the varying parameter to the client.
Here are two ideas you could try to use as a workaround:
A variation of "security by obscurity": instead of exposing a single number, obfuscate that value in a way that makes guessing its purpose and other values an unpleasant experience; and share that with the client.
If you keep your users grouped by this key, not just as a flat list where this is a child node, you can use security rules to enforce that the user cannot read any other group than theirs.
(Note that this is also true for numerical values. Security rules are not filters.)
In a much more involved strategy, you could make the query static. Store and maintain a list of matching users per user, so the clients can load their own personal list without any varying parameters sans the UID.
(This is probably not really feasible if there is a lot of movement involved. But it might work in some edge cases.)

Deploying Neo4j database

so I developed a small Neo4j database with the aim of providing users with path-related information (shortest path from A to B and properties of individual sections of the path). My programming skills are very basic, but I want to make the database very user-friendly.
Basically, I would like to have a screen where users can choose start location and end location from dropdown lists, click a button, and the results (shortest path, distance of the path, properties of the path segments) will appear. For example, if this database had been made in MS Access, I would have made a form, where users could choose the locations, then click a control button which would have executed a query and produced results on a nice report.
Please note that all the nodes, relationships and queries are already in place. All I am looking for are some tips regarding the most user-friendly way of making the information accessible to the users.
Currently, all I can do is make the users install neo4j, run neo4j every time they need it, open the browser, run the cypher script and then edit the cypher script (write down strings as locations) and then execute the query. This makes it rather impractical for users and also I am worried that some user might corrupt the data,
I'd suggest making a web application using a web framework like Rails, especially if you're new to programming. You can use the neo4j gem for that to connect to your database and create models to access the data in a friendly way:
https://github.com/neo4jrb/neo4j
I'm one of the maintainers of that gem, so feel free to contact us if you have any questions:
neo4jrb#googlegroups.com
http://twitter.com/neo4jrb
Also, you might be interested in look at my newest project called meta model:
https://github.com/neo4jrb/meta_model
It's a Rails app that lets you define via the web app UI your database model (or at least part of it) and then browse/edit the objects via the web app. It's still very much preliminary, but I'd like to be able to things like what you're talking about (letting users examing data and the relationships between them in a user friendly way)
I general you would write an tiny (web/desktop/forms-)application that contains the form, takes the form values and issues the cypher requests with the form values as parameters.
The results can then be rendered as a table or chart or whatever.
You could even run this from Excel or Access with a Macro (using the Neo4j http endpoint).
Depending on your programming skills (which programming language can you write in) it can be anything. There is also a Neo4j .Net client (see http://neo4j.com/developer/dotnet).
And it's author Tatham Oddie showed a while ago how to do that with Excel

Need advice on MongoDB Schema for Chat App. Embedded vs Related Documents

I'm starting a MongoDB project just for kicks and as a chance to learn MongoDB/NoSQL schemas. It'll be a live chat app and the stack includes: Rails 3, Ruby 1.9.2, Devise, Mongoid/MongoDB, CarrierWave, Redis, JQuery.
I'll be handling the live chat polling/message queueing separately. Not sure how yet, either Node.js, APE or custom EventMachine app. But in regards to Mongo, I'm thinking to use it for everything else in the app, specifically chat logs and historical transcripts.
My question is how best to design the schema as all my previous experience has been with MySQL and relational DB schema's. And as a sub-question, when is it best to us embedded documents vs related documents.
The app will have:
Multiple accounts which have multiple rooms
Multiple rooms
Multiple users per room
List of rooms a user is allowed to be in
Multiple user chats per room
Searchable chat logs on a per room and per user basis
Optional file attachment for a given chat
Given Mongo (at least last time I checked) has a document limit of 4MB, I don't think having a collection for rooms and storing all room chats as embedded documents would work out so well.
From what I've thought about so far, I'm thinking of doing something like:
A collection for accounts
A collection for rooms
Each room relates back to an account
Related documents in chats collections for all chat messages in the room
Embedded Document listing all users currently in the room
A collection for users
Embedded Document listing all the rooms the user is currently in
Embedded Document listing all the rooms the user is allowed to be in
A collection for chats
Each chat relates back to a room in the rooms collection
Each chat relates back to a user in the users collection
Embedded document with info about optional uploaded file attachment.
My main concern is how far do I go until this ends up looking like a relational schema and I defeat the purpose? There is definitely more relating than embedding going on.
Another concern is that referencing related documents is much slower than accessing embedded documents I've heard.
I want to make generic queries such as:
Give me all rooms for an account
Give me all chats in a room (or filtered via date range)
Give me all chats from a specific user
Give me all uploaded files in a given room or for a given org
etc
Any suggestions on how to structure the schema efficiently in a way that scales? Thanks everyone.
I think you're pretty much on the right track. I'd use a capped collection for chat lines, with each line containing the user ID, room ID, timestamp, and what was said. This data would expire once the capped collection's "end" is reached, so if you needed a historical log you'd want to copy data out of the capped collection into a "log" collection periodically, but capped collections are specifically designed for logging-style applications where you aren't going to be deleting documents, and insertion order matters. In the case of chat, it's a perfect match.
The only other change I'd suggest would be to maintain uploads in a separate collection, as well.
I am a big fan of mongodb as a document database aswell. But are you sure you are using mongodb for the right reason? What is mongodb powerful at?
Its a subjective question but for me in-place (atomic) updates over documents is what makes mongodb powerful. And I can't really see you using it that much. And on top of that you are hitting the document size limit problem aswell.(With experience I can tell you that embedding files to mongodb is not a good idea). You want to have a live chat application on top of database too.
Your document schema's seems logical. But I wouldn't go with mongodb for this kind of project where your application heavily depends on inserts. I would go for CouchDB.
With CouchDB you wouldn't have to worry about attachments problem, you can embed them easily. "_changes" would make your life much much easier to eighter build a live chat application / long pooling / feeding search engine (if you want to implement one).
And I saw an open source showcase project in couchone. It has some similarities with your goals: Anologue. You should check it out.
PS : Sorry it was a little off topic but I couldn't hold myself.

Resources