How to limit parsing depth using Tinkerpop Frames - neo4j

Hi I have an interface and a corresponding implementation class like:
public interface IActor extends VertexFrame {
#Property(ActorProps.nodeClass)
public String getNodeClass();
#Property(ActorProps.nodeClass)
public void setNodeClass(String str);
#Property(ActorProps.id)
public String getId();
#Property(ActorProps.id)
public void setId(String id);
#Property(ActorProps.name)
public String getName();
#Property(ActorProps.name)
public void setText(String text);
#Property(ActorProps.uuid)
public String getUuid();
#Property(ActorProps.uuid)
public void setUuid(String uuid);
#Adjacency(label = RelClasses.CoActors, direction = Direction.OUT)
public Iterable<IActor> getCoactors();
}
And I use OrientDB with it that looks something like that. I had similar implementation with Neo4j as well:
Graph graph = new OrientGraph("remote:localhost/actordb");
FramedGraph<Graph> manager = new FramedGraphFactory().create(graph);
IActor actor = manager.frame(((OrientGraph)graph).getVertexByKey("Actor.uuid",uuid), IActor.class);
Above works but the problem is that in this case or similar, because there is a relationship between two vertices of class Actor, there could be potentially a graph loop. Is there a way to define either by Annotation or some other way (e.g through Manager) to stop after x steps for a specific #Adjacency so this won't go forever? If #GremlinGroovy (https://github.com/tinkerpop/frames/wiki/Gremlin-Groovy) annotation is the answer could you please give an example ?

I'm not sure I understand the question/problem. (You say "potentially", but haven't actually proven that there's a problem!)
Is the problem that there is a loop in the Vertex/Frames, and (you think) loading the object will result in an infinite loop?
Have you been able to prove that there is a problem loading a Vertex/Frame with a loop? (show me the code/problem)
As I understand it, the Pipelines will lazy-load objects (only load then when required). The frames (I imagine) only load adjacent frames when requested. Basically, as far as I can tell, theres no problem.
Example (Groovy)
// create some framed vertices
Person nick = createPerson(name: 'Nick')
Person michail = createPerson(name: 'Michail')
// create a recursive loop
nick.addKnows(michail)
michail.addKnows(nick)
// handles recursion = true!
Person nick2 = framedGraph.getVertex(nick.asVertex().id, Person)
assert nick2.knows.knows.knows.knows.knows.name == 'Michail'

Related

Trying to add a listener to a model (backed by a TDB2 dataset)

After a little research, org.apache.jena.sparql.core.DatasetGraphMonitor looked the way to go.
To my understanding I have to crate a DatasetGraph wrapped by the DatasetGraphMonitor, use this graph to create a Model and all the modifications to the model are now notified to my DatasetChanges object.
So that's what I'm doing:
//create a Dataset backed by TBD2
Dataset dataset = TDB2Factory.connectDataset(location);
//wrap the dataset with a DatasetGraphMonitor and obtain a DatasetGraph
DatasetGraph datasetGraph = new DatasetGraphMonitor(dataset.asDatasetGraph(), new DatasetChanges() {
#Override
public void start() {
}
#Override
public void reset() {
}
#Override
public void finish() {
}
#Override
public void change(QuadAction qaction, Node g, Node s, Node p, Node o) {
LOG.info("Dataset change: "+qaction);
}
});
//create a model using the DatasetGraphMonitor as underlying graph
Model model = ModelFactory.createModelForGraph(datasetGraph.getDefaultGraph());
//run an insert sparql query to add new triples to the triplestore (this really is in a write transaction, maybe I'm oversimplifying here)
UpdateAction.parseExecute(sparqlQuery, model);
well, you guessed that already: change never gets called.
Any idea about what I'm doing wrong here? Thanks.
DatasetGraphMonitor is for monitoring actions on the dataset. Getting the default graph, making it a model, doesn't trigger that machinery. (If it did, you'd get a "not in transaction" exception). The returns graph does straight to the core database.
Instead, either:
Wrap the graph from datasetGraph.getDefaultGraph() with GraphWrapper and put
the monitoring code on the various add/delete methods.
Do the update (in a transaction) on the datasetGraph.

How do I make View's asList() sortable in Google Dataflow SDK?

We have a problem making asList() method sortable.
We thought we could do this by just extending the View class and override the asList method but realized that View class has a private constructor so we could not do this.
Our other attempt was to fork the Google Dataflow code on github and modify the PCollectionViews class to return a sorted list be using the Collections.sort method as shown in the code snippet below
#Override
protected List<T> fromElements(Iterable<WindowedValue<T>> contents) {
Iterable<T> itr = Iterables.transform(
contents,
new Function<WindowedValue<T>, T>() {
#SuppressWarnings("unchecked")
#Override
public T apply(WindowedValue<T> input){
return input.getValue();
}
});
LOG.info("#### About to start sorting the list !");
List<T> tempList = new ArrayList<T>();
for (T element : itr) {
tempList.add(element);
};
Collections.sort((List<? extends Comparable>) tempList);
LOG.info("##### List should now be sorted !");
return ImmutableList.copyOf(tempList);
}
Note that we are now sorting the list.
This seemed to work, when run with the DirectPipelineRunner but when we tried the BlockingDataflowPipelineRunner, it didn't seem like the code change was being executed.
Note: We actually recompiled the dataflow used it in our project but this did not work.
How can we be able to achieve this (as sorted list from the asList method call)?
The classes in PCollectionViews are not intended for extension. Only the primitive view types provided by View.asSingleton, View.asSingleton View.asIterable, View.asMap, and View.asMultimap are supported.
To obtain a sorted list from a PCollectionView, you'll need to sort it after you have read it. The following code demonstrates the pattern.
// Assume you have some PCollection
PCollection<MyComparable> myPC = ...;
// Prepare it for side input as a list
final PCollectionView<List<MyComparable> myView = myPC.apply(View.asList());
// Side input the list and sort it
someOtherValue.apply(
ParDo.withSideInputs(myView).of(
new DoFn<A, B>() {
#Override
public void processElement(ProcessContext ctx) {
List<MyComparable> tempList =
Lists.newArrayList(ctx.sideInput(myView));
Collections.sort(tempList);
// do whatever you want with sorted list
}
}));
Of course, you may not want to sort it repeatedly, depending on the cost of sorting vs the cost of materializing it as a new PCollection, so you can output this value and read it as a new side input without difficulty:
// Side input the list, sort it, and put it in a PCollection
PCollection<List<MyComparable>> sortedSingleton = Create.<Void>of(null).apply(
ParDo.withSideInputs(myView).of(
new DoFn<Void, B>() {
#Override
public void processElement(ProcessContext ctx) {
List<MyComparable> tempList =
Lists.newArrayList(ctx.sideInput(myView));
Collections.sort(tempList);
ctx.output(tempList);
}
}));
// Prepare it for side input as a list
final PCollectionView<List<MyComparable>> sortedView =
sortedSingleton.apply(View.asSingleton());
someOtherValue.apply(
ParDo.withSideInputs(sortedView).of(
new DoFn<A, B>() {
#Override
public void processElement(ProcessContext ctx) {
... ctx.sideInput(sortedView) ...
// do whatever you want with sorted list
}
}));
You may also be interested in the unsupported sorter contrib module for doing larger sorts using both memory and local disk.
We tried to do it the way Ken Knowles suggested. There's a problem for large datasets. If the tempList is large (so sort takes some measurable time as it's proportion to O(n * log n)) and if there are millions of elements in the "someOtherValue" PCollection, then we are unecessarily re-sorting the same list millions of times. We should be able to sort ONCE and FIRST, before passing the list to the someOtherValue.apply's DoFn.

Json and Circular Reference Exception

I have an object which has a circular reference to another object. Given the relationship between these objects this is the right design.
To Illustrate
Machine => Customer => Machine
As is expected I run into an issue when I try to use Json to serialize a machine or customer object. What I am unsure of is how to resolve this issue as I don't want to break the relationship between the Machine and Customer objects. What are the options for resolving this issue?
Edit
Presently I am using Json method provided by the Controller base class. So the serialization I am doing is as basic as:
Json(machineForm);
Update:
Do not try to use NonSerializedAttribute, as the JavaScriptSerializer apparently ignores it.
Instead, use the ScriptIgnoreAttribute in System.Web.Script.Serialization.
public class Machine
{
public string Customer { get; set; }
// Other members
// ...
}
public class Customer
{
[ScriptIgnore]
public Machine Machine { get; set; } // Parent reference?
// Other members
// ...
}
This way, when you toss a Machine into the Json method, it will traverse the relationship from Machine to Customer but will not try to go back from Customer to Machine.
The relationship is still there for your code to do as it pleases with, but the JavaScriptSerializer (used by the Json method) will ignore it.
I'm answering this despite its age because it is the 3rd result (currently) from Google for "json.encode circular reference" and although I don't agree with the answers (completely) above, in that using the ScriptIgnoreAttribute assumes that you won't anywhere in your code want to traverse the relationship in the other direction for some JSON. I don't believe in locking down your model because of one use case.
It did inspire me to use this simple solution.
Since you're working in a View in MVC, you have the Model and you want to simply assign the Model to the ViewData.Model within your controller, go ahead and use a LINQ query within your View to flatten the data nicely removing the offending circular reference for the particular JSON you want like this:
var jsonMachines = from m in machineForm
select new { m.X, m.Y, // other Machine properties you desire
Customer = new { m.Customer.Id, m.Customer.Name, // other Customer properties you desire
}};
return Json(jsonMachines);
Or if the Machine -> Customer relationship is 1..* -> * then try:
var jsonMachines = from m in machineForm
select new { m.X, m.Y, // other machine properties you desire
Customers = new List<Customer>(
(from c in m.Customers
select new Customer()
{
Id = c.Id,
Name = c.Name,
// Other Customer properties you desire
}).Cast<Customer>())
};
return Json(jsonMachines);
Based on txl's answer you have to
disable lazy loading and proxy creation and you can use the normal methods to get your data.
Example:
//Retrieve Items with Json:
public JsonResult Search(string id = "")
{
db.Configuration.LazyLoadingEnabled = false;
db.Configuration.ProxyCreationEnabled = false;
var res = db.Table.Where(a => a.Name.Contains(id)).Take(8);
return Json(res, JsonRequestBehavior.AllowGet);
}
Use to have the same problem. I have created a simple extension method, that "flattens" L2E objects into an IDictionary. An IDictionary is serialized correctly by the JavaScriptSerializer. The resulting Json is the same as directly serializing the object.
Since I limit the level of serialization, circular references are avoided. It also will not include 1->n linked tables (Entitysets).
private static IDictionary<string, object> JsonFlatten(object data, int maxLevel, int currLevel) {
var result = new Dictionary<string, object>();
var myType = data.GetType();
var myAssembly = myType.Assembly;
var props = myType.GetProperties();
foreach (var prop in props) {
// Remove EntityKey etc.
if (prop.Name.StartsWith("Entity")) {
continue;
}
if (prop.Name.EndsWith("Reference")) {
continue;
}
// Do not include lookups to linked tables
Type typeOfProp = prop.PropertyType;
if (typeOfProp.Name.StartsWith("EntityCollection")) {
continue;
}
// If the type is from my assembly == custom type
// include it, but flattened
if (typeOfProp.Assembly == myAssembly) {
if (currLevel < maxLevel) {
result.Add(prop.Name, JsonFlatten(prop.GetValue(data, null), maxLevel, currLevel + 1));
}
} else {
result.Add(prop.Name, prop.GetValue(data, null));
}
}
return result;
}
public static IDictionary<string, object> JsonFlatten(this Controller controller, object data, int maxLevel = 2) {
return JsonFlatten(data, maxLevel, 1);
}
My Action method looks like this:
public JsonResult AsJson(int id) {
var data = Find(id);
var result = this.JsonFlatten(data);
return Json(result, JsonRequestBehavior.AllowGet);
}
In the Entity Framework version 4, there is an option available: ObjectContextOptions.LazyLoadingEnabled
Setting it to false should avoid the 'circular reference' issue. However, you will have to explicitly load the navigation properties that you want to include.
see: http://msdn.microsoft.com/en-us/library/bb896272.aspx
Since, to my knowledge, you cannot serialize object references, but only copies you could try employing a bit of a dirty hack that goes something like this:
Customer should serialize its Machine reference as the machine's id
When you deserialize the json code you can then run a simple function on top of it that transforms those id's into proper references.
You need to decide which is the "root" object. Say the machine is the root, then the customer is a sub-object of machine. When you serialise machine, it will serialise the customer as a sub-object in the JSON, and when the customer is serialised, it will NOT serialise it's back-reference to the machine. When your code deserialises the machine, it will deserialise the machine's customer sub-object and reinstate the back-reference from the customer to the machine.
Most serialisation libraries provide some kind of hook to modify how deserialisation is performed for each class. You'd need to use that hook to modify deserialisation for the machine class to reinstate the backreference in the machine's customer. Exactly what that hook is depends on the JSON library you are using.
I've had the same problem this week as well, and could not use anonymous types because I needed to implement an interface asking for a List<MyType>. After making a diagram showing all relationships with navigability, I found out that MyType had a bidirectional relationship with MyObject which caused this circular reference, since they both saved each other.
After deciding that MyObject did not really need to know MyType, and thereby making it a unidirectional relationship this problem was solved.
What I have done is a bit radical, but I don't need the property, which makes the nasty circular-reference-causing error, so I have set it to null before serializing.
SessionTickets result = GetTicketsSession();
foreach(var r in result.Tickets)
{
r.TicketTypes = null; //those two were creating the problem
r.SelectedTicketType = null;
}
return Json(result);
If you really need your properties, you can create a viewmodel which does not hold circular references, but maybe keeps some Id of the important element, that you could use later for restoring the original value.

Iterating over an unknown IQueryable's properties?

Forgive me if this has been asked before; I couldn't find anything close after a few searches:
I'm trying to write an ActionFilter in MVC that will "intercept" an IQueryable and nullify all the parent-child relationships at runtime. I'm doing this because Linq does not serialize objects properly if they have parent-child relationships (it throws a circular reference error because the parent refers to the child, which refers back to the parent and so on), and I need the object serialized to Json for an Ajax call. I have tried marking the child relationship in the DBML file with a privacy status of internal, and while this fixes the serialization problem, it also hides the child members from the view engine when the page renders, causing another error to be thrown. So, by fixing one problem, I cause another.
The only thing that fixes both problems is to manually set the child members to null just before returning the serialization, but I'm trying to avoid doing that because it's cumbersome, not reusable, etc. I'd rather use an ActionFilter to inspect the IQueryable that is being serialized and nullify any members with a Type of EntitySet (how Foreign Keys/Associations are represented). However, I don't have much experience with Reflection and can't find any examples that illustrate how to do something like this. So... is this possible with Reflection? Is there a better way to accomplish the same thing? I'll post the relevant code tomorrow when I'm back at my work computer.
Thanks,
Daniel
As promised, the code:
[GridAction]
public ActionResult _GetGrid()
{
IQueryable<Form> result = formRepository.GetAll();
foreach (Form f in result)
{
f.LineItems = null;
f.Notes = null;
}
return View(new GridModel<Form> { Data = result });
}
An added wrinkle is that I'm using the new Telerik MVC Extensions, so I'm not actually serializing the Json myself -- I'm just returning the IQueryable in an IGridModel, and the action filter [GridAction] does the rest.
So, just in case anyone's curious, here's how I finally solved this problem: I modified Damien Guard's T4 template to include the attribute [ScriptIgnore] above entities of type Association. This lets the JSON serializer know to not bother serializing these, thus preventing the circular reference problem I was getting. The generated code ends up looking like this:
private EntitySet<LineItem> _LineItems;
[ScriptIgnore]
[Association(Name=#"Form_LineItem", Storage=#"_LineItems", ThisKey=#"Id", OtherKey=#"FormId")]
public EntitySet<LineItem> LineItems
{
get {
return _LineItems;
}
set {
_LineItems.Assign(value);
}
}
This fixes the serialization problem I was having without disabling the use of child tables through LINQ. The grid action on the controller ends up looking like this:
[GridAction]
public ActionResult _GetGrid()
{
return View(new GridModel<Form> { Data = formRepository.GetAll() });
}
There are two options, one is to ignore those properties during serialization using [XmlIgnore]. The other one is to nullify the properties using reflection.
Ignore in serialization, simple usage sample that shows how to use default value in serialization:
[Serializable]
public class MyClass
{
[XmlIgnore]
public int IgnoredVal { get; set; }
public int Val { get; set; }
}
public void UsageSample()
{
var xmlSerializer = new XmlSerializer(typeof(MyClass));
var memoryStream = new MemoryStream();
var toSerialize = new MyClass { IgnoredVal = 1, Val = 2 };
xmlSerializer.Serialize(memoryStream, toSerialize);
memoryStream.Position = 0;
var deserialize = (MyClass)xmlSerializer.Deserialize(memoryStream);
Assert.AreEqual(0, deserialize.IgnoredVal);
Assert.AreEqual(2, deserialize.Val);
}
Nullify with reflection, code sample:
public void NullifyEntitySetProperties(object obj)
{
var entitySetProperties = obj.GetType().GetProperties()
.Where(property => property.PropertyType == typeof(EntitySet));
foreach (var property in entitySetProperties)
{
property.SetValue(obj, null, null);
}
}
In my opinion, if the first option can be done used in your code it's better. This option is more direct and economic.

Using Stored Procedures with Linq To Sql which have Additional Parameters

I have a very big problem and can't seem to find anybody else on the internet that has my problem. I sure hope StackOverflow can help me...
I am writing an ASP.NET MVC application and I'm using the Repository concept with Linq To Sql as my data store. Everything is working great in regards to selecting rows from views. And trapping very basic business rule constraints. However, I'm faced with a problem in my stored procedure mappings for deletes, inserts, and updates. Let me explain:
Our DBA has put a lot of work into putting the business logic into all of our stored procedures so that I don't have to worry about it on my end. Sure, I do basic validation, but he manages data integrity and conflicting date constraints, etc... The problem that I'm faced with is that all of the stored procedures (and I mean all) have 5 additional parameters (6 for inserts) that provide information back to me. The idea is that when something breaks, I can prompt the user with the appropriate information from our database.
For example:
sp_AddCategory(
#userID INT,
#categoryName NVARCHAR(100),
#isActive BIT,
#errNumber INT OUTPUT,
#errMessage NVARCHAR(1000) OUTPUT,
#errDetailLogID INT OUTPUT,
#sqlErrNumber INT OUTPUT,
#sqlErrMessage NVARCHAR(1000) OUTPUT,
#newRowID INT OUTPUT)
From the above stored procedure, the first 3 parameters are the only parameters that are used to "Create" the Category record. The remaining parameters are simply used to tell me what happened inside the method. If a business rule is broken inside the stored procedure, he does NOT use the SQL 'RAISEERROR' keyword when business rules are broken. Instead, he provides information about the error back to me using the OUTPUT parameters. He does this for every single stored procedure in our database even the Updates and Deletes. All of the 'Get' calls are done using custom views. They have all been tested and the idea was to make my job easier since I don't have to add the business logic to trap all of the various scenarios to ensure data quality.
As I said, I'm using Linq To Sql, and I'm now faced with a problem. The problem is that my "Category" model object simply has 4 properties on it: CategoryID, CategoryName, UserId, and IsActive. When I opened up the designer to started mapping my properties for the insert, I realized that there is really no (easy) way for me to account for the additional parameters unless I add them to my Model object.
Theoretically what I would LIKE to do is this:
// note: Repository Methods
public void AddCategory(Category category)
{
_dbContext.Categories.InsertOnSubmit(category);
}
public void Save()
{
_dbContext.SubmitChanges();
}
And then from my CategoryController class I would simply do the following:
[AcceptVerbs(HttpVerbs.Post)]
public ActionResult Create(FormCollection collection)
{
var category = new Category();
try
{
UpdateModel(category); // simple validation here...
_repository.AddCategory(category);
_repository.Save(); // should get error here!!
return RedirectToAction("Index");
}
catch
{
// manage friendly messages here somehow... (??)
// ...
return View(category);
}
}
What is the best way to manage this using Linq to Sql? I (personally) don't feel that it makes sense to have all of these additional properties added to each model object... For example, the 'Get' should NEVER have errors and I don't want my repository methods to return one type of object for Get calls, but accept another type of object for CUD calls.
Update: My Solution! (Dec. 1, 2009)
Here is what I did to fix my problem. I got rid of my 'Save()' method on all of my repositories. Instead, I added an 'Update()' method to each repository and actually commit the data to the database on each CUD (ie. Create / Update / Delete) call.
I knew that each stored procedure had the same parameters, so I created a class to hold them:
public class MySprocArgs
{
private readonly string _methodName;
public int? Number;
public string Message;
public int? ErrorLogId;
public int? SqlErrorNumber;
public string SqlErrorMessage;
public int? NewRowId;
public MySprocArgs(string methodName)
{
if (string.IsNullOrEmpty(methodName))
throw new ArgumentNullException("methodName");
_methodName = methodName;
}
public string MethodName
{
get { return _methodName; }
}
}
I also created a MySprocException that accepts the MySprocArgs in it's constructor:
public class MySprocException : ApplicationException
{
private readonly MySprocArgs _args;
public MySprocException(MySprocArgs args) : base(args.Message)
{
_args = args;
}
public int? ErrorNumber
{
get { return _args.Number; }
}
public string ErrorMessage
{
get { return _args.Message; }
}
public int? ErrorLogId
{
get { return _args.ErrorLogId; }
}
public int? SqlErrorNumber
{
get { return _args.SqlErrorNumber; }
}
public string SqlErrorMessage
{
get { return _args.SqlErrorMessage; }
}
}
Now here is where it all comes together... Using the example that I started with in my initial inquiry, here is what the 'AddCategory()' method might look like:
public void AddCategory(Category category)
{
var args = new MySprocArgs("AddCategory");
var result = _dbContext.AddWidgetSproc(
category.CreatedByUserId,
category.Name,
category.IsActive,
ref args.Number, // <-- Notice use of 'args'
ref args.Message,
ref args.ErrorLogId,
ref args.SqlErrorNumber,
ref args.SqlErrorMessage,
ref args.NewRowId);
if (result == -1)
throw new MySprocException(args);
}
Now from my controller, I simply do the following:
[HandleError(ExceptionType = typeof(MySprocException), View = "SprocError")]
public class MyController : Controller
{
[AcceptVerbs(HttpVerbs.Post)]
public ActionResult Create(Category category)
{
if (!ModelState.IsValid)
{
// manage friendly messages
return View(category);
}
_repository.AddCategory(category);
return RedirectToAction("Index");
}
}
The trick to managing the new MySprocException is to simply trap it using the HandleError attribute and redirect the user to a page that understands the MySprocException.
I hope this helps somebody. :)
I don't believe you can add the output parameters to any of your LINQ classes because the parameters do not persist in any table in your database.
But you can handle output parameters in LINQ in the following way.
Add the stored procedure(s) you whish to call to your .dbml using the designer.
Call your stored procedure in your code
using (YourDataContext context = new YourDataContext())
{
Nullable<int> errNumber = null;
String errMessage = null;
Nullable<int> errDetailLogID = null;
Nullable<int> sqlErrNumber = null;
String sqlErrMessage = null;
Nullable<int> newRowID = null;
Nullable<int> userID = 23;
Nullable<bool> isActive=true;
context.YourAddStoredProcedure(userID, "New Category", isActive, ref errNumber, ref errMessage, ref errDetailLogID, ref sqlErrNumber, ref sqlErrMessage, ref newRowID);
}
I haven' tried it yet, but you can look at this article, where he talks about stored procedures that return output parameters.
http://weblogs.asp.net/scottgu/archive/2007/08/16/linq-to-sql-part-6-retrieving-data-using-stored-procedures.aspx
Basically drag the stored procedure into your LINQ to SQL designer then it should do the work for you.
The dbContext.SubmitChanges(); will work only for ENTITY FRAMEWORK.I suggest Save,Update and delete will work by using a Single Stored procedure or using 3 different procedure.

Resources