Preventing the "serializing the whole world" issue for large object graph in SnakeYaml

Preventing the "serializing the whole world" issue for large object graph in SnakeYaml - snakeyaml

We have a very large object graph (lazy loaded from the DB via DataNucleus ORM during normal program execution so no problem normally) but we only want to serialize a small portion of it with SnakeYaml - just a small subset of classes.
There is a relationship from one of these classes to other classes which end up "reaching" most of the other objects in the object graph which results in pulling nearly all objects from the database into the YAML serialization stream - the classic "serializing the whole world" problem which doesn't end well when you have millions of reachable objects as you can imagine :)
I found the SnakeYaml 'Representer' class which appears like a hook that lets you specify "not" to serialize a particular bean but it appears like it doesn't act like a circuit breaker on the object graph navigation process when it encounters that bean. It won't write YAML output for that bean but SnakeYaml appears to continue navigating the object graph past that bean.
private class CircuitBreakerRepresenter extends Representer
{
#Override
protected NodeTuple representJavaBeanProperty(Object javaBean, Property property,
Object propertyValue, Tag customTag) {
// Intention: Don't navigate past the instances of 'Role' class when serializing
// Outcome: Appears to continue navigating past 'Role' class instances
if (javaBean instanceof Role) {
return null;
} else {
return super.representJavaBeanProperty(javaBean, property, propertyValue,
customTag);
}
}
}
Is there a way to cause SnakeYaml to not navigate past a particular bean when serializing an object graph?

I just managed to answer my own question :)
What I was doing wrong was attempting to stop the object graph navigation at the class level.
What you need to do to circuit break the navigation is do it at the level of the individual relationship/property that you don't want SnakeYaml to navigate beyond during serialization.
so instead of
if (javaBean instanceof Role) {
return null;
I needed to do
if (javaBean instanceof ClassWithAttribute && property.getName().equals("classAttributeName"))
return null;
where:
"ClassWithAttribute" is the name of the class with the relationship
beyond which you don't want the serialization to proceed.
"classAttributeName" is the name of the relationship attribute beyond
which you don't want the serialization to proceed.

Related

How to dynamically add a property / field to a domain class in Grails?

For a project I'm currently working on I need to dynamically add properties to a domain class and persist them later in the database. In general, I need a key/value store attached to a "normal" domain class. Sadly I cannot use a NoSQL database (e.g. Redis).
My approach would be to handle the additional properties on a save() by identifying them within afterInsert or afterUpdate and writing them to another table - I would prefer not to use a map property within the domain class but an additional "Field" table (to better support searches).
I tried to add properties using the metaClass approach:
person.metaClass.middlename = "Biterius"
assert person.middlename == "Biterius" // OK
This works and I can identify the additional properties in the afterInsert/afterUpdate methods but it seems that I cannot change the value thereafter - i.e., the following does not work:
person.middlename = "Tiberius"
assert person.middlename == "Tiberius" // FAIL
Then I tried an Expando approach by extending the Person class by the Expando class (directly ("Person extends Expando") and via an abstract intermediate class ("Person extends AbstractPerson" and "AbstractPerson extends Expando")).
def person = new Person()
assert person in Person // OK
assert person in AbstractPerson // OK
assert person in Expando // OK
Both variants did not work - I could assign values to arbitrary "properties" but the values were not stored!
person.mynewproperty = "Tiberius" // no MissingPropertyException is thrown
println person.mynewproperty // returns null
So how can I add properties to a domain class programmatically during runtime, change them and retrieve them during afterInsert or afterUpdate in order to "manually" store them in a "Fields" table?
Or am I doing something completely wrong? Are there other / simpler ways to do this?

What about turning your DB into a "NoSQL" one?
In one of my projects, I just used a String-property to store a map as JSON-Object.
For Groovy it's not a big problem to convert between a map and a JSON-Object. And since you can access a map just like an object with properties, I found this solution very convenient.
Only drawback: you have to plan the size of your String-property in advance...
Update: sorry, just read that you want to support searches...
what about
class Person {
...
static hasMany = [extProperties:KeyValue]
...
def invokeMethod(String name, args) {
if (name.startsWith('get')) {
//an unknown properties's getter is called
}
//add same for setter
}
}
class KeyValue {
String key
String value
}
I guess such a schema would give you all freedom you need. Even without the hasMany, you can make use of invokeMethod to handle your external tables...
The getter and setter can save your values in a transient string propertie (static transients = ['myTransientProperty']). This property should be available in the afterInsert / `afterUpdate´ events.

Why don't you just create a map of strings on the domain object and store your extra data there manually? Unless you're storing complex data you should be able to cast anything you need to/from a string.

InSingletonScope using Ninject and a Windows Service

I re-posted this question as I think it is a bit vague. New Post
I am currently using a Windows Service that is on a 2 minute timer. I am using EF code first with a repository pattern for data access. I am using Ninject to inject my dependencies. I have the following bindings in my NinjectDependencyResolver class:
ConnectionStringSettings connectionStringSettings = ConfigurationManager.ConnectionStrings["Database"];
Bind<IDatabaseFactory>().To<DatabaseFactory>()
.InSingletonScope()
.WithConstructorArgument("connectionString", connectionStringSettings.Name);
Bind<IUnitOfWork>().To<UnitOfWork>().InSingletonScope();
Bind<IMyRepository>().To<MyRepository>().InSingletonScope();
When my service runs every 2 minutes I do some thing similar to this:
foreach (var row in rows)
{
var existing = myRepository.GetById(row.Id);
if (existing == null)
{
existing = new Row();
myRepository.Add(existing);
unitOfWork.Commit();
}
}
I am starting to see an error in my logs that say:
The changes to the database were committed successfully, but an error occurred while updating the object context. The ObjectContext might be in an inconsistent state. Inner exception message: AcceptChanges cannot continue because the object's key values conflict with another object in the ObjectStateManager. Make sure that the key values are unique before calling AcceptChanges.
Is it correct to use InSingeltonScope when using Ninject in a Windows Service? I believe I tried using different scopes like InTransientScope but I could only get InSingeltonScope to work with data access. Does the error message have anything to do with Scope or is it unrelated?

Assuming that the service is not the only process that operates on the database you shouldn't use Singleton. What happens in this case is that you are reusing a DBContext that has cached entities which are out of date.
The better way is to treat each timer execution of the service in a similar way like it is a web/wcf request and create a new job processor for the request.
var processor = factory.CreateRowsProcessor();
processor.ProcessRows(rows);
public class RowsProcessor
{
public Processor(UoW uow, ....)
{
...
}
public void ProcessRows(Rows[] rows)
{
foreach (var row in rows)
{
var existing = myRepository.GetById(row.Id);
if (existing == null)
{
existing = new Row();
myRepository.Add(existing);
unitOfWork.Commit();
}
}
}
}
Depending of the problem it might even better to put the loop outside and have a new processor for each single row.
Read http://www.planetgeek.ch/2011/12/31/ninject-extensions-factory-introduction/ for more information about factories. Also have a look at the InCallScope of the named scope extension if you need to inject the UoW into multiple classes. http://www.planetgeek.ch/2010/12/08/how-to-use-the-additional-ninject-scopes-of-namedscope/

InSingletonScope will create singleton context = one context for the whole lifetime of your service. It is very bad solution. Because context holds all objects from all previous time events its memory consumption grows and there are possibilities to get errors as the one you are receiving at the moment (but the error really can be unrelated to your singleton context but most likely it is not). The exception says that you have two different objects with the same key identifier tracked by the context - that is not allowed.
Instead of using singleton uow, repository and context use singleton factory and in each time even request new fresh instances from the factory. Dispose context at the end of the time event processing.

Cancel/Block the save of a domain object based on some criteria?

i have a need to block or cancel a save of a domain object based on some property.
Can this be done in a constraint?
Example:
An 'Order' domain object has a state of 'invoiced' then the order should not be able to be updated anymore..
Any suggestions on how to tackle this?

I see no reason why you couldn't simply use a constraint for this (as you suggested). Something like this should do it
class Order {
String state
static constraints = {
state(validator: {stateValue, self ->
// only check state if this object has already been saved
if (self.id && stateValue == 'invoiced') {
return false
}
})
}
}
If for some reason you can't use a constraint, here are a couple of alternative suggestions:
Meta-Programming
Use Groovy's method-interception capabilities to intercept calls to save(). Your interceptor should only forward the call to the intercepted save() if the order does not have an invoiced state.
There are some good examples of how to do this in the Programming Groovy book
GORM Events
GORM provides a number of events that are triggered during a persisted objects lifecycle. It may be possible in the beforeUpdate or beforeValidate events to prevent updating the object (I guess throwing an exception would work)

System.InvalidOperationException when trying to iteratively add objects using EF 4

This question is very similiar to this one. However, the resolution to that question:
Does not seem to apply, or
Are somewhat suspect, and don't seem like a good approach to resolving the problem.
Basically, I'm iterating over a generic list of objects, and inserting them. Using MVC 2, EF 4 with the default code generation.
foreach(Requirement r in requirements)
{
var car = new CustomerAgreementRequirement();
car.CustomerAgreementId = viewModel.Agreement.CustomerAgreementId;
car.RequirementId = r.RequirementId;
_carRepo.Add(car); //Save new record
}
And the Repository.Add() method:
public class BaseRepository<TEntity> : IRepository<TEntity> where TEntity : class
{
private TxRPEntities txDB;
private ObjectSet<TEntity> _objectSet;
public void Add(TEntity entity)
{
SetUpdateParams(entity);
_objectSet.AddObject(entity);
txDB.SaveChanges();
}
I should note that I've been successfully using the Add() method throughout my code for single inserts; this is the first time I've tried to use it to iteratively insert a group of objects.
The error:
System.InvalidOperationException: The changes to the database were committed successfully, but an error occurred while updating the object context. The ObjectContext might be in an inconsistent state. Inner exception message: AcceptChanges cannot continue because the object's key values conflict with another object in the ObjectStateManager. Make sure that the key values are unique before calling AcceptChanges.
As stated in the prior question, the EntityKey is set to True, StoreGeneratedPattern = Identity. The actual table that is being inserted into is a relationship table, in that it is comprised of an identity field and two foreign key fields. The error always occurs on the second insert, regardless of whether that specific entity has been inserted before or not, and I can confirm that the values are always different, no key conflicts as far as the database is concerned. My suspicion is that it has something to do with the temporary entitykey that gets set prior to the actual insert, but I don't know how to confirm that, nor do I know how to resolve it.
My gut feeling is that the solution in the prior question, to set the SaveOptions to None, would not be the best solution. (See prior discussion here)

I've had this issue with my repository using a loop as well and thought that it might be caused by some weird race-like condition. What I've done is refactor out a UnitOfWork class, so that the repository.add() method is strictly adding to the database, but not storing the context. Thus, the repository is only responsible for the collection itself, and every operation on that collection happens in the scope of the unit of work.
The issue there is that: In a loop, you run out of memory damn fast with EF4. So you do need to store the changes periodically, I just don't store after every save.
public class BaseRepository : IRepository where TEntity : class
{
private TxRPEntities txDB;
private ObjectSet _objectSet;
public void Add(TEntity entity)
{
SetUpdateParams(entity);
_objectSet.AddObject(entity);
}
public void Save()
{
txDB.SaveChanges();
}
Then you can do something like
foreach(Requirement r in requirements)
{
var car = new CustomerAgreementRequirement();
car.CustomerAgreementId = viewModel.Agreement.CustomerAgreementId;
car.RequirementId = r.RequirementId;
_carRepo.Add(car); //Save new record
if (some number limiting condition if you have thousands)
_carRepo.Save(); // To save periodically and clear memory
}
_carRepo.Save();
Note: I don't really like this solution, but I hunted around to try to find why things break in a loop when they work elsewhere, and that's the best I came up with.

We have had some odd collision issues if the entity is not added to the context directly after being created (before doing any assignments). The only time I've noticed the issue is when adding objects in a loop.
Try adding the newed up entity to the context, do the assignments, then save the context. Also, you don't need to save the context each time you add a new entity unless you absolutely need the primary key.

Entity Framework 4 Code First and the new() Operator

I have a rather deep hierarchy of objects that I'm trying to persist with Entity Framework 4, POCO, PI (Persistence Ignorance) and Code First. Suddenly things started working pretty well when it dawned on me to not use the new() operator. As originally written, the objects frequently use new() to create child objects.
Instead I'm using my take on the Repository Pattern to create all child objects as needed. For example, given:
class Adam
{
List<Child> children;
void AddChildGivenInput(string input) { children.Add(new Child(...)); }
}
class Child
{
List<GrandChild> grandchildren;
void AddGrandChildGivenInput(string input) { grandchildren.Add(new GrandChild(...)); }
}
class GrandChild
{
}
("GivenInput" implies some processing not shown here)
I define an AdamRepository like:
class AdamRepository
{
Adam Add()
{
return objectContext.Create<Adam>();
}
Child AddChildGivenInput(Adam adam, string input)
{
return adam.children.Add(new Child(...));
}
GrandChild AddGrandchildGivenInput(Child child, string input)
{
return child.grandchildren.Add(new GrandChild(...));
}
}
Now, this works well enough. However, I'm no longer "ignorant" of my persistence mechanism as I have abandoned the new() operator.
Additionally, I'm at risk of an anemic domain model since so much logic ends up in the repository rather than in the domain objects.
After much adieu, a question:
Or rather several questions...
Is this pattern required to work with EF 4 Code First?
Is there a way to retain use of new() and still work with EF 4 / POCO / Code First?
Is there another pattern that would leave logic in the domain object and still work with EF 4 / POCO / Code First?
Will this restriction be lifted in later versions of Code First support?
Sometimes trying to go the POCO /
Persistence Ignorance route feels like
swimming upstream, other times it feels
like swimming up Niagra Falls. Still, I want to believe...

Here are a couple of points that might help answer your question:
In your classes you have a field for the children collection and a method to add to the children. EF in general (not just Code First) currently requires that collections are surface as properties, so this pattern is not currently supported. More flexibility in how we interact with classes is a common ask for EF and our team is looking at how we can support this at the moment
You mentioned that you need to explicitly register entities with the context, this isn’t necessarily the case. In the following example if GetAdam() returned a Adam object that is attached to the underlying context then the new child Cain would be automatically discovered by EF when you save and inserted into the database.
var adam = myAdamRepository.GetAdam();
var cain = new Child();
adam.Children.Add(cain);
~Rowan

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

Preventing the "serializing the whole world" issue for large object graph in SnakeYaml - snakeyaml

Related

How to dynamically add a property / field to a domain class in Grails?

InSingletonScope using Ninject and a Windows Service

Cancel/Block the save of a domain object based on some criteria?

System.InvalidOperationException when trying to iteratively add objects using EF 4

Entity Framework 4 Code First and the new() Operator

Categories

Resources