TypeAccessException in Orleans Heterogeneous silos - - orleans

In a single Orleans cluster(v3.6), there are six silos. An (A1, A2, A3) implements one part of grains; Bn(B1, B2, B3) implements the other part of grains. Membership provider is Consul.
There is client connects to the cluster to call methods in grains, from either An or Bn.
This setup works well till --
When B1, B2, B3 are deployed with a new grain type accepting a parameter, which is a new data type invisible to group B. Client occationally receives TypeAccessException when calling this grain - the error says the new type cannot be resolved.
I think the issue is from Silo Gateway. Client may connect to An server to emit the call. And hence An does not know this type and failed to resolve it. To solve the issue, both An and Bn must be deployed with new data type to avoid this kind of error.
But in fact An logically does not need know that new data type since it does not need call it.
Is there a way to avoid this error but don't deploy An?
Possible to only expose Bn silos as gateways?
Possible to control client to select Silo Gateways by some tag?

Related

Is Sales Transaction modeled as Hub or a Link in Data Vault 2.0

I'm a rookie in Data Vault, so please excuse my ignorance. I am currently ramping up and modeling Raw Data Vault in parallel using Data Vault 2.0. I have few assumptions and need help validating them.
1) Individual Hubs are modeled for:
a) Product(holds pk-Product_Hkey, BK,Metadata),
b) Customer(holds pk-Customer_Hkey,BK,Metadata),
c) Store(holds pk-Store_Hkey,BK,Metadata).
Now a Sales Txn's that involves all the above Business Objects should be modeled as a Link Table
d) Link table- Sales_Link(holds pk-Sales_Hkey, Sales Txn ID, Product_Hkey(fk), Customer_Hkey(fk), Store_Hkey(fk), Metadata) and a Satellite needs to be associated to Link table holding some descriptive data about Link.
Is the above approach valid ?
My rationale for the above Link Table is because
I consider Sales Txn ID as a non-BK & hence
Sales Txn's must be hosted in a Link as opposed to hub.
2) Operational data has different types of customers.(Retail, Professional). All customers (agnostic to types) should be modeled in one hub & this distinction of customer types should be made by modeling different Satellites(one for retail, one for Professional) tied to Customer hub.
Is the above valid?
I have researched online technical forums, but got conflicting theories, so I'm posting it here.
There is no code applicable here
I would suggest to model sales as Hub if you are fine with below points else link is perfectly good design..
Sales transaction as a hub (Sales_Hub) :
Whats business key? Can you consider "Sales Txn ID"(unique number) as a BK.
Is this hub or the same BK used in another Link (except Sales_Link) i.e. link on link.
Are you ok with Sales_Link with no satellite, since all the descriptive exists in Sales_Hub.
Also it will store same BK+Audit metadata info in two places (Hub/Link) and addition joins to fetch data from Hub-satellite.
Is valid when
Customer information (retail,professional..etc) stored in separate tables at source(s) system.
You should model a satellite if the data is coming thru single source table then you apply soft rules to bifurcate them into their type in Business data vault.

Microservices (Application-Level joins) more API calls - leads to more latency?

I have 2 Micro Services one for Orders and one for Customers
Exactly like below example
http://microservices.io/patterns/data/database-per-service.html
Which works without any problem.
I can list Customers data and Orders data based on input CustomerId
But now there is new requirement to develop a new screen
Which shows Orders of input Date and show CustomerName beside each Order information
When going to implementation
I can fetch the list of Ordersof input Date
But to show the corresponding CustomerNames based on a list of CustomerIds
I make a multiple API calls to Customer microservice , each call send CustomerId to get CustomerName
Which lead us to more latency
I know above solution is a bad one
So any ideas please?
The point of a microservices architecture is to split your problem domain into (technically, organizationally and semantically) independent parts. Making the "microservices" glorified (apified) tables actually creates more problems than it solves, if it solves any problem at all.
Here are a few things to do first:
List architectural constraints (i.e. the reason for doing microservices). Is it separate scaling ability, organizational problems, making team independent, etc.
List business-relevant boundaries in the problem domain (i.e. parts that theoretically don't need each other to work, or don't require synchronous communication).
With that information, here are a few ways to fix the problem:
Restructure the services based on business boundaries instead of technical ones. This means not using tables or layers or other technical stuff to split functions. Services should be a complete vertical slice of the problem domain.
Or as a work-around create a third system which aggregates data and can create reports.
Or if you find there is actually no reason to keep the microservices approach, just do it in a way you are used to.
New requirement needs data from cross Domain
Below are the ways
Update the customer Id and Name in every call . Issue is latency as
there would be multiple round trips
Have a cache of all CustomerName with ID in Order Service ( I am
assuming there a finite customers ).Issue would be , when to refresh
cache or invalidate cache , For that you may need to expose some
rest call to invalidate fields. For new customers which are not
there in cache go and fetch from DB and update cache for future . )
Use CQRS way in which all the needed data( Orders customers etc ..) goes to a separate table . Now in this schema you can create a composite SQL query . This will remove the round trips etc ...

Where did HazelCast map.get() go? And scalability concerns

According to this, "Each map.get(k) will be a remote operation" But where is the remote? For example, I have a node that writes into the IMap with key - k. Another 50 nodes that does read from the IMap using map.get(k). What happens when 50 nodes call map.get(k). Does each call come to the node that does the write? If so, how many copies of IMap does this "remote" node will create in responds to these 50 calls? Is it multi-threaded? Is this IMap singleton? Or each thread will create a deep copy of such IMap?
But where is the remote?
The answer is in the preceding sentence in the documentation link you supplied: "Imagine that you are reading the key k so many times and k is owned by another member in your cluster.". Each key is hashed and mapped to a partition, as explained in http://docs.hazelcast.org/docs/3.7/manual/html-single/index.html#sharding-in-hazelcast and the "Data Partitioning" section that follows. The cluster member that owns that partition is the owner (or primary replica) of the key. Each read & write on that key will be executed by the same thread on that particular member (that is unless your configuration allows read from backup).
What happens when 50 nodes call map.get(k). Does each call come to the node that does the write?
Yes, it's always the key owner the one that executes operations on that key.
If so, how many copies of IMap does this "remote" node will create in responds to these 50 calls?
The member only has one instance of the IMap, no copies.
Is it multi-threaded?
No, all map operations involving the same key k will be executed on the same partition thread on the same member which is the primary replica of that key. You may read more about threading model of operations in http://docs.hazelcast.org/docs/3.7/manual/html-single/index.html#operation-threading

HBase Row Key Design

I'm using Hbase coupled with phoenix for interractive analytics and i'm trying to desing my hbase row key for an iot project but i'm not very sure if i'm doing it right.
My Database can be represented into something like this :
Client--->Project ----> Cluster1 ---> Cluster 2 ----> Sensor1
Client--->Project ----> Building ----> Sensor2
Client--->Project ----> Cluster1 ---> Building ----> Sensor3
What i have done is a Composite primary key of ( Client_ID, Project_ID,Cluster_ID,Building_iD, SensorID)
(1,1,1#2,0,1)
(1,1,0,1,2)
(1,1,1,1,3)
And we can specify multiple Cluster or building with a seperator # 1#2#454 etc
and if we don't have a node we insert 0.
And in the columns family we will have the value of the sensor and multiples meta_data.
My Question is this hbase row key design for a request that say we want all sensors for the cluster with ID 1 is valid ?
I thought also to just put the Sensor_ID,TimeStamp in the key and put all the rooting in the column family but with this design im not sure its a good fit for my requests .
My third idea for this project is to combine neo4j for the rooting and hbase for the data.
Anyone got any experience on similar problems to guide me on the best approach to design this database ?
It seems that you are dealing with time series data. Once of the main risks of using HBase with time series data (or other forms of monotonically increasing keys) is hotspotting. This is dangerous scenario that might arise and make your cluster behave as a single machine.
You should consider OpenTSDB on top of HBase as it approaches the problem quite nicely. The single most important thing to understand is how it engineers the HBase schema/key. Note that the timestamp is not in the leading part of the key and it assumes a number of distinct metric_uid >>> of the number of slave nodes and region servers (This is essential for a balanced cluster).
An OpenTSDB key has the following structure:
<metric_uid><timestamp><tagk1><tagv1>[...<tagkN><tagvN>]
Depending on your specific use case you should engineer your metric_uid appropriately (maybe a compound key unique to a sensor reading) as well as the tags. Tags will play a fundamental role in data aggregation.
NOTE: As of v2.0 OpenTSDB introduced the concept of Trees that could be very helpful to 'navigate' your sensor readings and facilitate aggregations. I'm not too familiar with them but I assume that you could create a hierarchical structure that will help determining which sensors are associated with which client, project, cluster, building, and so on...
P.S. I don't think that there is room for Neo4J in this project.

Does it make sense to use different CoreData configurations to improve performance/reduce storage even if using just one persistent store?

I am working on a suite of apps and those apps will have a lot of model code in common. I'm using CoreData so I currently plan on having just one model file for all the different apps, although not all apps use all entities defined in the model.
I have read about Core Data configurations that can be defined in the managed object model to get only a subset of all entities. I am wondering whether I could use these to also optimize the CoreData usage in my apps.
Consider the following scenario:
I have three apps, App1, App2 and App3.
They have a shared managed object model with the following entities.
A, A1, A2, A3, B, C, D
whereas A is abstract and A1, A2 and A3 all inherit from A. Each of the A1, A2 and A3 entities have around 10 - 20 attributes/relationships.
Now
App1 only uses A, A1, B, C, D,
App2 only uses A, A2, B, C, D,
App3 only uses A, A3, B
I have read (can't remember where) that to model sub entities in sqlite, CoreData just creates a table for the parent entity that contains all attributes and relationship of sub entities as table columns. Therefore it would often not be advisable to create small parent-entities with several large sub-entities, since it would lead to a lot of empty columns for each of the sub-entities (which don't need the columns for attributes of other sub-entities).
Now, using configurations, I could create three configurations Conf1, Conf2, Conf3 like that:
Conf1 contains entities A, A1, B, C, D,
Conf2 contains entities A, A2, B, C, D,
Conf3 contains entities A, A3, B
Each of the apps would use a single store with the appropriate configuration, so I wouldn't make use of the "store the object automatically in the correct store" advantage configurations have when used with several stores.
However, my hope is that by adding a store for the specific configurations in each of the apps, the store would ignore the attributes of the non-included entities and thus not create the appropriate table colums. In Case of App3/Conf3 it would even avoid the creation of tables for entities C and D altogether.
My questions is: Does it work that way? Would the superfluous columns be left out in persistent stores that use the correct configuration?
And if so: Does it actually make a difference in performance or storage requirements (assuming a number of objects so performance optimizations actually start to make sense)?
How Core Data represents sub entities in the SQLite store is an implementation detail that is hidden from you and subject to change. Do not depend on it working one way, because at some point it may work completely differently.
You may be prematurely optimizing. Build it, test it, and if there is a performance issue stemming from how you're using entities address it at that point.
As to your actual broader question wether there should be performance advantages in using multiple configurations for a single store: There shouldn't be. If you have one SQLite store and there is only one configuration, Core Data is not going to be making additional optimizations based on the (single) configuration.
Much of Core Data's performance comes from your data model design and access patterns. An application that is architected to be aware of Core Data faulting behavior that uses a well thought out data model will be quite performant. Even if you have a less than optimal data model Core Data can be very fast if you optimize your round trips to the persistent store (i.e. managing faults, batch faulting when appropriate, implementing a correct find-or-create).
The Incremental Store Programming Guide contains a very good description of how faults are fulfilled. The Core Data Programming Guide has a higher level description of faulting, and discusses batch faulting, prefetching, and the find or create pattern.

Resources