Whitebox pool vs. blackbox pool in BPMN - business-process-management

Blackbox pools are used to model an external participant, while whitebox pools are used in modeling entities whose process we are interested in. When should I choose which one or can we just pick and choose any?

Internal pools will always be whitebox pools, because you want to execute the internal process. External pools could be whitebox or blackbox pools.
The difference is pretty obvious: you can't see what's happening in a blackbox. As a process developer you'll be saving a few minutes by throwing in a blackbox pool, but it makes the process more difficult to understand. Therefore you should use whitebox pools even for external partners.
Reasons to use a whitebox:
You don't need to look at incoming and outgoing transitions to guess what's happening at the partner's site.
The external process makes restrictions of the partner's workflow visible. For example, you cannot parallelize two tasks, because your partner expects one message before the other.

There is no such thing as a "white box", but I understand why you say it.
The term "white box" is mentioned once in the BPMN specification, but only to distinguish it from a "black box". It is not a definition. A "white box" is really just a pool that contains a process.
Another aspect of pools is whether or not there is any activity detailed within the pool. Thus, a given pool may be shown as a "white box," with all details (e.g., a process) exposed, or as a "black box," with all details hidden.
- "Business Process Model and Notation (BPMN), v2.0.2", page 111
Also, a black box doesn't even have to be black or dark. A black box is only a term for a pool without a process.
When should I choose which one or can we just pick and choose any?
- OP
Black box: A term used to describe an empty pool. That is, a pool where the contents is either unknown to you or out of your control, or irrelevant.
Use it in a collaboration diagram when you want to refer to an external participant whose processes are out of your control or irrelevant to the diagram you are making.
White box (regular pool): A term used as an antonym to a black box. But it is really just a pool containing a process.
Use it in a collaboration diagram (two or more participants) to refer to an internal process. In other words, use it as a normal pool - that's what it is.

There are several participants in a collaborative model, since we are interested to explain the activities of a particular participant, we use whitebox pool for the said participant and rest of the pools will be supposed as blackbox because most of the time we are not interested to mention detail of those participants whom we don't know the business process or we don't want to mention at all.
So when we have explain a specific participant's detail, we use whitebox pool otherwise blackbox whose means detail is not provided with it.

I'll add some details as to why you should or shouldn't use black-box pools:
While you know what you send (output) and receive (input) to an external source (vendor, etc.), you do not know the inner workings of your partner's process. Hence, you would use a black-box because you have neither 1) knowledge of their exact processes and 2) control over said processes. The only knowns are what you send and receive, nothing more, nothing less.
On the other hand, a white-box could be used to identify a source of input and output within your organization or through a possible partnership. In that case, using a whitebox would make sense since you could interact and possibly modify how the white-box source interacts with your current process.
However, using a black-box for an external source to which you have no control and no knowledge of its inner workings could be misleading, at best, and dangerous in some scenarios.
In the case of an on-going project, a development/project team could wrongfully conclude that it can interact/modify interactions with an external source that is mistakenly assigned a white-box.

Related

Why are read-only nodes called read-only in the case of data store replication?

I was going through the article, https://learn.microsoft.com/en-us/azure/architecture/patterns/cqrs which says, "If separate read and write databases are used, they must be kept in sync". One obvious benefit I can understand from having separate read replicas is that they can be scaled horizontally. However, I have some doubts:
It says, "Updating the database and publishing the event must occur in a single transaction". My understanding is that there is no guarantee that the updated data will be available immediately on the read-only nodes because it depends on when the event will be consumed by the read-only nodes. Did I get it correctly?
Data must be first written to read-only nodes before it can be read i.e. write operations are also performed on the read-only nodes. Why are they called read-only nodes? Is it because the write operations are performed on these nodes not directly by the data producer application; but rather by some serverless function (e.g. AWS Lambda or Azure Function) that picks up the event from the topic (e.g. Kafka topic) to which the write-only node has sent the event?
Is the data sharded across the read-only nodes or does every read-only node have the complete set of data?
All of these have "it depends"-like answers...
Yes, usually, although some implementations might choose to (try to) update read models transactionally with the update. With multiple nodes you're quickly forced to learn the CAP theorem, though, and so in many CQRS contexts, eventual consistency is just accepted as a feature, as the gains from tolerating it usually significantly outweigh the losses.
I suspect the bit you quoted anyway refers to transactionally updating the write store with publishing the event. Even this can be difficult to achieve, and is one of the problems event sourcing seeks to solve.
Yes. It's trivially obvious - in this context - that data must be written before it can be read, but your apps as consumers of the data see them as read-only.
Both are valid outcomes. Usually this part is less an application concern and is more delegated to the capabilities of your chosen read-model infrastructure (Mongo, Cosmos, Dynamo, etc).

Is hyper ledger fabric considered as centralized blochain

Hyper ledger has some classic/old world mechanisms that brings up the question, is it really decentralized?
Having a REST server to communicate with the blockchain brings up the cloud model behavior.
Even though the hyper ledger is distributed, someone calling a rest API will may be written to the server logs with some data such as IP address, GEO info and more.
So, is hyper ledger fabric considered as centralized blockchain or maybe decentralized blockchain?
Thanks
If you mean decentralised as in not controlled by any one entity and anonymous then no, it is not that.
Fabric is the backbone of blockchain applications and certain things are plug and play. it is not meant to be anonymous, it is meant to be controlled and only known parties have access to anything.
It's simply an eco-system which allows anyone to build blockchain based applications. Imagine a system where your bank holds your money and you want to pay someone. The bank needs to make sure that you are who you say you are and no one else can authorise payments from your account. That's what the permissions mean. It's not meant to bypass the man in the middle like bitcoin or other cryptocurrencies are. Those however are just an implementation of a block chain system, they are not the only way you can use such a system though.
The immutability of the ledger offers certain advantages. Imagine an audit system where every action is recorded and can't be changed. If your audit records are in a sql database for example, anyone with access can go in, change or delete that data. What goes on the ledger stays there for ever and can't be change. This doesn't mean that your asset data cannot change. That is a fundamental thing to understand. Underlying data can be changed via a new transaction against the same asset, but the history of the asset is clearly visible and can't be modified.
In this world, you build a something, someone controls it, gives access to other organisations and people within those and every action has its source identified.
It is decentralised in the sense that the ledger does not live in one place only, a copy of the ledger exists on every peer that is joined to a channel.
However, it is not meant to be anonymous, all the participants are known and their access level controlled, that's the whole point.

Distributing an Erlang Chat system

I just finished Erlang in Practice screencasts (code here), and have some questions about distribution.
Here's the is overall architecture:
Here is how to the supervision tree looks like:
Reading Distributed Applications leads me to believe that one of the primary motivations is for failover/takeover.
However, is it possible, for example, the Message Router supervisor and its workers to be on one node, and the rest of the system to be on another, without much changes to the code?
Or should there be 3 different OTP applications?
Also, how can this system be made to scale horizontally? For example if I realize now that my system can handle 100 users, and that I've identified the Message Router as the main bottleneck, how can I 'just add another node' where now it can handle 200 users?
I've developed Erlang apps only during my studies, but generally we had many small processes doing only one thing and sending messages to other processes. And the beauty of Erlang is that it doesn't matter if you send a message within the same Erlang VM or withing the same Computer, same LAN or over the Internet, the call and the pointer to the other process looks always the same for the developer.
So you really want to have one application for every small part of the system.
That being said, it doesn't make it any simpler to construct an application which can scale out. A rule of thumb says that if you want an application to work on a factor of 10-times more nodes, you need to rewrite, since otherwise the messaging overhead would be too large. And obviously when you start from 1 to 2 you also need to consider it.
So if you found a bottleneck, the application which is particularly slow when handling too many clients, you want to run it a second time and than you need to have some additional load-balancing implemented, already before you start the second application.
Let's assume the supervisor checks the message content for inappropriate content and therefore is slow. In this case the node, everyone is talking to would be simple router application which would forward the messages to different instances of the supervisor application, in a round robin manner. In case those 1 or 2 instances are not enough, you could have the router written in a way, that you can manipulate the number of instances by sending controlling messages.
However for this, to work automatically, you would need to have another process monitoring the servers and discovering that they are overloaded or under utilized.
I know that dynamically adding and removing resources always sounds great when you hear about it, but as you can see it is a lot of work and you need to have some messaging system built which allows it, as well as a monitoring system which can monitor the need.
Hope this gives you some idea of how it could be done, unfortunately it's been over a year since I wrote my last Erlang application, and I didn't want to provide code which would be possibly wrong.

Printing from one Client to another Client via the Server

I don't know if it sounds crazy, but here's the scenario -
I need to print a document over the internet. My pc ClientX initiates the process using the web browser to access a ServerY on the internet and the printer is connected to a ClientZ (may be yours).
1. The document is stored on ServerY.
2. ClientZ is purely a cliet; no IIS, no print server etc.
3. I have the specific details of ClientZ, IP, Port, etc.
4. It'll be completely a server side application (and no client-side on ClientZ) with ASP.NET & C#
- so, is it possible? If yes, please give some clue. Thanks advanced.
This is kind of to big of a question for SO but basically what you need to do is
upload files to the server -- trivial
do some stuff to figure out if they are allowed to print the document -- trivial to hard depending on scope
add items to a queue for printing and associate them with a user/session -- easy
render and print the document -- trivial to hard depending on scope
notify the user that the document has been printed
handling errors
the big unknowns here are scope, if this is for a school project you probably don't have to worry about billing or queue priority in step two. If its for a commercial product billing can be a significant subsystem in its self.
the difficulty in step 4 depends directly on what formats you are going to support as many formats are going to require document specific libraries or applications. There are also security considerations here if this is a commercial product since it isn't safe to try to render all types of files.
Notifications can be easy or hard depending on how you want to do it. You can post back to the html page, but depending on how long its going to take for a job to complete it might be nice to have an email option as well.
You also need to think about errors. What is going to happen when paper or toner runs out or when someone tries to print something on A4 paper? Someone has to be notified so that jobs don't just build up.
On the server I would run just the user interaction piece on the web and have a "print daemon" running as a service to manage getting the documents printed and monitoring their status. I would use WCF to do IPC between the two.
Within the print daemon you are going to need a set of components to print different kinds of documents. I would make one assembly per type (or cluster of types) and load them into your service as plugins using MEF.
sorry this is so general, but you are asking a pretty general and difficult to answer question.

Is this a good reason to use a service bus, alternatives please

I'm in the planning phase of our new site - it's an extension of some mobile apps we've built. We want to provide our users with a central point for communication and also provide features for users who don't want to/can't use the mobile apps. One of the features we're looking at adding is a reputation system similar in nature to the SO badge system. We're designing the system to use SOA.
I don't want to have to code all of this logic into the main app as discreet chunks. I'm thinking of creating a means to accomplish this which will allow us to define new thresholds and rules for gaining reputation and have them injected into some service. The two ways I've thought of doing this so far are:
To look for certain traits in a users actions and respond, this would mean having a service running that can run through the 'plugged in' award definitions and check for thresholds that have been met and respond appropriately.
To fire events when the user performs actions - listen out for those events and respond appropriately. Because the services which will be carrying out these actions are running in separate app domains potentially on separate servers the only way I can see having a central message bus to listen and respond to these events is by using something like MassTransit, nServiceBus or Rhino.Esb.
I know that using a service bus can very easily be inappropriately designed into an application that simply doesn't need it and most times - unless you're integrating disparate, heterogenous systems - you most likely won't need one when designing a new system but I'm a bit lost for options as to the best way to do this. I don't like the idea of having a service hammer the Db all the time in the background. But it does sound like it might be a lot simpler early on - later on - I dread to think!
Has anyone here designed a system like this? How did you accomplish this? We're designing for high throughput as we expect there will be times when the system will need to be able to cope with bursts of users.
I've designed a system that had similar requirements. To achieve this the key elements were:
Plugins
Event messaging - using Emesary
The basic concept is that the core is not aware of exactly which module will perform any given task.
The messages are defined and at points within the system they are dispatched. The sender is not aware if the message is required. This effectively decouples vast chunks of the system.
So to perform a job some code is plugged in, that registers with the event messaging bus and will receive messages. When it receives a message that it needs to process it will process it.
The Emesary code is extremely small and efficient in the first instance I've called it (Emesary and you're free to use it; or from Emesary CodePlex
As the system becomes more complex it is possible that there are lots of events flying about, if you get more than 20k a second it was always in my design to add filtering and routing (implemented by the recipient interface being extended to allow a recipient to specify messages it wants to receive during registration). I've never needed to add this filtering because Emesary is sufficiently efficient that it is the processing of the messages that takes the time.
I've build a version of Emesary which bridges two Notifiers across disparate systems using WCF, Corba and TCP/IP. I investigated using RabbitMQ and decided it was possible to use this underneath Emesary if needed.
Base Class Diagram
Scalable server.
This is a fairly complex example however it shows where Emesary fits in. In this diagram anything with a drop shadow can have multiple instances and this is managed outside of what I'm trying to explain here.

Resources