Currently I am working with Orleans and I am wondering what exactly the PrimarySilo is in a cluster? My guess is that if I make a silo the PrimarySilo, by using the following piece of code for example:
...
var primarySiloEndpoint = new IPEndPoint(IPAddress.Loopback, localPrimarySiloEndpoint);
siloHostBuilder.UseDevelopmentClustering(primarySiloEndpoint)
.Configure<EndpointOptions>(options => options.AdvertisedIPAddress = IPAddress.Loopback);
...
Then I am actually configuring what ip:port the cluster it self is listening on, since clients talk to silos and not clusters ? Can a cluster have multiple primary silos for example? I guess not.
See point 6 here in the documentation.
In summary, this term is only applicable for development clustering.
Related
EDIT:
My question was horrifically put so I delete it and rephrase entirely here.
I'll give a tl;dr:
I'm trying to assign each computation to a designated worker that fits the computation type.
In long:
I'm trying to run a simulation, so I represent it using a class of the form:
Class Simulation:
def __init__(first_Client: Client, second_Client: Client)
self.first_client = first_client
self.second_client = second_client
def first_calculation(input):
with first_client.as_current():
return output
def second_calculation(input):
with second_client.as_current():
return output
def run(input):
return second_calculation(first_calculation(input))
This format has downsides like the fact that this simulation object is not pickleable.
I could edit the Simulation object to contain only addresses and not clients for example, but I feel as if there must be a better solution. For instance, I would like the simulation object to work the following way:
Class Simulation:
def first_calculation(input):
client = dask.distributed.get_client()
with client.as_current():
return output
...
Thing is, the dask workers best fit for the first calculation, are different than the dask workers best fit for the second calculation, which is the reason my Simulation object has two clients that connect to tow different schedulers to begin with. Is there any way to make it so there is only one client but two types of schedulers and to make it so the client knows to run the first_calculation to the first scheduler and the second_calculation to the second one?
Dask will chop up large computations in smaller tasks that can run in paralell. Those tasks will then be submitted by the client to the scheduler which in turn wil schedule those tasks on the available workers.
Sending the client object to a Dask scheduler will likely not work due to the serialization issue you mention.
You could try one of two approaches:
Depending on how you actually run those worker machines, you could specify different types of workers for different tasks. If you run on kubernetes for example you could try to leverage the node pool functionality to make different worker types available.
An easier approach using your existing infrastructure would be to return the results of your first computation back to the machine from which you are using the client using something like .compute(). And then use that data as input for the second computation. So in this case you're sending the actual data over the network instead of the client. If the size of that data becomes an issue you can always write the intermediary results to something like S3.
Dask does support giving specific tasks to specific workers with annotate. Here's an example snippet, where a delayed_sum task was passed to one worker and the doubled task was sent to the other worker. The assert statements check that those workers really were restricted to only those tasks. With annotate you shouldn't need separate clusters. You'll also need the most recent versions of Dask and Distributed for this to work because of a recent bug fix.
import distributed
import dask
from dask import delayed
local_cluster = distributed.LocalCluster(n_workers=2)
client = distributed.Client(local_cluster)
workers = list(client.scheduler_info()['workers'].keys())
with dask.annotate(workers=workers[0]):
delayed_sum = delayed(sum)([1, 2])
with dask.annotate(workers=workers[1]):
doubled = delayed_sum * 2
# use persist so scheduler doesn't clean up
# wrap in a distributed.wait to make sure they're there when we check the scheduler
distributed.wait([doubled.persist(), delayed_sum.persist()])
worker_restrictions = local_cluster.scheduler.worker_restrictions
assert worker_restrictions[delayed_sum.key] == {workers[0]}
assert worker_restrictions[doubled.key] == {workers[1]}
I have pointed mass transit at my AWS account and given it a "scope". This creates one SNS topic per scope. Here's my config with the scope set to "development":
container.AddMassTransit(config =>
config.AddConsumer<FooConsumer>(typeof(FooConsumerDefinition));
config.UsingAmazonSqs((amazonContext, amazonConfig) =>
{
amazonConfig.Host(
new UriBuilder("amazonsqs://host")
{
Host = "eu-north-1"
}.Uri, h =>
{
h.AccessKey("my-access-key-is-very-secret");
h.SecretKey("my-secret-key-also-secret");
h.Scope("development", true);
});
amazonConfig.ConfigureEndpoints(amazonContext);
});
);
// Somewhere else:
public class FooConsumerDefinition : ConsumerDefinition<FooConsumer>
{
public FooConsumerDefinition ()
{
ConcurrentMessageLimit = 1;
// I used to set EndpointName, but I stopped doing that to test scopes
}
}
If I change the scope and run it again, I get more SNS topics and subscriptions, which are prefixed with my scope. Something like:
development_Namespace_ObjectThatFooRecieves
However, the SQS queues aren't prefixed and won't increase in number.
Foo
The more scopes you run, the more SNS subscriptions 'Foo' will get. And on top of that, if I start a consuming application that is configured to say "development" it will start consuming all the messages for all the different scopes. So as a consequence, I'm not getting any environment separation.
Is there anyway to have these different topics feeding different queues? Is there anyway to neatly prefix my queues alongside my topics?
In fact, What's the point of the "Scope" configuration if it only separates out topics, and then after that they all go to the same queue to get processed indiscriminately?
NB. I don't think the solution here is to just use a separate subscription. That's significant overhead, and I feel like "Scope" should just work.
MassTransit 7.0.4 introduced new options on the endpoint name formatter, including the option to include the namespace, as well as add a prefix in front of the queue name. This should cover most of your requirements.
In your example above, you could use:
services.AddSingleton<IEndpointNameFormatter>(provider =>
new DefaultEndpointNameFormatter("development", true));
That should give you what you want. You will still need to specify the scope as you have already done, which is applied to SNS topics names.
SQS support for MassTransit has a long history, and teams hate breaking changes. That is why these aren't the new defaults as they'd break existing applications.
For my work, we are trying to spin up a docker swarm cluster with Puppet. We use puppetlabs-docker for this, which has a module docker::swarm. This module allows you to instantiate a docker swarm manager on your master node. This works so far.
On the docker workers you can join to docker swarm manager with exported resources:
node 'manager' {
##docker::swarm {'cluster_worker':
join => true,
advertise_addr => '192.168.1.2',
listen_addr => '192.168.1.2',
manager_ip => '192.168.1.1',
token => 'your_join_token'
tag => 'docker-join'
}
}
However, the your_join_token needs to be retrieved from the docker swarm manager with docker swarm join-token worker -q. This is possible with Exec.
My question is: is there a way (without breaking Puppet philosophy on idempotent and convergence) to get the output from the join-token Exec and pass this along to the exported resource, so that my workers can join master?
My question is: is there a way (without breaking Puppet philosophy on
idempotent and convergence) to get the output from the join-token Exec
and pass this along to the exported resource, so that my workers can
join master?
No, because the properties of resource declarations, exported or otherwise, are determined when the target node's catalog is built (on the master), whereas the command of an Exec resource is run only later, when the fully-built catalog is applied to the target node.
I'm uncertain about the detailed requirements for token generation, but possibly you could use Puppet's generate() function to obtain one at need, during catalog building on the master.
Update
Another alternative would be an external (or custom) fact. This is the conventional mechanism for gathering information from a node to be used during catalog building for that node, and as such, it might be more suited to your particular needs. There are some potential issues with this, but I'm unsure how many actually apply:
The fact has to know for which nodes to generate join tokens. This might be easier / more robust or trickier / more brittle depending on factors including
whether join tokens are node-specific (harder if they are)
whether it is important to avoid generating multiple join tokens for the same node (over multiple Puppet runs; harder if this is important)
notwithstanding the preceding, whether there is ever a need to generate a new join token for a node for which one was already generated (harder if this is a requirement)
If implemented as a dynamically-generated external fact -- which seems a reasonable option -- then when a new node is added to the list, the fact implementation will be updated on the manager's next puppet run, but the data will not be available until the following one. This is not necessarily a big deal, however, as it is a one-time delay with respect to each new node, and you can always manually perform a catalog run on the manager node to move things along more quickly.
It has more moving parts, with more complex relationships among them, hence there is a larger cross-section for bugs and unexpected behavior.
Thanks to #John Bollinger I seem to have fixed my issue. In the end, it was a bit more worked than I envisioned, but this is the idea:
My puppet setup now uses PuppetDB for storing facts and sharing exported resources.
I have added an additional custom fact to the code base of Docker (in ./lib/facter/docker.rb).
The bare minimum in the site.pp file, now contains:
node 'manager' {
docker::swarm {'cluster_manager':
init => true,
advertise_addr => "${::ipaddress}",
listen_addr => "${::ipaddress}",
require => Class['docker'],
}
##docker::swarm {'cluster_worker':
join => true,
manager_ip => "${::ipaddress}",
token => "${worker_join_token}",
tag => "cluster_join_command",
require => Class['docker'],
}
}
node 'worker' {
Docker::Swarm<<| tag == 'cluster_join_command' |>> {
advertise_addr => "${::ipaddress}",
listen_addr => "${::ipaddress}",
}
}
Do keep in mind that for this to work, puppet agent -t has to be run twice on the manager node, and once (after this) on the worker node. The first run on the manager will start the cluster_manager, while the second one will fetch the worker_join_token and upload it to PuppetDB. After this fact is set, the manifest for the worker can be properly compiled and run.
In the case of a different module, you have to add a custom fact yourself. When I was researching how to do this, I added the custom fact to the LOAD_PATH of ruby, but was unable to find it in my PuppetDB. After some browsing I found that facts from a module are uploaded to PuppetDB, which is the reason that I tweaked the upstream Docker module.
I have a Rails web application. I want to create a class that takes an email address, say "matt#trucksandstuff.com," parses out the domain, and then checks if the domain is found in the Spamhaus DBL. I am having no luck with the dig or host commands as described on their website and the Charon gem doesn't seem to work with their sample URL either. Any ideas?
EDIT: Here is what is on the website:
In response to "How can I test the DBL?" they said:
First, the DBL follows RFC5782 for determining whether a URI zone is operational with an entry for TEST. Second, the DBL has a specific domain for testing DBL applications: dbltest.com. To test functionality of the DBL use the host or dig command to do a manual query. (If you need to look up a domain in the DBL via the web, use the domain lookup form at our Blocklist Removal Center. Do not query our website with automated tools.).
I have tried using the Charon gem, which I think should be as simple as running
Charon.query('dbltest.com')
with variations that remove the parentheses, add a space, etc.
Also tried
resolver = Resolv::DNS.new
name = 'dbltest.com'
resolver.getresources("#{name}.zen.spamhaus.org", Resolv::DNS::Resource::IN::A)
in the Rails console.
The Zen database is only for IP addresses. The DBL list is for hostnames. Therefore Charon (Zen query) only works with IP addresses. To test hostnames, query them with Resolv and dbl.spamhouse.org:
def is_spammer?(host)
!Resolv::DNS.new.getresources("#{host}.dbl.spamhaus.org",
Resolv::DNS::Resource::IN::A).empty?
end
is_spammer?('dbltest.com')
=> true
is_spammer?('google.com')
=> false
I am trying to bind to two diferenet SMSCs through SMPP 3.4 from a single rails application using the ruby-smpp gem.
Following the example included on the documentation of this gem I have two configuration blocks pointing to the different ISPs i.e.
config_1 = {
#.......
}
config_2 = {
#.......
}
I go on to declare and run two instaces of the gateways as shown below:
gw_1 = SampleGateway.new
gw_1.start(config_1)
gw_2 = SampleGateway.new
gw_2.start(config_2)
I am able to bind to the respective ISPs but the problems I am experiancing are as follows:
Whenever one of the binds is lost (i.e. on unbound), both ISP connections are lost.
When I initiate/send an SMS to a particular ISP at least 2 times the number of SMSes will be sent throught that ISP (i.e. if i send 1 SMS throught ISP1, 2 SMSes will be delivered on the handset)
Any ideas as to how I can prevent the above from happening, or should I connect to the ISPs with two different rails apps?
The samplegateway provided by the project is not suitable for your usecase. if you check https://github.com/raykrueger/ruby-smpp/blob/master/examples/sample_gateway.rb#L64 the EventMachine connection is stored in a class variable, which means your second call gw_2.start(config_2) will just overwrite the first one.
You should probably orient yourself on the https://github.com/raykrueger/ruby-smpp basic usage and write your own Gateway