Django Channels: chan you check the number of sockets inside a room/channel_layer - django-channels

Same as the question itself. How can you check the number of live sockets inside a room or a channel_layer if you are using django channels?

You can't do this directly from the generic channel layer api, if you'r using Redis you could look into the Redis api and check how many subscriptions are open.
this could be done using this api:
https://redis.io/commands/client-list
(this might be quite slow and costly if you have lots of open connections to your redis cluster)
You will need to convert the group name into the group key in the same way the redis layer is doing this see here:
https://github.com/django/channels_redis/blob/master/channels_redis/core.py#L582

Related

How to send image data via different microservices with Redis

I wanted to ask what options make sense with Redis, as I am unsure about Redis Pub/Sub in particular. Suppose I have a service A (Java client) that processes images. Unfortunately it can't process all kinds of images (because the language/framework doesn't support it yet). This is where service B comes into play (Node.js).
Service A streams the image bytes to Redis. Service B should read these bytes from Redis and encode them into the correct format. Then stream back to Redis and Service A is somehow notified to read the result from Redis.
There are two strategies I consider for this:
Using the Pub/Sub feature of Redis. Service A streams via writeStream e.g. the chunks to Redis and then publishes as publisher certain metadata to Service B (& replicas) as subscriber. Service B then reads the stream ( locks it for other replicas), processes it, and then streams the result back to Redis. Then sends a message to Service A as Publisher that the result can be fetched from Redis.
I put everything directly into the pub/sub Redis. Metadata and bytes and then proceed as in 1). But how do I then lock the message for other replicas of B? I want to avoid that all process the same image.
So my question is:
Does the pub/sub feature of Redis allow strategy no. 2 in terms of performance or is this exclusively intended for "lightweight" messages such as log data, metadata, IDs?
And if Redis in general would not be a good solution for this approach. Which one then? Async rest endpoints?

What would Kafka do if producer goes down?

I'm a bit confused about Kafka architecture. We would like to capture Twitter Streaming API. We came across this https://github.com/NFLabs/kafka-twitter/blob/master/src/main/java/com/nflabs/peloton2/kafka/producer/TwitterProducer.java Twitter Producer.
What I'm thinking about is how to design the system so it's fault tolerant.
If the producer goes down, does it mean we lose some of the data? How to prevent this from happening?
If the producer you linked to stops running, new data from the Twitter API will not make its way into Kafka. I'm not sure how the Twitter Streaming API works, but it may be possible to get historic data, allowing you to fetch all data back to the point when the producer failed.
Another option is to use Kafka Connect, which is a distributed, fault tolerant service for connecting data sources and sinks to Kafka. Connect exposes a higher-level API and uses the out-of-the-box producer/consumer API behind the scenes. The documentation explains Connect very thoroughly, so give that a read and go from there.

How do I retrieve data from statsd?

I'm glossing over their documentation here :
http://www.rubydoc.info/github/github/statsd-ruby/Statsd
And there's methods for recording data, but I can't seem to find anything about retrieving recorded data. I'm adopting a projecting with an existing statsd addition. It's host is likely a defunct URL. Perhaps, is the host where those stats are recorded?
The statsd server implementations that Mircea links just take care of receiving, aggregating metrics and publishing them to a backend service. Etsy's statsd definition (bold is mine):
A network daemon that runs on the Node.js platform and listens for
statistics, like counters and timers, sent over UDP or TCP and sends
aggregates to one or more pluggable backend services (e.g.,
Graphite).
To retrieve the recorded data you have to query the backend. Check the list of available backends. The most common one is Graphite.
See also this question: How does StatsD store its data?
There are 2 parts to statsd: a client and a server.
What you're looking at is the client part. You will not see functionality related to retrieving the data as it's not there - it normally is on the server side.
Here is a list of statsd server implementations:
http://www.joemiller.me/2011/09/21/list-of-statsd-server-implementations/
Research and pick one that fits your needs.
Statsd originally started at etsy: https://github.com/etsy/statsd/wiki

independent publisher/subscriber in different threads via rabbitmq in rails

I need to implement independent publisher/subscriber via rabbitmq in rails (plann to use amqp gem), but I need publisher work with one thread, subscriber works in another and they not depend from each other.
Currenlty i'm using amqp gem but it needs that messages sended and cunsumed inside single eventmachine code block. So my question is how to avoid it and make it completely independent?
Looks like the possible variant for that is run subscriber and consumer in different process (not thread)
You can use DDS for this section. Different process/application can be communicate using Data Distribution Service.
Object management Groups/Check this link

Datasift multiple thread in same connection

We are trying out an option of having multiple parallel thread meaning multiple consuming for different set of keywords by establishing single connection. Is it possible or should we establish new connection if we want to consume another set of data? Can anyone guide on this
Regards,
Balaji D
This is documented on DataSift's Multiple Streaming docs page. Some of the official API Client Libraries also support multi-streaming over WebSockets.

Resources