I'm working with a PHP frontend which connects to a distributed back end, using Amazon SQS and a variety of message types and message consumers. I'm trying to come up with a way to safely debug those consumers, as we don't want message handlers with new, untested code consuming end-user messages, risking the messages being lost or incorrectly processed.
The actual message queue names are hardcoded as PHP constants in a class, so my first tactic was to create two different sets of queues, one for production and another for debugging, and to externalise the queue name constants into two different files. Depending on whether our debug condition is true or not, I wanted to include one or the other of those constant definitions and assign the constants in the included file to the class constants which currently have the names hardcoded.
This doesn't seem to work though because constants seem to act like class variables in PHP whereas I am trying to assign the values like instance variables. The next tactic was to see if there was anything on Amazon's side that would allow us to debug our message consumers transparently without adding lots of hacks to our code, but I couldn't see anything there that facilitated this. I'd love to know if anyone else has experienced (and ideally, solved this problem)
SQS doesn't provide a way to inspect the contents of messages in the queue, or for the sender to see if any consumers are failing to process messages.
A common approach to this problem would be to set up two sets of queues as you suggest and have the producer post the same message onto both queues. That way you can debug your code against a stream of production messages without affecting the actual production queue.
I'd recommend moving the decision of which queue to use out of your code and into config, and then deploy different config files to your development boxes vs your production boxes. The risk is always that a development box ends up talking to production systems, so having a single consistent approach to configuring those end-points across all your code is much less risky that doing it on an ad-hoc basis each time you call out to a service.
I'd also recommend putting your production and development queues in different AWS accounts with different access credentials. That way you can give your production account permission to publish to the development account's queue, but you can guarantee that your development systems can't read from the production queue.
Related
I had a python application, which, with the use of shell tricks, was able to send me emails when new error messages appeared in the log during an execution session, scheduled with cron.
Now I am packing it in docker and was able to reproduce most of its functionality with docker-compose.
But when it comes to emails on failures, I am not sure of what is the best way to implement it.
What are your suggestions? Are there any best practices?
Update:
The app runs couple of times a day. Previously, all prints to stderr was duplicated to stdout to preserve chronological order in the main log file. Then, the wrapper script would accumulate all stderr from a single session in another, temporary file. And if that file was not empty after the session, its content was send in a single email from me to myself through SMTP with proper authentication. I was happy to receive and able to handle them for the last few months.
Right now I see three possible solutions:
Duplicating everything worth sending to a temporary file right in the app, this way docker logs would persist. Then sending it after the session from the entrypoint, provided, there is a way to setup in container all the requirements.
Grepping docker log from the outside. But that's somewhat missing the point of docker.
Relaying reports via local net to another container, with something like https://hub.docker.com/r/juanluisbaptiste/postfix/ which then will send it in a email
I was not able to properly setup postfix or use mail utility inside the container, but python seems to work just fine https://realpython.com/python-send-email/
TL;DR look at NewRelic's free tier.
There is a lot to unpack here. First, it would be helpful to know more about what you had been doing previously with the backticks. Including some more information about what commands might change how I'd respond to this. Without that I might make some assumptions that are incorrect/not applicable.
I would consider errors via email a bad idea for a few reasons:
If many errors occur quickly, you inbox will get flooded with emails and/or flood a mail server with messages or flood the network with traffic. It's your mailbox and your network so you can do what you want, but it tends to fail dramatically when it happens, especially in production.
To send an email you need to send those messages through a SMTP server/gateway/relay of some sort and often times automated scripts like this get blocked as they trigger spam detections. When that happens and there is an error, the messages get silently dropped and production issues go unreported. Again, it's your data/errors and you can do that if you want, but I wouldn't consider it best practices as far as reliability goes. I've seen it fail to alert many times in my past experience for this very reason.
It's been my experiences (over 20 years in the field) that any alerting sent via email quickly gets routed to a sub-folder via a mail rule and start getting ignored as they are noisy. As soon as people start to ignore the messages, serious errors get lost in the inbox along with them and go unnoticed.
De-duplication of error messages isn't built-in and you could get hundreds or thousands of emails in a few seconds, often with the one meaningful error like finding a needle in a haystack of emails.
So I'd recommend not using email for those reasons. If you were really set on using email you could have a script running inside the container that tails the error log and sends the emails off using some smtp client (that could be installed in the docker container). You'd likely have to setup credentials depending on your mail server but it could be done. Again, I'd suggest against it.
Instead I'd recommend having the logs be sent to something like AWS SQS or AWS CloudWatch where you can setup rules to alert (via SNS which supports email alerting) if there are N messages in N minutes. You can configure those thresholds as you see fit and it can also handle de-duplication.
If you didn't want to use AWS (or some other Cloud provider) you could perhaps use something like Elasticache to store the events with a script to check for recent events/perform de-duplication.
There are plenty of 3rd party companies that will take this data and give you an all-in-one solution for storing the logs and a nice dashboard with custom notifications (via email/SMS/etc.), some of which are free. NewRelic comes to mind and is free assuming you don't need log retention.
I don't work for any of these companies, just some tools I've worked with that I'd consider using before I tried to roll a cronjob to send SMTP messages (although I've done that several times in my career which I needed something quick and dirty).
I want to build a service that notifies me when a url returns status 200. I'm currently using a sidekiq worker, if the status == 200, it updates my database (row.available = true), if not, it raises an exception and retries the worker in n seconds, n amount of times.
Though this works, it doesn't feel efficient or scalable (1000's checks would result in 1000's of exceptions, and on certain platforms that's bad news -- JRuby), and I'm sure there is a way I can build an internal service to manage this url monitoring that doesn't rely on sidekiq (perhaps in Go, or another, more suited Ruby gem). However, I have no idea where to begin, and so I'd appreciate some general direction.
Writing and running a simple link checker is easy. Doing that for 1000s of links quickly, without redundancy, and handling dead and slow-responding links without bogging down your entire system gets harder.
I'd use three threads, plus two queues:
A dispatcher thread that only reads from the database. It is responsible for finding and queuing URLs to be checked in to a "to be checked" queue.
A worker thread that consumes from the first queue and pushes results into the "updated URL results" queue.
An updater/consumer thread that takes the result of a thread in #2 and updates the database.
Ruby has some built-in classes to help:
Thread
Queue
I'd highly recommend Typhoeus and Hydra for use in the middle thread. The documentation for these two classes cover a lot of what you need to do as far as handling multiple threads running in parallel.
I wouldn't write this code as part of a Rails application. There is no value added by Rails to this, nor is it necessary. I would either require Active Record and piggy-back on the existing database.yaml settings and models, or use Rails' "runner" to run the code as an adjunct to the Rails code.
Or, I'd write a small, application-specific, piece of code to run on a different server to avoid bogging down the Rails server. Using something like MySQL or PostgreSQL drivers would let you talk to the same database that Rails uses. In this case I'd use the Sequel gem to act as the ORM, but that's because I prefer it over Active Record.
There are a lot of things to consider as you write this code, including retries of failed URLs, sensing redirections and updating the source URLs to reflect them to avoid wasting time, and not beating up the hosting servers causing you to be banned.
I've written several apps for this purpose over the years and doing it right takes a lot of forethought, so think out your design up front otherwise you could end up with some major rewrites later on.
I am curious as to how to proceed with this issue; I currently have a DataSnap server setup with a TDSAuthenticationManager class managing the authentication.
If an authentication fails, is it safe for me to write directly onto a form TMemo or something similar for logging purposes? What's the best way to observe this?
Do I need threading?
Cheers for reading,
Adrian
Yes, you need synchronization, since Datasnap events run in the context of different threads, and as you may know, the UI programming is limited to the main thread.
So, if you want to display something in the UI, you have to take care of how to do it.
On the other hand, if you want to log to a file, you don't need synchronization, but you have to be careful, since it is possible for two different threads to try to log at the same time.
The options I would evaluate are:
Protect the access to the log file using a Critical Section, thus avoiding the multi-thread access with a lock. Only one thread can access the file at a time and all other interested threads have to wait.
Create a new logging class, from which a global instance that can take log requests by simply adding the log message to a (multi thread capable) queue in memory, and running it's own thread writing them to a file when there are messages in the queue.
Since servers tend to run as a services in production environments, I would choose the latter.
I have a rails app using delayed_job. I need my jobs to communicate with each other for things like "task 5 is done" or "this is the list of things that need to be processed for task 5".
Right now I have a special table just for this, and I always access the table inside a transaction. It's working fine. I want to build out a cleaner api/dsl for it, but first wanted to check if there were existing solutions for this already. Weirdly I haven't found a single things, I'm either googling completely wrong, or the task is so simple (set and get values inside a transaction) that no one has abstracted it out yet.
Am I missing something?
clarification: I'm not looking for a new queueing system, I'm looking for a way for background tasks to communicate with one another. Basically just safely shared variables. Do the below frameworks offer this facility? It's a shame that delayed job does not.
use case: "do these 5 tasks in parallel, and then when they are all done, do this 1 final task." So, each of the 5 tasks checks to see if it's the last one, and if it is, it fires off the final task.
I use resque. Also there are lots of plugins, which should make inter-process comms easier.
Using redis has another advantage: you can use the pub-sub channels for communication between workers/services.
Another approach (but untested by me): http://www.zeromq.org/, which also has ruby bindings. If you like to test new stuff, then try zeromq.
Update
To clarify/explain/extend my comments above:
Why I should switch from DelayedJobs to Resque is the mentioned advantage that I have queue and messages in one system because Redis offers this.
Further sources:
https://github.com/blog/542-introducing-resque
https://github.com/defunkt/resque#readme
If I had to stay on DJ I would extend the worker classes with redis or zeromq/0mq (only examples here) to get the messaging in my extisting background jobs.
I would not try messaging with ActiveRecord/MySQL (not even queueing actually!) because this DB isn't the best performing system for this use case especially if the application has too many background workers and huge queues and uncountable message exchanges in short times.
If it is a small app with less workers you also could implement a simple messaging via DB, but also here I would prefer memcache instead; messages are short living data chunk which can be handled in-memory only.
Shared variables will never be a good solution. Think of multiple machines where your application and your workers can live on. How you would ensure a save variable transfer between them?
Okay, someone could mention DRb (distributed ruby) but it seems not really used anymore. (never seen a real world example so far)
If you want to play around with DRb however, read this short introduction.
My personal preference order: Messaging (real) > Database driven messaging > Variable sharing
memcached
rabbitmq
You can use Pipes:
reader, writer = IO.pipe
fork do
loop do
payload = { name: 'Kris' }
writer.puts Marshal.dump(payload)
sleep(0.5)
end
end
loop do
begin
Timeout::timeout(1) do
puts Marshal.load(reader.gets) # => { name: 'Kris' }
end
rescue Timeout::Error
# no-op, no messages to receive
end
end
One way
Read as a byte stream
Pipes are expressed as a pair, a reader and a writer. To get two way communication you need two sets of pipes.
I'm writing a Rails web service that interacts with various pieces of hardware scattered throughout the country.
When a call is made to the web service, the Rails app then attempts to contact the appropriate piece of hardware, get the needed information, and reply to the web client. The time between the client's call and the reply may be up to 10 seconds, depending upon lots of factors.
I do not want to split the web service call in two (ask for information, answer immediately with a pending reply, then force another api call to get the actual results).
I basically see two options. Either run JRuby and use multithreading or else run several regular Ruby instances and hope that not many people try to use the service at a time. JRuby seems like the much better solution, but it still doesn't seem to be mainstream and have out of the box support at Heroku and EngineYard. The multiple instance solution seems like a total kludge.
1) Am I right about my two options? Is there a better one I'm missing?
2) Is there an easy deployment option for JRuby?
I do not want to split the web service call in two (ask for information, answer immediately with a pending reply, then force another api call to get the actual results).
From an engineering perspective, this seems like it would be the best alternative.
Why don't you want to do it?
There's a third option: If you host your Rails app with Passenger and enable global queueing, you can do this transparently. I have some actions that take several minutes, with no issues (caveat: some browsers may time out, but that may not be a concern for you).
If you're worried about browser timeout, or you cannot control the deployment environment, you may want to process it in the background:
User requests data
You enter request into a queue
Your web service returns a "ticket" identifier to check the progress
A background process processes the jobs in the queue
The user polls back, referencing the "ticket" id
As far as hosting in JRuby, I've deployed a couple of small internal applications using the glassfish gem, but I'm not sure how much I would trust it for customer-facing apps. Just make sure you run config.threadsafe! in production.rb. I've heard good things about Trinidad, too.
You can also run the web service call in a delayed background job so that it's not hogging up a web-server and can even be run on a separate physical box. This is also a much more scaleable approach. If you make the web call using AJAX then you can ping the server every second or two to see if your results are ready, that way your client is not held in limbo while the results are being calculated and the request does not time out.