I've got the mailman gem integrated into my rails project. It fetches emails from gmail successfully. In my app there is a model Message for my emails. The emails are properly saved as Message model.
The problem is that the emails are saved multiple times sometimes and I can't recognize a pattern. Some emails are saved once, some two times and some are saved three times.
But I can't find the failure in my code.
Here is my mailman_server script:
script/mailman_server
#!/usr/bin/env ruby
# encoding: UTF-8
require "rubygems"
require "bundler/setup"
require File.expand_path(File.join(File.dirname(__FILE__), '..', 'config', 'environment'))
require 'mailman'
Mailman.config.ignore_stdin = true
#Mailman.config.logger = Logger.new File.expand_path("../../log/mailman_#{Rails.env}.log", __FILE__)
if Rails.env == 'test'
Mailman.config.maildir = File.expand_path("../../tmp/test_maildir", __FILE__)
else
Mailman.config.logger = Logger.new File.expand_path("../../log/mailman_#{Rails.env}.log", __FILE__)
Mailman.config.poll_interval = 15
Mailman.config.imap = {
server: 'imap.gmail.com',
port: 993, # usually 995, 993 for gmail
ssl: true,
username: 'my#email.com',
password: 'my_password'
}
end
Mailman::Application.run do
default do
begin
Message.receive_message(message)
rescue Exception => e
Mailman.logger.error "Exception occurred while receiving message:\n#{message}"
Mailman.logger.error [e, *e.backtrace].join("\n")
end
end
end
The email is processed inside my Message class:
def self.receive_message(message)
if message.from.first == "my#email.com"
Message.save_bcc_mail(message)
else
Message.save_incoming_mail(message)
end
end
def self.save_incoming_mail(message)
part_to_use = message.html_part || message.text_part || message
if Kontakt.where(:email => message.from.first).empty?
encoding = part_to_use.content_type_parameters['charset']
Message.create topic: message.subject, message: part_to_use.body.decoded.force_encoding(encoding).encode('UTF-8'), communication_partner: message.from.first, inbound: true, time: message.date
else
encoding = part_to_use.content_type_parameters['charset']
Message.create topic: message.subject, message: part_to_use.body.decoded.force_encoding(encoding).encode('UTF-8'), communication_partner: message.from.first, inbound: true, time: message.date, messageable_type: 'Company', messageable_id: Kontakt.where(:email => message.from.first).first.year.id
end
end
def self.save_bcc_mail(message)
part_to_use = message.html_part || message.text_part || message
if Kontakt.where(:email => message.to.first).empty?
encoding = part_to_use.content_type_parameters['charset']
Message.create topic: message.subject, message: part_to_use.body.decoded.force_encoding(encoding).encode('UTF-8'), communication_partner: message.to.first, inbound: false, time: message.date
else
encoding = part_to_use.content_type_parameters['charset']
Message.create topic: message.subject, message: part_to_use.body.decoded.force_encoding(encoding).encode('UTF-8'), communication_partner: message.to.first, inbound: false, time: message.date, messageable_type: 'Company', messageable_id: Kontakt.where(:email => message.to.first).first.year.id
end
end
I have daemonized the mailman_server with this script:
script/mailman_daemon
#!/usr/bin/env ruby
require 'rubygems'
require "bundler/setup"
require 'daemons'
Daemons.run('script/mailman_server')
I deploy with capistrano.
This are the parts which are responsible for stopping, starting and restarting my mailman_server:
script/deploy.rb
set :rails_env, "production" #added for delayed job
after "deploy:stop", "delayed_job:stop"
after "deploy:start", "delayed_job:start"
after "deploy:restart", "delayed_job:restart"
after "deploy:stop", "mailman:stop"
after "deploy:start", "mailman:start"
after "deploy:restart", "mailman:restart"
namespace :deploy do
desc "mailman script ausfuehrbar machen"
task :mailman_executable, :roles => :app do
run "chmod +x #{current_path}/script/mailman_server"
end
desc "mailman daemon ausfuehrbar machen"
task :mailman_daemon_executable, :roles => :app do
run "chmod +x #{current_path}/script/mailman_daemon"
end
end
namespace :mailman do
desc "Mailman::Start"
task :start, :roles => [:app] do
run "cd #{current_path};RAILS_ENV=#{fetch(:rails_env)} bundle exec script/mailman_daemon start"
end
desc "Mailman::Stop"
task :stop, :roles => [:app] do
run "cd #{current_path};RAILS_ENV=#{fetch(:rails_env)} bundle exec script/mailman_daemon stop"
end
desc "Mailman::Restart"
task :restart, :roles => [:app] do
mailman.stop
mailman.start
end
end
Could it be that multiple instances of the mailman server are started during my deploy at nearly the same time and then each instance polls nearly at the same time? The second and third instance pools before the first instance marks the email as read and polls and processes the email as well?
Update 30.01.
I had set the polling intervall to 60 seconds. but that changes nothing.
I checked the folder where the mailman pid file is stored. there is only one mailman pid file. So there is definitely only one mailman server running. I checked the logfile and can see, that the messages are fetched multiple times:
Mailman v0.7.0 started
IMAP receiver enabled (my#email.com).
Polling enabled. Checking every 60 seconds.
Got new message from 'my.other#email.com' with subject 'Test nr 0'.
Got new message from 'my.other#email.com' with subject 'Test nr 1'.
Got new message from 'my.other#email.com' with subject 'test nr 2'.
Got new message from 'my.other#email.com' with subject 'test nr 2'.
Got new message from 'my.other#email.com' with subject 'test nr 3'.
Got new message from 'my.other#email.com' with subject 'test nr 4'.
Got new message from 'my.other#email.com' with subject 'test nr 4'.
Got new message from 'my.other#email.com' with subject 'test nr 4'.
So that seems to me, that the problem is definitely in my mailman server code.
Update 31.1.
Seems to me, that is has something to do with my production machine. when I'm testing this in development with the exact same configuration (changed my local database from sqlite to mysql this morning to test it) as on the production machine I don't get duplicates. Probably is everything ok with my code, but there is a problem with the production machine. Will ask my hoster if they could see a solution for this. To fix this I will go with Ariejan'S suggestion.
The solution:
I found the problem. I deploy to a machine where the tmp directory is a shared one between all releases. I forgot to define the path where the pid file of the mailman_daemon should be saved. So it was saved in the script directory instead of the /tmp/pids directory. Because of this the old mailman_daemon could not be stopped after a new deploy. That had led to an army of working mailman_daemons which were polling my mailaccount... After killing all these processes all went well! No more duplicates!
This may be some concurrency/timing issue. E.g. new mails are imported before the ones currently processing have been saved.
Edit: Just noticed you have Mailman.config.poll_interval set to 15. This means it will check for new messages every 15 seconds. Try increasing this value to the default 60 seconds. Regardless of this setting, it might be a good idea to add the deduplication code I mentioned below.
My tip would be to also store the message_id from each email, so you can easily spot duplicates.
Instead of:
Message.create(...)
do:
# This makes sure you have the latest pulled version.
message = Message.find_or_create(message_id: message.message_id)
message.update_attributes(...)
# This makes sure you only import it once, then ignore further duplicates.
if !Message.where(message_id: message.message_id).exists?
Message.create(...)
end
For more info on message_id: http://rdoc.info/github/mikel/mail/Mail/Message#message_id-instance_method
Remember that email and imap are not meant to be consistent data stores like you'd expect Postgres or Mysql to be. Hope this helps you sort out the duplicate mails.
I found the problem. I deploy to a machine where the tmp directory is a shared one between all releases. I forgot to define the path where the pid file of the mailman_daemon should be saved. So it was saved in the script directory instead of the /tmp/pids directory. Because of this the old mailman_daemon could not be stopped after a new deploy. That had led to an army of working mailman_daemons which were polling my mailaccount... After killing all these processes all went well! No more duplicates!
Related
I am familiar with Rails but this is my first time uploading to production. I am able to successfully upload my app to AWS and deploy it. However, every time I do that, I have to ssh into my server and run the necessary rake tasks to clean up my models and fully prep my website. Is there a file like production.rb where you can write a script to be run on every production upload. For instance run all tests and rake tests ? Is there a simple example of a script someone. This is the example of my rake file.
Note: I am using AWS Beanstalk, super easy to deploy, just want to run some post production ready scripts.
This is the rake file I want to run commands of post deployment.
require "#{Rails.root}/app/helpers/application_helper"
include ApplicationHelper
namespace :db do
desc "Generate a new blog post markdown"
task new_article: :environment do
cp 'lib/assets/articles/template.md', "lib/assets/articles/NEW_ARTICLE#{Time.now.strftime("%s")}.md"
puts 'new article created!'
end
task populate: :environment do
Article.destroy_all
if User.count == 0
User.create!(name: "AJ", email: "aj#psychowarfare.com")
end
puts Dir.pwd
a = File.join("lib", "assets", "articles", "*.md")
Dir.glob(a).reject { |name| /.*(template|NEW_ARTICLE).*/ =~ name }.each do |file|
File.open(file, "r") do |f|
contents = f.read
mkdown = Metadown.render(contents)
md = mkdown.metadata
unrendered_content = contents.sub(/^---(\n|.)*---/, '')
#puts unrendered_content
article = Article.create!(title: md["title"],
content: markdown(unrendered_content),
header_image: md["header_image"],
published: md["published"],
useful_links: md["useful_links"],
people_mentioned: md["people_mentioned"],
written_at_date: md["written_at_date"],
timestamp: md["timestamp"],
embedded_link: md["embedded_link"],
user: User.first)
article.add_tag(md["tags"])
puts article.useful_links
puts article.people_mentioned
puts article.header_image
puts article.tags
end
end
puts "Article Count: #{Article.count}"
end
end
For post deployment, you can try the following way.
Create a file in .ebextensions/01_build.config
commands:
create_post_dir:
command: "mkdir /opt/elasticbeanstalk/hooks/appdeploy/post"
ignoreErrors: true
files:
"/opt/elasticbeanstalk/hooks/appdeploy/post/99_build_app.sh":
mode: "000755"
owner: root
group: root
content: |
#!/usr/bin/env bash
cd /var/app/current/app/
Your-Post-Deploy-Command1
Your-Post-Deploy-Command2
Your-Post-Deploy-Command3
What this config does is:
create the “post” directory if it doesn’t already exist (it won’t by
default) – ignore any errors (such as if the directory already
existed)
deploy the shell script with the appropriate permissions into the right directory
For more details look at the following references: Blog-Article & Stackoverflow-Question
I am trying to run message queues on heroku. For this I am using RabbitMQ Bigwig plugin.
I am publishing messages using bunny gem and trying to receive messages with sneakers gem. This whole setup works smoothly on local machine.
I take following steps to setup queue
I run this rake on server to setup queue:
namespace :rabbitmq do
desc 'Setup routing'
task :setup_test_commands_queue do
require 'bunny'
conn = Bunny.new(ENV['SYNC_AMQP'], read_timeout: 10, heartbeat: 10)
conn.start
ch = conn.create_channel
# get or create exchange
x = ch.direct('testsync.pcc', :durable => true)
# get or create queue (note the durable setting)
queue = ch.queue('test.commands', :durable => true, :ack => true, :routing_key => 'test_cmd')
# bind queue to exchange
queue.bind(x, :routing_key => 'test_cmd')
conn.close
end
end
I am able to see this queue in rabbitmq management plugin with mentioned binding.
class TestPublisher
def self.publish(test)
x = channel.direct("testsync.pcc", :durable => true)
puts "publishing this = #{Test}"
x.publish(Test, :persistent => true, :routing_key => 'pcc_cmd')
end
def self.channel
#channel ||= connection.create_channel
end
def self.connection
#conn = Bunny.new(ENV['RABBITMQ_BIGWIG_TX_URL'], read_timeout: 10, heartbeat: 10) # getting configuration from rabbitmq.yml
#conn.start
end
end
I am calling TestPublisher.publish() to publish message.
I have sneaker worker like this:
require 'test_sync'
class TestsWorker
include Sneakers::Worker
from_queue "test.commands", env: nil
def work(raw_event)
puts "^"*100
puts raw_event
# o = CaseNote.create!(content: raw_event, creator_id: 1)
# puts "#########{o}"
test = Oj.load raw_event
test.execute
# event_params = JSON.parse(raw_event)
# SomeWiseService.build.call(event_params)
ack!
end
end
My Procfile
web: bundle exec unicorn -p $PORT -c ./config/unicorn.rb
worker: bundle exec rake jobs:work
sneaker: WORKERS=TestsWorker bundle exec rake sneakers:run
My Rakefile
require File.expand_path('../config/application', __FILE__)
require 'rake/dsl_definition'
require 'rake'
require 'sneakers/tasks'
Test::Application.load_tasks
My sneaker configuration
require 'sneakers'
Sneakers.configure amqp: ENV['RABBITMQ_BIGWIG_RX_URL'],
log: "log/sneakers.log",
threads: 1,
workers: 1
puts "configuring sneaker"
I am sure that message gets published. I am able to get message on rabbitmq management plugin. But sneaker does not work. There is nothing in sneakers.log that can help.
sneakers.log on heroku :
# Logfile created on 2016-04-05 14:40:59 +0530 by logger.rb/41212
Sorry for this late response. I was able to get this working on heroku. When I faced this error after hours of debugging I was not able to fix it. So I rewrote all above code and I did not check what was wrong with my previous code.
The only problem with this code and correct code is queue binding.
I had two queues on same exchange. pcc.commands with routing key pcc_cmd and test.commands with routing key test_cmd.
I was working with test_cmd but as per following line in TestPublisher
x.publish(Test, :persistent => true, :routing_key => 'pcc_cmd')
I was publishing to different queue(pcc.commands). Hence I was not able to recieve the message on test.commands queue.
In TestWorker
from_queue "test.commands", env: nil
This states that fetch messages only from test.commands queue.
Regarding sneakers.log file:
Above setup was not able to give me logs in sneakers.log file. Yes this setup works on your local development machine, but it was not working on heroku. Now days to debug such issue I ommit log attribute from configuration. like this:
require 'sneakers'
Sneakers.configure amqp: ENV['RABBITMQ_BIGWIG_RX_URL'],
# log: "log/sneakers.log",
threads: 1,
workers: 1
This way you will get sneaker logs (even heartbeat logs) in heroku logs which can be seen by running command heroku logs -a app_name --tail.
I am using Resque and Resque Schedule to start a job that has to be run immediately on the application start. Other scheduled jobs are loaded every 30 seconds.
This is the code for my config/initializers/redis.rb
require 'rake'
require 'resque'
require 'resque/server'
require 'resque_scheduler/tasks'
# This will make the tabs show up.
require 'resque_scheduler'
require 'resque_scheduler/server'
uri = URI.parse(ENV["REDISTOGO_URL"])
REDIS = Redis.new(:host => uri.host, :port => uri.port, :password => uri.password)
Resque.redis = REDIS
Dir["#{Rails.root}/app/workers/*.rb"].each { |file| require file }
Resque.enqueue(AllMessageRetriever)
Resque.schedule = YAML.load_file(Rails.root.join('config', 'schedule.yml'))
When the application is started up, the AllMessageRetriever gets run 2-3 times rather than only once. Do the initializers get called more than once? This happens both on Heroku and my local environment?
Is it possible to set a delayed job in Resque-Scheduler which will only get executed once and immediately on runtime?
The code for AllMessageRetriever. Basically it loops over a table and calls an external API to get data and then updates it to the table. This entire task happens 2-3 times if I add the enqueue method in initializer file
require 'socialcast'
module AllMessageRetriever
#queue = :message_queue
def self.perform()
Watchedgroup.all.each do |group|
puts "Running group #{group.name}"
continueLoading=true
page=1
per_page=500
while(continueLoading == true)
User.first.refresh_token_if_expired
token = User.first.token
puts "ContinueLoading: #{continueLoading}"
#test = Socialcast.get_all_messages(group.name,token,page,per_page)
messagesArray = ActiveSupport::JSON.decode(#test)["messages"]
puts "Message Count: #{messagesArray.count}"
if messagesArray.count == 0
puts 'count is zero now'
continueLoading = false
else
messagesArray.each do |message|
if not Message.exists?(message["id"])
Message.create_with_socialcast(message, group.id)
else
Message.update_with_socialcast(message)
end
end
end
page += 1
end
Resqueaudit.create({:watchedgroup_id => group.id,:timecompleted => DateTime.now})
end
# Do anything here, like access models, etc
puts "Doing my job"
end
end
Rake
Firstly, why are you trying to queue on init?
You'd be much better delegating to a rake task which is called from an initializer.
This will remove dependency on the initialize process, which should clear things up a lot. I wouldn't put this in an initializer itself, as it will be better handled elsewhere (modularity)
Problem
I think this line is causing the issue:
Resque.enqueue(AllMessageRetriever)
Without seeing the contents of AllMessageRetriever, I'd surmise that you're AllMessageRetriever (module / class?) will be returning the results 2/3 times, causing Resque to add the (2 / 3 times) data-set to the queue
Could be wrong, but it would make sense, and mean your issue is not with Resque / Initializers, but your AllMessageRetriever class
Would be a big help if you showed it!
I'm using the whenever gem to have a rails cron job send emails. Everything seems to work just fine and i have no errors in my cron.log or my production.log file, but i never receive an email. I've checked that the email address is correct also.
Any help is appreciated.
The production.log file contains this:
Connecting to database specified by database.yml
Rendered email_mailer/send_birthday_reminders.html.erb (5.3ms)
Sent mail to tomcaflisch#gmail.com (409ms)
Here's my whenever gem schedule.rb file
set :output, "#{path}/log/cron.log"
every :hour do
runner "BirthdayRemindersController.send_birthday_email_reminders"
end
birthday_reminders_controller.rb
class BirthdayRemindersController < ApplicationController
# cron job that sends birthday reminders
def self.send_birthday_email_reminders
users = User.all
email_addresses = []
users.each_with_index do |user, i|
if user.user_details.birthday_reminders == true
email_addresses[i] = get_primary_email(user)
end
end
p "email_addresses to send to:"
p email_addresses
users.each do |user|
p "this user is"
p user.user_details.full_name
if user.user_details.birthday.try(:strftime, "%m") == Date.today.strftime("%m") && user.user_details.birthday.try(:strftime, "%d") == Date.today.strftime("%d")
p "reminder sent"
EmailMailer.send_birthday_reminders(user, email_addresses).deliver
end
end
end
end
email_mailer.rb snippet
class EmailMailer < ActionMailer::Base
include ApplicationHelper
default :from => "\"FamNFo\" <no-reply#mysite.com>"
def send_birthday_reminders(birthday_person, email_addresses)
p "we in send_birthday_reminders mailer"
p email_addresses
#birthday_person = birthday_person
mail(:subject => "Birthday Reminder For The Caflisch Family", :to => email_addresses, :reply_to => email_addresses)
end
end
capistrano's deploy.rb contains this
# needed for the 'whenever' gem
set(:whenever_command) { "RAILS_ENV=#{rails_env} bundle exec whenever"}
require "whenever/capistrano"
Check your spam folder. To make sure emails don't end up there, add an "Unsubscribe" link in each email.
This could happen if your action mailer configuration specifies perform_deliveries=false. You can check out the configuration in your environment files.
If your application is deployed to cloud services then you may be getting your emails in a spam folder. Their entire IP blocks are registered as spam at services like Spamhaus, which is a sensible precaution or else we'd be getting even more spam than usual.
You should enter your server's IP address in that field to see if you're listed as a spammer.
If you are, you can request to Spamhaus that the block be lifted.
The other big issues I have found is that the PATH and rbenv may not be initialized in the CRONTAB depending on how you have it setup.
I would recommend adding the following to the top of your .bashrc file
export PATH="$HOME/.rbenv/bin:$PATH"
eval "$(rbenv init -)"
This ensures that if you are using whenever to call model methods that rbenv and ruby are fully available.
I have two questions:
How can I add a heroku worker just before running a delayed job and remove it after it finishes?
Is my cron.rake ok?
cron.rake:
desc "This task is called by the Heroku cron add-on"
task :cron => :environment do
puts "requesting homepage to refresh cache"
uri = URI.parse('http://something.com')
Net::HTTP.get(uri)
puts "end requesting homepage"
puts "start sending daily mail"
User.notified_today.each do |user|
Delayed::Job.enqueue UserMailer.daily_mail(user).deliver
end
puts "end sending daily mail"
end
I use collectiveidea delayed_job.
I've had good success with HireFire.
Easy setup:
Add gem 'hirefire' to your Gemfile
Create Rails.root/config/initializers/hirefire.rb with the config information.
To add remove/remove workers, hook into your ORM's after :create / after :destroy
With DataMapper on Heroku, I did it like this (You must set the ENV vars yourself)
MAX_CONCURRENT_WORKERS = 5
if ENV["HEROKU_APP"]
Delayed::Job.after :create do
workers_needed = [Delayed::Job.count, MAX_CONCURRENT_WORKERS].min
client = Heroku::Client.new(ENV['HEROKU_USERNAME'], ENV['HEROKU_PASSWORD'])
client.set_workers(ENV['HEROKU_APP'], workers_needed)
puts "- Initialized Heroku workers for ZipDecoder"
end
Delayed::Job.after :destroy do
workers_needed = [Delayed::Job.count, MAX_CONCURRENT_WORKERS].min
client = Heroku::Client.new(ENV['HEROKU_USERNAME'], ENV['HEROKU_PASSWORD'])
client.set_workers(ENV['HEROKU_APP'], workers_needed)
puts "- Cleaned Up a Delayed Job for ZipDecoder ---------------------------------"
end
end
You maybe can use an "autoscale" plugin like workless or heroku-autoscale.
About the cron I don't see any problem on it...