Rails - Access sidekiq status via API - ruby-on-rails

I'm using sidekiq and sidekiq-status gems for workers and tracking progress of them on web UI: /sidekiq/statuses.
For individual worker tracking /sidekiq/statuses/job_id.
How can I access progress info from a frontend via API?
On GET /sidekiq/stats I get reponse:
{
"sidekiq": {
"processed": 805,
"failed": 62,
"busy": 3,
"processes": 1,
"enqueued": 0,
"scheduled": 0,
"retries": 1,
"dead": 0,
"default_latency": 0
},
"redis": {
"redis_version": "3.0.6",
"uptime_in_days": "0",
"connected_clients": "24",
"used_memory_human": "1.06M",
"used_memory_peak_human": "2.00M"
},
"server_utc_time": "14:50:29 UTC"
}
Can I do similar thing for /statuses/job_id ?

In case you want to get stats of individual sidekiq queue, I have written a small gem sidekiq_queue_metrics to do so. It provides individual queue stats on Web UI as well as provides an API to fetch the metrics.
require 'sidekiq_queue_metrics'
Sidekiq.configure_server do |config|
Sidekiq::QueueMetrics.init(config)
end
API to fetch stats:
Sidekiq::QueueMetrics.fetch
#=> {
"mailer_queue" => {"processed" => 5, "failed" => 1, "enqueued" => 2, "in_retry" => 0},
"default_queue" => {"processed" => 10, "failed" => 0, "enqueued" => 1, "in_retry" => 1}
}
Web UI:

Looking at the Sidekiq::Web source the only endpoints I see returning JSON are /sidekiq/stats and /sidekiq/stats/queues.
Remember, Sidekiq has many API helpers for use in your Ruby code. There's no reason you can't just create your own controller to pass job info to the frontend using the Ruby API. E.g. Sidekiq::Queue.new("high-queue").find_job(jid)
This also has the advantage of letting you set up more fine-grained user access control over the data rather than letting any user of your frontend have access to the Sidekiq API.
If you plan on making heavy use of this, you might think about upgrading to Pro which includes a Pro API with a more efficient Sidekiq::JobSet#find_job(jid) method.
Finally, if there's no API helper in Sidekiq remember that Sidekiq is all just Redis in the backend and you could write your own Redis queries to fetch the right data in the shape you want.

Related

In Power Automate, is there a way to filter on a Custom Field using DevOp's Send HTTP Request?

I'm trying to use Power Automate to return a custom work item in Azure DevOps using the "workitemsearch" API (via the "Send HTTP Request" action). Part of this will require me to filter based on the value of a Custom Field, however, I have not been able to get it to work. Here is a copy of my HTTP Request Body:
{
"searchText": "ValueToSearch",
"$skip": 0,
"$top": 1,
"filters": {
"System.TeamProject": ["MyProject"],
"System.AreaPath": ["MyAreaPath"],
"System.WorkItemType": ["MyCustomWorkItem"],
"Custom.RequestNumber": ["ValueToSearch"]
},
"$orderBy": [
{
"field": "system.id",
"sortOrder": "ASC"
}
],
"includeFacets": true
}
I have been able to get it to work by removing the Custom.RequestNumber": ["ValueToSearch"] but am hesitant to use that in case my ValueToSearch is found in other places like the comments of other work items.
Any help on this would be appreciated.
Cheers!
From WorkItemSearchResponse, we can see the facets (A dictionary storing an array of Filter object against each facet) only supports the following fields:
"System.TeamProject"
"System.WorkItemType"
"System.State":
"System.AssignedTo"
If you want to filter RequestNumber, you can just set it in the searchText as the following syntax:
"searchText": "RequestNumber:ValueToSearch"

Get JIRA issue properties using JRJC search

To search for JIRA issues in Java we can use this REST
/api/2.0.alpha1/search?jql&startAt&maxResults
for example:
/api/2.0.alpha1/search?assignee=mehran
but unfrotunately, according to the documentation, the result is in this format:
{
"startAt": 0,
"maxResults": 50,
"total": 1,
"issues": [
{
"self": "http://www.example.com/jira/rest/api/2.0/jira/rest/api/2.0/issue/HSP-1",
"key": "HSP-1"
}
]
}
How can I access the other properties of issues, like: title, description, ...
Well, first of all, why are you looking at an ancient version (4.3) of the API documentation? The latest is 7.2.4, for example.
If you're also running JIRA 4.3 then you're S-O-L as at that point their REST API was in a very early state.
However if your JIRA instance is newer (if it isn't, UPGRADE) then open up the proper documentation # https://docs.atlassian.com/jira/REST/{yourVersion}/. At one point the search endpoint was improved so you could expand issues and specify exactly which custom field values you want to retrieve.

Strategies for sending weekly newsletter emails from Rails

So we have about 50,000 users who have signed up for a weekly newsletter. The contents of this email is personalized for each user though, it's not a mass email.
We are using Rails 4 and Mandrill.
Right now we're taking about 12 hours every time we want to fire off this emails.rake task and I'm looking for a way to distribute that time or make it shorter.
What are some techniques I can use to improve this time that is only growing longer the more people sign up?
I was thinking of using mandrill templates, and just sending the json object to mandrill and have them send the email from their end, but I'm not really sure if this is even going to help improve speeds.
At the 50,000+ level: How do I keep email sending times manageable?
Looks like you could use MailyHerald. It is a Rails gem for managing application emails. It sends personalized emails in the background using Sidekiq worker threads which should help you out in terms of performance.
MailyHerald has a nice Web UI and works with email services like Amazon SES or Mandrill.
You need to probably look into Merge Tags on Mandrill. It allows you to define custom content per email. So you can break your newsletter sending into fewer API calls to Mandrill instead of 1 per email. The number of calls will just depend on the size of your data since I am sure there is probably a limit.
You can just create a template and put in merge vars such as *|custom_content_placeholder|** wherever you need user specific content to be placed. You can do this templating in your system and just pass it into the message or you can set it up in Mandrill and make a call to that template.
When you make the Mandrill API call to send an email or email template you just attach the JSON data such as:
"message": {
"global_merge_vars": [
{
"name": "global_placeholder",
"content": "Content to replace for all emails"
}
],
"merge_vars": [
{
"rcpt": "user#domain.com",
"vars": [
{
"name": "custom_content_placeholder",
"content": "User specific content"
},
{
"name": "custom_content_placeholder2",
"content": "More user specific content"
}
]
},
{
"rcpt": "user2#domain.com",
"vars": [
{
"name": "custom_content_placeholder",
"content": "User2 specific content"
},
{
"name": "custom_content_placeholder2",
"content": "More user2 specific content"
}
]
}
],
You can find more info on Merge Tags here:
https://mandrill.zendesk.com/hc/en-us/articles/205582487-How-to-Use-Merge-Tags-to-Add-Dynamic-Content
If you are familiar with handlebars for templating, Mandrill now supports it with the merge tags:
http://blog.mandrill.com/handlebars-for-templates-and-dynamic-content.html

Log Data Analytics : Choice of Database [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I am getting Log Data from various web applications in the following format:
Session Timestamp Event Parameters
1 1 Started Session
1 2 Logged In Username:"user1"
2 3 Started Session
1 3 Started Challenge title:"Challenge 1", level:"2"
2 4 Logged In Username:"user2"
Now, a person wants to carry out analytics on this log data (And would like to receive it as a JSON blob after appropriate transformations). For example, he may want to receive a JSON blob where the Log Data is grouped by Session and TimeFromSessionStart and CountOfEvents are added before the data is sent so that he can carry out meaningful analysis. Here I should return:
[
{
"session":1,"CountOfEvents":3,"Actions":[{"TimeFromSessionStart":0,"Event":"Session Started"}, {"TimeFromSessionStart":1, "Event":"Logged In", "Username":"user1"}, {"TimeFromSessionStart":2, "Event":"Startd Challenge", "title":"Challenge 1", "level":"2" }]
},
{
"session":2, "CountOfEvents":2,"Actions":[{"TimeFromSessionStart":0,"Event":"Session Started"}, {"TimeFromSessionStart":2, "Event":"Logged In", "Username":"user2"}]
}
]
Here, TimeFromSessionStart, CountOfEvents etc. [Let's call it synthetic additional data] will not be hard coded and I will make a web interface to allow the person to decide what kind of synthetic data he requires in the JSON blob. I would like to provide a good amount of flexibility to the person to decide what kind of synthetic data he wants in the JSON blob.
I am expecting the database to store around 1 Million rows and carry out transformations in a reasonable amount of time.
My question is regarding choice of Database. What will be the relative advantages and disadvantages of using SQL Database such as PostgreSQL v/s using NoSQL Database such as MongoDB. From whatever I have read till now, I think that NoSQL may not be able to provide enough flexibility of adding additional synthetic data. On the other hand, I may face issues of flexibility in data representation if I use SQL Database.
I think the storage requirement for both MongoDB and PostgreSQL will be comparable since I will have to build similar indices (probably!) in both situations to speed up querying.
If I use PostgreSQL, I can store the data in the following manner:
Session and Event can be string, Timestamp can be date and Parameters can be hstore(key value pairs available in PostgreSQL). After that, I can use SQL queries to compute the synthetic (or additional) data, store it temporarily in variables in a Rails Application (which will interact with PostgreSQL database and act as interface for the person who wants the JSON blob) and create JSON blob from it.
Another possible approach is to use MongoDB for storing the log data and use Mongoid as an interface with Rails application if I can get enough flexibility of adding additional synthetic data for analytics and some performance/storage improvements over PostgreSQL. But, in this case, I am not clear of what will be the best way to store log data in MongoDB. Also, I read that MongoDB will be somewhat slower than PostgreSQL and is mainly meant to run in background.
Edit:
From whatever I have read in the past few days, Apache Hadoop seems to be a good choice as well because of it's greater speed over MongoDB (being multi-threaded).
Edit:
I am not asking for opinions and would like to know the specific advantages or disadvantages of using a particular approach. Therefore, I don't think that the question is opinion based.
You should check out logstash / kibana from elasticsearch. The primary use case for that stack is collecting log data, storing it, analyzing it.
http://www.elasticsearch.org/overview/logstash/
http://www.elasticsearch.org/videos/kibana-logstash/
Mongo is a good choice too if you are looking at building it all yourself, but I think you could find that the products from elasticsearch very well could solve your needs and allow the integration you need.
MongoDB is well suited to your task and its document storage is more flexible than the rigid SQL table structure.
Below, please find a working test using Mongoid
that demonstrates comprehension of your log data input,
easy storage as documents in a MongoDB collection,
and analytics using MongoDB's aggregation framework.
I've chosen to put the parameters in a sub-document.
This matches your sample input table more closely and simplifies the pipeline.
The resulting JSON is slightly modified,
but all of the specified calculations, data, and grouping is present.
I've added a test for an index on parameter Username that demonstrates an index on a subdoc field.
This is adequate for specific fields that you want to index,
but a completely general index can't be done on keys, you would have to restructure to values.
I hope that this helps and that you like it.
test/unit/log_data_test.rb
require 'test_helper'
require 'json'
require 'pp'
class LogDataTest < ActiveSupport::TestCase
def setup
LogData.delete_all
#log_data_analysis_pipeline = [
{'$group' => {
'_id' => '$session',
'session' => {'$first' => '$session'},
'CountOfEvents' => {'$sum' => 1},
'timestamp0' => {'$first' => '$timestamp'},
'Actions' => {
'$push' => {
'timestamp' => '$timestamp',
'event' => '$event',
'parameters' => '$parameters'}}}},
{'$project' => {
'_id' => 0,
'session' => '$session',
'CountOfEvents' => '$CountOfEvents',
'Actions' => {
'$map' => { 'input' => "$Actions", 'as' => 'action',
'in' => {
'TimeFromSessionStart' => {
'$subtract' => ['$$action.timestamp', '$timestamp0']},
'event' => '$$action.event',
'parameters' => '$$action.parameters'
}}}}
}
]
#key_names = %w(session timestamp event parameters)
#log_data = <<-EOT.gsub(/^\s+/, '').split(/\n/)
1 1 Started Session
1 2 Logged In Username:"user1"
2 3 Started Session
1 3 Started Challenge title:"Challenge 1", level:"2"
2 4 Logged In Username:"user2"
EOT
docs = #log_data.collect{|line| line_to_doc(line)}
LogData.create(docs)
assert_equal(docs.size, LogData.count)
puts
end
def line_to_doc(line)
doc = Hash[*#key_names.zip(line.split(/ +/)).flatten]
doc['session'] = doc['session'].to_i
doc['timestamp'] = doc['timestamp'].to_i
doc['parameters'] = eval("{#{doc['parameters']}}") if doc['parameters']
doc
end
test "versions" do
puts "Mongoid version: #{Mongoid::VERSION}\nMoped version: #{Moped::VERSION}"
puts "MongoDB version: #{LogData.collection.database.command({:buildinfo => 1})['version']}"
end
test "log data analytics" do
pp LogData.all.to_a
result = LogData.collection.aggregate(#log_data_analysis_pipeline)
json = <<-EOT
[
{
"session":1,"CountOfEvents":3,"Actions":[{"TimeFromSessionStart":0,"Event":"Session Started"}, {"TimeFromSessionStart":1, "Event":"Logged In", "Username":"user1"}, {"TimeFromSessionStart":2, "Event":"Started Challenge", "title":"Challenge 1", "level":"2" }]
},
{
"session":2, "CountOfEvents":2,"Actions":[{"TimeFromSessionStart":0,"Event":"Session Started"}, {"TimeFromSessionStart":2, "Event":"Logged In", "Username":"user2"}]
}
]
EOT
puts JSON.pretty_generate(result)
end
test "explain" do
LogData.collection.indexes.create('parameters.Username' => 1)
pp LogData.collection.find({'parameters.Username' => 'user2'}).to_a
pp LogData.collection.find({'parameters.Username' => 'user2'}).explain['cursor']
end
end
app/models/log_data.rb
class LogData
include Mongoid::Document
field :session, type: Integer
field :timestamp, type: Integer
field :event, type: String
field :parameters, type: Hash
end
$ rake test
Run options:
# Running tests:
[1/3] LogDataTest#test_explain
[{"_id"=>"537258257f11ba8f03000005",
"session"=>2,
"timestamp"=>4,
"event"=>"Logged In",
"parameters"=>{"Username"=>"user2"}}]
"BtreeCursor parameters.Username_1"
[2/3] LogDataTest#test_log_data_analytics
[#<LogData _id: 537258257f11ba8f03000006, session: 1, timestamp: 1, event: "Started Session", parameters: nil>,
#<LogData _id: 537258257f11ba8f03000007, session: 1, timestamp: 2, event: "Logged In", parameters: {"Username"=>"user1"}>,
#<LogData _id: 537258257f11ba8f03000008, session: 2, timestamp: 3, event: "Started Session", parameters: nil>,
#<LogData _id: 537258257f11ba8f03000009, session: 1, timestamp: 3, event: "Started Challenge", parameters: {"title"=>"Challenge 1", "level"=>"2"}>,
#<LogData _id: 537258257f11ba8f0300000a, session: 2, timestamp: 4, event: "Logged In", parameters: {"Username"=>"user2"}>]
[
{
"session": 2,
"CountOfEvents": 2,
"Actions": [
{
"TimeFromSessionStart": 0,
"event": "Started Session",
"parameters": null
},
{
"TimeFromSessionStart": 1,
"event": "Logged In",
"parameters": {
"Username": "user2"
}
}
]
},
{
"session": 1,
"CountOfEvents": 3,
"Actions": [
{
"TimeFromSessionStart": 0,
"event": "Started Session",
"parameters": null
},
{
"TimeFromSessionStart": 1,
"event": "Logged In",
"parameters": {
"Username": "user1"
}
},
{
"TimeFromSessionStart": 2,
"event": "Started Challenge",
"parameters": {
"title": "Challenge 1",
"level": "2"
}
}
]
}
]
[3/3] LogDataTest#test_versions
Mongoid version: 3.1.6
Moped version: 1.5.2
MongoDB version: 2.6.1
Finished tests in 0.083465s, 35.9432 tests/s, 35.9432 assertions/s.
3 tests, 3 assertions, 0 failures, 0 errors, 0 skips
MongoDB is an ideal database for this.
Create a collection for your raw log data.
Use one of Mongo's powerful aggregation tools, and output the aggregated data to another collection (or multiple output collections, if you want different buckets or views of the raw data)
You can either do the agg offline, with a set of pre-determined possibilities that users can pull from, or do it on demand/ad hoc, if you can tolerate some latency in your response.
http://docs.mongodb.org/manual/aggregation/

Restkit: How to get and map data from multiple source

I'm currently working on iOS Application with RestKit 0.20 to access data from Tastypie API.
And I am trying to get feeds data from URL like this
/api/v2/feed/?format=json
Then I will get array of feeds as below.
{
"meta": {
"limit": 20,
"next": null,
"offset": 0,
"previous": null,
"total_count": 2
},
"objects": [
{
"id": 1,
"info": "This is my first post",
"pub_date": "2013-02-03T15:59:33.311000",
"user": "/api/v2/user/1/",
"resource_uri": "/api/v2/feed/1/"
},
{
"id": 2,
"info": "second post, yeah",
"pub_date": "2013-02-03T16:00:09.350000",
"user": "/api/v2/user/1/",
"resource_uri": "/api/v2/feed/2/"
}
]
}
if I want to fetch more data about user which Tastypie send it as url like a foreign key "user": "/api/v2/user/1/", do I have to nested call objectRequestOperation.
I'm confusing because I'm using block to callback when data is successful loaded. So is there any better way than requesting user data again for each feed after requesting feed complete.
Thank you very much :)
You have to define in the Feed resource :
user = fields.ToOneField(UserResource, full=True)
More info in the tastypie doc http://django-tastypie.readthedocs.org/en/latest/resources.html
Why Resource URIs?
Resource URIs play a heavy role in how Tastypie delivers data. This can seem very different from other solutions which simply inline related data. Though Tastypie can inline data like that (using full=True on the field with the relation), the default is to provide URIs.
URIs are useful because it results in smaller payloads, letting you fetch only the data that is important to you. You can imagine an instance where an object has thousands of related items that you may not be interested in.
URIs are also very cache-able, because the data at each endpoint is less likely to frequently change.
And URIs encourage proper use of each endpoint to display the data that endpoint covers.
Ideology aside, you should use whatever suits you. If you prefer fewer requests & fewer endpoints, use of full=True is available, but be aware of the consequences of each approach.

Resources