Rails save log data to database - ruby-on-rails

Is it possible to access to the information being saved into a rails log file without reading the log file. To be clear I do not want to send the log file as a batch process but rather every event that is written into the log file I want to also send as a background job to a separate database.
I have multiple apps running in docker containers and wish to save the log entries of each into a shared telemetry database running on the server. Currently the logs are formatted with lograge but I have not figured out how to access this information directly and send it to a background job to be processed.(as stated before I would like direct access to the data being written to the log and send that via a background job)
I am aware of the command Rails.logger.instance_variable_get(:#logger) however what I am looking for is the actual data being saved to the logs so I can ship it to a database.
The reasoning behind this is that there are multiple rails api's running in docker containers. I have an after action set up to run a background job that I hoped would send just the individual log entry but this is where I am stuck. Sizing isn't an issue as the data stored in this database to be purged every 2 weeks. This is moreso a tool for the in-house devs to track telemetry through a dashboard. I appreciate you taking the time to respond

You would probably have to go through your app code and manually save the output from the logger into a table/field in your database inline. Theoretically, any data that ends up in your log should be accessible from within your app.
Depending on what how much data you're planning on saving this may not be the best idea as it has the potential to grow your database extremely quickly (it's not uncommon for apps to create GBs worth of logs in a single day).
You could write a background job that opens the log files, searches for data, and saves it to your database, but the configuration required for that will depend largely on your hosting setup.

So I got a solution working and in fairness it wasn't as difficult as I had thought. As I was using the lograge gem for formatting the logs I created a custom formatter through the guide in this Link.
As I wanted the Son format I just copied this format but was able to put in the call for a background job at this point and also cleanse some data I did not want.
module Lograge
module Formatters
class SomeService < Lograge::Formatters::Json
def call(data)
data = data.delete_if do |k|
[:format, :view, :db].include? k
end
::JSON.dump(data)
# faktory job to ship data
LogSenderJob.perform_async(data)
super
end
end
end
end
This was just one solution to the problem that was made easier as I was able to get the data formatted via lograge but another solution was to create a custom logger and in there I could tell it to write to a database if necessary.

Related

Preventing Rails from connecting to database during initialization

I am quite new at Ruby/Rails. I am building a service that make an API available to users and ends up with some files created in the local filesystem, without any need to connect to any database. Then, once every few hours, I want to run a piece of ruby code that takes these local files, uploads them to Amazon S3 and registers their location into a Postgres database.
Right now both codes live together in the same project. I am observing that every time a user does something the system connects to the database. I have seen this answer which recommends to eliminate all traces of ActiveRecord in my code, but given that I want to have my background bookkeeping process connect to the database I am stuck on what to do.
Is it possible to define two different profiles (one with database and one without) and specify which profile a certain function call should run on? would this work?
I'm a bit confused by this, the db does not magically connect to the database for kicks on every request, it does so because of a specific request requires it. Generally through ActiveRecord but not exclusively
If your system is connecting every time you make a request, then that implies you have some sort of user metric or authorisation based code in there. Just killing off the database will cause this to fail, and likely you'll have to find it anyways, to then get your system to work. I'd advise locating it.
Things to look for are before_filters in controllers, or database session management, for example, or look for what is in the logs - the query should appear - and that will tell you what is being loaded, modified or whatnot.
It might even work to stop your database, just before doing a user activity, and see where the error leads you. Rinse and repeat until the user activity works, without the database.

Rails generating/caching rarely changing datasets

In a website I am working on I have cases with an expensive controller endpoint (5+ seconds between request received and having the response ready) that outputs data sourced from database tables that nearly never change.
The only time I expect the response to change is once every few months when I get new source data, or when redeploying the site with related code changes.
As such I am thinking of just creating the resulting JSON, etc. files to then serve statically. But is there a Rails way to do this, rather than just some custom CLI script that creates a bunch of files I then commit? Ideally still being able to use ActiveRecord etc. maybe as some sort of build/deploy time action?

Mvc azure storage, auto delete storage after certain time

Im developing a azure website where users can upload blob and metadata. I want uploaded stuff too be deleted after some time.
The only way i can think off is going for a cloudapp instead of a website with a worker role that checks like every hour if the uploaded file has expired and continue and delete it. However im going for a simple website here without workerroles.
I have a function that checks if the uploaded item should be deleted and if the user do something on the page i can easily call this function, BUT.. If the user isnt doing anything and the time runs out it wont delete it because the user never calls the function.. The storage will never be deleted. How would you solve this?
Thanks
Too broad to give one right answer, as you can solve this in many ways. But... from an objective perspective because you're using Web Sites I do suggest you look at Web Jobs and see if this might be the right tool for you (as this gives you the ability to run periodic jobs without the bulk of extra VMs in web/worker configuration). You'll still need a way to manage your metadata to know what to delete.
Regarding other Azure-specific built-in mechanisms, you can also consider queuing delete messages, with an invisibility time equal to the time the content is to be available. After that time expires, the queue message becomes visible, and any queue consumer would then see the message and be able to act on it. This can be your Web Job (which has SDK support for queues) or really any other mechanism you build.
Again, a very broad question with no single right answer, so I'm just pointing out the Azure-specific mechanisms that could help solve this particular problem.
Like David said in his answer, there can be many solutions to your problem. One solution could be to rely on blob itself. In this approach you can periodically fetch the list of blobs in the blob container and decide if the blob should be removed or not. The periodic fetching could be done through a Azure WebJob (if application is deployed as a website) or through a Azure Worker Role. Worker role approach is independent of how your main application is deployed. It could be deployed as a cloud service or as a website.
With that, there are two possible approaches you can take:
Rely on Blob's Last Modified Date: Whenever a blob is updated, its Last Modified property gets updated. You can use that to identify if the blob should be deleted or not. This approach would work best if the uploaded blob is never modified.
Rely on Blob's custom metadata: Whenever a blob is uploaded, you could set the upload date/time in blob's metadata. When you fetch the list of blobs, you could compare the upload date/time metadata value with the current date/time and decide if the blob should be deleted or not.
Another approach might be to use the container name to be the "expiry date"
This might make deletion easier, as you then could just remove expired containers

Asp.Net MVC - how would i maintain user state in azure in my application

I know that there are a few questions like this, but the question is more in respect to this specifict situation.
Im developing a platform for taking tests online. A test is a set of images and belonging questions. Its being hosted on Azure and using MVC 4.
How I would love that if the user have taken half the test and the brower crashes or something makes it deviate from the test and comes back, it will give the option to resume.
I have one idea my self, but would like to know if theres other options. I was considering to use the localstorage. When a user starts a test, the information for the test is saved in localstorage and every time he moved on to a new image, it updates the local state. Then if the test-player is loaded it will check if any ongoing tests are avalible.
How could i do it? any one witch similar problem/solution.
Local Storage is not a good choice, because it is specific to each instance. That means if you have two instances of a Web Role (the recommended minimum), then each instance would have it's own local storage. They are not shared, and there is no way to access local storage on a specific machine.
You really have two options. You could use a database like SQL Azure, or use Azure caching. Azure caching is probably easier, since it's super easy to serialize/deserialize complex objects, but the downside is that caching is only valid for 72 hours. If a cached object isn't accessed/updated in 72 hours, it gets purged.
I would not recommend you storing this information on the client browser. The user has access to local storage, cookies, etc ... and could modify it. You could store the test start time in your database on the server. Then every time the user sends a request to the server in order to answer a question you would verify if the test is still active or the maximum allowed time has elapsed.

How to run some tasks in the background and put the results to a global memory hash?

I'm building a website with Rails which can let user check if some domains have been registered. The logic is designed like this:
There is a text field on a page let users input some domain names
When user click check button, the input will be post to the server
server get the inputs, create a background task(which will be executed in another process), and return some text like "checking now..." to user immediately
The background task will post the domain names to another site, get the response, parse it to get useful information, then put data to a globle memory hash (the task takes 3 seconds)
The page contains "checking now..." has a javascript function, will request to server to get the check result. It runs every 2 seconds until it gets result.
There is a action in the server side to handle the "check-status request". It checks that Globle memory hash, it found the result, returns it, otherwise return "[]"
(I'm using Rails 2.3.8, delayed_job)
I was a Java developer, this is what I will do in Java. But I found it difficult to hold a "globle memory hash" for "delayed job worker" and "rails", because they are in different processes.
Is my design is impossible in rails? Must I store the checking result to database or something like "memcached"?
And, can these background tasks run in parallel?
It's not just between DelayedJob's workers and Rails' workers that you won't see the global variables, in production you will be dealing with multiple Rails worker processes as well. So even if you were to set a global variable in Rails, some other Rails worker won't see it.
This is by design. It has the advantage that you can spread the load of Rails and DelayedJob across multiple machines, because they only deal with the stateless request, and look at the database system or other persistent storage to add the statefulness that your web application needs.
From what I've gathered, a Java web application may use a threaded model that would allow you to perform background tasks and set global variables like you want to. But that also limits you to a single machine; what would you do if you had to scale up?
Memcached actually sounds like a really good solution in this case. It's a snap to install, requires very little set up, and is easy to use from within Rails too.

Resources