Fluentd record with source filename parts - fluentd

I'm using fluentd on a server to export logs.
My configuration uses something like this to capture several log files:
<source>
type tail
path /my/path/to/file/*/*.log
</source>
The different files are tracked properly, however, I have one more feature needed:
The two wildcards parts of the path should be added to the record as well (let's call them directory and filename).
If the in_tail plugin would add the filename to the record, I could write a formatter to split and edit.
Anything that I'm missing or rewriting in_tail to my heart wishes is the best way to go?

So, yes. Extending in_tail is the way to go.
I've written a new plugin that inherits from NewTailInput and uses a slightly different parse_singleline and parse_multilines to add the path to the record.
Much better than expected.
Update 6/3/2020:
I've dug up the code, this was the least Ruby I could muster to solve the problem.
Customize convert_line_to_event_with_path_names for your needs to add custom data to the records.
module Fluent
class DirParsingTailInput < NewTailInput
Plugin.register_input('dir_parsing_tail', self)
def initialize
super
end
def receive_lines(lines, tail_watcher)
es = #receive_handler.call(lines, tail_watcher)
unless es.empty?
tag = if #tag_prefix || #tag_suffix
#tag_prefix + tail_watcher.tag + #tag_suffix
else
#tag
end
begin
router.emit_stream(tag, es)
rescue
# ignore errors. Engine shows logs and backtraces.
end
end
end
def convert_line_to_event_with_path_names(line, es, path)
begin
directory = File.basename(File.dirname(path))
filename = File.basename(path, ".*")
line.chomp! # remove \n
#parser.parse(line) { |time, record|
if time && record
if directory != "logs"
record["parent"] = directory
record["child"] = filename
else
record["parent"] = filename
end
es.add(time, record)
else
log.warn "pattern not match: #{line.inspect}"
end
}
rescue => e
log.warn line.dump, :error => e.to_s
log.debug_backtrace(e.backtrace)
end
end
def parse_singleline(lines, tail_watcher)
es = MultiEventStream.new
lines.each { |line|
convert_line_to_event_with_path_names(line, es, tail_watcher.path)
}
es
end
def parse_multilines(lines, tail_watcher)
lb = tail_watcher.line_buffer
es = MultiEventStream.new
if #parser.has_firstline?
lines.each { |line|
if #parser.firstline?(line)
if lb
convert_line_to_event_with_path_names(lb, es, tail_watcher.path)
end
lb = line
else
if lb.nil?
log.warn "got incomplete line before first line from #{tail_watcher.path}: #{line.inspect}"
else
lb << line
end
end
}
else
lb ||= ''
lines.each do |line|
lb << line
#parser.parse(lb) { |time, record|
if time && record
convert_line_to_event_with_path_names(lb, es, tail_watcher.path)
lb = ''
end
}
end
end
tail_watcher.line_buffer = lb
es
end
end
end

Related

Redirect to another endpoint with large data - Rails/Ruby

I have a doubt about showing a generated CSV file to the user (with a large amount of data). So here is the task I have to do.
App: I have a film that has many characters.
Task:
allow users to upload characters via CSV (ok, done)
if there are errors, show them for each row (ok, done)
in the results page, also show a link to a new CSV file only with the remaining characters - the ones that couldn’t be created (I’m stuck here)
Here is part of my code (upload method):
def upload
saved_characters = []
characters_with_errors = []
errors = {}
begin
CSV.parse(params[:csv].read, **csv_options) do |row|
row_hash = clear_input(row.to_h)
new_character = Character.new(row_hash)
if new_character.save
add_images_to(new_character, row)
saved_characters << new_character
else
characters_with_errors << new_character
errors[new_character.name] = new_character.errors.full_messages.join(', ')
end
end
rescue CSV::MalformedCSVError => e
errors = { 'General error': e.message }.merge(errors)
end
#upload = {
errors: errors,
characters: saved_characters,
characters_with_errors: characters_with_errors
}
end
The issue: large amount of data
In the end, the upload.html.erb almost everything works fine, it shows the results and errors per column BUT I’m not sure how create a link on this page to send the user to the new CSV file (only with characters with errors). If the link sends the user to another method / GET endpoint (for the view with CSV format), how can I send such a large amount of data (params won’t work because they will get too long)? What would be the best practice here?
You can use a session variable to store the data, and then redirect to a new action to download the file. In the new action, you can get the data from the session variable, and then generate the CSV file.
For example, In the upload action, you can do something like this:
session[:characters_with_errors] = characters_with_errors
redirect_to download_csv_path
In the download_csv action, you can do something like this:
characters_with_errors = session[:characters_with_errors]
session[:characters_with_errors] = nil
respond_to do |format|
format.csv { send_data generate_csv(characters_with_errors) }
end
In the generate_csv method, you can do something like this:
def generate_csv(characters_with_errors)
CSV.generate do |csv|
csv << ['name', 'age' ]
characters_with_errors.each do |character|
csv << [character.name, character.age]
end
end
end
Another option, you can use a temporary file to store the data and then send the user to the new CSV file. Here is an example:
def upload
saved_characters = []
characters_with_errors = []
errors = {}
begin
CSV.parse(params[:csv].read, **csv_options) do |row|
row_hash = clear_input(row.to_h)
new_character = Character.new(row_hash)
if new_character.save
add_images_to(new_character, row)
saved_characters << new_character
else
characters_with_errors << new_character
errors[new_character.name] = new_character.errors.full_messages.join(', ')
end
end
rescue CSV::MalformedCSVError => e
errors = { 'General error': e.message }.merge(errors)
end
#upload = {
errors: errors,
characters: saved_characters,
characters_with_errors: characters_with_errors
}
respond_to do |format|
format.html
format.csv do
# Create a temporary file
tmp = Tempfile.new('characters_with_errors')
# Write the CSV data to the temporary file
tmp.write(characters_with_errors.to_csv)
# Send the user to the new CSV file
send_file tmp.path, filename: 'characters_with_errors.csv'
# Close the temporary file
tmp.close
end
end
end

How to import a large size (5.5Gb) CSV file to Postgresql using ruby on rails?

I have huge CSV file of 5.5 GB size, it has more than 100 columns in it. I want to import only specific columns from the CSV file. What are the possible ways to do this?
I want to import it to two different tables. Only one field to one table and rest of the fields into another table.
Should i use COPY command in Postgresql or CSV class or SmartCSV kind of gems for this purpose?
Regards,
Suresh.
If I had 5Gb of CSV, I'd better import it without Rails! But, you may have a use case that needs Rails...
Since you've said RAILS, I suppose you are talking about a web request and ActiveRecord...
If you don't care about waiting (and hanging one instance of your server process) you can do this:
Before, notice 2 things: 1) use of temp table, in case of errors you don't mess with your dest table - this is optional, of course. 2) use o option to truncate dest table first
CONTROLLER ACTION:
def updateDB
remote_file = params[:remote_file] ##<ActionDispatch::Http::UploadedFile>
truncate = (params[:truncate]=='true') ? true : false
if remote_file
result = Model.csv2tempTable(remote_file.original_filename, remote_file.tempfile) if remote_file
if result[:result]
Model.updateFromTempTable(truncate)
flash[:notice] = 'sucess.'
else
flash[:error] = 'Errors: ' + result[:errors].join(" ==>")
end
else
flash[:error] = 'Error: no file given.'
end
redirect_to somewhere_else_path
end
MODEL METHODS:
# References:
# http://www.kadrmasconcepts.com/blog/2013/12/15/copy-millions-of-rows-to-postgresql-with-rails/
# http://stackoverflow.com/questions/14526489/using-copy-from-in-a-rails-app-on-heroku-with-the-postgresql-backend
# http://www.postgresql.org/docs/9.1/static/sql-copy.html
#
def self.csv2tempTable(uploaded_name, uploaded_file)
erros = []
begin
#read csv file
file = uploaded_file
Rails.logger.info "Creating temp table...\n From: #{uploaded_name}\n "
#init connection
conn = ActiveRecord::Base.connection
rc = conn.raw_connection
# remove columns created_at/updated_at
rc.exec "drop table IF EXISTS #{TEMP_TABLE}; "
rc.exec "create table #{TEMP_TABLE} (like #{self.table_name}); "
rc.exec "alter table #{TEMP_TABLE} drop column created_at, drop column updated_at;"
#copy it!
rc.exec("COPY #{TEMP_TABLE} FROM STDIN WITH CSV HEADER")
while !file.eof?
# Add row to copy data
l = file.readline
if l.encoding.name != 'UTF-8'
Rails.logger.info "line encoding is #{l.encoding.name}..."
# ENCODING:
# If the source string is already encoded in UTF-8, then just calling .encode('UTF-8') is a no-op,
# and no checks are run. However, converting it to UTF-16 first forces all the checks for invalid byte
# sequences to be run, and replacements are done as needed.
# Reference: http://stackoverflow.com/questions/2982677/ruby-1-9-invalid-byte-sequence-in-utf-8?rq=1
l = l.encode('UTF-16', 'UTF-8').encode('UTF-8', 'UTF-16')
end
Rails.logger.info "writing line with encoding #{l.encoding.name} => #{l[0..80]}"
rc.put_copy_data( l )
end
# We are done adding copy data
rc.put_copy_end
# Display any error messages
while res = rc.get_result
e_message = res.error_message
if e_message.present?
erros << "Erro executando SQL: \n" + e_message
end
end
rescue StandardError => e
erros << "Error in csv2tempTable: \n #{e} => #{e.to_yaml}"
end
if erros.present?
Rails.logger.error erros.join("*******************************\n")
{ result: false, erros: erros }
else
{ result: true, erros: [] }
end
end
# copy from TEMP_TABLE into self.table_name
# If <truncate> = true, truncates self.table_name first
# If <truncate> = false, update lines from TEMP_TABLE into self.table_name
#
def self.updateFromTempTable(truncate)
erros = []
begin
Rails.logger.info "Refreshing table #{self.table_name}...\n Truncate: #{truncate}\n "
#init connection
conn = ActiveRecord::Base.connection
rc = conn.raw_connection
#
if truncate
rc.exec "TRUNCATE TABLE #{self.table_name}"
return false unless check_exec(rc)
rc.exec "INSERT INTO #{self.table_name} SELECT *, '#{DateTime.now}' as created_at, '#{DateTime.now}' as updated_at FROM #{TEMP_TABLE}"
return false unless check_exec(rc)
else
#remove lines from self.table_name that are present in temp
rc.exec "DELETE FROM #{self.table_name} WHERE id IN ( SELECT id FROM #{FARMACIAS_TEMP_TABLE} )"
return false unless check_exec(rc)
#copy lines from temp into self + includes timestamps
rc.exec "INSERT INTO #{self.table_name} SELECT *, '#{DateTime.now}' as created_at, '#{DateTime.now}' as updated_at FROM #{FARMACIAS_TEMP_TABLE};"
return false unless check_exec(rc)
end
rescue StandardError => e
Rails.logger.error "Error in updateFromTempTable: \n #{e} => #{e.to_yaml}"
return false
end
true
end

how to set query timeout for oracle 11 in ruby

I saw other threads stating how to do it for mySql, and even how to do it in java, but not how to set the query timeout in ruby.
I'm trying to use the setQueryTimeout function in Jruby using OJDBC7, but can't find how to do it in ruby. I've tried the following:
#c.connection.instance_variable_get(:#connection).instance_variable_set(:#query_timeout, 1)
#c.connection.instance_variable_get(:#connection).instance_variable_set(:#read_timeout, 1)
#c.connection.setQueryTimeout(1)
I also tried modifying my database.yml file to include
adapter: jdbc
driver: oracle.jdbc.driver.OracleDriver
timeout: 1
none of the above had any effect, other then the setQueryTimeout which threw a method error.
Any help would be great
So I found a way to make it work, but I don't like it. It's very hackish and orphans queries on the database, but it at least allows my app to continue executing. I would still love to find a way to cancel the statement so i'm not orphaning queries that take longer then 10 seconds.
query_thread = Thread.new {
#execute query
}
begin
Timeout::timeout(10) do
query_thread.join()
end
rescue
Thread.kill(query_thread)
results = Array.new
end
Query timeout on Oracle-DB works for me with Rails 4 and JRuby
With JRuby you can use JBDC-function statement.setQueryTimeout to define query timeout.
Suddenly this requires patching of oracle-enhanced_adapter as shown below.
This example is an implementation of iterator-query without storing result in array, which also uses query timeout.
# hold open SQL-Cursor and iterate over SQL-result without storing whole result in Array
# Peter Ramm, 02.03.2016
# expand class by getter to allow access on internal variable #raw_statement
ActiveRecord::ConnectionAdapters::OracleEnhancedJDBCConnection::Cursor.class_eval do
def get_raw_statement
#raw_statement
end
end
# Class extension by Module-Declaration : module ActiveRecord, module ConnectionAdapters, module OracleEnhancedDatabaseStatements
# does not work as Engine with Winstone application server, therefore hard manipulation of class ActiveRecord::ConnectionAdapters::OracleEnhancedAdapter
# and extension with method iterate_query
ActiveRecord::ConnectionAdapters::OracleEnhancedAdapter.class_eval do
# Method comparable with ActiveRecord::ConnectionAdapters::OracleEnhancedDatabaseStatements.exec_query,
# but without storing whole result in memory
def iterate_query(sql, name = 'SQL', binds = [], modifier = nil, query_timeout = nil, &block)
type_casted_binds = binds.map { |col, val|
[col, type_cast(val, col)]
}
log(sql, name, type_casted_binds) do
cursor = nil
cached = false
if without_prepared_statement?(binds)
cursor = #connection.prepare(sql)
else
unless #statements.key? sql
#statements[sql] = #connection.prepare(sql)
end
cursor = #statements[sql]
binds.each_with_index do |bind, i|
col, val = bind
cursor.bind_param(i + 1, type_cast(val, col), col)
end
cached = true
end
cursor.get_raw_statement.setQueryTimeout(query_timeout) if query_timeout
cursor.exec
if name == 'EXPLAIN' and sql =~ /^EXPLAIN/
res = true
else
columns = cursor.get_col_names.map do |col_name|
#connection.oracle_downcase(col_name).freeze
end
fetch_options = {:get_lob_value => (name != 'Writable Large Object')}
while row = cursor.fetch(fetch_options)
result_hash = {}
columns.each_index do |index|
result_hash[columns[index]] = row[index]
row[index] = row[index].strip if row[index].class == String # Remove possible 0x00 at end of string, this leads to error in Internet Explorer
end
result_hash.extend SelectHashHelper
modifier.call(result_hash) unless modifier.nil?
yield result_hash
end
end
cursor.close unless cached
nil
end
end #iterate_query
end #class_eval
class SqlSelectIterator
def initialize(stmt, binds, modifier, query_timeout)
#stmt = stmt
#binds = binds
#modifier = modifier # proc for modifikation of record
#query_timeout = query_timeout
end
def each(&block)
# Execute SQL and call block for every record of result
ActiveRecord::Base.connection.iterate_query(#stmt, 'sql_select_iterator', #binds, #modifier, #query_timeout, &block)
end
end
Use above class SqlSelectIterator like this example:
SqlSelectIterator.new(stmt, binds, modifier, query_timeout).each do |record|
process(record)
end

Silence ActionView::Template::Errors, like "isn't precompiled"

my question is about the standard behavior of the action-view gem when using the rails asset-pipeline.
It throws an Exception and the app-execution stops whenever there's an image which isn't precompiled, so the user just gets to see the standard blank page saying: "... something went wrong".
Something as trivial as a missing image (could be an icon, maybe with just a misspelled name...) shouldn't be a showstopper. Should it be?!
We would like to change this radical behavior to a more mild version: Having the app continue working, but, of course, notifying us about the missing image.
Question:
Is there any other way then monkeypatching the relevant part of the helper method contained in the action-view gem?
Is there any config we could modify so there would be no need for this patch?
Having this kind of monkeypatch is considered a maintenance nightmare in case of gem-updates, isn't it?
This is our actual patch: called: "assetpipe_easy_errors.rb" residing in config/initializers, the relevant method is "digest_for"
Sprockets::Helpers::RailsHelper::AssetPaths.class_eval do
attr_accessor :asset_environment, :asset_prefix, :asset_digests, :compile_assets, :digest_assets
class AssetNotPrecompiledError < StandardError; end
def asset_for(source, ext)
source = source.to_s
return nil if is_uri?(source)
source = rewrite_extension(source, nil, ext)
asset_environment[source]
rescue Sprockets::FileOutsidePaths
nil
end
def digest_for(logical_path)
if digest_assets && asset_digests && (digest = asset_digests[logical_path])
return digest
end
if compile_assets
if digest_assets && asset = asset_environment[logical_path]
return asset.digest_path
end
return logical_path
else
#original code: raise AssetNotPrecompiledError.new("#{logical_path} isn't precompiled")
### own Patch: these next four lines:
Rails.logger.info(" arrg!! an image is missing ")
### example: FeedbackMailer.generic_system_message(subject,bodytext).deliver
FeedbackMailer.generic_system_message("asset error",logical_path).deliver
return logical_path
end
end
def rewrite_asset_path(source, dir, options = {})
if source[0] == ?/
source
else
if digest_assets && options[:digest] != false
source = digest_for(source)
end
source = File.join(dir, source)
source = "/#{source}" unless source =~ /^\//
source
end
end
def rewrite_extension(source, dir, ext)
source_ext = File.extname(source)
if ext && source_ext != ".#{ext}"
if !source_ext.empty? && (asset = asset_environment[source]) &&
asset.pathname.to_s =~ /#{source}\Z/
source
else
"#{source}.#{ext}"
end
else
source
end
end
end
Any ideas are highly appreciated

Traversing directories and reading from files in ruby on rails

I'm having some trouble figuring out how to 1) traverse a directory and 2) taking each file (.txt) and saving it as a string. I'm obviously pretty new to both ruby and rails.
I know that I could save the file with f=File.open("/path/*.txt") and then output it with puts f.read but I would rather save it as a string, not .txt, and dont know how to do this for each file.
Thanks!
You could use Dir.glob and map over the filenames to read each filename into a string using IO.read. This is some pseudo code:
file_names_with_contents = Dir.glob('/path/*.txt').inject({}){|results, file_name| result[file_name] = IO.read(file_name)}
You could prob also use tap here:
file_names_with_contents = {}.tap do |h|
Dir.glob('/path/*.txt').each{|file_name| h[file_name] = IO.read(file_name)}
end
The following based on python os.walk function, which returns a list of tuples with: (dirname, dirs, files ). Since this is ruby, you get a list of arrays with:
[dirname, dirs, files]. This should be easier to process than trying to recursively walk the directory yourself. To run the code, you'll need to provide a demo_folder.
def walk(dir)
dir_list = []
def _walk(dir, dir_list)
fns = Dir.entries(dir)
dirs = []
files = []
dirname = File.expand_path(dir)
list_item = [dirname, dirs, files]
fns.each do |fn|
next if [".",".."].include? fn
path_fn = File.join(dirname, fn)
if File.directory? path_fn
dirs << fn
_walk(path_fn, dir_list)
else
files << fn
end
end
dir_list << list_item
end
_walk(dir, dir_list)
dir_list
end
if __FILE__ == $0
require 'json'
dir_list = walk('demo_folder')
puts JSON.pretty_generate(dir_list)
end
Jake's answer is good enough, but each_with_object will make it slightly shorter. I also made it recursive.
def read_dir dir
Dir.glob("#{dir}/*").each_with_object({}) do |f, h|
if File.file?(f)
h[f] = open(f).read
elsif File.directory?(f)
h[f] = read_dir(f)
end
end
end
When the directory is like:
--+ directory_a
+----file_b
+-+--directory_c
| +-----file_d
+----file_e
then
read_dir(directory_a)
willl return:
{file_b => contents_of_file_b,
directory_c => {file_d => contents_of_file_d},
file_e => contents_of_file_e}

Resources