I have an array list of email address and i want to delete eveything after the ";"
i tried chomp. but that didnt seem to work.
How would i loop through the array and remove everything after the ";" in my view.
Try:
["aaaa;bbb"].map { |e| e.gsub(/;.*/, ';') }
From documentation: gsub returns a copy of str with the all occurrences of pattern substituted for the second argument.
So this regexp will match ; and any character after, so you have to pass ; as second argument.
Related
I have a test_list.txt file containing lines of file names. Each file name contains the date when they were created. Here's how it looks like:
test_list.txt:
UTF_06012018_SAMPLE_Control.xlsx
UTF_06022018_SAMPLE_Control.xlsx
UTF_06092018_SAMPLE_Control.xlsx
UTF_06022018_SAMPLE_Control.xlsx
UTF_06082018_SAMPLE_Control.xlsx
UTF_06032018_SAMPLE_Demand.xlsx
UTF_06092018_SAMPLE_Demand.xlsx
UTF_06122018_SAMPLE_Demand.xlsx
UTF_06032018_SAMPLE_Control.xlsx
UTF_06022018_SAMPLE_Demand.xlsx
The date in the file name is in the format mmddyyyy. Also, there are files which were created on the same date. What I'm trying to do is to print the line that matches the regex expression for the dates and sort them alphabetically by date.
Here's my code so far:
path = Dir.glob('/path/to/my/file/*.txt').first
regex = /(\d{1,2}\d{1,2}\d{4})/
samplefile = File.open(path)
string = File.read(samplefile)
string.scan(regex).each do|x|
sorted = x.sort_by { |s| s.scan(/\d+/).first.to_i }
puts sorted
end
However, what my code does is it only prints the dates, not the entire line. To add to that, it doesn't even sort them alphabetically. How to tweak it and make it do as I intend to?
You may use
string.scan(/^([^_]*_(\d++)(.*))/).sort_by { |m,n,z| [n.to_i,z] }.collect{ |m,n,z| m}.join("\n")
See the Ruby demo.
The regex will extract all lines into a three element array with the following values: whole line, the date string, and the string after the date. Then, .sort_by { |m,n,z| [n.to_i,z] } will sort by the date string first, and then by the substring after the date. The .collect{ |m,n,z| m} will only keep the first value of the array elements and .join("\n") will re-build the resulting string.
Note that instead of [n.to_i,z], you might want to parse the date string first, then use [Date.strptime(n,"%d%m%Y"),z] (add require 'date').
Regex details
^ - start of a line
([^_]*_(\d++)(.*)) - Group 1 (m): the whole line meeting the following patterns:
[^_]* - zero or more chars other than _
_ - an underscore
(\d++) - Group 2 (n): 1+ digits, a possessive match
(.*) - Group 3 (z): the rest of the line.
I want to achieve this: retrieve a word from a CSV file, then look for the existence of a hashtag with the word in a post the problem is that I was unable to perform the concatenation
The "Type mismatch" error could be solved by enclosing the concatenation in parentheses, as in:
WHERE line[0] =~ (".*#" + line[0] + ".*")
However, logically, that WHERE clause can never be true. A string cannot be equal to a longer string (itself, preceded by an extra character).
If you just trying to see if a word starts with a hashtag, this should work:
WHERE line[0] STARTS WITH "#"
Or, if you want to see if there is a hashtag in the string:
WHERE line[0] CONTAINS "#"
I have a long list of information stored in a variable and I need to run some regex expressions against that variable and get various pieces of information from what is found.
How can you store the line that matches a regex expression in a variable?
How can you get the line number of the line that matches a regex expression?
Here is an example of what I'm talking about.
body = "service timestamps log datetime msec localtime show-timezone
service password-encryption
!
hostname switch01
!
boot-start-marker"
If I search for the line that contains "hostname" I need the line number, in this case it would be 4. I also need to store the line "hostname switch01" as another variable.
Any ideas?
Thanks!
First you'd want to convert the string to lines: body.split('\n'), then you want to add line numbers to the lines: .each_with_index. Then you want to select the lines .select {|line, line_nr| line =~ your_regex }. Putting it all together:
body.split('\n').each_with_index
.select {|line, line_nr| line =~ your_regex }
.map {|line, line_nr| line_nr }
This will give you all the lines matching 'your_regex'
Let's say you have an object file that provides a #lines method:
lines = file.lines.each_with_index.select {|line, i| line =~ /regex/ }
If you already have a list of lines you can leave out the call to #lines. If you have a string you can use string.split("\n").
This will result in the variable lines containing an array of 2-element arrays with the line that matched your RegEx and the index of the line in the original file.
Breakdown
file.lines gets the lines - of course the other methods I mentioned might also apply here for you. We then add the index to each element with #each_with_index, because you want to store these as well. This has the same effect as #map.with_index {|e, i| [e, i]}, i.e. map every element to [element, index]. We then use the #select method to get all lines that do match your RegEx (FYI, =~ is the matching operator in Ruby, Perl and other languages - in case you didn't already know). We're done after that, but you might need to further transform the data so you can process it.
I am working on the csv generation. I am seperating values which are seperated by comma(,). If the value in a field contains comma, then it should not seperate the field in excel. So I want to put a escape character there. I am using FasterCsv. So how I can put a escape character. What is the escape character of fastercsv?
Just quote every field (doublequotes by default) and commas inside of them are ignored:
CSV.generate(:col_sep=>',', :quote_char => '"') do |row|
row << ["Quid, quid", "latinum dictum"]
row << ["sit, altum", "viditur."]
end
=> "\"Quid, quid\",latinum dictum\n\"sit, altum\",viditur.\n"
If you have commas in your data, set a different column seperator with the :col_sep option. If you like your commas and can not live without them, set the data within quotation marks.
If you use the FasterCSV methods, this will be handled for you automatically!
a creative way is to replace the real comma with a look-alike. This may be stupid, it all depends on your use case. It was ok for us - I think. I need to post this before I change my mind, lol
my_string.gsub(',','‚')
I'm not sure if it copy pasted this correctly, but you can create it on mac by holding ALT(option) + ,
I have string "(1,2,3,4,5,6),(1,2,3)" I would like to change it to "('1','2','3','4','5','6'),('1','2','3')" - replase all parts that mathces /([^,)("])/ with the '$1', '$2' etc
"(1,2,3,4,5,6),(1,2,3)".gsub(/([^,)("]\w*)/,"'\\1'")
gsub is a "global replace" method in String class. It finds all occurrences of given regular expression and replaces them with the string given as the second parameter (as opposed to sub which replaces first occurrence only). That string can contain references to groups marked with () in the regexp. First group is \1, second is \2, and so on.
Try
mystring.gsub(/([\w.]+)/, '\'\1\'')
This will replace numbers (ints/floats) and words with their "quote-surrounded" selves while leaving punctuation (except the dot) alone.
UPDATED: I think you want to search for this
(([^,)("])+)
And replace it with this
'$1'
the looks for anything 1 or more times and assigns it to the $1 variable slot due to using the parenthesis around the "\d". The replace part will use what it finds as the replacement value.