I am trying to test the contents of a file that is generated from code. The problem is that the full name of the file is based on a timestamp abc123_#{d.strftime('%Y%m%d%I%M%S')}.log
How could I use File to find this file and read it? I tried doing File.exists?() with a regular expression as the parameter but that didn't work.
I found this in another question on stackoverflow:
File.basename(file_path).match(/_.*(css|scss|sass)/)
How would I be able to use that in my case where the file is located in mypublic folder?
ANSWER
So the two answers below both work and I used a combination of them.
Dir['public/*.log'].select { |f| f =~ /purge_cc_website/}
The * acts as a wildcard that is sort of a regular expression in itself. After that you filter the array using an actual regex.
Dir[] takes a file glob so, if your pattern isn't too complicated, you can just do:
Dir['public/abc123_*.log']
More glob info here.
File is for reading one file. You need to use Dir to find files by name.
files = Dir['*'].select {|x| x =~ /_.*(css|scss|sass)/ }
If you just want the last file in the case of dups:
files = Dir['*'].select {|x| x =~ /_.*(css|scss|sass)/ }.sort.last
Related
I'm writing an Apache Beam pipeline in python and trying to load multiple text files but encounter an error when using the pattern match. When I pass in an exact filename, the pipeline runs correctly.
For example:
files = p | 'Read' >> ReadFromText('lyrics.txt')
However, when using pattern match an error occurs:
files = p | 'Read' >> ReadFromText('lyrics*')
IOError: No files found based on the file pattern
In this example, I have several files that start with "lyrics".
I've tried many different pattern types but haven't had any success with anything except passing the complete file name. Is there a different way to apply pattern match in this case?
Updated with answer
If you're on Windows don't forget to use a backslash instead of forward slash when specifying directories. For example: ReadFromText('.\lyrics*')
This looks like a bug. I've filed https://issues.apache.org/jira/browse/BEAM-7560. In the meantime, try an absolute path or ReadFromText('./lyrics*').
I use flume-1.8.0.
On the document, it says that I cannot set the directory pattern.
(Regular expression (and not file system patterns) can be used for filename only.)
But I have to set the directory pattern to get log from other system which controlled by other team.
Is there some solution to set directory path like /dir/201801/0101.log, /dir/201802/0001.log, ... ?
Use something like this for the file groups with file patterns i.e use the Regex ASCII pattern see https://en.wikipedia.org/wiki/Regular_expression for more details
a1.sources.r1.filegroups.f2 = /path/to/files/with/pattern/databundle_cnt_[0-9]{4}-[0-9]{2}-[0-9]{2}.csv
In your case I will advise
a1.sources.r1.filegroups.f2 = /dir/[0-9]{6}/[0-9]{4}.log
For the import of my Sass files, I use sass-rails' (https://github.com/rails/sass-rails) glob feature. It says
Any valid ruby glob may be used
I want to exclude a directory and a file when using #import. Any ruby code using blocks don't work in this scenario. But even trying to exclude a single file doesn't work the way I want.
Consider this tree structure
/_bar.scss
/_foo.scss
/all.scss
For example, I want to exclude the file _foo.scss. I read here https://stackoverflow.com/a/27707682/228370, using a ! you can negate a pattern.
I tried the following:
Dir["{[!_foo]}*.scss"]
=> ["all.scss"]
But this skips _bar.scss. When looking into the glob reference of Ruby (http://ruby-doc.org/core-2.2.0/Dir.html#method-c-glob) it becomes clear why:
[set]
Matches any one character in set. Behaves exactly like character sets in Regexp, including set negation ([^a-z]).
(apparently, negation can be achieved with ! AND ^)
Because we have an underscore in our pattern, every file with an underscore gets excluded.
But what would be the solution, to exclude a fixed file?
There's probably a regex way of doing it. But if you're talking about one specific file, it might be easier to just do:
Dir["*.scss"].reject { |i| i == '_foo.scss' }
File.rename(blog_path + '/' + project_path, File.expand_path(topic_name, blog_path))
I use these code to rename ruby file name, but I think there is a better way to write this functionality with less code since it includes blog_path two times.
The code is OK, but I think there is no need to expand_path here - this method creates an absolute path from the the relative one.
Also, it is good to use File.join to create a path instead just concatenate it with slash - it will be completely OS independent. So I would write your code like this:
File.rename(File.join(blog_path, project_path), File.join(blog_path, topic_name))
Or if you want to get rid of doubled blog_path, change working directory before doing a rename:
Dir.chdir(blog_path)
File.rename(project_path, topic_name)
More info on working with files and directories in Ruby you can find in the article: Ruby for Admins: Files and Directories.
I think (correct me if I am wrong) that it is better to put a / at the end of most of url. Like this: http://www.myweb/file/
And not put / at the end of filenames: http://www.myweb/name.html
I have to correct that in a website with a lot of links. Is there a way I can do that in a fast way. For instance in some programs like Dreamweaver I can use find and replace.
The second case is quite easy with Dreamweaver:
- Find: .html/"
- Replace: .html"
But how can I say something like:
- Find: all the links that end with a directory. Like http://www.myweb/file
- Replace: the same link but with a / at the end. Like http://www.myweb/file/
Your approach may work but it is based on the assumption that all files have a file extension.
There is a distinct difference between the urls http://www.myweb/file and http://www.myweb/file/ because the latter could resolve to http://www.myweb/file/index.php, or any other in the default set configured in your web server. That URL could also reference a perfectly valid file which doesn't contain a file extension, such as if it were a REST endpoint.
So you are correct insofar as you should explicitly add a "/" if you are referring to a directory, for example if you are expecting the web server to look up the correct index page to respond, or doing a directory listing.
To replace the incorrect URLS, regular expressions are your friend.
To find all files which have an erroneous "/" you could use /\.(html|php|jpg|png)\//, adding as many different file extensions into that pipe-separated list as you like. You can then replace that with .$1 or .\1 depending on your tool.
An example of doing this with Perl would be:
perl -pi -e 's/\.(html|php|jpg|png)\//.\1/g' theFileYouWantToCheck.html
Of (if you're using a Linux-based system) you can automate that nicely with find:
find path/to/html/root -type f -name "*.html* | xargs perl -pi -e 's/\.(html|php|jpg|png)\//.\1/g'
which will find all html files in the directory and do an inline find and replace. Assuming you're using version control, it's then easy to see the changes it's applied :)
Update
Solving the problem for adding a slash to directories isn't trivial. The approach I'd take:
Write a script to recurse through your website structure locally, making a list of all files
Parse the HTML files to extract all href=".*" and replace them with href=".*/" only if the end of the URL isn't present in the list extracted by the first script.
Any text-based find and replace is not going to be aware of whether the link is actually to a file or not.