Server side rendering for dynamic pages with PhantomJS on Ruby On Rails - ruby-on-rails

I have a WebPage made that is 90% Javascript. All of the WebSite is rendered dynamically.
I want this content to be rendered by the server as well so that Google can crawl and index all of my content and links.
I know that in order not to get banned by google, the content of the dynamic page and the server rendered page must be almost identical.
I don't want to code two different pages (one from the client with Handlebars and one from the server with ERB in this case).
So I thought of PhantomJS. What I want is that when I get the _escaped_fragment_ param from google, I open the page without that with PhantomJS and I render this to HTML from PhantomJS and return that from the server to Google. This way, I don't have to create two different pages for anything.
I know that I can use Handlebars for Server Side templating as well, but I'd have to code everything twice anyway.
Does anybody know how to accomplish this with PhantomJS? Is there any other way for not repeating the Logic and code Twice and have Google index the Site?
Thanks!!!

Yes you can.
Add the following to the of your Javascript intensive page
<meta name="fragment" content="!">
When the Google bot finds this tag, it will issue a new http GET request. This time, it will add ?_escaped_fragment_= to your URL.
So if your web page with Javascript is located at:
www.mysite.com/mypage
Google will issue a new GET using the following URL:
www.mysite.com/mypage?_escaped_fragment_=
In your Ruby GET handler, you simply call PhantomJs with the unescaped URL (just do a string replace). In your PhantomJs javascript code, wait for the page to render and then then extract the HTML using regular javascript and return it back to your Ruby GET handler where you will simply respond to the GET with the HTML text string.
In this way you do not have to write your code twice. The solution is generic and will snapshot anything.

Related

Generate PDF from Vue template

I need to create downloadable pdfs of a page which is rendered using Vue. The html to template API we're using is DocRaptor.
API built using Rails
Client built using Vue
Two types of approaches are possible:
Passing in a url to the page, which is then rendered to a PDF.
Problems.
The page is behind our auth, do I pass in the session token in the header?
Page is calling our API, meaning the above wouldn't even matter...I assume you the page will only fetch the raw html, not run JS in the DocRaptor POST request.
Passing in the raw html in the DocRaptor POST request, with styling. Problems
We don't use server side rendering, so don't have access to a nice pre rendered html string
Figuring out how to compile vue to raw html
Am I way off the mark here?
The two options above seem like the way to go. Would love for option 1 to work, but I don't see how - which leaves me with option 2, however no amount of googling has given me answers beyond server side rendering. Can I even do that for single pages? I assume the whole app gets rendered.
Option 1 could work, assuming you have some sort of authentication mechanism in place (for example a short lived token). DocRaptor does indeed execute your javascript, so it should work.
You can render to an invisible element on client (or may be even visible and make user think that this is a preview) and then use old good innerHTML:
let html = document.querySelector('#render-placeholder').innerHTML;
and then post it to server to forward to pdf renderer (to keep service access tokens secret).

PWA - Cache and load same appshell for different seo-friendly urls

We are consumer internet and have SEO friendly urls: www.xyz.com/user-1, www.xyz.com/user-2, www.xyz.com/user-n. Technically these are all user pages with different url's. And we need to load only one html file (app-shell) for all these cases.
What I want to achieve is that:
Go to www.xyz.com/user-1 page, cache the html (app-shell) file.
Navigate to www.xyz.com/user-2 page, get the html response from cache of www.xyz.com/user-1 ( since it is the same app-shell ).
I couldn't achieve this because, the 'match' method of cache api works on url request object, and I couldn't manipulate it. Is there a way where I can manipulate the url request object? Or is there a workaround for it?
You can create your own response if you want. However I think you are describing the classic SPA architecture. There you would have an app shell and fill in the main content area on the client-side, either by rendering the markup in the browser or appending pre-rendered markup from the server.
You might want to check out the sw templating strategy as a possible place to start -> https://jakearchibald.com/2014/offline-cookbook/#serviceworker-side-templating

How is this URL modification possible?

Could anyone please tell how the site http://www.outsharked.com/imagemapster/default.aspx?what.html is working in such way? Modifying the url without loading/reloading the page. I think this is not done by html5. Because it works in IE6 which doesn't support html5.
I created that site. The commenter is correct, it uses Javascript to change the URL. There's nothing about how that navigation works that is different for IE6 - that browser supports the necessary client-side functionality to do this kind of thing. The basic functionality involves:
capturing click events on the nav, and loading the inner content via AJAX
update the URL to reflect a working direct URL to target.
The links also are valid anchor links that, in the absence of Javascript, would go to the same page (but load the whole thing). This is your basic AJAX web site setup with one minor difference. It's common practice to use a URLs like this in AJAX/single page web sites:
http://mysite.com/home#somepage
or even just
http://mysite.com/#somepage
Where the hashtag part represents the actual page a user has navigated to. If someone accessed that url directly, e.g. from outside the site, the site would use Javascript to load the correct content based on the hashtag, after the page had loaded. This means that there might be a little delay for the inner content to reflect the correct page, since it has to run another request after the initial page has loaded from the browser to get the inner content via AJAX.
I was trying to avoid that by creating a setup that worked completely with and without Javascript. If you go directly to a URL within the site such as http://www.outsharked.com/imagemapster/default.aspx?faq.html you will notice it loads the content directly. This URL will work even if Javascript is disabled. You can't actually do this using hashtags, since hashtag content is not sent to the server. Only the client knows what's after the hashtag in a URL. That's why I was using query strings to represent inner pages.
This site architecture was sort of an experiment at the time. It works pretty well but the code isn't fantastic, I didn't really do anything else with it, and I'm sure there are other better-fleshed-out/tested/full-featured frameworks out there to do much the same thing.
But it might not be a bad example of the nuts and bolts of creating a basic AJAX navigation setup, as a learning tool, since it's pretty concise, and also does HTML5 history navigation (e.g. so the back button works on modern browsers).

Umbraco - different behaviour for usual and ajax requests

I'm developing an Umbraco site that is a "single page" - no reload, only ajax calls.
The site will have nice urls and use html5 push state history.
The problem here is that every time a request is made to the server I need to handle it differently depending on the type of the request: normal or ajax.
For usual requests I need to display the content along with it's master page.
For ajax requests I need to display only the content.
I don't know how to accomplish this - routing and master page magic.
Can anyone help?
You could use alternate templates. For more information see here. Basically, have the alternate template just render out the content in whatever format you want, without the full html template, and then make sure that all your AJAX requests call the pages using the alternate template.
One word of warning though, if you're doing all the site navigation with AJAX and no page reloads, then Google (or most other search engine spiders for that matter) won't be able to index your site properly (as they don't process javascript) and your site won't rank very well.

jQuery Mobile POST request with multiple pages response

From the documentation, http://jquerymobile.com/test/docs/pages/page-links.html
It's important to note that if you are linking from a mobile page that was loaded via Ajax to a page that contains multiple internal pages, you need to add a rel="external" or data-ajax="false" to the link. This tells the framework to do a full page reload to clear out the Ajax hash in the URL. This is critical because Ajax pages use the hash (#) to track the Ajax history, while multiple internal pages use the hash to indicate internal pages so there will be conflicts in the hash between these two modes.
Now, that seems to only apply to GET requests (i.e. via link elements). However, what is the guideline for POST requests? Right now, for the project I am currently working on, if I make a POST request with response that contains multiple N internal pages (lots of divs with data-role="page" and unique IDs), jQuery mobile only loads the first one it sees, and ignores the rest.
I am pulling my hairs out, not sure what to do to work around this. Is there a way to force jQuery mobile to do to full page reload?
Thanks a bunch in advance!
It turns out that data-ajax will work just fine inside the form tag. i.e
<form action="target.php" method="post" data-ajax="false">...</form>
Previously, I thought I had set data-ajax="false". However, it turns out that with the PHP framework (yii) I use, there is a big difference between "data-ajax" => false and "data-ajax" => "false" (former assigns to boolean type, and later is string type). Anyway, long story short, data-ajax is indeed the solution.

Resources