I need to create downloadable pdfs of a page which is rendered using Vue. The html to template API we're using is DocRaptor.
API built using Rails
Client built using Vue
Two types of approaches are possible:
Passing in a url to the page, which is then rendered to a PDF.
Problems.
The page is behind our auth, do I pass in the session token in the header?
Page is calling our API, meaning the above wouldn't even matter...I assume you the page will only fetch the raw html, not run JS in the DocRaptor POST request.
Passing in the raw html in the DocRaptor POST request, with styling. Problems
We don't use server side rendering, so don't have access to a nice pre rendered html string
Figuring out how to compile vue to raw html
Am I way off the mark here?
The two options above seem like the way to go. Would love for option 1 to work, but I don't see how - which leaves me with option 2, however no amount of googling has given me answers beyond server side rendering. Can I even do that for single pages? I assume the whole app gets rendered.
Option 1 could work, assuming you have some sort of authentication mechanism in place (for example a short lived token). DocRaptor does indeed execute your javascript, so it should work.
You can render to an invisible element on client (or may be even visible and make user think that this is a preview) and then use old good innerHTML:
let html = document.querySelector('#render-placeholder').innerHTML;
and then post it to server to forward to pdf renderer (to keep service access tokens secret).
Related
We are consumer internet and have SEO friendly urls: www.xyz.com/user-1, www.xyz.com/user-2, www.xyz.com/user-n. Technically these are all user pages with different url's. And we need to load only one html file (app-shell) for all these cases.
What I want to achieve is that:
Go to www.xyz.com/user-1 page, cache the html (app-shell) file.
Navigate to www.xyz.com/user-2 page, get the html response from cache of www.xyz.com/user-1 ( since it is the same app-shell ).
I couldn't achieve this because, the 'match' method of cache api works on url request object, and I couldn't manipulate it. Is there a way where I can manipulate the url request object? Or is there a workaround for it?
You can create your own response if you want. However I think you are describing the classic SPA architecture. There you would have an app shell and fill in the main content area on the client-side, either by rendering the markup in the browser or appending pre-rendered markup from the server.
You might want to check out the sw templating strategy as a possible place to start -> https://jakearchibald.com/2014/offline-cookbook/#serviceworker-side-templating
Im trying to send some data to a form on a site were im a member using cURL, but when i look at the headers being sent, they seem to have been encrypted.
Is there a way i can get around this by making the computer / server visit the site and actual add the data to the inputs on the form and then hit submit, so that it would generate the correct data and post the form ?
You have got a few options:
reverse engineer the JavaScript that does the encryption (or possibly just encoding) process
get a browser engine (e.g. the Gecko engine), and add some scripting to it to fill in the forms and push the submit button - of course you would need JavaScript support within the page itself
parse the HTML using an HTML parser, feed the JavaScript in it to a JavaScript runtime with the correct libraries, fill in the "form" and hit the submit button
It's probably easiest to go for the first option. The JavaScript must be in the open to be able to be executed in the browser. But it may take some time to reverse-engineer as it is likely obfuscated.
You can use a framework to automate user interaction on the web pages, like Selenium.
This would enable you to not bother reverse engineering anything.
Selenium has binding in various languages, including Python and java.
Provided the javascript is visible on the website in question, you should be able to simply copy and paste their encryption routines to prepare the headers exactly as they do
A hacky fix if you can isolate the function that encodes the data you type in the form - is to use something like PyV8 to execute the JS inside python.
Use AutoHotKeyIt and actually have it use the Browser Normally. It can read from files, and do repetitive tasks infinitely. Also you can push a flag to make it only happen within that application, which means you can have it minimized and yet still preform the action.
You seem to be having issues with the problem of them encrypting the headers and such, so why not simply use that too your advantage? Your still pushing the same data in, but now your working around their system. With little to no side effect too you.
Could anyone please tell how the site http://www.outsharked.com/imagemapster/default.aspx?what.html is working in such way? Modifying the url without loading/reloading the page. I think this is not done by html5. Because it works in IE6 which doesn't support html5.
I created that site. The commenter is correct, it uses Javascript to change the URL. There's nothing about how that navigation works that is different for IE6 - that browser supports the necessary client-side functionality to do this kind of thing. The basic functionality involves:
capturing click events on the nav, and loading the inner content via AJAX
update the URL to reflect a working direct URL to target.
The links also are valid anchor links that, in the absence of Javascript, would go to the same page (but load the whole thing). This is your basic AJAX web site setup with one minor difference. It's common practice to use a URLs like this in AJAX/single page web sites:
http://mysite.com/home#somepage
or even just
http://mysite.com/#somepage
Where the hashtag part represents the actual page a user has navigated to. If someone accessed that url directly, e.g. from outside the site, the site would use Javascript to load the correct content based on the hashtag, after the page had loaded. This means that there might be a little delay for the inner content to reflect the correct page, since it has to run another request after the initial page has loaded from the browser to get the inner content via AJAX.
I was trying to avoid that by creating a setup that worked completely with and without Javascript. If you go directly to a URL within the site such as http://www.outsharked.com/imagemapster/default.aspx?faq.html you will notice it loads the content directly. This URL will work even if Javascript is disabled. You can't actually do this using hashtags, since hashtag content is not sent to the server. Only the client knows what's after the hashtag in a URL. That's why I was using query strings to represent inner pages.
This site architecture was sort of an experiment at the time. It works pretty well but the code isn't fantastic, I didn't really do anything else with it, and I'm sure there are other better-fleshed-out/tested/full-featured frameworks out there to do much the same thing.
But it might not be a bad example of the nuts and bolts of creating a basic AJAX navigation setup, as a learning tool, since it's pretty concise, and also does HTML5 history navigation (e.g. so the back button works on modern browsers).
I have a WebPage made that is 90% Javascript. All of the WebSite is rendered dynamically.
I want this content to be rendered by the server as well so that Google can crawl and index all of my content and links.
I know that in order not to get banned by google, the content of the dynamic page and the server rendered page must be almost identical.
I don't want to code two different pages (one from the client with Handlebars and one from the server with ERB in this case).
So I thought of PhantomJS. What I want is that when I get the _escaped_fragment_ param from google, I open the page without that with PhantomJS and I render this to HTML from PhantomJS and return that from the server to Google. This way, I don't have to create two different pages for anything.
I know that I can use Handlebars for Server Side templating as well, but I'd have to code everything twice anyway.
Does anybody know how to accomplish this with PhantomJS? Is there any other way for not repeating the Logic and code Twice and have Google index the Site?
Thanks!!!
Yes you can.
Add the following to the of your Javascript intensive page
<meta name="fragment" content="!">
When the Google bot finds this tag, it will issue a new http GET request. This time, it will add ?_escaped_fragment_= to your URL.
So if your web page with Javascript is located at:
www.mysite.com/mypage
Google will issue a new GET using the following URL:
www.mysite.com/mypage?_escaped_fragment_=
In your Ruby GET handler, you simply call PhantomJs with the unescaped URL (just do a string replace). In your PhantomJs javascript code, wait for the page to render and then then extract the HTML using regular javascript and return it back to your Ruby GET handler where you will simply respond to the GET with the HTML text string.
In this way you do not have to write your code twice. The solution is generic and will snapshot anything.
I understand what progressive enhancement is, I'm just fuzzy on some of the details in actually pulling it off. Of course, that could be because I'm looking at it in the wrong way. Let me try to explain my difficulty with a hypothetical:
ASP.NET MVC site. I have a view that has tabbed navigation. Each tab is for a movie category/genre which displays 5-10 links to movies in that category. The movie data is obtained through Netflix's Odata.
My initial thought is to use Ajax to pull and parse the JSON from the proper OData GET requests when each tab is clicked. How would I provide a non-JavaScript version of that? Is it even possible?
For simpler requests where JSON isn't necessary - like, say, having a user log into the system - I see how I could simply set a cookie and dynamically change the page based on it to reflect the change. But what if I need to return and parse JSON? How do I provide an alternative?
The deal with progressive enhancement is that your server side must be fully capable of generating every last bit of HTML that appears in all of your pages. This is obvious, since otherwise (if JS is turned off) there will be no part of your application capable of doing said rendering.
Since the server side must know how to render everything, it doesn't make much sense to generate things (DOM elements/HTML) on the client side from JSON responses the server gives you. Why repeat yourself?
This brings us to the logical conclusion that when doing dynamic updates on the client, you need to get ready-made HTML from the server (since the rendering logic is over there) and insert it into the DOM as appropriate. You are then free to work on the newly inserted elements with jQuery and enhance them all you want.
So -- forget about parsing JSON on the client, otherwise you 're locking yourself out of progressive enhancement. If you want to call a third party, have the server be your intermediary: call the server with all the necessary information for it to call the third party and get ready-made HTML back.
If you do this, then the server can of course provide non-JS versions of everything on your site with no problem. Total non-reliance on JS achieved.
There is no JSON without JS, by definition (JavaScript Object Notation). Without JS you won't make AJAX calls. Your pages will render as is, just like oldschool sites.
If you need to do this progressively, you will have to call the odata service server-side, and provide .net objects to the site in viewdata, or your viewmodel, and have your views/partials render it.
In ASP.Net MVC actions, the httpcontext available via the controller will have a property on this path: this.HttpContext.Request.IsAjaxRequest() and can be used to test whether you want to return a view or just json data, or whatever type of ActionResult you want. This can be an excellent timesaver for building progressive enhancement style sites.