Rails Cloudinary Widget implementation trouble - ruby-on-rails

I am in the process of moving from Dropzone's widget to Cloudinary's widget and running into heaps of trouble.
First off Dropzone is currently working beautifully with uploads to cloudinary. I am moving to their proprietary widget for a bunch of reasons that would just distract this post.
The issue I am having is "simple" at first glance. Images are uploading correctly to Cloudinary. It is on the subsequent form post that I am having issues.
Dropzone automagically creates necessary hidden inputs and values...Cloudinary you have to roll your own. So I have done that and not only is it not working the input values are very different to what dropzone generates for the very same image. I cannot find the logic in dropzone.js that can explain how the inputs are created.
For example, here is what dropzone renders for one image:
<input type="hidden" name="entity[job_entries_attributes][0][images][]" value="eyJfcmFpbHMiOnsibWVzc2FnZSI6IkJBaHBBNkpLQWc9PSIsImV4cCI6bnVsbCwicHVyIjoiYmxvYl9pZCJ9fQ==--7d13c16894d2a146f1ac85e12ddea03d9c14c26e">
When I hand roll I have access to the object returned from direct upload to Cloudinary - public_id, asset_id, etc. But none of them resemble the value above. I am assuming the post and subsequent image rendering is failing because of this.
Anyone have experience with this??? Driving me nuts...

The Upload Widget is an interactive UI for uploading images to your Cloudinary Media Library account and its usage can be extended depending on the use case requirements. Although, the widget does not have many features that Dropzone has, but you can use the different events to include additional custom processes using javascript or HTML (see sample demos here and here).
<div id="widgetdiv"></div>
<script type="text/javascript">
cloudinary.openUploadWidget({
cloudName: 'cloudname',
inlineContainer: '#widgetdiv',
uploadPreset: 'uploadPreset',
showPoweredBy: false,
sources: ['local','instagram']},
(error, result) => {
if (!error && result && result.event === "success") {
console.log('Done! Here is the image info: ', result.info);
console.log('Custom implementations here...');
var filename = result.info.secure_url.split('/').pop().split('#')[0].split('?')[0];
var filesize = result.info.bytes;
// Populate form, card container, post to another endpoint, etc
}
}
);
</script>

Related

Twitter Card Images not working on Gatsby app

I'm working on a Gatsby app with Netlify CMS (and hosted on Netlify). Trying to get the metadata working so that Twitter cards display correctly with images.
The metadata is generally all right, but the images aren't showing on the Twitter validator or if I try to post to Twitter. The problem is clearly the images themselves, which are hosted on the site using Gatsby and Gatsby Image Sharp to render.
In fact, the validator seems to show no fundamental issues. Simply, the image doesn't show up:
Example relevant metadata:
<meta name="twitter:url" content="https://example.com/" data-react-helmet="true">
<meta name="twitter:image" content="https://example.com/static/12345/c5b20/blah.jpg" data-react-helmet="true">
<meta data-react-helmet="true" name="twitter:title" content="Site title">
<meta data-react-helmet="true" name="twitter:card" content="summary_large_image">
I know the images the issue, because if I replace my image URL (which is the full image URL) with an external URL, it works fine, showing the full card with image.
Any idea what could be causing this? I'm sizing the image down so it loads quickly, and it seems to load just fine directly (eg). (I mean, is there something weird/off about that image?)
NOTE: In a previous version of this question, I referenced Cloudinary and Uploadcare, but have since removed those two in a branch to simplify the problem. (They seem to have been unecessary holdovers from the starter app I used.) You can now see an example page for that branch here and the associated image in the twitter:image tag here. I feed this pre-processed/shrunk image into the header using React Helmet (and Gatsby React Helmet) and using the following code in my GraphQL call to get the image associated with the blogpost in that particular, smaller format:
featuredimage {
childImageSharp {
fixed(width: 480, quality: 75) {
src
}
}
Second Note/thought: Should I be worried about the fact that the pages in production seem to be re-rendering on every reload? Isn't SSR supposed to ensure that doesn't happen? I tested this by including a call to Math.random(), hidden, in the page. You can see the result by running document.getElementsByClassName('document')[0].children[0].innerText, and note that it produces a different number on each page reload. This implies to me that the whole page is being re-rendered by the client. Isn't that wrong? Why would that be happening? Might that relate to some sort of client processing of the images on each request, which might be screwing up the Twitter cards?
Third update: I put together a simpler reproduction here. It's based off of this starter template, with Uploadcare/Cloudinary removed and Twitter card metadata added to the header. Other than that, and removing unnecessary pages, I didn't make any other changes. I used this starter for a repro rather than a vanilla starter app, because I'm unsure whether the issue is caused by the interaction of Netlify CMS and the Gatsby Sharp Image plugin. I might try to put together a second reproduction. For now, the code for this repo is here, and the pages that should show Twitter cards are the blog posts, such as this one.
ACTUALLY, it seems that a super basic reproduction, with Gatsby 3 and no Netlify CMS or anything, has the same issue. Here's the minimal reproduction, with the image taken from src/images using an allImageSharp query and inserted into the metadata for each page. Code here.
FINAL UPDATE
Based on Derek's answer below, I removed the #reach/router stuff, and got the site URL from Netlify build env variables. It appeared that #reach/router only gave this information when JS was running, which excluded the Twitterbot, resulting in an undefined base URL, which broke the Twitter image. Including the URL from Netlify (using process.env.URL in the Gatsby config and pulling that in through a siteMetadata query) fixed the problem!
Update:
I think I might have found the issue. When opening the minimal production with script disabled, the url for twitter:image is invalid:
<meta data-react-helmet="true" name="twitter:image" content="undefined/static/03475800ca60d2a62669c6ad87f5fda0/58026/energy.jpg">
So for some reasons, during build, the hostname is missing, but when JS kicks in, it appears (Might have something to do with the way you get the hostname). Twitter crawlers probably does not have JS enabled & couldn't fetch the image.
Make sure your opengraph images are absolute urls with https:// or http:// protocols. I checked your example link & saw that it was a relative link (/static/etc.)
For Twitter, it seems to demand social cards to be 2:1
Images for this Card support an aspect ratio of 2:1 with minimum dimensions of 300x157 or maximum of 4096x4096 pixels.
https://developer.twitter.com/en/docs/twitter-for-websites/cards/overview/summary-card-with-large-image
If you're using the latest Gatsby image plugin, you can use aspectRatio to crop the image.
Also note that you can skip the twitter:image tag, if your og:image has already satisfied Twitter's card requirement.
SSR does not mean to never run JS in the client, React will render your page on the client side regardless of SSR.
This was solved here: https://github.com/gatsbyjs/gatsby/discussions/32100.
"location and thus origin is not available during gatsby build and thus the generated HTML has undefined there."
I got it working by changing the way I create the image URL inside seo.js from this:
let origin = "";
if (typeof window !== "undefined") {
origin = window.location.origin;
}
const image = origin + imageSrc;
to this:
const imageSrc = thumbnail && thumbnail.childImageSharp.fixed.src;
const image = site.siteMetadata?.siteUrl + imageSrc;
You need to use siteUrl from siteMetadata.
Below is my pageQuery from inside blog-post.js:
export const pageQuery = graphql`
query BlogPostBySlug(
$id: String!
$previousPostId: String
$nextPostId: String
) {
site {
siteMetadata {
title
siteUrl
}
}
markdownRemark(id: { eq: $id }) {
id
excerpt(pruneLength: 160)
html
frontmatter {
title
date(formatString: "MMMM DD, YYYY")
description
thumbnail {
childImageSharp {
fixed(width: 1200) {
...GatsbyImageSharpFixed
}
}
}
}
}
}
`

Twilio video: How to render mobile layout on desktop

I am using Twilio Video React application, for my video application.
Twilio video renders video in two views, desktop and mobile, based on the device.
Due to space constraints on my desktop application I would like to render the video similar to that of a mobile on desktop, Is this possible? Is there a variable that I could set to allow me to do this ? Basically, I would like Twilio video to think that I am running the app on the mobile.
I tried to set isMobile to true in utils (as shown below), this doesn't seem to make a difference to the UI.
export const isMobile = (() => {
if (
typeof navigator === "undefined" ||
typeof navigator.userAgent !== "string"
) {
return true;
}
return /Mobile/.test(navigator.userAgent);
})();
I would like to achieve the below:
Twilio developer evangelist here.
I've not worked on this application myself, so I'm not familiar with how it is styled. There is not a variable for setting the style on mobile though, it is mostly controlled by CSS media query break points.
What you will notice among the code is that the CSS is embedded within the JavaScript. You will also find lines like:
[theme.breakpoints.down('xs')]: {
// styles
}
That breakpoint defines how a number of the styles are supposed to work at the small screen size. So if you remove the breakpoint and use the styles inside the breakpoint as the default styles, then the application will lay out in the mobile version.
Once you've done that, you can then place the video parts of the application within a div with a width you define and place the rest of your application around it.
Let me know if that helps at all.

How to scrape images from eBay and Amazon using XPath in Nokogiri from JSON

I'm trying to scrape images from websites using Nokogiri and XPath, so far with limited success. For a typical website whose HTML has img and src, I can use:
tmp2 = Nokogiri::HTML(open(site_url))
tmp2.xpath("//img/#src").each do |src|
...do whatever
end
However, some sites like Amazon and eBay only trigger certain images with JavaScript. If I look at the code I can see the data in arrays. For example, from Amazon:
<script type="text/javascript">
P.when('jQuery', 'cf').execute(function($, cf){
P.load.js('http://z-ecx.images-amazon.com/images/G/01/browser-scripts/imageBlock-udp-airy/imageBlock-udp-airy-4060168860._V1_.js');
});
P.when('A', 'jQuery', 'ImageBlockATF', 'cf').register('ImageBlockBTF', function(A, $, imageBlockATF, cf){
var data = {"indexToColor":[],"burjImageBlock":0,"isSwatchHoverConsistent":1,"heroFocalPoint":null,"visualDimensions":["color_name"],"productGroupID":"apparel_display_on_website","newVideoMissing":0,"useIV":0,"useClickZoom":null,"useChildVideos":0,"numColors":7,"logMetrics":0,"defaultColor":"initial","airyConfig":{"enableContinuousPlay":null,"installFlashButtonText":"Install Flash Player","contentTitle":null,"autoplayCutOffTimeSeconds":null,"ageGate":{"monthNames":["January","February","March","April","May","June","July","August","September","October","November","December"],"deniedPrompt":"We're sorry. You are not old enough to watch this video.","submitText":"Submit","prompt":"This video is not intended for all audiences. What date were you born?"},"videoAds":null,"videoUnsupportedPrompt":"Sorry, this video is unsupported on this browser.","desiredMode":null,"swfUrl":"http://g-ecx.images-amazon.com/images/G/01/vap/video/airy2/prod/2.0.1102.0/flash/AiryBasicRenderer._V304902271_.swf","isAutoplayEnabled":null,"installFlashPrompt":"Adobe Flash Player is required to watch this video.","isLiveStream":null,"regionCode":"NA","contentId":null,"playbackErrorPrompt":"Sorry, an error has occurred while attempting video playback. Please try again later.","contentMinAge":null,"isForesterTrackingDisabled":null,"streamingUrls":null,"parentId":null,"foresterMetadataParams":{"client":"Dpx","requestId":"1MX7VHFRVAS6TWY64BXC","marketplaceId":"ATVPDKIKX0DER","session":"182-9511970-7757812","method":"Apparel.ImageBlock"},"jsUrl":"http://z-ecx.images-amazon.com/images/G/01/vap/video/airy2/prod/2.0.1102.0/js/airy.chromeless._V304902265_.js"},"mainImageMaxSizes":null,"staticStrings":{"playVideo":"Click to play video","rollOverToZoom":"Roll over image to zoom in","images":"Images","video":"video","clickToZoom":"Click on image to zoom in","touchToZoom":"Touch the image to zoom in","videos":"Videos","close":"Close","pleaseSelect":"Please select","clickToExpand":"Click to open expanded view","allMedia":"All Media"},"notThumbnailClickImmersiveView":1,"gIsNewTwister":1,"title":"Threads 4 Thought Women's Tabitha Basic Tank Top","ivRepresentativeAsin":{"6":"B00T46V76W","4":"B00WM3O7ES","1":"B00T46YZES","3":"B00WM3NLPE","2":"B00T46VD16","5":"B00T46VGXQ"},"mainImageSizes":[[342,445],[385,500],[425,550],[466,606],[522,679]],"isQuickview":0,"ipadVideoSizes":[[340,444],[384,500]],"colorToAsin":{"Coral Dreams":{"asin":"B00T46V76W"},"Heather Grey":{"asin":"B00WM3NLPE"},"Black":{"asin":"B00T46YZES"},"White":{"asin":"B00T46VGXQ"},"Deep Blue Sea":{"asin":"B00T46VD16"},"Sea Glass":{"asin":"B00WM3O7ES"}},"thumbExperimentEnabledValue":1,"showLITBOnClick":0,"videoSizes":[[342,445],[384,500]],"stretchyGoodnessWidth":[1280,1440,1640,1800],"autoplayVideo":0,"hoverZoomIndicator":"","sitbReftag":"","useHoverZoom":1,"staticImages":{"zoomOut":"http://g-ecx.images-amazon.com/images/G/01/detail-page/cursors/zoom-out._V184888738_.bmp","hoverZoomIcon":"http://g-ecx.images-amazon.com/images/G/01/img11/apparel/UX/DP/icon_zoom._V138923886_.png","zoomIn":"http://g-ecx.images-amazon.com/images/G/01/detail-page/cursors/zoom-in._V184888790_.bmp","zoomLensBackground":"http://g-ecx.images-amazon.com/images/G/01/apparel/rcxgs/tile._V211431200_.gif","videoThumbIcon":"http://g-ecx.images-amazon.com/images/G/01/Quarterdeck/en_US/images/video._V183716339_SX38_SY50_CR,0,0,38,50_.gif","spinner":"http://g-ecx.images-amazon.com/images/G/01/ui/loadIndicators/loading-large_labeled._V192238949_.gif","zoomInCur":"http://g-ecx.images-amazon.com/images/G/01/detail-page/cursors/zoomIn._V323082799_.cur","videoSWFPath":"http://g-ecx.images-amazon.com/images/G/01/Quarterdeck/en_US/video/20110518115040892/Video._V178668404_.swf","arrow":"http://g-ecx.images-amazon.com/images/G/01/javascripts/lib/popover/images/light/sprite-vertical-popover-arrow._V186877868_.png","zoomOutCur":"http://g-ecx.images-amazon.com/images/G/01/detail-page/cursors/zoomOut._V323082798_.cur"},"videos":[],"gPreferChildVideos":0,"altsOnLeft":1,"ivImageSetKeys":{"Coral Dreams":"6","Heather Grey":"3","Black":"1","initial":0,"White":"5","Deep Blue Sea":"2","Sea Glass":"4"},"useHoverZoomIpad":"","isUDP":1,"alwaysIncludeVideo":0,"widths":[1280,1440,1640,1800],"maxAlts":7,"useChromelessVideoPlayer":1,"mainImageHeightPartitions":null};
data["customerImages"] = eval('[]');
data["colorImages"] = {"Coral Dreams":[{"large":"http://ecx.images-amazon.com/images/I/41FGlhksmtL.jpg","variant":"MAIN","hiRes":"http://ecx.images-amazon.com/images/I/81iXQbkcpiL._UL1500_.jpg","thumb":"http://ecx.images-amazon.com/images/I/41FGlhksmtL._SR38,50_.jpg","main":{"http://ecx.images-amazon.com/images/I/81iXQbkcpiL._UX466_.jpg":["466","606"],"http://ecx.images-amazon.com/images/I/81iXQbkcpiL._UX522_.jpg":["522","679"],"http://ecx.images-amazon.com/images/I/81iXQbkcpiL._UY550_.jpg":["423","550"],"http://ecx.images-amazon.com/images/I/81iXQbkcpiL._UX342_.jpg":["342","445"],"http://ecx.images-amazon.com/images/I/81iXQbkcpiL._UY500_.jpg":["385","500"]}},{"large":"http://ecx.images-amazon.com/images/I/41XR9o0cV-L.jpg","variant":"BACK","hiRes":"http://ecx.images-amazon.com/images/I/81bVmFiRu0L._UL1500_.jpg","thumb":"http://ecx.images-amazon.com/images/I/41XR9o0cV-L._SR38,50_.jpg","main":{"http://ecx.images-amazon.com/images/I/81bVmFiRu0L._UY500_.jpg":["385","500"],"http://ecx.images-amazon.com/images/I/81bVmFiRu0L._UX522_.jpg":["522","679"],"http://ecx.images-amazon.com/images/I/81bVmFiRu0L._UX342_.jpg":["342","445"],"http://ecx.images-amazon.com/images/I/81bVmFiRu0L._UX466_.jpg":["466","606"],"http://ecx.images-amazon.com/images/I/81bVmFiRu0L._UY550_.jpg":["423","550"]}}],"Heather Grey":[{"large":"http://ecx.images-amazon.com/images/I/41f-8R8Eu-L.jpg","variant":"MAIN","hiRes":"http://ecx.images-amazon.com/images/I/81dTYkBL%2BxL._UL1500_.jpg","thumb":"http://ecx.images-amazon.com/images/I/41f-8R8Eu-L._SR38,50_.jpg","main":{"http://ecx.images-amazon.com/images/I/81dTYkBL%2BxL._UX466_.jpg":["466","606"],"http://ecx.images-amazon.com/images/I/81dTYkBL%2BxL._UY500_.jpg":["385","500"],"http://ecx.images-amazon.com/images/I/81dTYkBL%2BxL._UY550_.jpg":["423","550"],"http://ecx.images-amazon.com/images/I/81dTYkBL%2BxL._UX522_.jpg":["522","679"],"http://ecx.images-amazon.com/images/I/81dTYkBL%2BxL._UX342_.jpg":["342","445"]}},{"large":"http://ecx.images-amazon.com/images/I/41gLiFBbcdL.jpg","variant":"BACK","hiRes":"http://ecx.images-amazon.com/images/I/81ua3AXCpJL._UL1500_.jpg","thumb":"http://ecx.images-amazon.com/images/I/41gLiFBbcdL._SR38,50_.jpg","main":{"http://ecx.images-amazon.com/images/I/81ua3AXCpJL._UX342_.jpg":["342","445"],"http://ecx.images-amazon.com/images/I/81ua3AXCpJL._UY550_.jpg":["423","550"],"http://ecx.images-amazon.com/images/I/81ua3AXCpJL._UY500_.jpg":["385","500"],"http://ecx.images-amazon.com/images/I/81ua3AXCpJL._UX522_.jpg":["522","679"],"http://ecx.images-amazon.com/images/I/81ua3AXCpJL._UX466_.jpg":["466","606"]}}],"Black":[{"large":"http://ecx.images-amazon.com/images/I/41BxSpfEM7L.jpg","variant":"MAIN","hiRes":"http://ecx.images-amazon.com/images/I/81%2BTW8762BL._UL1500_.jpg","thumb":"http://ecx.images-amazon.com/images/I/41BxSpfEM7L._SR38,50_.jpg","main":{"http://ecx.images-amazon.com/images/I/81%2BTW8762BL._UY550_.jpg":["423","550"],"http://ecx.images-amazon.com/images/I/81%2BTW8762BL._UX342_.jpg":["342","445"],"http://ecx.images-amazon.com/images/I/81%2BTW8762BL._UX522_.jpg":["522","679"],"http://ecx.images-amazon.com/images/I/81%2BTW8762BL._UY500_.jpg":["385","500"],"http://ecx.images-amazon.com/images/I/81%2BTW8762BL._UX466_.jpg":["466","606"]}},{"large":"http://ecx.images-amazon.com/images/I/41Gf%2BW-cPTL.jpg","variant":"BACK","hiRes":"http://ecx.images-amazon.com/images/I/81SJwuaCspL._UL1500_.jpg","thumb":"http://ecx.images-amazon.com/images/I/41Gf%2BW-cPTL._SR38,50_.jpg","main":{"http://ecx.images-amazon.com/images/I/81SJwuaCspL._UY500_.jpg":["385","500"],"http://ecx.images-amazon.com/images/I/81SJwuaCspL._UX522_.jpg":["522","679"],"http://ecx.images-amazon.com/images/I/81SJwuaCspL._UX342_.jpg":["342","445"],"http://ecx.images-amazon.com/images/I/81SJwuaCspL._UX466_.jpg":["466","606"],"http://ecx.images-amazon.com/images/I/81SJwuaCspL._UY550_.jpg":["423","550"]}}],"White":[{"large":"http://ecx.images-amazon.com/images/I/41tElK2wPKL.jpg","variant":"MAIN","hiRes":"http://ecx.images-amazon.com/images/I/81kKgU75rIL._UL1500_.jpg","thumb":"http://ecx.images-amazon.com/images/I/41tElK2wPKL._SR38,50_.jpg","main":{"http://ecx.images-amazon.com/images/I/81kKgU75rIL._UY550_.jpg":["423","550"],"http://ecx.images-amazon.com/images/I/81kKgU75rIL._UX522_.jpg":["522","679"],"http://ecx.images-amazon.com/images/I/81kKgU75rIL._UY500_.jpg":["385","500"],"http://ecx.images-amazon.com/images/I/81kKgU75rIL._UX342_.jpg":["342","445"],"http://ecx.images-amazon.com/images/I/81kKgU75rIL._UX466_.jpg":["466","606"]}},{"large":"http://ecx.images-amazon.com/images/I/31lEDIs4cqL.jpg","variant":"BACK","hiRes":"http://ecx.images-amazon.com/images/I/81OBgvbUR7L._UL1500_.jpg","thumb":"http://ecx.images-amazon.com/images/I/31lEDIs4cqL._SR38,50_.jpg","main":{"http://ecx.images-amazon.com/images/I/81OBgvbUR7L._UX466_.jpg":["466","606"],"http://ecx.images-amazon.com/images/I/81OBgvbUR7L._UX342_.jpg":["342","445"],"http://ecx.images-amazon.com/images/I/81OBgvbUR7L._UX522_.jpg":["522","679"],"http://ecx.images-amazon.com/images/I/81OBgvbUR7L._UY500_.jpg":["385","500"],"http://ecx.images-amazon.com/images/I/81OBgvbUR7L._UY550_.jpg":["423","550"]}}],"Deep Blue Sea":[{"large":"http://ecx.images-amazon.com/images/I/41oNq3KmSGL.jpg","variant":"MAIN","hiRes":"http://ecx.images-amazon.com/images/I/81MtZtmxVLL._UL1500_.jpg","thumb":"http://ecx.images-amazon.com/images/I/41oNq3KmSGL._SR38,50_.jpg","main":{"http://ecx.images-amazon.com/images/I/81MtZtmxVLL._UX342_.jpg":["342","445"],"http://ecx.images-amazon.com/images/I/81MtZtmxVLL._UX522_.jpg":["522","679"],"http://ecx.images-amazon.com/images/I/81MtZtmxVLL._UY550_.jpg":["423","550"],"http://ecx.images-amazon.com/images/I/81MtZtmxVLL._UY500_.jpg":["385","500"],"http://ecx.images-amazon.com/images/I/81MtZtmxVLL._UX466_.jpg":["466","606"]}},{"large":"http://ecx.images-amazon.com/images/I/41AJgd1OuYL.jpg","variant":"BACK","hiRes":"http://ecx.images-amazon.com/images/I/81uLEksrYFL._UL1500_.jpg","thumb":"http://ecx.images-amazon.com/images/I/41AJgd1OuYL._SR38,50_.jpg","main":{"http://ecx.images-amazon.com/images/I/81uLEksrYFL._UX342_.jpg":["342","445"],"http://ecx.images-amazon.com/images/I/81uLEksrYFL._UY500_.jpg":["385","500"],"http://ecx.images-amazon.com/images/I/81uLEksrYFL._UX522_.jpg":["522","679"],"http://ecx.images-amazon.com/images/I/81uLEksrYFL._UX466_.jpg":["466","606"],"http://ecx.images-amazon.com/images/I/81uLEksrYFL._UY550_.jpg":["423","550"]}}],"Sea Glass":[{"large":"http://ecx.images-amazon.com/images/I/418vg-re8oL.jpg","variant":"MAIN","hiRes":"http://ecx.images-amazon.com/images/I/81YgtD-bEwL._UL1500_.jpg","thumb":"http://ecx.images-amazon.com/images/I/418vg-re8oL._SR38,50_.jpg","main":{"http://ecx.images-amazon.com/images/I/81YgtD-bEwL._UX342_.jpg":["342","445"],"http://ecx.images-amazon.com/images/I/81YgtD-bEwL._UX522_.jpg":["522","679"],"http://ecx.images-amazon.com/images/I/81YgtD-bEwL._UX466_.jpg":["466","606"],"http://ecx.images-amazon.com/images/I/81YgtD-bEwL._UY500_.jpg":["385","500"],"http://ecx.images-amazon.com/images/I/81YgtD-bEwL._UY550_.jpg":["423","550"]}},{"large":"http://ecx.images-amazon.com/images/I/41lcpC41VSL.jpg","variant":"BACK","hiRes":"http://ecx.images-amazon.com/images/I/814%2B6ZLwIxL._UL1500_.jpg","thumb":"http://ecx.images-amazon.com/images/I/41lcpC41VSL._SR38,50_.jpg","main":{"http://ecx.images-amazon.com/images/I/814%2B6ZLwIxL._UY500_.jpg":["385","500"],"http://ecx.images-amazon.com/images/I/814%2B6ZLwIxL._UX342_.jpg":["342","445"],"http://ecx.images-amazon.com/images/I/814%2B6ZLwIxL._UX522_.jpg":["522","679"],"http://ecx.images-amazon.com/images/I/814%2B6ZLwIxL._UX466_.jpg":["466","606"],"http://ecx.images-amazon.com/images/I/814%2B6ZLwIxL._UY550_.jpg":["423","550"]}}]};
data["heroImage"] = {};
data["landingAsinColor"] = 'Coral Dreams';
data["shouldApplyResizeFix"] = false;
return data;
});
</script>
The filenames I want to grab don't have src (i.e. http://ecx.images-amazon.com/images/I/81%2BTW8762BL._UY500_.jpg) In this case, the array is called data["colorImages"]. But I can't hard-code anything because the same thing happens on eBay.
The filenames I need here are in enImgCarousel.
On a side note, when I use the following JavaScript bookmarklet for each URL to get images, I'm able to get the correct images:
a='';
for (b=0;b<document.images.length;b++){
a+='<img src='+document.images[b].src+'><br>'};
ifa=''){
document.writea+'</center>');
void(document.close())
}else{
alert('No images!')
}
Back to Nokogiri and XPath, I've also tried:
tmp2.xpath("//img").each do |src|...
and
tmp2.xpath("html//img").each do |src|
Any ideas how I should do this or which direction to go in?
This is alternative way to solve what you want; you can use Capybara and Poltergeist.
I assume you don't have to dive into JavaScript with this solution.
If you scrape, I recommend that you consider Capybara with Poltergeist, you can find many sources to reference.
This is the code I tried:
require 'capybara'
require 'capybara/dsl'
require 'capybara/poltergeist'
Capybara.register_driver :poltergeist_debug do |app|
Capybara::Poltergeist::Driver.new(app, inspector: true)
end
Capybara.javascript_driver = :poltergeist_debug
Capybara.current_driver = :poltergeist_debug
# Amazon Case
visit_site('https://www.amazon.com/dp/B00T46V758/?tag=stackoverfl08-20')
doc_amazon = Nokogiri::HTML.parse(page.html)
doc_amazon.xpath("//img/#src").each do |src|
p src.value
end
#ebay case
visit_site('https://www.ebay.com/itm/Summer-Women-Casual-Chiffon-Loose-Tops-Batwing-Short-Sleeve-Loose-T-Shirt-Blouse-/351411949784?pt=LH_DefaultDomain_0&var=&hash=item51d1c8d0d8')
doc_ebay = Nokogiri::HTML.parse(page.html)
doc_ebay.xpath("//img/#src").each do |src|
p src.value
end
If you want to dig into it:
doc.xpath("//div[#id='imgTagWrapperId']/img").attribute('src').value
# => "https://images-na.ssl-images-amazon.com/images/I/81%2BTW8762BL._UX453_.jpg"
doc.xpath("//div[#id='mainImgHldr']/img[#id='icImg']").attribute('src').value
# => "https://i.ebayimg.com/images/g/dtAAAOSwpdpVZuU~/s-l300.jpg"
Are you trying to generate a database of competitors items with pricing, etc.?
Are you trying to grab entire categories or individual sellers?
The reason why I ask is you can get an RSS feed of items each seller lists if they have turned that feature on. This way, you do not have to waste time scraping a page when you can get the central data from an RSS feed.
When parsing webpages, depending upon where you are in the webpage (you mentioned carousel) the indices you are encountering are from the stash of thumbnails representing the larger images.
I recommend looking at the eBay API and the Amazon API and finding the RSS feeds for the sellers first.
As far as getting past any Javascript issues, the webpage loads rotating slideshows and carousels dynamically, so you will have to use Mechanize (as RAJ suggested above) or Beautiful Soup or Selenium to get fully rendered web pages in which all images are in a scrapable state.
Feel free to post your source if there is anything else I can help with.
Sorry, as I am posting the answer from mobile phone, I can't write full code right away, however, I can give you a way. You should use Mechanize with selenium-webdriver & watir instead of only Nokogiri.
Using Mechanize, you will be able to handle elements coming from JavaScript. You can mock the actual moves on browser i.e. you can code for clicking on links/buttons, you can wait for image load and then can scrape it. And all this can be done using Mechanize very easily.

"document" in mozilla extension js modules?

I am building Firefox extension, that creates single XMPP chat connection, that can be accessed from all tabs and windows, so I figured, that only way to to this, is to create connection in javascript module and include it on every browser window. Correct me if I am wrong...
EDIT: I am building traditional extension with xul overlays, not using sdk, and talking about those modules: https://developer.mozilla.org/en-US/docs/Mozilla/JavaScript_code_modules
So I copied Strophe.js into js module. Strophe.js uses code like this:
/*_Private_ function that creates a dummy XML DOM document to serve as
* an element and text node generator.
*/
[---]
if (document.implementation.createDocument === undefined) {
doc = this._getIEXmlDom();
doc.appendChild(doc.createElement('strophe'));
} else {
doc = document.implementation
.createDocument('jabber:client', 'strophe', null);
}
and later uses doc.createElement() to create xml(or html?) nodes.
All worked fine, but in module I got error "Error: ReferenceError: document is not defined".
How to get around this?
(Larger piece of exact code: http://pastebin.com/R64gYiKC )
Use the hiddenDOMwindow
Cu.import("resource://gre/modules/Services.jsm");
var doc = Services.appShell.hiddenDOMWindow.document;
It sounds like you might not be correctly attaching your content script to the worker page. Make sure that you're using something like tabs.attach() to attach one or more content scripts to the worker page (see documentation here).
Otherwise you may need to wait for the DOM to load, waiting for the entire page to load
window.onload = function ()
{
Javascript code goes here
}
Should take at least diagnose that issue (even if the above isn't the best method to use in production). But if I had to wager, I'd say that you're not attaching the content script.

Why is the file upload disabled in the Youtube Upload Widget?

I have just been looking into allowing my users to upload videos to their YouTube accounts directly from my site using the Youtube Upload widget. This widget is like 1000 times easier to deploy that the usual API process.
I have seen that it currently defaults to webcam_only=true but am wondering why? If I change the iframe to webcam_only=false I get the upload button and it all seems to work fine...
Obviously it would be an enourmous time saver for me if I could just use this functionality as opposed to trying to get my head around the whole API 2 way of doing things- plus that method seems to require refreshing the page which is no good for my app.
Any updates on why this is disabled and when it may be enabled?
Thanks in advance
webcam_only is set to true by default if the api creates the api. You can create the iframe element yourself as detailed in the "Loading an upload widget" of the developer docs.
https://developers.google.com/youtube/youtube_upload_widget
<iframe id="widget" type="text/html" width="640" height="390"
src="https://www.youtube.com/upload_embed" frameborder="0"></iframe>
<script>
widget = new YT.UploadWidget('widget', {
});
</script>
Or
<div id="widget"></div>
widget = new YT.UploadWidget('widget', {
webcamOnly: true;
});
I believe the option to upload videos using the YouTube Upload Direct widget has been removed as of August 2012. Although you can force add the upload button you get the button, but nothing happens with it.
https://developers.google.com/youtube/youtube_upload_widget#Revision_History
August 22, 2012 This update contains the following changes:
The webcamOnly property has been removed from the list of widget options that you can specify in the constructor for the upload widget.
Previously, this property was documented as having a default value of
false, which would mean that the widget would also display a button
for uploading an existing video file. However, the option to upload an
existing file is not currently supported, so the widget always only
displays an option to record and upload a webcam video.
Here is an example of the button with no action: http://sandboxsite.us/youtubetest.php
This is using:
<iframe id="widget" type="text/html" width="640" height="390"
src="https://www.youtube.com/upload_embed?webcam_only=false" frameborder="0"></iframe>
<script>
widget = new YT.UploadWidget('widget', {
});
</script>
If anyone figures out how to add the button AND make the uploads work, I'd surely buy them a beer!

Resources