Locale setting of importxml function in google sheet not working - google-sheets

I'm trying to extract prices from bookdepositary site in local currency. However, it always retrieves the USD prices no matter what I am trying.
A specific example is:
=IMPORTXML("https://www.bookdepository.com/1/9783836519885";"//span[#class='sale-price']";"bg-BG")
gives US$47.63 no matter that Google sheet settings are changed to Bulgarian and despite of the locale set to "bg-BG".
The same US$47.63 result is retrieved when I use another scrap method like:
=IMPORTXML("https://www.bookdepository.com/1/9783836519885";"//meta[#itemprop='price']/#content";"bg-BG")
The following does not retrieve any result (but this is a secondary problem I am investigating which will follow once I understand the locale problem):
=IMPORTXML("https://www.bookdepository.com/1/9783836519885";"/html/body/div[2]/div[6]/div[3]/div/div[1]/div[1]/div[3]/div/div[2]/div/div[3]/div/span[1]";"bg-BG")
What do you think - is there a workaround?

I don't think that can be done with importXML().
The function help page seems to be missing the locale parameter, but the formula editor inline help box tells the following:
locale: A language and region locale code to use when parsing the data. If unspecified, the document locale will be used.
The importXML() function only finds data that actually appears in the XML document. The endpoint you mention seems to adjust its content per the client's IP address, but in each response, it only has prices in one currency.
The locale parameter does not change the IP address the request is sent from. It gets sent from one of Google's servers, most of which are in the United States. When you set the locale parameter, the content may get parsed in a different way, but that will not magically make additional content appear in the page.

in this case, what you actually need is to fool google sheet to not default out on the "common path"
you need something like: https://www.4everproxy.com/ (but with Bulgarian support)
here is some example...
where:
and then the formula will be from:
=IMPORTXML("https://de.4everproxy.com/direct/aHR0cHM6Ly93d3cuYm9va2RlcG9zaXRvcnkuY29tLzEvOTc4MzgzNjUxOTg4NQ--","//span[#class='sale-price']")

Related

Using XML spreadSheet to get the price of an item

I want to get the price of an item which is market in steam. I tried to use this formula but it is not working it tells me that the value is too big. and I did not know what to do. I want to get the price of an item which is on market on steam.
Blockquote =VALUE(REGEXEXTRACT(REGEXEXTRACT(CONCATENATE(IMPORTXML("https://steamcommunity.com/market/listings/730/Clutch%20Case", "//script[2]")),".*]]"), "[0-9]+.[0-9]+"))
The main problem here is that the prices in the Steam page are generated by Javascript and IMPORTXML cannot retrieve dynamically generated data. It seems that you're trying to get around this by importing a <script> section, but this will not execute the script, you're just grabbing a bunch of code.
According to this answer, Steam has some endpoints that you can use to get market data. These return a simple JSON string with the item information. The endpoint looks like this:
http://steamcommunity.com/market/priceoverview/?currency=1&appid=[ID]&market_hash_name=[Item name]
The appid is the game's ID, and the market_hash_name is the URL-encoded name of the item. Conveniently you can already find these in the URL that you are already using, https://steamcommunity.com/market/listings/730/Clutch%20Case. The game ID is 730 and the name is Clutch%20Case. So you can plug these in to the endpoint to get this URL:
http://steamcommunity.com/market/priceoverview/?currency=1&appid=730&market_hash_name=Clutch%20Case
The endpoint's JSON looks like this:
{
"success":true,
"lowest_price":"$0.30",
"volume":"94,440",
"median_price":"$0.31"
}
Since you only care about the median price, we can use a formula with REGEXEXTRACT to extract only that part:
Here's a sample pasting the URL in A1.
=REGEXEXTRACT(JOIN("", IMPORTDATA(A1)), "median_price:""(\$[0-9]+.[0-9]+)")
Edit: As mentioned in the answer I linked, you can test the currency parameter in the URL with different numbers to get other currencies. In your case you can try currency=2for pounds (£). You'll also have to edit the REGEXEXTRACT to account for this change:
URL: http://steamcommunity.com/market/priceoverview/?currency=2&appid=730&market_hash_name=Clutch%20Case
Formula: =REGEXEXTRACT(JOIN("", IMPORTDATA(A1)), "median_price:""(£[0-9]+.[0-9]+)")

Google Sheet find cell by URL parameter

I have a database of elements, each element has its own QR Code. After reading the code I would like to be able to open the worksheet on a specific tab and jump to the appropriate cell (according to the element name). Calling a worksheet through a URL with the #gid parameter allows you to open a tab.... the "range" parameter allows you to jump to a specific cell.... and what if I want to search for an item by name? Something like: https://docs.google.com/spreadsheets/d/1fER4x1p.../edit#gid=82420100&search=element_name.... is it possible?
Google has not introduced this yet
But you can look into Google Script (Googles SpreadSheets macros like) to achieve this.
Also a simpler approach will be to just filter the data, but this will change your requirement obviously. For example you can create a Filter with the name you are looking for and then you will get the URL.
This is the URL to a Sample of this, it should open the
Spreadsheet and filter the data when loaded. This is the Icon to
look for to create the filters
here is some documentation for you to get started on Google App Script, but I don't have a direct link to let you know how to catch the parameters for it to process them. What I can tell you is that this is a much more complicated approach than just a URL because it involves programmatic processing on the Spreadsheet side.

cumulocity mqtt measurement

I am pretty new to Cumulocity and I am trying to get data into the platform from my own device using mqtt and the smartrest templates. I can get data in using the static templates but they only support certain data types. I am struggling to create the appropriate smartrest template in the UI and the documentation doesn't go into much detail.
I get that the template name goes in the MQTT topic (or selected on login as part of the username) in s/ut/template_name and the messageId of the messages in the template get matched to the first CSV field of the MQTT publish payload. What I don't get is the template terminology. In the UI I choose API->Measurement and Method->POST and I am presented with required values $.type and $.time. My questions:
Is $.type the "measurement fragment type" name or do I have to make it "c8y_CustomMeasurement"? Can I call it whatever I want?
$.time has a value field. Is this the default value if one is not supplied in the publish?
I assume I need to add a numerical value in the optional API values. To link it to the value of the data point should I make the key "c8y_CustomMeasurement.custom.value"?
Am I way off base here?
Every time I publish to my own smartrest template the server drops the connection so I assume its an error in my template setup but I don't see a way of accessing debug messages (also nothing is published back to me on s/e or s/dt).
For the sake of an example, lets say I wish to publish a unitless, timestamped pulse count with payload format "mId,ts,value" with example data "p01,'2017-07-17 12:34:00',1234"
What you wrote so far is mostly correct just to be a bit more precise:
The topic is s/uc/template_id (not the template name, this is just a label)
The $.type refers to the 'type' fragment in the measurement JSON. It is a free text field
In 99% of cases you want to leave the $.time empty. If you set something here it is not the default but fixed to that timestamp and you cannot change it when using the template. If you leave it empty and still not send something in
Example: p01,2017-07-17T12:34:00,1234 (no quotes arounf timestamp and ISO8601 format
Example without sending time: p01,,1234 (sending empty string as time results in server time beeing set. The template is the same)
Hope these points help you to find you issue

Twitter Search API IDs meaning

I am using the Twitter Search API and I can't understand the id field of a tweet.
For example here is one: <id>tag:search.twitter.com,2005:1990561514</id>. The real ID is the final number part, right? Why doesn't Twitter already provide this in a single element? And, why is there a year of 2005on the ID field? Is that the ID of that year and the following year tweets get an ID recounted to zero? Is the ID indexed to the year?
I am asking all this stuff, because I am going to use the option of since_id to retrive new tweets. If the ID isn't really unique and depends on the year, it won't work as expected.
Thanks.
The tag is unique - but parts of it are redundant.
tag:search.twitter.com,2005:1990561514
Obviously, search.twitter.com is the URL from where you requested the document.
The ,2005 is constant. As far as I can tell, it has never changed since the service was launched. While there's no official documentation, I would guess that it refers to the ATOM specification namespace - http://www.w3.org/2005/Atom"
Finally, the long number is the Tweet's status ID. It will always be unique and can be used for the since_id.
What you will need to do is split the string, and just use the number after the colon as your ID.
I believe you are doing something wrong. If you look at all of the example results from the Twitter Search API, none of the id fields are formatted like this one you are showing.
For example:
http://search.twitter.com/search.json?q=%40twitterapi%20-via
Also, if you check out the example requests page, you will see that all of the id fields have normal formats, i.e.:
"id":122032448266698752
Update:
Now that I know you are using the atom feed, I can see where the seemingly oddly formatted element comes from. See this article on avoiding duplicates in atom feeds. Another helpful article.
Basically, atom feeds REQUIRE a unique id for each element in a feed. Some feeds use the "tag" scheme to ensure uniqueness. This format is actually pretty common in atom feeds and many frameworks use it by default. For instance, the RoR AtomFeedHelper (which might even be what Twitter uses) specifies the default format to be:
"tag:#{request.host},#{options}:#{request.fullpath.split(".")}"

How does a website highlight search terms you used in the search engine?

I've seen some websites highlight the search engine keywords you used, to reach the page. (such as the keywords you typed in the Google search listing)
How does it know what keywords you typed in the search engine? Does it examine the referrer HTTP header or something? Any available scripts that can do this? It might be server-side or JavaScript, I'm not sure.
This can be done either server-side or client-side. The search keywords are determined by looking at the HTTP Referer (sic) header. In JavaScript you can look at document.referrer.
Once you have the referrer, you check to see if it's a search engine results page you know about, and then parse out the search terms.
For example, Google's search results have URLs that look like this:
http://www.google.com/search?hl=en&q=programming+questions
The q query parameter is the search query, so you'd want to pull that out and un-URL-escape it, resulting in:
programming questions
Then you can search for the terms on your page and highlight them as necessary. If you're doing this server side-you'd modify the HTML before sending it to the client. If you're doing it client-side you'd manipulate the DOM.
There are existing libraries that can do this for you, like this one.
Realizing this is probably too late to make any difference...
Please, I beg you -- find out how to accomplish this and then never do it. As a web user, I find it intensely annoying (and distracting) when I come across a site that does this automatically. Most of the time it just ends up highlighting every other word on the page. If I need assistance finding a certain word within a page, my browser has a much more appropriate "find" function built right in, which I can use or not use at will, rather than having to reload the whole page to get it to go away when I don't want it (which is the vast majority of the time).
Basically, you...
Examine document.referrer.
Have a list of domains to GET param that contains the search terms.
var searchEnginesToGetParam = {
'google.com' : 'q',
'bing.com' : 'q'
}
Extract the appropriate GET param, and decodeURIComponent() it.
Parse the text nodes where you want to highlight the terms (see Replacing text with JavaScript).
You're done!

Resources