Google Docs IMPORTXML exact data value - google-sheets

first of all: I'm completely new to this, I just started with XML/HTML an hour ago and I tried to import data from a website to work with it in Google Docs/tables. The website uses HTML and I try to import the exact value behind a displayed one and rounded version on the website i.e. for example instead of 86% displayed I want the value 0.8597.
In this example I try to grab the Playoff chances for every NFL team in the 2021 season, recent week, based of FiveThirtyEight's predictions.
First off, I import the team names using this formula:
=IMPORTXML("https://projects.fivethirtyeight.com/2021-nfl-predictions/"; "//table[#id='standings-table']/tbody/tr/td[#class='team']/span[#class='full']")
After that, I try to import the playoff chances with this formula:
=IMPORTXML("https://projects.fivethirtyeight.com/2021-nfl-predictions/"; "//table[#id='standings-table']/tbody/tr/td[#class='pct div']")
That returns the following:
TEAM
Playoff chances
Rams
86%
Buccaneers
83%
...
...
But I want this:
TEAM
Playoff chances
Rams
0.8597
Buccaneers
0.82702
...
...
The whole element with the values looks like this, in this case for the Los Angeles Rams (LAR) <td class="pct div" data-team="LAR" data-cat="make_playoffs" data-val="0.8597" style="background-color:rgb(36, 159, 219);color:#000;">86%</td>
Instead of the >86%< I want to receive the data-val="0.8597"
I tried to describe my problem as good as I can, I hope you can understand it. Thanks for anyone reading this or even answering!
-Daxelinho

In your situation, how about the following xpath?
Modified xpath:
//table[#id='standings-table']/tbody/tr/td[#class='pct div']/#data-val
and
//td[#class='pct div']/#data-val
Result:
When the above xpath is used, the following result is obtained.
In this case, the sample formula is =IMPORTXML("https://projects.fivethirtyeight.com/2021-nfl-predictions/", "//td[#class='pct div']/#data-val")

Related

MQL4 gets different data than the MT4 History Center shows

I'm quite new to MQL so this may be a misunderstanding.
This code:
Print( TimeToStr( iTime(NULL,PERIOD_M5,i), TIME_DATE|TIME_MINUTES), iOpen(NULL,PERIOD_M5,i), iHigh(NULL,PERIOD_M5,i), iLow(NULL,PERIOD_M5,i), iClose(NULL,PERIOD_M5,i) );
Prints this:
2023.01.05 23:25, 0.91648, 0.91678, 0.91636, 0.91676
If I export data from the History Center I get slightly different numbers:
2023.01.05T23:25, 0.91646 ,0.91675, 0.91633 ,0.91675, 146
The values are just a tiny bit smaller. This is consistent - ALL of the values exported from the History Center are a bit lower than what I see when I examine the data with iOpen() etc. The difference seems to range from 0.1 to 0.3 pips.
I assume this is a Bid vs Ask thing. Is it? I understand the iOpen() etc returns Bid prices. But does the History database contain both Bid and Ask? Is there a way to see both prices?
Or it it just guessing? If so, it does not seem to be adjusting the prices consistently.
Edit: I am running in the Strategy Tester.
Edit2: Today I downloaded a bunch of M1 data from Dukascopy and imported it into MT4 History Center. Now I get exactly the results I expect from my program.
But my question stands, about the data MT4 downloads from the broker.

Google Sheets IF AND OR Logic

I am making a scoring system on Google sheets and I am struggling with the logic I need for the final step.
This question might be related, but I can't seem to apply the logic.
There are a number of chemicals tested and for each an amount detected (AD) si given, and each has a benchmark amount allowed (AL). From AL and AD we calculate AD/AL= %AL.
The Total Score (TS) is calculated based on an additive and weighted formula that takes into consideration the individual %ALs, but I won't go into that formula.
The final step is for me to "calculate" the Display Score (DS), which has some rules to it, and this is where I need the logic. The rules are as follows:
If any of the %Als are over 100 (this is will make TS>100 too) and DS should show "100+"
If none of the %ALs are over 99, (TS may be above or below 100) then DS can NOT be over 99, so it should show TS, maxing out at 99.
I want to do this within the sheet itself. I think the correct tool is logic operators IF, AND, OR.
I have made many attempts, these are some: (I am replacing cell references with the acronyms I used above)
=IF(TS>100,"100+",TS)
=IF(OR(AND(MAX(RANGE_OF_%ALS)<100,TS>99),(AND(MAX(RANGE_OF_%ALS)>100,TS>100)),99,"100+"))
I have also tried to think about how I would solve this in Python (just to explore it, I don't want to use Python for the solution). This was my attempt:
if Max%AL<100:
if TS<100:
print(TS)
else:
print("99")
else:
if TS>100:
print("100+")
Those are my attempts at thinking through the problem. I would appreciate some help.
This is a link to a copy of my sheet: https://docs.google.com/spreadsheets/d/1ZBnaFUepVdduEE2GBdxf5iEsfDsFNPIYhrhblHDHEYs/edit?usp=sharing
Please try:
=if(max(RANGE_OF_%ALS)>1,"100+",if(max(RANGE_OF_%ALS)<=0.99,MIN(TS,0.99),"?"))

How to use external data in an OSRM profile

It this Mapbox blog post, Lauren Budorick shares how they got working a routing engine with OSRM that uses elevation data in order to give cyclists better routes... AMAZING!
I also want to explore the potential of OSRM's routing when plugging in external (user-generated) data, but I'm still having a hard time grasping how OSRM's profiles work. I think I get the main idea, that every way (or node?) is piped into a few functions that, all toghether, scores how good that path is.
But that's it, there are plenty of missing parts in my head, like what do each of the functions Lauren uses in her profile do. If anyone could point me to some more detailed information on how all of this works, you'd make my next week much, much easier :)
Also, in Lauren's post, inside source_function she loads a ./srtm_bayarea.asc file. What does that .asc file looks like? How would one generate a file like that one from, let's say, data stored in a pgsql database? Can we use some other format, like GeoJSON?
Then, when in segment_function she uses things like source.lon and target.lat, are those refered to the raw data stored in the asc file? Or is that file processed into some standard that maps everything to comply it?
As you can see, I'm a complete newbie on routing and maybe GIS in general, but I'd love to learn more about this standards and tools that circle around the OSRM ecosystem. Can you share some tips with me?
I think I get the main idea, that every way (or node?) is piped into a few functions that, all toghether, scores how good that path is.
Right, every way and every node are scored as they are read from an OSM dump to determine passability of a node and speed of a way (used as the scoring heuristic).
A basic description of the data format can be found here. As it reads, data immediately available in ArcInfo ASCII grids includes SRTM data. Currently plaintext ASCII grids are the only supported format. There are several great Python tools for GIS developers that may help in converting other data types to ASCII grids - check out rasterio, for example. Here's an example of a really simple python script to convert NED IMGs to ASCII grids:
import sys
import rasterio as rio
import numpy as np
args = sys.argv[1:]
with rio.drivers():
with rio.open(args[0]) as src:
elev = src.read()[0]
profile = src.profile
def shortify(x):
if x == profile['nodata']:
return -9999
elif x == np.finfo(x).tiny:
return 0
else:
return int(round(x))
out_elev = [map(shortify, row) for row in elev]
with open(args[0] + '.asc', 'a') as dst:
np.savetxt(dst, np.array(out_elev),fmt="%s",delimiter=" ")
source.lon and target.lat e.g: source and target are nodes provided as arguments by the extraction process. Their coordinates are used to look up data at each location during extraction.
Make sure to read thoroughly through the relevant wiki page (already linked).
Feel free alternately to open a Github issue in
https://github.com/Project-OSRM/osrm-backend/issues with OSRM
questions.

How do you include categories with 0 responses in SPSS frequency output?

Is there a way to display response options that have 0 responses in SPSS frequency output? The default is for SPSS to omit in the frequency table output any response option that is not selected by at least a single respondent. I looked for a syntax-driven option to no avail. Thank you in advance for any assistance!
It doesn't show because there is no one single case in the data is with that attribute. So, by forcing a row of zero you'll need to realize we're asking SPSS to do something incorrect.
Having said that, you can introduce a fake case with the missing category. E.g. if you have Orange, Apple, and Pear, but no one answered they like Pear, the add one fake case that says Pear.
Now, make a new weight variable that consists of only 1. But for the Pear case, make it very very small like 0.00001. Then, go to Data > Weight Cases > Weight cases by and put that new weight variable over. Click OK to apply. Now what happens is that SPSS will treat the "1" with a weight of 1 and the fake case with a weight that is 1/10000 of a normal case. If you rerun the frequency you should see the one with zero count shows up.
If you have purchased the Custom Table module you can also do that directly as well, as far as I can tell from their technical document. That module costs 637 to 3630 depending on license type, so probably only worth a try if your institute has it.
So, I'm a noob with SPSS, I (shame on me) have a cracked version of SPSS 22 and if I understood your question correctly, this is my solution:
double click the Frequency table in Output
right click table, select Table Properties
go to General and then uncheck the Hide empty rows and columns option
Hope this helps someone!
If your SPSS version has no Custom Tables installed and you haven't collected money for that module yet then use the following (run this syntax):
*Note: please use variable names up to 8 characters long.
set mxloops 1000. /*in case your list of values is longer than 40
matrix.
get vars /vari= V1 V2 /names= names /miss= omit. /*V1 V2 here is your categorical variable(s)
comp vals= {1,2,3,4,5,99}. /*let this be the list of possible values shared by the variables
comp freq= make(ncol(vals),ncol(vars),0).
loop i= 1 to ncol(vals).
comp freq(i,:)= csum(vars=vals(i)).
end loop.
comp names= {'vals',names}.
print {t(vals),freq} /cnames= names /title 'Frequency'. /*here you are - the frequencies
print {t(vals),freq/nrow(vars)*100} /cnames= names /format f8.2 /title 'Percent'. /*and percents
end matrix.
*If variables have missing values, they are deleted listwise. To include missings, use
get vars /vari= V1 V2 /names= names /miss= -999. /*or other value
*To exclude missings individually from each variable, analyze by separate variables.

How to combining two files and creating a report with matched fields in COBOL

I have two files :
first file contains jobname and start time which looks like below:
ZPUDA13V STARTED - TIME=00.13.30
ZPUDM00V STARTED - TIME=03.26.54
ZPUDM01V STARTED - TIME=03.26.54
ZPUDM02V STARTED - TIME=03.26.54
ZPUDM03V STARTED - TIME=03.26.56
and the second file contains jobname and Endtime which looks like below:
ZPUDA13V ENDED - TIME=00.13.37
ZPUDM00V ENDED - TIME=03.27.38
ZPUDM01V ENDED - TIME=03.27.34
ZPUDM02V ENDED - TIME=03.27.29
ZPUDM03V ENDED - TIME=03.27.27
Now I am trying to combine these two files to get the report like JOBNAME START TIME ENDTIME.I have used ICETOOL to get the report If I get JOBNAME START TIME ,ENDTIME is SPACES .If I get Endtime ,JOBNAME START TIME gets spaces.
Please let me know how to code the outrec fields as I have coded with almost all possibilites to get the desired one.But still my output is not the same as I required
I have no idea what ICETOOL is (nor the inclination to even look it up in Google :-) but this is a classic COBOL data processing task.
Based on your simple data input, the algorithm would be:
for every record S in startfile:
for every record E in endfile:
if S.jobnname = E.jobname:
ouput S.jobname S.time E.time
next S
endif
endfor
endfor
However, you may need to take into account the fact that:
multiple jobs of the same name may run during the day (multiple entries in the file).
multiple jobs of the same name may run at the same time.
You could get around the first problem by ensuring the E record was the one immediately following the S record (based on time). The second problem is a doozy.
If you're running on z/OS (and you probably are, given the job names), have you considered using information from the SMF records to do this collection and analysis. I'm pretty certain SMF type 30 records hold everything you need.
And assuming this is a mainframe question, here's a shameless plug for a book one of my friends at work has written, check out What On Earth is a Mainframe? by David Stephens (ISBN-13 = 978-1409225355).
I know, i'm toooo late with my resolution, but may be helpful for new comers to stackoverflow
You can make use of JOINKEYS of DFSORT using JCL.
JOINKEYS F1 FIELDS=(01,08,CH,A)
JOINKEYS F2 FIELDS=(01,08,CH,A)
REFORMAT FIELDS=(F1:01,33,F2:25,08)
SORT FIELDS=COPY
OUTREC FIELDS=(01,08,25,08,34,08)
the outrec will hold the data as you need!

Resources