options for saving xarray dataset with to_netcdf - save

I would like to add units, long_name, and maybe a description to a variable while using the to_netcdf command. Let me know if you know how.
Here is my code that work:
filename = path+'file.nc'
ds = xr.Dataset({'sla': (('time_counter','x', 'y'), SLA)}, coords={'time_counter':time_counter,'nav_lon':(('x','y'),lon),'nav_lat':(('x','y'),lat)})
ds.to_netcdf(filename, 'w')
Supplementary informations if you want to use this:
'sla' is the name I give while saving the variable SLA
SLA has 3 dimensions; I give them the names 'time_counter', 'x', and 'y'
I defined coordinates, one of which ('time_counter') is directly a dimension of SLA, but also it is possible to have a coordinate with multiple dimensions (e.g., 'nav_lon' and 'nav_lat' have 2 dimensions.
Here is the link that explain the function: http://xarray.pydata.org/en/stable/generated/xarray.Dataset.to_netcdf.html

You can set the attributes of each variable before saving the Dataset to NetCDF, for example (after creating your ds):
ds['sla'].attrs = {'units': 'something'}
After the to_netcdf() step I get (part of the ncdump -h):
double sla(time_counter, x, y) ;
...
sla:units = "something" ;

Related

Extract a list of variables satisfying certain conditions and storing it in a new variable using SPSS Syntax

I have around 300 variables and I am calculating their Skewness and Kurtosis. Now, I want to create a new varaible which will consist of the list of all those variables whose Skewness and Kurtosis are within a certain range. The idea is to select only those variables which are satisfying a condition and perform normalization on all the other variables.
To calcualte Skewness i am using;
Descriptives A TO Z
/Statistics Skewness.
Execute.
I know this is not a valid Syntax but i Need something like this:
Compute x= if(Skewness(A TO Z)>1)
Please help me out with an SPSS Syntax for this.
There are multiple ways to approach this, so there might be an easier way.
you just need to change the 'var1 TO varN' to your list of variables and whatever criteria you want for Skewness & Kurtosis on the two COMPUTE lines that create the flags, and this will do it for you.
If I were doing this I would go a step further and build the normalization into the syntax using WRITE OUT = ".sps" /CMD. INSERT FILE = ".sps", but that isn't what you asked for.
DATASET DECLARE DistributionSyntax.
OMS
/SELECT TABLES
/IF SUBTYPES=["Descriptives"] INSTANCES=[1]
/DESTINATION FORMAT=SAV OUTFILE = 'DistributionSyntax'.
EXAMINE VARIABLES=var1 TO varN
/PLOT NONE
/STATISTICS DESCRIPTIVES
/CINTERVAL 95
/MISSING PAIRWISE
/NOTOTAL.
OMSEND.
DATASET ACTIVATE DistributionSyntax.
USE ALL.
FILTER OFF.
SELECT IF ANY(Var2,'Skewness','Kurtosis').
EXECUTE.
STRING VarName (A64).
COMPUTE SkewnessFlag = (Var2 = 'Skewness' AND ABS(Statistic) > 2).
COMPUTE KurtosisFlag = (Var2 = 'Kurtosis' AND ABS(Statistic) > 2).
COMPUTE VarName = CHAR.SUBSTR(Var1,1,CHAR.INDEX(Var1,' ')-1).
EXECUTE.
USE ALL.
COMPUTE filter_$=(SkewnessFlag = 1).
VALUE LABELS filter_$ 0 'Not Selected' 1 'Selected'.
FORMATS filter_$ (f1.0).
FILTER BY filter_$.
EXECUTE.
FRE VarName.
USE ALL.
COMPUTE filter_$=(KurtosisFlag= 1).
VALUE LABELS filter_$ 0 'Not Selected' 1 'Selected'.
FORMATS filter_$ (f1.0).
FILTER BY filter_$.
EXECUTE.
FRE VarName.
USE ALL.
FILTER OFF.
EXECUTE.
If you omit the select data blocks after you compute the flags and replace it with this, it will calculate normalized versions of the variables that meet your criteria. This calculates new variables, and you will want to add a file location for the syntax file (replace the "~/" in the WRITE and INSERT commands), and change the name of the dataset referenced as 'RAWDATA' to whatever your dataset name is:
USE ALL.
FILTER OFF.
SELECT IF ANY(1,SkewnessFlag,KurtosisFlag).
EXECUTE.
STRING CMD (A250).
COMPUTE CMD = CONCAT("COMPUTE ",RTRIM(VarName),".Norm = ln(",RTRIM(VarName),").").
EXECUTE.
DATA LIST /CMD 1-250 (A).
BEGIN DATA
EXECUTE.
END DATA.
DATASET NAME EXE WINDOW = FRONT.
DATASET ACTIVATE DistributionSyntax.
ADD FILES /FILE = *
/FILE = 'EXE'.
EXECUTE.
DATASET CLOSE EXE.
DATASET ACTIVATE DistributionSyntax.
WRITE OUT="~\Normalize Variables.sps" /CMD.
DATASET CLOSE DistributionSyntax.
DATASET ACTIVATE RAWDATA.
INSERT FILE="~\Normalize Variables.sps".

How to modify the data in a column using Wolfram Mathematica?

I am working on a Dataset object with one column, named Property.
The data is given as shown in the following picture:
Based on the range, I would like to assign a new value, and eventually replace the whole column in question. For example if the range is 500-5000, I would like to get the value 1, and for 5000-50000, I would like to give the value 2, and so on.
As I understand it, you want to recode one column of a dataset by modifying the dataset. To my knowledge, datasets are not really designed to be mutable types. If you can accept that, here are two ways to proceed.
First, let's get some artifical data.
ds = Dataset[<|"x" -> RandomInteger[10],
"y" -> Interval[{10^#, 10^(# + 1)}]|> & /# Range[5]]
Now suppose we want to recode the second column with a function f:
ds[All, {2 -> f}]
Note that the original dataset is unchanged. (Usually a good thing.)
Here's an example function to try out.
f[x_Interval] := Log[10, x[[1, 1]]]
ds[All, {2 -> f}]
Now a big problem with this is that your new dataset has a column with exactly the same name but entirely different interpretation. If this bothers you, you can instead append to the dataset with a new name.
Append[#, "y2" -> f[#y]] & /# ds
Edit:
What about those dollar signs? Unless you show us the full form of an entry, I'll have to guess. So I'll guess that the following artificial data gets us close enough to be useful:
ds = Dataset[<|"x" -> RandomInteger[10],
"y" -> Quantity[Interval[{10^#, 10^(# + 1)}], "USDollars"]|> & /# Range[5]]
This just means we need to make a small change in f:
f[Quantity[Interval[{x_, _}], _]] := Log[10, x]
Then we can replace or append as before:
ds[All, {2 -> f}]
Append[#, "y2" -> f[#y]] & /# ds
If we have grid stuff with column integer x (starting from 1 as we are in mathematica) named "Property", the code to get the column of transformed ranges in x -- to what I think want you -- is below:
Replace[#1[[1]] & /# stuff, x_ :> IntegerLength[x[[1, 1]]] - 2, {1}]
It takes all the ranges in the specified column, and subtracts 2 from the length of the lower part of the range to give you your result.
For example, if we take your sample ranges:
stuff = {{$Interval[{500, 50000}], things, things},
{$Interval[{5000, 5000000}], things, things}}
And run it through our Replace:
Replace[#1[[1]] & /# stuff, x_ :> IntegerLength[x[[1, 1]]] - 2, {1}]
We get an Out: of:
{1, 2}
You can then easily modify the Replace above to give you the transformed column in situ of stuff.

List still being treated as a set even after converting

So i have an instance where even after converting my sets to lists, they aren't recognized as lists.
So the idea is to delete extra columns from a data frame comparing with columns in another. I have two data frames say df_test and df_train . I need to remove columns in df_test which are not in train .
extracols = set(df_test.columns) - set(df_train.columns) #Gives cols 2b
deltd
l = [extracols] # or list(extracols)
Xdp.dropna( subset = l, how ='any' , axis = 0)
I get an error : Unhashable type set
Even on printing l it prints like a set with {} curlies.
[{set}] doesn't cast to list, it just creates a list of length 1 with your set inside it.
Are you sure that list({set}) isn't working for you? Maybe you should post more of your code as it is hard to see where this is going wrong for you.

f# deedle filter data frame based on a list

I wanted to filter a Deedle dataframe based on a list of values how would I go about doing this?
I had an idea to use the following code below:
let d= df1|>filterRowValues(fun row -> row.GetAs<float>("ts") = timex)
However the issue with this is that it is only based on one variable, I then thought of combining this with a for loop and an append function:
for i in 0.. recd.length -1 do
df2.Append(df1|>filterRowValues(fun row -> row.GetAs<float>("ts") = recd.[i]))
This does not work either however and there must be a better way of doing this without using a for loop. In R I could for instance using an %in%.
You can use the F# set type to create a set of the values that you are interested. In the filtering, you can then check whether the set contains the actual value for the row.
For example, say that you have recd of type seq<float>. Then you should be able to write:
let recdSet = set recd
let d = df1 |> Frame.filterRowValues (fun row ->
recdSet.Contains(row.GetAs<float>("ts"))
Some other things that might be useful:
You can replace row.GetAs<float>("ts") with just row?ts (which always returns float and works only when you have a fixed name, like "ts", but it makes the code nicer)
Comparing float values might not be the best thing to do (because of floating point imprecisions, this might not always work as expected).

Using properties in F#

Hopefully this is coherent, albeit longwinded.
I am trying to create a property in one .fs file that I can set from a separate .fs file and then use that value in a module in the first .fs file... For instance,
In my first file function.fs, I would like to define a property theta.
I would then like to define a function Q in function.fs such that:
function Q = Q(r) and...
Q(r) is dependent on some calculations that are dependent on theta,
i.e
A1(theta), A2(theta), A3(theta)
Q returns a data set in the form of a list.
I would also like to maintain a set of theta values in my main .fs file program.fs (i.e.
theta = [90;120;150;180])
I would then like to generate a data sets from function.fs for each theta.
My thought was to do this by setting a value for a property theta, running the program to generate a data set, setting a new value for theta, running the program to generate a data set, repeat... I have done a fair amount of research, what is not clear to me is how I actually recall the value of the property in the code for Q(r).
I have successfully setup a property in my function.fs file that I can set from program.fs:
In function.fs I have:
namespace models.test
type ContactAngle() =
let mutable m_theta = 90.0
//read only property
member this.Empty =
m_theta = 90.0
//read-write property
//i think i'm onto something with this static...
member this.Angle
with get() =
m_theta
and set newAmt =
m_theta <- newAmt
//module HTModel =
And in program.fs I have:
open models.test
let me = new ContactAngle()
printfn "%A" me.Angle
me.Angle <- 120.0
printfn "%A" me.Angle
This allows me to redefine the value theta. Where I am struggling is how I now use the new property value in a function in function.fs.
I feel like I am missing something very elementary and need some help! Any insight would be greatly appreciated!
Since functions evaluate when they are called not when created (exactly like in, for example, C#) you can just create a normal function in your ContactAngle type like this:
member this.DoSomenthingWithTheta multiplier
m_theta <- m_theta * multiplier
You can reuse your mutable value anywhere in your class.
To clarify all that you should read 'members' section of F# language reference.
But if you want to use that value outside your type and in a place where you don't have your initiated instance. Well then You would have to take a different approach. For example create a static mutable field and expose it with static property. Or create a singleton to store your value throughout the application.
But this kind of kills the 'spirit' of functional programming :).

Resources