Kraken API AssetPairs - trading

I am consuming the Kraken API, and I am not able to find a good explanation to the info I have in the response.
Actually, for a given pair, I have the following info:
altname = alternate pair name
aclass_base = asset class of base component
base = asset id of base component
aclass_quote = asset class of quote component
quote = asset id of quote component
lot = volume lot size
pair_decimals = scaling decimal places for pair
lot_decimals = scaling decimal places for volume
lot_multiplier = amount to multiply lot volume by to get currency volume
leverage_buy = array of leverage amounts available when buying
leverage_sell = array of leverage amounts available when selling
fees = fee schedule array in [volume, percent fee] tuples
fees_maker = maker fee schedule array in [volume, percent fee] tuples (if on maker/taker)
fee_volume_currency = volume discount currency
margin_call = margin call level
margin_stop = stop-out/liquidation margin level
I have some questions about this payload. Thanks to everyone who will help.
lot_decimals and pair_decimals: how do these apply ? I am guessing that maybe the former means that the quantity of the first currency can be represented with a maximum of lot_decimals decimals, with the latter meaning that you can represent the pair value with a maximum of pair_decimals decimals. Is this any reasonable ?
lot_multiplier: the explanation is clear, but it's always 1. Any reason for this being there ?
lot: is this the amount of first currency that you want to calculate in the second currency ? Example: 100 EURBTC, 100 is the lot size ?
Thanks

I copied this over from inside the question by #Bruno Ripa:
I had feedback from Kraken. It is:
lot_decimals is the maximal precision of order size (volume), which is in base currency. pair_decimals is price precision
lot_multiplier is unused at the moment.
lot: pair is BTC/EUR, the volume of 100 is denominated in BTC, which is a base currency

Related

GEE - Random Forest: how to calculate % variable importance?

I'm trying to calculate the importance (in percentage) of each variable in my model (using smileRandomForest) in GEE.
var RFmodel = ee.Classifier.smileRandomForest(1000).train(trainingData, 'classID', predictionBands);
var var_imp = ee.Feature(null, ee.Dictionary(RFmodel.explain()).get('importance'));
In the example above, "var_imp" is a feature that has "importance" as a property. To calculate importance as %, I'm assuming I'll need to do something like:
Importance (%) = (variable importance value)/(total sum of all importance variables) * 100
Can someone help me to write a function for this? I'm relatively new to GEE and have no idea where to start. I've tried using aggregate_sum() at least to sum all variables, but "var_imp" isn't a FeatureCollection so it doesn't work.
You can work directly with the dictionary. Extract a list of the values and reduce it with a sum reducer to get the total importance. Then you can map over the importance dictionary and calculate the percentage of each band.
For future questions, please include a link to the code editor (use Get Link) and make sure all used assets are shared. It makes it easier to help you, increasing your chance of getting answers to your questions.
var importance = ee.Dictionary(
classifier.explain().get('importance')
)
var totalImportance = importance.values().reduce(ee.Reducer.sum())
var importancePercentage = importance.map(function (band, importance) {
return ee.Number(importance).divide(totalImportance).multiply(100)
})
https://code.earthengine.google.com/bd63aa319a37516d924a6d8c391ab076

Generate custom length hash values of a String in Swift

Is it possible to somehow "hash" a given String with length n to a hash value of an arbitrary length m?
I want to achieve something like follows:
let s1 = "<UNIQUE_USER_IDENTIFIER_1>"
let s2 = "<UNIQUE_USER_IDENTIFIER_2>"
let x1 = s1.hashValue(length: 4)
let x2 = s2.hashValue(length: 4)
I want to assign each given user a (e.g. four-digit) number, that is based on its unique UID. Is that possible?
First, I want to be clear that you mean "hash" and don't mean "(lossless) compress." You should expect some collisions where x1 and x2 are the same value for different s1 and s2. If you really mean a mapping so that there are no collisions, then we have to know a lot more about the problem. It is impossible to achieve that in the general case (see the Pigeonhole principle). But it can be achieved in some special cases where there is sufficient redundancy in the input. Or it can be done by maintaining a table (i.e. a database or the like). The rest of this answer is about hashing.
If your UID is a UUID created on iOS (or any v4 UUID), then its bits are already quite high quality, and the last four digits should be fine without doing any hashing at all. There are a couple of bytes in the middle that you should avoid, but the whole end section is random and so an ideal hash.
If your UUID is not random, you can try using the default hashes and pulling the required number of bits out of them, but non-cryptographic hashes don't always have good independence between their bits, so this may collide more than you like.
In that case use a cryptographic hash larger than the size you need and truncate it (or take the least-significant bits; either set are fine). This is commonly done in cryptography. For example SHA-512/256 is a commonly used hash that computes a 512-bit hash and extracts 256 bits from it. Cryptographic hashes require high independence of all their bits, so any subset of bits will also be collision resistant.
BTW, if you mean "4 decimal digits," then you should expect a collision about 1 time out 100. If you mean 16 bits (4 hex digits), you should expect a collision about one time in 300. These are your best-case scenarios and mean your hash is working well. See Birthday Attack for a table of expectations and some helpful approximations.
Based on only the information you provided:
extension String {
func hashValue(length: Int) -> Int? {
return Int(String(abs(hash)).prefix(length))
}
}
Usage:
"foo".hashValue(length: 4) // 5192
This will give you a consistent positive integer result based on the string input. Obviously it is not very useful for uuid purposes but useful for other use-cases nonetheless.

Create a JSON string with number of significant figures / decimal places based on key IOS OBJ C

I need to upload JSON data from an app (IOS) to the backend server.
The goal is to optimise the size of the upload packet which is JSON encoded as a NSString. The string is currently about 5MB but contains mostly doubles which have more precision than necessary.
The size of the packet can be reduced by around 40-50% by removing unnecessary decimal places in doubles. This has to be customisable based on the key.
What is the best way to create a JSON string with different numbers of significant figures or decimal places depending on the key.
You may need to do some experiments. Let's say you want to send data with two decimal digits, like 3.14 instead of pi. You know you have to turn all numbers into NSNumber. You would turn x into a number with two decimals by writing
double asDouble = 3.141592653;
NSNumber* asNumber = #(round (asDouble * 100.0) / 100.0);
However, you need to check that this always works; with some bad luck this could send 3.140000000000000000000001 to your server.
Obviously you can replace the 100.0 with 1000.0 etc. Do not replace the division with a multiplication by 0.01 because that will increase rounding errors and the chance that you get tons of decimal digits.
You might check what happens if you write
NSNumber* asNumber = #((float) asDouble);
If NSJSONSerialization is clever enough, it will send fewer decimals.

Getting length of vector in SPSS

I have an sav file with plenty of variables. What I would like to do now is create macros/routines that detect basic properties of a range of item sets, using SPSS syntax.
COMPUTE scale_vars_01 = v_28 TO v_240.
The code above is intended to define a range of items which I would like to observe in further detail. How can I get the number of elements in the "array" scale_vars_01, as an integer?
Thanks for info. (as you see, the SPSS syntax is still kind of strange to me and I am thinking about using Python instead, but that might be too much overhead for my relatively simple purposes).
One way is to use COUNT, such as:
COUNT Total = v_28 TO v_240 (LO THRU HI).
This will count all of the valid values in the vector. This will not work if the vector contains mixed types (e.g. string and numeric) or if the vector has missing values. An inefficient way to get the entire count using DO REPEAT is below:
DO IF $casenum = 1.
COMPUTE Total = 0.
DO REPEAT V = v_28 TO V240.
COMPUTE Total = Total + 1.
END REPEAT.
ELSE.
COMPUTE Total = LAG(Total).
END IF.
This will work for mixed type variables, and will count fields with missing values. (The DO IF would work the same for COUNT, this forces a data pass, but for large datasets and large lists will only evaluate for the first case.)
Python is probably the most efficient way to do this though - and I see no reason not to use it if you are familiar with it.
BEGIN PROGRAM.
import spss
beg = 'X1'
end = 'X10'
MyVars = []
for i in xrange(spss.GetVariableCount()):
x = spss.GetVariableName(i)
MyVars.append(x)
len = MyVars.index(end) - MyVars.index(beg) + 1
print len
END PROGRAM.
Statistics has a built-in macro facility that could be used to define sets of variables, but the Python apis provide much more powerful ways to access and use the metadata. And there is an extension command SPSSINC SELECT VARIABLES that can define macros based on variable metadata such as patterns in names, measurement level, type, and other properties. It generates a macro listing these variables that can then be used in standard syntax.

Lookup table size reduction

I have an application in which I have to store a couple of millions of integers, I have to store them in a Look up table, obviously I cannot store such amount of data in memory and in my requirements I am very limited I have to store the data in an embebedded system so I am very limited in the space, so I would like to ask you about recommended methods that I can use for the reduction of the look up table. I cannot use function approximation such as neural networks, the values needs to be in a table. The range of the integers is not known at the moment. When I say integers I mean a 32 bit value.
Basically the idea is use some copmpression method to reduce the amount of memory but without losing many precision. This thing needs to run in hardware so the computation overhead cannot be very high.
In my algorithm I have to access to one value of the table do some operations with it and after update the value. In the end what I should have is a function which I pass an index to it and then I get a value, and after I have to use another function to write a value in the table.
I found one called tile coding , this one is based on several look up tables, does anyone know any other method?.
Thanks.
I'd look at the types of numbers you need to store and pull out the information that's common for many of them. For example, if they're tightly clustered, you can take the mean, store it, and store the offsets. The offsets will have fewer bits than the original numbers. Or, if they're more or less uniformly distributed, you can store the first number and then store the offset to the next number.
It would help to know what your key is to look up the numbers.
I need more detail on the problem. If you cannot store the real value of the integers but instead an approximation, that means you are going to reduce (throw away) some of the data (detail), correct? I think you are looking for a hash, which can be an artform in itself. For example say you have 32 bit values, one hash would be to take the 4 bytes and xor them together, this would result in a single 8 bit value, reducing your storage by a factor of 4 but also reducing the real value of original data. Typically you could/would go further and perhaps and only use a few of those 8 bits , say the lower 4 and reduce the value further.
I think my real problem is either you need the data or you dont, if you need the data you need to compress it or find more memory to store it. If you dont, then use a hash of some sort to reduce the number of bits until you reach the amount of memory you have for storage.
Read http://www.cs.ualberta.ca/~sutton/RL-FAQ.html
"Function approximation" refers to the
use of a parameterized functional form
to represent the value function
(and/or the policy), as opposed to a
simple table."
Perhaps that applies. Also, update your question with additional facts -- don't merely answer in the comments.
Edit.
A bit array can easily store a bit for each of your millions of numbers. Let's say you have numbers in the range of 1 to 8 million. In a single megabyte of storage you can have a 1 bit for each number in your set and a 0 for each number not in your set.
If you have numbers in the range of 1 to 32 million, you'll require 4Mb of memory for a big table of all 32M distinct numbers.
See my answer to Modern, high performance bloom filter in Python? for a Python implementation of a bit array of unlimited size.
If you are merely looking for the presence of the number in question a bloom filter, might be what you are looking for. Honestly though your question is fairly vague and confusing. It would help to explain what Q values are, and what you do with them once you find them in the table.
If your set of integers is homongenous, then you could try a hash table, because there is a trick you can use to cut the size of the stored integers, in your case, in half.
Assume the integer, n, because its set is homogenous can be the hash. Assume you have 0x10000 (16k) buckets. Each bucket index, iBucket = n&FFFF. Each item in a bucket need only store 16 bits, since the first 16 bits are the bucket index. The other thing you have to do to keep the data small is to put the count of items in the bucket, and use an array to hold the items in the bucket. Using a linked list will be too large and slow. When you iterate the array looking for a match, remember you only need to compare the 16 bits that are stored.
So assuming a bucket is a pointer to the array and a count. On a 32 bit system, this is 64 bits max. If the number of ints was small enough we might be able to do some fancy things and use 32 bits for a bucket. 16k * 8 bytes = 524k, 2 million shorts = 4mb. So this gets you a method to lookup the ints and about 40% compression.

Resources