interrogate a Ruby array of hashes - ruby-on-rails

I need to group people by age in Ruby. I have their date of birth, and a method which returns their age in years. So a solution like this works.
case
when (0..15).cover?(age_years)
'child'
when (16..24).cover?(age_years)
'16 to 24'
when (25..34).cover?(age_years)
'25 to 34'
when (35..44).cover?(age_years)
'35 to 44'
when (45..54).cover?(age_years)
'45 to 54'
when (55..64).cover?(age_years)
'55 to 64'
when age_years > 64
'really old'
else
'unknown'
end
However, I am trying to learn Ruby and am looking for a more elegant solution. I thought about putting the age_ranges into an array of hashes like this...
age_ranges = [{ name: 'child', min_age: 0, max_age: 15 },
{ name: '16 to 24', min_age: 16, max_age: 24 }]
but am at a loss as to how to interrogate this data to return the correct name where the age_years is within the appropriate ranges, or even a range like this
age_ranges = [{ name: 'child', age_range: '0..15' },
{ name: '16 to 24', age_range: '16..24' }]
which looks neater but I have no idea if I have written gibberish as I don't know how to extract the name when the age years matches.
Can someone point me in the right direction?

Now that you have an map of age names and ranges (note I used range, not string as a value of age_range), you want to search the age_ranges array of hashes for such, which value of age_range includes the age:
def age_ranges
[
{ name: 'child', age_range: 0..15 },
{ name: '16 to 24', age_range: 16..24 }
]
end
def find_age(age)
age_ranges.find { |hash| hash[:age_range].include?(age) }[:name]
end
find_age(12)
#=> "child"
find_age(17)
#=> "16 to 24"
Note, that [:name] will fail if find returns nil (meaning, no matches found).
To overcome it either add an infinite range as a last one in the array (I'd prefer this one, because it is simpler):
def age_ranges
[
{ name: 'child', age_range: 0..15 },
{ name: '16 to 24', age_range: 16..24 },
{ name: 'unknown', age_range: 25..Float::INFINITY }
]
end
Or handle it while fetching the age in the find_age method:
def find_age(age)
age_ranges.each_with_object('unknown') { |hash, _| break hash[:name] if hash[:age_range].include?(age) }
end
Also, make sure to handle the negative numbers passed to the method (since age_ranges do not cover negatives):
def find_age(age)
return 'Age can not be less than 0' if age.negative?
age_ranges.find { |hash| hash[:age_range].include?(age) }[:name]
end
P.S. After all these "note/make sure" I want to say that #mudasobwa's answer is the simplest way to go about it :)

Use Range#=== triple equal directly, as it is supposed to be used:
case age_years
when 0..15 then 'child'
when 16..24 then '16 to 24'
when 25..34 then '25 to 34'
when 35..44 then '35 to 44'
when 45..54 then '45 to 54'
when 55..64 then '55 to 64'
when 64..Float::INFINITY then 'really old' # or when 64.method(:<).to_proc
else 'unknown'
end
To make case to accept floats, one should use triple-dot ranges:
case age_years
when 0...16 then 'child'
when 16...25 then '16 to 24'
when 25...35 then '25 to 34'
when 35...45 then '35 to 44'
when 45...55 then '45 to 54'
when 55...64 then '55 to 64'
when 64..Float::INFINITY then 'really old' # or when 64.method(:<).to_proc
else 'unknown'
end

Here's how I'd do it, to avoid code repetition between 16 and 64 :
def age_range(age, offset=4, span=10, lowest_age=16)
i = ((age-offset-1)/span).to_i
min = [i*span+offset+1, lowest_age].max
max = (i+1)*span + offset
"#{min} to #{max}"
end
def age_description(age)
case age
when 0...16 then 'child'
when 16..64 then age_range(age)
when 64..999 then 'really old'
else 'unknown'
end
end
(0..99).each do |age|
puts "%s (%s)" % [age_description(age), age]
end
It outputs :
child (0)
child (1)
child (2)
child (3)
child (4)
child (5)
child (6)
child (7)
child (8)
child (9)
child (10)
child (11)
child (12)
child (13)
child (14)
child (15)
16 to 24 (16)
16 to 24 (17)
16 to 24 (18)
16 to 24 (19)
16 to 24 (20)
16 to 24 (21)
16 to 24 (22)
16 to 24 (23)
16 to 24 (24)
25 to 34 (25)
25 to 34 (26)
25 to 34 (27)
25 to 34 (28)
25 to 34 (29)
25 to 34 (30)
25 to 34 (31)
25 to 34 (32)
25 to 34 (33)
25 to 34 (34)
35 to 44 (35)
...
As a bonus, it also works with Floats (e.g. 15.9 and 16.0).

Related

ActiveRecord::Fixture::FixtureError: table has no columns named "false"

I am getting the error: ActiveRecord::Fixture::FixtureError: table "creatures" has no columns named "false". I have no column named false in this model.
What is going on?
Here is my fixture:
3 one:
4 name: MyString
5 no: 1
6 type1: 1
7 type2: 1
8 total: 1
9 hp: 1
10 attack: 1
11 defense: 1
12 special_attack: 1
13 special_defense: 1
14 speed: 1
15 generation: 1
16 legendary: false
17
Putting the no in single quotes solved the problem.
If I put a debugger call just before the error is raised:
[475, 484] in /usr/local/bundle/gems/activerecord-7.0.4.2/lib/active_record/connection_adapters/abstract/database_statements.rb
475: fixture = fixture.stringify_keys
476:
477: unknown_columns = fixture.keys - columns.keys
478: if unknown_columns.any?
479: debugger
=> 480: raise Fixture::FixtureError, %(table "#{table_name}" has no columns named #{unknown_columns.map(&:inspect).join(', ')}.)
481: end
482:
483: columns.map do |name, column|
484: if fixture.key?(name)
(byebug) fixtures
It looks like the no got interpreted as a false:
[{"name"=>"MyString", false=>1, "type1"=>1, "type2"=>1, "total"=>1, "hp"=>1, "attack"=>1, "defense"=>1, "special_attack"=>1, "special_defense"=>1, "speed"=>1, "genneration"=>1, "legendary"=>false, "created_at"=>2023-02-05 18:53:31.63881045 UTC, "updated_at"=>2023-02-05 18:53:31.63881045 UTC, "id"=>980190962}, {"name"=>"MyString", false=>2, "type1"=>1, "type2"=>1, "total"=>1, "hp"=>1, "attack"=>1, "defense"=>1, "special_attack"=>1, "special_defense"=>1, "speed"=>1, "genneration"=>1, "legendary"=>false, "created_at"=>2023-02-05 18:53:31.63881045 UTC, "updated_at"=>2023-02-05 18:53:31.63881045 UTC, "id"=>298486374}]
Putting the no in single quotes solved the problem:
3 one:
4 name: MyString
5 'no': 1
6 type1: 1
7 type2: 1
8 total: 1
9 hp: 1
10 attack: 1
11 defense: 1
12 special_attack: 1
13 special_defense: 1
14 speed: 1
15 generation: 1
16 legendary: false
17
18 two:
19 name: MyString
20 'no': 2
21 type1: 1
22 type2: 1
23 total: 1
24 hp: 1
25 attack: 1
26 defense: 1
27 special_attack: 1
28 special_defense: 1
29 speed: 1
30 generation: 1
31 legendary: false

xarray transpose: TypeError: unhashable type: 'list'

I am trying to rearrange the dimensions of the following dataset
X
xarray.Dataset
Dimensions:
lon: 720lat: 360sector: 8time: 240
Coordinates:
lon
(lon)
float64
-179.8 -179.2 ... 179.2 179.8
lat
(lat)
float64
-89.75 -89.25 ... 89.25 89.75
sector
(sector)
int32
0 1 2 3 4 5 6 7
time
(time)
object
2000-01-16 00:00:00 ... 2019-12-...
Data variables:
CO_em_anthro
(time, sector, lat, lon)
float32
0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0
Attributes: (33)
This used to work
X.transpose(['lat','lon','sector','time'])
but in version 0.20.1, I am getting the following error
File /nbhome/f1p/miniconda3/envs/f1p_gfdl/lib/python3.9/site-packages/xarray/core/utils.py:879, in drop_missing_dims(supplied_dims, dims, missing_dims)
868 """Depending on the setting of missing_dims, drop any dimensions from supplied_dims that
869 are not present in dims.
870
(...)
875 missing_dims : {"raise", "warn", "ignore"}
876 """
878 if missing_dims == "raise":
--> 879 supplied_dims_set = {val for val in supplied_dims if val is not ...}
880 invalid = supplied_dims_set - set(dims)
881 if invalid:
File /nbhome/f1p/miniconda3/envs/f1p_gfdl/lib/python3.9/site-packages/xarray/core/utils.py:879, in <setcomp>(.0)
868 """Depending on the setting of missing_dims, drop any dimensions from supplied_dims that
869 are not present in dims.
870
(...)
875 missing_dims : {"raise", "warn", "ignore"}
876 """
878 if missing_dims == "raise":
--> 879 supplied_dims_set = {val for val in supplied_dims if val is not ...}
880 invalid = supplied_dims_set - set(dims)
881 if invalid:
TypeError: unhashable type: 'list'
calling transpose does work without the dimension name.
I am not sure how to fix this issue. Thanks
Xarray’s transpose accepts the target dimensions as multiple arguments, not a list of dimensions.
See the *args in the transpose docs.
You need to change your code to:
X.transpose('lat','lon','sector','time')

Inner join return duplicated record

My table:
Report_Period
Entity
Tag
Users Count
Report_Period_M-1
Report_Period_Q-1
...
2017-06-30
entity 1
X
471
2017-05-31
2017-03-31
...
2020-12-31
entity 2
A
135
2020-11-30
2020-09-30
...
2020-11-30
entity 3
X
402
2020-10-31
2020-08-31
...
What I want:
Report_Period
Entity
Tag
Users Count
Users_Count_M-1
Users_Count_Q-1
...
2017-06-30
entity 1
X
471
450
438
...
2020-12-31
entity 2
A
135
122
118
...
2020-11-30
entity 3
X
402
380
380
...
I have have tried this code but it duplicate records! How can I avoid it?
SELECT M."Entity",M."Tag",M."Report_Period",M."Users Count",
M."Report_Period_M-1",M1."Users Count" AS "Users Count M1",
FROM "DB"."SCHEMA"."PERIOD" M, "DB"."SCHEMA"."PERIOD" M1
WHERE M."Report_Period_M-1"= M1."Report_Period"
Your join clause should include the entity column and tag (I suspect)
SELECT M."Entity",
M."Tag",
M."Report_Period",
M."Users Count",
M."Report_Period_M-1",
M1."Users Count" AS "Users Count M1",
FROM "DB"."SCHEMA"."PERIOD" M,
"DB"."SCHEMA"."PERIOD" M1
WHERE M."Report_Period_M-1"= M1."Report_Period"
AND M."Entity" = M1."Entity"
AND M."Tag" = M1."Tag"

Dask: ValueError: Integer column has NA values

I tried to use dask and found something that appears to be a bug in dask.dataframe.read_csv.
import dask.dataframe as dd
types = {'id': 'int16', 'Semana': 'uint8', 'Agencia_ID': 'uint16', 'Canal_ID': 'uint8',
'Ruta_SAK': 'uint16' ,'Cliente_ID': 'float32', 'Producto_ID': 'float32'}
name_map = {'Semana': 'week', 'Agencia_ID': 'agency', 'Canal_ID': 'channel',
'Ruta_SAK': 'route', 'Cliente_ID': 'client', 'Producto_ID': 'prod'}
test = dd.read_csv(os.path.join(datadir, 'test.csv'), usecols=types.keys(), dtype=types)
test = test.rename(columns=name_map)
gives :
ValueError: Integer column has NA values in column 1
However, the same pandas read_csv operation completes fine and does not yield any NA:
types = {'id': 'int16', 'Semana': 'uint8', 'Agencia_ID': 'uint16', 'Canal_ID': 'uint8',
'Ruta_SAK': 'uint16' ,'Cliente_ID': 'float32', 'Producto_ID': 'float32'}
name_map = {'Semana': 'week', 'Agencia_ID': 'agency', 'Canal_ID': 'channel',
'Ruta_SAK': 'route', 'Cliente_ID': 'client', 'Producto_ID': 'prod'}
test = pd.read_csv(os.path.join(datadir, 'test.csv'), usecols=types.keys(), dtype=types)
test = test.rename(columns=name_map)
test.isnull().any()
id False
week False
agency False
channel False
route False
client False
prod False
dtype: bool
Should I consider this to be an established bug and raise a JIRA for it?
Full traceback:
ValueError Traceback (most recent call last)
in ()
4 'Ruta_SAK': 'route', 'Cliente_ID': 'client', 'Producto_ID': 'prod'}
5
----> 6 test = dd.read_csv(os.path.join(datadir, 'test.csv'), usecols=types.keys(), dtype=types)
7 test = test.rename(columns=name_map)
D:\PROGLANG\Anaconda2\lib\site-packages\dask\dataframe\csv.pyc in read_csv(filename, blocksize, chunkbytes, collection, lineterminator, compression, sample, enforce, storage_options, **kwargs)
195 else:
196 header = sample.split(b_lineterminator)[0] + b_lineterminator
--> 197 head = pd.read_csv(BytesIO(sample), **kwargs)
198
199 df = read_csv_from_bytes(values, header, head, kwargs,
D:\PROGLANG\Anaconda2\lib\site-packages\pandas\io\parsers.pyc in parser_f(filepath_or_buffer, sep, delimiter, header, names, index_col, usecols, squeeze, prefix, mangle_dupe_cols, dtype, engine, converters, true_values, false_values, skipinitialspace, skiprows, skipfooter, nrows, na_values, keep_default_na, na_filter, verbose, skip_blank_lines, parse_dates, infer_datetime_format, keep_date_col, date_parser, dayfirst, iterator, chunksize, compression, thousands, decimal, lineterminator, quotechar, quoting, escapechar, comment, encoding, dialect, tupleize_cols, error_bad_lines, warn_bad_lines, skip_footer, doublequote, delim_whitespace, as_recarray, compact_ints, use_unsigned, low_memory, buffer_lines, memory_map, float_precision)
560 skip_blank_lines=skip_blank_lines)
561
--> 562 return _read(filepath_or_buffer, kwds)
563
564 parser_f.name = name
D:\PROGLANG\Anaconda2\lib\site-packages\pandas\io\parsers.pyc in _read(filepath_or_buffer, kwds)
323 return parser
324
--> 325 return parser.read()
326
327 _parser_defaults = {
D:\PROGLANG\Anaconda2\lib\site-packages\pandas\io\parsers.pyc in read(self, nrows)
813 raise ValueError('skip_footer not supported for iteration')
814
--> 815 ret = self._engine.read(nrows)
816
817 if self.options.get('as_recarray'):
D:\PROGLANG\Anaconda2\lib\site-packages\pandas\io\parsers.pyc in read(self, nrows)
1312 def read(self, nrows=None):
1313 try:
-> 1314 data = self._reader.read(nrows)
1315 except StopIteration:
1316 if self._first_chunk:
pandas\parser.pyx in pandas.parser.TextReader.read (pandas\parser.c:8748)()
pandas\parser.pyx in pandas.parser.TextReader._read_low_memory (pandas\parser.c:9003)()
pandas\parser.pyx in pandas.parser.TextReader._read_rows (pandas\parser.c:10022)()
pandas\parser.pyx in pandas.parser.TextReader._convert_column_data (pandas\parser.c:11397)()
pandas\parser.pyx in pandas.parser.TextReader._convert_tokens (pandas\parser.c:12093)()
pandas\parser.pyx in pandas.parser.TextReader._convert_with_dtype (pandas\parser.c:13057)()
ValueError: Integer column has NA values in column 1

decodingTCAP message - dialoguePortion

I'm writing an simulator (for learning purposes) for complete M3UA-SCCP-TCAP-MAP stack (over SCTP). So far M3UA+SCCP stacks are OK.
M3UA Based on the RFC 4666 Sept 2006
SCCP Based on the ITU-T Q.711-Q716
TCAP Based on the ITU-T Q.771-Q775
But upon decoding TCAP part I got lost on dialoguePortion.
TCAP is asn.1 encoded, so everything is tag+len+data.
Wireshark decode it differently than my decoder.
Message is:
62434804102f00676b1e281c060700118605010101a011600f80020780a1090607040000010005036c1ba1190201010201163011800590896734f283010086059062859107
Basically, my message is BER-decoded as
Note: Format: hex(tag) + (BER splitted to CLS+PC+TAG in decimal) + hex(data)
62 ( 64 32 2 )
48 ( 64 0 8 ) 102f0067
6b ( 64 32 11 )
28 ( 0 32 8 )
06 ( 0 0 6 ) 00118605010101 OID=0.0.17.773.1.1.1
a0 ( 128 32 0 )
60 ( 64 32 0 )
80 ( 128 0 0 ) 0780
a1 ( 128 32 1 )
06 ( 0 0 6 ) 04000001000503 OID=0.4.0.0.1.0.5.3
6c ( 64 32 12 )
...
So I can see begin[2] message containing otid[8], dialogPortion[11] and componentPortion[12].
otid and ComponentPortion are decoded correctly. But not dialogPortion.
ASN for dialogPortion does not mention any of these codes.
Even more confusing, wireshark decode it differently (oid-as-dialogue is NOT in the dialoguePortion, but as a field after otid, which is NOT as described in ITU-T documentation - or not as I'm understanding it)
Wireshark decoded Transaction Capabilities Application Part
begin
Source Transaction ID
otid: 102f0067
oid: 0.0.17.773.1.1.1 (id-as-dialogue)
dialogueRequest
Padding: 7
protocol-version: 80 (version1)
1... .... = version1: True
application-context-name: 0.4.0.0.1.0.5.3 (locationInfoRetrievalContext-v3)
components: 1 item
...
I can't find any reference for Padding in dialoguePDU ASN.
Can someone point me in the right direction?
I would like to know how to properly decode this message
DialoguePDU format should be simple in this case:
dialogue-as-id OBJECT IDENTIFIER ::= {itu-t recommendation q 773 as(1) dialogue-as(1) version1(1)}
DialoguePDU ::= CHOICE {
dialogueRequest AARQ-apdu,
dialogueResponse AARE-apdu,
dialogueAbort ABRT-apdu
}
AARQ-apdu ::= [APPLICATION 0] IMPLICIT SEQUENCE {
protocol-version [0] IMPLICIT BIT STRING {version1(0)} DEFAULT {version1},
application-context-name [1] OBJECT IDENTIFIER,
user-information [30] IMPLICIT SEQUENCE OF EXTERNAL OPTIONAL
}
Wireshark is still wrong :-). But then... that is display. It displays values correctly - only in the wrong section. Probably some reason due to easier decoding.
What I was missing was definition of EXTERNAL[8]. DialoguePortion is declared as EXTERNAL...so now everything makes sense.
For your message, my very own decoder says:
begin [APPLICATION 2] (x67)
otid [APPLICATION 8] (x4) =102f0067h
dialoguePortion [APPLICATION 11] (x30)
EXTERNAL (x28)
direct-reference [OBJECT IDENTIFIER] (x7) =00118605010101h
encoding:single-ASN1-type [0] (x17)
dialogueRequest [APPLICATION 0] (x15)
protocol-version [0] (x2) = 80 {version1 (0) } spare bits= 7
application-context-name [1] (x9)
OBJECT IDENTIFIER (x7) =04000001000503h
components [APPLICATION 12] (x27)
invoke [1] (x25)
invokeID [INTEGER] (x1) =1d (01h)
operationCode [INTEGER] (x1) = (22) SendRoutingInfo
parameter [SEQUENCE] (x17)
msisdn [0] (x5) = 90896734f2h
Nature of Address: international number (1)
Numbering Plan Indicator: unknown (0)
signal= 9876432
interrogationType [3] (x1) = (0) basicCall
gmsc-Address [6] (x5) = 9062859107h
Nature of Address: international number (1)
Numbering Plan Indicator: unknown (0)
signal= 26581970
Now, wireshark's padding 7 and my spare bits=7 both refer to the the protocol-version field, defined in Q.773 as:
AARQ-apdu ::= [APPLICATION 0] IMPLICIT SEQUENCE {
protocol-version [0] IMPLICIT BIT STRING { version1 (0) }
DEFAULT { version1 },
application-context-name [1] OBJECT IDENTIFIER,
user-information [30] IMPLICIT SEQUENCE OF EXTERNAL
OPTIONAL }
the BIT STRING definition, assigns name to just the leading bit (version1)... the rest (7 bits) are not given a name and wireshark consider them as padding

Resources