Sorting a list in Erlang - erlang

How does one sort a list in Erlang depending on a tag for each element?
If I have a process which loops and receives a single value from multiple processes and then I want to arrange the list according to the tag(which is an index) as an element is received.
How does one do this without using BIFs?
I currently do the following but would like to arrange the list when an element is added, like an insertion sort using the tags.
fibLoop(calcData) ->
receive
{Number, Tag} ->
fibLoop([{Number, Tag}|calcData]);

Something like this would probably work:
insert({_, Tag} = Data, [{_,HTag}|_] = List) when Tag >= HTag ->
[Data | List];
insert(Data, [H | T]) ->
[H | insert(Data, T)];
insert(Data, []) ->
[Data].

There are multiple ways to do what you want, a bit depending on what you want to use the value for later.
Easy solution would be to use gb_trees. gb_trees are a sorted structure that you can loop over using an iterator.
Or if you want to keep it simple and have a list you could use orddict (or possibly ordsets).
orddict:store(Number, Tag, CalcData)
to insert {Number, Tag} into an ordered list. See documentation for orddict for more information.
To get the smallest value in the list you can use hd/1, and to get the biggest lists:last/1 (not that I recommend lists:last, mind you).

Related

Java 8- forEach method iterator behaviour

I recently started checking new Java 8 features.
I've come across this forEach iterator-which iterates over the Collection.
Let's take I've one ArrayList of type <Integer> having values= {1,2,3,4,5}
list.forEach(i -> System.out.println(i));
This statement iteates over a list and prints the values inside it.
I'd like to know How am I going to specify that I want it to iterate over some specific values only.
Like, I want it to start from 2nd value and iterate it till 2nd last value. or something like that- or on alternate elements.
How am I going to do that?
To iterate on a section of the original list, use the subList method:
list.subList(1, list.length()-1)
.stream() // This line is optional since List already has a foreach method taking a Consumer as parameter
.forEach(...);
This is the concept of streams. After one operation, the results of that operation become the input for the next.
So for your specific example, you can follow #Joni's command. But if you're asking in general, then you can create a filter to only get the values you want to loop over.
For example, if you only wanted to print the even numbers, you could create a filter on the streams before you forEached them. Like this:
List<Integer> intList = Arrays.asList(1,2,3,4,5);
intList.stream()
.filter(e -> (e & 1) == 0)
.forEach(System.out::println);
You can similarly pick out the stuff you want to loop over before reaching your terminal operation (in your case the forEach) on the stream. I suggest you read this stream tutorial to get a better idea of how they work: http://winterbe.com/posts/2014/07/31/java8-stream-tutorial-examples/

Reorder elements in Erlang

I want to redefine the order of a tuple looking for specific words
Example, I have a list of tuples like this:
[{"a",["r001"]},
{"bi",["bidder"]},
{"bo",["an"]}]
But sometimes the order of the tuples can change for example:
[{"bi",["bidder"]},
{"a",["r001"]},
{"bo",["an"]}]
or
[{"bo",["an"]},
{"a",["r001"]},
{"bi",["bidder"]}]
The first string/list of the tuple is my unique key ("bo","a","bi")
But I want to be able to reorder the list of tuples, always like:
[{"a",["r001"]},
{"bi",["bidder"]},
{"bo",["an"]}]
How can I achieve this?
This will do it:
lists:sort(fun({A,_},{B,_}) -> A =< B end, List).
Or this, which will sort by the tuples second element after the first:
lists:sort(List).
I offer the second version, because without the custom sort function, it is faster for data like this.
If you need to sort by specified element, you just sort by specified element
lists:keysort(1, List).

Determine if record with field of certain value exists in list

I need to determine if a record with a given value exists in a list, what is the most efficient way to do this?
i think like this:
[ L || L = #record{state=determined} <- List ].
And the most efficient way is:
lists:any(fun(#record{state=deter}) -> true; (_) -> false end, List).
The first aproach is applicable if your list contains few records with determined field in the list and you'll get it all.
The second aproach is the most efficient because we are using standart library and if we'll get nedeed record we'll will stop iteration over the list.

Getting lots of data from Mnesia - fastest way

I have a record:
-record(bigdata, {mykey,some1,some2}).
Is doing a
mnesia:match_object({bigdata, mykey, some1,'_'})
the fastest way fetching more than 5000 rows?
Clarification:
Creating "custom" keys is an option (so I can do a read) but is doing 5000 reads fastest than match_object on one single key?
I'm curious as to the problem you are solving, how many rows are in the table, etc., without that information this might not be a relevant answer, but...
If you have a bag, then it might be better to use read/2 on the key and then traverse the list of records being returned. It would be best, if possible, to structure your data to avoid selects and match.
In general select/2 is preferred to match_object as it tends to better avoid full table scans. Also, dirty_select is going to be faster then select/2 assuming you do not need transactional support. And, if you can live with the constraints, Mensa allows you to go against the underlying ets table directly which is very fast, but look at the documentation as it is appropriate only in very rarified situations.
Mnesia is more a key-value storage system, and it will traverse all its records for getting match.
To fetch in a fast way, you should design the storage structure to directly support the query. To Make some1 as key or index. Then fetch them by read or index_read.
The statement Fastest Way to return more than 5000 rows depends on the problem in question. What is the database structure ? What do we want ? what is the record structure ? After those, then, it boils down to how you write your read functions. If we are sure about the primary key, then we use mnesia:read/1 or mnesia:read/2 if not, its better and more beautiful to use Query List comprehensions. Its more flexible to search nested records and with complex conditional queries. see usage below:
-include_lib("stdlib/include/qlc.hrl").
-record(bigdata, {mykey,some1,some2}).
%% query list comprehenshions
select(Q)->
%% to prevent against nested transactions
%% to ensure it also works whether table
%% is fragmented or not, we will use
%% mnesia:activity/4
case mnesia:is_transaction() of
false ->
F = fun(QH)-> qlc:e(QH) end,
mnesia:activity(transaction,F,[Q],mnesia_frag);
true -> qlc:e(Q)
end.
%% to read by a given field or even several
%% you use a list comprehension and pass the guards
%% to filter those records accordingly
read_by_field(some2,Value)->
QueryHandle = qlc:q([X || X <- mnesia:table(bigdata),
X#bigdata.some2 == Value]),
select(QueryHandle).
%% selecting by several conditions
read_by_several()->
%% you can pass as many guard expressions
QueryHandle = qlc:q([X || X <- mnesia:table(bigdata),
X#bigdata.some2 =< 300,
X#bigdata.some1 > 50
]),
select(QueryHandle).
%% Its possible to pass a 'fun' which will do the
%% record selection in the query list comprehension
auto_reader(ValidatorFun)->
QueryHandle = qlc:q([X || X <- mnesia:table(bigdata),
ValidatorFun(X) == true]),
select(QueryHandle).
read_using_auto()->
F = fun({bigdata,SomeKey,_,Some2}) -> true;
(_) -> false
end,
auto_reader(F).
So i think if you want fastest way, we need more clarification and problem detail. Speed depends on many factors my dear !

how to sort documents using the erlang map reduce in riak

i'm using riak to store json documents right now, and i want to sort them based on some attribute, let's say there's a key, i.e
{
"someAttribute": "whatever",
"order": 1
}
so i want to sort the documents based on the "order".
I am currently retrieving the documents in riak with the erlang interface. i can retrieve the document back as a string, but i dont' really know what to do after that. i'm thinking the map function just reduces the json document itself, and in the reduce function, i'd make a check to see whether the item i'm looking at has a higher "order" than the head of the rest of the list, and if so append to beginning, and then return a lists:reverse.
despite my ideas above i've had zero results after almost an entire day, i'm so confused with the erlang interface in riak. can someone provide insight on how to write this map/reduce function, or just how to parse the json document?
As far as I know, You do not have access to Input list in Map. You emit from Map a document as 1 element list.
Inputs (all the docs to handle as {Bucket, Key}) -> Map (handle single doc) -> Reduce (whole list emitted from Map).
Maps are executed per each doc on many nodes whereas Reduce is done once on so called coordinator node (the one where query was called).
Solution:
Define Inputs (as a list or bucket)
Retrieve Value in Map and emit whole doc or {Id, Val_to_sort_by)
Sort in Reduce (using regular list:keysort)
This is not a map reduce solution but you should check out Riak Search.
so i "solved" the problem using javascript, still can't do it using erlang.
here is my query
{"inputs":"test",
"query":[{"map":{"language":"javascript",
"source":"function(value, keyData, arg){ var data = Riak.mapValuesJson(value)[0]; var obj = {}; obj[data.order] = data; return [ obj ];}"}},
{"reduce":{"language":"javascript",
"source":"function(values, arg){ return [ values.reduce(function(acc, item){ for(var order in item){ acc[order] = item[order]; } return acc; }) ];}",
"keep":true}}
]
}
so in the map phase, all i do is create a new array, obj, with the key as the order, and the value as the data itself. so visually, the obj is like this
{"1":{"firstName":"John","order":1}
in the reduce phase, i'm just putting it in the accumulator, so basically that's the sort if you think about it, because when you're done, everything will be put in order for you. so i put 2 json documents for testing, one is above, the ohter is just firstName: Billie, order 2. and here is my result for the query above
[{"1":{"firstName":"John","order":1},"2":{"firstName":"Billie","order":2}}]
so it works! . but i still need to do this in ERLANG, any insights?

Resources