Create several Mnesia tables with the same columns - erlang

I want to create the following schema in Mnesia. Have three tables, called t1, t2 and t3, each of them storing elements of the following record:
-record(pe, {pid, event}).
I tried creating the tables with:
Attrs = record_info(fields, pe),
Tbls = [t1, t2, t3],
[mnesia:create_table(Tbl, [{attributes, Attrs}]) || Tbl <- Tbls],
and then write some content using the following line (P and E have values):
mnesia:write(t1, #pe{pid=P, event=E}, write)
but I got a bad type error. (Relevant commands were passed to transactions, so it's not a sync problem.)
All the textbook examples of Mnesia show how to create different tables for different records. Can someone please reply with an example for creating different tables for the same record?

regarding your "DDT" for creating the tables, I don't see any mystake at first sight, just remember that using tables with names different from the record names makes you lose the "simple" commands (like mnesia:write/1) because they use element(1, RecordTuple) to retrieve table name.
When defining tables, you can use option {record_name, RecordName} (in your case: {record_name, pe}) to tell mnesia that first atom in tuple representing records in table is not the table name, but instead the atom you passed with record_name; so in case of your table t1 it makes mnesia expecting 'pe' records when inserting or looking up for records.
If you want to insert a record in all tables, you might use a script similar to the one used to create table (but in a function wrapper for mnesia transaction context):
insert_record_in_all_tables(Pid, Event, Tables) ->
mnesia:transaction(fun() -> [mnesia:write(T, #pe{pid=Pid, event=Event}, write) || T <- Tables] end).
Hope this helps!

Related

How to effectively delete a lot of mnesia tables

I faced a situation when I need to delete a lot of mnesia tables on the node (about 20000). Since there is a name pattern for these tables I can collect and delete them this way:
Tables = [Table || Table <- mnesia:system_info(tables), re:run(atom_to_list(Table), "<pattern>") /= nomatch],
lists:foreach(
fun (Table) ->
mnesia:delete_table(Table)
end,
Tables).
However deleting them one by one is very slow and it takes very long to delete 20k tables.
Is there any way to do it more effectively?
you can spawn processes.
lists:foreach(
fun (Table) ->
spawn(mnesia, delete_table, [Table])
end,
Tables).

How to extract data from mnesia backup file

Problem statement
I have a mnesia backup file and would like to extract values from it. There are 3 tables(to make it simple), Employee, Skills, and attendance. So the mnesia back up file contains all those data from these three tables.
Emplyee table is :
Empid (Key)
Name
SkillId
AttendanceId
Skill table is
SkillId (Key)
Skill Name
Attendance table is
Code (Key)
AttendanceId
Percentage
What i have tried
I have used
ets:foldl(Fetch,OutputFile,Table)
Fetch : is separate function to traverse the record fetched to bring in desired output format.
OutputFile : it writes to this file
Table : name of the table
Expecting
I am gettig records with AttendanceId(as this is the key) where as i Want to get code only. It displays employee informations and attendance id.
Help me out.
Backup and restore is described in the mnesia user guide here.
To read an existing backup, without restoring it, use mnesia:traverse_backup/4.
1> mnesia:backup(backup_file).
ok
2> Fun = fun(BackupItems, Acc) -> {[], []} end.
#Fun<erl_eval.12.90072148>
3> mnesia:traverse_backup(backup_file, mnesia_backup, [], read_only, Fun, []).
{ok,[]}
Now add something to the Fun to get what you want.

How do I remove rows of an RDD whose key is not in another RDD?

Let's say I have a PairRDD, students (id, name). I would like to only keep rows where id is in another RDD, activeStudents (id).
The solution I have is to create a PairDD from activeStudents, (id, id), and the do a join with students.
Is there a more elegant way of doing this?
Thats a pretty good solution to start with. If active students is small enough you could collect the ids as a map and then filter with the id presence (this avoids having to a do a shuffle).
Much like you thought, you can do an outer join if both RDDs contain keys and values.
val students: RDD[(Long, String)]
val activeStudents: RDD[Long]
val activeMap: RDD[(Long, Unit)] = activeStudents.map(_ -> ())
val activeWithName: RDD[(Long, String)] =
students.leftOuterJoin(activeMap).flatMapValues {
case (name, Some(())) => Some(name)
case (name, None) => None
}
If you don't have to join those two data sets then you should definitely avoid it.
I had a similar problem recently and I successfully solved it using a broadcasted Set, which I used in UDF to check whether each RDD row (rather value from one of its columns) is in that Set. That UDF is than used as the basis for the filter transformation.
More here: whats-the-most-efficient-way-to-filter-a-dataframe.
Hope this helps. Ask if it's not clear.

transfer data from table to another table in erlang

I have in my database mnesia two table which have this syntax :
-record(person, {firstname, lastname,adress}).
-record(personBackup, {firstname, lastname,adress}).
I want to transfer the data from the table person to the table personBackup
I think that I should create the two tables with this syntax ( I'm agree with your idea)
mnesia:create_table(person,
[{disc_copies, [node()]},
{attributes, record_info(fields, person)}]),
mnesia:create_table(person_backup,
[{disc_copies, [node()]},
{attributes, record_info(fields, person)},
{record_name, person}]),
now I have a function named verify
in this function I will do a test and if the test is verified I should transfert data from person to person_backup and then I should do a reset
this is my function
verify(Form)->
if Form =:= 40 ->
%%here I should transert data from person to person_backup : read all lines from person and write this lines into person_backup
reset();
Form =/= 40 ->
io:format("it is ok")
end.
this is the function reset :
reset() ->
stop(),
destroy(),
create(),
start(),
{ok}.
You don't have to use a separate record definition for each table. mnesia:create_table takes a record_name option, so you could create your tables like this:
mnesia:create_table(person,
[{disc_copies, [node()]},
{attributes, record_info(fields, person)}]),
mnesia:create_table(person_backup,
[{disc_copies, [node()]},
{attributes, record_info(fields, person)},
{record_name, person}]),
The value for record_name defaults to the name of the table, so there's no need to specify it for person. (I changed personBackup to person_backup, as Erlang atoms are usually written without camel case, unlike variables.)
Then you can put the same kind of records in both tables. Read or select from person, and write to person_backup, no conversion necessary.
There is no need for a seperated record definition for each table. Variables 90 and 80 will do the trick. If you wish to take a option of recod_name, you could use mnesia:create_table
# legoscia, you are correct for everything except for line 6.
mnesia:create_table(person, if player value = 1
this way the outcome can disc all copies and nodes.

Getting lots of data from Mnesia - fastest way

I have a record:
-record(bigdata, {mykey,some1,some2}).
Is doing a
mnesia:match_object({bigdata, mykey, some1,'_'})
the fastest way fetching more than 5000 rows?
Clarification:
Creating "custom" keys is an option (so I can do a read) but is doing 5000 reads fastest than match_object on one single key?
I'm curious as to the problem you are solving, how many rows are in the table, etc., without that information this might not be a relevant answer, but...
If you have a bag, then it might be better to use read/2 on the key and then traverse the list of records being returned. It would be best, if possible, to structure your data to avoid selects and match.
In general select/2 is preferred to match_object as it tends to better avoid full table scans. Also, dirty_select is going to be faster then select/2 assuming you do not need transactional support. And, if you can live with the constraints, Mensa allows you to go against the underlying ets table directly which is very fast, but look at the documentation as it is appropriate only in very rarified situations.
Mnesia is more a key-value storage system, and it will traverse all its records for getting match.
To fetch in a fast way, you should design the storage structure to directly support the query. To Make some1 as key or index. Then fetch them by read or index_read.
The statement Fastest Way to return more than 5000 rows depends on the problem in question. What is the database structure ? What do we want ? what is the record structure ? After those, then, it boils down to how you write your read functions. If we are sure about the primary key, then we use mnesia:read/1 or mnesia:read/2 if not, its better and more beautiful to use Query List comprehensions. Its more flexible to search nested records and with complex conditional queries. see usage below:
-include_lib("stdlib/include/qlc.hrl").
-record(bigdata, {mykey,some1,some2}).
%% query list comprehenshions
select(Q)->
%% to prevent against nested transactions
%% to ensure it also works whether table
%% is fragmented or not, we will use
%% mnesia:activity/4
case mnesia:is_transaction() of
false ->
F = fun(QH)-> qlc:e(QH) end,
mnesia:activity(transaction,F,[Q],mnesia_frag);
true -> qlc:e(Q)
end.
%% to read by a given field or even several
%% you use a list comprehension and pass the guards
%% to filter those records accordingly
read_by_field(some2,Value)->
QueryHandle = qlc:q([X || X <- mnesia:table(bigdata),
X#bigdata.some2 == Value]),
select(QueryHandle).
%% selecting by several conditions
read_by_several()->
%% you can pass as many guard expressions
QueryHandle = qlc:q([X || X <- mnesia:table(bigdata),
X#bigdata.some2 =< 300,
X#bigdata.some1 > 50
]),
select(QueryHandle).
%% Its possible to pass a 'fun' which will do the
%% record selection in the query list comprehension
auto_reader(ValidatorFun)->
QueryHandle = qlc:q([X || X <- mnesia:table(bigdata),
ValidatorFun(X) == true]),
select(QueryHandle).
read_using_auto()->
F = fun({bigdata,SomeKey,_,Some2}) -> true;
(_) -> false
end,
auto_reader(F).
So i think if you want fastest way, we need more clarification and problem detail. Speed depends on many factors my dear !

Resources