How to extract data from mnesia backup file - erlang

Problem statement
I have a mnesia backup file and would like to extract values from it. There are 3 tables(to make it simple), Employee, Skills, and attendance. So the mnesia back up file contains all those data from these three tables.
Emplyee table is :
Empid (Key)
Name
SkillId
AttendanceId
Skill table is
SkillId (Key)
Skill Name
Attendance table is
Code (Key)
AttendanceId
Percentage
What i have tried
I have used
ets:foldl(Fetch,OutputFile,Table)
Fetch : is separate function to traverse the record fetched to bring in desired output format.
OutputFile : it writes to this file
Table : name of the table
Expecting
I am gettig records with AttendanceId(as this is the key) where as i Want to get code only. It displays employee informations and attendance id.
Help me out.

Backup and restore is described in the mnesia user guide here.
To read an existing backup, without restoring it, use mnesia:traverse_backup/4.
1> mnesia:backup(backup_file).
ok
2> Fun = fun(BackupItems, Acc) -> {[], []} end.
#Fun<erl_eval.12.90072148>
3> mnesia:traverse_backup(backup_file, mnesia_backup, [], read_only, Fun, []).
{ok,[]}
Now add something to the Fun to get what you want.

Related

Insert a lot of data that depends on the previously inserted data

I am trying to store a "navigation" path on the database
The paths are stored on the logfile as a string, something like "a1 b1 c1 d1" where each one is a "token"
I want that for each token I store the path to it, as an example I can have
a1 -> b1 -> c1
a1 -> b1 -> c2
a1 -> b2 -> c2
So, if I ask all the subtokens for a1 I will get [b1 => 2, b2 => 1] on a token => count format.
This way I can get all the subtokens for a given token and the "usage count" for each of those subtokens.
It is possible to have
a1 -> b1 -> c1
g1 -> h1 -> b1
But for me, those two b1 are not the same, the count should not be the same.
There should not be a LOT of tokens, but there will be a lot of entries on the logfile so I will expect a big count value for those tokens.
I am representing the data like that (sqlite3):
id; parent_id; token; count
where the parent_id is a FK to the same table.
My issue is. I have around 50k entries on my log and I can have more.
I am inserting the data on the database using the following procedure
search for a entry that has the parent_id + token (for the first token the parent_id is null)
EXISTS: Update the count
DON'T EXISTS: Create a entry
Save the ID of the updated entry/new entry as a parent_id
Repeat until there are no more tokens to consume
With 50k entries having an average of 4 tokens per entry it gives 200k tokens to process.
It does not write a lot of data on the database as a lot of those tokens repeats, even if I can have the same token with different parent_id.
The issue is.... it is too slow.... I cannot perform an insert in chunks as I depends on the id of an existing one or the id of a new one. Worse I also need to update the count.
I was thinking to use some sort of tree to store this data, but there is the problem where it is possible that there are old records that needs to be preserved and this data needs to be counted on top of the existing one.
I can create the tree using the database + update it with the current data, but it feels like an overcomplicated solution for a problem.
Does anyone have any idea on how to optimize the insertion of this data?
I am using rails (active record) + sqlite 3.

Mnesia - aborted with bad_type when specifying storage strategy

So I'm getting {aborted,{bad_type,link,disc_copies, 'my_server#127.0.0.1'}} (it is returned by my init_db/0 function):
-record(link, {hash, original, timestamp}).
init_db() ->
application:set_env(mnesia, dir, "/tmp/mnesia_db"),
mnesia:create_schema([node()]),
mnesia:start(),
mnesia:create_table( link,[
{index,[timestamp]},
{attributes, record_info(fields, link)},
{disc_copies, [node()]}]).
Without {disc_copies, [node()]} table is properly created.
Verify write permissions on the parent directory of the mnesia dir you're specifying via application:set_env/3. If the mnesia dir parent directory doesn't allow you to write, you'll get this error. (Another way to get this error is to forget to set mnesia dir entirely, but your set_env call is clearly doing that.)
Update: looking more carefully at your reported error, I see the node mentioned in the error is not in a list:
{aborted,{bad_type,link,disc_copies, 'my_server#127.0.0.1'}}
This might mean that the code you show in your question doesn't match what's really running. Specifically, if you call mnesia:create_table/2 passing a node instead of a list of nodes in the disc_copies tuple, as shown below, you'll get the same exact error:
mnesia:create_table(link,[{index,[timestamp]},
{attributes, record_info(fields, link)},
{disc_copies, node()}]). % note no list here, should be [node()]
You may need to change the schema table to disc_copies which seems to affect the entire node.
mnesia:change_table_copy_type(schema, node(), disc_copies)
From the mnesia docs:
This function can also be used to change the storage type of the table named schema. The schema table can only have ram_copies or disc_copies as the storage type. If the storage type of the schema is ram_copies, no other table can be disc-resident on that node.
After this, you should be able to create disc_copies tables on the node.

Mnesia deletion error

I am using mnesia table.This table has two attributes(primary key and its value).
Now i am trying delete a tuple from mnesia table.I am using delete/1 function of mnesia for deletion purpose.This function takes table name and key corresponding to tuple fro which deletion has to be made.My problem is how can i handle the scenrio when tuple corresponding to passed key is not present.This delete function gives {atomic,ok} every time?
For your case you have to read the record first and delete it only after that. To prevent an access to the record from other transactions between 'read' and 'delete' operations use 'write' lock kind when you are reading the record. It gives your transaction an exclusive access to it:
delete_record(Table, Key) ->
F = fun () ->
case mnesia:read(Table, Key, write) of
[Record] ->
mnesia:delete({Table, Key}),
{ok, Record};
[] ->
mnesia:abort(not_exist)
end
end,
mnesia:transaction(F).

Create several Mnesia tables with the same columns

I want to create the following schema in Mnesia. Have three tables, called t1, t2 and t3, each of them storing elements of the following record:
-record(pe, {pid, event}).
I tried creating the tables with:
Attrs = record_info(fields, pe),
Tbls = [t1, t2, t3],
[mnesia:create_table(Tbl, [{attributes, Attrs}]) || Tbl <- Tbls],
and then write some content using the following line (P and E have values):
mnesia:write(t1, #pe{pid=P, event=E}, write)
but I got a bad type error. (Relevant commands were passed to transactions, so it's not a sync problem.)
All the textbook examples of Mnesia show how to create different tables for different records. Can someone please reply with an example for creating different tables for the same record?
regarding your "DDT" for creating the tables, I don't see any mystake at first sight, just remember that using tables with names different from the record names makes you lose the "simple" commands (like mnesia:write/1) because they use element(1, RecordTuple) to retrieve table name.
When defining tables, you can use option {record_name, RecordName} (in your case: {record_name, pe}) to tell mnesia that first atom in tuple representing records in table is not the table name, but instead the atom you passed with record_name; so in case of your table t1 it makes mnesia expecting 'pe' records when inserting or looking up for records.
If you want to insert a record in all tables, you might use a script similar to the one used to create table (but in a function wrapper for mnesia transaction context):
insert_record_in_all_tables(Pid, Event, Tables) ->
mnesia:transaction(fun() -> [mnesia:write(T, #pe{pid=Pid, event=Event}, write) || T <- Tables] end).
Hope this helps!

erlang - how can I match tuple contents with qlc and mnesia?

I have a mnesia table for this record.
-record(peer, {
peer_key, %% key is the tuple {FileId, PeerId}
last_seen,
last_event,
uploaded = 0,
downloaded = 0,
left = 0,
ip_port,
key
}).
Peer_key is a tuple {FileId, ClientId}, now I need to extract the ip_port field from all peers that have a specific FileId.
I came up with a workable solution, but I'm not sure if this is a good approach:
qlc:q([IpPort || #peer{peer_key={FileId,_}, ip_port=IpPort} <- mnesia:table(peer), FileId=:=RequiredFileId])
Thanks.
Using on ordered_set table type with a tuple primary key like { FileId, PeerId } and then partially binding a prefix of the tuple like { RequiredFileId, _ } will be very efficient as only the range of keys with that prefix will be examined, not a full table scan. You can use qlc:info/1 to examine the query plan and ensure that any selects that are occurring are binding the key prefix.
Your query time will grow linearly with the table size, as it requires scanning through all rows. So benchmark it with realistic table data to see if it really is workable.
If you need to speed it up you should focus on being able to quickly find all peers that carry the file id. This could be done with a table of bag-type with [fileid, peerid] as attributes. Given a file-id you would get all peers ids. With that you could construct your peer table keys to look up.
Of course, you would also need to maintain that bag-type table inside every transaction that change the peer-table.
Another option would be to repeat fileid and add a mnesia index on that column. I am just not that into mnesia's own secondary indexes.

Resources