Are ETS bulk operations atomic? - erlang

Specifically, :ets.tab2list and :ets.file2tab. Does these functions "snapshot" the table state, or can other operations interleave reads and writes while these functions complete?

Based on the documentation here:
Functions that internally traverse over a table, like select and match, give the same guarantee as safe_fixtable.
Where
[...] function safe_fixtable can be used to guarantee that a sequence of first/1 and next/2 calls traverse the table without errors and that each existing object in the table is visited exactly once, even if another (or the same) process simultaneously deletes or inserts objects into the table.
And specifically related to your question:
Nothing else is guaranteed; in particular objects that are inserted or deleted during such a traversal can be visited once or not at all.
EDIT
ets:tab2list/1 calls ets:match_object/2 which is a built-in function (BIF) implemented in C. The implementation here is using the BIF ets_select2, which is the implementation for ets:select/2.
ets:file2tab ends up calling load_table/3 which simply uses ets:insert/2.
The code for ets:tab2file/3 in ets.erl, uses ets:select/3 to get the first chunk and then ets:select/1 to get the rest of the chunks in the table.

Related

Is there any standard stored procedure to capture table refresh details in snowflake

I am trying to log table refresh details in snowflake DWH
Details include below
Batch Date, Source Table Name, Target Table Name, rows loaded, timestamp, status, err.Message.
Is there any standard SQL\Snowflake stored procedure which can be useful as common one for entire DWH to trace\audit table refresh details and log them into single table.
I have the variables which captures Batchdate, target table name, source table name, etc...
If I get standard stored procedure which can log start of the activity and end of the activity, that really helpful.
Regards,
Srinivas
If you are looking for some ideas moving forward, here are a couple of things that can help you out:
Query History is useful, but hard to filter. If you use a query_tag in your batch processes, then you can reference query_history for information.
In addition, if you want to capture information as its running, you could use Streams and Tasks on your tables to capture counts of updates/inserts/deletes, etc. for each batch in the background.
There is no standard stored procedure that you can leverage within Snowflake to query this information, but there is a lot of data available in the snowflake.account_usage share.
Not sure what exactly you're trying to achieve here, but
you can use last_altered on a table to see when the data was last modified
you can filter the query_history view to see what queries modified the table:
https://docs.snowflake.com/en/sql-reference/functions/query_history.html
You can take advantage of Snowflake Streams https://docs.snowflake.com/en/sql-reference/sql/create-stream.html
When you create a stream, you point it to a target table. So, your stream, records changes produces on the target table (INSERTS, UPDATES and DELETES) between two points in time.
You can use your stream as any table to select over it, to look for changes.
What's great about streams is that after a succesfully DML operation is done by using data from any stream, the stream is purged, so when you query against it, it'll be empty.
Use them free of guilty, since streams don´t duplicate your data, they just storage the offset and the CDC, so data remains on your table.
Some useful guides: it generates something related you need
- Part 1: https://www.snowflake.com/blog/building-a-type-2-slowly-changing-dimension-in-snowflake-using-streams-and-tasks-part-1/
- Part 2: https://www.snowflake.com/blog/building-a-type-2-slowly-changing-dimension-in-snowflake-using-streams-and-tasks-part-2/

When to use HANA SPs instead of graphical Calculation views?

I didn't come across in any such scenario where we have to use stored procedure instead of Calculation View, but I read many sites where it is mentioned. One can use Stored Procedure in complex scenarios, but I am confused which scenarios are meant.
Can anyone suggest me such scenarios where we have to use Stored Procedure instead of Graphical Calculation View?
Hierarchies
If you are looking for the parents (or children) of an object for undetermined depth, you have to do many SELECTs in a loop.
If you use views, the loop has to be on ABAP side, causing many roundtrips between the application server and the DB.
Stored procedures are very beneficial in this case, as they can run the loops on HANA side. You only have to more the end result through the network.
Sidenote: you should be using CDS views instead of Calculation views, as they offer many benefits.
First of all they are used by SAP internally in S/4 products, making CDS the way of the present and future.
Also they are ABAP objects, transported together with the referencing ABAP coding.
In a stored procedure, or in an AMDP you can use a script code block which can contain more than a single SELECT statement. You can store temp tables storing outcomes of previous SELECT commands in that AMDP and use later for example.
AMDP enables developers to keep the business logic in it.
But if you are using a view, you are generally limited with allowed functions with a single SELECT statement
For example, I could not use TRIM function within a CDS view but can use in AMDP

Which is better practice between passing ActiveRecord Objects vs database table ids to a method

I have a rake task that pulls data from the database (Event Model) for events that
have status 'Open' and processes these Events by calling methods in two different
classes. The first one batches up the events based on some condition and the other is a
crawler that generates a CSV for these event batches and uploads that CSV to an external
website thereafter updating the status of each event referencing those batches after the
crawl finishes executing.
There are two ways I pass the Active Record objects to these two classes' methods.
Passing the ActiveRecord objects to the two classes (my current implementation)
Passing the object's database table ids and just do a fetch from each of
those classes.
Which of these options has less of a 'smell' to it. My brain tells me that passing the ids
would have a performance downside by doing another database query once the ids get to
the other class. On the other hand passing ActiveRecord objects with all that data seems
superflous since all that will be updated is the status. So which option is the better one? I've included the rake task just to get a clear picture of what I mean.
desc "Process open Events ..."
task :process_open_events => :environment do
open_events = Event.find_all_by_status("Open")
event_batches = EventUtils::EventProcessor.create_event_batches(open_events)
crawler = EventsCrawler.create!
crawler.enqueue_crawler(event_batches)
end
In this case you'd read (and hold in memory) a list of records, then map their ids to an array (also held in memory), send that array over to another method and not just rerun the query which you know will have identical results, but put a duplicate of that initial list into memory as well.
Working with a list of ids would seem to me to be less efficient in every way, processing time, memory usage, database usage.

what is the basic difference between stack and queue?

What is the basic difference between stack and queue??
Please help me i am unable to find the difference.
How do you differentiate a stack and a queue?
I searched for the answer in various links and found this answer..
In high level programming,
a stack is defined as a list or sequence of elements that is lengthened by placing new elements "on top" of existing elements and shortened by removing elements from the top of existing elements. It is an ADT[Abstract Data Type] with math operations of "push" and "pop".
A queue is a sequence of elements that is added to by placing the new element at the rear of existing and shortened by removing elements in front of queue. It is an ADT[Abstract Data Type]. There is more to these terms understood in programming of Java, C++, Python and so on.
Can i have an answer which is more detailed? Please help me.
Stack is a LIFO (last in first out) data structure. The associated link to wikipedia contains detailed description and examples.
Queue is a FIFO (first in first out) data structure. The associated link to wikipedia contains detailed description and examples.
Imagine a stack of paper. The last piece put into the stack is on the top, so it is the first one to come out. This is LIFO. Adding a piece of paper is called "pushing", and removing a piece of paper is called "popping".
Imagine a queue at the store. The first person in line is the first person to get out of line. This is FIFO. A person getting into line is "enqueued", and a person getting out of line is "dequeued".
A Visual Model
Pancake Stack (LIFO)
The only way to add one and/or remove one is from the top.
Line Queue (FIFO)
When one arrives they arrive at the end of the queue and when one leaves they leave from the front of the queue.
Fun fact: the British refer to lines of people as a Queue
You can think of both as an ordered list of things (ordered by the time at which they were added to the list). The main difference between the two is how new elements enter the list and old elements leave the list.
For a stack, if I have a list a, b, c, and I add d, it gets tacked on the end, so I end up with a,b,c,d. If I want to pop an element of the list, I remove the last element I added, which is d. After a pop, my list is now a,b,c again
For a queue, I add new elements in the same way. a,b,c becomes a,b,c,d after adding d. But, now when I pop, I have to take an element from the front of the list, so it becomes b,c,d.
It's very simple!
Queue
Queue is a ordered collection of items.
Items are deleted at one end called ‘front’ end of the queue.
Items are inserted at other end called ‘rear’ of the queue.
The first item inserted is the first to be removed (FIFO).
Stack
Stack is a collection of items.
It allows access to only one data item: the last item inserted.
Items are inserted & deleted at one end called ‘Top of the stack’.
It is a dynamic & constantly changing object.
All the data items are put on top of the stack and taken off the top
This structure of accessing is known as Last in First out structure (LIFO)
STACK:
Stack is defined as a list of element in which we can insert or delete elements only at the top of the stack.
The behaviour of a stack is like a Last-In First-Out(LIFO) system.
Stack is used to pass parameters between function. On a call to a function, the parameters and local variables are stored on a stack.
High-level programming languages such as Pascal, c, etc. that provide support for recursion use the stack for bookkeeping. Remember in each recursive call, there is a need to save the current value of parameters, local variables, and the return address (the address to which the control has to return after the call).
QUEUE:
Queue is a collection of the same type of element. It is a linear list in which insertions can take place at one end of the list,called rear of the list, and deletions can take place only at other end, called the front of the list
The behaviour of a queue is like a First-In-First-Out (FIFO) system.
A stack is a collection of elements, which can be stored and retrieved one at a time. Elements are retrieved in reverse order of their time of storage, i.e. the latest element stored is the next element to be retrieved. A stack is sometimes referred to as a Last-In-First-Out (LIFO) or First-In-Last-Out (FILO) structure. Elements previously stored cannot be retrieved until the latest element (usually referred to as the 'top' element) has been retrieved.
A queue is a collection of elements, which can be stored and retrieved one at a time. Elements are retrieved in order of their time of storage, i.e. the first element stored is the next element to be retrieved. A queue is sometimes referred to as a First-In-First-Out (FIFO) or Last-In-Last-Out (LILO) structure. Elements subsequently stored cannot be retrieved until the first element (usually referred to as the 'front' element) has been retrieved.
STACK:
Stack is defined as a list of element in which we can insert or delete elements only at the top of the stack
Stack is used to pass parameters between function. On a call to a function, the parameters and local variables are stored on a stack.
A stack is a collection of elements, which can be stored and retrieved one at a time. Elements are retrieved in reverse order of their time of storage, i.e. the latest element stored is the next element to be retrieved. A stack is sometimes referred to as a Last-In-First-Out (LIFO) or First-In-Last-Out (FILO) structure. Elements previously stored cannot be retrieved until the latest element (usually referred to as the 'top' element) has been retrieved.
QUEUE:
Queue is a collection of the same type of element. It is a linear list in which insertions can take place at one end of the list,called rear of the list, and deletions can take place only at other end, called the front of the list
A queue is a collection of elements, which can be stored and retrieved one at a time. Elements are retrieved in order of their time of storage, i.e. the first element stored is the next element to be retrieved. A queue is sometimes referred to as a First-In-First-Out (FIFO) or Last-In-Last-Out (LILO) structure. Elements subsequently stored cannot be retrieved until the first element (usually referred to as the 'front' element) has been retrieved.
To try and over-simplify the description of a stack and a queue,
They are both dynamic chains of information elements that can be accessed from one end of the chain and the only real difference between them is the fact that:
when working with a stack
you insert elements at one end of the chain and
you retrieve and/or remove elements from the same end of the chain
while with a queue
you insert elements at one end of the chain and
you retrieve/remove them from the other end
NOTE:
I am using the abstract wording of retrieve/remove in this context because there are instances when you just retrieve the element from the chain or in a sense just read it or access its value, but there also instances when you remove the element from the chain and finally there are instances when you do both actions with the same call.
Also the word element is purposely used in order to abstract the imaginary chain as much as possible and decouple it from specific programming language
terms. This abstract information entity called element could be anything, from a pointer, a value, a string or characters, an object,... depending on the language.
In most cases, though it is actually either a value or a memory location (i.e. a pointer). And the rest are just hiding this fact behind the language jargon<
A queue can be helpful when the order of the elements is important and needs to be exactly the same as when the elements first came into your program. For instance when you process an audio stream or when you buffer network data. Or when you do any type of store and forward processing. In all of these cases you need the sequence of the elements to be output in the same order as they came into your program, otherwise the information may stop making sense. So, you could break your program in a part that reads data from some input, does some processing and writes them in a queue and a part that retrieves data from the queue processes them and stores them in another queue for further processing or transmitting the data.
A stack can be helpful when you need to temporarily store an element that is going to be used in the immediate step(s) of your program. For instance, programming languages usually use a stack structure to pass variables to functions. What they actually do is store (or push) the function arguments in the stack and then jump to the function where they remove and retrieve (or pop) the same number of elements from the stack. That way the size of the stack is dependent of the number of nested calls of functions. Additionally, after a function has been called and finished what it was doing, it leaves the stack in the exact same condition as before it has being called! That way any function can operate with the stack ignoring how other functions operate with it.
Lastly, you should know that there are other terms used out-there for the same of similar concepts. For instance a stack could be called a heap. There are also hybrid versions of these concepts, for instance a double-ended queue can behave at the same time as a stack and as a queue, because it can be accessed by both ends simultaneously. Additionally, the fact that a data structure is provided to you as a stack or as a queue it does not necessarily mean that it is implemented as such, there are instances in which a data structure can be implemented as anything and be provided as a specific data structure simply because it can be made to behave like such. In other words, if you provide a push and pop method to any data structure, they magically become stacks!
STACK is a LIFO (last in, first out) list. means suppose 3 elements are inserted in stack i.e 10,20,30.
10 is inserted first & 30 is inserted last so 30 is first deleted from stack & 10 is last
deleted from stack.this is an LIFO list(Last In First Out).
QUEUE is FIFO list(First In First Out).means one element is inserted first which is to be
deleted first.e.g queue of peoples.
Stacks a considered a vertical collection. First understand that a collection is an OBJECT that gathers and organizes other smaller OBJECTS. These smaller OBJECTS are commonly referred to as Elements. These elements are "Pushed" on the stack in an A B C order where A is first and C is last. vertically it would look like this:
3rd element added) C
2nd element added) B
1st element added) A
Notice that the "A" which was first added to the stack is on the bottom.
If you want to remove the "A" from the stack you first have to remove "C", then "B", and then finally your target element "A". The stack requires a LIFO approach while dealing with the complexities of a stack.(Last In First Out) When removing an element from a stack, the correct syntax is pop. we don't remove an element off a stack we "pop" it off.
Recall that "A" was the first element pushed on to the stack and "C" was the last item Pushed on the stack. Should you decide that you would like to see what is on bottom the stack, being the 3 elements are on the stack ordered A being the first B being the second and C being the third element, the top would have to be popped off then the second element added in order to view the bottom of the stack.
Simply put, a stack is a data structure that retrieves data in opposite order that it was stored in. Meaning that Insertion and Deletion both follow the LIFO (Last In First Out) system. You only ever have access to the top of the stack.
With a queue, it retrieves data in the same order which it was sorted. You have access to the front of the queue when removing, and the back when adding. This follows the FIFO (First In First Out) system.
Stacks use push, pop, peek, size, and clear. Queues use Enqueue, dequeue, peek, size and clear.

Can TCustomClientDataset apply updates in a batch mode?

I've got a DB Express TSimpleDataset connected to a Firebird database. I've just added several thousand rows of data to the dataset, and now it's time to call ApplyUpdates.
Unfortunately, this results in several thousand database hits as it tries to INSERT each row individually. That's a bit disappointing. What I'd really like to see is the dataset generate a single transaction with a few thousand INSERT statements in it and send the whole thing at once. I could set that up myself if I had to, but first I'd like to know if there's any method for it built in to the dataset or the DBX framework.
Don't know if it is possible with a TSimpleDataset (never used it), but surely you can if you use a TClientDataset + TDatasetProvider + <put your db dataset here>. You can write a BeforeUpdateRecord to handle the apply process yourself. Basically, it allows you to bypass the standard apply process, access the dataset delta with changes made to records, and then use your own code and components to apply changes to the database. For example you could call stored procedures to modify data, and so on.
However, there is a difference between a transaction and what is called "array DML", "bulk insert" or the like. Even if you use a single transaction (and an "apply" AFAIK happens in a single transaction), within the transaction you may still need to send "n" INSERTs. Some databases supports a way of sending a single INSERT (or update, delete) with an array of parameters to be inserted, reducing the number of single statements to be used - but that may be very database specific and AFAIK dbExpress/Datasnap do not support it - you still could use the BeforeUpdateRecord event to take advantage of specific database capabililties.

Resources