Azure Data Factory read url values from csv and copy to sql database - url

I am quite new in ADF so thats why i am asking you for any suggestions.
The use case:
I have a csv file which contains unique id and url's (see image below). i would like to use this file in order to export the value from various url's. In the second image you can see a example of the data from a url.
So in the current situation i am using each url and insert this manually as a source from the ADF Copy Activity task to export the data to a SQL DB. This is very time consuming method.
How can i create a ADF pipeline to use the csv file as a source, and that a copy activity use each row of the url and copy the data to Azure SQL DB? Do i need to add GetMetaData activity for example? so how?
Many thanks.

use a look up activity that reads all the data,Then use a foreach loop which reads line by line.Inside foreach use a copy activity where u can able to copy response to the sink.

In order to copy XML response of URL, we can use HTTP linked service with XML dataset. As #BeingReal said, Lookup activity should be used to refer the table which contains all the URLs and inside for each activity, Give the copy activity with HTTP as source and sink as per the requirement. I tried to repro the same in my environment. Below are the steps.
Lookup table with 3 URLs are taken as in below image.
For-each activity is added in sequence with Lookup activity.
Inside For-each, Copy activity is added. Source is given as HTTP linked service.
In HTTP linked service, base URL, #item().name is given. name is the column that stored URLs in the lookup table. Replace the name with the column name that you gave in lookup table.
In Sink, azure database is given. (Any sink of your requirement is to be used). Data is copied to SQL database.

this is the dataset HTTP inside the copy activity
This is the input of the Copy Activity inside the for each
this is the output of the Copy Activity
My sink is A Azure SQL Database without any tables yet. i would like to create auto table on the fly from ADF. Dont understand why this error came up

Related

Is there a way to Bulk extract contact details out of Oracle Eloqua's API?

I am trying to extract a large amount of details out of our Eloqua system using it's API and got this API to work perfectly for single IDs: https://docs.oracle.com/en/cloud/saas/marketing/eloqua-rest-api/op-api-rest-1.0-data-contact-id-get.html
The problem is that I need to run this for a large number of IDs and it will require alot in order to run it for the entire population. Is there any bulk APIs that can extract all of the following details out of Eloqua/Contact for the entire population? I don't see any on that pages documentation that meet this need under the Bulk section.
contactid, company, employees, company_revenue, business_phone, email_address, web_domain, date_created, date_modified, address_1, address_2, city, state_or_province, zip_or_postal_code, mobile_phone, first_name, last_name, title
It's a multi-step process with the Bulk API, typically in the following fashion:
Get a list of the current internal field names - useful for creating your export definition
Create an export definition and post it here. There is a useful example on the page, you do not need a filter criteria. Store the export ID somewhere
Using your export definition id, create a sync. It will gather the data in the background and prepare it for you. Take note of the sync ID provided in the initial response.
Check on the sync status with your sync ID here. It should only take a couple of minutes - and there is a callback url option as well in the previous step, if you don't want to keep polling.
Once your data is ready, use that sync id and request the data. Depending on how many rows were retrieved, you might need to paginate through the results using the offset query param. By default it will give you JSON, but I usually choose CSV (specify in the header).
If you need updated data, feel free to create a new sync using the same export definition id. You do not need to create a new export definition each time.

How to create large CSV file and send it to front end

I'm building a project where the front end is react and the backend is ruby on rails and uses a postgres DB. A required functionality is the ability for users to export a large dataset. So they'll get a table view and click "export" and that will send a request to the backend which should create a CSV file and send it to the front end.
This is the query that displays the data in the table and how it's executed (using find_by_sql)
query = <<-SQL
SELECT * FROM ORDERS WHERE ORDERS.STORE_ID = ? OFFSET ? LIMIT ?
SQL
query_result = Order.find_by_sql([query, store_id.to_i, offset.to_i, 50])
Now whenever the users click export, it's going to make a request to the same endpoint except it'll set a flag to notify the backend that it wants a CSV file and the limit will be much greater than 50...it could be hundreds of thousands to millions of records.
What is the best way to create a CSV to send to the front end, taking into account that the number of records will be large.
You have a couple of options:
Create a temporary file, use the standard CSV library to populate that temporary file, and then use send_file to dispatch that file to the user.
Depending on the size of data and/or your server's ability to host large temporary files, spooling to a tempfile might take too long or be otherwise impractical. In that case, you might want to stream the CSV data as it's generated, which is more complicated to set up but lessens the impact on your server.
This article has some well thought out steps to set up an interface for streaming data. As a bonus, it also delegates the act of generating the CSV to PostgreSQL itself. This will give you the best possible performance, but at the expense of code readability. However, it should set you on your way.

Getting more than 1000 documents using folder() in Appian

I am writing an Appian web API, to retrieve documents from our Appian system which will be used to integrate with our other systems.
To this end, I am using the folder() method to get information about the contents of a folder in Appian.
folder(
theCaseFolder,
"documentChildren"
)
The problem I am having is that while this code works most of the time - we have some cases where there are more than 1000 documents stored against the case. I note that the Appian documentation states that:
The documentChildren and folderChildren properties return up to the first 1000 documents or folders, respectively, that are direct children of the selected folder.
My problem is that we have a few cases where there are more than 3000 documents attached to the case. Is there a way to get a list of of those child documents, or am I plain out of luck?
In long term I would suggest storing some information about document in separate table in db. In this way you can query db as you wish by Appian or by SQL.
In short term you can get first 1000 as it is in documentation and then move them to subfolder/different folder or delete. This can be repeated multiple times to get all files from folder.
Move Document Appian Function

Suitelink Reference Table

I work with Wonderware software. One of the objects used to perform communication between Wonderware and the PLC is called Suitelink. In it, I have a table defined that has the name of one of my application fields on the left side and the name of the PLC tag providing its value on the right.
Once this saved and activated (deployed) the PLC tags will feed values in the field attributes to Wonderware.
Does anyone know where is this list saved in the system?
I am working at a web page and want to retrieve this list dynamically so I can have the page updated based on the current live value of the PLC tag being used.
I have looked in the database but could not find it.
C:\ProgramData\Wonderware\DAServer
Then within there you'll have several subfolders for your DA Servers. Open the subfolder to find a *.AAcfg file and your contents are in there in what looks like an XML format. You'll be hunting for all the <DeviceItem> tags

how to send HL7 message using mirth by reading data from my database

I'm having a problem is sending(creating) an HL7 message using mirth.
I want to read data from my patient table in SQLSERVER 2008 and, using that data,
I want to send a message to my destination connector, a file writer. I want my messages to get saved in the file writer's output directory.
So far I'm able to generate the message, but the size of the output file in my destination directory is increasing as the channel's polling time goes on.
Have I done something wrong in the transformer mapping?
UPDATE:
The size of the output file in my destination directory IS increasing. (My .txt file starts from 1 kb and goes to 900kb and so on). This is happening becasue same data is getting generated again and again and multiple times too. for eg. my generated message has one(MSH,PID,PV1,ORM) for one row of data in my Database. The same MSH,PID, PV1 and ORM are getting generated multiple times.
If you are seeing the same data generated in your output directory multiple time, the most likely cause is that you are not doing anything to indicate to your database that a given record has been processed.
For example, if you have 1 record in your database: ["John", "Smith", "12134" ...] on the first poll, you will generate 1 message. If on the second poll you also have a second record ["Fred", "Jones", "98371" ...], you will generate TWO messages - one for John Smith and one for Fred Jones. And so on.
The key is to use the "Run On-Update Statement" of your Database Reader (Source) connector to update the database table you are polling with an indication that a given record has been processed. This ensures that the same record is not processed multiple times.
This requires that your source table have some kind of column to indicate the record has been processed. Mirth will not keep track of this for you - you must do it manually.
You can't have a file reader as a destination, so I assume you mean file writer. You say that "the size of my file in my destination is increasing." Is that a typo? Do you mean NOT increasing?
If it is increasing, then your messages are getting generated and you can view them to start your next round of troubleshooting...
If not, the you should look at the message log in the dashboard to see what is happening on a message-by-message basis - that would be the next place to troubleshoot.
You have to have a way of distinguishing what records to pull from the database by filtering on some sort of status flag or possible a time-stamp. Then, you have to use some sort of On-Update statement to mark these same records as processed.
i.e.
Select id, patient, result from results where status_flag='N'
or
Select * from results where status_flag = 'N' and created_date >= '9/25/2012'
Then, in either a transformer step or the On-Update section of your Source, you would do something like:
Update results
set status_flag = 'Y' where id=$(id)
If you do not do something like this and you have Mirth polling at a certain interval, it will just keep pulling the same records over and over.
You have to change your connector type as Database reader in source.
You have to change your connector type as file writer in the destination.
And you can write your data in the file, For which you have access to write.
while creating HL7 template you have to use the following code in outbound message template
MSH|^~\&|||
Thanks
Krishna

Resources