XSLT transform/mapping work - what questions will you ask? - mapping

I have been given an assignment. It pertains to integration/transformation using xml/xslt and is deliberately vague. I have been given a sample integration (shown below) and I have been tasked with listing several questions I would ask before delivering this design,
The hypothetical integration
Data Source --> Mapping ---> output
The question is so vague I couldn't think much. I am not looking for anyone to plagiarise from, but I am hoping someone could post some sample questions to help me get started.

Pertinent Information
Note: Stack Overflow is not a place for you to cheat on an interview process. I am providing this information for other users who are looking to familiarize themselves with integrations. If you don't already know what questions to ask here, and are applying for an SOA job, you will likely be fired within a month. Dishonesty can cost a business a lot of money, and if you cheat your way into a job don't be surprised when you get blackballed or worse - perpetuate a harmful stereotype.
There are a variety of questions you would need to ask before implementing this type of integration. Here are a few things that come to mind.
1. What type of integration is this?
There are a variety of different integration paradigms. I would need to know if it is
An app/request driven orchestration
A scheduled orchestration
A file transfer
A pub/sub subscription
2. Is it invoked or triggered
An invoked integration is one that begins when it is specifically called. If I had a REST service that returned a list of countries, and your call that service every time a button was clicked that would be an invocation based integration.
Integrations can also be trigger based. Let's say you had a table that stored customers. You want to send an email whenever a new customer is added to that table. If you set your initial data source (adapter) as a trigger source on a row insert you could essentially have the integration run without explicitly being triggered.
3. What is the data source?
I would need to know if the data source is REST, SOAP, a database (DB2, MySQL, Oracle DB, etc), a customer adapter, etc. IS the data source adapter the entry point here or is the initial app adapter not shown?
4. What is the schema definition of the request / response body, and how is it specified?
You have a data source (which appears to be your initial app adapter), then you have a transformation, and a response. You can't do any transformation (or build an integration) if you don't know what the input / output will be (with some exceptions). This is really a multi level question.
How do I specify the request and response? Do I need to draft a JSON Schema or XSD document? Some platforms allow you to specify XML or JSON and they will do it's best to generate a schema for you.
What is the request and response content type? You can specify the request / response in whatever format is acceptable, but that doesn't necessarily mean that is the request / response type. For example some platforms let you specify your request body with an XSD but the content type is actually JSON. Is it XML, JSON, Plain Text, other?
5. What about other parameters
Basically, what does the endpoint look like? Are there query parameters, template parameters, custom header parameters, etc?
6. How is this integration secured?
Is this integration security using OAuth? If so what type of tokens are used (JWT, etc)? Does the integration use basic authentication?
Based off the answers to the previous questions you may then have questions about the mapping. For example, if I was provided a schema definition for the output that had an attribute called "zip" I might ask how they wish to format that, etc. I wouldn't ask anything about what technology is used for the mapping. Firstly, because it'as almost always XPath/XSLT, secondly that isn't something you need to know, it's something you would figure out on your own.

Related

How to enable Watson conversation service to use your own database for serving user's request

I want to build a smart search agent which would use Watson conversation to process the request and give response but will use my own database say SQL server to search the desired output.
In Short Instead of writing intents and dialogues manually or importing from a csv file, I want to write my won code in .net in such a way that all the request and responses are influenced by my own data stored in my database. I only intent to use watson's processing and interpreting capability. But the processing must happen on my data.
E.g If the user searches for a word say "Dog", the Watson conversation service must search in my database and give relevant answers to the user based on the search.
Take a look at the solution architecture in the Watson Conversation documentation. Your database would be one of the depicted backend systems. Your application would be, as you mentioned, written in .NET and would use WCS to process the user input. It would return a response with all the associated metadata. Instead of having complete answers configured in a dialog, you would use something I have described as "replaced markers" in my collection of examples. Those markers are kind of hints to your application of which database query or which action to perform.
Note that WCS requires some intents and entities to work on. If you want to rely just on the detected intents and entities, you could work with one or two generic dialog nodes. As another technique you could use data from your database to generate intents and entities as an initial setup. In my "Mutating EgoBot" I use the Watson Conversation API to add intents and entities on the fly.
I believe you should use the standard trick:
instead of defining resposnses in the node of your diaglog, define an action on the output object of the node and let your applicatation take care of providing response (see https://console.bluemix.net/docs/services/conversation/develop-app.html#building-a-client-application)

Check Site URL which fills data in Report Suite in SiteCatalyst (Omniture)

This question may seems odd but we have a slight mixup within our Report Suites on Omniture (SiteCatalyst). Multiple Report Suites are generating analytics and it's hard for us to find which site URL is constituting the results.
Hence my question is, is there any way we can find which Site is filling data within a certain Report Suite.
Through this following JS, I am able to find which "report suite" is being used by a certain site though:-
javascript:void(window.open("","dp_debugger","width=600,height=600,location=0,menubar=0,status=1,toolbar=0,resizable=1,scrollbars=1").document.write("<script language=\"JavaScript\" id=dbg src=\"https://www.adobetag.com/d1/digitalpulsedebugger/live/DPD.js\"></"+"script>"));
But I am hoping to find the other way around that where Report Suite gets its data from within the SiteCatalyst admin.
Any assistance?
Thanks
Adobe Analytics (formerly SiteCatalyst) does not have anything native or built in to globally look at all data coming to see which page/site is sending data to which report suite. However, you can contact Adobe ClientCare and request raw hit logs for a date range, and you can parse those logs yourself, if you really want.
Alternatively, if you have Data Warehouse access, you can export urls and domains from there for a given date range. You can only select one report suite at a time but that's also better than nothing, if you really need the historical data now.
Another alternative is if your sites are NOT currently setting s.pageName, then you may be in some measure of luck for your historical data. The pages report is popped from s.pageName value. If you do not set that variable, it will default to the URL of the web page that made the request. So, at a minimum you will be able to see your URLs in that report right now, so that should help you out. And if you define "site" as equivalent of "domain" (location.hostname) you can also setup a classification level for pages for domain and then use the Classification Rule Builder and a regular expression to pop the classification with the domain, which will give you some aggregated numbers.
Some suggestions moving forward...
I good strategy moving forward is to have all of your sites report to a global report suite. Then, you can have each site also send data to a site level report suite (warning: make sure you have enough server calls in your contract to cover this, since AA does not have unlimited server calls). Alternatively, you can stick with one global report suite and setup segments for each site. Another alternative is to create a rollup report suite to have all data from your other report suites to also go to. Rollup report suites do not have as many features as standard report suites, but for basic things such as pages, page views, it works.
The overall point though is that one way or the other, you should have all of your data go into one report suite as the first step.
Then, you should also assign a few custom variables to be output on the pages of all your sites. These are the 4 main things I always try to include in an implementation to make it easier to find out which sites/pages are reporting to what.
A custom variable to identify the site. Some people use s.server for this. However, you may also want to pop a prop or eVar with the value as well, depending on how you'd like to be able to break data down. The big question here is: How do you define "site" ? I have seen it defined many different ways.
If you do NOT define "site" as domain (e.g. location.hostname) then I suggest you pop a prop and eVar with the domain, because AA does not have a native report for this. But if you do, then you can skip this, since it's same thing as point #1
A custom prop and eVar with the report suites(s). Unless you have a super old version of legacy code, just set it with s.sa(). This will ensure you get the final report suite(s), in case you happen to use a version that uses Dynamic Account variables (e.g. s.dynamicAccountList).
If you set s.pageName with a custom value, then I suggest you pop a prop and eVar with the URL. Tip: to save on request url length to AA, you can use dynamic variable syntax to copy the g parameter already in a given AA request. For example (assuming you don't have code that changes the dynamic variable prefix): s.prop1='D=g'; Or, you can pop this with a processing rule if you have the access.
you can normally find this sort of information in the Site Content-> Servers report. There will be information in there the indicates what sites are sending in the hits. Your milage may vary based on the actual tagging implementation, it is not common for anyone to explicitly set the server, so the implicit value is the domain the hit is coming in from.
Thanks C.

Web Load Test with MVC route parameters creates many instance URL's

I have a web test for the following request:
{{WebServer1}}/Listing/Details/{{Active Listing IDs.AWE2_RandomActiveAuctionIds#csv.ListingID}}
And I receive the error message:
The maximum number of unique Web test request URLs to report on has been exceeded; performance data for other requests will not be available
because there are thousands of unique URL's (because I'm testing for different values in the URL). Does anyone know how to fix this?
There are a few features within Visual Studio (2010 and above) that will help with this problem. I am speaking for 2010 specifically, and assuming that the same or similar options are available in later versions as well.
1. Maximum Request URLs Reported:
This is a General Option available in the Run Setting Properties of the load test. The default value set here is 1,000. This default value is usually sufficient... but not always high enough. Depending on the load test size, it may be necessary to increase this. Based on personal experience, if you are thinking about adjusting this value, go through your tests first and get an idea of how many requests you are expecting to see reported. For me, a rough guideline that is helpful:
*number_of_request_in_webtest * number_of_users_in_load_test = total_estimated_requests*
If your load test has multiple web tests in it, adjust the above accordingly by figuring out the number of requests in each indvidual test, sum that value up, and multiply by the number of users.
This option is more appropriate for large load tests that are referencing several web tests and/or have a very high user count. One reference for a problem|resolution use-case of this option is here:
https://social.msdn.microsoft.com/Forums/en-US/ffc16064-c6fc-4ef7-93b0-4decbd72167e/maximum-number-of-unique-web-test-requests-exceeded?forum=vstswebtest
In the specific case mentioned in the originally posted question, this would not resolve the problem entirely. In fact, it could create a new one, where you end up running out of virtual memory. Visual Studio will continue to create new request metrics and counters for every single request that has a unique AWE2_RandomActiveAuctionIds.
2. Reporting Name:
Another option, which #AdrianHHH already touched on, is the "Reporting Names" Option. This option is found in the request properties, inside the web test. It defaults to nothing, which in turn, results in Visual Studio trying to create the name that it will use from the request itself. This behavior creates the issue you are experiencing.
This option is the one that will directly resolve the issue of a new request being reported for every unique request report.
If you have a good idea of the expected number of requests to be seen in the load test (and I think it is a good idea to know this information, for debugging purposes, when running into this exception) a debugging step would be to set the "Maximum Request URLs Reported" DOWN to that value. This would force the exception you are seeing to pop up more quickly. If you see it after adjusting this value, then there is likely a request that is having a new reported value generated each time a virtual user executes the test.
Speaking from experience, this debugging step can save you some time and hair when dealing with requests that contain sessionId, GUID, and other similar types of information in them. Unless you are explicitly defining a Reporting Name for every single request... it can be easy to miss a request that has dynamic data in it.
3. Record Results:
Depending on the request, and its importance to your test, you can opt to remove it from your test results by setting this value to false. It is accessed under the request properties, within the web test itself. I personally do not use this option, but it may also be used to directly resolve the issue you are experiencing, given that it would just remove the request from the report all together.
A holy-grail-of-sorts document can be found on Codeplex that covers the Reporting Name option in a small amount of detail. At the time of writing this answer, the following link can be used to get to it:
http://vsptqrg.codeplex.com/releases/view/42484
I hope this information helps.

how to make web search in grails

Hi i am a student doing my academic project.I need some guidance in completing my project.
My project is based on grails framework which searches for books from 3 different bookstores and gives d price from all the 3 stores.I need help in searching part.
how to direct the search for those bookstores once user types for required book.
thanks in advance
You need to give more details. By searching bookstores, do you mean searching in a database or are these like Amazon etc?
I would find out if these online bookstores have APIs, or if you have a choice, select the online bookstores that do have APIs that you can use to do your searching. For example, Amazon has a "Product Advertising API" that can be used to perform searching of its catalogue (see http://docs.amazonwebservices.com/AWSECommerceService/latest/DG). You usually have to register as an affiliate to get access these sort of things.
Once you have several online bookstores that are accessible via APIs, it is relatively easy to write some grails code to call them, and coordinate the results. APIs usually take the form of Web requests, either REST or SOAP (e.g. see Amazon - AnatomyOfaRESTRequest). Groovy's HTTPBuilder can be used to call and consume the bookstores' API web services if you can use simple REST, or I believe there are a couple of Grails plugins (e.g. REST Client builder). For SOAP, consider the Grails CXF Client Grails plugin.
You could do the searches on the APIs one by one, or if you want to get more advanced, you could try calling all 3 APIs at the same time asynchronously using the new servlet 3.0 async feature (see how to use from Grails 2.0.x: Grails Web Features - scroll to "Servlet 3.0 Async Features"). You would probably need to coordinate this via the DB, and perhaps poll through AJAX on your result page to check when results come in.
So the sequence would be as follows:
User submits search request from a form on a page to the server
Server creates and saves a DB object to track requests, kicks off API calls asynchronously (i.e. so the request is not blocked), then returns a page back to the user.
The "pending results" page is shown to user and a periodic AJAX update is used to check the progress of results.
Meanwhile your API calls are executing. When they return, hopefully with results, they update the DB object (or better, a related object) to store the results and status of the call.
Eventually all your results will be in the DB, and your periodic AJAX check to the server which is querying the results will be able to return them to the page. It could wait for all of the calls to the 3 bookstores to finish or it could update the page as and when it gets results back.
Your AJAX call updates the page to show the results to the user.
Note if your bookstore doesn't have an API, you might have to consider "web scraping" the results straight from bookstore's website. This is a bit harder and can be quite brittle since web pages obviously change frequently. I have used Geb (http://www.gebish.org/) to automate the browsing along with some simple string matching to pick out things I needed. Also remember to check terms & conditions of the website involved since sometimes scraping is specifically not allowed.
Also note that the above is a server oriented method of accomplishing this kind of thing. You could do it purely on the client (browser), calling out to the webservices using AJAX and processing via JavaScript. But I'm a server man :)

Is there a tool to automate/stress POST calls to my site for testing?

I would like to stress (not sure this is the right word, but keep reading) the [POST] actions of my controllers. Does a tool exist that would generate many scenarios, like omitting fields, adding some, generating valid and invalid values, injecting attacks, and so on ? Thx
Update: I don't want to benchmark/performance test my site. Just automatically filling/tampering forms and see what happens
WebInspect from Spidynamics (HP bought them).
I've used this one in my previous job (I recommended it to my employer at the time) and I was overwhelmed with the amount of info and testing I could do with it.
https://download.spidynamics.com/webinspect/default.htm
Apache JMeter, is more likely to benchmark/stress itself rather than your site. I was recently pointed twards Faban which can be used for very simple and more complex tests and scenarios, its very performant. Also, take a look at OpenSTA and WebLoad both free and powerful with capabilities to record and replay complex scenarios.
Apache JMeter might fit the bill?
Have you seen CrossBow Web Stress Tester over at CodePlex?
supports get and post operations
you specify the number of threads, requests, waits, and timeouts
reads a txt file with name/value pairs for posting values
You'd have to download & modify the source if you wanted to generate random data for your Post variables.
Build one yourself by using WebClient with HtmlAgilityPack. Use agilitypack to parse your html and get the form, then just randomly fill the fields and make POSTs

Resources