Check Site URL which fills data in Report Suite in SiteCatalyst (Omniture) - adobe-analytics

This question may seems odd but we have a slight mixup within our Report Suites on Omniture (SiteCatalyst). Multiple Report Suites are generating analytics and it's hard for us to find which site URL is constituting the results.
Hence my question is, is there any way we can find which Site is filling data within a certain Report Suite.
Through this following JS, I am able to find which "report suite" is being used by a certain site though:-
javascript:void(window.open("","dp_debugger","width=600,height=600,location=0,menubar=0,status=1,toolbar=0,resizable=1,scrollbars=1").document.write("<script language=\"JavaScript\" id=dbg src=\"https://www.adobetag.com/d1/digitalpulsedebugger/live/DPD.js\"></"+"script>"));
But I am hoping to find the other way around that where Report Suite gets its data from within the SiteCatalyst admin.
Any assistance?
Thanks

Adobe Analytics (formerly SiteCatalyst) does not have anything native or built in to globally look at all data coming to see which page/site is sending data to which report suite. However, you can contact Adobe ClientCare and request raw hit logs for a date range, and you can parse those logs yourself, if you really want.
Alternatively, if you have Data Warehouse access, you can export urls and domains from there for a given date range. You can only select one report suite at a time but that's also better than nothing, if you really need the historical data now.
Another alternative is if your sites are NOT currently setting s.pageName, then you may be in some measure of luck for your historical data. The pages report is popped from s.pageName value. If you do not set that variable, it will default to the URL of the web page that made the request. So, at a minimum you will be able to see your URLs in that report right now, so that should help you out. And if you define "site" as equivalent of "domain" (location.hostname) you can also setup a classification level for pages for domain and then use the Classification Rule Builder and a regular expression to pop the classification with the domain, which will give you some aggregated numbers.
Some suggestions moving forward...
I good strategy moving forward is to have all of your sites report to a global report suite. Then, you can have each site also send data to a site level report suite (warning: make sure you have enough server calls in your contract to cover this, since AA does not have unlimited server calls). Alternatively, you can stick with one global report suite and setup segments for each site. Another alternative is to create a rollup report suite to have all data from your other report suites to also go to. Rollup report suites do not have as many features as standard report suites, but for basic things such as pages, page views, it works.
The overall point though is that one way or the other, you should have all of your data go into one report suite as the first step.
Then, you should also assign a few custom variables to be output on the pages of all your sites. These are the 4 main things I always try to include in an implementation to make it easier to find out which sites/pages are reporting to what.
A custom variable to identify the site. Some people use s.server for this. However, you may also want to pop a prop or eVar with the value as well, depending on how you'd like to be able to break data down. The big question here is: How do you define "site" ? I have seen it defined many different ways.
If you do NOT define "site" as domain (e.g. location.hostname) then I suggest you pop a prop and eVar with the domain, because AA does not have a native report for this. But if you do, then you can skip this, since it's same thing as point #1
A custom prop and eVar with the report suites(s). Unless you have a super old version of legacy code, just set it with s.sa(). This will ensure you get the final report suite(s), in case you happen to use a version that uses Dynamic Account variables (e.g. s.dynamicAccountList).
If you set s.pageName with a custom value, then I suggest you pop a prop and eVar with the URL. Tip: to save on request url length to AA, you can use dynamic variable syntax to copy the g parameter already in a given AA request. For example (assuming you don't have code that changes the dynamic variable prefix): s.prop1='D=g'; Or, you can pop this with a processing rule if you have the access.

you can normally find this sort of information in the Site Content-> Servers report. There will be information in there the indicates what sites are sending in the hits. Your milage may vary based on the actual tagging implementation, it is not common for anyone to explicitly set the server, so the implicit value is the domain the hit is coming in from.
Thanks C.

Related

XSLT transform/mapping work - what questions will you ask?

I have been given an assignment. It pertains to integration/transformation using xml/xslt and is deliberately vague. I have been given a sample integration (shown below) and I have been tasked with listing several questions I would ask before delivering this design,
The hypothetical integration
Data Source --> Mapping ---> output
The question is so vague I couldn't think much. I am not looking for anyone to plagiarise from, but I am hoping someone could post some sample questions to help me get started.
Pertinent Information
Note: Stack Overflow is not a place for you to cheat on an interview process. I am providing this information for other users who are looking to familiarize themselves with integrations. If you don't already know what questions to ask here, and are applying for an SOA job, you will likely be fired within a month. Dishonesty can cost a business a lot of money, and if you cheat your way into a job don't be surprised when you get blackballed or worse - perpetuate a harmful stereotype.
There are a variety of questions you would need to ask before implementing this type of integration. Here are a few things that come to mind.
1. What type of integration is this?
There are a variety of different integration paradigms. I would need to know if it is
An app/request driven orchestration
A scheduled orchestration
A file transfer
A pub/sub subscription
2. Is it invoked or triggered
An invoked integration is one that begins when it is specifically called. If I had a REST service that returned a list of countries, and your call that service every time a button was clicked that would be an invocation based integration.
Integrations can also be trigger based. Let's say you had a table that stored customers. You want to send an email whenever a new customer is added to that table. If you set your initial data source (adapter) as a trigger source on a row insert you could essentially have the integration run without explicitly being triggered.
3. What is the data source?
I would need to know if the data source is REST, SOAP, a database (DB2, MySQL, Oracle DB, etc), a customer adapter, etc. IS the data source adapter the entry point here or is the initial app adapter not shown?
4. What is the schema definition of the request / response body, and how is it specified?
You have a data source (which appears to be your initial app adapter), then you have a transformation, and a response. You can't do any transformation (or build an integration) if you don't know what the input / output will be (with some exceptions). This is really a multi level question.
How do I specify the request and response? Do I need to draft a JSON Schema or XSD document? Some platforms allow you to specify XML or JSON and they will do it's best to generate a schema for you.
What is the request and response content type? You can specify the request / response in whatever format is acceptable, but that doesn't necessarily mean that is the request / response type. For example some platforms let you specify your request body with an XSD but the content type is actually JSON. Is it XML, JSON, Plain Text, other?
5. What about other parameters
Basically, what does the endpoint look like? Are there query parameters, template parameters, custom header parameters, etc?
6. How is this integration secured?
Is this integration security using OAuth? If so what type of tokens are used (JWT, etc)? Does the integration use basic authentication?
Based off the answers to the previous questions you may then have questions about the mapping. For example, if I was provided a schema definition for the output that had an attribute called "zip" I might ask how they wish to format that, etc. I wouldn't ask anything about what technology is used for the mapping. Firstly, because it'as almost always XPath/XSLT, secondly that isn't something you need to know, it's something you would figure out on your own.

Web Load Test with MVC route parameters creates many instance URL's

I have a web test for the following request:
{{WebServer1}}/Listing/Details/{{Active Listing IDs.AWE2_RandomActiveAuctionIds#csv.ListingID}}
And I receive the error message:
The maximum number of unique Web test request URLs to report on has been exceeded; performance data for other requests will not be available
because there are thousands of unique URL's (because I'm testing for different values in the URL). Does anyone know how to fix this?
There are a few features within Visual Studio (2010 and above) that will help with this problem. I am speaking for 2010 specifically, and assuming that the same or similar options are available in later versions as well.
1. Maximum Request URLs Reported:
This is a General Option available in the Run Setting Properties of the load test. The default value set here is 1,000. This default value is usually sufficient... but not always high enough. Depending on the load test size, it may be necessary to increase this. Based on personal experience, if you are thinking about adjusting this value, go through your tests first and get an idea of how many requests you are expecting to see reported. For me, a rough guideline that is helpful:
*number_of_request_in_webtest * number_of_users_in_load_test = total_estimated_requests*
If your load test has multiple web tests in it, adjust the above accordingly by figuring out the number of requests in each indvidual test, sum that value up, and multiply by the number of users.
This option is more appropriate for large load tests that are referencing several web tests and/or have a very high user count. One reference for a problem|resolution use-case of this option is here:
https://social.msdn.microsoft.com/Forums/en-US/ffc16064-c6fc-4ef7-93b0-4decbd72167e/maximum-number-of-unique-web-test-requests-exceeded?forum=vstswebtest
In the specific case mentioned in the originally posted question, this would not resolve the problem entirely. In fact, it could create a new one, where you end up running out of virtual memory. Visual Studio will continue to create new request metrics and counters for every single request that has a unique AWE2_RandomActiveAuctionIds.
2. Reporting Name:
Another option, which #AdrianHHH already touched on, is the "Reporting Names" Option. This option is found in the request properties, inside the web test. It defaults to nothing, which in turn, results in Visual Studio trying to create the name that it will use from the request itself. This behavior creates the issue you are experiencing.
This option is the one that will directly resolve the issue of a new request being reported for every unique request report.
If you have a good idea of the expected number of requests to be seen in the load test (and I think it is a good idea to know this information, for debugging purposes, when running into this exception) a debugging step would be to set the "Maximum Request URLs Reported" DOWN to that value. This would force the exception you are seeing to pop up more quickly. If you see it after adjusting this value, then there is likely a request that is having a new reported value generated each time a virtual user executes the test.
Speaking from experience, this debugging step can save you some time and hair when dealing with requests that contain sessionId, GUID, and other similar types of information in them. Unless you are explicitly defining a Reporting Name for every single request... it can be easy to miss a request that has dynamic data in it.
3. Record Results:
Depending on the request, and its importance to your test, you can opt to remove it from your test results by setting this value to false. It is accessed under the request properties, within the web test itself. I personally do not use this option, but it may also be used to directly resolve the issue you are experiencing, given that it would just remove the request from the report all together.
A holy-grail-of-sorts document can be found on Codeplex that covers the Reporting Name option in a small amount of detail. At the time of writing this answer, the following link can be used to get to it:
http://vsptqrg.codeplex.com/releases/view/42484
I hope this information helps.

Why would Google Search use client-side URL parameters?

Yesterday morning I noticed Google Search was using hash parameters:
http://www.google.com/#q=Client-side+URL+parameters
which seems to be the same as the more usual search (with search?q=Client-side+URL+parameters). (It seems they are no longer using it by default when doing a search using their form.)
Why would they do that?
More generally, I see hash parameters cropping up on a lot of web sites. Is it a good thing? Is it a hack? Is it a departure from REST principles? I'm wondering if I should use this technique in web applications, and when.
There's a discussion by the W3C of different use cases, but I don't see which one would apply to the example above. They also seem undecided about recommendations.
Google has many live experimental features that are turned on/off based on your preferences, location and other factors (probably random selection as well.) I'm pretty sure the one you mention is one of those as well.
What happens in the background when a hash is used instead of a query string parameter is that it queries the "real" URL (http://www.google.com/search?q=hello) using JavaScript, then it modifies the existing page with the content. This will appear much more responsive to the user since the page does not have to reload entirely. The reason for the hash is so that browser history and state is maintained. If you go to http://www.google.com/#q=hello you'll find that you actually get the search results for "hello" (even if your browser is really only requesting http://www.google.com/) With JavaScript turned off, it wouldn't work however, and you'd just get the Google front page.
Hashes are appearing more and more as dynamic web sites are becoming the norm. Hashes are maintained entirely on the client and therefore do not incur a server request when changed. This makes them excellent candidates for maintaining unique addresses to different states of the web application, while still being on the exact same page.
I have been using them myself more and more lately, and you can find one example here: http://blixt.org/js -- If you have a look at the "Hash" library on that page, you'll see my implementation of supporting hashes across browsers.
Here's a little guide for using hashes for storing state:
How?
Maintaining state in hashes implies that your application (I'll call it application since you generally only use hashes for state in more advanced web solutions) relies on JavaScript. Without JavaScript, the only function of hashes would be to tell the browser to find content somewhere on the page.
Once you have implemented some JavaScript to detect changes to the hash, the next step would be to parse the hash into meaningful data (just as you would with query string parameters.)
Why?
Once you've got the state in the hash, it can be modified by your code (or your user) to represent the current state in your application. There are many reasons for why you would want to do this.
One common case is when only a small part of a page changes based on a variable, and it would be inefficient to reload the entire page to reflect that change (Example: You've got a box with tabs. The active tab can be identified in the hash.)
Other cases are when you load content dynamically in JavaScript, and you want to tell the client what content to load (Example: http://beta.multifarce.com/#?state=7001, will take you to a specific point in the text adventure.)
When?
If you had a look at my "JavaScript realm" you'll see a border-line overkill case. I did it simply because I wanted to cram as much JavaScript dynamics into that page as possible. In a normal project I would be conservative about when to do this, and only do it when you will see positive changes in one or more of the following areas:
User interactivity
Usually the user won't see much difference, but the URLs can be confusing
Remember loading indicators! Loading content dynamically can be frustrating to the user if it takes time.
Responsiveness (time from one state to another)
Performance (bandwidth, server CPU)
No JavaScript?
Here comes a big deterrent. While you can safely rely on 99% of your users to have a browser capable of using your page with hashes for state, there are still many cases where you simply can't rely on this. Search engine crawlers, for example. While Google is constantly working to make their crawler work with the latest web technologies (did you know that they index Flash applications?), it still isn't a person and can't make sense of some things.
Basically, you're on a crossroads between compatability and user experience.
But you can always build a road inbetween, which of course requires more work. In less metaphorical terms: Implement both solutions so that there is a server-side URL for every client-side URL that outputs relevant content. For compatible clients it would redirect them to the hash URL. This way, Google can index "hard" URLs and when users click them, they get the dynamic state stuff!
Recently google also stopped serving direct links in search results offering instead redirects.
I believe both have to do with gathering usage statistics, what searches were performed by the same user, in what sequence, what of the search results the user has followed etc.
P.S. Now, that's interesting, direct links are back. I absolutely remember seeing there only redirects in the last couple of weeks. They are definitely experimenting with something.

Sticky notes associated with web page - how to?

I have this idea for a project. Associated with any web page, i want to create notes that will be saved locally in a database, the notes will be reloaded automatically from that database the next time i visit the same page.
Creating the note is easy, but i'm looking for how to link the notes to the web page url and how to keep aware of the active web page. Any idea?
(Note: i have come to this searching on the internet: http://webkit.org/demos/sticky-notes/ - this is part of WebKit Open source projects) - this is about what i'm looking for.
Thank.
Browserdependent probably. You'll have to have a plugin for every browser type.
IE might be doable via the COM interface, but that probably would require starting IE via a way you control. So that probably will have to be a plugin too.
For browser independence, there are quite a few challenges in this one. One way would be to implement a proxy server and watch for text/html content....this will work for most of the general cases, but not every case. Handling frames for instance... which resource is the "parent" and which is the "child"? Which one contains the sticky note? I think you would have to inject some client side javascript to keep track of things, and that might break some websites.
protonotes.com is a web service version of this. Not sure how they do it though.
Actually, Daniel H hit the nail on the head mate: http://www.protonotes.com
It does exactly as you want, in fact it gives you two options to store your data, the first is hosted, the second is your own mySQL db - protonotes pipes the data from the tack-on style notes to your own db, if you prefer. This means that you're not the only person who can see the notes - access is granted by a unique 'group' key.
I've just deployed protonotes as our main online review tool for two reasons, we can save our own data, and it lacks some features which I generally label "dubious" anyway.
It's simplicity is great, the only thing I'm aware of that could cause a prob is that it dumps a bunch of stuff in the global namespace - if that's a potential problem for you.
d

Is there a tool to automate/stress POST calls to my site for testing?

I would like to stress (not sure this is the right word, but keep reading) the [POST] actions of my controllers. Does a tool exist that would generate many scenarios, like omitting fields, adding some, generating valid and invalid values, injecting attacks, and so on ? Thx
Update: I don't want to benchmark/performance test my site. Just automatically filling/tampering forms and see what happens
WebInspect from Spidynamics (HP bought them).
I've used this one in my previous job (I recommended it to my employer at the time) and I was overwhelmed with the amount of info and testing I could do with it.
https://download.spidynamics.com/webinspect/default.htm
Apache JMeter, is more likely to benchmark/stress itself rather than your site. I was recently pointed twards Faban which can be used for very simple and more complex tests and scenarios, its very performant. Also, take a look at OpenSTA and WebLoad both free and powerful with capabilities to record and replay complex scenarios.
Apache JMeter might fit the bill?
Have you seen CrossBow Web Stress Tester over at CodePlex?
supports get and post operations
you specify the number of threads, requests, waits, and timeouts
reads a txt file with name/value pairs for posting values
You'd have to download & modify the source if you wanted to generate random data for your Post variables.
Build one yourself by using WebClient with HtmlAgilityPack. Use agilitypack to parse your html and get the form, then just randomly fill the fields and make POSTs

Resources