Way to get current editor id (doc id) in HiveServerClient in Hue source code - hue

When having multiple hue pages run tez applications at the same time, it, sometimes, will apply the same session to two different tasks, which will cause of of them receiving KILL signal and the other one complains that current app master is being used and retrying. I looked into the code of HiveServerClient._get_tez_session and I think the problem lies in the way busy_sessions is retrieved, which is not thread-safe. So there's chance that two query will be allocated to the same session when submitted virtually the same time.
I'd like to know is there any way to get current editor id (doc_id) from HiveServerClient._get_tez_session method, so I could do some hacking for a quick solution now. Thanks.

You can solve this by disabling Tez session mode
set tez.am.mode.session=false;
Session mode is more aggressive in reserving execution resources and
is typically used for interactive applications where multiple DAGs are
submitted in quick succession by the same user. For long running
applications, one-off executions, batch jobs etc non-session mode is
recommended. If session mode is enabled then container reuse is
recommended.
Also try to disable container reuse:
set tez.am.container.reuse.enabled=false;
See all Tez configuration settings here.
Also read this thread about Tez session naming.
I did not test it myself, maybe you can use hive.session.id property for getting/setting session id's.

Related

How to handle SAP Kapsel Offline app OData conflicts properly?

I build an app that is able to store OData offline by using SAP Kapsel Plugins.
More or less it's the same as generated by WEB ID or similer to the apps in this example: https://blogs.sap.com/2017/01/24/getting-started-with-kapsel-part-10-offline-odatasp13/
Now I am at the point to check the error resolution potential. I created a sync conflict (chaning data on the server after the offline database was stored and changed something on the app and started a flush).
As mentioned in the documentation I can see the error in ErrorArchive and could also see some details. But what I am missing is the information of the "current" data on the database.
In the error details I can just see the data on the device but not the data changed on the server.
For example:
Device is loading some names into offline store
Device is offline
User A is changing some names
User B is changing one of this names directly online
User A is online again and starts a sync
User A is now informend about the entity that was changed BUT:
not the content user B entered
I just see the "offline" data.
Is there a solution to see the "current" and the "offline" one in a kind of compare view?
Please also note that the server communication is done by the Kapsel Plugin and not with normal AJAX calls. This could be an alternative but I am wondering if there is no smarter way supported by the API?
Meanwhile I figured out how to load the online data (manually).
This could be done by switching http handler back to normal one.
sap.OData.removeHttpClient();
sap.OData.applyHttpClient();
Anyhow this does not look like a proper solution and I also have the issue with the conflict log itself. It must be deleted before any refresh could be applied.
I could not find any proper documentation for that. Also ETag handling is hardly described in SAPUI5 and SAP Kapsel documentation.
This question is a really tricky one, due to its implications. I understand that you are simulating a synchronization error due to concurrent modification, and want to know if there is a way for the client to obtain the "current" server state in order to give the user a means to compare the local and server state.
First, let me give you the short answer: No, there is no way for the client to see the current server state "for reference" via the Offline APIs when there are synchronization errors. Doing an online query as outlined above might work, but it certainly is a bad idea.
Now for the longer answer, which explains why this is not necessarily a defect and why I said there are quite some implications to the answer.
Types of Synchronization Errors
We distinguish a number of synchronization errors, and in this context, we are clearly dealing with business-related issues. There are two subtypes here: Those that the user can correct, e.g. validation errors, and those that are issues in the business process itself.
If the user violates the input range, e.g. by putting a negative price for a product, the server would reply with the corresponding message: "-1 is not a valid input value for 'Price'". You, as a developer, can display such messages to the user from the error archive, and the ensuing fix is indeed a very easy one.
Now when we talk about concurrent modification, things get really, really nasty. In fact, I like to say that in this case there is an issue with the business process, because on one hand, we allow data to get out of sync. On the other hand, the process allows multiple users to manipulate the same piece of information. How all relevant users should now be notified and synchronize, is no longer just a technical detail, but in fact a new business process. There just is no way to generically device how to handle this case. In most cases, it would involve back-office experts who need to decide how the changes should be merged.
A Better Solution
Angstrom pointed out that there is no way to manipulate ETags on the client side, and you should in fact not even think about it. ETags work like version numbers in optimistic locking scenarios, and changing the ETag basically means "Just overwrite what's on the server". This is a no-go in serious scenarios.
An acceptable workaround would be the following:
Make sure the server returns verbose error messages so that the user can see what happened and what caused the conflict.
If that does not help, refresh the data. This will get you an updated ETag, and merge the local changes into the "current" server state, but only locally. "Merging" really means that local changes always overwrite remote changes.
The user now has another opportunity to review the data and can submit it again.
A Good Solution
Better is not necessarily good, so here is what you should really do: Never let concurrent modification happen because it is really expensive to handle. This implies that not the developer should address this issue, but the business needs to change the process.
The right question to ask is, "When you replicate data in a distributed system, why do you allow it to be modified concurrently at all?" Typically stakeholders will not like this kind of question, and the appropriate reaction is to work out a conflict resolution process together with them. Only then they will realize how expensive fixing that kind of desynchronization is, and more often than not they will see that adjusting the process is way cheaper than insisting in yet another back-office process to fix the issues it causes. Even if they insist that there is a need for this concurrent modification, they will now understand that it is not your task to sort this out and that they need to invest in a conflict resolution process.
TL;DR
There is no way to compare the server and client state to the server state on the client, but you can do a refresh to retain the local changes and get an updated ETag. The real solution, however, is to rework the business process, because this no longer is a purely technical issue.
The default solution is that SMP or HCPms is detecting errors by ETags. At client side there is no API to manipulate ETags in case of conflicts. A potential solution to implement a kind of diff view on the device would work like this:
Show errors
Cache errors (maybe only in memory?)
delete the errors
do a refresh of the database
build a diff view with current data and cached errors
The idea with
sap.OData.removeHttpClient();
sap.OData.applyHttpClient();
could also work but could be very tricky and may introduce side effects.
Maybe some requests are triggered against the "wrong" backend.

Unique Realm container objects

I implemented real time sync following Realm's tasks demo app.
There a dummy container is used, to hold a List with the models.
The demo app doesn't seem to support offline usage.
I wondered what happens when, given this setup, I start the app on an online as well as an offline device and then go online with the offline device.
My initial expectation was that I'd end with 2 containers (which would be an invalid state), but when I tested surprisingly there was only 1 container at the end.
But sometimes I get 2 containers and haven't been able to identify what causes this.
The question then is, how does this exactly work? I assume the reason that the container is normally not duplicated when I sync the offline device for the first time is that it's handled as the same object, maybe because it doesn't have a primary key or something? But then why is it sometimes duplicated? And what would be the best practice here? Do I maybe have to use a primary key or check after connecting if there's duplication and if yes do a manual merge of the containers?
At the moment, Realm Tasks merely checks if the default Realm is empty before it tries to add a new base list container object. If the synchronization process hasn't completed by the time this check occurs, it's reasonable that a second container would be created. When testing the app on a local network, this usually isn't a problem since the download speeds are so fast, but we definitely should test this a bit more thoroughly.
Adding a primary key will definitely help since it means that if a second list is created locally, it will get merged with the version that comes down from the server.
We've recently been focusing on the 'on-boarding' process when a second device connects to a user's Realm Mobile Platform account via the new progress notification system. A more logical approach would be to wait for the synchronization to complete the initial download after logging in, and then checking for the presence of the objects. Once the documentation is complete, we'll most likely be revamping how Realm Tasks handles this.
The demo app (as well as the Realm Mobile Platform) does support offline, but only after the user has logged in for the first time (which is when these container objects are initially generated). After that time, the apps can be used offline, and any changes done in that interim are synchronized the next time it comes online.
We're planning on building 'anonymous user' feature where a user can start using the app straight away (even offline) and then any changes they made before they log in (due to them being offline) are then transferred to the user account after they do so.

C# 5 .NET MVC long async task, progress report and cancel globally

I use ASP.Net MVC 5 and I have a long running action which have to poll webservices, process data and store them in database.
For that I want to use TPL library to start the task async.
But I wonder how to do 3 things :
I want to report progress of this task. For this I think about SignalR
I want to be able to left the page where I start this task from and be able to report the progression across the website (from a panel on the left but this is ok)
And I want to be able to cancel this task globally (from my panel on the left)
I know quite a few about all of technologies involved. But I'm not sure about the best way to achieve this.
Is someone can help me about the best solution ?
The fact that you want to run long running work while the user can navigate away from the page that initiates the work means that you need to run this work "in the background". It cannot be performed as part of a regular HTTP request because the user might cancel his request at any time by navigating away or closing the browser. In fact this seems to be a key scenario for you.
Background work in ASP.NET is dangerous. You can certainly pull it off but it is not easy to get right. Also, worker processes can exit for many reasons (app pool recycle, deployment, machine reboot, machine failure, Stack Overflow or OOM exception on an unrelated thread). So make sure your long-running work tolerates being aborted mid-way. You can reduce the likelyhood that this happens but never exclude the possibility.
You can make your code safe in the face of arbitrary termination by wrapping all work in a transaction. This of course only works if you don't cause non-transacted side-effects like web-service calls that change state. It is not possible to give a general answer here because achieving safety in the presence of arbitrary termination depends highly on the concrete work to be done.
Here's a possible architecture that I have used in the past:
When a job comes in you write all necessary input data to a database table and report success to the client.
You need a way to start a worker to work on that job. You could start a task immediately for that. You also need a periodic check that looks for unstarted work in case the app exits after having added the work item but before starting a task for it. Have the Windows task scheduler call a secret URL in your app once per minute that does this.
When you start working on a job you mark that job as running so that it is not accidentally picked up a second time. Work on that job, write the results and mark it as done. All in a single transaction. When your process happens to exit mid-way the database will reset all data involved.
Write job progress to a separate table row on a separate connection and separate transaction. The browser can poll the server for progress information. You could also use SignalR but I don't have experience with that and I expect it would be hard to get it to resume progress reporting in the presence of arbitrary termination.
Cancellation would be done by setting a cancel flag in the progress information row. The app needs to poll that flag.
Maybe you can make use of message queueing for job processing but I'm always wary to use it. To process a message in a transacted way you need MSDTC which is unsupported with many high-availability solutions for SQL Server.
You might think that this architecture is not very sophisticated. It makes use of polling for lots of things. Polling is a primitive technique but it works quite well. It is reliable and well-understood. It has a simple concurrency model.
If you can assume that your application never exits at inopportune times the architecture would be much simpler. But this cannot be assumed. You cannot assume that there will be no deployments during work hours and that there will be no bugs leading to crashes.
Even if using http worker is a bad thing to run long task I have made a small example of how to manage it with SignalR :
Inside this example you can :
Start a task
See task progression
Cancel task
It's based on :
twitter bootstrap
knockoutjs
signalR
C# 5.0 async/await with CancelToken and IProgress
You can find the source of this example here :
https://github.com/dragouf/SignalR.Progress

How to log user activity with time spent and application name using c#.net 2.0?

I am creating one desktop application in which I want to track user activity on the system like opened Microsoft Excel with file name and worked for ... much of time on that..
I want to create on xml file to maintain that log.
Please provide me help on that.
This feels like one of those questions where you have to figure out what is meant by the question itself. Taken at face value, it sounds like you want to monitor how long a user spends in any process running in their session, however it may be that you only really want to know if, and for how long a user spends time in a specific subset of all running processes.
Since I'm not sure which of these is the correct assumption to make, I will address both as best I can.
Regardless of whether you are monitoring one or all processes, you need to know what processes are running when you start up, and you need to be notified when a new process is created. The first of these requirements can be met using the GetProcesses() method of the System.Diagnostics.Process class, the second is a tad more tricky.
One option for checking whether new processes exist is to call GetProcesses after a specified interval (polling) and determine whether the list of processes has changed. While you can do this, it may be very expensive in terms of system resources, especially if done too frequently.
Another option is to look for some mechanism that allows you to register to be notified of the creation of a new process asynchronously, I don't believe such a thing exists within the .NET Framework 2.0 but is likely to exist as part of the Win32 API, unfortunately I cant give you a specific function name because I don't know what it is.
Finally, however you do it, I recommend being as specific as you can about the notifications you choose to subscribe for, the less of them there are, the less resources are used generating and processing them.
Once you know what processes are running and which you are interested in you will need to determine when focus changes to a new process of interest so that you can time how long the user spends actually using the application, for this you can use the GetForegroundWindow function to get the window handle of the currently focused window.
As far as longing to an XML file, you can either use an external library such as long4net as suggested by pranay's answer, or you can build the log file using the XmlTextWriter or XmlDocument classes in the System.Xml namespace

any better timer in asp.net?

I used System.Timers.timer in global.as in asp.net to set a timer for scheduling execute a
function
let' say transferMoney().
But it seems that this timer might stop after several hours unexpected.
And this cause that all the actions are pending.
I want to know whether there are any better methods to set up a timer in asp.net, MVC 1.0?
Thanks in advance!
It might just be because the application got recycled. Global.ashx is not really the right place to do long running tasks because if your AppDomain gets recycled your timer will die. I suggest making a job windows service instead.
Edit: Well, it's fairly easy to create a windows service project in Visual Studio just do [File] > [Add] > [New Project...] > [Windows] > [Windows Service] and you will get the stub code for the project.
It's hard to come up with a complete example so i suggest you google it. ;) There are tons of samples out there for you to look at.
This article on CodeProject seems to be a good introduction to Windows Services.
Any timer you'll use in ASP.NET apps will eventually "terminate", but this a very expected behavior due to process recycling.
The timer will never work because IIS will reschedule the worker process regularly based on Application Pool settings, so when it recycles your timer will get destroyed and you might need to reopen it.
You can put a check on whether timer object is still available or not, if not available then create it !!, using any other timer object will not work. But this still has a problem, because if you dont have any web request for particular period of time, it will still get destroyed. Best is to setup a ping monitor from other place which can keep your website alive.
You can't reliably run a timer in ASP.NET. If there are no requests coming in, the IIS can shut down the application, and it will not start until the next request arrives.
Why do you think that you need a timer? In most web applications this is not needed at all to do periodical updates unless they depend on an external source.
If you are just moving data around inside your application, the actual transactions doesn't have to happen at an exact interval, you only have to calculate what the result would be if they had happened. Whenever a request comes in, you calculate how many transactions would have happened since the last request, and do them to catch up to the current state.
If your transactions rely on an external source so that they actually has to run at a specific time, you simply can't do it with ASP.NET alone. You need an application that runs outside IIS, for example started periodically by the windows scheduler.
You could try the system.threading.timer
http://msdn.microsoft.com/en-us/library/system.threading.timer.aspx

Resources