Is Google Dataflow retrying DoFns when RuntimeException appears? - google-cloud-dataflow

We have a simple pipeline, where we transforming data from Unbounded data source.
In one step, when we enrich the data from the external service, sometimes RuntimeException is thrown ( it's because Dataflow is so fast ( :p ) and the external services are not aware of this particular data ). After 10s it will be aware, and RuntimeException will not be thrown.
In this in mind we switched completely from using failsafe and we try to rely on native Dataflow mechanism ( according to this: https://cloud.google.com/dataflow/pipelines/troubleshooting-your-pipeline#detecting-an-exception-in-worker-code )
But we've found out, that this is not really working. I mean, the bundle is not redelivered to DoFn, so that our sink does not have all the data that comes to our source.
Also, when running locally, this exception also quits the execution of the whole.
Is that a problem only with this particular type of exception (RuntimeException)? How to force Dataflow to reprocess bundle?
update
The DoFn in which the exception appears:
#DoFn.ProcessElement
public void processElement(ProcessContext c) {
String txHash = c.element().getHash();
try {
LOG.info("TransformId: " + txHash);
// here the RuntimeException is thrown
throw new new RuntimeException
}
} catch (Exception e) {
LOG.error("Exception during processing id: " + txHash, e);
throw e;
}
}
And logs:
2018-02-22 17:15:53.633 CET
Receiver: 00ff ( this is source, we are receiving id"
2018-02-22 17:15:53.634 CET
TransformId: 00ff ( beginning of the DoFn )
2018-02-22 17:15:53.634 CET
getTxRest invoked: 00ff ( the enriching service )
2018-02-22 17:15:53.638 CET
Exception during processing id: 00ff
2018-02-22 17:15:53.834 CET
Uncaught exception: ( and here are the details that the log name is: "xxx/logs dataflow.googleapis.com%2Fworker" )
Why am I saying that this is not retried? Because this id 00ff does not exists in the log elsewhere.

There could be 2 reasons:
If getHash() is non-deterministic
If you're reading from a custom UnboundedSource that does not provide at-least-once reads. E.g., the source might not support acking at all, or might be incorrectly acking records immediately when they are received, rather than in finalizeCheckpoint().
The second is more likely in this case. When the bundle is retried, it retries reading from the source, and the source doesn't give this record back again.
If the source can not be fixed, as a workaround, you can pass the data from the source through Reshuffle.viaRandomKey() - that will effectively temporarily materialize it so retries will concern only the processing but not the reading, at the expense of small performance overhead.

Related

How to identify exactly where exception occurred in Jenkins pipeline?

In Jenkins pipeline build, sometimes I've seen null pointer or other exceptions like -
java.lang.NullPointerException: Cannot invoke method trim() on null object
Generally if we run Java program through IDE or command line, if an exception occurs we see at which line number the exception has occurred.
But with Jenkins build output console, it does not show the line number where the exception has occurred.
In this case, based on method name ie trim() from log, I check wherever trim() method is used. But as I've used it at multiple places in same method, it becomes difficult to identify exactly where error has occurred.
Another way is to add echo statements and re-run build and see where it gives this exception but this is time consuming.
Is there any better way/plugin using which I can identify at which line of pipeline code exception has occurred?
I don't really know if it's possible to show the exact line number, but you can wrap your code in try-catch statements and then show the exception info in the catch, like so:
try {
// line with trim()
catch (ex) {
println "Exception while trimming: $ex"
}

Web scraping stock dividend data with F#

I am attempting to scrape stock dividend data from web pages using F# and the FSharp.Data library. An example page can be seen at http://www.nasdaq.com/symbol/ibm/dividend-history.
To request the web page, my code is setup as a simple console app as an example and is as follows:
open FSharp.Data
[<EntryPoint>]
let main argv =
let url = "http://www.nasdaq.com/symbol/ibm/dividend-history"
let result = Http.RequestString(url)
System.Console.ReadLine() |> ignore
0 // return an integer exit code
When run, the RequestString method errors with:
"An unhandled exception of type 'System.ArgumentOutOfRangeException' occurred in FSharp.Core.dll
Additional information: Length cannot be less than zero."
It looks like the page is formatted in a way to that "traditional" scraping approaches won't work. Any ideas or thoughts would be appreciated.
This is the full stacktrace I get when I run the code:
System.ArgumentOutOfRangeException: Length cannot be less than zero.
Parameter name: length
at System.String.Substring(Int32 startIndex, Int32 length)
at FSharp.Data.HttpHelpers.getAllCookiesFromHeader#671.Invoke(Int32 i, String cookiePart) in C:\Git\FSharp.Data\src\Net\Http.fs:line 675
at Microsoft.FSharp.Collections.ArrayModule.IterateIndexed[T](FSharpFunc`2 action, T[] array)
at FSharp.Data.HttpHelpers.getAllCookiesFromHeader(String header, Uri responseUri, CookieContainer cookieContainer) in C:\Git\FSharp.Data\src\Net\Http.fs:line 671
at <StartupCode$FSharp-Data>.$Http.InnerRequest#803-5.Invoke(WebResponse _arg2) in C:\Git\FSharp.Data\src\Net\Http.fs:line 803
at Microsoft.FSharp.Control.AsyncBuilderImpl.args#835-1.Invoke(a a)
--- End of stack trace from previous location where exception was thrown ---
at Microsoft.FSharp.Control.AsyncBuilderImpl.commit[a](Result`1 res)
at Microsoft.FSharp.Control.CancellationTokenOps.RunSynchronously[a](CancellationToken token, FSharpAsync`1 computation, FSharpOption`1 timeout)
> at Microsoft.FSharp.Control.FSharpAsync.RunSynchronously[T](FSharpAsync`1 computation, FSharpOption`1 timeout, FSharpOption`1 cancellationToken)
at <StartupCode$FSI_0004>.$FSI_0004.main#() in C:\Users\helgeu.COMPODEAL\AppData\Local\Temp\~vs2B9.fsx:line 8
Stopped due to error
I think you unfortunately have stumbled upon an bug related to this cookie handling code:
https://github.com/fsharp/FSharp.Data/issues/904
<rant>
I have tried to look into that code, but it gives me a headache from the evil cut and paste of some google answer on how to handle cookies in C# and then badly translated to F#.
</rant>
Think maybe adding info to that github case might be a better option than here.

TFS 2012 Exception Message: TF246021

I'm receiving the following error within TFS
Exception Message: TF246021: An error occurred while processing your
request. Technical information (for administrator): SQL Server Error:
2601 (type VersionControlException) Exception Stack Trace: Server
stack trace: at
Microsoft.TeamFoundation.Client.Channels.TfsHttpClientBase.HandleReply(TfsClientOperation
operation, TfsMessage message, Object[]& outputs) at
Microsoft.TeamFoundation.VersionControl.Client.Repository5.LabelItem(String
workspaceName, String workspaceOwner, VersionControlLabel label,
LabelItemSpec[] labelSpecs, LabelChildOption children, Int32
maxClientPathLength, Failure[]& failures) at
Microsoft.TeamFoundation.VersionControl.Client.WebServiceLayer.LabelItem(String
workspaceName, String workspaceOwner, VersionControlLabel label,
LabelItemSpec[] labelSpecs, LabelChildOption children, Failure[]&
failures) at
Microsoft.TeamFoundation.VersionControl.Client.VersionControlServer.CreateLabel(VersionControlLabel
label, LabelItemSpec[] itemSpecs, LabelChildOption options, Failure[]&
failures) at
Microsoft.TeamFoundation.Build.Workflow.Activities.TfLabel.TfLabelInternal.RunCommand(VersionControlScope
versionControlScope, String nonFatalError, VersionControlLabel label,
IEnumerable`1 items, LabelChildOption childOption) at
System.Runtime.Remoting.Messaging.StackBuilderSink._PrivateProcessMessage(IntPtr
md, Object[] args, Object server, Object[]& outArgs) at
System.Runtime.Remoting.Messaging.StackBuilderSink.AsyncProcessMessage(IMessage
msg, IMessageSink replySink)
Exception rethrown at [0]: at
System.Runtime.Remoting.Proxies.RealProxy.EndInvokeHelper(Message
reqMsg, Boolean bProxyCase) at
System.Runtime.Remoting.Proxies.RemotingProxy.Invoke(Object NotUsed,
MessageData& msgData) at System.Func6.EndInvoke(IAsyncResult
result) at
Microsoft.TeamFoundation.Build.Workflow.Activities.TfLabel.TfLabelInternal.EndExecute(AsyncCodeActivityContext
context, IAsyncResult result) at
System.Activities.AsyncCodeActivity1.System.Activities.IAsyncCodeActivity.FinishExecution(AsyncCodeActivityContext
context, IAsyncResult result) at
System.Activities.AsyncCodeActivity.CompleteAsyncCodeActivityData.CompleteAsyncCodeActivityWorkItem.Execute(ActivityExecutor
executor, BookmarkManager bookmarkManager)
Inner Exception Details:
Exception Message: TF246021: An error occurred while processing your
request. Technical information (for administrator): SQL Server Error:
2601 (type SoapException)SoapException Details: Exception
Stack Trace:
I've read a previous post within StackOverflow which points at recreating workspaces, I tried that and it doesn't work.
Also I've tried to clean the cache, again without any luck. One thing I noticed when I go and disable 'Label Sources' and run the a build it works. I believe its to do with creating a branch and then deleting some files from a main branch which then gets TFS into a mess. I'm not sure whats the best way to fix this, it'll be difficult to upgrade to a newer version of TFS.
Make sure your TFS 2012 has been upgraded to the latest edition Update 4.
Make sure the build server has the same edition with TFS Application Tier.
Make sure both client cache and server cache are cleaned. Client cache: C:\Users\username\AppData\Local\Microsoft\Team Foundation\4.0\Cache. Server cache: C:\ProgramData\Microsoft\Team Foundation\Web Access\Cache_v11.0
Check Event log in Event Viewer to see whether there is something helpful.

NServiceBus RavenDB exception

From times to times I receive an exception that looks to be appearing when my NServiceBus console shows the following message:
NServiceBus.Timeout.Hosting.Windows.TimeoutPersisterReceiver [(null)] <(null)> - Polling next retrieval is at 11/19/2015 11:20:49.
.
Exception:
A first chance exception of type 'System.Net.WebException' occurred in System.dll
Additional information: The remote server returned an error: (404) Not Found.
Callstack:
[External Code]
Raven.Client.Lightweight!Raven.Client.Connection.HttpJsonRequest.ReadJsonInternal(System.Func<System.Net.WebResponse> getResponse) Line 332 C#
Raven.Client.Lightweight!Raven.Client.Connection.HttpJsonRequest.ReadResponseJson() Line 225 C#
Raven.Client.Lightweight!Raven.Client.Connection.ServerClient.DirectGet(string serverUrl, string key) Line 203 C#
Raven.Client.Lightweight!Raven.Client.Connection.ReplicationInformer.RefreshReplicationInformation(Raven.Client.Connection.ServerClient commands) Line 351 C#
Raven.Client.Lightweight!Raven.Client.Connection.ReplicationInformer.UpdateReplicationInformationIfNeeded.AnonymousMethod__6() Line 134 C#
[External Code]
It's an exception that it's thrown and caught inside RavenDB code it self, so I suspect this doesn't interfere with my own code. But I might be hiding a problem.
So I'm wondering why is this exception happening and how can I avoid it?
That is fine, 404 error is raised when you are trying to load a non existant document.
This is expected and part of how this works.

Rejected by Server TFS Error

I am building an application but getting an error. Can someone help me to understand how to resolve this?
Exception Message: The request was rejected by the server.Technical
information: HTTP code 400: Bad Request (type
TeamFoundationServerInvalidRequestException)Exception Stack Trace:
at
Microsoft.TeamFoundation.Client.Channels.TfsHttpWebRequest.ReadResponse(HttpWebResponse
webResponse, WebException webException) at
Microsoft.TeamFoundation.Client.Channels.TfsHttpWebRequest.IsAuthenticationChallenge(TfsMessage
requestMessage, HttpWebResponse webResponse, WebException
webException, TfsMessage& responseMessage) at
Microsoft.TeamFoundation.Client.Channels.TfsHttpWebRequest.SendRequest()
at
Microsoft.TeamFoundation.Client.Channels.TfsHttpRequestChannel.Request(TfsMessage
message, TimeSpan timeout) at
Microsoft.TeamFoundation.Client.Channels.TfsHttpRetryChannel.Request(TfsMessage
message, TimeSpan timeout) at
Microsoft.TeamFoundation.Client.Channels.TfsHttpClientBase.Invoke(TfsClientOperation
operation, Object[] parameters, TimeSpan timeout, Object[]& outputs)
at
Microsoft.TeamFoundation.TestImpact.Client.TestImpactServer.Microsoft.TeamFoundation.TestImpact.Client.ITestImpactServer.PublishBuildChanges(Uri
buildUri, CodeChange[] changes) at
Microsoft.TeamFoundation.TestImpact.BuildIntegration.BuildActivities.GetImpactedTests.Execute(CodeActivityContext
context) at
System.Activities.CodeActivity.InternalExecute(ActivityInstance
instance, ActivityExecutor executor, BookmarkManager bookmarkManager)
at
System.Activities.Runtime.ActivityExecutor.ExecuteActivityWorkItem.ExecuteBody(ActivityExecutor
executor, BookmarkManager bookmarkManager, Location
resultLocation)Inner Exception Details:Exception Message: The remote
server returned an error: (400) Bad Request. (type
WebException)Status: ProtocolErrorResponse Status Code:
BadRequestResponse Status Message: Bad RequestException Stack Trace:
at System.Net.HttpWebRequest.GetResponse()
I've just ran into the same issue with TFS2013 and our CI builds.
It seems that the error happens with the Test Impact analyser fails somehow.
You can alter your build configuration to not analyse the test impact.
It depends on which build template you are using, but for Scrum 2013, you'll find it under
It's under: Process > Test > Advanced > Analyze Test Impact - set this to false.
Jaans' "solution" works for me too, obviously. As for the cause I started seeing this error after enabling obfuscation as part of my TFS build.
Doesn't look like it's possible to disable test impact analysis only for the release configuration - debug is not obfuscated. If I really want the test impact analysis I'll need 2 build definitions and don't build the release configuration for the one where the test impact analysis is enabled.
It's also "interesting" that it breaks the build even if there are no UT.
I didn't find why it occurs, but I resolved the error using a loop and a try-catch until get impacted tests succeeded.

Resources