We have an ASP.NET application (.NET 4.5.2) that's hosted by IIS 8.5. It makes calls into several web services that are hosted on the same machine. We use HttpClient to make the calls to the web services and we use the FQDN of the server to address the web services. There may be several users connected to the server at any given time.
We're seeing somewhat inexplicable timeouts in the application and trying to understand how we can fix it. We've isolated the issue in the System.Net trace but I don't know how to map this to what might be happening in the app.
We always see a trace that looks roughly like this:
System.Net Verbose: 0 : [7040] ServicePoint#54409111::ServicePoint([fqdn]:443)
DateTime=2018-07-31T14:19:39.8579341Z
System.Net Information: 0 : [7040] Associating HttpWebRequest#63284140 with ServicePoint#54409111
DateTime=2018-07-31T14:19:39.8579341Z
System.Net Information: 0 : [7040] Associating Connection#66464819 with HttpWebRequest#63284140
DateTime=2018-07-31T14:19:39.8579341Z
System.Net.Sockets Verbose: 0 : [7040] Socket#15069449::Socket(AddressFamily#2)
DateTime=2018-07-31T14:19:39.8579341Z
System.Net.Sockets Verbose: 0 : [7040] Exiting Socket#15069449::Socket()
DateTime=2018-07-31T14:19:39.8579341Z
System.Net.Sockets Verbose: 0 : [7040] Socket#36384690::Socket(AddressFamily#23)
DateTime=2018-07-31T14:19:39.8579341Z
System.Net.Sockets Verbose: 0 : [7040] Exiting Socket#36384690::Socket()
DateTime=2018-07-31T14:19:39.8579341Z
System.Net.Sockets Verbose: 0 : [7040] DNS::TryInternalResolve([fqdn])
DateTime=2018-07-31T14:19:39.8579341Z
System.Net.Sockets Verbose: 0 : [7040] Socket#36384690::BeginConnectEx()
DateTime=2018-07-31T14:19:39.8579341Z
System.Net.Sockets Verbose: 0 : [7040] Socket#36384690::InternalBind([::]:0#-1630021378)
DateTime=2018-07-31T14:19:39.8579341Z
System.Net.Sockets Verbose: 0 : [7040] Exiting Socket#36384690::InternalBind()
DateTime=2018-07-31T14:19:39.8579341Z
System.Net.Sockets Verbose: 0 : [7040] Exiting Socket#36384690::BeginConnectEx() -> ConnectOverlappedAsyncResult#20281278
DateTime=2018-07-31T14:19:39.8579341Z
System.Net Verbose: 0 : [7040] Exiting HttpWebRequest#63284140::BeginGetResponse() -> ContextAwareResult#61049080
DateTime=2018-07-31T14:19:39.8579341Z
System.Net.Sockets Verbose: 0 : [1988] Socket#36384690::EndConnect(ConnectOverlappedAsyncResult#20281278)
DateTime=2018-07-31T14:20:00.8591809Z
System.Net.Sockets Error: 0 : [1988] Socket#36384690::UpdateStatusAfterSocketError() - TimedOut
DateTime=2018-07-31T14:20:00.8591809Z
System.Net.Sockets Error: 0 : [1988] Exception in Socket#36384690::EndConnect - A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond [fe80::10f8:605a:8a44:5f1e%12]:443.
DateTime=2018-07-31T14:20:00.8591809Z
In every case where we have this timeout occur we see this call sequence:
DNS::TryInternalResolve
and then to:
Socket#########::InternalBind([::]:0#-1630021378)
In a successful connection we see:
::InternalBind(0.0.0.0:0#0)
with no call to resolve the DNS
The weird thing is that the application never sees any errors. The call to HttpClient just seems to take a long time.
Anyone know what's happening here or if there's more debugging information I can turn on to learn more?
A couple of thoughts -
check IPv6 is disabled on the host machine. It sounds like initial DNS lookups (perhaps occurring when a cached record TTL expires) are sometimes attempted via IPv6 which may have a bogus DNS server associated with it (check your IP config and also test that ping {fqdn} -6 actually works.....or as said above just disable it)
DNS might be a red herring here and the real problem is that you’re hitting a maximum connections limit. There are many places that this could be occurring but two easy things to check - First make sure you’re not recreating/disposing your HttpClient for each call....it should be static. Second, if you have more than 100 tcp connections being made per second consider increasing the ServicePointManager max connections limit
Related
I have a uwsgi process running a flask application. There is haproxy (running in mode http) sitting between the client and the application.
I am seeing occational haproxy termination state as "SD--" and the Tc = 0 and Tr = -1, and the returned http code is -1. This means that the haproxy encountered a explicit tcp disconnection from the uwsgi server.
Looking at the uwsgi logs, I found that the server was normally processing requests at the same time. But the affected request never reached the server.
Only thing strange about the uwsgi logs at that point of time is that
the Number of requests managed by the current uwsgi worker is greater than the sum total of requests managed by the whole uwsgi app.
like this:
[pid: 22759|app: 0|req: **47188**/**47178**] * POST * => generated 84 bytes in 970 msecs (HTTP/1.1 200) 2 headers in 71 bytes (3 switches on core 98)
I am wondering if this is abnormal, or what what scenarios can these counters be so?
I’m trying to setup a local mongodb crawler for my Watson discovery service. MongoDB is up and running. I downloaded the JDBC connector (mongodb-driver-3.4.2.jar) and placed it in /opt/ibm/crawler/connectorFramework/crawler-connector-framework-0.1.18/lib/java/database/
Let me show you how I modified the configuration files:
On crawler.conf, under the main section “input_adapter” I changed the following values:
crawl_config_file = "connectors/database.conf",
crawl_seed_file = "seeds/database-seed.conf",
extra_jars_dir = "database",
On seeds/database-seed.conf, in the seed > attribute section, the portion of the url looks like this:
{
name ="url",
value="mongo://localhost:27017/local/tweets?per=1000"
},
(tried also using mongodb instead of mongo)
On connectors/database.conf, the first portion of the file looks like this:
crawl_extender {
attribute = [
{
name="protocol",
value="mongo"
},
{
name="collection",
value="SomeCollection"
}
],
(also tried using mongodb instead of mongo)
When I run the crawler command, this is my output:
pish#ubuntu-crawler:~$ crawler crawl --config ./crawler-config/config/crawler.conf
2017-08-02 04:29:10,206 INFO: Connector Framework service will start and connect to crawler on port 35775
2017-08-02 04:29:10,460 INFO: This crawl is running in CrawlRun mode
2017-08-02 04:29:10,460 INFO: Running a crawl...
2017-08-02 04:29:10,465 INFO: URLs matching these patterns will be not be processed: (?i)\.(xlsx?|pptx?|jpe?g|gif|png|mp3|tiff)$
2017-08-02 04:29:10,500 INFO: HikariPool-1 - Starting...
2017-08-02 04:29:10,685 INFO: HikariPool-1 - Start completed.
2017-08-02 04:29:12,161 ERROR: There was a problem processing URL mongo://localhost:27017/local/tweets?per=1000: Couldn't load JDBC driver :
2017-08-02 04:29:17,184 INFO: HikariPool-1 - Shutdown initiated...
2017-08-02 04:29:17,196 INFO: HikariPool-1 - Shutdown completed.
2017-08-02 04:29:17,198 INFO: The service for the Connector Framework Input Adapter was signaled to halt.
Attempting to shutdown the crawler cleanly.
What am I missing or doing wrong in my crawler?
At the end, turns out that I also had to specify the connection string in one of the configuration files. It works now.
I have Xamarin Android App with PCL shared library where SignalR communication established.
In debug mode it's working.
See IIS trace:
2017-06-24 09:40:50 192.168.1.220 GET
/signalr/negotiate+clientProtocol=1.4&connectionData=[{}]&UserId=1137158
- 80 - 192.168.1.1 Mozilla/5.0+(Windows+NT+10.0;+Win64;+x64)+AppleWebKit/537.36+(KHTML,+like+Gecko)+Chrome/58.0.3029.110+Safari/537.36
- 301 0 0 31
In release mode it's NOT working:
2017-06-24 09:53:54 192.168.1.220 GET /signalr/negotiate
clientProtocol=1.5&connectionData=[%7B%7D]&UserId=1137158 80 -
192.168.1.1 SignalR.Client.NetStandard/2.2.2.0+(Unknown+OS) - 500 0 0 31
And Server-side SignalR code throwing exception somewhere internally (spotted in logs).
It could be fixed by setting 'Linkage' to 'None' in Android compilation options, but i'm not sure if that's correct way to solve this.
(found here SignalR .Net Client fails with 500 server error on device, on simulator works fine)
I am managing to get everything working with the local master and two remote workers. Now, I want to connect to a remote master that has the same remote workers. I have tried different combinations of settings withing the /etc/hosts and other reccomendations on the Internet, but NOTHING worked.
The Main class is:
public static void main(String[] args) {
ScalaInterface sInterface = new ScalaInterface(CHUNK_SIZE,
"awsAccessKeyId",
"awsSecretAccessKey");
SparkConf conf = new SparkConf().setAppName("POC_JAVA_AND_SPARK")
.setMaster("spark://spark-master:7077");
org.apache.spark.SparkContext sc = new org.apache.spark.SparkContext(
conf);
sInterface.enableS3Connection(sc);
org.apache.spark.rdd.RDD<Tuple2<Path, Text>> fileAndLine = (RDD<Tuple2<Path, Text>>) sInterface.getMappedRDD(sc, "s3n://somebucket/");
org.apache.spark.rdd.RDD<String> pInfo = (RDD<String>) sInterface.mapPartitionsWithIndex(fileAndLine);
JavaRDD<String> pInfoJ = pInfo.toJavaRDD();
List<String> result = pInfoJ.collect();
String miscInfo = sInterface.getMiscInfo(sc, pInfo);
System.out.println(miscInfo);
}
It fails at:
List<String> result = pInfoJ.collect();
The error I am getting is:
1354 [sparkDriver-akka.actor.default-dispatcher-3] ERROR akka.remote.transport.netty.NettyTransport - failed to bind to spark-master/192.168.0.191:0, shutting down Netty transport
1354 [main] WARN org.apache.spark.util.Utils - Service 'sparkDriver' could not bind on port 0. Attempting port 1.
1355 [main] DEBUG org.apache.spark.util.AkkaUtils - In createActorSystem, requireCookie is: off
1363 [sparkDriver-akka.actor.default-dispatcher-3] INFO akka.remote.RemoteActorRefProvider$RemotingTerminator - Shutting down remote daemon.
1364 [sparkDriver-akka.actor.default-dispatcher-3] INFO akka.remote.RemoteActorRefProvider$RemotingTerminator - Remote daemon shut down; proceeding with flushing remote transports.
1364 [sparkDriver-akka.actor.default-dispatcher-5] INFO akka.remote.RemoteActorRefProvider$RemotingTerminator - Remoting shut down.
1367 [sparkDriver-akka.actor.default-dispatcher-4] INFO akka.event.slf4j.Slf4jLogger - Slf4jLogger started
1370 [sparkDriver-akka.actor.default-dispatcher-6] INFO Remoting - Starting remoting
1380 [sparkDriver-akka.actor.default-dispatcher-4] ERROR akka.remote.transport.netty.NettyTransport - failed to bind to spark-master/192.168.0.191:0, shutting down Netty transport
Exception in thread "main" 1382 [sparkDriver-akka.actor.default-dispatcher-6] INFO akka.remote.RemoteActorRefProvider$RemotingTerminator - Shutting down remote daemon.
1382 [sparkDriver-akka.actor.default-dispatcher-6] INFO akka.remote.RemoteActorRefProvider$RemotingTerminator - Remote daemon shut down; proceeding with flushing remote transports.
java.net.BindException: Failed to bind to: spark-master/192.168.0.191:0: Service 'sparkDriver' failed after 16 retries!
at org.jboss.netty.bootstrap.ServerBootstrap.bind(ServerBootstrap.java:272)
at akka.remote.transport.netty.NettyTransport$$anonfun$listen$1.apply(NettyTransport.scala:393)
at akka.remote.transport.netty.NettyTransport$$anonfun$listen$1.apply(NettyTransport.scala:389)
at scala.util.Success$$anonfun$map$1.apply(Try.scala:206)
at scala.util.Try$.apply(Try.scala:161)
at scala.util.Success.map(Try.scala:206)
at scala.concurrent.Future$$anonfun$map$1.apply(Future.scala:235)
at scala.concurrent.Future$$anonfun$map$1.apply(Future.scala:235)
at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:32)
at akka.dispatch.BatchingExecutor$Batch$$anonfun$run$1.processBatch$1(BatchingExecutor.scala:67)
at akka.dispatch.BatchingExecutor$Batch$$anonfun$run$1.apply$mcV$sp(BatchingExecutor.scala:82)
at akka.dispatch.BatchingExecutor$Batch$$anonfun$run$1.apply(BatchingExecutor.scala:59)
at akka.dispatch.BatchingExecutor$Batch$$anonfun$run$1.apply(BatchingExecutor.scala:59)
at scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:72)
at akka.dispatch.BatchingExecutor$Batch.run(BatchingExecutor.scala:58)
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41)
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:393)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
1383 [sparkDriver-akka.actor.default-dispatcher-7] INFO akka.remote.RemoteActorRefProvider$RemotingTerminator - Remoting shut down.
1385 [delete Spark temp dirs] DEBUG org.apache.spark.util.Utils - Shutdown hook called
Thank you kindly for your help!
Setting the environment variable SPARK_LOCAL_IP=127.0.0.1 solved this for me.
I had this problem when my /etc/hosts file was mapping the wrong IP address to my local hostname.
The BindException in your logs complains about the IP address 192.168.0.191. I assume that resolves to the hostname of your machine and it's not the actual IP address that your network interface is using. It should work fine once you fix that.
I had spark working in my EC2 instance. I started a new web server and to meet its requirement I had to change hostname to ec2 public DNS name i.e.
hostname ec2-54-xxx-xxx-xxx.compute-1.amazonaws.com
After that my spark could not work and showed error as below:
16/09/20 21:02:22 WARN Utils: Service 'sparkDriver' could not bind on port 0. Attempting port 1.
16/09/20 21:02:22 ERROR SparkContext: Error initializing SparkContext.
I solve it by setting SPARK_LOCAL_IP to as below:
export SPARK_LOCAL_IP="localhost"
then just launched sparkling shell as below:
$SPARK_HOME/bin/spark-shell
Possily your master is running on non-default port. Can you post your submit command?
Have a look in https://spark.apache.org/docs/latest/spark-standalone.html#connecting-an-application-to-the-cluster
I've been using the Enyim Memcached Client for .Net, trying to connect to a server running on AppHarbor. The relevant parts of my configuration file look like this:
<enyim.com>
<log factory="Enyim.Caching.DiagnosticsLogFactory, Enyim.Caching" />
<memcached protocol="Binary">
<servers>
<add address="8d593f28-37d7-4c4f-a702-aa7687a85ea1.memcacher.com" port="11211" />
</servers>
<authentication
type="Enyim.Caching.Memcached.PlainTextAuthenticator, Enyim.Caching"
userName="changed to post on stack overflow"
password="changed to post on stack overflow"
zone="AUTHZ"
/>
</memcached>
</enyim.com>
My connection keeps timing out. Any ideas whats going on here? Here are the logs from Enyim client:
2012-01-21 18:56:08 [ERROR] 7 Enyim.Caching.Memcached.MemcachedNode.InternalPoolImpl - Could not init pool. - System.TimeoutException: Could not connect to 50.19.210.46:11211
at Enyim.Caching.Memcached.PooledSocket.ConnectWithTimeout(Socket socket, IPEndPoint endpoint, Int32 timeout)
at Enyim.Caching.Memcached.PooledSocket..ctor(IPEndPoint endpoint, TimeSpan connectionTimeout, TimeSpan receiveTimeout)
at Enyim.Caching.Memcached.MemcachedNode.CreateSocket()
at Enyim.Caching.Memcached.Protocol.Binary.BinaryNode.CreateSocket()
at Enyim.Caching.Memcached.MemcachedNode.InternalPoolImpl.CreateSocket()
at Enyim.Caching.Memcached.MemcachedNode.InternalPoolImpl.InitPool()
2012-01-21 18:56:08 [DEBUG] 7 Enyim.Caching.Memcached.MemcachedNode.InternalPoolImpl - Mark as dead was requested for 50.19.210.46:11211
2012-01-21 18:56:08 [DEBUG] 7 Enyim.Caching.Memcached.MemcachedNode.InternalPoolImpl - FailurePolicy.ShouldFail(): True
2012-01-21 18:56:08 [WARN] 7 Enyim.Caching.Memcached.MemcachedNode.InternalPoolImpl - Marking node 50.19.210.46:11211 as dead
2012-01-21 18:56:08 [DEBUG] 7 Enyim.Caching.Memcached.DefaultServerPool - Node 50.19.210.46:11211 is dead.
2012-01-21 18:56:08 [DEBUG] 7 Enyim.Caching.Memcached.DefaultServerPool - Starting the recovery timer.
2012-01-21 18:56:08 [DEBUG] 7 Enyim.Caching.Memcached.DefaultServerPool - Timer started.
2012-01-21 18:56:08 [DEBUG] 7 Enyim.Caching.Memcached.MemcachedNode.InternalPoolImpl - Acquiring stream from pool. 50.19.210.46:11211
2012-01-21 18:56:08 [DEBUG] 7 Enyim.Caching.Memcached.MemcachedNode.InternalPoolImpl - Pool is dead or disposed, returning null. 50.19.210.46:11211
UPDATE:
Turns out the reason I can't connect to the memcached server is because it's only accessible from appharbor's environment. So for anyone else that runs across this, you need to use a local memcached service when developing locally, then simply change the credentials when deploying (which apphaorbor actually does automatically for you). Problem resolved.
AppHarbor Memcacher buckets are only accessible from AppHarbor application servers. The documentation has been amended to clearly reflect this.
You should use a locally installed memcached server for testing.