Our query is taking 20s and we need to substantially reduce this. We're calling it via the python dataframe client, but I reproduced the same query and 20s response time via the CLI client:
influx --host 10.0.5.183 --precision RFC3339 -execute "select * from turbine_ops.permanent.turbine_interval where ((turbine_id = 'NKWF-T15' or turbine_id = 'NKWF-T41' or turbine_id = 'NKWF-T23' or turbine_id = 'NKWF-T19' or turbine_id = 'NKWF-T51' or turbine_id = 'NKWF-T14' or turbine_id = 'NKWF-T42' or turbine_id = 'NKWF-T26' or turbine_id = 'NKWF-T39' or turbine_id = 'NKWF-T49' or turbine_id = 'NKWF-T38') and time >= '2019-05-01')">/dev/null
Influx is running on a r5.large EC2 instance with EBS a general purpose SSD (gp2) volume, the CLI is on an EC2 in the same subnet. The query returns 747120 rows, each having 1 tag (turbine_id) and 5 fields (all decimal values). Does this seem normal?
Via htop on the influx host I see no significant change in RAM usage, a brief CPU spike that lasts ~1s at the start of query, and then no subsequent CPU activity.
Shard duration is set to 1 year.
show series exact cardinality on turbine_ops
name: turbine_interval
count
-----
11
I tried scaling the influxdb host to r5.8xlarge and the query time did not change at all.
explain select * from turbine_ops.permanent.turbine_interval where ((turbine_ = 'NKWF-T15' or turbine_id = 'NKWF-T41' or turbine_id = 'NKWF-T23' or turbine_id = 'NKWF-T19' or turbine_id = 'NKWF-T51' or turbine_id = 'NKWF-T14' or turbine_id = 'NKWF-T42' or turbine_id = 'NKWF-T26' or turbine_id = 'NKWF-T39' or turbine_id = 'NKWF-T49' or turbine_id = 'NKWF-T38') and time >= '2019-05-01')
QUERY PLAN
EXPRESSION:
AUXILIARY FIELDS: active_power::float, “duration”::integer, rotor_rpm::float, turbine_id::tag, wind_speed::float, yaw_direction::float
NUMBER OF SHARDS: 1
NUMBER OF SERIES: 10
CACHED VALUES: 0
NUMBER OF FILES: 150
NUMBER OF BLOCKS: 3515
SIZE OF BLOCKS: 12403470
explain analyze select * from turbine_ops.permanent.turbine_interval where ((turbine_ = 'NKWF-T15' or turbine_id = 'NKWF-T41' or turbine_id = 'NKWF-T23' or turbine_id = 'NKWF-T19' or turbine_id = 'NKWF-T51' or turbine_id = 'NKWF-T14' or turbine_id = 'NKWF-T42' or turbine_id = 'NKWF-T26' or turbine_id = 'NKWF-T39' or turbine_id = 'NKWF-T49' or turbine_id = 'NKWF-T38') and time >= '2019-05-01')
EXPLAIN ANALYZE
.
└── select
├── execution_time: 1.442047426s
├── planning_time: 2.105094ms
├── total_time: 1.44415252s
└── build_cursor
├── labels
│ └── statement: SELECT active_power::float, “duration”::integer, rotor_rpm::float, turbine_id::tag, wind_speed::float, yaw_direction::float FROM turbine_ops.permanent.turbine_interval WHERE turbine_ = ‘NKWF-T15’ OR turbine_id::tag = ‘NKWF-T41’ OR turbine_id::tag = ‘NKWF-T23’ OR turbine_id::tag = ‘NKWF-T19’ OR turbine_id::tag = ‘NKWF-T51’ OR turbine_id::tag = ‘NKWF-T14’ OR turbine_id::tag = ‘NKWF-T42’ OR turbine_id::tag = ‘NKWF-T26’ OR turbine_id::tag = ‘NKWF-T39’ OR turbine_id::tag = ‘NKWF-T49’ OR turbine_id::tag = ‘NKWF-T38’
└── iterator_scanner
├── labels
│ └── auxiliary_fields: active_power::float, “duration”::integer, rotor_rpm::float, turbine_id::tag, wind_speed::float, yaw_direction::float
└── create_iterator
├── labels
│ ├── cond: turbine_ = ‘NKWF-T15’ OR turbine_id::tag = ‘NKWF-T41’ OR turbine_id::tag = ‘NKWF-T23’ OR turbine_id::tag = ‘NKWF-T19’ OR turbine_id::tag = ‘NKWF-T51’ OR turbine_id::tag = ‘NKWF-T14’ OR turbine_id::tag = ‘NKWF-T42’ OR turbine_id::tag = ‘NKWF-T26’ OR turbine_id::tag = ‘NKWF-T39’ OR turbine_id::tag = ‘NKWF-T49’ OR turbine_id::tag = ‘NKWF-T38’
│ ├── measurement: turbine_interval
│ └── shard_id: 1584
├── cursors_ref: 0
├── cursors_aux: 50
├── cursors_cond: 0
├── float_blocks_decoded: 2812
├── float_blocks_size_bytes: 12382380
├── integer_blocks_decoded: 703
├── integer_blocks_size_bytes: 21090
├── unsigned_blocks_decoded: 0
├── unsigned_blocks_size_bytes: 0
├── string_blocks_decoded: 0
├── string_blocks_size_bytes: 0
├── boolean_blocks_decoded: 0
├── boolean_blocks_size_bytes: 0
└── planning_time: 1.624627ms
Please let me know any optimizations we may be able to make.
My suspicions were confirmed that influx itself was not the culprit here, when I curled the HTTP API directly and got a ~3s response. I’m not sure why the CLI or python DataFrameClient are adding so much overhead but I got to a Pandas dataframe in 3.78s using this:
import urllib
import pandas as pd
from io import BytesIO
data = {}
data['db']='turbine_ops'
data['precision']='s'
data['q']="select * from turbine_ops.permanent.turbine_interval where ((turbine_id = 'NKWF-T15' or turbine_id = 'NKWF-T41' or turbine_id = 'NKWF-T23' or turbine_id = 'NKWF-T19' or turbine_id = 'NKWF-T51' or turbine_id = 'NKWF-T14' or turbine_id = 'NKWF-T42' or turbine_id = 'NKWF-T26' or turbine_id = 'NKWF-T39' or turbine_id = 'NKWF-T49' or turbine_id = 'NKWF-T38') and time >= '2019-05-01')"
url_values=urllib.parse.urlencode(data)
url="http://10.0.5.183:8086/query?" + url_values
request = urllib.request.Request(url, headers={'Accept':'application/csv'})
response = urllib.request.urlopen(request)
response_bytestr = response.read()
df = pd.read_csv(BytesIO(response_bytestr), sep=",")
This is a good start, faster would be even better so please submit other solutions.
Related
A huge problem with training with LdaMulticore. It takes 2.5h to get only 25 topics. Whilst only one core is active, and I have 16 of them on Amazon EC2.
How can I optimize this?
Something is bottlenecking this process... When I take a look at processes only one core is active, but after some time all cores get active for a couple of seconds, then again one core.
numberTopics = 25 #Number of topics
model_gensim = LdaMulticore(num_topics=numberTopics,
id2word=dictionary,
iterations=10,
passes=1,
chunksize=50,
eta='auto',
workers=12)
perp_gensim = []
times_gensim = []
i=0
max_it = 5
min_prep = np.inf
start = time()
for _ in tqdm_notebook(range(100)):
model_gensim.update(corpus)
tmp = np.exp(-1 * model_gensim.log_perplexity(corpus))
perp_gensim.append(tmp)
times_gensim.append(time() - start)
if(tmp<min_prep):
min_prep = tmp;
i = 0
else:
i = i + 1;
if (i==max_it):
break
model_gensim.save('results/model_genism/model_genism.model')
with open('results/model_genism/perp_gensim.pickle', 'wb') as f:
pickle.dump(perp_gensim, f)
with open('results/model_genism/time_gensim.pickle', 'wb') as f:
pickle.dump(times_gensim, f)
for i, topic in enumerate(model_gensim.get_topics().argsort(axis=1)[:, -10:][:, ::-1], 1):
print('Topic {}: {}'.format(i, ' '.join([vocabulary[id] for id in topic])))
Good morning.
First of all, I've created a docker swarm with 2 physical hosts and an overlay network. In the same host I've created 2 containers (postgres and ambari servers) and one with ambari agent in which I'll install kafka, zookeeper, spark,... from ambari. The scope is installing several containers in several hosts, but I'm trying first like this as I'm not getting it working.
The fact is that once deployed with Ambari, I change kafka configuration to add advertised.host.name to the physical host's ip and advertised.port to 9092 to bind it to physical host's 9092 port.
When trying it, I'm always getting the following errors:
WARN Error while fetching metadata with correlation id 17 : {prueba2=LEADER_NOT_AVAILABLE} (org.apache.kafka.clients.NetworkClient)
If trying by sending to container's 6667 port
or
[2018-08-30 12:06:28,758] WARN Bootstrap broker 192.168.0.28:9092 disconnected (org.apache.kafka.clients.NetworkClient)
If trying to physical host's port 9092.
Thanks for help and please ask for any further information needed to help solving this problem.
EDIT1:
Configuration changed to the following properties.
and kafka-broker is not running. server.log is showing following trace:
root#host1:/usr/hdp/2.6.3.0-235/kafka# cat /var/log/kafka/server.log
[2018-09-04 09:17:46,964] INFO KafkaConfig values:
advertised.host.name = host1.ambari
advertised.listeners = INTERNO://host1.ambari:6667,EXTERNO://192.168.0.28:9092
advertised.port = 9092
authorizer.class.name =
auto.create.topics.enable = true
auto.leader.rebalance.enable = true
background.threads = 10
broker.id = -1
broker.id.generation.enable = true
broker.rack = null
compression.type = producer
connections.max.idle.ms = 600000
controlled.shutdown.enable = true
controlled.shutdown.max.retries = 3
controlled.shutdown.retry.backoff.ms = 5000
controller.socket.timeout.ms = 30000
default.replication.factor = 1
delete.topic.enable = false
fetch.purgatory.purge.interval.requests = 10000
group.max.session.timeout.ms = 300000
group.min.session.timeout.ms = 6000
host.name =
inter.broker.protocol.version = 0.10.1-IV2
leader.imbalance.check.interval.seconds = 300
leader.imbalance.per.broker.percentage = 10
listeners = INTERNO://host1.ambari:6667,EXTERNO://host1.ambari:6667
log.cleaner.backoff.ms = 15000
log.cleaner.dedupe.buffer.size = 134217728
log.cleaner.delete.retention.ms = 86400000
log.cleaner.enable = true
log.cleaner.io.buffer.load.factor = 0.9
log.cleaner.io.buffer.size = 524288
log.cleaner.io.max.bytes.per.second = 1.7976931348623157E308
log.cleaner.min.cleanable.ratio = 0.5
log.cleaner.min.compaction.lag.ms = 0
log.cleaner.threads = 1
log.cleanup.policy = [delete]
log.dir = /tmp/kafka-logs
log.dirs = /kafka-logs
log.flush.interval.messages = 9223372036854775807
log.flush.interval.ms = null
log.flush.offset.checkpoint.interval.ms = 60000
log.flush.scheduler.interval.ms = 9223372036854775807
log.index.interval.bytes = 4096
log.index.size.max.bytes = 10485760
log.message.format.version = 0.10.1-IV2
log.message.timestamp.difference.max.ms = 9223372036854775807
log.message.timestamp.type = CreateTime
log.preallocate = false
log.retention.bytes = -1
log.retention.check.interval.ms = 300000
log.retention.hours = 168
log.retention.minutes = null
log.retention.ms = null
log.roll.hours = 168
log.roll.jitter.hours = 0
log.roll.jitter.ms = null
log.roll.ms = null
log.segment.bytes = 1073741824
log.segment.delete.delay.ms = 60000
max.connections.per.ip = 2147483647
max.connections.per.ip.overrides =
message.max.bytes = 1000000
metric.reporters = []
metrics.num.samples = 2
metrics.sample.window.ms = 30000
min.insync.replicas = 1
num.io.threads = 8
num.network.threads = 3
num.partitions = 1
num.recovery.threads.per.data.dir = 1
num.replica.fetchers = 1
offset.metadata.max.bytes = 4096
offsets.commit.required.acks = -1
offsets.commit.timeout.ms = 5000
offsets.load.buffer.size = 5242880
offsets.retention.check.interval.ms = 600000
offsets.retention.minutes = 86400000
offsets.topic.compression.codec = 0
offsets.topic.num.partitions = 50
offsets.topic.replication.factor = 3
offsets.topic.segment.bytes = 104857600
port = 6667
principal.builder.class = class org.apache.kafka.common.security.auth.DefaultPrincipalBuilder
producer.purgatory.purge.interval.requests = 10000
queued.max.requests = 500
quota.consumer.default = 9223372036854775807
quota.producer.default = 9223372036854775807
quota.window.num = 11
quota.window.size.seconds = 1
replica.fetch.backoff.ms = 1000
replica.fetch.max.bytes = 1048576
replica.fetch.min.bytes = 1
replica.fetch.response.max.bytes = 10485760
replica.fetch.wait.max.ms = 500
replica.high.watermark.checkpoint.interval.ms = 5000
replica.lag.time.max.ms = 10000
replica.socket.receive.buffer.bytes = 65536
replica.socket.timeout.ms = 30000
replication.quota.window.num = 11
replication.quota.window.size.seconds = 1
request.timeout.ms = 30000
reserved.broker.max.id = 1000
sasl.enabled.mechanisms = [GSSAPI]
sasl.kerberos.kinit.cmd = /usr/bin/kinit
sasl.kerberos.min.time.before.relogin = 60000
sasl.kerberos.principal.to.local.rules = [DEFAULT]
sasl.kerberos.service.name = null
sasl.kerberos.ticket.renew.jitter = 0.05
sasl.kerberos.ticket.renew.window.factor = 0.8
sasl.mechanism.inter.broker.protocol = GSSAPI
security.inter.broker.protocol = PLAINTEXT
socket.receive.buffer.bytes = 102400
socket.request.max.bytes = 104857600
socket.send.buffer.bytes = 102400
ssl.cipher.suites = null
ssl.client.auth = none
ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1]
ssl.endpoint.identification.algorithm = null
ssl.key.password = null
ssl.keymanager.algorithm = SunX509
ssl.keystore.location = null
ssl.keystore.password = null
ssl.keystore.type = JKS
ssl.protocol = TLS
ssl.provider = null
ssl.secure.random.implementation = null
ssl.trustmanager.algorithm = PKIX
ssl.truststore.location = null
ssl.truststore.password = null
ssl.truststore.type = JKS
unclean.leader.election.enable = true
zookeeper.connect = host1.ambari:2181
zookeeper.connection.timeout.ms = 25000
zookeeper.session.timeout.ms = 30000
zookeeper.set.acl = false
zookeeper.sync.time.ms = 2000
(kafka.server.KafkaConfig)
[2018-09-04 09:17:46,974] FATAL (kafka.Kafka$)
java.lang.IllegalArgumentException: Error creating broker listeners from 'INTERNO://host1.ambari:6667,EXTERNO://host1.ambari:6667': No enum constant org.apache.kafka.common.protocol.SecurityProtocol.INTERNO
at kafka.server.KafkaConfig.validateUniquePortAndProtocol(KafkaConfig.scala:994)
at kafka.server.KafkaConfig.getListeners(KafkaConfig.scala:1013)
at kafka.server.KafkaConfig.<init>(KafkaConfig.scala:966)
at kafka.server.KafkaConfig$.fromProps(KafkaConfig.scala:779)
at kafka.server.KafkaConfig$.fromProps(KafkaConfig.scala:776)
at kafka.server.KafkaServerStartable$.fromProps(KafkaServerStartable.scala:28)
at kafka.Kafka$.main(Kafka.scala:58)
at kafka.Kafka.main(Kafka.scala)
If you're deploying Kafka within Docker, you need to configure the listener to be accessible if you want to be able to access it from outside your Docker network too. This article explains the details.
I have written a distributed TensorFlow program with 1 ps job and 2 worker jobs. I was expecting the data batches to be distributed among workers but it don't seem to be the case. I see that only one worker (task=0) is doing all the work while the other one is idle. Could you please help me find the issue with this program:
import math
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
# Flags for defining the tf.train.ClusterSpec
tf.app.flags.DEFINE_string("ps_hosts", "",
"Comma-separated list of hostname:port pairs")
tf.app.flags.DEFINE_string("worker_hosts", "",
"Comma-separated list of hostname:port pairs")
tf.app.flags.DEFINE_string("master_hosts", "oser502110:2222",
"Comma-separated list of hostname:port pairs")
# Flags for defining the tf.train.Server
tf.app.flags.DEFINE_string("job_name", "", "One of 'ps', 'worker'")
tf.app.flags.DEFINE_integer("task_index", 0, "Index of task within the job")
tf.app.flags.DEFINE_integer("hidden_units", 100,
"Number of units in the hidden layer of the NN")
tf.app.flags.DEFINE_string("data_dir", "/tmp/mnist-data",
"Directory for storing mnist data")
tf.app.flags.DEFINE_integer("batch_size", 100, "Training batch size")
tf.app.flags.DEFINE_string("worker_grpc_url", None,
"Worker GRPC URL")
FLAGS = tf.app.flags.FLAGS
IMAGE_PIXELS = 28
def main(_):
ps_hosts = FLAGS.ps_hosts.split(",")
worker_hosts = FLAGS.worker_hosts.split(",")
master_hosts = FLAGS.master_hosts.split(",")
cluster = tf.train.ClusterSpec({"ps": ps_hosts, "worker": worker_hosts})
# Create and start a server for the local task.
server = tf.train.Server(cluster,
job_name=FLAGS.job_name,
task_index=FLAGS.task_index)
if FLAGS.job_name == "ps":
server.join()
elif FLAGS.job_name == "worker":
is_chief = (FLAGS.task_index == 0)
if is_chief: tf.reset_default_graph()
# Assigns ops to the local worker by default.
with tf.device(tf.train.replica_device_setter(
worker_device="/job:worker/task:%d" % FLAGS.task_index,
cluster=cluster)):
# Variables of the hidden layer
hid_w = tf.Variable(
tf.truncated_normal([IMAGE_PIXELS * IMAGE_PIXELS, FLAGS.hidden_units],
stddev=1.0 / IMAGE_PIXELS), name="hid_w")
hid_b = tf.Variable(tf.zeros([FLAGS.hidden_units]), name="hid_b")
# Variables of the softmax layer
sm_w = tf.Variable(
tf.truncated_normal([FLAGS.hidden_units, 10],
stddev=1.0 / math.sqrt(FLAGS.hidden_units)),
name="sm_w")
sm_b = tf.Variable(tf.zeros([10]), name="sm_b")
x = tf.placeholder(tf.float32, [None, IMAGE_PIXELS * IMAGE_PIXELS])
y_ = tf.placeholder(tf.float32, [None, 10])
hid_lin = tf.nn.xw_plus_b(x, hid_w, hid_b)
hid = tf.nn.relu(hid_lin)
y = tf.nn.softmax(tf.nn.xw_plus_b(hid, sm_w, sm_b))
loss = -tf.reduce_sum(y_ * tf.log(tf.clip_by_value(y, 1e-10, 1.0)))
global_step = tf.Variable(0, trainable=False)
train_op = tf.train.AdagradOptimizer(0.01).minimize(
loss, global_step=global_step)
saver = tf.train.Saver()
#summary_op = tf.merge_all_summaries()
init_op = tf.initialize_all_variables()
# Create a "supervisor", which oversees the training process.
sv = tf.train.Supervisor(is_chief=is_chief,
logdir="/tmp/train_logs",
init_op=init_op,
recovery_wait_secs=1,
saver=saver,
global_step=global_step,
save_model_secs=600)
if is_chief:
print("Worker %d: Initializing session..." % FLAGS.task_index)
else:
print("Worker %d: Waiting for session to be initialized..." % FLAGS.task_index)
mnist = input_data.read_data_sets(FLAGS.data_dir, one_hot=True)
sess_config = tf.ConfigProto(allow_soft_placement=True, log_device_placement=True,
device_filters=["/job:ps", "/job:worker/task:%d" % FLAGS.task_index])
# The supervisor takes care of session initialization, restoring from
# a checkpoint, and closing when done or an error occurs.
with sv.prepare_or_wait_for_session(server.target, config=sess_config) as sess:
print("Worker %d: Session initialization complete." % FLAGS.task_index)
# Loop until the supervisor shuts down or 1000000 steps have completed.
step = 0
while not sv.should_stop() and step < 1000000:
# Run a training step asynchronously.
# See `tf.train.SyncReplicasOptimizer` for additional details on how to
# perform *synchronous* training.
batch_xs, batch_ys = mnist.train.next_batch(FLAGS.batch_size)
print("FETCHING NEXT BATCH %d" % FLAGS.batch_size)
train_feed = {x: batch_xs, y_: batch_ys}
_, step = sess.run([train_op, global_step], feed_dict=train_feed)
if step % 100 == 0:
print("Done step %d" % step)
# Ask for all the services to stop.
sv.stop()
if __name__ == "__main__":
tf.app.run()
and here are the logs from the workers with task=0:
2017-06-20 04:50:58.405431: I tensorflow/core/common_runtime/simple_placer.cc:841] Adagrad/value: (Const)/job:ps/replica:0/task:0/cpu:0 truncated_normal/stddev: (Const): /job:worker/replica:0/task:0/gpu:0
2017-06-20 04:50:58.405456: I tensorflow/core/common_runtime/simple_placer.cc:841] truncated_normal/stddev: (Const)/job:worker/replica:0/task:0/gpu:0 truncated_normal/mean: (Const): /job:worker/replica:0/task:0/gpu:0
2017-06-20 04:50:58.405481: I tensorflow/core/common_runtime/simple_placer.cc:841] truncated_normal/mean: (Const)/job:worker/replica:0/task:0/gpu:0 truncated_normal/shape: (Const): /job:worker/replica:0/task:0/gpu:0
2017-06-20 04:50:58.405506: I tensorflow/core/common_runtime/simple_placer.cc:841] truncated_normal/shape: (Const)/job:worker/replica:0/task:0/gpu:0 Worker 0: Session initialization complete.
FETCHING NEXT BATCH 500
FETCHING NEXT BATCH 500
FETCHING NEXT BATCH 500
FETCHING NEXT BATCH 500
FETCHING NEXT BATCH 500
Done step 408800
...
...
but from worker 2 (task=1) the logs looks like:
2017-06-20 04:51:07.288600: I tensorflow/core/common_runtime/simple_placer.cc:841] zeros: (Const)/job:worker/replica:0/task:1/gpu:0 Adagrad/value: (Const): /job:ps/replica:0/task:0/cpu:0
2017-06-20 04:51:07.288614: I tensorflow/core/common_runtime/simple_placer.cc:841] Adagrad/value: (Const)/job:ps/replica:0/task:0/cpu:0 truncated_normal/stddev: (Const): /job:worker/replica:0/task:1/gpu:0
2017-06-20 04:51:07.288639: I tensorflow/core/common_runtime/simple_placer.cc:841] truncated_normal/stddev: (Const)/job:worker/replica:0/task:1/gpu:0 truncated_normal/mean: (Const): /job:worker/replica:0/task:1/gpu:0
2017-06-20 04:51:07.288664: I tensorflow/core/common_runtime/simple_placer.cc:841] truncated_normal/mean: (Const)/job:worker/replica:0/task:1/gpu:0 truncated_normal/shape: (Const): /job:worker/replica:0/task:1/gpu:0 2017-06-20 04:51:07.288689: I tensorflow/core/common_runtime/simple_placer.cc:841] truncated_normal/shape: (Const)/job:worker/replica:0/task:1/gpu:0
I was expecting similar logs from both the workers. Please help me understand this. Looking forward to your help.
You need to manually separate out the data for each worker.
# Get the subset of data for this worker
mnist = input_data.read_data_sets('/tmp/mnist_temp', one_hot=True)
num_old = mnist.train._num_examples
ids = list(range(task_index, mnist.train._num_examples, num_workers))
mnist.train._images = mnist.train._images[ids,]
mnist.train._labels = mnist.train._labels[ids,]
mnist.train._num_examples = mnist.train._images.shape[0]
print("subset of training examples ", mnist.train._num_examples,"/",num_old)
I'm Testing Spring Cloud Stream App for twitter,
Started the docker container with the following Environment properties related to Kafka,
KAFKA_ADVERTISED_HOST_NAME=<ip>
advertised.host.name=<ip>:9092
spring.cloud.stream.bindings.output.destination=twitter-source-test
spring.cloud.stream.kafka.binder.brokers=<ip>:9092
spring.cloud.stream.kafka.binder.zkNodes=<ip>:2181
My kafka producerConfig values follows,
2017-01-12 14:47:09.979 INFO 1 --- [itterSource-1-1] o.a.k.clients.producer.ProducerConfig : ProducerConfig values:
compression.type = none
metric.reporters = []
metadata.max.age.ms = 300000
metadata.fetch.timeout.ms = 60000
reconnect.backoff.ms = 50
sasl.kerberos.ticket.renew.window.factor = 0.8
bootstrap.servers = [192.168.127.188:9092]
retry.backoff.ms = 100
sasl.kerberos.kinit.cmd = /usr/bin/kinit
buffer.memory = 33554432
timeout.ms = 30000
key.serializer = class org.apache.kafka.common.serialization.ByteArraySerializer
sasl.kerberos.service.name = null
sasl.kerberos.ticket.renew.jitter = 0.05
ssl.keystore.type = JKS
ssl.trustmanager.algorithm = PKIX
block.on.buffer.full = false
ssl.key.password = null
max.block.ms = 60000
sasl.kerberos.min.time.before.relogin = 60000
connections.max.idle.ms = 540000
ssl.truststore.password = null
max.in.flight.requests.per.connection = 5
metrics.num.samples = 2
client.id =
ssl.endpoint.identification.algorithm = null
ssl.protocol = TLS
request.timeout.ms = 30000
ssl.provider = null
ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1]
acks = 1
batch.size = 16384
ssl.keystore.location = null
receive.buffer.bytes = 32768
ssl.cipher.suites = null
ssl.truststore.type = JKS
security.protocol = PLAINTEXT
retries = 0
max.request.size = 1048576
value.serializer = class org.apache.kafka.common.serialization.ByteArraySerializer
ssl.truststore.location = null
ssl.keystore.password = null
ssl.keymanager.algorithm = SunX509
metrics.sample.window.ms = 30000
partitioner.class = class org.apache.kafka.clients.producer.internals.DefaultPartitioner
send.buffer.bytes = 131072
linger.ms = 0
2017-01-12 14:47:09.985 INFO 1 --- [itterSource-1-1] o.a.kafka.common.utils.AppInfoParser : Kafka version : 0.9.0.1
But the producer continuously throws the following exception,
2017-01-12 14:47:42.196 ERROR 1 --- [ad | producer-3] o.s.k.support.LoggingProducerListener : Exception thrown when sending a message with key='null' and payload='{-1, 1, 11, 99, 111, 110, 116, 101, 110, 116, 84, 121, 112, 101, 0, 0, 0, 12, 34, 116, 101, 120, 116...' to topic twitter-source-test:
org.apache.kafka.common.errors.TimeoutException: Batch Expired
I can telnet from my docker container to the broker 192.168.127.188:9092 and 2181. Also my kafka server is not a docker container.
Saw some solution like adding 'advertised.host.name' but doesn't worked, or is it the correct way that I have given the env props.
Any help?
Sharing the fix.
Setting listeners in server.properties resolves the problem. eg: listeners = PLAINTEXT://your.host.name:9092
In my ucf file in xps except to the clock of Microblaze I have to add one more clock of my design. I'm not able to understand how to do that.
It's giving me warning like:
WARNING:ConstraintSystem:56 - Constraint [system.ucf(13)]: Unable to find an active 'TNM' or 'TimeGrp'
constraint named 'clk'.
Here is my ucf file which is giving warning.
NET "CLK_N" LOC = AD11 | IOSTANDARD = DIFF_SSTL15;
NET "CLK_P" LOC = AD12 | IOSTANDARD = DIFF_SSTL15;
NET RESET LOC = "AB7" | IOSTANDARD = "LVCMOS15";
NET RS232_Uart_1_sin LOC = "M19" | IOSTANDARD = "LVCMOS25";
NET RS232_Uart_1_sout LOC = "K24" | IOSTANDARD = "LVCMOS25";
NET sm_fan_pwm_net_vcc LOC = "L26" | IOSTANDARD = "LVCMOS25";
NET "CLK" TNM_NET = sys_clk_pin;
TIMESPEC TS_sys_clk_pin = PERIOD sys_clk_pin 200000 kHz;
CONFIG DCI_CASCADE = "33 32 34";
#######custom ip clk##########
NET "clk" TNM_NET = "clk";
TIMESPEC TS_clk = PERIOD "clk" 5 ns HIGH 50 %;
How to define userip clock in ucf file? I want to give same clock to both Microblaze and my Verilog design.
You have to assign the timing constraint to the clock pin. The clock pin is clk_p (we usually use positive pin for differential) while you try to assign the constraint to clk, which doesn't exist.
Moreover, you have duplicate constraint for the timing net clk. This UCF should work better:
NET "CLK_N" LOC = AD11 | IOSTANDARD = DIFF_SSTL15;
NET "CLK_P" LOC = AD12 | IOSTANDARD = DIFF_SSTL15;
NET RESET LOC = "AB7" | IOSTANDARD = "LVCMOS15";
NET RS232_Uart_1_sin LOC = "M19" | IOSTANDARD = "LVCMOS25";
NET RS232_Uart_1_sout LOC = "K24" | IOSTANDARD = "LVCMOS25";
NET sm_fan_pwm_net_vcc LOC = "L26" | IOSTANDARD = "LVCMOS25";
NET "CLK_P" TNM_NET = sys_clk_pin;
TIMESPEC TS_sys_clk_pin = PERIOD sys_clk_pin 200000 kHz;
CONFIG DCI_CASCADE = "33 32 34";