OpenCL Theano - How to forcefully disable CUDA? - machine-learning

After a series of pains, I have installed Theano on a machine with AMD graphics card - Radeon HD 5450 (Cedar).
Now, consider a following code.
import numpy
import theano
import theano.tensor as T
rng = numpy.random
N = 400 #number of samples
feats = 784 #dimensionality of features
D = (rng.randn(N, feats), rng.randint(size=N, low=0, high=2))
training_steps = 10000
# theano symbolic variables
x = T.matrix("x")
y = T.vector("y")
w = theano.shared(rng.randn(784), name="w")
b = theano.shared(0., name="b")
print("Initial Model:")
print(str(w.get_value()) + " " + str(b.get_value()) )
p_1 = 1/(1 + T.exp(-T.dot(x, w) - b)) # probability of target being 1
prediction = p_1 > 0.5 # prediction threshold
xent = -y * T.log(p_1) - (1-y)*T.log(1-p_1) # cross-entropy loss function
cost = xent.mean() + 0.01 * (w**2).sum() # cost - to be minimized
gw, gb = T.grad(cost, [w, b])
#compile it
train = theano.function(
inputs = [x, y],
outputs = [prediction, xent],
updates = {w: w - 0.1*gw, b: b - 0.1*gb} )
predict = theano.function(inputs = [x], outputs = prediction)
#train it
for i in range (training_steps):
pred, err = train(D[0], D[1])
print("Final Model: ")
print(str(w.get_value()) + " " + str(b.get_value()) )
print("Target values for D: " + str(D[1]))
print("Predictions on D: " + str(D[0]))
I think this code should work just fine. But I get a series of errors:
ERROR (theano.gof.opt): Optimization failure due to: local_gpua_hgemm
ERROR (theano.gof.opt): node: dot(x.T, Elemwise{sub,no_inplace}.0)
ERROR (theano.gof.opt): TRACEBACK:
ERROR (theano.gof.opt): Traceback (most recent call last):
File "/home/user/anaconda3/lib/python3.5/site-packages/theano/gof/opt.py", line 1772, in process_node
replacements = lopt.transform(node)
File "/home/user/anaconda3/lib/python3.5/site-packages/theano/sandbox/gpuarray/opt.py", line 140, in local_opt
new_op = maker(node, context_name)
File "/home/user/anaconda3/lib/python3.5/site-packages/theano/sandbox/gpuarray/opt.py", line 732, in local_gpua_hgemm
if nvcc_compiler.nvcc_version < '7.5':
TypeError: unorderable types: NoneType() < str()
And I get the same set of messages multiple times. Then at the end:
File "/home/user/anaconda3/lib/python3.5/site-packages/pygpu-0.2.1-py3.5-linux-x86_64.egg/pygpu/elemwise.py", line 286, in __init__
**self.flags)
File "pygpu/gpuarray.pyx", line 1950, in pygpu.gpuarray.GpuKernel.__cinit__ (pygpu/gpuarray.c:24214)
File "pygpu/gpuarray.pyx", line 467, in pygpu.gpuarray.kernel_init (pygpu/gpuarray.c:7174)
pygpu.gpuarray.UnsupportedException: ('The following error happened while compiling the node', GpuElemwise{Composite{((-i0) - i1)}}[(0, 0)]<gpuarray>(GpuFromHost<None>.0, InplaceGpuDimShuffle{x}.0), '\n', b'Device does not support operation')
Does this mean I cannot use this GPU or I have done something wrong in my code. Moreover, from the errors, it seems there is been a search for nvcc. But I do not have CUDA, I have opencl.
>>> import theano
Mapped name None to device opencl0:0: Cedar
also:
>>> from theano import config
>>> config.device
'opencl0:0'
>>> config.cuda
<theano.configparser.AddConfigVar.<locals>.SubObj object at 0x7fba9dee7d30>
>>> config.nvcc
<theano.configparser.AddConfigVar.<locals>.SubObj object at 0x7fba9e5967f0>
>>> config.gpu
<theano.configparser.AddConfigVar.<locals>.SubObj object at 0x7fbaa9f61828>
So how do I go from here? Is there way to make sure clcc is searched instead of nvcc.
PS_1: hello world works.
PS_2: System = 14.04 64 bit

OpenCL is not yet supported by Theano. As a result, only NVIDIA GPUs are supported.
The status of OpenCL is recorded on GitHub.
You need to disable GPU operation by setting device=cpu in your Theano config. There are multiple ways to do this (i.e. via THEANO_FLAGS environment variable or via a .theanorc file; see documentation).
Before running the script, try setting
export THEANO_FLAGS=device=cpu,floatX=float64
Your situation may need additional configuration options. See the documentation for more.

Related

Map Dask bincount over 2d array columns

I am trying to use bincount over a 2D array. Specifically I have this code:
import numpy as np
import dask.array as da
def dask_bincount(weights, x):
da.bincount(x, weights)
idx = da.random.random_integers(0, 1024, 1000)
weight = da.random.random((1000, 2))
bin_count = da.apply_along_axis(dask_bincount, 1, weight, idx)
The idea is that the bincount can be made with the same idx array on each one of the weight columns. That would return an array of size (np.amax(x) + 1, 2) if I am correct.
However when doing this I get this error message:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-17-5b8eed89ad32> in <module>
----> 1 bin_count = da.apply_along_axis(dask_bincount, 1, weight, idx)
~/.local/lib/python3.9/site-packages/dask/array/routines.py in apply_along_axis(func1d, axis, arr, dtype, shape, *args, **kwargs)
454 if shape is None or dtype is None:
455 test_data = np.ones((1,), dtype=arr.dtype)
--> 456 test_result = np.array(func1d(test_data, *args, **kwargs))
457 if shape is None:
458 shape = test_result.shape
<ipython-input-14-34fd0eb9b775> in dask_bincount(weights, x)
1 def dask_bincount(weights, x):
----> 2 da.bincount(x, weights)
~/.local/lib/python3.9/site-packages/dask/array/routines.py in bincount(x, weights, minlength, split_every)
670 raise ValueError("Input array must be one dimensional. Try using x.ravel()")
671 if weights is not None:
--> 672 if weights.chunks != x.chunks:
673 raise ValueError("Chunks of input array x and weights must match.")
674
AttributeError: 'numpy.ndarray' object has no attribute 'chunks'
I thought that when dask array were created the library automatically assigns them chunks, so the error does not say much. How can I fix this?
I made an script that does it on numpy with map.
idx_np = np.random.randint(0, 1024, 1000)
weight_np = np.random.random((1000,2))
f = lambda y: np.bincount(idx_np, weight_np[:,y])
result = map(f, [i for i in range(2)])
np.array(list(result))
array([[0.9885341 , 0.9977873 , 0.24937023, ..., 0.31024526, 1.40754883,
0.87609759],
[1.77406303, 0.84787723, 0.14591474, ..., 0.54584068, 0.38357015,
0.85202672]])
I would like to the same but with dask
There are multiple problems at play.
Weights should be (2, 1000)
You discover this by trying to write the same function in numpy using apply_along_axis.
idx_np = np.random.random_integers(0, 1024, 1000)
weight_np = np.random.random((2, 1000)) # <- transposed
# This gives the same result as the code you provided
np.apply_along_axis(lambda weight, idx: np.bincount(idx, weight), 1, weight_np, idx_np)
da.apply_along_axis applies the function to numpy arrays
You're getting the error
AttributeError: 'numpy.ndarray' object has no attribute 'chunks'
This suggests that what makes it into the da.bincount method is actually a numpy array. The fact is that da.apply_along_axis actually takes each row of weight and sends it to the function as a numpy array.
Your function should therefore actually be a numpy function:
def bincount(weights, x):
return np.bincount(x, weights)
However, if you try this, you will still get the same error. I believe that happens for a whole another reason though:
Dask doesn't know what the output shape will be and tries to infer it
In the code and/or documentation for apply_along_axis, we can see that Dask tries to infer the output shape and dtype by passing in the array [1] (related question). This is a problem, since bincount cannot just accept such argument.
What we can do instead is provide shape and dtype to the method so that Dask doesn't have to infer it.
The problem here is that bincount's output shape depends on the maximum value of the input array. Unless you know it beforehand, you will sadly need to compute it. The whole operation therefore won't be fully lazy.
This is the full answer:
import numpy as np
import dask.array as da
idx = da.random.random_integers(0, 1024, 1000)
weight = da.random.random((2, 1000))
def bincount(weights, x):
return np.bincount(x, weights)
m = idx.max().compute()
da.apply_along_axis(bincount, 1, weight, idx, shape=(m,), dtype=weight.dtype)
Appendix: randint vs random_integers
Be careful, because these are subtly different
randint takes integers from low (inclusive) to high (exclusive)
random_integers takes integers from low (inclusive) to high (inclusive)
Thus you have to call randint with high + 1 to get the same value.

How to feed previous time-stamp prediction as additional input to the next time-stamp?

This question might have been asked, but I got confused.
I am trying to apply one of RNN types, e.g. LSTM for time-series forecasting. I have inputs, y (stock returns). For each timestamp, I'd like to get the predictions. Q1 - Am I correct choosing seq2seq approach?
I also want to use predictions from previous timestamp (initializing initial values with some constant) as additional (still using my existing inputs) input in the form of squared residuals, i.e. using
eps_{t-1} = (y_{t-1} - y^_{t-1})^2 as additional input at t (as well as previous inputs).
So, how can I do this in tensorflow or in pytorch?
I tried to depict what I want on the attached graph. The graph
p.s. Sorry, it the question is poorly formulated
Let say your input if of dimension (32,10,1) with batch_size 32, time steps of length 10 and dimension of 1. Same for your target (stock return). This code make use of the tf.scan function, which is usefull when implementing custom recurrent networks (it will iterate over the timesteps). It remains to use the residual of t-1 in t somewhere, as you would like to.
ps: it is a very basic implementation of lstm from scratch, without any bias or output activation.
import tensorflow as tf
import numpy as np
tf.reset_default_graph()
BS = 32
TS = 10
inputs_dim = 1
target_dim = 1
inputs = tf.placeholder(shape=[BS, TS, inputs_dim], dtype=tf.float32)
stock_returns = tf.placeholder(shape=[BS, TS, target_dim], dtype=tf.float32)
state_size = 16
# initial hidden state
init_state = tf.placeholder(shape=[2, BS, state_size],
dtype=tf.float32, name='initial_state')
# initializer
xav_init = tf.contrib.layers.xavier_initializer
# params
W = tf.get_variable('W', shape=[4, state_size, state_size],
initializer=xav_init())
U = tf.get_variable('U', shape=[4, inputs_dim, state_size],
initializer=xav_init())
W_out = tf.get_variable('W_out', shape=[state_size, target_dim],
initializer=xav_init())
#the function to feed tf.scan with
def step(prev, inputs_):
#unpack all inputs and previous outputs
st_1, ct_1 = prev[0][0], prev[0][1]
x = inputs_[0]
target = inputs_[1]
#get previous squared residual
eps = prev[1]
"""
here do whatever you want with eps_t-1
like x += eps if x if of the same dimension
or include it somewhere in your graph
"""
# lstm gates (add bias if needed)
#
# input gate
i = tf.sigmoid(tf.matmul(x,U[0]) + tf.matmul(st_1,W[0]))
# forget gate
f = tf.sigmoid(tf.matmul(x,U[1]) + tf.matmul(st_1,W[1]))
# output gate
o = tf.sigmoid(tf.matmul(x,U[2]) + tf.matmul(st_1,W[2]))
# gate weights
g = tf.tanh(tf.matmul(x,U[3]) + tf.matmul(st_1,W[3]))
ct = ct_1*f + g*i
st = tf.tanh(ct)*o
"""
make prediction, compute residual in t
and pass it to t+1
Normaly, we would compute prediction outside the scan function,
but as we need it here, we could just keep it and return it back
as an output of the scan function
"""
prediction_t = tf.matmul(st, W_out) # + bias
eps = (target - prediction_t)**2
return [tf.stack((st, ct), axis=0), eps, prediction_t]
states, eps, preds = tf.scan(step, [tf.transpose(inputs, [1,0,2]),
tf.transpose(stock_returns, [1,0,2])], initializer=[init_state,
tf.zeros((32,1), dtype=tf.float32),
tf.zeros((32,1),dtype=tf.float32)])
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
out = sess.run(preds, feed_dict=
{inputs:np.random.rand(BS,TS,inputs_dim),
stock_returns:np.random.rand(BS,TS,target_dim),
init_state:np.zeros((2,BS,state_size))})
out = tf.transpose(out,[1,0,2])
print(out)
And the output :
Tensor("transpose_2:0", shape=(32, 10, 1), dtype=float32)
Base code from here

unsupervised learning how to get number of clusters

In this code below the author says that -
"Before I begin the kmeans clustering I want to use a hierarchial clustering to figure how many clusters I should have. I truncated the dendrogram because if I didn't the dendrogram will be hard to read. I cut at 20 because it has the second biggest distance jump (the first big jump is at 60). After the cut there are 7 clusters."
I am not able to see in the Dendrogram how he arrived at the numbers he mentioned - 20, 60 or 7
I am attaching the dendrogram that I have got from the sample data taken from his github example and am wondering if anyone can shed light on how he arrived at the numbers 20, 60 or 7
he also says "Let's fit k-means on the matrix with a range of clusters 1 - 19." where did he get that range 1 to 19 from? is it cause of the drop at 20 (or the cut off at 20)
github - https://github.com/moyphilip/SKU-Clustering
Also what would one say should be the number of clusters in this second image attached here ? 6 clusters ? (its a different dataset)
from sklearn.feature_extraction.text import TfidfVectorizer
import os
import pandas as pd
import re
import numpy as np
df = pd.read_csv('sample-data.csv')
def split_description(string):
string_split = string.split(' - ',1)
name = string_split[0]
return name
df_new = pd.DataFrame()
df_new['name'] = df.loc[:,'description'].apply(lambda x: split_description(x))
df_new['id'] = df['id']
def remove(name):
new_name = re.sub("[0-9]", '', name)
new_name = ' '.join(new_name.split())
return new_name
df_new['name'] = df_new.loc[:,'name'].apply(lambda x: remove(x))
df_new.head()
tfidf_vectorizer = TfidfVectorizer(
use_idf=True,
stop_words = 'english',
ngram_range=(1,4), min_df = 0.01, max_df = 0.8)
tfidf_matrix = tfidf_vectorizer.fit_transform(df_new['name'])
print (tfidf_matrix.shape)
print (tfidf_vectorizer.get_feature_names())
from sklearn.metrics.pairwise import cosine_similarity
dist = 1.0 - cosine_similarity(tfidf_matrix)
print (dist)
from scipy.cluster.hierarchy import ward, dendrogram
#run_line_magic('matplotlib', 'inline')
import matplotlib.pyplot as plt
linkage_matrix = ward(dist) #define the linkage_matrix using ward clustering pre-computed distances
fig, ax = plt.subplots(figsize=(15, 20)) # set size
ax = dendrogram(linkage_matrix,
truncate_mode='lastp', # show only the last p merged clusters
p=20, # show only the last p merged clusters
leaf_rotation=90.,
leaf_font_size=12.,
labels=list(df_new['name']))
plt.axhline(y=20, linewidth = 2, color = 'black')
fig.suptitle("Hierarchial Clustering Dendrogram Truncated", fontsize = 35, fontweight = 'bold')
#fig.show()
from sklearn.cluster import KMeans
num_clusters = range(1,20)
KM = [KMeans(n_clusters=k, random_state = 1).fit(tfidf_matrix) for k in num_clusters]
# Let's plot the within cluster sum of squares for each k to see which k I should choose.
#
# The plot shows a steady decline from from 0 to 19. Since the elbow rule does not apply for this I will choose k = 7 because of the previous dendrogram.
# In[17]:
import matplotlib.pyplot as plt
#get_ipython().run_line_magic('matplotlib', 'inline')
with_in_cluster = [KM[k].inertia_ for k in range(0,len(num_clusters))]
plt.plot(num_clusters, with_in_cluster)
plt.ylim(min(with_in_cluster)-1000, max(with_in_cluster)+1000)
plt.ylabel('with-in cluster sum of squares')
plt.xlabel('# of clusters')
plt.title('kmeans within ss for k value')
plt.show()
# I add the cluster label to each record in df_new
# In[18]:
model = KM[6]
clusters = model.labels_.tolist()
df_new['cluster'] = clusters
# Here is the distribution of clusters. Cluster 0 has a records, then cluster 1. Cluster 2 - 4 seem pretty even.
# In[19]:
df_new['cluster'].value_counts()
# I print the top terms per cluster and the names in the respective cluster.
# In[20]:
print("Top terms per cluster:")
print
order_centroids = model.cluster_centers_.argsort()[:, ::-1]
terms = tfidf_vectorizer.get_feature_names()
for i in range(model.n_clusters):
print ("Cluster %d : " %i )
for ind in order_centroids[i, :10]:
print ( '%s' % terms[ind])
print
print ("Cluster %d names:" %i)
for idx in df_new[df_new['cluster'] == i]['name'].sample(n = 10):
print ( ' %s' %idx)
print
print
# I reduce the dist to 2 dimensions with MDS. The dissimilarity is precomputed because we provide 1 - cosine similarity. Then I assign the x and y variables.
# In[21]:
import matplotlib.pyplot as plt
import matplotlib as mpl
from sklearn.manifold import MDS
mds = MDS(n_components=2, dissimilarity="precomputed", random_state=1)
pos = mds.fit_transform(dist)
xs, ys = pos[:, 0], pos[:, 1]
# In[22]:
cluster_colors = {0: '#85C1E9', 1: '#FF0000', 2: '#800000', 3: '#04B320',
4: '#6033FF', 5: '#33FF49', 6: '#F9E79F', 7: '#935116',
8: '#9B59B6', 9: '#95A5A6'}
cluster_labels = {0: 'vest dress print', 1: 'shirt merino island',
2: 'pants guide pants guide', 3: 'shorts board board shorts',
4: 'simply live live simply', 5: 'cap cap bottoms bottoms',
6: 'jkt zip jkt guide'}
#some ipython magic to show the matplotlib plots inline
#get_ipython().run_line_magic('matplotlib', 'inline')
#create data frame that has the result of the MDS plus the cluster numbers and titles
df_plot = pd.DataFrame(dict(x=xs, y=ys, label=clusters, name=df_new['name']))
#group by cluster
groups = df_plot.groupby('label')
# set up plot
fig, ax = plt.subplots(figsize=(17, 9)) # set size
for name, group in groups:
ax.plot(group.x, group.y, marker='o', linestyle='', ms=12,
label = cluster_labels[name],
color = cluster_colors[name])
ax.set_aspect('auto')
ax.legend(numpoints = 1)
fig.suptitle("SKU Clustering", fontsize = 35, fontweight = 'bold')
#plt.show()

Tensorflow Neural Network for Binary Classicication; how do I use placeholder

Here is my code:
My target is a vector with shape(N,) which is a vector with only binary numbers
However, I'm running into compiling errors
/Library/Frameworks/Python.framework/Versions/3.6/bin/python3.6 /Users/Lai/Dropbox/PersonalProject/MachineLearningForSports/models/NeuralNetwork.py
/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/sklearn/cross_validation.py:44: DeprecationWarning: This module was deprecated in version 0.18 in favor of the model_selection module into which all the refactored classes and functions are moved. Also note that the interface of the new CV iterators are different from that of this module. This module will be removed in 0.20.
"This module will be removed in 0.20.", DeprecationWarning)
Traceback (most recent call last):
File "/Users/Lai/Dropbox/PersonalProject/MachineLearningForSports/models/NeuralNetwork.py", line 102, in <module>
_, c = sess.run([optimizer,cost],feed_dict = {x:batch_x,y:batch_y})
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 766, in run
run_metadata_ptr)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 943, in _run
% (np_val.shape, subfeed_t.name, str(subfeed_t.get_shape())))
ValueError: Cannot feed value of shape (100,) for Tensor 'Placeholder_1:0', which has shape '(?, 2)'
Since my batch size is 100; I believe that the error is at when comparing my target to my predictions. The tf.placable seems make the prediction with N*2, although I'm sure. Any help ?? Thanks
import tensorflow as tf
import DataPrepare as dp
import numpy as np
def random_init(x,num_feature_1st,num_feature_2nd,num_class):
W1 = tf.Variable(tf.random_normal([num_feature_1st,num_feature_2nd]))
bias1 = tf.Variable(tf.random_normal([num_feature_2nd]))
W2 = tf.Variable(tf.random_normal([num_feature_2nd,num_class]))
bias2 = tf.Variable(tf.random_normal([num_class]))
return [W1,bias1,W2,bias2]
def softsign(z):
"""The softsign function, applied elementwise."""
return z / (1. + np.abs(z))
def multilayer_perceptron(x,num_feature_1st,num_feature_2nd,num_class):
params = random_init(x,num_feature_1st,num_feature_2nd,num_class)
layer_1 = tf.add(tf.matmul(x,params[0]),params[1])
layer_1 = softsign(layer_1)
#layer_1 = tf.nn.relu(layer_1)
layer_2 = tf.add(tf.matmul(layer_1,params[2]),params[3])
#output = tf.nn.softmax(layer_2)
output = tf.nn.sigmoid(layer_2)
return output
def next_batch(num, dataX,dataY):
idx = np.arange(0,len(dataX))
np.random.shuffle(idx)
idx = idx[0:num]
dataX_shuffle = [dataX[i] for i in idx]
dataY_shuffle = [dataY[i] for i in idx]
dataX_shuffle = np.asarray(dataX_shuffle)
dataY_shuffle = np.asarray(dataY_shuffle)
return dataX_shuffle, dataY_shuffle
if __name__ == "__main__":
#sess = tf.InteractiveSession()
learning_rate = 0.001
training_epochs = 10
batch_size = 100
display_step = 1
num_feature_1st = 6
num_feature_2nd = 500
num_class = 2
x = tf.placeholder('float', [None, 6])
y = tf.placeholder('float',[None,2])
data = dp.dataPrepare(dp.datas,dp.path)
trainX = data[0]
testX = data[1] # a matrix
trainY = data[2] # a vector with binary number
testY = data[3]
params = random_init(x,num_feature_1st,num_feature_2nd,num_class)
# construct model
pred = multilayer_perceptron(x, num_feature_1st, num_feature_2nd, num_class)
cost = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(pred, y))
optimizer = tf.train.AdamOptimizer().minimize(cost)
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
#train
for epoch in range(training_epochs):
avg_cost = 0
total_batch = int(len(trainX[:,0])/batch_size)
for i in range(total_batch):
batch_x, batch_y = next_batch(batch_size,trainX,trainY)
_, c = sess.run([optimizer,cost],feed_dict = {x:batch_x,y:batch_y})
avg_cost += c/total_batch
if epoch % display_step ==0:
print("Epoch: ", "%04d" % (epoch+1), " cost= ", "{:.9f}".format(avg_cost))
print("Optimization Finished!")
Whenever you execute a dynamic node from the computation graph - which is pretty much any node that is not an input - you need to specify all dependent variables. Think about this this way: If you had a mathematical function of the form
y = f(x) = Ax + b (for example)
and you wanted to evaluate that function, you need to specify x as well. You need not, however, specify x if you wanted to evaluate (i.e. read) the value of A, since A is known (at least in this context).
Consequently, you can evaluate (by passing it to tf.Session.run(...) the parameters of your network without specifying the inputs (A in the example above). You cannot, however, evaluate the output of your functions without specifying the inputs (in the example, you need to specify x).
As for your code, the following line will thus not work:
print(sess.run(pred)), since you ask the session to evaluate a function without specifying its inputs.

Neural network model not learning?

I tried to model a NN using softmax regression.
After 999 iterations, I got error of about 0.02% for per data point, which i thought was good. But when I visualize the model on tensorboard, my cost function did not reach towards 0 instead I got something like this
And for weights and bias histogram this
I am a beginner and I can't seem to understand the mistake. May be I am using a wrong method to define cost?
Here is my full code for reference.
import tensorflow as tf
import numpy as np
import random
lorange= 1
hirange= 10
amplitude= np.random.uniform(-10,10)
t= 10
random.seed()
tau=np.random.uniform(lorange,hirange)
x_node = tf.placeholder(tf.float32, (10,))
y_node = tf.placeholder(tf.float32, (10,))
W = tf.Variable(tf.truncated_normal([10,10], stddev= .1))
b = tf.Variable(.1)
y = tf.nn.softmax(tf.matmul(tf.reshape(x_node,[1,10]), W) + b)
##ADD SUMMARY
W_hist = tf.histogram_summary("weights", W)
b_hist = tf.histogram_summary("biases", b)
y_hist = tf.histogram_summary("y", y)
# Cost function sum((y_-y)**2)
with tf.name_scope("cost") as scope:
cost = tf.reduce_mean(tf.square(y_node-y))
cost_sum = tf.scalar_summary("cost", cost)
# Training using Gradient Descent to minimize cost
with tf.name_scope("train") as scope:
train_step = tf.train.GradientDescentOptimizer(0.00001).minimize(cost)
sess = tf.InteractiveSession()
# Merge all the summaries and write them out to logfile
merged = tf.merge_all_summaries()
writer = tf.train.SummaryWriter("/tmp/mnist_logs_4", sess.graph_def)
error = tf.reduce_sum(tf.abs(y - y_node))
init = tf.initialize_all_variables()
sess.run(init)
steps = 1000
for i in range(steps):
xs = np.arange(t)
ys = amplitude * np.exp(-xs / tau)
feed = {x_node: xs, y_node: ys}
sess.run(train_step, feed_dict=feed)
print("After %d iteration:" % i)
print("W: %s" % sess.run(W))
print("b: %s" % sess.run(b))
print('Total Error: ', error.eval(feed_dict={x_node: xs, y_node:ys}))
# Record summary data, and the accuracy every 10 steps
if i % 10 == 0:
result = sess.run(merged, feed_dict=feed)
writer.add_summary(result, i)
I got the same plot like you a couple of times.
That happened mostly when I was running tensorboard on multiple log-files. That is, the logdir I gave to TensorBoard contained multiple log-files. Try to run TensorBoard on one single log-file and let me know what happens

Resources