the line: self.Q = tf.reduce_sum(tf.multiply(self.output, self.actions_)) should be: self.Q = tf.reduce_sum(tf.multiply(self.output, self.actions_), axis=1)
the line:
self.Q = tf.reduce_sum(tf.multiply(self.output, self.actions_))
should be:
self.Q = tf.reduce_sum(tf.multiply(self.output, self.actions_), axis=1)