It is possible for the loss to drop a bit but the accuracy not to improve at all, due to the abrupt difference in the classification levels (and the fact that the loss is given by a mean).
As with which metric to use when evaluating the model, I think usually you want it to be accuracy (for classification), since it's what matters in the end. We mostly use the logloss to check if everything is ok (i.e. convergence is smooth and monotonically decreasing).*
I believe you probably also used the same number of epochs for the different learning rates, right? In that case, it's a bit natural for the lower learning rate to go a little bit slower. Unless it finds a nasty plateau, it should converge to a lower minimum.
Anyway, when it comes to learning rates, the best practice is to make it smaller as you go through the training process, i.e., learning rate decay. Many practitioners also use cosines to diminish the value of the learning rate and then reset it so that we "never" get stuck in local minima. You can check out the first lecture of the fast.ai course to take a look into another better technique.
So... My reasoning was a bit oversimplified — which is terrible. Let's go over it again in a little bit more detail. What should you analyse here? The answer is: it depends on your problem. Accuracy is of good guidance, however, if, for example, 90% of your dataset consists of one class, your algorithm might "learn" to give everything that class and call it a day, which will give you 90% accuracy but doesn't mean anything. In order to more easily identify these imbalanced cases, many libraries offer methods for you to compare your classifier to the so-called base-class zero classifier, which can be the basic standard of comparison with respect to performance.
In almost all cases, the recommendation for assessing performance of classifiers is to use the confusion matrix and look for the statistics you most care for. For instance, if you're looking at something like identifying a person has cancer, you want the false positive rate (FPR) to be as low as possible, because misdiagnosing an ill patient can be a disaster and, if someone is not ill, it isn't that a big a problem for him or her to do more tests and be sure about it.
Another tool that is very standard in analysing the performance of classifiers is the ROC-AUC criterion, which creates a graph and index that account for both false positive rate (FPR) and true positive rate (TPR).
Let me reiterate that this is not at all an easy problem and careful analysis needs to be done. Use all the necessary and available tools.