Is this normal?
It is not surprising.
First, you are using different measures of feature importance. It’s like measuring the importance of people (or simply sorting them) using their a) weight, b) height, c) wealth and d) IQ. With a and b you might get quite similar results, but these results are likely to be different from results obtained with c and d.
Second, the performance of your models is likely to be different. In extreme case, output of one of your models could be completely rubbish (in your case it is more likely to be NB). Then the feature importance metrics produced with such model is not credible. In less extreme scenarios when the difference in models‘ performance is not so dramatic the trustworthiness of importances produced by two different models is more comparable. Still the importances might be quite different due to the first argument, i.e. different language used to capture the importance.
You have not asked about it in your question, but there are feature importance approaches which are model agnostic, i.e. can be applied to any predictive model. For example, check the permutation importance approach, described in chapter "15.3.2 Variable Importance" in The Elements of Statistical Learning