I am doing some problems on an application of decision tree/random forest. I am trying to fit a problem which has numbers as well as strings (such as country name) as features. Now the library, scikit-learn takes only numbers as parameters, but I want to inject the strings as well as they carry a significant amount of knowledge.
How do I handle such a scenario?
I can convert a string to numbers by some mechanism such as hashing in Python. But I would like to know the best practice on how strings are handled in decision tree problems.