There is nothing necessarily wrong with this. If you have no better information, then using past performance (i.e., prior probabilities) can work pretty well, particularly when your classes are very unevenly distributed.
Example methods using class priors are Gaussian Maximum Likelihood classification and Naïve Bayes.
Since you've added additional details to the question...
Suppose you are doing 10-fold cross-validation (holding out 10% of the data for validating each of the 10 subsets). If you use the entire data set to establish the priors (including the 10% of validation data), then yes, it is "cheating" since each of the 10 subset models uses information from the corresponding validation set (i.e., it is not truly a blind test). However, if the priors are recomputed for each fold using only the 90% of data used for that fold, then it is a "fair" validation.
An example of the effect of this "cheating" is if you have a single, extreme outlier in your data. Normally, with k-fold cross-validation, there would be one fold where the outlier is in the validation data and not the training data. When applying the corresponding classifier to the outlier during validation, it would likely perform poorly. However, if the training data for that fold included global statistics (from the entire data set), then the outlier would influence the statistics (priors) for that fold, potentially resulting in artificially favorable performance.