You're pretty much on the right track, save for using a compressor. I don't use Audition so someone else will have to chime in. It sounds to me that this is some kind of promo or in-house instructional kind of video, in that case, I would favour speed and clarity over the timbre of the voice. So:
- Adding a compressor to the track/bus to attain more consistent levels. You probably could be more aggressive with it.
- Cut up dialogue and add fades and crossfades where necessary. If the voiceover was recorded in an adequately quiet room and had a good s/n ratio, I probably wouldn't even bother with the fades unless I heard a pop.
- Using clip gain and volume automation to fine tune any bits the compressor isn't processing well enough.
- Doing some noise reduction (Declick, Declip, Denoise) where necessary.
- Using a DeEsser to soften sibilance.
- Using an EQ to attain more clarity with the voiceover.
- Add a limiter to the track/bus to stop any peaks from sliding through. If it still does, go back to 3.
This would be the sequence in which I usually work with voiceover for non-so-consequential material. It's kinda a quick and dirty way of doing things fast. Definitely not how you want to go about treating dialogue for film and tv.