This paper shows a way to produce optimal circuits. I haver verified most of them and they are correct except this procedure:
I cannot even produce a correct result by using the returned value of ProduceArray(3). Could you please review that procedure to verify if it is correct? It is just 3 qubits. The decomposed matrices cannot compute the original matrix.