2

1

In conditional generative adversarial networks (GAN), the objective function (of a two-player minimax game) would be

$$\min _{G} \max _{D} V(D, G)=\mathbb{E}_{\boldsymbol{x} \sim p_{\text {data }}(\boldsymbol{x})}[\log D(\boldsymbol{x} | \boldsymbol{y})]+\mathbb{E}_{\boldsymbol{z} \sim p_{\boldsymbol{z}}(\boldsymbol{z})}[\log (1-D(G(\boldsymbol{z} | \boldsymbol{y})))]$$

The discriminator and generator both take $y$, the auxiliary information.

I am confused as to what will be the difference by using $\log D(x,y)$ and $\log(1-D(G(z,y))$, as $y$ goes in input to $D$ and $G$ in addition to $x$ and $z$?

So if some information say y (which is given and not being modeled), is given to discriminator only. Then for each generated and real frame, can we say it , D(x/y) and D(G(z)/y)? – matsu – 2018-09-16T18:39:12.510

Careful, the bar | is not the same as the slash / (use shift + \ to get the bar), and has a very different meaning. In that scenario, then we'd model it with D(x|y) and D(G(z) | y), but that's a very strange situation to imagine. If x is not conditionally independent of y, then it is very unlikely that your generator will produce challenging examples, because it cannot produce examples that vary with y, but the discriminator observes y for every sample. – John Doucette – 2018-09-17T01:26:43.300