Tag: policy-iteration

5 Why are policy iteration and value iteration studied as separate algorithms? 2020-08-13T13:31:43.800

4 Would you categorize policy iteration as an actor-critic reinforcement learning approach? 2020-05-13T10:32:45.627

4 Why is update rule of the value function different in policy evaluation and policy iteration? 2020-05-19T06:08:46.437

4 Why are the Bellman operators contractions? 2020-07-31T02:48:34.320

3 Choosing more than one action in a parameterized policy 2019-02-18T12:44:06.557

3 Understanding the update rule for the policy in the policy iteration algorithm 2019-05-12T11:15:54.263

3 How can the policy iteration algorithm be model-free if it uses the transition probabilities? 2020-03-11T16:11:52.107

3 Why do value iteration and policy iteration obtain similar policies even though they have different value functions? 2020-04-21T20:03:40.037

2 Can policy iteration use only the immediate reward for updates? 2019-09-11T16:05:13.550

2 Equation not satisfied in Policy Iteration Algorithm 2020-06-06T07:34:06.170

2 Why doesn't value iteration use $\pi(a \mid s)$ while policy evaluation does? 2020-08-25T12:35:26.587

1 Is Value Iteration better than Policy Iteration for first few iterations? 2019-09-11T17:20:27.750

1 What is the difference between value iteration and policy iteration? 2019-12-01T11:52:09.413

1 What is generalized policy iteration? 2020-04-25T16:17:27.533

1 Monte Carlo epsilon-greedy Policy Iteration: monotonic improvement for all cases or for the expected value? 2020-04-25T20:06:16.880

0 Soft Actor Critic - Losses are not converging 2020-06-14T13:55:02.290

0 Why care about the value of the action which I'm not gonna take in policy iteration? 2020-06-21T17:01:18.127