Hey, Thanks for your response.
I mean these two equations:
- Bellman equation for state value function:
- Bellman equation for state-action value function
The right side of both bellman equations are exactly the same, I don’t think that’s the case. The first one looks weird to me. Maybe I’m wrong here.