init cyy mind

This commit is contained in:
MobKBK
2026-03-08 04:05:39 +08:00
commit 1a9ee73be5
34 changed files with 1292 additions and 0 deletions

View File

@@ -0,0 +1,40 @@
# ppo算法
[零基础学习强化学习算法ppo_哔哩哔哩_bilibili](https://www.bilibili.com/video/BV1iz421h7gb/?spm_id_from=333.337.search-card.all.click&vd_source=f553a12b04c16a678ddc0064cc04563c)
<img src="http://tuchuang-cyy.oss-cn-beijing.aliyuncs.com/img/image-20260203074417725.png" alt="image-20260203074417725" style="zoom: 67%;" />
<img src="http://tuchuang-cyy.oss-cn-beijing.aliyuncs.com/img/image-20260203074801825.png" alt="image-20260203074801825" style="zoom:80%;" />
action space
策略policy
trajectory
return
马尔科夫链
蒙特卡洛
![image-20260203075054012](http://tuchuang-cyy.oss-cn-beijing.aliyuncs.com/img/image-20260203075054012.png)
![image-20260203075357801](http://tuchuang-cyy.oss-cn-beijing.aliyuncs.com/img/image-20260203075357801.png)
![image-20260203075438132](http://tuchuang-cyy.oss-cn-beijing.aliyuncs.com/img/image-20260203075438132.png)
![image-20260203075557016](http://tuchuang-cyy.oss-cn-beijing.aliyuncs.com/img/image-20260203075557016.png)
![image-20260203075828723](http://tuchuang-cyy.oss-cn-beijing.aliyuncs.com/img/image-20260203075828723.png)
![image-20260203075933230](http://tuchuang-cyy.oss-cn-beijing.aliyuncs.com/img/image-20260203075933230.png)