Group Meetings (KR)

From Multiagent Communications and Networking Lab
Revision as of 08:16, 29 August 2022 by Mcnl (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

  • Offline RL Policies Should be Trained to be Adaptive
    • 일시: 22.08.01
    • 저자발표:
    • 요약
      • Bayesian approach를 통해 offline RL을 adaptive하게 train하는 방식 제안
      • Offline data에서 추론한 Q-function들과 belief를 통해 unseen data에 대해서 적절한 모델 적용 가능