ETH官方钱包

前往
大廳
主題

狗頭人哨兵 攻擊 3

夏洛爾 | 2023-01-28 14:07:57 | 巴幣 0 | 人氣 158


Kobold Sentinel Attack V3

實驗目標:
1.設法用武器用力的打擊目標

實驗設計:
1.任何弱點觸地皆失敗 (尾巴、武器和Calf並非弱點)
2.當武器OnCollisionEnter Player
//enterCoef = 0.05f
agent.AddReward( Mathf.Clamp01(collision.impulse.magnitude * enterCoef) );
3.
//Set: judge.endEpisode = true//Set: judge.episodeLength = 3.3f//Set: weapon, tail not weakness//Set: useClampReward = trueif(weaknessOnGround){if(inferenceMode){brainMode = BrainMode.GetUp;SetModel("KoboldGetUp", getUpBrain);behaviorParameters.BehaviorType = BehaviorType.InferenceOnly;}else{// ===Train Attack===if(!hitTarget){float survivedTime = Time.fixedTime - arrivedMoment;if(survivedTime < judge.episodeLength ){AddReward( (survivedTime - judge.episodeLength) * 0.1f );}}judge.outLife++;judge.Reset();return;//===Train Other===// brainMode = BrainMode.GetUp;// SetModel("KoboldGetUp", getUpBrain);// behaviorParameters.BehaviorType = BehaviorType.InferenceOnly;}}else if(koboldRoot.localPosition.y < -1f){if(inferenceMode){brainMode = BrainMode.GetUp;SetModel("KoboldGetUp", getUpBrain);behaviorParameters.BehaviorType = BehaviorType.InferenceOnly;}else{// ===Train Attack===if(!hitTarget){float survivedTime = Time.fixedTime - arrivedMoment;if(survivedTime < judge.episodeLength ){AddReward( (survivedTime - judge.episodeLength) * 0.3f );}}judge.outY++;judge.Reset();return;// ===Train Other===// brainMode = BrainMode.GetUp;// SetModel("KoboldGetUp", getUpBrain);// behaviorParameters.BehaviorType = BehaviorType.InferenceOnly;}}else{if(hitTarget){// float myVelocity = velocityBuffer.GetSmoothVal().magnitude;// float myAngularVelocity = Vector3.Project(angularVelocityBuffer.GetSmoothVal(), Vector3.up).magnitude;float targetVelocity = targetVelocityBuffer.GetSmoothVal().magnitude;float targetAngularVelocity = targetAngularVelocityBuffer.GetSmoothVal().magnitude;// velocityCoef = Mathf.InverseLerp(15f, 0f, myVelocity );// float angularVelocityCoef = Mathf.InverseLerp(0f, 30f, myAngularVelocity );velocityCoef = Mathf.InverseLerp(0f, 20f, targetVelocity );float angularVelocityCoef = Mathf.InverseLerp(0f, 30f, targetAngularVelocity );lastReward = velocityCoef * 0.05f + angularVelocityCoef * 0.05f;totalReward += lastReward;AddReward(lastReward);}}

//大致來說
1.打擊得分變成僅限初擊
2.在打擊後,得分加上目標速度和角速度

實驗時間:
Step: 2.5e7
Time Elapsed: 114764s (31.88hr)

實驗結果:
實驗結果為失敗,狗頭人使用武器的意圖極弱

但是SAC相當的動感,沒有像很久之前會是木頭人

從現狀來看,狗頭人果然還是以衝撞為主

而後續實驗目前有兩個不衝突分歧
1.增加視線
目前狗頭人是盲目狀態,只知道對方核心位置,但不了解肢體關係
追加視線後狗頭人可能更有辦法搞清楚武器和敵人的關係

2.強化打擊
首先狗頭人不能用身體碰觸到敵人,藉此強化狗頭人使用武器和避免衝撞的傾向
然後只有當紅蓮往上飛(速度向上) 或是 y座標大於 1.5 (正在飛) 才會計算擊飛得分

想實驗看看能不能藉此讓狗頭人演化出向上打擊的傾向

考慮視線要全部重練,而強化打擊在沒有視線的情況有可能不精準,但應該足以實驗

因此下個實驗
1.使用SAC
2.身體不能碰觸敵人
3.只有當紅蓮往上飛(速度向上) 或是 y座標大於 1.5 (正在飛) 才會計算擊飛得分

更多創作