Kobold Sentinel Attack V6
實驗目標:
1.設法用武器用力的打擊目標
實驗設計:
1.任何弱點觸地皆失敗 (尾巴、武器和Calf並非弱點)
2.當武器OnCollisionEnter Player
會傳送collision.impulse
//impulseRewardCoef = 0.05f
public void HitWithWeapon(Vector3 impulse){if(!hitTarget){avgVelocity = velocityBuffer.GetSmoothVal();hitOnVelocity = avgVelocity.normalized;float reward = Vector3.ProjectOnPlane(impulse, hitOnVelocity).magnitude * impulseRewardCoef;lastReward += reward;totalReward += reward;AddReward( reward );arrivedMoment = Time.fixedTime;hitTarget = true;}}
3.
//Set: judge.endEpisode = true//Set: judge.episodeLength = 3.3f//Set: weapon, tail not weakness//Set: useClampReward = trueif(weaknessOnGround){if(inferenceMode){brainMode = BrainMode.GetUp;SetModel("KoboldGetUp", getUpBrain);behaviorParameters.BehaviorType = BehaviorType.InferenceOnly;}else{// ===Train Attack===if(!hitTarget){float survivedTime = Time.fixedTime - arrivedMoment;if(survivedTime < judge.episodeLength ){AddReward( (survivedTime - judge.episodeLength) * 0.1f );}}judge.outLife++;judge.Reset();return;//===Train Other===// brainMode = BrainMode.GetUp;// SetModel("KoboldGetUp", getUpBrain);// behaviorParameters.BehaviorType = BehaviorType.InferenceOnly;}}else if(koboldRoot.localPosition.y < -1f){if(inferenceMode){brainMode = BrainMode.GetUp;SetModel("KoboldGetUp", getUpBrain);behaviorParameters.BehaviorType = BehaviorType.InferenceOnly;}else{// ===Train Attack===if(!hitTarget){float survivedTime = Time.fixedTime - arrivedMoment;if(survivedTime < judge.episodeLength ){AddReward( (survivedTime - judge.episodeLength) * 0.3f );}}judge.outY++;judge.Reset();return;// ===Train Other===// brainMode = BrainMode.GetUp;// SetModel("KoboldGetUp", getUpBrain);// behaviorParameters.BehaviorType = BehaviorType.InferenceOnly;}}/*else if( IsCollideWithBody() ){if(inferenceMode){brainMode = BrainMode.GetUp;SetModel("KoboldGetUp", getUpBrain);behaviorParameters.BehaviorType = BehaviorType.InferenceOnly;}else{// ===Train Attack===if(!hitTarget){float survivedTime = Time.fixedTime - arrivedMoment;if(survivedTime < judge.episodeLength ){AddReward( (survivedTime - judge.episodeLength) * 0.1f );}}judge.outLife++;judge.Reset();return;//===Train Other===// brainMode = BrainMode.GetUp;// SetModel("KoboldGetUp", getUpBrain);// behaviorParameters.BehaviorType = BehaviorType.InferenceOnly;}}*/else{if(hitTarget){targetSmoothVelocity = targetVelocityBuffer.GetSmoothVal();lastReward = Vector3.ProjectOnPlane(targetSmoothVelocity, hitOnVelocity).magnitude * 0.01f;totalReward += lastReward;AddReward(lastReward);if(inferenceMode){if(hasArrived && Time.fixedTime - arrivedMoment >= judge.episodeLength){hitTarget = false;brainMode = BrainMode.GetUp;SetModel("KoboldGetUp", getUpBrain);behaviorParameters.BehaviorType = BehaviorType.InferenceOnly;}}}}
//大致來說
在命中目標時紀錄當下自身的移動方向
之後,獎勵和移動方向不一樣的敵人速度值
然後暫時取消不能碰撞敵人的限制
實驗時間:
Step: 5e7
Time Elapsed: 220422s (61.23hr)
實驗結果:
實驗結果為成功,狗頭人很明顯的使用武器將目標打擊至其他方向
實驗證實果然誘導方式合適的情況,ML還是能處理這種問題
但目前有三個缺點
1.打擊的方式感覺沒有很猛
看起來只依靠前臂力量甩一下而已
2.對於動態目標效果不佳
由於訓練時就是固定位置靜止目標,可以看到試圖攻擊水月時,命中率欠佳
3.只能銜接在快速跑動之後
如果是距離很近的加速就會失敗
考慮了一下,這系列實驗想先確立的是可行性和關鍵點
所以應該先研究如何讓狗頭人願意更迅猛的攻擊
動態目標先有位置變化即可
所以下個實驗
1.目標變重
5 => 50
目標速度的係數增加
2.目標會稍微有位置變化
3.出生點範圍縮小
讓狗頭人大部分時間沒有衝刺到高速,這樣有可能讓狗頭人會想自己踩一些腳步 (速度太高這種嘗試會一直跌倒)
4.狗頭人如果對Vector3.up旋轉,會有得分
想鼓勵旋風錘
不過這個實驗的失敗率不低
因為打擊實驗最麻煩的地方是,唯獨狗頭人打中目標,才會知道打擊可以得分,才能開始優化
所以原本狗頭人高速奔跑會有"撞上"目標的效果
但如果狗頭人不太奔跑很可能甚至不知道目標是可以打的,為此有可能要調整出生點,讓一些出生點離目標很近