主題

道爾靜立 2

夏洛爾 | 2022-10-31 03:59:38 | 巴幣 2 | 人氣 178

Doyle Stand V2

實驗目標:

1.進入站立瞬間後，由於其實可能仍處於不穩定狀態，要再進入靜立狀態

2.進入站立瞬間後，可能面向並沒有瞄準目標，要轉向目標

實驗設計:

1.任何弱點觸地皆失敗 (尾巴和劍並非弱點)

2.

if(weaknessOnGround){if(inferenceMode){brainMode = DoyleMode.GetUp;SetModel("DoyleGetUp", getUpBrain);behaviorParameters.BehaviorType = BehaviorType.InferenceOnly;}else{AddReward(-1f);judge.outLife++;judge.Reset();return;// brainMode = DoyleMode.GetUp;// SetModel("DoyleGetUp", getUpBrain);// behaviorParameters.BehaviorType = BehaviorType.InferenceOnly;}}else if(doyleRoot.localPosition.y < -10f){if(inferenceMode){brainMode = DoyleMode.GetUp;SetModel("DoyleGetUp", getUpBrain);behaviorParameters.BehaviorType = BehaviorType.InferenceOnly;}else{AddReward(-1f);judge.outY++;judge.Reset();return;// brainMode = DoyleMode.GetUp;// SetModel("DoyleGetUp", getUpBrain);// behaviorParameters.BehaviorType = BehaviorType.InferenceOnly;}}else{targetSmoothPosition = targetPositionBuffer.GetSmoothVal();headDir = targetSmoothPosition - stageBase.InverseTransformPoint(doyleHeadRb.position);rootDir = targetSmoothPosition - stageBase.InverseTransformPoint(doyleRootRb.position);flatTargetVelocity = rootDir;flatTargetVelocity.y = 0f;targetDistance = flatTargetVelocity.magnitude;Vector3 flatLeftDir = Vector3.Cross(flatTargetVelocity, Vector3.up);lookAngle = Mathf.InverseLerp(180f, 0f, Vector3.Angle(doyleHead.up, headDir));upAngle = Mathf.InverseLerp(180f, 0f, Vector3.Angle(doyleHead.right * -1f, Vector3.up));spineLookAngle = Mathf.InverseLerp(180f, 0f, Vector3.Angle(doyleSpine.forward, flatLeftDir));spineUpAngle = Mathf.InverseLerp(180f, 45f, Vector3.Angle(doyleSpine.right * -1f, Vector3.up));rootLookAngle = Mathf.InverseLerp(180f, 0f, Vector3.Angle(doyleRoot.right * -1f, flatLeftDir));rootUpAngle = Mathf.InverseLerp(180f, 30f, Vector3.Angle(doyleRoot.up, Vector3.up));// float velocityReward = Mathf.InverseLerp(0f, 10f, doyleRootRb.velocity.magnitude) * 0.5f + Mathf.InverseLerp(0f, 10f, doyleSpineRb.velocity.magnitude) * 0.3f + Mathf.InverseLerp(0f, 10f, doyleHeadRb.velocity.magnitude) * 0.2f;// float angularReward = Mathf.InverseLerp(0f, 6.28f, doyleRootRb.angularVelocity.magnitude) * 0.2f + Mathf.InverseLerp(0f, 6.28f, doyleSpineRb.angularVelocity.magnitude) * 0.3f + Mathf.InverseLerp(0f, 6.28f, doyleHeadRb.angularVelocity.magnitude) * 0.5f; velocityReward = GetVelocityReward(8f); angularReward = GetAngularVelocityReward(10f);float standReward = (doyleLeftFeetBody.isStand? 0.5f : 0f) + (doyleRightFeetBody.isStand? 0.5f : 0f);lastReward = (1f-velocityReward) * 0.02f + (1f-angularReward) * 0.02f+ (lookAngle + upAngle + spineLookAngle + spineUpAngle + rootLookAngle + rootUpAngle) * 0.008f + standReward * 0.012f;totalReward += lastReward;AddReward( lastReward );}

//大致來說

--1.獎勵抑制速度和角速度

--2.獎勵視線角度

--3.獎勵Spine和Root角度

--4.獎勵雙足接觸地面

實驗時間:

Step: 5e7

Time Elapsed: 42085s (11.69hr)

實驗結果:

實驗結果為成功，但不理想

不理想處有三

1.相較於超迅速的起身，Stand模型調整面向的速度太慢