主題

狗頭人哨兵受身 1

夏洛爾 | 2022-11-23 21:19:51 | 巴幣 0 | 人氣 195

Kobold Sentinel GetUp v1

實(shí)驗(yàn)?zāi)繕?biāo):

1.扣分制

2.快速進(jìn)入站立瞬間

3.站立瞬間不限制視線方向 (視線瞄準(zhǔn)將由站立處理)

4.啟用Take Actions between Decision

(New) 5.Size 在 1-2之前隨機(jī)，Size會影響mass和JointDrive

(New) 6.ML為Release19，Unity為2021.3.11f1

實(shí)驗(yàn)設(shè)計(jì):

(沿用道爾受身16)

1.弱點(diǎn)觸地

AddReward(-0.0001f * koboldBodies[i].damageCoef);life -= 0.005f * koboldBodies[i].damageCoef;

2.//Set: judge.endEpisode = true//Set: judge.episodeLength = 10fif(life <= 0f){if(inferenceMode){}else{// ===Train Get Up===float survivedTime = Time.fixedTime - arrivedMoment;if(survivedTime < judge.episodeLength ){AddReward( (survivedTime - judge.episodeLength) * 0.1f );}judge.outLife++;judge.Reset();return;}}else if(koboldRoot.localPosition.y < -10f){if(inferenceMode){}else{//===Train Get Up===float survivedTime = Time.fixedTime - arrivedMoment;if(survivedTime < judge.episodeLength ){AddReward( (survivedTime - judge.episodeLength) * 0.1f );}judge.outY++;//===All Required===judge.Reset();return;}}targetSmoothPosition = targetPositionBuffer.GetSmoothVal();headDir = targetSmoothPosition - stageBase.InverseTransformPoint(koboldHeadRb.position);spineDir = targetSmoothPosition - stageBase.InverseTransformPoint(koboldSpine.position);rootDir = targetSmoothPosition - stageBase.InverseTransformPoint(koboldRootRb.position);lookAngle = Mathf.InverseLerp(180f, 0f, Vector3.Angle(koboldHead.up, headDir));upAngle = Mathf.InverseLerp(180f, 0f, Vector3.Angle(koboldHead.right * -1f, Vector3.up));spineLookAngle = Mathf.InverseLerp(180f, 30f, Vector3.Angle(koboldSpine.up, spineDir));spineUpAngle = Mathf.InverseLerp(180f, 30f, Vector3.Angle(koboldSpine.right * -1f, Vector3.up));rootLookAngle = Mathf.InverseLerp(180f, 30f, Vector3.Angle(koboldRoot.forward, rootDir));rootUpAngle = Mathf.InverseLerp(180f, 30f, Vector3.Angle(koboldRoot.up, Vector3.up));leftThighAngle = Mathf.InverseLerp(180f, 45f, Vector3.Angle(koboldLeftThigh.right, Vector3.up));leftCalfAngle = Mathf.InverseLerp(180f, 45f, Vector3.Angle(koboldLeftCalf.right, Vector3.up));rightThighAngle = Mathf.InverseLerp(180f, 45f, Vector3.Angle(koboldRightThigh.right, Vector3.up));rightCalfAngle = Mathf.InverseLerp(180f, 45f, Vector3.Angle(koboldRightCalf.right, Vector3.up));avgVelocity = velocityBuffer.GetSmoothVal();flatVelocity = avgVelocity;flatVelocity.y = 0f;velocityCoef = Mathf.InverseLerp(0f, 10f, flatVelocity.magnitude );Reward -1 + angleslastReward = (upAngle + spineUpAngle + rootUpAngle) * 0.00033f+ (lookAngle + spineLookAngle + rootLookAngle) * 0.000133f+ (leftThighAngle + leftCalfAngle + rightThighAngle + rightCalfAngle) * 0.0001f+ (1f - velocityCoef) * 0.00018f+ (1f - exertionRatio) * 0.00002f - 0.002f;totalReward += lastReward;AddReward( lastReward );if(hasLanding && !weaknessOnGround && velocityCoef < 0.2f && upAngle > 0.9f && spineUpAngle > 0.9f && rootUpAngle > 0.9f&& leftThighAngle > 0.9f && leftCalfAngle > 0.9f && rightThighAngle > 0.9f && rightCalfAngle > 0.9f){//===Train Get Up===AddReward(1f);judge.survived++;judge.Reset();return;}

實(shí)驗(yàn)時間:

Step: 5e7

Time Elapsed: 47884s (13.3hr)

實(shí)驗(yàn)結(jié)果:

結(jié)果顯示為成功，狗頭人哨兵能有效率的受身並進(jìn)入 "站立瞬間"，而且相當(dāng)?shù)挠行?/div>

並且不管縮放比例都能有近乎同質(zhì)的動作和效率

但是本次尾巴有設(shè)定成弱點(diǎn)，卻反而有明顯大量使用尾巴的傾向

另外相較道爾受身16僅用8.6hr完成，狗頭人哨兵使用了1.5倍的時間

根據(jù)觀測有可能是打擊力設(shè)定的太高，部分打擊情況會造成狗頭人飛出場外，成為浪費(fèi)時間的訓(xùn)練組

下個實(shí)驗(yàn)將進(jìn)行狗頭人哨兵靜立，實(shí)驗(yàn)設(shè)計(jì)預(yù)計(jì)為沿用道爾靜立

1.獎勵瞄準(zhǔn)方向

2.獎勵抑制全身速度和角速度

3.獎勵雙腳觸地

警告:

在縮放比例部分有Bug，放大時僅JointDrive中的Spring和Damper有加成，但是Maximum Force沒有

新版程式已修正，由於人物未必會觸及Maximum Force，目前測試上狗頭人哨兵模型仍然能效率受身

但未來根據(jù)情況還是必須考慮重新訓(xùn)練受身

#物理性人物 #UnityML #狗頭人哨兵

留言

ETH官方钱包

狗頭人哨兵受身 1

創(chuàng)作回應(yīng)

作者相關(guān)創(chuàng)作

更多創(chuàng)作

ETH官方钱包

狗頭人哨兵 受身 1

創(chuàng)作回應(yīng)

作者相關(guān)創(chuàng)作

更多創(chuàng)作

狗頭人哨兵受身 1