Portal-Zone Gothic-Zone Gothic II-Zone Gothic 3-Zone Gothic 4-Zone Modifikationen-Zone Download-Zone Foren-Zone RPG-Zone Almanach-Zone Spirit of Gothic

 

Ergebnis 1 bis 7 von 7
  1. Beiträge anzeigen #1 Zitieren
    Apprentice
    Registriert seit
    May 2017
    Beiträge
    38
     
    LootaBox ist offline

    Performant damage over time

    Hi,

    I have been working on setting up a system to allow various damage over time effects, e.g. burn for fire spells. So far I have been only experimenting with LeGo buffs and using them I have been able to set up a somewhat working system where the damage ticks every second for example.

    However, if I try to increase the tick rate to more than few times per second the performance becomes a little choppy when there are a lot of buffs going around, e.g. a lot of burn debuffs from a fire rain spell. I can't help but wonder if there is some more performant approach I could explore.

    Even if there is no other easy approaches, perhaps someone can simply confirm that these LeGo buffs can be performant even if there are many of them ticking simultaneously! Any ideas or hints would be much appreciated.

  2. Beiträge anzeigen #2 Zitieren
    Apprentice
    Registriert seit
    May 2017
    Beiträge
    38
     
    LootaBox ist offline
    So, I think I found a potential solution.

    My idea is to modify the unused existing HP/Mana regeneration mechanic (with ATR_REGENERATEHP and ATR_REGENERATEMANA) so that these attributes instead control what direction HP/Mana should be changed at a set interval.

    I was able to find the bit inside oCNpc::Regenerate that handles this mechanic using IDA. For example; this is the part that handles the regeneration for HP.
    Code:
    00742030 - push    1               ; int
    00742032 - push    0               ; int
    00742034 - mov     ecx, esi        ; this
    00742036 - call    ?ChangeAttribute@oCNpc@@QAEXHH@Z ; oCNpc::ChangeAttribute(int,int)
    0074203B - fild    dword ptr [esi+1D0h]
    00742041 - fmul    ds:__real@447a0000
    00742047 - fstp    dword ptr [esi+7C4h]
    The part up until the ChangeAttribute call handles the regeneration: 1 is the amount, 0 is the atribute index for hitpoints. The part after handles resetting the timer for the regen. It loads the integer (as float) from ATR_REGENERATEHP and multiplies it by 1000 to get milliseconds, then stores it for next cycle.

    It is clear to me what needs to be done:
    * Set the amount to be read from the attribute instead of using constant 1.
    * Set the timer to be stored from constant value instead of reading it from the attribute.

    I think I might be able to figure out how to load and push the attribute value for the ChangeAttribute call, but the bit with the floating points is a bit difficult... I don't really know how to "load" a constant and then store it.

    Any help would be much appreciated!

    PS, incase you are wondering:
    Outside my plan would be to use the LeGo buffs to increment/decrease these attributes on hero/NPCs according to what is happening in the game. This should probably solve the main issue I'm having with LeGo buffs based damage-over-time effects: Choppy performance with many overlapped buffs ticking at the same time.

    EDIT

    I realize I could also just override that "Push 1" into a "Push 0" and use a hook to actually handle the damage/heal effects. This could give some more flexibility, though I don't know if that might have the same performance issue I am having with the buffs. In any case, I would still need to somehow override the timer with a constant value.
    Geändert von LootaBox (19.06.2021 um 17:46 Uhr)

  3. Beiträge anzeigen #3 Zitieren
    Apprentice
    Registriert seit
    May 2017
    Beiträge
    38
     
    LootaBox ist offline
    Alright, made some progress, but still struggling.

    First I noticed a bug; mana regen is actually looking at ATR_REGENERATEHP to check whether it triggers at all. The following snippet will fix it and could be used independently (in case some madman wanted to use the regen mechanic as it is)
    Code:
        // Fix: mana regen triggers if ATR_REGENERATEMANA != 0 (instead of ATR_REGENERATEHP)
        MemoryProtectionOverride(7610451, 1);
        MEM_WriteByte(7610451 /* 0x742053 */, 212 /* 0xd4 */); // [esi+1D0h] -> [esi+1D4h]
    Honestly, no clue how the MemoryProtectionOverride is supposed to be used, but that works.


    Despite my initial trepidation, I managed to set the timer to a static value, rather then it being seconds determined by the attribute. With the following snippet it is pretty easy to control the frequency this regen/degen is applied.
    Code:
        const float intervalMs = 200.0;
        const int oCNpc__Regenerate__Life_SetTimer_Start = 7610427; // 0x74203B
        const int oCNpc__Regenerate__Life_SetTimer_End   = 7610439; // 0x742047
        const int oCNpc__Regenerate__Mana_SetTimer_Start = 7610503; // 0x742087
        const int oCNpc__Regenerate__Mana_SetTimer_End   = 7610515; // 0x742093
    
        // Fill with NOPs (ultimately only replaces fild(6) -> NOP(x6))
        repeat(i1, oCNpc__Regenerate__Life_SetTimer_End - oCNpc__Regenerate__Life_SetTimer_Start); var int i1;
            MEM_WriteByte(oCNpc__Regenerate__Life_SetTimer_Start + i1, 144); // NOP | 0x90
        end;
        repeat(i2, oCNpc__Regenerate__Mana_SetTimer_End - oCNpc__Regenerate__Mana_SetTimer_Start); var int i2;
            MEM_WriteByte(oCNpc__Regenerate__Mana_SetTimer_Start + i2, 144); // NOP | 0x90
        end;
    
        // Replace fmul(6) -> fld(6) using *intervalMs
        var int intervalPtr; intervalPtr = MEM_GetFloatAddress(intervalMs);
        MEM_WriteByte(oCNpc__Regenerate__Life_SetTimer_End - 6, 217); // fld | 0xd9
        MEM_WriteByte(oCNpc__Regenerate__Life_SetTimer_End - 5,   5); // fld | 0x05
        MEM_WriteInt (oCNpc__Regenerate__Life_SetTimer_End - 4, intervalPtr);
        MEM_WriteByte(oCNpc__Regenerate__Mana_SetTimer_End - 6, 217); // fld | 0xd9
        MEM_WriteByte(oCNpc__Regenerate__Mana_SetTimer_End - 5,   5); // fld | 0x05
        MEM_WriteInt (oCNpc__Regenerate__Mana_SetTimer_End - 4, intervalPtr);
    The only bit I can't figure out is actually replacing that push 1 with what basically should be push dword ptr [esi+1D0h]. The trouble is that the original instruction is 2 bytes and what I need is 3 bytes. As I've made some NOPs in the addresses following the call instruction I assumed I could just "move" it and all preceeding instructions forward one byte in memory to make space for the larger push instruction, and I've tried several ways to do this, but I can't get anything working. I feel like I'm missing something obvious, but I can't see it.

    Any help or hints appreciated, i.e. how to move instructions around in memory, perhaps there some magic between the bytes that I should be aware of. I am also wondering how this memory protection override should actually be used.
    Geändert von LootaBox (27.06.2021 um 13:34 Uhr)

  4. Beiträge anzeigen #4 Zitieren
    Dea
    Registriert seit
    Jul 2007
    Beiträge
    10.446
     
    Lehona ist offline
    I can't really give you a complete solution, but at least I can fill in some bits and pieces to help you understand.

    Let's start with MemoryProtectionOverride. At the lowest level, all regions of memory are the same, i.e. the region where the code is placed is on the same physical RAM as the data (e.g. the name of an NPC, or the value of ATR_REGENERATEHP for a given NPC). This is called Von-Neumann-Architecture. There are devices where this is not true (called Harvard-Architecture), but this is generally only found in e.g. embedded devices and hence not relevant for us (consumer hardware is always Von-Neumann-Architecture). However, the operating system still distinguishes these regions. To protect against malicious code, the region where the code from an executable is located, is marked as read-only. In other words: You're not allowed to modify any of the bytes in the memory region where the code from an executable is located. This is helpful even if there's no malicious code involved, because accidentally overwriting those bytes is generally a bug which could case unforeseen consequences and might be hard to catch. However, you can change the permissions for any given address with a call to the Windows kernel, which is exactly what MemoryProtectionOverride does. Hence the second parameter to MemoryProtectionOverride marks the length of the memory which should be made writeable. If you only need to overwrite one byte, you only need a length of 1 (MEM_WriteByte used to always write 4 bytes, but I think in the newest version of Ikarus it writes exactly 1 byte due to another bug we encountered).

    Ok this was quite long-winded, the next part (why can't you simply move the call) is a bit simpler. Both jumps and calls are generally encoded using relative offsets, i.e. they do not contain the absolute address which should be executed. Instead they contain an offset which is added to the current address. Hence moving the instruction, even slightly, will cause a crash most of the time. It's generally easier to avoid moving such an instruction, but if you absolutely have to do it, you will need to adjust the offset likewise.

  5. Beiträge anzeigen #5 Zitieren
    Ritter Avatar von Kirides
    Registriert seit
    Jul 2009
    Ort
    Norddeutschland
    Beiträge
    1.780
     
    Kirides ist offline
    Zitat Zitat von LootaBox Beitrag anzeigen
    Hi,

    I have been working on setting up a system to allow various damage over time effects, e.g. burn for fire spells. So far I have been only experimenting with LeGo buffs and using them I have been able to set up a somewhat working system where the damage ticks every second for example.

    However, if I try to increase the tick rate to more than few times per second the performance becomes a little choppy when there are a lot of buffs going around, e.g. a lot of burn debuffs from a fire rain spell. I can't help but wonder if there is some more performant approach I could explore.

    Even if there is no other easy approaches, perhaps someone can simply confirm that these LeGo buffs can be performant even if there are many of them ticking simultaneously! Any ideas or hints would be much appreciated.
    Can you try to modify the LeGo "Talents.d" code that does the NPC lookup?
    I found that if you "optimize" the lookup for buffed Npc, you get much better performance (for regular case, in worst case, you lose a tiny bit of performance because of new checks introduced).

    Look for "Npc_FindByID" and adjust the code as in this Spoiler. This might also work great for you.

    Spoiler:(zum lesen bitte Text markieren)
    Code:
    func int _TAL_List_FindByID(var int list, var int targetID) {
        var zCListSort l;
        var C_Npc npc;
    
        while(list);
            l = _^(list);
            if (l.data) {
                npc = _^(l.data);
                if (npc.aivar[AIV_TALENT] == targetID) {
                    return l.data;
                };
            };
            list = l.next;
        end;
        return 0;
    };
    
    func int Npc_FindByID(var int ID) { // GetByID would probably be too similar to GetID
        if (MEM_World.voblist_npcs) {
            return _TAL_List_FindByID(MEM_World.voblist_npcs, ID);
        };
        return 0;
    };
    Geändert von Kirides (25.06.2021 um 14:46 Uhr)

  6. Beiträge anzeigen #6 Zitieren
    Apprentice
    Registriert seit
    May 2017
    Beiträge
    38
     
    LootaBox ist offline
    Thank you Lehona, this really helped me understand the whole thing better. From my previous snippets the MemoryProtectionOverride length can be reduced to just 1 and I managed to change the "push 1" now to push the appropriate attribute values:

    Code:
        const int oCNpc__Regenerate__Life_CallChangeAttribute_Start = 7610416; // 0x742030
        const int oCNpc__Regenerate__Life_CallChangeAttribute_End   = 7610427; // 0x74203B
        const int oCNpc__Regenerate__Mana_CallChangeAttribute_Start = 7610492; // 0x74207C
        const int oCNpc__Regenerate__Mana_CallChangeAttribute_End   = 7610503; // 0x742087
    
        // All call related instructions will be offset, so the call offset itself must first be adjusted
        const int offset = 4; var int orig_pos;
        orig_pos = MEM_ReadInt(oCNpc__Regenerate__Life_CallChangeAttribute_End - 4);
        MEM_WriteInt(oCNpc__Regenerate__Life_CallChangeAttribute_End - 4, orig_pos - offset);
        orig_pos = MEM_ReadInt(oCNpc__Regenerate__Mana_CallChangeAttribute_End - 4);
        MEM_WriteInt(oCNpc__Regenerate__Mana_CallChangeAttribute_End - 4, orig_pos - offset);
    
        // Offset all CallChangeAttribute related instruction to make space
        repeat(k1, oCNpc__Regenerate__Life_CallChangeAttribute_End - oCNpc__Regenerate__Life_CallChangeAttribute_Start); var int k1;
            MEM_WriteByte(oCNpc__Regenerate__Life_CallChangeAttribute_End - k1 + (offset - 1), MEM_ReadByte(oCNpc__Regenerate__Life_CallChangeAttribute_End - k1 - 1));
        end;
        repeat(k2, oCNpc__Regenerate__Mana_CallChangeAttribute_End - oCNpc__Regenerate__Mana_CallChangeAttribute_Start); var int k2;
            MEM_WriteByte(oCNpc__Regenerate__Mana_CallChangeAttribute_End - k2 + (offset - 1), MEM_ReadByte(oCNpc__Regenerate__Mana_CallChangeAttribute_End - k2 - 1));
        end;
    
        // push 1 -> push dword ptr [esi+1D0h]
        MEM_WriteByte(oCNpc__Regenerate__Life_CallChangeAttribute_Start + 0, 255); // 0xff
        MEM_WriteByte(oCNpc__Regenerate__Life_CallChangeAttribute_Start + 1, 182); // 0xb6
        MEM_WriteByte(oCNpc__Regenerate__Life_CallChangeAttribute_Start + 2, 208); // 0xd0
        MEM_WriteByte(oCNpc__Regenerate__Life_CallChangeAttribute_Start + 3,   1); // 0x01
        MEM_WriteByte(oCNpc__Regenerate__Life_CallChangeAttribute_Start + 4,   0); // 0x00
        MEM_WriteByte(oCNpc__Regenerate__Life_CallChangeAttribute_Start + 5,   0); // 0x00
    
        // push 1 -> push dword ptr [esi+1D4h]
        MEM_WriteByte(oCNpc__Regenerate__Mana_CallChangeAttribute_Start + 0, 255); // 0xff
        MEM_WriteByte(oCNpc__Regenerate__Mana_CallChangeAttribute_Start + 1, 182); // 0xb6
        MEM_WriteByte(oCNpc__Regenerate__Mana_CallChangeAttribute_Start + 2, 212); // 0xd4
        MEM_WriteByte(oCNpc__Regenerate__Mana_CallChangeAttribute_Start + 3,   1); // 0x01
        MEM_WriteByte(oCNpc__Regenerate__Mana_CallChangeAttribute_Start + 4,   0); // 0x00
        MEM_WriteByte(oCNpc__Regenerate__Mana_CallChangeAttribute_Start + 5,   0); // 0x00
    So, this combined with the previous snippets allows easy control over regen, degen or combination of the two for any NPC. I have not actually done any performance testing yet to see if it is actually any faster, but if nothing else this was a great learning experience for me.

    There is one issue, another bug it seems, that I have not yet quite resolved... Sometimes the regen just stops, e.g. if you take fall damage, due to this condition:
    Code:
    fcomp   ds:__real@00000000
    fnstsw  ax
    test    ah, 41h
    jp      short loc_742051
    After breaking it down, my understanding is that it jumps (skipping regen) if st(0) < 0.0. As far as I can see, what should be on top of the stack is the timer for the regen, as expected. My suspicion is that for some reason this timer (at esi+7C4h inside the function) is overridden somewhere else, but I'm not sure... I will have to look into this a bit more before I even know what to ask about.

    I did not end up trying out your idea Kirides, does not seem like I will need it for this one, but I will keep it in mind. It may come in handy if I do need better performance for the buffs for some other reason.
    Thanks for all the help.

  7. Beiträge anzeigen #7 Zitieren
    Apprentice
    Registriert seit
    May 2017
    Beiträge
    38
     
    LootaBox ist offline
    Alright, finally found the time to dive into this again.

    The issue was not just with fall damage, my understanding is that PB had intended to reset the timer for this regen mechanic whenever any damage was received. Of course, it was reset according to the original mechanic, where the attribute determined the frequency of the regen in seconds. I don't see the need for this resetting with my changes, so I opted to remove the resetting completely. This was easy enough to do by filling in some NOPs at the very end of oCNpc::OnDamage_Hit.

    Here is the full solution:
    Code:
        // ------------------------------------------------------------------------------
        // Fix: mana regen triggers if ATR_REGENERATEMANA != 0 (instead of ATR_REGENERATEHP)
        MemoryProtectionOverride(7610451, 1);
        MEM_WriteByte(7610451 /* 0x742053 */, 212 /* 0xd4 */); // [esi+1D0h] -> [esi+1D4h]
    
        // ------------------------------------------------------------------------------
        // Make timer for regen static and independent of the attribute
        const float intervalMs = 1000.0;
        const int oCNpc__Regenerate__Life_SetTimer_Start = 7610427; // 0x74203B
        const int oCNpc__Regenerate__Life_SetTimer_End   = 7610439; // 0x742047
        const int oCNpc__Regenerate__Mana_SetTimer_Start = 7610503; // 0x742087
        const int oCNpc__Regenerate__Mana_SetTimer_End   = 7610515; // 0x742093
    
        // Fill with NOPs (ultimately only replaces fild(6) -> NOP(x6))
        repeat(i1, oCNpc__Regenerate__Life_SetTimer_End - oCNpc__Regenerate__Life_SetTimer_Start); var int i1;
            MEM_WriteByte(oCNpc__Regenerate__Life_SetTimer_Start + i1, 144); // NOP | 0x90
        end;
        repeat(i2, oCNpc__Regenerate__Mana_SetTimer_End - oCNpc__Regenerate__Mana_SetTimer_Start); var int i2;
            MEM_WriteByte(oCNpc__Regenerate__Mana_SetTimer_Start + i2, 144); // NOP | 0x90
        end;
    
        // Replace fmul(6) -> fld(6) using *intervalMs
        var int intervalPtr; intervalPtr = MEM_GetFloatAddress(intervalMs);
        MEM_WriteByte(oCNpc__Regenerate__Life_SetTimer_End - 6, 217); // fld | 0xd9
        MEM_WriteByte(oCNpc__Regenerate__Life_SetTimer_End - 5,   5); // fld | 0x05
        MEM_WriteInt (oCNpc__Regenerate__Life_SetTimer_End - 4, intervalPtr);
        MEM_WriteByte(oCNpc__Regenerate__Mana_SetTimer_End - 6, 217); // fld | 0xd9
        MEM_WriteByte(oCNpc__Regenerate__Mana_SetTimer_End - 5,   5); // fld | 0x05
        MEM_WriteInt (oCNpc__Regenerate__Mana_SetTimer_End - 4, intervalPtr);
    
        // ------------------------------------------------------------------------------
        // Remove resetting of regen timers on damage calculation
        const int oCNpc__OnDamage_Hit__ResetTimers_Start = 6737613; // 0x66CECD
        const int oCNpc__OnDamage_Hit__ResetTimers_End   = 6737686; // 0x66CF16
    
        // Fill with NOPs
        var int address;
        repeat(r, oCNpc__OnDamage_Hit__ResetTimers_End - oCNpc__OnDamage_Hit__ResetTimers_Start); var int r;
            address = oCNpc__OnDamage_Hit__ResetTimers_Start + r;
            // there is unrelated mov and pop's in the middle, leave them untouched
            if (address < 6737657 /* 0x66CEF9 */ || address >= 6737664 /* 0x66CF00*/)
            && (address < 6737678 /* 0x66CF0E */ || address >= 6737680 /* 0x66CF10*/) {
                MEM_WriteByte(address, 144); // NOP | 0x90
            };
        end;
    
        // ------------------------------------------------------------------------------
        // Make regen use attribute for regen amount per cycle
        const int oCNpc__Regenerate__Life_CallChangeAttribute_Start = 7610416; // 0x742030
        const int oCNpc__Regenerate__Life_CallChangeAttribute_End   = 7610427; // 0x74203B
        const int oCNpc__Regenerate__Mana_CallChangeAttribute_Start = 7610492; // 0x74207C
        const int oCNpc__Regenerate__Mana_CallChangeAttribute_End   = 7610503; // 0x742087
    
        // All call related instructions will be offset, so the call offset itself must first be adjusted
        const int offset = 4; var int orig_pos;
        orig_pos = MEM_ReadInt(oCNpc__Regenerate__Life_CallChangeAttribute_End - 4);
        MEM_WriteInt(oCNpc__Regenerate__Life_CallChangeAttribute_End - 4, orig_pos - offset);
        orig_pos = MEM_ReadInt(oCNpc__Regenerate__Mana_CallChangeAttribute_End - 4);
        MEM_WriteInt(oCNpc__Regenerate__Mana_CallChangeAttribute_End - 4, orig_pos - offset);
    
        // Offset all CallChangeAttribute related instruction to make space
        repeat(k1, oCNpc__Regenerate__Life_CallChangeAttribute_End - oCNpc__Regenerate__Life_CallChangeAttribute_Start); var int k1;
            MEM_WriteByte(oCNpc__Regenerate__Life_CallChangeAttribute_End - k1 + (offset - 1), MEM_ReadByte(oCNpc__Regenerate__Life_CallChangeAttribute_End - k1 - 1));
        end;
        repeat(k2, oCNpc__Regenerate__Mana_CallChangeAttribute_End - oCNpc__Regenerate__Mana_CallChangeAttribute_Start); var int k2;
            MEM_WriteByte(oCNpc__Regenerate__Mana_CallChangeAttribute_End - k2 + (offset - 1), MEM_ReadByte(oCNpc__Regenerate__Mana_CallChangeAttribute_End - k2 - 1));
        end;
    
        // push 1 -> push dword ptr [esi+1D0h]
        MEM_WriteByte(oCNpc__Regenerate__Life_CallChangeAttribute_Start + 0, 255); // 0xff
        MEM_WriteByte(oCNpc__Regenerate__Life_CallChangeAttribute_Start + 1, 182); // 0xb6
        MEM_WriteByte(oCNpc__Regenerate__Life_CallChangeAttribute_Start + 2, 208); // 0xd0
        MEM_WriteByte(oCNpc__Regenerate__Life_CallChangeAttribute_Start + 3,   1); // 0x01
        MEM_WriteByte(oCNpc__Regenerate__Life_CallChangeAttribute_Start + 4,   0); // 0x00
        MEM_WriteByte(oCNpc__Regenerate__Life_CallChangeAttribute_Start + 5,   0); // 0x00
    
        // push 1 -> push dword ptr [esi+1D4h]
        MEM_WriteByte(oCNpc__Regenerate__Mana_CallChangeAttribute_Start + 0, 255); // 0xff
        MEM_WriteByte(oCNpc__Regenerate__Mana_CallChangeAttribute_Start + 1, 182); // 0xb6
        MEM_WriteByte(oCNpc__Regenerate__Mana_CallChangeAttribute_Start + 2, 212); // 0xd4
        MEM_WriteByte(oCNpc__Regenerate__Mana_CallChangeAttribute_Start + 3,   1); // 0x01
        MEM_WriteByte(oCNpc__Regenerate__Mana_CallChangeAttribute_Start + 4,   0); // 0x00
        MEM_WriteByte(oCNpc__Regenerate__Mana_CallChangeAttribute_Start + 5,   0); // 0x00

    It is a bit of sphaghetti, but it works. It has also not been playtested through a full playthrough, but I couldn't find a situation where anything weird or unwanted would occur. It does have two pitfalls: The frequency is global and it is not possible to regenerate slower than 1 hp/mana per frequency. This means there is a strange compromise to be made between how small regen/degens you want and how smooth you want them to be...

    This is good enough for my needs though.

    Thank you all, I learned a lot working on this and with your help.

Berechtigungen

  • Neue Themen erstellen: Nein
  • Themen beantworten: Nein
  • Anhänge hochladen: Nein
  • Beiträge bearbeiten: Nein
Impressum | Link Us | intern
World of Gothic © by World of Gothic Team
Gothic, Gothic 2 & Gothic 3 are © by Piranha Bytes & Egmont Interactive & JoWooD Productions AG, all rights reserved worldwide