Home » MODDING HQ 1.13 » v1.13 Bug Reports » JA2 1.13 Windows 8.1 -- Consistent Freezes
() 1 Vote
|
|
Re: JA2 1.13 Windows 8.1 -- Consistent Freezes[message #347684 is a reply to message #335812]
|
Sun, 04 December 2016 01:14
|
|
MaiorPain |
|
Messages:15
Registered:July 2015 Location: Russia |
|
|
http://steamcommunity.com/games/215930/announcements/detail/583609044411508094
Today is the good day to...
So, there is announcement about patch for win 10 and 8 (include 8.1) for lauch from steam-devs.
Maybe it's can help us to fix this long-story issue.
Date of patch 22.11.2016
Let's wait for fix wich includes this patch in v1.13
P.S I tested steam version, it works fine for 8.1
[Updated on: Sun, 04 December 2016 01:16]
I'm sorry for my english. It's not my language and I never learn studied it. Games and translator were my teachers. Don't blame me hard, I'm always like to get corrected. Report message to a moderator
|
Private
|
|
|
|
|
|
|
|
|
|
|
|
|
Re: JA2 1.13 Windows 8.1 -- Consistent Freezes[message #349417 is a reply to message #349338]
|
Thu, 06 April 2017 09:54
|
|
TrentL |
|
Messages:68
Registered:February 2015 Location: United States |
|
|
Well trying this all out again after a couple years.
New hardware;
Windows 10 anniversary refresh on a Surface Pro 4 core i7.
I've done wine, registry fix, XP3 compatibility, and 16 bit color mode.
Playing full screen;
I get freezes (no response, music still plays) randomly but only when multiple sound effects triggered.
If I turn sound down all the way for PC/NPC and music, no random lockups / freezes.
With sound on...
It is diabolically easy to replicate. With "annoying speech" at 40%, doing a SHIFT+CTRL+R after a battle to reload weapons will freeze the game. (7 members on squad)
Doing a mass move of merc squad (7 members) occasionally freezes the game.
20 militia with the command to "inspect militia" doesn't lock up the game BUT causes a game-ending issue where the merc's mouths all continually wag nonstop, and the buttons on the strategic map no longer work. (once you exit sector back to strategic map, game over, the enter sector, options, resume time, etc all are grayed out. )
The issue is definitely related to sfx.
Is there a way to track how many simultaneous sound effects are playing and cap it?
Seems to me that if enough SFX are triggered simultaneously it is causing the issue, as the lockup problem is ONLY occurring when there are sound effects playing simultaneously to one another. Since music keeps playing and other things keep working, it almost seems like the problem stems from the code which is firing the sound effects is crapping out the in game ogg sound driver. The functions never return (stack overflow that isn't trapped?) and the game logic cannot continue.
If there were a setting externalized (e.g. "MAX_SIMULTANEOUS_SOUNDS") it would allow us to further test & diagnose.
Another useful option would be (grasping at straws here) some ability to forcefully re-initialize the sound system; perhaps returning control back to the game processing loop.
(To be clear when the game "locks up" from this; screen redraw still works, music still works, scrolling around the map still works, but cannot click on anything, UI is totally nonresponsive other than border scroll, keyboard input is ignored. Whatever is breaking is causing only a partial failure where mouse and keyboard input is no longer accepted and the sounds which were supposed to play never play; e.g. when it locks up on CTRL+SHIFT+R after a battle or when I do "militia inspect" there's no sounds that actually play - whatever dies, dies during sound processing and then stops all input from working.)
Report message to a moderator
|
Corporal
|
|
|
|
|
|
Re: JA2 1.13 Windows 8.1 -- Consistent Freezes[message #349424 is a reply to message #349422]
|
Sat, 08 April 2017 03:26
|
|
The_Bob |
|
Messages:415
Registered:May 2009 Location: Behind you. |
|
|
I've been trying to track down this issue, so far it appears to be caused by something weird going on in timer control code.
The timer is based on the QueryPerformanceCounter() system function, which supplies a timestamp that should be reasonably independent of current clock speed, number of CPUs and such things. So once the timer rolls over to the next tick, it stores two things, the current timestamp and the calculated timestamp of the next clock tick.
I've found that whenever the game seems to freeze, that second timestamp usually contains some weird values, sometimes decades into the future. Whenever that happens, the timer simply waits for the real timestamp to meet that bogus value. So far I'm trying to rule out errors in calculating that value, although I'm slightly worried that it might be a case of random memory corruption.
I'm testing an experimental fix where I try to detect an incorrect value and reset the timer in case it seems off. While not 100% effective, it managed to prevent the game from freezing a few times and got it back on track after it froze for several minutes.
Report message to a moderator
|
|
|
|
Re: JA2 1.13 Windows 8.1 -- Consistent Freezes[message #349429 is a reply to message #349424]
|
Sat, 08 April 2017 12:16
|
|
TrentL |
|
Messages:68
Registered:February 2015 Location: United States |
|
|
Ok that's actually kind of interesting. Since this only happens on multi-core (running single core fixes the issue but "breaks" other things such as AI movement) you might definitely be on to something.
Read this Bob;
http://www.virtualdub.org/blog/pivot/entry.php?id=106
There's other articles out there on this.
The issue is as good as it gets for microsecond time differentials, the QueryPerformanceCounter() API function sometimes returns screwy data on multicore systems, including negative values (depending on what calculations you are using). Furthermore the way timing was handled in XP changed somewhat moving in to the Windows 7 era and again for the Win8/8.1/10.
How does your code accomodate moving time to the past, if you get a value preceeding the current value?
If you detect a difference that's greater than X ms can you not do some bounds checking logic to resample or just force a new time interval?
Are you reading system clock speed (cpu frequency) with QueryPerformanceFrequency() to translate the counter to the approx microsecond value? (this requires kernel mode QPC/QPF() calls).
The system I am using is a dual core i7 surface tablet w/ 2 logical processors per core. It *also* frequently varies CPU clock speed based on load as most systems do these days. If you are reading QPC() without taking in to account the frequency OR the frequency changes between the QPC/QPF calls (hey, stuff happens fast these days), it could cause you to get some really weird values back depending on how you're trying to calculate "the next tick" (if I'm inferring what you wrote properly). E.g. if the CPU suddenly went from a burst mode of 3.2GHZ to a power saving mode of 1.2GHZ and the frequency wasn't taken in to account.. you could get a negative estimate (compounded with QPC's sometimes odd "moving back in time" thing which causes it to spontaneously roll back it's time a few ms here and there...)
Does the application *need* nanosecond timing or would millisecond timing work? timeGetTime will be far more accurate (less buggy on multicore systems) but only gives ms timing.
http://stackoverflow.com/questions/1825720/c-high-precision-time-measurement-in-windows
"This is not just a theoretical problem; we ran into it with our application and had to conclude that the only reliable time source is timeGetTime which only has ms precision (which fortunately was sufficient in our case). We also tried fixating the thread affinity for our threads to guarantee that each thread always got a consistent value from QueryPerformanceCounter, this worked but it absolutely killed the performance in the application."
Sound familiar? Thread affinity ABSOLUTELY fixes the issue in JA but it screws up AI and other performance.
(I ran in to similar timing issues in a network monitoring system I wrote once; system.diagnostics.stopwatch uses QPC() calls and occasionally I'd get ping responses that were oddly "wrong". E.g. the response packet arrived at a time "previous" to when I actually sent the network packet out on very low latency networks. "How did I just get a -1ms ping?")
https://msdn.microsoft.com/en-us/library/windows/desktop/dn553408(v=vs.85).aspx
This MSDN article covers some interesting differences between the various OS's and can shed some light in to why this might be the underlying issue in JA2:
Windows 8, Windows 8.1, Windows Server 2012, and Windows Server 2012 R2
Windows 8, Windows 8.1, Windows Server 2012, and Windows Server 2012 R2 use TSCs as the basis for the performance counter. The TSC synchronization algorithm was significantly improved to better accommodate large systems with many processors. In addition, support for the new precise time-of-day API was added, which enables acquiring precise wall clock time stamps from the operating system. For more info, see GetSystemTimePreciseAsFileTime. On Windows RT PC platforms, the performance counter is based on either a proprietary platform counter or the system counter provided by the Windows RT PC Generic Timer if the platform is so equipped.
Now what they aren't telling you here is that different CPU cores are going to give different clock rates (naturally) and that all cores may not be running the same exact frequency (naturally). BUT, they need applications to give very precise timing which is consistent across all cores. So whatever mechanism they are using has to account for per core drift.
Personally I think their implementation is buggy - Microsoft says it's foolproof but I have seen it screw up firsthand (and there's a lot of other articles out there about this for people who have delved in to it). On Server 2012R2+ if you do a ping -t on a low latency, totally clean, isolated lab network, you'll see about 1 out of 3,000 packets give "request timed out" - often a single 1 second instance, other times in groups of 2 or 3 seconds. The system timers temporarily flip out, giving future times which exceed the 1000ms wait time default setting of the ping command. I have also witnessed this on multicore systems just reading the QPC/QPF counters as you mentioned. This is more deviation than you would expect.
I suspect this is due to some internal mechanism of 'normalizing' the ticks across multiple CPU's.
Now Windows XP SP2 used CPU TSC timer by default but could be forced to use PM timer.
Windows XP SP3 forced TSC timer. (We know JA2 runs good on this)
Windows 7+ uses HPET or some other CPU timer (which I'm trying to track down).
I'm trying to find a way to forcefully use TSC instead of HPET. Currently looks like you have to read the intrinsic _rdtsc() and work it directly off of that. (That will forcefully use the TSC counter, but there could be threading issues). That function returns the # of ticks since last processor reset. It should (importantly) never run negative and also (importantly) never return some bogus arbitrary value. It can (only) be used to track ticks between checks. (Has no bearing on processor frequency, etc).
Another symptom that points to this is when a machine hibernates/ sleeps (if I walk away from my Surface to grab coffee, take a leak, etc, come back and sign on) JA2 locks up hard most of the time.
From MSDN:
QueryPerformanceCounter reads the performance counter and returns the total number of ticks that have occurred since the Windows operating system was started, including the time when the machine was in a sleep state such as standby, hibernate, or connected standby.
Now if you are comparing values from last read to current and suddenly there's a reaaaaly big jump (say 20 minutes) between the last read and the read NOW.. you could very well find yourself waiting a very long time for the program to get to the next "tick" (if I am inferring from your post that the system determines or tries to determine when the next "tick" will be?)
You need to accommodate that.. "how much time do I have before I see the next tick" is going to be damn tricky using the QPC.
Meanwhile, reading the TSC counter should NOT be affected by this since the CPU stops making "ticks" while it's asleep.
Can you post the timing code and show how this is being done?
"So once the timer rolls over to the next tick, it stores two things, the current timestamp and the calculated timestamp of the next clock tick."
I'll check back tomorrow.. will try to help however I can.
[Updated on: Sat, 08 April 2017 12:20] Report message to a moderator
|
Corporal
|
|
|
|
|
|
Re: JA2 1.13 Windows 8.1 -- Consistent Freezes[message #349434 is a reply to message #335812]
|
Sat, 08 April 2017 14:09
|
|
The_Bob |
|
Messages:415
Registered:May 2009 Location: Behind you. |
|
|
In my latest attempt to fix the timer I did two things:
- Ensured all variable types were correct, so there would be no overflows or signed/unsigned conversion issues
- Added a sanity check in the function responsible for calculating the timestamp for the next tick
The timer code is in this file: https://ja2svn.mooo.com/source/ja2/trunk/GameSource/ja2_v1.13/Build/Utils/Timer%20Control.cpp
There are three essential JA2 threads - the main thread, the notify thread and the timer thread. Also there's a bunch of auxiliary threads for audio and stuff that don't seem to cause trouble.
Hopefully this will fix the issue for a while. If not, I have a few more ideas on how to fix this, or work around the problem. While the timer thread may be frozen, waiting for an hour after the clock went backwards, the main thread is still working the Windows message loop. It shouldn't be too hard to use it to wake up or restart the clock thread if it's not responding for too long, or add a key that allows the user to reset it once things start going off the rails.
Im guessing installing different runtimes and changing compatibility modes altered how the QPC() function behaved, which is why it helped in some cases. I rather doubt CPU pinning helped, at least for win8/win10, at most it delayed the freeze by slowing down the timer.
Report message to a moderator
|
|
|
|
|
|
|
|
|
Re: JA2 1.13 Windows 8.1 -- Consistent Freezes[message #349479 is a reply to message #349475]
|
Mon, 10 April 2017 21:06
|
|
TrentL |
|
Messages:68
Registered:February 2015 Location: United States |
|
|
Definitely good so far.
Bob I'm going to go ahead and extensively playtest this over the next couple of weeks - I'll finish out my campaign on your 8396 version you posted above over the next couple weeks. I'm not very far in to it so it should give a pretty lengthy test. That way you've got a long term test before committing it.
Playing without crashes has me positively giddy. I was getting so frustrated and about to toss it back on the shelf for another year lol.
[Updated on: Mon, 10 April 2017 21:06] Report message to a moderator
|
Corporal
|
|
|
|
|
|
|
|
|
Re: JA2 1.13 Windows 8.1 -- Consistent Freezes[message #349499 is a reply to message #349497]
|
Wed, 12 April 2017 05:41
|
|
TrentL |
|
Messages:68
Registered:February 2015 Location: United States |
|
|
Ok had an interesting non-crash Bob. Was assaulting the HQ sector in Alma and at one point in the fight, everything went in to hyperdrive.
I couldn't *do* anything with my guys, everything was flashing super fast. I could change mercs, aim, etc, but couldn't actually fire a weapon (although I could do everything else aiming wise.)
I went back to the strategic map OK, and it was flashing uber fast as well. I saved the game, reloaded, and things were back to normal. Never had to leave the game.
So not exactly a crash - as I could save and load and everything was OK after a reload, without leaving the game - but definitely something I'd never seen JA2 do before.
I don't know what precisely caused that, I was just aiming. It *almost* felt like the fast-forward AI turn never "finished" (or didn't slow back down to normal speed, when it was over), since everything was still going like it was on fast forward.
Better than crashing for sure!
But odd!
That was ~4 hours in to a game without any issues.
I was able to recover just fine and keep playing, only left the game to come write a report about it before I forgot.
Report message to a moderator
|
Corporal
|
|
|
|
|
|
|
Goto Forum:
Current Time: Fri Apr 19 03:46:26 GMT+3 2024
Total time taken to generate the page: 0.02710 seconds
|