Windows Task Manager does not report correct GPU usage in DX12 applications. Use something like MSI Afterburner to correctly monitor it under DX12.
Hey. Here they are; https://www.trainsimcommunity.com/m...ini-tweaks/i3889-tsw4-to-d4-lighting-overhaul
I have found that taking out some of the "r.streaming---" lines make for better performance without a sacrifice in quality; while I can increase some view / shadow / foliage distance settings and it doesn't tax my GPU that much. Just an illustration of what you are saying. Ryzen 7 / RTX 3060
Good morning. For my part I am on an i5 12400 - rtx 3060 (12gb) 16gb of ram - tsw4 installed on an ssd-M2. The game runs in ultra quality in 1080p 60 fps consistent in DX11 in origin. (but a lot of stutter of course) So I took all your settings in “blue” activated DX12 and the PSO cache as indicated. I specify that I have no other optimization in the config.ini, only what is in blue. I lost on average 25% of resources on my graphics card but on the other hand the fluidity is exemplary!! So I changed 2 quality settings to compensate for the 25% loss because my FPS was going towards 55 and the graphics level was at 100% To have an fps of 60 and a slightly less demanding graphics card, I set it as follows: I am now set to "ultra-high" mixed quality. The sky is in "high" quality and the far view is in "high" quality; everything else is in ultra quality. The sky is the element that consumes the most!?!? I blocked my fps at 60 and everything is fine, the graphics card is at around 80-90% and the processor at 18% and I don't see any difference in graphics!!! I can therefore confirm that with your settings and the activation of DX12 the game is much smoother, there are only a few micro stutters left due to the loading of the tiles (very very very minimal!!!) I'll stay tuned in case further improvements are possible. Thank you very much for your work .
You’re absolutely right. ‘Ultra’ sky quality is incredibly resource hungry for very little difference in quality. It’s very odd.
According to Will, the PSO cache should make no difference as it's not used here. The Sky quality determines the voxel size and resolution the clouds are made of - this is very resource hungry as they are not bitmaps as in ToD3 but real objects, but not made of a polygon mesh and textures, they are made of small blocks, the higher the quality the smaller those blocks and the more of them - correct me if I'm wrong here. I'm fine with medium cloud quality. All the rest is on Ultra with stable 75 FPS.
Yes, I saw right away that something was very tasty! After suspecting the shadows, I searched and found of course that the sky was really very, very greedy. I just put it on medium and I still gained a lot of resources for a loss of graphic quality that was almost non-existent, which allowed me to put everything else on ultra. So should I deactivate the PSO cache or should I keep it activated anyway?
According to the devs, it has no function due to how TSW is set up - explained in Post #52 https://forums.dovetailgames.com/threads/please-optimize-tsw4-engine-ini.73687/page-2#post-722087
About the sky there are some things to consider because indeed it has impact on performance. Sun light had an apparent improvement on TSW4 and that´s always using resources, even if we are not having ray tracing. Clouds resolution itself is not a particular resource hungry feature (not different to any other textures I mean). However clouds are similar to trees in the way that both cast shadows to terrain and nearby objects. So higher texture resolutions mean more shadows details and more trees mean also more shadows. That gets even worse as you increase viewdistance (more objects will receive light and shadows and also cast their own shadows to nearby objects as well). Indeed I think those two elements (sun light and cloud shadows) are the responsible ones for the remaining tile hitching because it looks as if something is updated per tile globally when tile is loaded. Maybe it´s just a legacy feature which was left active. I tried to tune/deactivate some settings, even trying some of the variables that force updates per frame, but the issue is still there upon tile loading. The results in the tests I posted with those pictures above show that shadows and light may be still updated upon tile loading in some situations (specially when using savegames), not only in real time as sun position changes. A global update of the whole scene shadows cloud explain why threads are hanging do the amount of objects to be updated in short time. However variables seem to produce no effect so maybe it´s just hardcoded, I don´t know. There´s also a random appearance factor that confuses me a lot. The pattern is clearly there but is not always reproduced in the same way. It may be caused by the random weather presets, as that can change the environmental conditions even for a given savegame. The hitching is always happening at the same exact place but the visual effect resulting in that new tile loading sometimes changes. On cloudy weather for instance tile loading hitching is less noticeable due to the less amount of casted cloud shadows but there´s still hitching anyway, so something still gets processed despite the custom settings which are forcing max details basically everywhere to mitigate that and despite the reduced sun light conditions. That happens also during pitch black night, which makes this even harder to understand. Cheers
Hello Geloxo, thank you so much for this information. My system specs are similar to mine and I am curious hoe these settings will change my TSW experience. One question, you mentioned some sort of Unreal tweaker which is needed and a „Patreon version“. Where would I find information about this? Is it needed to utilize your settings?
That´s Universal Unreal Engine Unlocker and it´s not needed to run those settings, only to test variables changes in real time while in game. My settings are just added at engine.ini and will be loaded by game directly without further actions needed. Cheers
I turned on Low Latency mode on nVidia settings, and there are fewer of the 1-2 second pauses. Shader-load stutter is still there. Ryzen 7 / RTX 3060
Hi everyone. I updated my config with the following main changes: Forced using of DirectX shaders compiler instead of default one (r.D3D.ForceDXC=1). Shaders created with this new compiler are only compatible with DX12 but produce better results and give also better performance figures under DX12. Don´t use this new compiler if you plan to use DX11 or switch between modes. Disabled the shaders fast math optimization (r.Shaders.FastMath=0). This generates more precise shaders (with less potential errors) which reduces some residual hitching after shaders are already generated. This optimization was originally intended as a performance saver on mobile devices and low-end systems but it can indeed propage errors in generated shaders as it´s skipping the floating point arithmetic. This is another example to demonstrate why Unreal ancient optimizations should not be used at all. My config still proposes the use of DX12 PSO cache, in particular the use of user cache which is managed by GPU driver directly. It will still create some PSOs even if they are not originally shipped with default game or fully set in current game version, so this feature can be still used until game eventually implements a full PSO support. In that case game would load both PSOs (native and user created) and merge them in case any PSOs are missing, using the existing ones stored at user cache to complete the missing ones. Added shader pipeline cache preoptimization (r.ShaderPipelineCache.PreOptimizeEnabled=1) which allows to generate some PSOs before loading game, reducing the need to create them on demand later on. Shader pipeline variables adjusted to improve first time generation process Level streaming and garbage collector adjusted to prevent crashes with some specific routes such as Arosa line Some optional fixes included as a result of the configuration testing (see notes at the end of post) Config entries sorted a bit for better variables classification Please check the post with colors and read the updated notes at the end of it. I may still fine tune some values in the following days as soon as I test this new set with more routes to get the best standard set I can propose. Cheers
Thanks a lot. So far I´m having better results than before but it would be great if anyone else who also tested the new set could confirm the results to know if parameters are fine for everyone. Cheers
note : i used the following settings steam TSW 4 set launch option to -dx12 -USEALLAVAILABLECORES -HIGH -threads 8 used the following ini edit [SystemSettings] ts2.dbg.JourneyChapterLockOverride=1 ts2.save.CheckpointsEnabled=1 ts2.dbg.SteamParticleSpawnMultiplier=2.0 TimeOfDaySystem.AutoExposure.SpeedUp=10 TimeOfDaySystem.AutoExposure.SpeedDown=5 r.SkeletalMeshLODBias=-3 r.StaticMeshLODDistanceScale=0.1 r.HLOD.DistanceOverrideScale=8 r.Streaming.LimitPoolSizeToVRAM=0 r.Streaming.PoolSize=7000 r.Streaming.MaxTempMemoryAllowed=3000 r.RenderTargetPoolMin=1200 r.Streaming.FramesForFullUpdate=2 r.Streaming.Boost=2 r.GTSyncType=1 r.OneFrameThreadLag=1 g.TimeToBlockOnRenderFence=0 r.RHICmdBalanceParallelLists=1 D3D12.MaximumFrameLatency=0 a.ForceParallelAnimUpdate=1 r.CreateShadersOnLoad=1 fx.ForceCompileOnLoad=1 niagara.CreateShadersOnLoad=1 r.ForceAllCoresForShaderCompiling=1 r.D3D.ForceDXC=1 r.Shaders.FastMath=0 D3D12.PSO.DiskCache=1 D3D12.PSO.DriverOptimizedDiskCache=1 r.ShaderPipelineCache.PreOptimizeEnabled=1 r.ShaderPipelineCache.BatchTime=2 r.ShaderPipelineCache.PrecompileBatchTime=2 r.ShaderPipelineCache.BackgroundBatchTime=0 r.ShaderPipelineCache.BatchSize=5 r.ShaderPipelineCache.PrecompileBatchSize=5 r.ShaderPipelineCache.BackgroundBatchSize=0 r.LightMaxDrawDistanceScale=10 r.MinScreenRadiusForLights=0.001 r.Shadow.WholeSceneShadowCacheMb=2000 s.AsyncLoadingTimeLimit=2 s.PriorityAsyncLoadingExtraTime=1 s.UnregisterComponentsTimeLimit=2 s.LevelStreamingActorsUpdateTimeLimit=2 s.PriorityLevelStreamingActorsUpdateExtraTime=1 s.LevelStreamingComponentsUnregistrationGranularity=10 s.LevelStreamingComponentsRegistrationGranularity=50 r.Streaming.NumStaticComponentsProcessedPerFrame=500 gc.TimeBetweenPurgingPendingKillObjects=900 gc.NumRetriesBeforeForcingGC=5 gc.MinDesiredObjectsPerSubTask=20 s.ContinuouslyIncrementalGCWhileLevelsPendingPurge=0 r.DFDistanceScale=8 r.DFFullResolution=0 r.DFFarTransitionScale=0.1 r.AmbientOcclusion.Method=1 r.AmbientOcclusionLevels=3 r.Shadow.FilterMethod=0 r.Shadow.DistanceScale=2 r.ShadowQuality=3 r.Shadow.CSM.MaxCascades=10 r.Shadow.MaxResolution=4096 r.Shadow.MaxCSMResolution=2048 r.Shadow.SpotLightTransitionScale=2048 r.Shadow.RadiusThreshold=0.005 r.Shadow.CSM.TransitionScale=2 r.Shadow.CSMShadowDistanceFadeoutMultiplier=0.1 r.GTAO.FalloffEnd=300 r.GTAO.SpatialFilter=0 r.GTAO.NumAngles=2 r.GTAO.UseNormals=1 r.GTAO.ThicknessBlend=0 r.AllowLandscapeShadows=1 r.DistanceFieldShadowing=1 r.ViewDistanceScale=4 r.LandscapeLOD0DistributionScale=4 foliage.DitheredLOD=4 foliage.LODDistanceScale=4 TimeOfDaySystem.VolumetricCloud.RayMarchedShadows=1 TimeOfDaySystem.AutoExposure.ExposureBias=-0.5 TimeOfDaySystem.SkyLightPollutionLuminance=1.0 TimeOfDaySystem.BloomIntensity=0.17 TimeofDaySystem.SunIntensity=50000 TimeofDaySystem.CloudShadowVolumetricResolutionScale=1 TimeofDaySystem.Clouds.HighAltitude.CloudDensityMult=0.2 TimeofDaySystem.VolumetricCloud.GroundContribution=1 TimeofDaySystem.VolumetricCloud.LayerHeightScale=1.8 TimeOfDaySystem.StarIntensity=50 TimeofDaySystem.LegacyEmissiveAdjustments.EmissiveMultNonLamp=550 r.VolumetricFog.DepthDistributionScale=32.000000 r.VolumetricFog.GridDivisor=120 r.VolumetricFog.GridPixelSize=8 r.VolumetricFog.GridSizeZ=96 r.VolumetricFog.HistoryMissSupersampleCount=1 r.VolumetricFog.HistoryWeight=0.900000 r.VolumetricFog.InjectShadowedLightsSeparately=1 r.VolumetricFog.InverseSquaredLightDistanceBiasScale=1.000000 r.VolumetricFog.Jitter=1 r.VolumetricFog.LightFunctionSupersampleScale=2.000000 r.VolumetricFog.TemporalReprojection=1 r.VolumetricFog.VoxelizationShowOnlyPassIndex=-1 r.VolumetricFog.VoxelizationSlicesPerGSPass=8 r.VolumetricLightmap.VisualizationMinScreenFraction=0.001000 r.VolumetricLightmap.VisualizationRadiusScale=0.010000 [Audio] UnfocusedVolumeMultiplier=1.000000 MaxChannels=96 everything went ok but i noticed the game took very long to get past the TSW 4 small popup screen i guess it is building up cache files before starting the game up too
Some shaders are being compiled during loading, yes, but they just take 1-2 seconds to do it and are the ones needed by the game menus basically. The most relevant portion of the shaders compiling needed for routes still happens on demand during gameplay as DTG has not enabled the possibility to generate all shaders during route loading for instance, that would be what we need even if first time route loading takes 15min, as that would be needed only once. Some shaders are anyway compiled on route loading also due to my tweaks but that´s still a very limited amount (basically the ones needed at the station you start, and also taking few seconds to compile). DX12 user cache files can be generated without disturbing gameplay basically. Indeed the major update happens when you exit game. Shaders, no matter if you are on DX11 or DX12, are the main reason for hitching during compilation on demand and the problem is that they need to be compiled every few seconds as you move. Compiling shaders for such big routes on demand is not optimal at all because you basically need several days/weeks until you build the complete shaders set for your complete content collection and they are only loaded once you reload route after their are generated for the first time or updated later on. You need to play several times on each route for that to happen and those first runs are basically a nightmare. There´s something else in game loading that causes the delay you described. I also noticed this random delay in the first days after the latest patch while even using default settings. Cheers
for me the first run after deleting all dx shaders took sometime longer that i thought the game hanged the 2nd run was normal startup did runs on SEHS and Cajon Pass both where first run after cache delete and even then i hardly notice stutter performance was pretty good already incase you wonder this is my system not the fastest anymore since it is now about 5 years old but i am still very happy with it windows 11 Pro 64 bit Cooler Master MasterCase H500M - Case Asus ROG STRIX Z390-H GAMING - Moederbord Intel Core i9-9900K Corsair Vengeance LPX 32gb ddr4 ASUS ROG-STRIX-RTX3080-O10G-WHITE 10GB PSU Seasonic Prime Ultra 1000 Titanium windows 11 64 bit => Samsung 970 EVO Plus 500GB m2 drive Steam: Client on drive e (1tb Samsung m2-ssd 960 pro 1tb M2 SSD)
did another run on SEHS from Faversham to Rochester and averaged around 70fps wich is very good for running it at ScreenPercentage=140.000000
Thanks a lot. I'm also getting really good figures and high fps on the routes I have tested so far. There´s still one last micro hitch (less than 0.1 seconds that happens on the tile loading event. I could measure it to happen systematically every 20s at max speed on Riesa-Dresden (so every 800m more or less) on the portion of the route with less details. It´s clearly every 20s always and it´s not scenery loading nor trains because I´m alone there in a random scenery area. I imagine anything global gets updated upon each tile loading or just periodically. Disabling shadows completely does not remove it. I also included some sound tweaks in the next version so sounds are not apparently the guilty ones. So either this is caused by the physics/collisions or it could be even simply by the sun cycle position update. I noticed in the past that shadows caused by sun light in some cases were updated every 20s too and generally every 3-5s. We will see... Cheers
i always notice for instance with USA trains (other European trains too but those from USA sure make lot of noise so easy to hear ) that when driving with wipers on you hear them always when driving in a straight line but as soon as the rails are curved the wiper sounds go away a good example is with slippery descent scenario
I found a problem with original sound thread that I fixed today. Maybe that helps in your case too. The problem was the usage of too many sound commands in one go. It resulted for instance in german locomotives high pitch sound to be cut on tile loading. I was looking to solve that particular problem since looong time ago indeed... Hopefully I will publish the new tweaks today. We will have even less hitching than before. We are closer to find the complete solution Cheers
Dear all, I included new modifications, involving sounds in this case. There was a problem with default parameters that resulted in an overload of audio commands, which had a severe impact on overall performance as Unreal can´t manage so many sounds at once efficiently. Basically the amount of sound batches (AudioThread.BatchAsyncBatchSize) was excessive for the audio thread, even if that was run async. On the other hand even sound loops from very distant objects, with objects being barely visible and sounds still far from their max audible limit indeed, were checked periodically. One of the most noticeable effects in game caused by this severe performance degradation was the cutting of german locomotives high pitch sound upon new tile loading. Those values didn´t produce best sounds just worse performance by computing something that was not even possible to be audible at all. I included the following modification in the performance section, as it was the main reason of the sound overload and performance decrease: AudioThread.BatchAsyncBatchSize (default 125) The following group is added as optional (marked now in pink) in the list and some of them are also secondary performance tweaks: au.Concurrency.MinVolumeScale (default 0.001) --> controls the fading distance for sounds (default value is its virtual max value indeed). I increased that range but we will just gain very few extra distance only. Most sounds will be indeed still faded at around 200-300m in game for performance reasons as Unreal can´t play infinite sounds, so this tweak is not really needed au.SoundDistanceOptimizationLength (default 1) --> defines how long one-shot sounds shall be (those played just once even if sound source is destroyed afterwards, via particle emitters for instance) to be culled with distance. I increased this and it could be even more. Basically it has no sense to optimize a so short sound. Sound will be played just once and before you can even finish optimizing it, so it´s better to let it play and terminate by itself au.VirtualLoops.PerfDistance (default 15000) --> extra distance added to effective max sound source audible distance to check if ambient sound loops shall be repeated (when player is approaching them) or stay muted. Those loops are cars, factories, people in stations, birds, general noises, etc. It has no sense to use 15000 there. I limited it to a value lower than standard map tiles size, so that engine saves some workload by checking only sounds existing at less than a tile size distance and also at distances not much bigger than sound sources real max audible ranges, that is what matters at the end because that´s the true limiting factor we have. This variable is basically not very useful as a moving player train won´t be ever close enough to sound to miss it if we use fast frequency checks instead (see next) au.VirtualLoops.UpdateRate.Max (default 3) --> how frequently distant loop sounds will be checked to trigger them if player is still out of their max audible distance, but approaching them for instance au.VirtualLoops.UpdateRate.Min (default 0.1) --> how frequently sound loops will be checked when player is already inside their max audible distance After the addition of those sound fixes to my existing config (in particular the sound performance main fix) I get 5ms latency on the threads (and even as low as 2ms) in regular areas, which is virtually perfect. I was also able to increase performance from 80fps to 90fps while still keeping it close to 80fps at big stations like Dresden, so the sound problem was relevant as you can see. Post with colors updated. I also added an extra couple of changes there, which go at GameUserSettings.ini file, not at Engine.ini, and allows to tune volumes above 100% (limit in user interface). This is just optional. Please read notes at the end of original post. Departing Dresden: Arriving Reading. Look at those 3ms and 4ms latencies: First time shaders generation test with Azuma. Check how we still get low latencies and high fps figures, even in such a demanding situation. The DirectX compiler tweaks reveal here why those changes were needed: Cheers
Got to say thanks for this,using dx12 with a 4090 & 57" Samsung Odyssey it looks so good,I am now down to 65% usage on gpu & 67% temps.
Just as another data point and because most in here are like "oh all this stuff is so obvious, should be a no-brainer for DTG" – enabling DX12 seriously tanks performance for me, both on my PC (GTX 980 + i7-4790) and Surface Book 2 (GTX 1060 + i7-8650U). With DX11 I can play with stable 30FPS in 3840x2160 on my Surface Book and 2560x1080 with 150% scaling on my PC, but enabling DX12 (with or without the DX Compiler and PSO Cache) worsens performance for me significantly. Still trying to figure out which of the individual settings might be worth it, but most ini tweaks seem to negatively affect the experience on my low-end hardware.
You can always choose your own settings, but keep in mind, probably most players will not run at the latest hardware, so I think DTG must be far more careful in selecting optimal settings than what you can afford. In this post I see long list with settings people claim to be an improvement. But, they are settings you can choose it it is likely it is not under all circumstances a good idea to apply them. I would like to have a good manual with simple explanation on what makes sense in which circumstance, probably mainly dependent on the graphics card you actually use.
Well yes, it would do. Both of those systems have very weak GPUs so enabling DX12 (which takes load off the CPU and puts the calculations on the GPU) means it is going to bottleneck immediately, particularly when you’re running it in 4K!
I have updated the post to classify each set as mandatory or optional. The minimun you need to include are the blue and black portions. With that alone you will see a clear difference. The rest is optional and they involve changes in quality. In case you use those blocks then settings need to be adjusted for each system depending on its capacity, following the guidelines I included at the end of my post to balance quality vs performance. It´s expected to see a severe performance drop if you intend to load my config (made for a 4090 card and using max details) on a system using a 980 -1060 card. You need to adjust the quality configuration for your system. Cheers
Ok, the problem with sounds was more relevant than I initially expected and it resulted in having a a really significant impact on performance. The main problem was that engine is only able to use 32 sound channels at once, but even more than 100 sounds can be played in regular situations at once, for instance when two trains cross in a normal main line tile. Engine has a sound streaming feature (similar to texture streaming one) that allows to handle such situation but the default settings were not set correctly at all. Buffers were set to 4 MB, to start with, when just player train sounds together with some random nature sounds loops can use around 50 MB RAM. In addition to that all sounds were running in a single thread instead of using multiple worker threads, which resulted in game thread being delayed until all sounds could be played or streamed. In many cases some sounds still needed to be decompressed in real time or at least partially, before playing them. That´s a no-go. Basically the simple update of existing sounds while moving inside the same tile was hammering game thread, and delaying the whole computing and simulation therefore. I made the following changes: Parallel sound threads activated via au.DisableParallelSourceProcessing (disabled in default settings) --> no comments... Somebody deactivated this Increased the worker threads to decompress audio via s.IoDispatcherDecompressionWorkerCount (default 4 threads only) --> helps with on demand sound decompression Set a big enough cache for sound dispatcher via s.IoDispatcherCacheSizeMB (default 0) --> streaming needs a cache bro, so let it be big... Increased dispatcher buffer memory via s.IoDispatcherBufferMemoryMB (default 4 MB) --> allows more sounds to be read per cycle. 4 MB are by far not enough even in regular situations Reduced dispatcher buffer size via s.IoDispatcherBufferSizeKB (default 256 Kb) --> allows to have less than 1ms latency when playing sounds. The less latency the more CPU usage but on the other hand the less possibility to have audio cuts. As CPU usage is greatly improved under DX12 we can go into the lower limits with this (low sizes are intented for real-time playing of quality audio anyway, so we could increase the buffer size a bit if needed) The extra available headroom that we have released allows us to have less impact during other demanding tasks, such as tile loading and shaders generation, resulting in an overall stability improvement and less hitching. I included those tweaks to the blue performance section together with some extra enhancements, like running animations and FX in parallel threads, which add their small contribution as well and should have never being forced to run single threaded either (basically it takes more time to wake up a new thread to perform a new short task than to run a small batch of short tasks in any existing one). I also tweaked some of the existing variables as follows: Set a fixed size texture streaming cache via r.Streaming.UseFixedPoolSize. Despite I initially discarded this change the cache dynamic resizing during gameplay can have an impact even on my card. Reduced streaming cache and texture streaming cache as I have not seen that game really needs so much VRAM, even on max details and dense areas. In most cases it stays below 10 GB VRAM, and that includes the texture streaming as well as the non-streamable objects. Increased view distance to 5. After the tweaks increasing this variable has barely no impact and it improves terrain rendering and overall rendering indeed, among other things. Original post updated. Please review it carefully due to the amount of changes and arrangement of new variables inside existing blocks. First time shaders generation results in amazing low latencies now and therefore high fps even on arrival to terminals or heavy rain. See example of Lübeck, with threads on the 5ms range in main line and on stations with several trains. GPU is still in the 10ms range all the time, which is perfect as we can the deliver 80-90 fps almost everywhere in the routes, no matter how dense station areas are or how many trains we have there. There´s still one issue I can´t solve nor understand. Even a route made of empty tiles (forced to be empty during tests be reducing all details and viewdistance to 0, so only terrain, few residual objects and one locomotive driven by player are visible), still produce hitching when next tile is loaded. That hitching will be therefore also reproduced on any route, and always at the same places of the route and with the same short or long duration (depending on the point) as well. Indeed, if you stop short before one of those points or spawn on foot and move there, you could see how some AI trains also reproduce a small hitch when they enter the next map tile on that point or they come from that tile to the one you are. In that case you don´t see any fps changes and rendering is smooth for you but the AI train movement simulation is shortly halted instead. Something is clearly stalling inside engine. My settings helped to reduce this effect a lot but it´s still there. I monitored garbage collector and it´s not firing thanks to my settings, so it´s caused by something else that is not garbage collector, sounds, shadows, geometry, textures, shaders or lights. This simply shouldn´t happen. Tile is empty and details are set to 0, so nothing is being rendered. If anyone has an explanation for this please let me know. I run out of ideas... One thing I could imagine is the residual work of decompressing/reading the DLC pak files, in case they are not fully read on loaded at start for some of those exotic performance saving ideas. Other possibility is the handling of asset classes themselves. And the last one I could imagine is any custom process implemented by DTG for game (train dispatching, route/signal updates, timetables, weather, etc). As that portion of the code is not documented I can´t investigate it anymore. If any of those things is running single threaded (as sounds were doing for instance) and handling too many instructions per frame cycle, that could explain why this effect is still reproduced in game and a bottleneck still exists even for the already optimized threads after using my settings. I think we need the help of devs to clarify or eventually solve this as they are the ones that can know what´s running on the threads. Cheers
Thanks, geloxo. Your posts are really informative and I am getting so much more out of TSW4 as a result.
But that is also true for newer GPU. I.e. on my 4070Ti I'm GPU limited all the time, so moving from DX11 to DX12, and putting more load on the GPU, it should also decrease performance there, no? It has nothing to do with how powerfull the GPU is, you would see the same results if you would compare a 1080ti with a 3060, where the 3060 (a less powerfull card) would benefit from DX12 and the 1080ti would not. It is down to architecture, where the newer cards benefit from DX12 more + nvidia has been continously rolling out driver optimization for DX12 for newer RTX cards. 10th and 9th series are on legacy support for some time now.
This is a good point – I'm still using a 2021 driver (newer ones aren't officially distributed by Microsoft/Surface), will try if a newer one improves perf. Not that optimistic as 10xx cards are legacy as you said, but might help a bit nonetheless
I doubt it will help, I had a GTX1080 until a month ago, and even with quite new drivers the performance under DX12 was worse than in DX11. The exact opposite of my current 4070Ti
DX12 is not always adding extra load to GPU but it releases CPU load instead by reducing the need of so many CPU-->GPU calls because DX12 can process requests in batches. The amount of overall work is still the same (the one the rendered scene needs, so the one game is requesting your system to provide) but that work is performed more efficiently under DX12. Some of this extra work existing on DX11 is simply removed unded DX12. That´s the main difference. Even if TSW4 is not yet fully optimized for DX12 the GPU driver is still able to provide optimizations on each own (use of PSO user cache available in Unreal for instance) which help during geometry calculations on the GPU side, and will also use any of the optimizations it has available thanks to its DX12 native support (newer shader model version for instance). Anyway TSW4 is not so demanding. Just to give you a quick idea: rendering Dresden on rush hour with multiple trains, using my settings and resolution, typically needs less than 1 GB VRAM for texture streaming + 4 GB VRAM used by non streamable objects (that comes from the real debug information, so it´s not just a random number I´m giving here). In addition to that, screen resolution, scaling and the rest of other things, add another 4-5 GB VRAM. In total GPU is always using around 10 GB VRAM with such high graphical settings. The use of streaming releases a lot of memory usage (as less things are resident in memory) but adds the extra work to load and unload them very often. However the amount of workload, even if the scene is dense, it´s not that high per frame, so even slower systems can still handle it provided that graphical quality is balanced in accordance to their capacity. Any system will be therefore working more frequently at medium or high power, and being therefore less time idle, because there´s always something new to stream in or out on VRAM and fewer things which remain resident in memory for longer periods of times. Geometry is also changing frequently as LODs also change quite often, specially on the shorter distances when using default settings, and new tiles are also loaded every 1km more or less (or every few seconds while train is moving). Game will normally use most of the time just 8 GB RAM, and can run using 5 GB RAM in regular areas. Flight Simulator can use 20 GB VRAM and 16 GB RAM in very dense areas and it barely goes below 10 GB in regular ones. it´s also using streaming and DX11 or DX12. The high numbers there are because it uses very high resolution textures, higher resolution terrain meshes, much more objects and is rendering typically 30 km away of detailed ground and sky (almost 10 times more than max effective viewdistance in TSW4), with virtually an unlimited distance in low resolution after that point. The physics simulation is also much more complex, as it includes weather dynamics and full airplane aerodynamics too. Under DX12 results can differ but in general I have always seen better results under DX12 there, both with a 3080Ti and now with the 4090, using the same 12900ks CPU in both cases. When I first purchased the 3080Ti I had a 9900ks and results were also better at that time. In addition Nvidia frame generation is available in that engine, which is not the case in Unreal. Cheers
wich sounds exactly falls under AmbienceSoundVolume=2.5 (it will increase ambient sounds to 250% volume) i always thought it was only animals/people cars and other external sounds since you put it so loud
That´s it basically. It´s anything else which are not the train sounds. They are sound loops, repeating constantly and automatically the same sounds while you are inside their max audible range. Some of them result on some kind of randomness. For instance next to those orange excavators found at many routes you can some times listen to a jack hammer as well as the excavator, or both can be silent. Some factories and industrial areas can also contain a small mix of several sounds in the loop. Other are not loops, but they are fired just once on demand. Examples are the platform announcement system or the clocks/bells at some church towers. They are triggered by specific events or at particular times of the day. The reason to put them so loud is that default volumes were very soft and they will anyway fade out completely at around 200m, so distant sounds are not going to be heard at all if you use default settings. Even a very noisy factory was faded by any regular train engine sound. Tune them according to your own preferences. I prefer to notice ambient sounds. With the increased sound volumens you can now even notice the sound of the women´s heels as they walk on the platforms for instance. There´s an excessive sound occlusion in game anyway as well as too soft sounds. Moving camera behind a wagon or a small wall can turn whole world around basically silent, even if you have a factory at the other side. Cheers
is it a good idea to lock the fps to like 63 just to stop my 4080 going to 100% all the time is there a difference to locking and unlimited with performance?
There is no reason to lock your FPS, other than wanting a smooth framerate (so you lock it to a value you can achieve 99% of time). You want your GPU to go to 100%, you payed for all the performance, not just 80%.
Yes, locking fps can reduce GPU load. But don´t worry about 100% GPU usage if you are getting high fps and card is not overheating. HW is meant to be used and it´s designed to do so. That´s why you purchased that card, isn´t it? Rendering at high resolution, too high scaling in game or high monitor refresh rates can also force your card to work harder, so it´s not always a matter of the fps you are getting. You should pay attention to GPU usage if you can´t produce high fps even if system is working at full power. Then most likely a lower graphical quality or resolution/scaling is needed as current combination of those factors may be too much for your system. Locking fps is also good to avoid the sudden fps drops, as they will result in a momentary performance degradation. The reason is that most of the relevant work in game is calculated per frame, so a drop to lower fps means that threads can also resolve less operations per second. However locking them below your monitor refresh when using V-sync may result in too evident stuttering. If your system can produce 70fps most of the time with fps unlocked for instance, then you can lock them to 60fps or run monitor at 60 Hz with fps unlocked instead of at higher refresh rates, as V-sync will cap fps to match the monitor refresh rate anyway. That will result in a very smooth rendering whenever your system can still produce 60fps or more in those situations, but you will lose that smoothness when it drops below 60fps. Even if I´m using G-sync with a 120Hz monitor, which produces always a smooth image no matter which fps figure you have, I still lock fps to 80-90fps in game. My system can´t reach 120fps but it can reach 100fps instead. I then adjust quality to be sure I can reach those 80-90fps most of the time, that is easier than achieving 100fps, specially on big stations. My system has a small reserve then and can use it on a very demanding situation that lowers my fps for instance, reducing the risk of a very sudden and giant fps drop. So a drop from 100fps to 50fps in one second, or from 60fps to 30fps in systems using V-sync at 60Hz, is really relevant. A drop from 80fps to 50fps is still relevant anyway, but gives system a bit more time to adapt and reduces the probability of a bottleneck that needs to push pending tasks to several of the next frames, that may result in microfreezes during some seconds until system is able to produce higher fps again. Cheers
I'm on 4060ti(1080р, 150% scaling, ultra preset except clouds). -DX12 gives me worse performance: - stutters are less, but they are became more noticeble; - FPS drops more at heavy zones like Doncaster(41 on DX11, 32 on DX12); - FPS starts drops at intensive forest areas. - I don't know why, but Antelope Valley runs extremely awful(stutters, low FPS) on DX12 and almost amazing on DX11.
What settings are you using to compare both modes? DX12 with default game engine.ini is basically useless. Most of the features which DX12 takes advantage of are disabled in that config. Cheers
With DX11 I've use almost default ini file with only this: r.TemporalAACatmullRom=1 r.TemporalAASharpness=1.0 r.TemporalAASamples=4 r.TemporalAAFilterSize=1.0 r.TemporalAAPauseCorrect=1 r.Tonemapper.Quality=0 r.MaxAnisotropy=16 With DX12 I've add blue and black strings from your ini. My second run on ECML was really good compare to DX11. Bu then something goes wrong with Antelope Valley. I run services 3 times and performance was awful. I've decided to test on ECML again and performance became wors on this route too. So I've roll back to DX11 and everything comes to nomal.
I can imagine what the problem is: r.D3D.ForceDXC=1 r.Shaders.FastMath=0 It's already written in my configuration post that first line will create shaders that are only compatible with DX12. If you intend to switch between modes it would be better not to use that line. You can try to remove those two lines from my config, and it will use the same DX11 shaders method also for DX12. Take into account that DX11 in game uses the old shaders compiler and the fast math workaround, intended for low specs systems. This results in less precise shaders computing and that's why you can see best performance with DX11. But this approach is wrong in my opinion as it will generate low precission shaders and even computing errors or faulty shaders. In particular DirectX (either version 11 or 12) should not be used with that fast math optimization because it's not compliant with the standards. That's still my own opinion. You can find others on the forums... That's why I included such fixes. If you still switch between DirectX modes then the best is to delete old shaders so that you don't have a mixture of files generated by different compilers and with different precissions. Let game just regenerate them from scratch so that you start from a clean situation. This is what I did for my shaders tests at least. Cheers
hehe i thought hey where are my dxshader files but nvidia moved that folder again for manualy deleting nvidia cache files since the new driver 545.84 its moved it's location to C:\Users\Your Name\AppData\LocalLow\NVIDIA\PerDriverVersion\DXCache