Commit Graph

106 Commits

Author SHA1 Message Date
Morph
295fc7d0f8 x64: cpu_wait: Implement MWAITX for non-MSVC compilers 2023-06-28 01:39:15 -04:00
Morph
2b68a3cbbf x64: cpu_wait: Remove magic values 2023-06-28 01:39:06 -04:00
Morph
3d868baaa4 x64: cpu_wait: Make use of MWAITX in MicroSleep
MWAITX is equivalent to UMWAIT on Intel's Alder Lake CPUs.
We can emulate TPAUSE by using MONITORX in conjunction with MWAITX to wait for 100K cycles.
2023-06-28 01:38:55 -04:00
Morph
4303ed614d x64: Add detection of monitorx instructions
monitorx introduces 2 instructions: MONITORX and MWAITX.
2023-06-28 01:36:06 -04:00
Morph
907507886d (wall, native)_clock: Add GetGPUTick
Allows us to directly calculate the GPU tick without double conversion to and from the host clock tick.
2023-06-07 21:44:42 -04:00
Morph
1492a65454 (wall, native)_clock: Rework NativeClock 2023-06-07 21:44:42 -04:00
Morph
dd12dd4c67 x64: Deduplicate RDTSC usage 2023-06-07 21:44:42 -04:00
Morph
981bc8aa1c x64: Simplify RDTSC on non-MSVC compilers
Co-Authored-By: liamwhite <liamwhite@users.noreply.github.com>
2023-03-27 17:45:22 -04:00
Morph
27c33ab73f x64: Add MicroSleep
MicroSleep allows the processor to pause for a "short" amount of time (in the microsecond range). This is useful for spin-waiting that does not require nanosecond precision.
This uses the new TPAUSE instruction introduced on Intel's newest processors as part of the waitpkg instructions. For CPUs that do not support waitpkg instructions, this is equivalent to yield().

Co-Authored-By: liamwhite <liamwhite@users.noreply.github.com>
2023-03-27 17:45:22 -04:00
Morph
d2cfe25b07 x64: cpu_detect: Add detection of waitpkg instructions
waitpkg introduces 3 instructions, UMONITOR, UMWAIT and TPAUSE.
2023-03-27 17:45:22 -04:00
Morph
d718eab351 native_clock: Wait for 10 seconds instead of 30
It was experimentally determined to be sufficient.
2023-03-07 21:17:46 -05:00
Morph
c27a626b5b native_clock: Use RealTimeClock instead of SteadyClock
We want to synchronize RDTSC to real time.
2023-03-07 21:17:46 -05:00
Morph
dcd13a7566 native_clock: Re-adjust the RDTSC frequency
The RDTSC frequency reported by CPUID is not accurate to its true frequency.
We will spawn a separate thread to calculate the true RDTSC frequency after a measurement period of 30 seconds has elapsed.
2023-03-07 21:17:46 -05:00
Morph
376a414f5b native_clock: Round RDTSC frequency to the nearest 1000 2023-03-05 02:36:31 -05:00
Matías Locatti
69768ec71e Add CPU core count to log files 2022-11-11 23:50:48 -03:00
Maide
2e46110379
Revert Coretiming PRs 8531 and 7454 (#8591) 2022-07-27 19:47:06 -04:00
Andrea Pappacoda
cdb240f3d4
chore: make yuzu REUSE compliant
[REUSE] is a specification that aims at making file copyright
information consistent, so that it can be both human and machine
readable. It basically requires that all files have a header containing
copyright and licensing information. When this isn't possible, like
when dealing with binary assets, generated files or embedded third-party
dependencies, it is permitted to insert copyright information in the
`.reuse/dep5` file.

Oh, and it also requires that all the licenses used in the project are
present in the `LICENSES` folder, that's why the diff is so huge.
This can be done automatically with `reuse download --all`.

The `reuse` tool also contains a handy subcommand that analyzes the
project and tells whether or not the project is (still) compliant,
`reuse lint`.

Following REUSE has a few advantages over the current approach:

- Copyright information is easy to access for users / downstream
- Files like `dist/license.md` do not need to exist anymore, as
  `.reuse/dep5` is used instead
- `reuse lint` makes it easy to ensure that copyright information of
  files like binary assets / images is always accurate and up to date

To add copyright information of files that didn't have it I looked up
who committed what and when, for each file. As yuzu contributors do not
have to sign a CLA or similar I couldn't assume that copyright ownership
was of the "yuzu Emulator Project", so I used the name and/or email of
the commit author instead.

[REUSE]: https://reuse.software

Follow-up to 01cf05bc75
2022-07-27 12:53:49 +02:00
Marshall Mohror
e71d457af9 guard against div-by-zero 2022-07-06 13:00:00 -05:00
Marshall Mohror
b2ad4dd189 common/x64: Use TSC clock rate from CPUID when available
The current method used to estimate the TSC is fairly accurate - within a few kHz - but the exact value can be extracted from CPUID if available.
2022-07-06 12:42:01 -05:00
Fernando Sahmkow
3196d957b0 Adress Feedback. 2022-06-30 10:18:56 +02:00
Fernando Sahmkow
2575a93dc6 Native clock: Use atomic ops as before. 2022-06-28 22:42:00 +02:00
Fernando Sahmkow
f5c1d7b8c8 Native Clock: remove inaccuracy mask. 2022-06-28 01:47:00 +02:00
Fernando Sahmkow
9cafb0d912 Core: Fix tests. 2022-06-28 01:10:55 +02:00
Fernando Sahmkow
096366ead5 Common: improve native clock. 2022-06-28 01:06:48 +02:00
Morph
99ceb03a1c general: Convert source file copyright comments over to SPDX
This formats all copyright comments according to SPDX formatting guidelines.
Additionally, this resolves the remaining GPLv2 only licensed files by relicensing them to GPLv2.0-or-later.
2022-04-23 05:55:32 -04:00
Merry
4052bfb4ad native_clock: Internal linkage for FencedRDTSC
__forceinline required on MSVC for function to be inlined
2022-04-03 22:38:12 +01:00
merry
fdd4d019ef native_clock: Use lfence with rdtsc 2022-04-03 22:38:10 +01:00
merry
979e53b87b native_clock: Use writeback from CAS to avoid double-loading 2022-04-02 22:22:48 +01:00
Merry
c562c1d6be native_clock: Use AtomicLoad128 2022-04-02 20:55:36 +01:00
ameerj
923decae5a common: Reduce unused includes 2022-03-19 15:01:31 -04:00
Wunkolo
d248c1203e cpu_detect: Add additional x86 flags and telemetry
Adds detection of additional CPU flags to cpu_detect and additions to telemetry output.

This is not exhaustive but guided by features that [dynarmic utilizes](bcfe377aaa/src/dynarmic/backend/x64/host_feature.h (L12-L33)) as well as features that are currently utilized but not reported to telemetry(invariant_tsc). This is intended to guide future optimizations.

AVX512 in particular is broken up into its individual subsets and some other processor features such as [sha](https://en.wikipedia.org/wiki/Intel_SHA_extensions) and [gfni](https://en.wikipedia.org/wiki/AVX-512#GFNI) are added to have some forward-facing data-points.

What used to be a single `CPU_Extension_x64_AVX512` telemetry field
is also broken up into individual `CPU_Extension_x64_AVX512{F,VL,CD,...}` fields.
2022-03-11 10:27:00 -08:00
Wunkolo
d9b1199ffb cpu_detect: Revert __cpuid{ex} array-type argument
Restores compatibility with MSVC's `__cpuid` intrinsic.
2022-03-09 19:50:01 -08:00
Wunkolo
873a9fa7e5 cpu_detect: Add missing lzcnt detection 2022-03-09 13:57:47 -08:00
Wunkolo
ec5f3351b6 cpu_detect: Refactor cpu/manufacturer identification
Set the zero-enum value to Unknown
Move the Manufacterer enum into the CPUCaps structure namespace
Add "ParseManufacturer" utility-function
Fix cpu/brand string buffer sizes(!)
2022-03-09 13:57:47 -08:00
Wunkolo
86e9e60f07 cpu_detect: Update array-types to span and array
Update some uses of `int` into some more explicitly sized types as well
2022-03-09 13:57:47 -08:00
Wunkolo
3c33ba7f18 cpu_detect: Utilize Bit<N> utility function 2022-03-09 13:57:47 -08:00
Wunkolo
d233de8194 cpu_detect: Compact capability fields
As this structure gets more explicit, bools can be bitfields and
small enums can use smaller types for their span of values.
2022-03-09 13:57:47 -08:00
Morph
4e766280c4 common: wall_clock: Utilize constants for ms, us, and ns ratios 2022-01-30 12:36:56 -05:00
Lioncash
f6a049337e common/xbyak_api: Make BuildRegSet() constexpr
This allows us to eliminate any static constructors that would have been
emitted due to the function not being constexpr.
2022-01-26 16:29:15 -05:00
Morph
4af413623b common/cpu_detect: Remove CPU family and model
We currently do not make use of these fields, remove them for now.
2021-12-13 20:45:18 -05:00
Morph
f919498f8f native_clock: Wait for less time in EstimateRDTSCFrequency
In my testing, waiting for 200ms provided the same level of precision as the previous implementation when estimating the RDTSC frequency.
This significantly improves the yuzu executable launch times since we reduced the wait time from 3 seconds to 200 milliseconds.
2021-12-03 19:55:59 -05:00
Morph
762b8ad448 general: Replace high_resolution_clock with steady_clock
On some OSes, high_resolution_clock is an alias to system_clock and is not monotonic in nature. Replace this with steady_clock.
2021-12-02 14:20:43 -05:00
Merry
1770503185 xbyak: Update include path 2021-08-15 19:26:38 +01:00
bunnei
0a91599aec common: Merge uint128 to a single header file with inlines. 2021-02-15 14:46:04 -08:00
Fernando Sahmkow
53d92318b8 X86/NativeClock: Reimplement RTDSC access to be lock free. 2021-01-02 04:00:27 +01:00
Fernando Sahmkow
d4f871cb6a X86/NativeClock: Improve performance of clock calculations on hot path. 2021-01-02 00:43:47 +01:00
Lioncash
2c375013dd xbyak_abi: Shorten std::size_t to size_t
Makes for less reading.
2020-12-05 00:43:55 -05:00
Lioncash
b126267442 xbyak_abi: Avoid implicit sign conversions 2020-12-05 00:43:41 -05:00
Lioncash
1ea6bdef05 audio_core: Make shadowing and unused parameters errors
Moves the audio code closer to enabling warnings as errors in general.
2020-12-03 00:54:31 -05:00
Lioncash
4a4b685a04 common: Enable warnings as errors
Cleans up common so that we can enable warnings as errors.
2020-11-02 15:50:58 -05:00