Core: Implement Device Mapping & GPU SMMU #12579

FernandoS27 · 2024-01-04T19:16:20Z

Memory in Tegra X1 works very differently to how it's currently emulated. Normally there's physical memory of (4Gb), a virtual memory address space used by processes (applications like games for example) and device virtual memory spaces used by peripherals (GPU, DSP, bluetooth, etc). This device memory space is normally called (SMMU) System Memory Management Unit or IOMMU. With this address space, the different devices can map over physical memory used by many different applications to one they can only see. This has the advantage that it allows sharing memory among many multiple processes in the OS with each device.

This PR implements a general simplified version of the SMMU and implements it for the GPU. With some differences, first, we use full 34 bits of the AS instead of just 32 bits for pinning memory, second, we use a common SMMU page file for every process in the GPU instead of switching it per channel.

Currently in yuzu., the GPU uses a memory model of GMMU (GPU's MMU) -> PMMU (the main application's virtual space) -> physical memory to handle memory overall. This is great when there is only one application using the GPU at a time and it's never shared with other processes. 95% of the time this is what happens in the switch. However, it is possible to have multiple processes running and rendering concurrently as is the case of overlays and applets like inline software keyboard. It's also possible to have an application suspended while another is running (an applet is running while the game is suspended). With the implementation of the SMMU, it is now possible to share the GPU resources without any issues whatsoever.

Advantages of new SMMU:

Not perfect but more accurate than before.
Uses less memory overall for tracking GPU memory.
More optimizations possible, in the future.
Multiprocess use of the emulated GPU.
Other devices can use the device mapper in case it's needed or be extended through correct implementation of KDevicePageTable.
Total memory usage of the SMMU is about 80Mb counting tracking. The tracking alone on none-SMMU version costed 256MBs.

Disadvantages of the new SMMU.

Harder tracking of resources.
Complicated.
More accurate than before but not perfectly accurate to avoid sacrificing performance.

Current issues:

Pikmin 4 seems to get some vertex explosions after transitioning in many worlds. (Fixed)
~~There's some memory leaking after closing an application.~~ (Unrelated, the leak is on master)
It's not working with 6Gb/8Gb memory layouts. (fixed)
Needs more cleanup.

src/core/hle/service/nvdrv/core/nvmap.cpp

src/core/hle/service/nvdrv/devices/nvmap.cpp

FernandoS27 · 2024-01-07T03:53:37Z

l hope to test this feature

this alone means almost nothing to users but it will allow a LOT of cool stuffs soon.

Slexer · 2024-01-07T21:16:09Z

How much work is left for a complete UMA implementation? Is SMMU a big chunk of it or just a little bit?

FernandoS27 · 2024-01-07T21:43:04Z

How much work is left for a complete UMA implementation? Is SMMU a big chunk of it or just a little bit?

For a real good and accurate UMA implementation; a lot of work. SMMU makes it easier for making the mirrors needed for DMI.

After that We'll do only DMI for downloads, which will improve things a bit.

Doing full DMI will be a challenge if we want to keep or even increase the compatibility we have. It will take a lot of time. I can't give an ETA.

brujo5 · 2024-01-08T05:50:24Z

by DMI do they refer to direct memory import? If yes, I remember that in Skyline they said that it only worked perfectly on a7xx and on a6xx you had to restart the phone or the games start to crash and it was a hardware bug, is that true?

FernandoS27 · 2024-01-08T08:09:57Z

by DMI do they refer to direct memory import? If yes, I remember that in Skyline they said that it only worked perfectly on a7xx and on a6xx you had to restart the phone or the games start to crash and it was a hardware bug, is that true?

Yes. It's true.

FernandoS27 marked this pull request as draft January 4, 2024 19:16

FernandoS27 force-pushed the smmu branch 8 times, most recently from 27029a2 to b6e5b85 Compare January 4, 2024 22:11

bylaws suggested changes Jan 5, 2024

View reviewed changes

src/core/hle/service/nvdrv/core/nvmap.cpp Outdated Show resolved Hide resolved

src/core/hle/service/nvdrv/devices/nvmap.cpp Outdated Show resolved Hide resolved

This comment was marked as off-topic.

Sign in to view

FernandoS27 force-pushed the smmu branch from 12df21b to 766168b Compare January 7, 2024 03:52

FernandoS27 force-pushed the smmu branch from 766168b to 0a033a0 Compare January 7, 2024 03:59

FernandoS27 added core-new New functionality for Core gpu-new New functionality for GPU gpu General functionality additions / issues for GPU labels Jan 7, 2024

FernandoS27 marked this pull request as ready for review January 7, 2024 04:05

FernandoS27 force-pushed the smmu branch from 1377d3a to 7c708be Compare January 7, 2024 04:52

FernandoS27 added early-access-merge android-merge labels Jan 7, 2024

FernandoS27 force-pushed the smmu branch 3 times, most recently from f4e217c to b5b47aa Compare January 7, 2024 08:04

This comment was marked as off-topic.

Sign in to view

liamwhite force-pushed the smmu branch from b5b47aa to 66ae60a Compare January 8, 2024 01:47

liamwhite mentioned this pull request Jan 15, 2024

video_core: simplify and unify presentation interface between APIs #12685

Closed

liamwhite force-pushed the smmu branch from 66ae60a to f8ad722 Compare January 16, 2024 02:26

FernandoS27 and others added 23 commits January 18, 2024 21:12

Core: Initial implementation of device memory mapping

2f0418c

NVDRV: Implement sessions and initial implementation of SMMU

7a9d1ad

SMMU: Implement backing CPU page protect/unprotect

c85d7cc

SMMU: Initial adaptation to video_core.

0a2536a

SMMU: Implement physical memory mirroring

34a8d0c

SMMU: Fix Unregister on MultiAddress

bad705f

GPU SMMU: Expand to 34 bits

96fd134

GPU-SMMU: Estimate game leak and preallocate device region.

0adc09e

SMMU: Add Android compatibility

303cd31

SMMU: Simplify and remove old code.

9b11b9d

SMMU: Add continuity tracking optimization.

d8f1ce2

SMMU: Fix software rendering and cleanup

b0bca0f

Core: Clang format and other small issues.

590d9b7

SMMU: Fix 8Gb layout.

a874ab0

SMMU: Fix Right Shift UB.

0672847

Core: Eliminate core/memory dependancies.

23430e6

Core: Make sure GPU Dirty Managers ae shared by all processes.

648ed55

Core: Invert guest memory depandancy

4b963ca

nvdrv: use static typing for SessionId, smmu Asid types

beb438b

nvdrv: use correct names for interface factory

b6c6534

nvdrv: clean up preallocation

32f623e

nvhost_vic: use map erase by key

04867e2

device_memory_manager: use unique_lock for update

748465f

liamwhite force-pushed the smmu branch from 93ef41f to 748465f Compare January 19, 2024 02:13

liamwhite added mainline-merge Merge this PR into the next mainline build and removed early-access-merge labels Jan 19, 2024

liamwhite approved these changes Jan 22, 2024

View reviewed changes

liamwhite merged commit 8bd1047 into yuzu-emu:master Jan 22, 2024
10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Core: Implement Device Mapping & GPU SMMU #12579

Core: Implement Device Mapping & GPU SMMU #12579

FernandoS27 commented Jan 4, 2024 •

edited

This comment was marked as off-topic.

FernandoS27 commented Jan 7, 2024

This comment was marked as off-topic.

Slexer commented Jan 7, 2024

FernandoS27 commented Jan 7, 2024

brujo5 commented Jan 8, 2024

FernandoS27 commented Jan 8, 2024

Core: Implement Device Mapping & GPU SMMU #12579

Core: Implement Device Mapping & GPU SMMU #12579

Conversation

FernandoS27 commented Jan 4, 2024 • edited

This comment was marked as off-topic.

FernandoS27 commented Jan 7, 2024

This comment was marked as off-topic.

Slexer commented Jan 7, 2024

FernandoS27 commented Jan 7, 2024

brujo5 commented Jan 8, 2024

FernandoS27 commented Jan 8, 2024

FernandoS27 commented Jan 4, 2024 •

edited