Commit graph

3029 commits

Author SHA1 Message Date
Tony Gutierrez
d327cdba07 gpu-compute: add gpu_isa.hh to switch hdrs, add GPUISA to WF
the GPUISA class is meant to encapsulate any ISA-specific behavior - special
register accesses, isa-specific WF/kernel state, etc. - in a generic enough
way so that it may be used in ISA-agnostic code.

gpu-compute: use the GPUISA object to advance the PC

the GPU model treats the PC as a pointer to individual instruction objects -
which are store in a contiguous array - and not a byte address to be fetched
from the real memory system. this is ok for HSAIL because all instructions
are considered by the model to be the same size.

in machine ISA, however, instructions may be 32b or 64b, and branches are
calculated by advancing the PC by the number of words (4 byte chunks) it
needs to advance in the real instruction stream. because of this there is
a mismatch between the PC we use to index into the instruction array, and
the actual byte address PC the ISA expects. here we move the PC advance
calculation to the ISA so that differences in the instrucion sizes may be
accounted for in generic way.
2016-10-26 22:47:38 -04:00
Tony Gutierrez
c7a79c9a42 gpu-compute, hsail: call discardFetch() from the WF
because every taken branch causes fetch to be discarded, we move the call
to the WF to avoid to have to call it from each and every branch instruction
type.
2016-10-26 22:47:27 -04:00
Tony Gutierrez
00a6346c91 hsail, gpu-compute: remove doGm/SmReturn add completeAcc
we are removing doGmReturn from the GM pipe, and adding completeAcc()
implementations for the HSAIL mem ops. the behavior in doGmReturn is
dependent on HSAIL and HSAIL mem ops, however the completion phase
of memory ops in machine ISA can be very different, even amongst individual
machine ISA mem ops. so we remove this functionality from the pipeline and
allow it to be implemented by the individual instructions.
2016-10-26 22:47:19 -04:00
Tony Gutierrez
7ac38849ab gpu-compute: remove inst enums and use bit flag for attributes
this patch removes the GPUStaticInst enums that were defined in GPU.py.
instead, a simple set of attribute flags that can be set in the base
instruction class are used. this will help unify the attributes of HSAIL
and machine ISA instructions within the model itself.

because the static instrution now carries the attributes, a GPUDynInst
must carry a pointer to a valid GPUStaticInst so a new static kernel launch
instruction is added, which carries the attributes needed to perform a
the kernel launch.
2016-10-26 22:47:11 -04:00
Tony Gutierrez
e1ad8035a3 gpu-compute: move disassemle() implementation to GPUStaticInst 2016-10-26 22:47:05 -04:00
Tony Gutierrez
0a6cdff176 gpu-compute, arch: add some methods to the base inst classes for ISA support 2016-10-26 22:47:01 -04:00
Fernando Endo
6c72c35519 cpu, arm: Distinguish Float* and SimdFloat*, create FloatMem* opClass
Modify the opClass assigned to AArch64 FP instructions from SimdFloat* to
Float*. Also create the FloatMemRead and FloatMemWrite opClasses, which
distinguishes writes to the INT and FP register banks.
Change the latency of (Simd)FloatMultAcc to 5, based on the Cortex-A72,
where the "latency" of FMADD is 3 if the next instruction is a FMADD and
has only the augend to destination dependency, otherwise it's 7 cycles.

Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2016-10-15 14:58:45 -05:00
Mitch Hayenga
bd0c2d5b0b isa,arm: Add missing AArch32 FP instructions
This commit adds missing non-predicated, scalar floating point
instructions.  Specifically VRINT* floating point integer rounding
instructions and VSEL* floating point conditional selects.

Change-Id: I23cbd1389f151389ac8beb28a7d18d5f93d000e7
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Nathanael Premillieu <nathanael.premillieu@arm.com>
2016-10-13 19:22:10 +01:00
Alexandru Dutu
3f0118876f kvm: Adding details to kvm page fault in x86
Adding details, e.g. rip, rsp etc. to the kvm pagefault exit when in SE mode.
2016-10-04 13:06:05 -04:00
Alexandru Dutu
68127ca3da hsail: Fix disassembly of load instruction with 3 destination operands 2016-09-16 12:36:20 -04:00
Alexandru Dutu
e9b14d5111 gpu-compute: Refactoring Wavefront::dynWaveId 2016-09-16 12:31:46 -04:00
Alexandru Dutu
589e13a23b gpu-compute: Wavefront refactoring
Renaming members of the Wavefront class in accordance with the style guide.
2016-09-16 12:26:52 -04:00
Ricardo Alves
e5c1488cb6 arm: Add m5_fail support for aarch64
Change-Id: Id2acbc09772be310a0eb9e33295afab07e08a4fa
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2016-09-15 18:21:24 +01:00
Michael LeBeane
2c43a21687 x86: Force strict ordering for memory mapped m5ops
Normal MMAPPED_IPR requests are allowed to execute speculatively under the
assumption that they have no side effects.  The special case of m5ops that are
treated like MMAPPED_IPR should not be allowed to execute speculatively, since
they can have side-effects.  Adding the STRICT_ORDER flag to these requests
blocks execution until the associated instruction hits the ROB head.
2016-09-13 23:18:34 -04:00
Nikos Nikoleris
698767e538 cpu, arch: fix the type used for the request flags
Change-Id: I183b9942929c873c3272ce6d1abd4ebc472c7132
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2016-08-15 12:00:35 +01:00
Tony Gutierrez
fa5e64987e sim: fix issues with pwrite(); don't enable fstatfs
this patch fixes issues with changeset 11593

use the host's pwrite() syscall for pwrite64Func(),
as opposed to pwrite64(), because pwrite64() does
not work well on all distros.

undo the enabling of fstatfs, as we will add this
in a separate pate.
2016-08-05 17:15:19 -04:00
Tony Gutierrez
0b68475b10 x86, sim: add some syscalls to X86
this patch adds an implementation for the pwrite64 syscall and
enables it for x86_64, and enables fstatfs for x86_64.
2016-08-04 12:32:21 -04:00
Curtis Dunham
8b3434a4f2 arm: refactor page table walking
Introduce and use a lookup table.

Using fetchDescriptor() rather than DMA cleanly handles nested paging.

Change-Id: I69ec762f176bd752ba1040890e731826b58d15a6
2016-08-02 10:38:03 +01:00
Dylan Johnson
09218ed397 arm: warn not fail on use of missing miscreg CNTHCTL_EL2
During host bootup, KVM reads/writes to CNTHCTL_EL2. Because this
miscreg has not been implemented, the simulation would end there. This
patch causes the simulation to warn about the read/write instead of fail.

Change-Id: If034bfd0818a9a5e50c5fe86609e945258c96fa3
2016-08-02 10:38:03 +01:00
Dylan Johnson
c15711725d arm: Check TLB stage 2 permissions in AArch64
This fixes a bug where stage 2 lookups used the AArch32
permissions rules even if we were executing in AArch64 mode.

Change-Id: Ia40758f0599667ca7ca15268bd3bf051342c24c1
2016-08-02 10:38:03 +01:00
Dylan Johnson
bce923c189 arm: correctly assign faulting IPA's to HPFAR_EL2
This patch corrects IPA reporting if the translation faults in a
stage 2 lookup.

Change-Id: I0b914527f8a9f98a5e980a131cf9d03e5584b4e9
2016-08-02 10:38:03 +01:00
Dylan Johnson
4d5d47c173 arm: Add TLBI instruction for stage 2 IPA's
This patch adds support for stage 2 TLBI instructions
such as TLBI IPAS2E1_Xt.

Change-Id: I0cd5e8055b0c1003e03439aa5183252f50ea0a88
2016-08-02 10:38:03 +01:00
Dylan Johnson
89511856fe arm: Fix stage 2 memory attribute checking in AArch64
Change-Id: I14c93a5460550051a12129e792a9a9bd522a145c
2016-08-02 10:38:03 +01:00
Dylan Johnson
02fcca9b6f arm: Fix trapping to Hypervisor during MSR/MRS read/write
This patch restricts trapping to hypervisor only if we are in the
correct exception level for the trap to happen.

Change-Id: I0a382b6a572ef835ea36d2702b8a81b633bd3df0
2016-08-02 10:38:03 +01:00
Dylan Johnson
c2271e301d arm: Fix secure state checking in various places
Faults that could potentially be routed to the hypervisor checked
whether or not they were in a secure state without checking if security
was enabled or not. This caused faults not to be routed correctly. This
patch causes secure state checking to first ask if security is enabled.

Change-Id: I179e9b181b27f552734c9bab2b18d05ac579a119
2016-08-02 10:38:02 +01:00
Dylan Johnson
996c1ed33c arm: Fix stage 2 determination in table walker
We recompute if we are doing a stage 2 walk inside of the table walker
but we have already figured it out in the tlb. Pass the information in
to the walk instead of recomputing it.

Change-Id: I39637ce99309b2ddbc30344d45ac9ebf6a203401
2016-08-02 10:38:02 +01:00
Dylan Johnson
eac27759e7 arm: Refactor aarch64 table walk logic to remove redundancy
The functional case is already handled within the fetchDescriptor()
function. We can thus use that function for both atomic and functional
mode when we start the table walk.

Change-Id: Iacaed28cd9024d259fd37a58150efd00ff94d86e
2016-08-02 10:38:02 +01:00
Dylan Johnson
f9a6f68e0b arm: Add check to fault routing for hypervisor/virtualization
This patch adds the option for faults to be routed to the hypervisor
using the pre-existing routeToHyp() functions that are present in each
fault type.

Change-Id: I9735512c094457636b9870456a5be5432288e004
2016-08-02 10:38:02 +01:00
Dylan Johnson
fc6879097b arm: Fix EL perceived at TLB for address translation instructions
During address translation instructions (such as AT S1E1R_Xt) the exception
level can be different than the current exception level. This patch fixes
how the TLB determines what EL to use during these instructions.

Change-Id: Ia9ce229404de9e284bc1f7479fd2c580efd55f8f
2016-08-02 10:38:02 +01:00
Dylan Johnson
2950a95672 arm: Add AArch64 hypervisor call instruction 'hvc'
This patch adds the AArch64 instruction hvc which raises an exception
from EL1 into EL2. The host OS uses this instruction to world switch
into the guest.

Change-Id: I930ee43f4f0abd4b35a68eb2a72e44e3ea6570be
2016-08-02 10:38:02 +01:00
Dylan Johnson
c53a57f74f arm: add stage2 translation support
Change-Id: I8f7c09c7ec3a97149ebebf4b21471b244e6cecc1
2016-08-02 10:38:02 +01:00
Curtis Dunham
49538a7118 arm: enable EL2 support
Change-Id: I59fa4fae98c33d9e5c2185382e1411911d27d341
2016-08-02 10:38:01 +01:00
Dylan Johnson
4fbf40daab arm: invalidate TLB miscreg cache on modification of HSCTLR
Change-Id: I5212c91c56435fe008950ed99feacc6921609226
2016-08-02 10:38:01 +01:00
Dylan Johnson
e727a0eeaa arm: change instruction classes to catch hyp traps
Change-Id: I122918d0e3dfd01ae1a4ca4f19240a069115c8b7
2016-08-02 10:38:01 +01:00
Mitch Hayenga
8a476d387c isa: Modify get/check interrupt routines
Make it so that getInterrupt *always* returns an interrupt if
checkInterrupts() returns true.  This fixes/simplifies handling
of interrupts on the SMT FS CPUs (currently minor).
2016-07-21 17:19:15 +01:00
Andreas Sandberg
f471cc22f0 arm: Don't consult the TLB test iface for functional translations
Don't consult the TLB test interface for PA's returned by functional
translations by the AT instruction. We implement this by chaning the
ISA code to synthesize 0-length functional reads for the TLB lookup.
The TLB then bypasses the final PA check in the tester if the size is
zero.

Change-Id: I2487b7f829cea88c37e229e9fc7a4543aced961b
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Curtis Dunham <curtis.dunham@arm.com>
2016-07-11 10:39:56 +01:00
Nikos Nikoleris
1fac3a292a arm: Mark uninitialized new TLB entries as not valid
Previously when we initialized the TLB we would allocate a number of
TLB entries which would be marked as valid. As a result the TLB
contained an entry which would be considered a valid entry for the 0
page.

Change-Id: I23ace86426a171a4f6200ebeb29ad57c21647036
Reviewed-by: Curtis Dunham <curtis.dunham@arm.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2016-06-20 15:51:31 +01:00
Andreas Sandberg
37bb0d0fb3 kern, arm: Dump dmesg on kernel panic/oops
Add helper functions to dump the guest kernel's dmesg buffer to a text
file in m5out. This functionality is split into two parts. First, a
dmesg dump function that can be used in other places:

void Linux::dumpDmesg(ThreadContext *, std::ostream &)

This function is used to implement two PCEvents: DmesgDumpEvent and
KernelPanic event. The only difference between the two is that the
latter produces a gem5 panic instead of a warning in addition to
dumping the kernel log.

Change-Id: I6d2af1d666ace57124089648ea906f6c787ac63c
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-by: Gabor Dozsa <gabor.dozsa@arm.com>
2016-06-20 14:39:49 +01:00
Tuan Ta
bb9033c26b gpu-compute: Fixed a bug in decoding Atomic ST
There is a mismatch between DataType and SrcDataType in constructing
Atomic ST instruction. The mismatch causes atomic_store and
atomic_store_explicit function to store incorrect value in memory.
2016-06-18 13:02:13 -04:00
jkalamat
3724fb15fa gpu-compute: parametrize Wavefront size
Eliminate the VSZ constant that defined the Wavefront size (in numbers of work
items); replaced it with a parameter in the GPU.py configuration script.
Changed all data structures dependent on the Wavefront size to be dynamically
sized. Legal values of Wavefront size are 16, 32, 64 for now and checked at
initialization time.
2016-06-09 11:24:55 -04:00
David Guillen Fandos
70798b1ba0 stats: Fixing regStats function for some SimObjects
Fixing an issue with regStats not calling the parent class method
for most SimObjects in Gem5. This causes issues if one adds new
stats in the base class (since they are never initialized properly!).

Change-Id: Iebc5aa66f58816ef4295dc8e48a357558d76a77c
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2016-06-06 17:16:43 +01:00
Stephan Diestelhorst
589033c94c sim: Call regStats of base-class as well
We want to extend the stats of objects hierarchically and thus it is necessary
to register the statistics of the base-class(es), as well.  For now, these are
empty, but generic stats will be added there.

Patch originally provided by Akash Bagdia at ARM Ltd.
2016-06-06 17:16:43 +01:00
Curtis Dunham
d31c0f165d arm: refactor page table format determination
In particular, when EL0 is in AArch32 but EL1 is AArch64, AArch64
memory translation must be used.  This is essential for typical
AArch64/32 interworking use cases.
2016-06-02 16:44:57 +01:00
Andreas Sandberg
660fbd543f arm: Rewrite ERET to behave according to the ARMv8 ARM
The ERET instruction doesn't set PSTATE correctly in some cases
(particularly when returning to aarch32 code). Among other things,
this breaks EL0 thumb code when using a 64-bit kernel. This changeset
updates the ERET implementation to match the ARM ARM.

Change-Id: I408e7c69a23cce437859313dfe84e68744b07c98
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Nathanael Premillieu <nathanael.premillieu@arm.com>
2016-06-02 13:41:26 +01:00
Andreas Sandberg
f48ad5b29d arm: Correctly check FP/SIMD access permission in aarch32
The current implementation of aarch32 FP/SIMD in gem5 assumes that EL1
and higher are all 32-bit. This breaks interprocessing since an
aarch64 EL1 uses different enable/disable bits. This change updates
the permission checks to according to what is prescribed by the ARM
ARM.

Change-Id: Icdcef31b00644cfeebec00216b3993aa1de12b88
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Mitch Hayenga <mitch.hayenga@arm.com>
Reviewed-by: Nathanael Premillieu <nathanael.premillieu@arm.com>
2016-06-02 13:38:30 +01:00
Andreas Sandberg
c661cc75ec arm: Enable LPAE support by default
LPAE has been tested with Linux 4.4 and seems to work just fine. Let's
enable it by default.

Change-Id: Id88c6e3c91ae9c353279d42f2aa1f8a78485bd32
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Gabor Dozsa <gabor.dozsa@arm.com>
2016-05-31 12:14:40 +01:00
Andreas Sandberg
9d4a42e8c5 arm: Correctly check translation mode (aarch64/aarch32)
According to the ARM ARM (see AArch32.TranslateAddress in the
pseudocode library), the TLB should be operating in aarch64 mode if
the EL0 is aarch32 and EL1 is aarch64. This is currently not the case
in gem5, which breaks 64/32 interprocessing. Update the check to match
the reference manual.

Change-Id: I6f1444d57c0e2eb5f8880f513f33a9197b7cb2ce
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Mitch Hayenga <mitch.hayenga@arm.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
2016-05-31 12:14:37 +01:00
Andreas Sandberg
4a6bb82123 arm: Use the target EL state when determining fault format
We currently check the current state instead of the state of the
target EL when determining how we report a fault. This breaks
interprocessing since EL0 in aarch32 would report its fault status
using the aarch32 registers even if EL1 is in aarch64. Fix this to
report the fault using the format of the target EL.

Change-Id: Ic080267ac210783d1e01c722a4ddaa687dce280e
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Mitch Hayenga <mitch.hayenga@arm.com>
2016-05-27 15:02:01 +01:00
Andreas Sandberg
2ace05044c arm: Fix incorrect TLB permission check in aarch32
The TLB currently assumes that the pxn bit in an LPAE page descriptor
disables execution from unprivileged mode. However, according to the
architecture manual, this bit should disable execution from privileged
modes. Update the TLB implementation to reflect this behavior.

Change-Id: I7f1bb232d7a94a93fd601a9230223195ac952947
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
2016-05-26 17:38:15 +01:00
Andreas Sandberg
e360f4db07 arm: Make EL checks available in SE mode
A lot of code assumes that it is possible to test what the highest EL
is and if it is 64 bit. These calls currently don't work in SE mode
since they rely on an instance of an ArmSystem.

Change-Id: I0d1f261926a66ce3dc4fa116845ffb2a081446f2
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-by: Nathanael Premillieu <nathanael.premillieu@arm.com>
2016-05-26 17:33:38 +01:00