Commit graph

3044 commits

Author SHA1 Message Date
Alec Roelke ee0c261e10 riscv: [Patch 7/5] Corrected LRSC semantics
RISC-V makes use of load-reserved and store-conditional instructions to
enable creation of lock-free concurrent data manipulation as well as
ACQUIRE and RELEASE semantics for memory ordering of LR, SC, and AMO
instructions (the latter of which do not follow LR/SC semantics). This
patch is a correction to patch 4, which added these instructions to the
implementation of RISC-V. It modifies locked_mem.hh and the
implementations of lr.w, sc.w, lr.d, and sc.d to apply the proper gem5
flags and return the proper values.

An important difference between gem5's LLSC semantics and RISC-V's LR/SC
ones, beyond the name, is that gem5 uses 0 to indicate failure and 1 to
indicate success, while RISC-V is the opposite. Strictly speaking, RISC-V
uses 0 to indicate success and nonzero to indicate failure where the
value would indicate the error, but currently only 1 is reserved as a
failure code by the ISA reference.

This is the seventh patch in the series which originally consisted of five
patches that added the RISC-V ISA to gem5. The original five patches added
all of the instructions and added support for more detailed CPU models and
the sixth patch corrected the implementations of Linux constants and
structs. There will be an eighth patch that adds some regression tests
for the instructions.

[Removed some commented-out code from locked_mem.hh.]
Signed-off by: Alec Roelke

Signed-off by: Jason Lowe-Power <jason@lowepower.com>
2016-11-30 17:10:28 -05:00
Alec Roelke 84020a8aed riscv: [Patch 6/5] Improve Linux emulation for RISC-V
This is an add-on patch for the original series that implemented RISC-V
that improves the implementation of Linux emulation for SE mode. Basically
it cleans up linux/linux.hh by removing constants that haven't been
defined for the RISC-V Linux proxy kernel and rearranging the stat
struct so it aligns with RISC-V's implementation of it. It also adds
placeholders for system calls that have been given numbers in RISC-V
but haven't been given implementations yet. These system calls are
as follows:
- readlinkat
- sigprocmask
- ioctl
- clock_gettime
- getrusage
- getrlimit
- setrlimit

The first five patches implemented RISC-V with the base ISA and multiply,
floating point, and atomic extensions and added support for detailed
CPU models with memory timing.

[Fixed incompatibility with changes made from patch 1.]
Signed-off by: Alec Roelke

Signed-off by: Jason Lowe-Power <jason@lowepower.com>
2016-11-30 17:10:28 -05:00
Alec Roelke 126c0360e2 riscv: [Patch 5/5] Added missing support for timing CPU models
Last of five patches adding RISC-V to GEM5. This patch adds support for
timing, minor, and detailed CPU models that was missing in the last four,
which basically consists of handling timing-mode memory accesses and
telling the minor and detailed models what a no-op instruction should
be (addi zero, zero, 0).

Patches 1-4 introduced RISC-V and implemented the base instruction set,
RV64I, and added the multiply, floating point, and atomic memory
extensions, RV64MAFD.

[Fixed compatibility with edit from patch 1.]
[Fixed compatibility with hg copy edit from patch 1.]
[Fixed some style errors in locked_mem.hh.]
Signed-off by: Alec Roelke

Signed-off by: Jason Lowe-Power <jason@lowepower.com>
2016-11-30 17:10:28 -05:00
Alec Roelke 535e6c5fa4 riscv: [Patch 4/5] Added RISC-V atomic memory extension RV64A
Fourth of five patches adding RISC-V to GEM5. This patch adds the RV64A
extension, which includes atomic memory instructions. These instructions
atomically read a value from memory, modify it with a value contained in a
source register, and store the original memory value in the destination
register and modified value back into memory. Because this requires two
memory accesses and GEM5 does not support two timing memory accesses in
a single instruction, each of these instructions is split into two micro-
ops: A "load" micro-op, which reads the memory, and a "store" micro-op,
which modifies and writes it back.  Each atomic memory instruction also has
two bits that acquire and release a lock on its memory location.
Additionally, there are atomic load and store instructions that only either
load or store, but not both, and can acquire or release memory locks.

Note that because the current implementation of RISC-V only supports one
core and one thread, it doesn't make sense to make use of AMO instructions.
However, they do form a standard extension of the RISC-V ISA, so they are
included mostly as a placeholder for when multithreaded execution is
implemented.  As a result, any tests for their correctness in a future
patch may be abbreviated.

Patch 1 introduced RISC-V and implemented the base instruction set, RV64I;
patch 2 implemented the integer multiply extension, RV64M; and patch 3
implemented the single- and double-precision floating point extensions,
RV64FD.

Patch 5 will add support for timing, minor, and detailed CPU models that
isn't present in patches 1-4.

[Added missing file amo.isa]
[Replaced information removed from initial patch that was missed during
division into multiple patches.]
[Fixed some minor formatting issues.]
[Fixed oversight where LR and SC didn't have both AQ and RL flags.]
Signed-off by: Alec Roelke

Signed-off by: Jason Lowe-Power <jason@lowepower.com>
2016-11-30 17:10:28 -05:00
Alec Roelke 1229b3b623 riscv: [Patch 3/5] Added RISCV floating point extensions RV64FD
Third of five patches adding RISC-V to GEM5. This patch adds the RV64FD
extensions, which include single- and double-precision floating point
instructions.

Patch 1 introduced RISC-V and implemented the base instruction set, RV64I
and patch 2 implemented the integer multiply extension, RV64M.

Patch 4 will implement the atomic memory instructions, RV64A, and patch
5 will add support for timing, minor, and detailed CPU models that is
missing from the first four patches.

[Fixed exception handling in floating-point instructions to conform better
to IEEE-754 2008 standard and behavior of the Chisel-generated RISC-V
simulator.]
[Fixed style errors in decoder.isa.]
[Fixed some fuzz caused by modifying a previous patch.]
Signed-off by: Alec Roelke

Signed-off by: Jason Lowe-Power <jason@lowepower.com>
2016-11-30 17:10:28 -05:00
Alec Roelke 070da98493 riscv: [Patch 2/5] Added RISC-V multiply extension RV64M
Second of five patches adding RISC-V to GEM5.  This patch adds the
RV64M extension, which includes integer multiply and divide instructions.

Patch 1 introduced RISC-V and implemented the base instruction set, RV64I.

Patch 3 will implement the floating point extensions, RV64FD; patch 4 will
implement the atomic memory instructions, RV64A; and patch 5 will add
support for timing, minor, and detailed CPU models that is missing from
the first four patches.

[Added mulw instruction that was missed when dividing changes among
patches.]
Signed-off by: Alec Roelke

Signed-off by: Jason Lowe-Power <jason@lowepower.com>
2016-11-30 17:10:28 -05:00
Alec Roelke e76bfc8764 arch: [Patch 1/5] Added RISC-V base instruction set RV64I
First of five patches adding RISC-V to GEM5. This patch introduces the
base 64-bit ISA (RV64I) in src/arch/riscv for use with syscall emulation.
The multiply, floating point, and atomic memory instructions will be added
in additional patches, as well as support for more detailed CPU models.
The loader is also modified to be able to parse RISC-V ELF files, and a
"Hello world\!" example for RISC-V is added to test-progs.

Patch 2 will implement the multiply extension, RV64M; patch 3 will implement
the floating point (single- and double-precision) extensions, RV64FD;
patch 4 will implement the atomic memory instructions, RV64A, and patch 5
will add support for timing, minor, and detailed CPU models that is missing
from the first four patches (such as handling locked memory).

[Removed several unused parameters and imports from RiscvInterrupts.py,
RiscvISA.py, and RiscvSystem.py.]
[Fixed copyright information in RISC-V files copied from elsewhere that had
ARM licenses attached.]
[Reorganized instruction definitions in decoder.isa so that they are sorted
by opcode in preparation for the addition of ISA extensions M, A, F, D.]
[Fixed formatting of several files, removed some variables and
instructions that were missed when moving them to other patches, fixed
RISC-V Foundation copyright attribution, and fixed history of files
copied from other architectures using hg copy.]
[Fixed indentation of switch cases in isa.cc.]
[Reorganized syscall descriptions in linux/process.cc to remove large
number of repeated unimplemented system calls and added implmementations
to functions that have received them since it process.cc was first
created.]
[Fixed spacing for some copyright attributions.]
[Replaced the rest of the file copies using hg copy.]
[Fixed style check errors and corrected unaligned memory accesses.]
[Fix some minor formatting mistakes.]
Signed-off by: Alec Roelke

Signed-off by: Jason Lowe-Power <jason@lowepower.com>
2016-11-30 17:10:28 -05:00
Tony Gutierrez 0799600686 x86: fix issue with casting in Cvtf2i
UBSAN flags this operation because it detects that arg is being cast directly
to an unsigned type, argBits. this patch fixes this by first casting the
value to a signed int type, then reintrepreting the raw bits of the signed
int into argBits.
2016-11-21 15:35:56 -05:00
Tony Gutierrez ae55cba281 x86: fix loading/storing of Float80 types 2016-11-19 12:35:14 -05:00
Andreas Hansson 6ed567d600 alpha: Remove ALPHA tru64 support and associated tests
No one appears to be using it, and it is causing build issues
and increases the development and maintenance effort.
2016-11-17 04:54:14 -05:00
Tony Gutierrez 74249f80df hsail,gpu-compute: fixes to appease clang++
fixes to appease clang++. tested on:

Ubuntu clang version 3.5.0-4ubuntu2~trusty2
(tags/RELEASE_350/final) (based on LLVM 3.5.0)

Ubuntu clang version 3.6.0-2ubuntu1~trusty1
(tags/RELEASE_360/final) (based on LLVM 3.6.0)

the fixes address the following five issues:

1) the exec continuations in gpu_static_inst.hh were marked
   as protected when they should be public. here we mark
   them as public

2) the Abs instruction uses std::abs() in its execute method.
   because Abs is templated, it can also operate on U32 and U64,
   types, which cause Abs::execute() to pass uint32_t and uint64_t
   types to std::abs() respectively. this triggers a warning
   because std::abs() has no effect in this case. to rememdy this
   we add template specialization for the execute() method of Abs
   when its template paramter is U32 or U64.

3) Some potocols that utilize the code in cprintf.hh were missing
   includes to BoolVec.hh, which defines operator<< for the BoolVec
   type. This would cause issues when the generated code would try
   to pass a BoolVec type to a method in cprintf.hh that used
   operator<< on an instance of a BoolVec.

4) Surprise, clang doesn't like it when you clobber all the bits
   in a newly allocated object. I.e., this code:

   tlb = new GpuTlbEntry\[size\];
   std::memset(tlb, 0, sizeof(GpuTlbEntry) \* size);

   Let's use std::vector to track the TLB entries in the GpuTlb now...

5) There were a few variables used only in DPRINTFs, so we mark them
   with M5_VAR_USED.
2016-10-26 22:48:45 -04:00
Michael LeBeane dc16c1ceb8 dev: Add m5 op to toggle synchronization for dist-gem5.
This patch adds the ability for an application to request dist-gem5 to begin/
end synchronization using an m5 op. When toggling on sync, all nodes agree
on the next sync point based on the maximum of all nodes' ticks. CPUs are
suspended until the sync point to avoid sending network messages until sync has
been enabled. Toggling off sync acts like a global execution barrier, where
all CPUs are disabled until every node reaches the toggle off point. This
avoids tricky situations such as one node hitting a toggle off followed by a
toggle on before the other nodes hit the first toggle off.
2016-10-26 22:48:40 -04:00
Tony Gutierrez de72e36619 gpu-compute: support in-order data delivery in GM pipe
this patch adds an ordered response buffer to the GM pipeline
to ensure in-order data delivery. the buffer is implemented as
a stl ordered map, which sorts the request in program order by
using their sequence ID. when requests return to the GM pipeline
they are marked as done. only the oldest request may be serviced
from the ordered buffer, and only if is marked as done.

the FIFO response buffers are kept and used in OoO delivery mode
2016-10-26 22:48:28 -04:00
Tony Gutierrez b63eb1302b gpu-compute, hsail: pass GPUDynInstPtr to getRegisterIndex()
for HSAIL an operand's indices into the register files may be calculated
trivially, because the operands are always read from a register file, or are
an immediate.

for machine ISA, however, an op selector may specify special registers, or
may specify special SGPRs with an alias op selector value. the location of
some of the special registers values are dependent on the size of the RF
in some cases. here we add a way for the underlying getRegisterIndex()
method to know about the size of the RFs, so that it may find the relative
positions of the special register values.
2016-10-26 22:47:49 -04:00
Tony Gutierrez 844fb845a5 gpu-compute, hsail: make the PC a byte address, not an instruction index
currently the PC is incremented on an instruction granularity, and not as an
instruction's byte address. machine ISA instructions assume the PC is a byte
address, and is incremented accordingly. here we make the GPU model, and the
HSAIL instructions treat the PC as a byte address as well.
2016-10-26 22:47:43 -04:00
Tony Gutierrez d327cdba07 gpu-compute: add gpu_isa.hh to switch hdrs, add GPUISA to WF
the GPUISA class is meant to encapsulate any ISA-specific behavior - special
register accesses, isa-specific WF/kernel state, etc. - in a generic enough
way so that it may be used in ISA-agnostic code.

gpu-compute: use the GPUISA object to advance the PC

the GPU model treats the PC as a pointer to individual instruction objects -
which are store in a contiguous array - and not a byte address to be fetched
from the real memory system. this is ok for HSAIL because all instructions
are considered by the model to be the same size.

in machine ISA, however, instructions may be 32b or 64b, and branches are
calculated by advancing the PC by the number of words (4 byte chunks) it
needs to advance in the real instruction stream. because of this there is
a mismatch between the PC we use to index into the instruction array, and
the actual byte address PC the ISA expects. here we move the PC advance
calculation to the ISA so that differences in the instrucion sizes may be
accounted for in generic way.
2016-10-26 22:47:38 -04:00
Tony Gutierrez c7a79c9a42 gpu-compute, hsail: call discardFetch() from the WF
because every taken branch causes fetch to be discarded, we move the call
to the WF to avoid to have to call it from each and every branch instruction
type.
2016-10-26 22:47:27 -04:00
Tony Gutierrez 00a6346c91 hsail, gpu-compute: remove doGm/SmReturn add completeAcc
we are removing doGmReturn from the GM pipe, and adding completeAcc()
implementations for the HSAIL mem ops. the behavior in doGmReturn is
dependent on HSAIL and HSAIL mem ops, however the completion phase
of memory ops in machine ISA can be very different, even amongst individual
machine ISA mem ops. so we remove this functionality from the pipeline and
allow it to be implemented by the individual instructions.
2016-10-26 22:47:19 -04:00
Tony Gutierrez 7ac38849ab gpu-compute: remove inst enums and use bit flag for attributes
this patch removes the GPUStaticInst enums that were defined in GPU.py.
instead, a simple set of attribute flags that can be set in the base
instruction class are used. this will help unify the attributes of HSAIL
and machine ISA instructions within the model itself.

because the static instrution now carries the attributes, a GPUDynInst
must carry a pointer to a valid GPUStaticInst so a new static kernel launch
instruction is added, which carries the attributes needed to perform a
the kernel launch.
2016-10-26 22:47:11 -04:00
Tony Gutierrez e1ad8035a3 gpu-compute: move disassemle() implementation to GPUStaticInst 2016-10-26 22:47:05 -04:00
Tony Gutierrez 0a6cdff176 gpu-compute, arch: add some methods to the base inst classes for ISA support 2016-10-26 22:47:01 -04:00
Fernando Endo 6c72c35519 cpu, arm: Distinguish Float* and SimdFloat*, create FloatMem* opClass
Modify the opClass assigned to AArch64 FP instructions from SimdFloat* to
Float*. Also create the FloatMemRead and FloatMemWrite opClasses, which
distinguishes writes to the INT and FP register banks.
Change the latency of (Simd)FloatMultAcc to 5, based on the Cortex-A72,
where the "latency" of FMADD is 3 if the next instruction is a FMADD and
has only the augend to destination dependency, otherwise it's 7 cycles.

Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2016-10-15 14:58:45 -05:00
Mitch Hayenga bd0c2d5b0b isa,arm: Add missing AArch32 FP instructions
This commit adds missing non-predicated, scalar floating point
instructions.  Specifically VRINT* floating point integer rounding
instructions and VSEL* floating point conditional selects.

Change-Id: I23cbd1389f151389ac8beb28a7d18d5f93d000e7
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Nathanael Premillieu <nathanael.premillieu@arm.com>
2016-10-13 19:22:10 +01:00
Alexandru Dutu 3f0118876f kvm: Adding details to kvm page fault in x86
Adding details, e.g. rip, rsp etc. to the kvm pagefault exit when in SE mode.
2016-10-04 13:06:05 -04:00
Alexandru Dutu 68127ca3da hsail: Fix disassembly of load instruction with 3 destination operands 2016-09-16 12:36:20 -04:00
Alexandru Dutu e9b14d5111 gpu-compute: Refactoring Wavefront::dynWaveId 2016-09-16 12:31:46 -04:00
Alexandru Dutu 589e13a23b gpu-compute: Wavefront refactoring
Renaming members of the Wavefront class in accordance with the style guide.
2016-09-16 12:26:52 -04:00
Ricardo Alves e5c1488cb6 arm: Add m5_fail support for aarch64
Change-Id: Id2acbc09772be310a0eb9e33295afab07e08a4fa
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2016-09-15 18:21:24 +01:00
Michael LeBeane 2c43a21687 x86: Force strict ordering for memory mapped m5ops
Normal MMAPPED_IPR requests are allowed to execute speculatively under the
assumption that they have no side effects.  The special case of m5ops that are
treated like MMAPPED_IPR should not be allowed to execute speculatively, since
they can have side-effects.  Adding the STRICT_ORDER flag to these requests
blocks execution until the associated instruction hits the ROB head.
2016-09-13 23:18:34 -04:00
Nikos Nikoleris 698767e538 cpu, arch: fix the type used for the request flags
Change-Id: I183b9942929c873c3272ce6d1abd4ebc472c7132
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2016-08-15 12:00:35 +01:00
Tony Gutierrez fa5e64987e sim: fix issues with pwrite(); don't enable fstatfs
this patch fixes issues with changeset 11593

use the host's pwrite() syscall for pwrite64Func(),
as opposed to pwrite64(), because pwrite64() does
not work well on all distros.

undo the enabling of fstatfs, as we will add this
in a separate pate.
2016-08-05 17:15:19 -04:00
Tony Gutierrez 0b68475b10 x86, sim: add some syscalls to X86
this patch adds an implementation for the pwrite64 syscall and
enables it for x86_64, and enables fstatfs for x86_64.
2016-08-04 12:32:21 -04:00
Curtis Dunham 8b3434a4f2 arm: refactor page table walking
Introduce and use a lookup table.

Using fetchDescriptor() rather than DMA cleanly handles nested paging.

Change-Id: I69ec762f176bd752ba1040890e731826b58d15a6
2016-08-02 10:38:03 +01:00
Dylan Johnson 09218ed397 arm: warn not fail on use of missing miscreg CNTHCTL_EL2
During host bootup, KVM reads/writes to CNTHCTL_EL2. Because this
miscreg has not been implemented, the simulation would end there. This
patch causes the simulation to warn about the read/write instead of fail.

Change-Id: If034bfd0818a9a5e50c5fe86609e945258c96fa3
2016-08-02 10:38:03 +01:00
Dylan Johnson c15711725d arm: Check TLB stage 2 permissions in AArch64
This fixes a bug where stage 2 lookups used the AArch32
permissions rules even if we were executing in AArch64 mode.

Change-Id: Ia40758f0599667ca7ca15268bd3bf051342c24c1
2016-08-02 10:38:03 +01:00
Dylan Johnson bce923c189 arm: correctly assign faulting IPA's to HPFAR_EL2
This patch corrects IPA reporting if the translation faults in a
stage 2 lookup.

Change-Id: I0b914527f8a9f98a5e980a131cf9d03e5584b4e9
2016-08-02 10:38:03 +01:00
Dylan Johnson 4d5d47c173 arm: Add TLBI instruction for stage 2 IPA's
This patch adds support for stage 2 TLBI instructions
such as TLBI IPAS2E1_Xt.

Change-Id: I0cd5e8055b0c1003e03439aa5183252f50ea0a88
2016-08-02 10:38:03 +01:00
Dylan Johnson 89511856fe arm: Fix stage 2 memory attribute checking in AArch64
Change-Id: I14c93a5460550051a12129e792a9a9bd522a145c
2016-08-02 10:38:03 +01:00
Dylan Johnson 02fcca9b6f arm: Fix trapping to Hypervisor during MSR/MRS read/write
This patch restricts trapping to hypervisor only if we are in the
correct exception level for the trap to happen.

Change-Id: I0a382b6a572ef835ea36d2702b8a81b633bd3df0
2016-08-02 10:38:03 +01:00
Dylan Johnson c2271e301d arm: Fix secure state checking in various places
Faults that could potentially be routed to the hypervisor checked
whether or not they were in a secure state without checking if security
was enabled or not. This caused faults not to be routed correctly. This
patch causes secure state checking to first ask if security is enabled.

Change-Id: I179e9b181b27f552734c9bab2b18d05ac579a119
2016-08-02 10:38:02 +01:00
Dylan Johnson 996c1ed33c arm: Fix stage 2 determination in table walker
We recompute if we are doing a stage 2 walk inside of the table walker
but we have already figured it out in the tlb. Pass the information in
to the walk instead of recomputing it.

Change-Id: I39637ce99309b2ddbc30344d45ac9ebf6a203401
2016-08-02 10:38:02 +01:00
Dylan Johnson eac27759e7 arm: Refactor aarch64 table walk logic to remove redundancy
The functional case is already handled within the fetchDescriptor()
function. We can thus use that function for both atomic and functional
mode when we start the table walk.

Change-Id: Iacaed28cd9024d259fd37a58150efd00ff94d86e
2016-08-02 10:38:02 +01:00
Dylan Johnson f9a6f68e0b arm: Add check to fault routing for hypervisor/virtualization
This patch adds the option for faults to be routed to the hypervisor
using the pre-existing routeToHyp() functions that are present in each
fault type.

Change-Id: I9735512c094457636b9870456a5be5432288e004
2016-08-02 10:38:02 +01:00
Dylan Johnson fc6879097b arm: Fix EL perceived at TLB for address translation instructions
During address translation instructions (such as AT S1E1R_Xt) the exception
level can be different than the current exception level. This patch fixes
how the TLB determines what EL to use during these instructions.

Change-Id: Ia9ce229404de9e284bc1f7479fd2c580efd55f8f
2016-08-02 10:38:02 +01:00
Dylan Johnson 2950a95672 arm: Add AArch64 hypervisor call instruction 'hvc'
This patch adds the AArch64 instruction hvc which raises an exception
from EL1 into EL2. The host OS uses this instruction to world switch
into the guest.

Change-Id: I930ee43f4f0abd4b35a68eb2a72e44e3ea6570be
2016-08-02 10:38:02 +01:00
Dylan Johnson c53a57f74f arm: add stage2 translation support
Change-Id: I8f7c09c7ec3a97149ebebf4b21471b244e6cecc1
2016-08-02 10:38:02 +01:00
Curtis Dunham 49538a7118 arm: enable EL2 support
Change-Id: I59fa4fae98c33d9e5c2185382e1411911d27d341
2016-08-02 10:38:01 +01:00
Dylan Johnson 4fbf40daab arm: invalidate TLB miscreg cache on modification of HSCTLR
Change-Id: I5212c91c56435fe008950ed99feacc6921609226
2016-08-02 10:38:01 +01:00
Dylan Johnson e727a0eeaa arm: change instruction classes to catch hyp traps
Change-Id: I122918d0e3dfd01ae1a4ca4f19240a069115c8b7
2016-08-02 10:38:01 +01:00
Mitch Hayenga 8a476d387c isa: Modify get/check interrupt routines
Make it so that getInterrupt *always* returns an interrupt if
checkInterrupts() returns true.  This fixes/simplifies handling
of interrupts on the SMT FS CPUs (currently minor).
2016-07-21 17:19:15 +01:00