Commit graph

3075 commits

Author SHA1 Message Date
Brandon Potter 6c41181b8e syscall_emul: [patch 9/22] remove unused global variable (num_processes) 2016-11-09 14:27:42 -06:00
Brandon Potter 49009f170a syscall_emul: [patch 8/22] refactor process class
Moves aux_vector into its own .hh and .cc files just to get it out of the
already crowded Process files. Arguably, it could stay there, but it's
probably better just to move it and give it files.

The changeset looks ugly around the Process header file, but the goal here is
to move methods and members around so that they're not defined randomly
throughout the entire header file. I expect this is likely one of the reasons
why I several unused variables related to this class. So, the methods are
declared first followed by members. I've tried to aggregate them together
so that similar entries reside near one another.

There are other changes coming to this code so this is by no means the
final product.
2016-11-09 14:27:41 -06:00
Brandon Potter 3886c4a8f2 syscall_emul: [patch 5/22] remove LiveProcess class and use Process instead
The EIOProcess class was removed recently and it was the only other class
which derived from Process. Since every Process invocation is also a
LiveProcess invocation, it makes sense to simplify the organization by
combining the fields from LiveProcess into Process.
2016-11-09 14:27:40 -06:00
Brandon Potter 7b6cf951e2 sparc: fix bugs caused by cd7f3a1dbf55
Turns out that SPARC SE mode relied on M5_pid being "0" in
all cases. The entries in the SPARC TLBs are accessed with
M5_pid as their context. This is buggy in the sense that it
will never work with more than one process or any
initialization that doesn't have the M5_pid value passed in
as "0".

cd7f3a1dbf55 broke the SPARC build because it deletes M5_pid
and uses a _pid with a default of "100" instead. This caused
the SPARC TLB to never return any valid lookups for any
request; the program never moved past the first instruction
with SPARC SE in the regression tester.

The solution proposed in this changeset is to initialize
the address space identification register with the PID value
that is passed into the process class as a parameter from
Python. This should return the correct responses from the TLB
since the insertions and lookups into the page table will be
using the same PID.

Furthermore, there are corner cases in the code which elevate
privileges and revert to using context "0" as the context in
the TLB. I believe that these are related to kernel level
traps and hypervisor privilege escalations, but I'm not
completely sure. I've tried to address the corner cases
properly, but it would be beneficial to have someone who is
familiar with the SPARC architecture to take a look at this
fix.
2017-02-17 12:01:51 -05:00
Curtis Dunham 80c17d0a8d arm, kvm: remove KvmGic
KvmGic functionality has been subsumed within the new MuxingKvmGic
model, which has Pl390 fallback when not using KVM for fast emulation.
This simplifies configuration and will enable checkpointing between
KVM emulation and full-system simulation.

Change-Id: Ie61251720064c512843015c075e4ac419a4081e8
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2017-02-14 15:09:18 -06:00
Curtis Dunham 0edf6dc956 arm, kvm: implement MuxingKvmGic
This device allows us to, when KVM support is detected and compiled in,
instantiate the same Gic device whether the actual simulation is with
KVM cores or simulated cores.  Checkpointing is not yet supported.

Change-Id: I67e4e0b6fb7ab5058e52c933f4f3d8e7ab24981e
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2017-02-14 15:09:18 -06:00
Curtis Dunham 41beacce08 sim, kvm: make KvmVM a System parameter
A KVM VM is typically a child of the System object already, but for
solving future issues with configuration graph resolution, the most
logical way to keep track of this object is for it to be an actual
parameter of the System object.

Change-Id: I965ded22203ff8667db9ca02de0042ff1c772220
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2017-02-14 15:09:18 -06:00
Curtis Dunham d3bfc03688 sim,kvm,arm: fix typos
Change-Id: Ifc65d42eebfd109c1c622c82c3c3b3e523819e85
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2017-02-14 15:09:18 -06:00
Jason Lowe-Power 153e5879c6 x86: Fix implicit stack addressing in 64-bit mode
When in 64-bit mode, if the stack is accessed implicitly by an instruction
the alternate address prefix should be ignored if present.

This patch adds an extra flag to the ldstop which signifies when the
address override should be ignored. Then, for all of the affected
instructions, this patch adds two options to the ld and st opcode to use
the current stack addressing mode for all addresses and to ignore the
AddressSizeFlagBit.  Finally, this patch updates the x86 TLB to not
truncate the address if it is in 64-bit mode and the IgnoreAddrSizeFlagBit
is set.

This fixes a problem when calling __libc_start_main with a binary that is
linked with a recent version of ld. This version of ld uses the address
override prefix (0x67) on the call instruction instead of a nop.

Note: This has not been tested in compatibility mode and only the call
instruction with the address override prefix has been tested.

See [1] page 9 (pdf page 45)

For instructions that are affected see [1] page 519 (pdf page 555).

[1] http://support.amd.com/TechDocs/24594.pdf

Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2017-02-10 11:19:34 -05:00
Bjoern A. Zeeb f0786704db arm: AArch64 report cache size correctly when reading CTR_EL0
Trying to read MISCREG_CTR_EL0 on AArch64 returned 0 as is was not
implmemented.  With that an operating system relying on the cache line
sizes reported in order to manage the caches would (a) panic given the
returned value 0 is not valid (high bit is RES1) or (b) worst case would
assume a cache line size of 4 doing a tremendous amount of extra
instruction work (including fetching).  Return the same values as for ARMv7
as the fields seem to be the same, or RES0/1 seem to be reported
accordingly for AArch64

In collaboration with:  Andrew Turner

Testing Done: Checked on FreeBSD boots with extra printfs;  also observed a
reduction of a factor of about 10 in instruction fetches for a simple
micro-test.

Reviewed at http://reviews.gem5.org/r/3667/

Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2017-02-09 18:54:28 -05:00
Alec Roelke e4c57275d3 riscv: Fix crash when syscall argument reg index is too high
By default, doSyscall gets the values of six registers to be used for
system call arguments.  RISC-V, by convention, only has four.  Because
RISC-V's implementation of these indices is as arrays of integers rather
than as base indices plus offsets, trying to get the fifth argument
register's value will cause a crash.  This patch fixes that by returning 0
for any index higher than 3.

Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2017-01-27 15:05:01 -06:00
Brandon Potter e387521527 syscall_emul: [patch 4/22] remove redundant M5_pid field from process 2016-11-09 14:27:40 -06:00
Brandon Potter a928a438b8 style: [patch 3/22] reduce include dependencies in some headers
Used cppclean to help identify useless includes and removed them. This
involved erroneously included headers, but also cases where forward
declarations could have been used rather than a full include.
2016-11-09 14:27:40 -06:00
Brandon Potter 1ced08c850 syscall_emul: [patch 2/22] move SyscallDesc into its own .hh and .cc
The class was crammed into syscall_emul.hh which has tons of forward
declarations and template definitions. To clean it up a bit, moved the
class into separate files and commented the class with doxygen style
comments. Also, provided some encapsulation by adding some accessors and
a mutator.

The syscallreturn.hh file was renamed syscall_return.hh to make it consistent
with other similarly named files in the src/sim directory.

The DPRINTF_SYSCALL macro was moved into its own header file with the
include the Base and Verbose flags as well.

--HG--
rename : src/sim/syscallreturn.hh => src/sim/syscall_return.hh
2016-11-09 14:27:40 -06:00
Brandon Potter 7a8dda49a4 style: [patch 1/22] use /r/3648/ to reorganize includes 2016-11-09 14:27:37 -06:00
Andreas Sandberg abe7ef95cb sim: Remove redundant export_method_cxx_predecls
The headers declared in export_method_cxx_predecls are redundant since a
SimObject's main header is automatically included.

Change-Id: Ied9e84630b36960e54efe91d16f8c66fba7e0da0
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Curtis Dunham <curtis.dunham@arm.com>
Reviewed-by: Joe Gross <joseph.gross@amd.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
2017-01-03 12:03:06 +00:00
Curtis Dunham f04d81163c arm: provide correct timer availability in ID_PFR1 register
Change-Id: Id4cd839c12b70616017a5830e3f9bbb59b0f97ba
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2016-12-19 11:03:28 -06:00
Curtis Dunham ae2e0ca3d0 arm: compute ID_AA64PFR{0,1}_EL1 registers
Compute the proper values of the aforementioned registers from
the system configuration rather than configuring the values themselves.

Change-Id: If9774b6610a29568b80ae4866107b9a6a5b5be0f
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2016-12-19 11:03:28 -06:00
Curtis Dunham a73937b60c arm: compute ID_PFR{0,1} registers
Compute the proper values of the aforementioned registers from
the system configuration rather than configuring the values themselves.

Change-Id: Ie7685b5d8b5f2dd9d6380b4af74f16d596b2bfd1
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2016-12-19 11:03:27 -06:00
Curtis Dunham 282cf5807d arm: miscreg refactoring
Change-Id: I4e9e8f264a4a4239dd135a6c7a1c8da213b6d345
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2016-12-19 11:03:27 -06:00
Curtis Dunham 9cf6bc444b arm: audit SCTLR
Change-Id: I814f1431a5f754f75721c9ac51171f860a714d24
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2016-12-19 11:03:27 -06:00
Curtis Dunham 7ddb55a5f2 arm: remove SCTLR.FI
Removed from ARMARM.

Change-Id: Ie8f28e4fa6e1b46dfd9c8c4b379e5b42fe25421d
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2016-12-19 11:03:27 -06:00
Curtis Dunham 19d90956eb arm: update AArch{64,32} register mappings
Change-Id: Idaaaeb3f7b1a0bdbf18d8e2d46686c78bb411317
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2016-12-19 11:03:27 -06:00
Brandon Potter cc84eb813c syscall_emul: implement fallocate 2016-12-15 13:16:25 -05:00
Brandon Potter 68e9c0e73b syscall_emul: add support for x86 statfs system calls 2016-12-15 13:16:03 -05:00
Brandon Potter 3d0a537862 hsail: disable asserts to allow immediate operands i.e. 0 with loads 2016-12-02 18:01:58 -05:00
Brandon Potter 900fd15622 hsail: add stub type and stub out several instructions 2016-12-02 18:01:57 -05:00
Brandon Potter 86b375f2f3 hsail: add popcount type and generate popcount instructions 2016-12-02 18:01:55 -05:00
Brandon Potter 3bb3db6194 hsail: add a wavesize case statement to register operand code 2016-12-02 18:01:52 -05:00
Brandon Potter 69c2d86d68 hsail: generate mov instructions for more arith_types and bit_types 2016-12-02 18:01:49 -05:00
Tony Gutierrez 38708f369b hsail: fix unsigned offset bug in address calculation
it's possible for the offset provided to an HSAIL mem inst to be a negative
value, however the variable we use to hold the offset is an unsigned type.
this can lead to excessively large offset values when the offset is negative,
which will almost certainly cause the access to go out of bounds.
2016-12-02 11:40:52 -05:00
Alec Roelke ee0c261e10 riscv: [Patch 7/5] Corrected LRSC semantics
RISC-V makes use of load-reserved and store-conditional instructions to
enable creation of lock-free concurrent data manipulation as well as
ACQUIRE and RELEASE semantics for memory ordering of LR, SC, and AMO
instructions (the latter of which do not follow LR/SC semantics). This
patch is a correction to patch 4, which added these instructions to the
implementation of RISC-V. It modifies locked_mem.hh and the
implementations of lr.w, sc.w, lr.d, and sc.d to apply the proper gem5
flags and return the proper values.

An important difference between gem5's LLSC semantics and RISC-V's LR/SC
ones, beyond the name, is that gem5 uses 0 to indicate failure and 1 to
indicate success, while RISC-V is the opposite. Strictly speaking, RISC-V
uses 0 to indicate success and nonzero to indicate failure where the
value would indicate the error, but currently only 1 is reserved as a
failure code by the ISA reference.

This is the seventh patch in the series which originally consisted of five
patches that added the RISC-V ISA to gem5. The original five patches added
all of the instructions and added support for more detailed CPU models and
the sixth patch corrected the implementations of Linux constants and
structs. There will be an eighth patch that adds some regression tests
for the instructions.

[Removed some commented-out code from locked_mem.hh.]
Signed-off by: Alec Roelke

Signed-off by: Jason Lowe-Power <jason@lowepower.com>
2016-11-30 17:10:28 -05:00
Alec Roelke 84020a8aed riscv: [Patch 6/5] Improve Linux emulation for RISC-V
This is an add-on patch for the original series that implemented RISC-V
that improves the implementation of Linux emulation for SE mode. Basically
it cleans up linux/linux.hh by removing constants that haven't been
defined for the RISC-V Linux proxy kernel and rearranging the stat
struct so it aligns with RISC-V's implementation of it. It also adds
placeholders for system calls that have been given numbers in RISC-V
but haven't been given implementations yet. These system calls are
as follows:
- readlinkat
- sigprocmask
- ioctl
- clock_gettime
- getrusage
- getrlimit
- setrlimit

The first five patches implemented RISC-V with the base ISA and multiply,
floating point, and atomic extensions and added support for detailed
CPU models with memory timing.

[Fixed incompatibility with changes made from patch 1.]
Signed-off by: Alec Roelke

Signed-off by: Jason Lowe-Power <jason@lowepower.com>
2016-11-30 17:10:28 -05:00
Alec Roelke 126c0360e2 riscv: [Patch 5/5] Added missing support for timing CPU models
Last of five patches adding RISC-V to GEM5. This patch adds support for
timing, minor, and detailed CPU models that was missing in the last four,
which basically consists of handling timing-mode memory accesses and
telling the minor and detailed models what a no-op instruction should
be (addi zero, zero, 0).

Patches 1-4 introduced RISC-V and implemented the base instruction set,
RV64I, and added the multiply, floating point, and atomic memory
extensions, RV64MAFD.

[Fixed compatibility with edit from patch 1.]
[Fixed compatibility with hg copy edit from patch 1.]
[Fixed some style errors in locked_mem.hh.]
Signed-off by: Alec Roelke

Signed-off by: Jason Lowe-Power <jason@lowepower.com>
2016-11-30 17:10:28 -05:00
Alec Roelke 535e6c5fa4 riscv: [Patch 4/5] Added RISC-V atomic memory extension RV64A
Fourth of five patches adding RISC-V to GEM5. This patch adds the RV64A
extension, which includes atomic memory instructions. These instructions
atomically read a value from memory, modify it with a value contained in a
source register, and store the original memory value in the destination
register and modified value back into memory. Because this requires two
memory accesses and GEM5 does not support two timing memory accesses in
a single instruction, each of these instructions is split into two micro-
ops: A "load" micro-op, which reads the memory, and a "store" micro-op,
which modifies and writes it back.  Each atomic memory instruction also has
two bits that acquire and release a lock on its memory location.
Additionally, there are atomic load and store instructions that only either
load or store, but not both, and can acquire or release memory locks.

Note that because the current implementation of RISC-V only supports one
core and one thread, it doesn't make sense to make use of AMO instructions.
However, they do form a standard extension of the RISC-V ISA, so they are
included mostly as a placeholder for when multithreaded execution is
implemented.  As a result, any tests for their correctness in a future
patch may be abbreviated.

Patch 1 introduced RISC-V and implemented the base instruction set, RV64I;
patch 2 implemented the integer multiply extension, RV64M; and patch 3
implemented the single- and double-precision floating point extensions,
RV64FD.

Patch 5 will add support for timing, minor, and detailed CPU models that
isn't present in patches 1-4.

[Added missing file amo.isa]
[Replaced information removed from initial patch that was missed during
division into multiple patches.]
[Fixed some minor formatting issues.]
[Fixed oversight where LR and SC didn't have both AQ and RL flags.]
Signed-off by: Alec Roelke

Signed-off by: Jason Lowe-Power <jason@lowepower.com>
2016-11-30 17:10:28 -05:00
Alec Roelke 1229b3b623 riscv: [Patch 3/5] Added RISCV floating point extensions RV64FD
Third of five patches adding RISC-V to GEM5. This patch adds the RV64FD
extensions, which include single- and double-precision floating point
instructions.

Patch 1 introduced RISC-V and implemented the base instruction set, RV64I
and patch 2 implemented the integer multiply extension, RV64M.

Patch 4 will implement the atomic memory instructions, RV64A, and patch
5 will add support for timing, minor, and detailed CPU models that is
missing from the first four patches.

[Fixed exception handling in floating-point instructions to conform better
to IEEE-754 2008 standard and behavior of the Chisel-generated RISC-V
simulator.]
[Fixed style errors in decoder.isa.]
[Fixed some fuzz caused by modifying a previous patch.]
Signed-off by: Alec Roelke

Signed-off by: Jason Lowe-Power <jason@lowepower.com>
2016-11-30 17:10:28 -05:00
Alec Roelke 070da98493 riscv: [Patch 2/5] Added RISC-V multiply extension RV64M
Second of five patches adding RISC-V to GEM5.  This patch adds the
RV64M extension, which includes integer multiply and divide instructions.

Patch 1 introduced RISC-V and implemented the base instruction set, RV64I.

Patch 3 will implement the floating point extensions, RV64FD; patch 4 will
implement the atomic memory instructions, RV64A; and patch 5 will add
support for timing, minor, and detailed CPU models that is missing from
the first four patches.

[Added mulw instruction that was missed when dividing changes among
patches.]
Signed-off by: Alec Roelke

Signed-off by: Jason Lowe-Power <jason@lowepower.com>
2016-11-30 17:10:28 -05:00
Alec Roelke e76bfc8764 arch: [Patch 1/5] Added RISC-V base instruction set RV64I
First of five patches adding RISC-V to GEM5. This patch introduces the
base 64-bit ISA (RV64I) in src/arch/riscv for use with syscall emulation.
The multiply, floating point, and atomic memory instructions will be added
in additional patches, as well as support for more detailed CPU models.
The loader is also modified to be able to parse RISC-V ELF files, and a
"Hello world\!" example for RISC-V is added to test-progs.

Patch 2 will implement the multiply extension, RV64M; patch 3 will implement
the floating point (single- and double-precision) extensions, RV64FD;
patch 4 will implement the atomic memory instructions, RV64A, and patch 5
will add support for timing, minor, and detailed CPU models that is missing
from the first four patches (such as handling locked memory).

[Removed several unused parameters and imports from RiscvInterrupts.py,
RiscvISA.py, and RiscvSystem.py.]
[Fixed copyright information in RISC-V files copied from elsewhere that had
ARM licenses attached.]
[Reorganized instruction definitions in decoder.isa so that they are sorted
by opcode in preparation for the addition of ISA extensions M, A, F, D.]
[Fixed formatting of several files, removed some variables and
instructions that were missed when moving them to other patches, fixed
RISC-V Foundation copyright attribution, and fixed history of files
copied from other architectures using hg copy.]
[Fixed indentation of switch cases in isa.cc.]
[Reorganized syscall descriptions in linux/process.cc to remove large
number of repeated unimplemented system calls and added implmementations
to functions that have received them since it process.cc was first
created.]
[Fixed spacing for some copyright attributions.]
[Replaced the rest of the file copies using hg copy.]
[Fixed style check errors and corrected unaligned memory accesses.]
[Fix some minor formatting mistakes.]
Signed-off by: Alec Roelke

Signed-off by: Jason Lowe-Power <jason@lowepower.com>
2016-11-30 17:10:28 -05:00
Tony Gutierrez 0799600686 x86: fix issue with casting in Cvtf2i
UBSAN flags this operation because it detects that arg is being cast directly
to an unsigned type, argBits. this patch fixes this by first casting the
value to a signed int type, then reintrepreting the raw bits of the signed
int into argBits.
2016-11-21 15:35:56 -05:00
Tony Gutierrez ae55cba281 x86: fix loading/storing of Float80 types 2016-11-19 12:35:14 -05:00
Andreas Hansson 6ed567d600 alpha: Remove ALPHA tru64 support and associated tests
No one appears to be using it, and it is causing build issues
and increases the development and maintenance effort.
2016-11-17 04:54:14 -05:00
Tony Gutierrez 74249f80df hsail,gpu-compute: fixes to appease clang++
fixes to appease clang++. tested on:

Ubuntu clang version 3.5.0-4ubuntu2~trusty2
(tags/RELEASE_350/final) (based on LLVM 3.5.0)

Ubuntu clang version 3.6.0-2ubuntu1~trusty1
(tags/RELEASE_360/final) (based on LLVM 3.6.0)

the fixes address the following five issues:

1) the exec continuations in gpu_static_inst.hh were marked
   as protected when they should be public. here we mark
   them as public

2) the Abs instruction uses std::abs() in its execute method.
   because Abs is templated, it can also operate on U32 and U64,
   types, which cause Abs::execute() to pass uint32_t and uint64_t
   types to std::abs() respectively. this triggers a warning
   because std::abs() has no effect in this case. to rememdy this
   we add template specialization for the execute() method of Abs
   when its template paramter is U32 or U64.

3) Some potocols that utilize the code in cprintf.hh were missing
   includes to BoolVec.hh, which defines operator<< for the BoolVec
   type. This would cause issues when the generated code would try
   to pass a BoolVec type to a method in cprintf.hh that used
   operator<< on an instance of a BoolVec.

4) Surprise, clang doesn't like it when you clobber all the bits
   in a newly allocated object. I.e., this code:

   tlb = new GpuTlbEntry\[size\];
   std::memset(tlb, 0, sizeof(GpuTlbEntry) \* size);

   Let's use std::vector to track the TLB entries in the GpuTlb now...

5) There were a few variables used only in DPRINTFs, so we mark them
   with M5_VAR_USED.
2016-10-26 22:48:45 -04:00
Michael LeBeane dc16c1ceb8 dev: Add m5 op to toggle synchronization for dist-gem5.
This patch adds the ability for an application to request dist-gem5 to begin/
end synchronization using an m5 op. When toggling on sync, all nodes agree
on the next sync point based on the maximum of all nodes' ticks. CPUs are
suspended until the sync point to avoid sending network messages until sync has
been enabled. Toggling off sync acts like a global execution barrier, where
all CPUs are disabled until every node reaches the toggle off point. This
avoids tricky situations such as one node hitting a toggle off followed by a
toggle on before the other nodes hit the first toggle off.
2016-10-26 22:48:40 -04:00
Tony Gutierrez de72e36619 gpu-compute: support in-order data delivery in GM pipe
this patch adds an ordered response buffer to the GM pipeline
to ensure in-order data delivery. the buffer is implemented as
a stl ordered map, which sorts the request in program order by
using their sequence ID. when requests return to the GM pipeline
they are marked as done. only the oldest request may be serviced
from the ordered buffer, and only if is marked as done.

the FIFO response buffers are kept and used in OoO delivery mode
2016-10-26 22:48:28 -04:00
Tony Gutierrez b63eb1302b gpu-compute, hsail: pass GPUDynInstPtr to getRegisterIndex()
for HSAIL an operand's indices into the register files may be calculated
trivially, because the operands are always read from a register file, or are
an immediate.

for machine ISA, however, an op selector may specify special registers, or
may specify special SGPRs with an alias op selector value. the location of
some of the special registers values are dependent on the size of the RF
in some cases. here we add a way for the underlying getRegisterIndex()
method to know about the size of the RFs, so that it may find the relative
positions of the special register values.
2016-10-26 22:47:49 -04:00
Tony Gutierrez 844fb845a5 gpu-compute, hsail: make the PC a byte address, not an instruction index
currently the PC is incremented on an instruction granularity, and not as an
instruction's byte address. machine ISA instructions assume the PC is a byte
address, and is incremented accordingly. here we make the GPU model, and the
HSAIL instructions treat the PC as a byte address as well.
2016-10-26 22:47:43 -04:00
Tony Gutierrez d327cdba07 gpu-compute: add gpu_isa.hh to switch hdrs, add GPUISA to WF
the GPUISA class is meant to encapsulate any ISA-specific behavior - special
register accesses, isa-specific WF/kernel state, etc. - in a generic enough
way so that it may be used in ISA-agnostic code.

gpu-compute: use the GPUISA object to advance the PC

the GPU model treats the PC as a pointer to individual instruction objects -
which are store in a contiguous array - and not a byte address to be fetched
from the real memory system. this is ok for HSAIL because all instructions
are considered by the model to be the same size.

in machine ISA, however, instructions may be 32b or 64b, and branches are
calculated by advancing the PC by the number of words (4 byte chunks) it
needs to advance in the real instruction stream. because of this there is
a mismatch between the PC we use to index into the instruction array, and
the actual byte address PC the ISA expects. here we move the PC advance
calculation to the ISA so that differences in the instrucion sizes may be
accounted for in generic way.
2016-10-26 22:47:38 -04:00
Tony Gutierrez c7a79c9a42 gpu-compute, hsail: call discardFetch() from the WF
because every taken branch causes fetch to be discarded, we move the call
to the WF to avoid to have to call it from each and every branch instruction
type.
2016-10-26 22:47:27 -04:00
Tony Gutierrez 00a6346c91 hsail, gpu-compute: remove doGm/SmReturn add completeAcc
we are removing doGmReturn from the GM pipe, and adding completeAcc()
implementations for the HSAIL mem ops. the behavior in doGmReturn is
dependent on HSAIL and HSAIL mem ops, however the completion phase
of memory ops in machine ISA can be very different, even amongst individual
machine ISA mem ops. so we remove this functionality from the pipeline and
allow it to be implemented by the individual instructions.
2016-10-26 22:47:19 -04:00
Tony Gutierrez 7ac38849ab gpu-compute: remove inst enums and use bit flag for attributes
this patch removes the GPUStaticInst enums that were defined in GPU.py.
instead, a simple set of attribute flags that can be set in the base
instruction class are used. this will help unify the attributes of HSAIL
and machine ISA instructions within the model itself.

because the static instrution now carries the attributes, a GPUDynInst
must carry a pointer to a valid GPUStaticInst so a new static kernel launch
instruction is added, which carries the attributes needed to perform a
the kernel launch.
2016-10-26 22:47:11 -04:00