sanchayanmaity/gem5 - Sanchayan Maity's repositories

Author	SHA1	Message	Date
Jason Lowe-Power	7520331402	tests: Regression stats updated for recent patches	2016-11-30 17:12:59 -05:00
Alec Roelke	33683bd087	riscv: [Patch 8/5] Added some regression tests to RISC-V This patch is the eighth patch in a series adding RISC-V to gem5, and third of the bonus patches to the original series of five. It adds some regression tests to RISC-V. Regression tests included: - se/00.hello - se/02.insttest (split into several binaries which are not included due to large size) The tests added to 00.insttest will need to be build manually; to facilitate this, a Makefile is included. The required toolchain and compiler (riscv64-unknown-elf-gcc) can be built from the riscv-tools GitHub repository at https://github.com/riscv/riscv-tools. Note that because EBREAK only makes sense when gdb is running or while in FS mode, it is not included in the linux-rv64i insttest. ERET is not included because it does not make sense in SE mode and, in fact, causes a panic by design. Note also that not every system call is tested in linux-rv64i; of the ones defined in linux/process.hh, some have been given numbers but not definitions for the toolchain, or are merely stubs that always return 0. Of the ones that do work properly, only a subset are tested due to similar functionality. Signed-off by: Alec Roelke Signed-off by: Jason Lowe-Power <jason@lowepower.com>	2016-11-30 17:12:56 -05:00
Alec Roelke	ee0c261e10	riscv: [Patch 7/5] Corrected LRSC semantics RISC-V makes use of load-reserved and store-conditional instructions to enable creation of lock-free concurrent data manipulation as well as ACQUIRE and RELEASE semantics for memory ordering of LR, SC, and AMO instructions (the latter of which do not follow LR/SC semantics). This patch is a correction to patch 4, which added these instructions to the implementation of RISC-V. It modifies locked_mem.hh and the implementations of lr.w, sc.w, lr.d, and sc.d to apply the proper gem5 flags and return the proper values. An important difference between gem5's LLSC semantics and RISC-V's LR/SC ones, beyond the name, is that gem5 uses 0 to indicate failure and 1 to indicate success, while RISC-V is the opposite. Strictly speaking, RISC-V uses 0 to indicate success and nonzero to indicate failure where the value would indicate the error, but currently only 1 is reserved as a failure code by the ISA reference. This is the seventh patch in the series which originally consisted of five patches that added the RISC-V ISA to gem5. The original five patches added all of the instructions and added support for more detailed CPU models and the sixth patch corrected the implementations of Linux constants and structs. There will be an eighth patch that adds some regression tests for the instructions. [Removed some commented-out code from locked_mem.hh.] Signed-off by: Alec Roelke Signed-off by: Jason Lowe-Power <jason@lowepower.com>	2016-11-30 17:10:28 -05:00
Alec Roelke	84020a8aed	riscv: [Patch 6/5] Improve Linux emulation for RISC-V This is an add-on patch for the original series that implemented RISC-V that improves the implementation of Linux emulation for SE mode. Basically it cleans up linux/linux.hh by removing constants that haven't been defined for the RISC-V Linux proxy kernel and rearranging the stat struct so it aligns with RISC-V's implementation of it. It also adds placeholders for system calls that have been given numbers in RISC-V but haven't been given implementations yet. These system calls are as follows: - readlinkat - sigprocmask - ioctl - clock_gettime - getrusage - getrlimit - setrlimit The first five patches implemented RISC-V with the base ISA and multiply, floating point, and atomic extensions and added support for detailed CPU models with memory timing. [Fixed incompatibility with changes made from patch 1.] Signed-off by: Alec Roelke Signed-off by: Jason Lowe-Power <jason@lowepower.com>	2016-11-30 17:10:28 -05:00
Alec Roelke	126c0360e2	riscv: [Patch 5/5] Added missing support for timing CPU models Last of five patches adding RISC-V to GEM5. This patch adds support for timing, minor, and detailed CPU models that was missing in the last four, which basically consists of handling timing-mode memory accesses and telling the minor and detailed models what a no-op instruction should be (addi zero, zero, 0). Patches 1-4 introduced RISC-V and implemented the base instruction set, RV64I, and added the multiply, floating point, and atomic memory extensions, RV64MAFD. [Fixed compatibility with edit from patch 1.] [Fixed compatibility with hg copy edit from patch 1.] [Fixed some style errors in locked_mem.hh.] Signed-off by: Alec Roelke Signed-off by: Jason Lowe-Power <jason@lowepower.com>	2016-11-30 17:10:28 -05:00
Alec Roelke	535e6c5fa4	riscv: [Patch 4/5] Added RISC-V atomic memory extension RV64A Fourth of five patches adding RISC-V to GEM5. This patch adds the RV64A extension, which includes atomic memory instructions. These instructions atomically read a value from memory, modify it with a value contained in a source register, and store the original memory value in the destination register and modified value back into memory. Because this requires two memory accesses and GEM5 does not support two timing memory accesses in a single instruction, each of these instructions is split into two micro- ops: A "load" micro-op, which reads the memory, and a "store" micro-op, which modifies and writes it back. Each atomic memory instruction also has two bits that acquire and release a lock on its memory location. Additionally, there are atomic load and store instructions that only either load or store, but not both, and can acquire or release memory locks. Note that because the current implementation of RISC-V only supports one core and one thread, it doesn't make sense to make use of AMO instructions. However, they do form a standard extension of the RISC-V ISA, so they are included mostly as a placeholder for when multithreaded execution is implemented. As a result, any tests for their correctness in a future patch may be abbreviated. Patch 1 introduced RISC-V and implemented the base instruction set, RV64I; patch 2 implemented the integer multiply extension, RV64M; and patch 3 implemented the single- and double-precision floating point extensions, RV64FD. Patch 5 will add support for timing, minor, and detailed CPU models that isn't present in patches 1-4. [Added missing file amo.isa] [Replaced information removed from initial patch that was missed during division into multiple patches.] [Fixed some minor formatting issues.] [Fixed oversight where LR and SC didn't have both AQ and RL flags.] Signed-off by: Alec Roelke Signed-off by: Jason Lowe-Power <jason@lowepower.com>	2016-11-30 17:10:28 -05:00
Alec Roelke	1229b3b623	riscv: [Patch 3/5] Added RISCV floating point extensions RV64FD Third of five patches adding RISC-V to GEM5. This patch adds the RV64FD extensions, which include single- and double-precision floating point instructions. Patch 1 introduced RISC-V and implemented the base instruction set, RV64I and patch 2 implemented the integer multiply extension, RV64M. Patch 4 will implement the atomic memory instructions, RV64A, and patch 5 will add support for timing, minor, and detailed CPU models that is missing from the first four patches. [Fixed exception handling in floating-point instructions to conform better to IEEE-754 2008 standard and behavior of the Chisel-generated RISC-V simulator.] [Fixed style errors in decoder.isa.] [Fixed some fuzz caused by modifying a previous patch.] Signed-off by: Alec Roelke Signed-off by: Jason Lowe-Power <jason@lowepower.com>	2016-11-30 17:10:28 -05:00
Alec Roelke	070da98493	riscv: [Patch 2/5] Added RISC-V multiply extension RV64M Second of five patches adding RISC-V to GEM5. This patch adds the RV64M extension, which includes integer multiply and divide instructions. Patch 1 introduced RISC-V and implemented the base instruction set, RV64I. Patch 3 will implement the floating point extensions, RV64FD; patch 4 will implement the atomic memory instructions, RV64A; and patch 5 will add support for timing, minor, and detailed CPU models that is missing from the first four patches. [Added mulw instruction that was missed when dividing changes among patches.] Signed-off by: Alec Roelke Signed-off by: Jason Lowe-Power <jason@lowepower.com>	2016-11-30 17:10:28 -05:00
Alec Roelke	e76bfc8764	arch: [Patch 1/5] Added RISC-V base instruction set RV64I First of five patches adding RISC-V to GEM5. This patch introduces the base 64-bit ISA (RV64I) in src/arch/riscv for use with syscall emulation. The multiply, floating point, and atomic memory instructions will be added in additional patches, as well as support for more detailed CPU models. The loader is also modified to be able to parse RISC-V ELF files, and a "Hello world\!" example for RISC-V is added to test-progs. Patch 2 will implement the multiply extension, RV64M; patch 3 will implement the floating point (single- and double-precision) extensions, RV64FD; patch 4 will implement the atomic memory instructions, RV64A, and patch 5 will add support for timing, minor, and detailed CPU models that is missing from the first four patches (such as handling locked memory). [Removed several unused parameters and imports from RiscvInterrupts.py, RiscvISA.py, and RiscvSystem.py.] [Fixed copyright information in RISC-V files copied from elsewhere that had ARM licenses attached.] [Reorganized instruction definitions in decoder.isa so that they are sorted by opcode in preparation for the addition of ISA extensions M, A, F, D.] [Fixed formatting of several files, removed some variables and instructions that were missed when moving them to other patches, fixed RISC-V Foundation copyright attribution, and fixed history of files copied from other architectures using hg copy.] [Fixed indentation of switch cases in isa.cc.] [Reorganized syscall descriptions in linux/process.cc to remove large number of repeated unimplemented system calls and added implmementations to functions that have received them since it process.cc was first created.] [Fixed spacing for some copyright attributions.] [Replaced the rest of the file copies using hg copy.] [Fixed style check errors and corrected unaligned memory accesses.] [Fix some minor formatting mistakes.] Signed-off by: Alec Roelke Signed-off by: Jason Lowe-Power <jason@lowepower.com>	2016-11-30 17:10:28 -05:00
Sophiane Senni	ce2722cdd9	mem: Split the hit_latency into tag_latency and data_latency If the cache access mode is parallel, i.e. "sequential_access" parameter is set to "False", tags and data are accessed in parallel. Therefore, the hit_latency is the maximum latency between tag_latency and data_latency. On the other hand, if the cache access mode is sequential, i.e. "sequential_access" parameter is set to "True", tags and data are accessed sequentially. Therefore, the hit_latency is the sum of tag_latency plus data_latency. Signed-off-by: Jason Lowe-Power <jason@lowepower.com>	2016-11-30 17:10:27 -05:00
Jason Lowe-Power	047caf24ba	cpu: Remove branch predictor function predictInOrder This function was used by the now-defunct InOrderCPU model. Since this model is no longer in gem5, this function was not called from anywhere in the code.	2016-11-30 17:10:27 -05:00
Andreas Hansson	2a56260d5d	tests: Check for TrafficGen as part of memcheck regression Since protobuf is still considered optional we do not always have the TrafficGen. Check before running the memcheck regression.	2016-11-30 11:15:21 -05:00
Michael LeBeane	cd4b26b6ae	dev: Fix buffer length when unserializing an eth pkt Changeset 11701 only serialized the useful portion of of an ethernet packets' payload. However, the device models expect each ethernet packet to contain a 16KB buffer, even if there is no data in it. This patch adds a 'bufLength' field to EthPacketData so the original size of the packet buffer can always be unserialized. Reported-by: Gabor Dozsa <Gabor.Dozsa@arm.com>	2016-11-29 13:04:45 -05:00
Joe Gross	4b7bc5b1e1	scons: fix sanitizer flags with multiple sanitizers There has been some problem when using address and undefined-behavior sanitizers at the same time. This patch will look for the special case where both are enabled at once and change the flags passed to the compiler to reflect this.	2016-11-28 12:44:54 -05:00
Andreas Sandberg	faaf2d396f	style: Add options to select checkers and apply fixes Add an option, --checker/-c, to style.py that selects individual style checkers to apply. When this option isn't specified, the script defaults to all available style checkers. The option may be specified multiple times to run multiple style checkers. The option, --fix/-f, can be specified to automatically fix style violations. Change-Id: Id7597fba6b65cecfa17a88b1c87c8a4c8315af59 Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Andreas Hansson <andreas.hansson@arm.com>	2016-11-25 10:33:15 +00:00
Rekai Gonzalez Alberquilla	ac29b6c6fc	util: git pre-commit hook to check staged files This patch updates the git-pre-commit hook to check the files as they will be after the commit, instead of as they are currently, this way we prevent the undesired situation: - unstylish modification of a file - stage said file for commit - try to commit and fail due to style - fix style, forgetting staging changes - try to commit and fail, as although the changes staged are not styly, the current content of the file is. Change-Id: I5cc3f783375d9e4162e310e176103ebbf0a59023 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> [andreas.sandberg@arm.com: Rebased ontop of latest gem5]	2016-11-25 10:31:21 +00:00
Jieming Yin	b0856ab3b1	ruby: Fix potential bugs in garnet2.0 1. Delete unused variable from struct LinkEntry 2. Correct GarnetExtLink and GarnetIntLink inheritance	2016-11-21 15:41:30 -05:00
Tony Gutierrez	14deacf86e	gpu-compute: fix segfault when constructing GPUExecContext the GPUExecContext context currently stores a reference to its parent WF's GPUISA object, however there are some special instructions that do not have an associated WF. when these objects are constructed they set their WF pointer to null, which causes the GPUExecContext to segfault when trying to dereference the WF pointer to get at the WF's GPUISA object. here we change the GPUISA reference in the GPUExecContext class to a pointer so that it may be set to null.	2016-11-21 15:40:03 -05:00
Tony Gutierrez	a0d4019abd	gpu-compute: init valid field of GpuTlbEntry in default ctor valid field for GpuTlbEntry is not set in the default ctor, which can lead to strange behavior, and is also flagged by UBSAN.	2016-11-21 15:38:30 -05:00
Tony Gutierrez	f82418acef	ruby: add default ctor for MachineID type not all uses of MachineID initialize its fields, so here we add a default ctor.	2016-11-21 15:37:07 -05:00
Tony Gutierrez	0799600686	x86: fix issue with casting in Cvtf2i UBSAN flags this operation because it detects that arg is being cast directly to an unsigned type, argBits. this patch fixes this by first casting the value to a signed int type, then reintrepreting the raw bits of the signed int into argBits.	2016-11-21 15:35:56 -05:00
Sooraj Puthoor	29d38e7576	ruby: init MessageSizeType of SequencerMsg to Request_Control SequencerMsg is autogenerated by slicc scripts and the MessageSizeType is initialized to the max enume value by default. The DMASequencer pushes this message to the mandatory queue and since the MessageSizeType is unitialized, string_to_MessageSizeType() function used by traces to print the message fails with a panic. This patch avoids this problem by initializing MessageSizeType of SequencerMsg to Request_Control.	2016-11-19 12:39:04 -05:00
Tony Gutierrez	ae55cba281	x86: fix loading/storing of Float80 types	2016-11-19 12:35:14 -05:00
Andreas Sandberg	af934452af	ext: Update fputils to rev 13589cd This patch updates fputils to the latest revision (13589cd) from the upstream repository (github.com/andysan/fputils).	2016-11-18 18:08:20 +00:00
Andreas Hansson	b8a162087d	stats, alpha: Update ALPHA stats Reflect the removal of the syscall tracking.	2016-11-17 04:54:18 -05:00
Andreas Hansson	4cf7f6c4ca	tests, ruby: Move rubytests from ALPHA (linux) to NULL (none) This patch avoids compiling ALPHA six times as part of running 'util/regress', and instead relis on NULL with different protocols to run the rubytest. All we need is the memory system, so there is really no need to compile the ISA over and over again. The one downside is the removal of running 'hello' for the variuos ALPHA and protocol combinations, but if this is a concern we should rather beef up the synthetic tests for the variuos protocols. --HG-- rename : build_opts/NULL => build_opts/NULL_MESI_Two_Level rename : build_opts/NULL => build_opts/NULL_MOESI_CMP_directory rename : build_opts/NULL => build_opts/NULL_MOESI_CMP_token rename : build_opts/NULL => build_opts/NULL_MOESI_hammer rename : tests/quick/se/60.rubytest/ref/alpha/linux/rubytest-ruby-MESI_Two_Level/config.ini => tests/quick/se/60.rubytest/ref/null/none/rubytest-ruby-MESI_Two_Level/config.ini rename : tests/quick/se/60.rubytest/ref/alpha/linux/rubytest-ruby-MESI_Two_Level/simerr => tests/quick/se/60.rubytest/ref/null/none/rubytest-ruby-MESI_Two_Level/simerr rename : tests/quick/se/60.rubytest/ref/alpha/linux/rubytest-ruby-MESI_Two_Level/simout => tests/quick/se/60.rubytest/ref/null/none/rubytest-ruby-MESI_Two_Level/simout rename : tests/quick/se/60.rubytest/ref/alpha/linux/rubytest-ruby-MESI_Two_Level/stats.txt => tests/quick/se/60.rubytest/ref/null/none/rubytest-ruby-MESI_Two_Level/stats.txt rename : tests/quick/se/60.rubytest/ref/alpha/linux/rubytest-ruby-MOESI_CMP_directory/config.ini => tests/quick/se/60.rubytest/ref/null/none/rubytest-ruby-MOESI_CMP_directory/config.ini rename : tests/quick/se/60.rubytest/ref/alpha/linux/rubytest-ruby-MOESI_CMP_directory/simerr => tests/quick/se/60.rubytest/ref/null/none/rubytest-ruby-MOESI_CMP_directory/simerr rename : tests/quick/se/60.rubytest/ref/alpha/linux/rubytest-ruby-MOESI_CMP_directory/simout => tests/quick/se/60.rubytest/ref/null/none/rubytest-ruby-MOESI_CMP_directory/simout rename : tests/quick/se/60.rubytest/ref/alpha/linux/rubytest-ruby-MOESI_CMP_directory/stats.txt => tests/quick/se/60.rubytest/ref/null/none/rubytest-ruby-MOESI_CMP_directory/stats.txt rename : tests/quick/se/60.rubytest/ref/alpha/linux/rubytest-ruby-MOESI_CMP_token/config.ini => tests/quick/se/60.rubytest/ref/null/none/rubytest-ruby-MOESI_CMP_token/config.ini rename : tests/quick/se/60.rubytest/ref/alpha/linux/rubytest-ruby-MOESI_CMP_token/simerr => tests/quick/se/60.rubytest/ref/null/none/rubytest-ruby-MOESI_CMP_token/simerr rename : tests/quick/se/60.rubytest/ref/alpha/linux/rubytest-ruby-MOESI_CMP_token/simout => tests/quick/se/60.rubytest/ref/null/none/rubytest-ruby-MOESI_CMP_token/simout rename : tests/quick/se/60.rubytest/ref/alpha/linux/rubytest-ruby-MOESI_CMP_token/stats.txt => tests/quick/se/60.rubytest/ref/null/none/rubytest-ruby-MOESI_CMP_token/stats.txt rename : tests/quick/se/60.rubytest/ref/alpha/linux/rubytest-ruby-MOESI_hammer/config.ini => tests/quick/se/60.rubytest/ref/null/none/rubytest-ruby-MOESI_hammer/config.ini rename : tests/quick/se/60.rubytest/ref/alpha/linux/rubytest-ruby-MOESI_hammer/simerr => tests/quick/se/60.rubytest/ref/null/none/rubytest-ruby-MOESI_hammer/simerr rename : tests/quick/se/60.rubytest/ref/alpha/linux/rubytest-ruby-MOESI_hammer/simout => tests/quick/se/60.rubytest/ref/null/none/rubytest-ruby-MOESI_hammer/simout rename : tests/quick/se/60.rubytest/ref/alpha/linux/rubytest-ruby-MOESI_hammer/stats.txt => tests/quick/se/60.rubytest/ref/null/none/rubytest-ruby-MOESI_hammer/stats.txt rename : tests/quick/se/60.rubytest/ref/alpha/linux/rubytest-ruby/config.ini => tests/quick/se/60.rubytest/ref/null/none/rubytest-ruby/config.ini rename : tests/quick/se/60.rubytest/ref/alpha/linux/rubytest-ruby/simerr => tests/quick/se/60.rubytest/ref/null/none/rubytest-ruby/simerr rename : tests/quick/se/60.rubytest/ref/alpha/linux/rubytest-ruby/simout => tests/quick/se/60.rubytest/ref/null/none/rubytest-ruby/simout rename : tests/quick/se/60.rubytest/ref/alpha/linux/rubytest-ruby/stats.txt => tests/quick/se/60.rubytest/ref/null/none/rubytest-ruby/stats.txt	2016-11-17 04:54:16 -05:00
Andreas Hansson	6ed567d600	alpha: Remove ALPHA tru64 support and associated tests No one appears to be using it, and it is causing build issues and increases the development and maintenance effort.	2016-11-17 04:54:14 -05:00
Tony Gutierrez	74249f80df	hsail,gpu-compute: fixes to appease clang++ fixes to appease clang++. tested on: Ubuntu clang version 3.5.0-4ubuntu2~trusty2 (tags/RELEASE_350/final) (based on LLVM 3.5.0) Ubuntu clang version 3.6.0-2ubuntu1~trusty1 (tags/RELEASE_360/final) (based on LLVM 3.6.0) the fixes address the following five issues: 1) the exec continuations in gpu_static_inst.hh were marked as protected when they should be public. here we mark them as public 2) the Abs instruction uses std::abs() in its execute method. because Abs is templated, it can also operate on U32 and U64, types, which cause Abs::execute() to pass uint32_t and uint64_t types to std::abs() respectively. this triggers a warning because std::abs() has no effect in this case. to rememdy this we add template specialization for the execute() method of Abs when its template paramter is U32 or U64. 3) Some potocols that utilize the code in cprintf.hh were missing includes to BoolVec.hh, which defines operator<< for the BoolVec type. This would cause issues when the generated code would try to pass a BoolVec type to a method in cprintf.hh that used operator<< on an instance of a BoolVec. 4) Surprise, clang doesn't like it when you clobber all the bits in a newly allocated object. I.e., this code: tlb = new GpuTlbEntry\[size\]; std::memset(tlb, 0, sizeof(GpuTlbEntry) \* size); Let's use std::vector to track the TLB entries in the GpuTlb now... 5) There were a few variables used only in DPRINTFs, so we mark them with M5_VAR_USED.	2016-10-26 22:48:45 -04:00
Michael LeBeane	dc16c1ceb8	dev: Add m5 op to toggle synchronization for dist-gem5. This patch adds the ability for an application to request dist-gem5 to begin/ end synchronization using an m5 op. When toggling on sync, all nodes agree on the next sync point based on the maximum of all nodes' ticks. CPUs are suspended until the sync point to avoid sending network messages until sync has been enabled. Toggling off sync acts like a global execution barrier, where all CPUs are disabled until every node reaches the toggle off point. This avoids tricky situations such as one node hitting a toggle off followed by a toggle on before the other nodes hit the first toggle off.	2016-10-26 22:48:40 -04:00
Michael LeBeane	48e43c9ad1	ruby: Allow multiple outstanding DMA requests DMA sequencers and protocols can currently only issue one DMA access at a time. This patch implements the necessary functionality to support multiple outstanding DMA requests in Ruby.	2016-10-26 22:48:37 -04:00
mlebeane	96905971f2	dev: Add 'simLength' parameter in EthPacketData Currently, all the network devices create a 16K buffer for the 'data' field in EthPacketData, and use 'length' to keep track of the size of the packet in the buffer. This patch introduces the 'simLength' parameter to EthPacketData, which is used to hold the effective length of the packet used for all timing calulations in the simulator. Serialization is performed using only the useful data in the packet ('length') and not necessarily the entire original buffer.	2016-10-26 22:48:33 -04:00
Tony Gutierrez	de72e36619	gpu-compute: support in-order data delivery in GM pipe this patch adds an ordered response buffer to the GM pipeline to ensure in-order data delivery. the buffer is implemented as a stl ordered map, which sorts the request in program order by using their sequence ID. when requests return to the GM pipeline they are marked as done. only the oldest request may be serviced from the ordered buffer, and only if is marked as done. the FIFO response buffers are kept and used in OoO delivery mode	2016-10-26 22:48:28 -04:00
Tony Gutierrez	b63eb1302b	gpu-compute, hsail: pass GPUDynInstPtr to getRegisterIndex() for HSAIL an operand's indices into the register files may be calculated trivially, because the operands are always read from a register file, or are an immediate. for machine ISA, however, an op selector may specify special registers, or may specify special SGPRs with an alias op selector value. the location of some of the special registers values are dependent on the size of the RF in some cases. here we add a way for the underlying getRegisterIndex() method to know about the size of the RFs, so that it may find the relative positions of the special register values.	2016-10-26 22:47:49 -04:00
Tony Gutierrez	aa7364276f	gpu-compute: use System cache line size in the GPU	2016-10-26 22:47:47 -04:00
Tony Gutierrez	844fb845a5	gpu-compute, hsail: make the PC a byte address, not an instruction index currently the PC is incremented on an instruction granularity, and not as an instruction's byte address. machine ISA instructions assume the PC is a byte address, and is incremented accordingly. here we make the GPU model, and the HSAIL instructions treat the PC as a byte address as well.	2016-10-26 22:47:43 -04:00
Tony Gutierrez	d327cdba07	gpu-compute: add gpu_isa.hh to switch hdrs, add GPUISA to WF the GPUISA class is meant to encapsulate any ISA-specific behavior - special register accesses, isa-specific WF/kernel state, etc. - in a generic enough way so that it may be used in ISA-agnostic code. gpu-compute: use the GPUISA object to advance the PC the GPU model treats the PC as a pointer to individual instruction objects - which are store in a contiguous array - and not a byte address to be fetched from the real memory system. this is ok for HSAIL because all instructions are considered by the model to be the same size. in machine ISA, however, instructions may be 32b or 64b, and branches are calculated by advancing the PC by the number of words (4 byte chunks) it needs to advance in the real instruction stream. because of this there is a mismatch between the PC we use to index into the instruction array, and the actual byte address PC the ISA expects. here we move the PC advance calculation to the ISA so that differences in the instrucion sizes may be accounted for in generic way.	2016-10-26 22:47:38 -04:00
Tony Gutierrez	98d8a7051d	gpu-compute: add instruction mix stats for the gpu	2016-10-26 22:47:30 -04:00
Tony Gutierrez	c7a79c9a42	gpu-compute, hsail: call discardFetch() from the WF because every taken branch causes fetch to be discarded, we move the call to the WF to avoid to have to call it from each and every branch instruction type.	2016-10-26 22:47:27 -04:00
Tony Gutierrez	00a6346c91	hsail, gpu-compute: remove doGm/SmReturn add completeAcc we are removing doGmReturn from the GM pipe, and adding completeAcc() implementations for the HSAIL mem ops. the behavior in doGmReturn is dependent on HSAIL and HSAIL mem ops, however the completion phase of memory ops in machine ISA can be very different, even amongst individual machine ISA mem ops. so we remove this functionality from the pipeline and allow it to be implemented by the individual instructions.	2016-10-26 22:47:19 -04:00
Tony Gutierrez	7ac38849ab	gpu-compute: remove inst enums and use bit flag for attributes this patch removes the GPUStaticInst enums that were defined in GPU.py. instead, a simple set of attribute flags that can be set in the base instruction class are used. this will help unify the attributes of HSAIL and machine ISA instructions within the model itself. because the static instrution now carries the attributes, a GPUDynInst must carry a pointer to a valid GPUStaticInst so a new static kernel launch instruction is added, which carries the attributes needed to perform a the kernel launch.	2016-10-26 22:47:11 -04:00
Tony Gutierrez	e1ad8035a3	gpu-compute: move disassemle() implementation to GPUStaticInst	2016-10-26 22:47:05 -04:00
Tony Gutierrez	0a6cdff176	gpu-compute, arch: add some methods to the base inst classes for ISA support	2016-10-26 22:47:01 -04:00
Tony Gutierrez	c7d4afd878	ruby: make a RequestDesc class instead of std::pair the RequestDesc was previously implemented as a std::pair, which made the implementation overly complex and error prone. here we encapsulate the packet, primary, and secondary types all in a single data structure with all members properly intialized in a ctor	2016-10-26 22:46:58 -04:00
Andreas Hansson	90b087171b	config: Break out base options for usage with NULL ISA This patch breaks out the most basic configuration options into a set of base options, to allow them to be used also by scripts that do not involve any ISA, and thus no actual CPUs or devices. The patch also fixes a few modules so that they can be imported in a NULL build, and avoid dragging in FSConfig every time Options is imported.	2016-10-26 14:50:54 -04:00
Andreas Hansson	607c277291	stats: Update stats to reflect recent changes to floats Mostly just splitting out the floats ops and corresponding reads/writes.	2016-10-19 06:20:04 -04:00
Shawn Rosti	71c982ff70	arm: Fix for ARM's Streamline conversion script tracked down issue with ARM's version of gem5 using the "cluster" name. The public/github version of ARM Gem5 does not use the "cluster" naming mechanism. Signed-off-by: Dam Sunwoo <dam.sunwoo@arm.com> Signed-off-by: Jason Lowe-Power <jason@lowepower.com>	2016-10-15 15:11:07 -05:00
Bjoern A. Zeeb	28c84d2886	arm, dev: pl011 console interactivity Improve PL011 console interactivity Signed-off-by: Jason Lowe-Power <jason@lowepower.com>	2016-10-15 15:11:04 -05:00
Nicolas Derumigny	976ef444b8	syscall: read() should not write anything if reading EOF. Read() should not write anything when returning 0 (EOF). This patch does not correct the same bug occuring for : nbr_read=read(file, buf, nbytes) When nbr_read<nbytes, nbytes bytes are copied into the virtual RAM instead of nbr_read. If buf is smaller than nbytes, a page fault occurs, even if buf is in fact bigger than nbr_read. Signed-off-by: Jason Lowe-Power <jason@lowepower.com>	2016-10-15 15:06:24 -05:00
Fernando Endo	6c72c35519	cpu, arm: Distinguish Float* and SimdFloat, create FloatMem opClass Modify the opClass assigned to AArch64 FP instructions from SimdFloat* to Float*. Also create the FloatMemRead and FloatMemWrite opClasses, which distinguishes writes to the INT and FP register banks. Change the latency of (Simd)FloatMultAcc to 5, based on the Cortex-A72, where the "latency" of FMADD is 3 if the next instruction is a FMADD and has only the augend to destination dependency, otherwise it's 7 cycles. Signed-off-by: Jason Lowe-Power <jason@lowepower.com>	2016-10-15 14:58:45 -05:00
Andreas Hansson	2f5262eb67	config: Make configs/common a Python package Continue along the same line as the recent patch that made the Ruby-related config scripts Python packages and make also the configs/common directory a package. All affected config scripts are updated (hopefully). Note that this change makes it apparent that the current organisation and naming of the config directory and its subdirectories is rather chaotic. We mix scripts that are directly invoked with scripts that merely contain convenience functions. While it is not addressed in this patch we should follow up with a re-organisation of the config structure, and renaming of some of the packages.	2016-10-14 10:37:38 -04:00

... 3 4 5 6 7 ...

11932 commits