sanchayanmaity/gem5 - Sanchayan Maity's repositories

Author	SHA1	Message	Date
Brandon Potter	2367198921	syscall_emul: [PATCH 15/22] add clone/execve for threading and multiprocess simulations Modifies the clone system call and adds execve system call. Requires allowing processes to steal thread contexts from other processes in the same system object and the ability to detach pieces of process state (such as MemState) to allow dynamic sharing.	2017-02-27 14:10:15 -05:00
Brandon Potter	a5802c823f	syscall_emul: [patch 13/22] add system call retry capability This changeset adds functionality that allows system calls to retry without affecting thread context state such as the program counter or register values for the associated thread context (when system calls return with a retry fault). This functionality is needed to solve problems with blocking system calls in multi-process or multi-threaded simulations where information is passed between processes/threads. Blocking system calls can cause deadlock because the simulator itself is single threaded. There is only a single thread servicing the event queue which can cause deadlock if the thread hits a blocking system call instruction. To illustrate the problem, consider two processes using the producer/consumer sharing model. The processes can use file descriptors and the read and write calls to pass information to one another. If the consumer calls the blocking read system call before the producer has produced anything, the call will block the event queue (while executing the system call instruction) and deadlock the simulation. The solution implemented in this changeset is to recognize that the system calls will block and then generate a special retry fault. The fault will be sent back up through the function call chain until it is exposed to the cpu model's pipeline where the fault becomes visible. The fault will trigger the cpu model to replay the instruction at a future tick where the call has a chance to succeed without actually going into a blocking state. In subsequent patches, we recognize that a syscall will block by calling a non-blocking poll (from inside the system call implementation) and checking for events. When events show up during the poll, it signifies that the call would not have blocked and the syscall is allowed to proceed (calling an underlying host system call if necessary). If no events are returned from the poll, we generate the fault and try the instruction for the thread context at a distant tick. Note that retrying every tick is not efficient. As an aside, the simulator has some multi-threading support for the event queue, but it is not used by default and needs work. Even if the event queue was completely multi-threaded, meaning that there is a hardware thread on the host servicing a single simulator thread contexts with a 1:1 mapping between them, it's still possible to run into deadlock due to the event queue barriers on quantum boundaries. The solution of replaying at a later tick is the simplest solution and solves the problem generally.	2015-07-20 09:15:21 -05:00
Curtis Dunham	41beacce08	sim, kvm: make KvmVM a System parameter A KVM VM is typically a child of the System object already, but for solving future issues with configuration graph resolution, the most logical way to keep track of this object is for it to be an actual parameter of the System object. Change-Id: I965ded22203ff8667db9ca02de0042ff1c772220 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>	2017-02-14 15:09:18 -06:00
Brandon Potter	a928a438b8	style: [patch 3/22] reduce include dependencies in some headers Used cppclean to help identify useless includes and removed them. This involved erroneously included headers, but also cases where forward declarations could have been used rather than a full include.	2016-11-09 14:27:40 -06:00
Brandon Potter	7a8dda49a4	style: [patch 1/22] use /r/3648/ to reorganize includes	2016-11-09 14:27:37 -06:00
Andreas Sandberg	abe7ef95cb	sim: Remove redundant export_method_cxx_predecls The headers declared in export_method_cxx_predecls are redundant since a SimObject's main header is automatically included. Change-Id: Ied9e84630b36960e54efe91d16f8c66fba7e0da0 Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Curtis Dunham <curtis.dunham@arm.com> Reviewed-by: Joe Gross <joseph.gross@amd.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com>	2017-01-03 12:03:06 +00:00
Arthur Perais	c9d933efb0	cpu: implement an L-TAGE branch predictor This patch implements an L-TAGE predictor, based on André Seznec's code available from CBP-2 (http://hpca23.cse.tamu.edu/taco/camino/cbp2/cbp-src/realistic-seznec.h). Signed-off-by Jason Lowe-Power <jason@lowepower.com>	2016-12-21 15:25:13 -06:00
Arthur Perais	497cc2d373	cpu: disallow speculative update of branch predictor tables (o3) The Minor and o3 cpu models share the branch prediction code. Minor relies on the BPredUnit::squash() function to update the branch predictor tables on a branch mispre- diction. This is fine because Minor executes in-order, so the update is on the correct path. However, this causes the branch predictor to be updated on out-of-order branch mispredictions when using the o3 model, which should not be the case. This patch guards against speculative update of the branch prediction tables. On a branch misprediction, BPredUnit::squash() calls BpredUnit::update(..., squashed = true). The underlying branch predictor tests against the value of squashed. If it is true, it restores any speculatively updated internal state it might have (e.g., global/local branch history), then returns. If false, it updates its prediction tables. Previously, exist- ing predictors did not test against the "squashed" parameter. To accomodate for this change, the Minor model must now call BPredUnit::squash() then BPredUnit::update(..., squashed = false) on branch mispredictions. Before, calling BpredUnit::squash() performed the prediction tables update. The effect is a slight MPKI improvement when using the o3 model. A further patch should perform the same modifications for the indirect target predictor and BTB (less critical). Signed-off-by: Jason Lowe-Power <jason@lowepower.com>	2016-12-21 15:07:16 -06:00
Arthur Perais	34065f8d5f	cpu: correct comments in tournament branch predictor The tournament predictor is presented as doing speculative update of the global history and non-speculative update of the local history used to generate the branch prediction. However, the code does speculative update of both histories. Signed-off-by: Jason Lowe-Power <jason@lowepower.com>	2016-12-21 15:06:13 -06:00
Arthur Perais	1664625db8	cpu: Resolve targets of predicted 'taken' decode for O3 The target of taken conditional direct branches does not need to be resolved in IEW: the target can be computed at decode, usually using the decoded instruction word and the PC. The higher-than-necessary penalty is taken only on conditional branches that are predicted taken but miss in the BTB. Thus, this is mostly inconsequential on IPC if the BTB is big/associative enough (fewer capacity/conflict misses). Nonetheless, what gem5 simulates is not representative of how conditional branch targets can be handled. Signed-off-by: Jason Lowe-Power <jason@lowepower.com>	2016-12-21 15:05:24 -06:00
Arthur Perais	e5fb6752d6	cpu: Clarify meaning of cachePorts variable in lsq_unit.hh of O3 cachePorts currently constrains the number of store packets written to the D-Cache each cycle), but loads currently affect this variable. This leads to unexpected congestion (e.g., setting cachePorts to a realistic 1 will in fact allow a store to WB only if no loads have accessed the D-Cache this cycle). In the absence of arbitration, this patch decouples how many loads can be done per cycle from how many stores can be done per cycle. Signed-off-by: Jason Lowe-Power <jason@lowepower.com>	2016-12-21 15:04:06 -06:00
Nikos Nikoleris	61860f2419	cpu: Change traffic generators to use different values for writes Previously all traffic generators would use the same value for write requests. With this change traffic generators use their master id as the payload of write requests making them more useful for the memchecker. Change-Id: Id1a6b8f02853789b108ef6003f4c32ab929bb123 Reviewed-by: Andreas Hansson <andreas.hansson@arm.com> Reviewed-by: Stephan Diestelhorst <stephan.diestelhorst@arm.com>	2016-12-05 16:48:20 -05:00
Alec Roelke	e76bfc8764	arch: [Patch 1/5] Added RISC-V base instruction set RV64I First of five patches adding RISC-V to GEM5. This patch introduces the base 64-bit ISA (RV64I) in src/arch/riscv for use with syscall emulation. The multiply, floating point, and atomic memory instructions will be added in additional patches, as well as support for more detailed CPU models. The loader is also modified to be able to parse RISC-V ELF files, and a "Hello world\!" example for RISC-V is added to test-progs. Patch 2 will implement the multiply extension, RV64M; patch 3 will implement the floating point (single- and double-precision) extensions, RV64FD; patch 4 will implement the atomic memory instructions, RV64A, and patch 5 will add support for timing, minor, and detailed CPU models that is missing from the first four patches (such as handling locked memory). [Removed several unused parameters and imports from RiscvInterrupts.py, RiscvISA.py, and RiscvSystem.py.] [Fixed copyright information in RISC-V files copied from elsewhere that had ARM licenses attached.] [Reorganized instruction definitions in decoder.isa so that they are sorted by opcode in preparation for the addition of ISA extensions M, A, F, D.] [Fixed formatting of several files, removed some variables and instructions that were missed when moving them to other patches, fixed RISC-V Foundation copyright attribution, and fixed history of files copied from other architectures using hg copy.] [Fixed indentation of switch cases in isa.cc.] [Reorganized syscall descriptions in linux/process.cc to remove large number of repeated unimplemented system calls and added implmementations to functions that have received them since it process.cc was first created.] [Fixed spacing for some copyright attributions.] [Replaced the rest of the file copies using hg copy.] [Fixed style check errors and corrected unaligned memory accesses.] [Fix some minor formatting mistakes.] Signed-off by: Alec Roelke Signed-off by: Jason Lowe-Power <jason@lowepower.com>	2016-11-30 17:10:28 -05:00
Jason Lowe-Power	047caf24ba	cpu: Remove branch predictor function predictInOrder This function was used by the now-defunct InOrderCPU model. Since this model is no longer in gem5, this function was not called from anywhere in the code.	2016-11-30 17:10:27 -05:00
Fernando Endo	6c72c35519	cpu, arm: Distinguish Float* and SimdFloat, create FloatMem opClass Modify the opClass assigned to AArch64 FP instructions from SimdFloat* to Float*. Also create the FloatMemRead and FloatMemWrite opClasses, which distinguishes writes to the INT and FP register banks. Change the latency of (Simd)FloatMultAcc to 5, based on the Cortex-A72, where the "latency" of FMADD is 3 if the next instruction is a FMADD and has only the augend to destination dependency, otherwise it's 7 cycles. Signed-off-by: Jason Lowe-Power <jason@lowepower.com>	2016-10-15 14:58:45 -05:00
Tushar Krishna	0f68b50ff1	ruby: rename networktest to garnet_synthetic_traffic. networktest is essentially a collection of synthetic traffic patterns for the network. The protocol name and the tester having the same name led to multiple python configuration files with the same name, adding confusion. This patch renames networktest to garnet_synthetic_traffic, and also adds more synthetic traffic patterns.	2016-10-06 14:35:16 -04:00
Rekai Gonzalez-Alberquilla	ad296b068c	cpu: Fix the O3 CPU Drain The drain did not wait until stages were ready again. Therefore, as a result of messages in the TimeBuffer being drain, the state after the drain was not consistent and asserts fired in some places when the draining happened after a stage got blocked, but before the notification arrived to the previous stages. Change-Id: Ib50b3b40b7f745b62c1eba2931dec76860824c71 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>	2016-09-22 10:49:10 +01:00
Radhika Jagtap	1fe5f63137	cpu: Support exit when any one Trace CPU completes replay This change adds a Trace CPU param to exit simulation early, i.e. when the first (any one) trace execution is complete. With this change the user gets a choice to configure exit as either when the last CPU finishes (default) or first CPU finishes replay. Configuring an early exit enables simulating and measuring stats strictly when memory-system resources are being stressed by all Trace CPUs. Change-Id: I3998045fdcc5cd343e1ca92d18dd7f7ecdba8f1d Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>	2016-09-15 18:01:20 +01:00
Radhika Jagtap	d067327fc0	cpu: Adjust for trace offset and fix stats This change subtracts the time offset present in the trace from all the event times when nodes and request are sent so that the replay starts immediately when the simulation starts. This makes the stats accurate when the time offset in traces is large, for example when traces are generated in the middle of a workload execution. It also solves the problem of unnecessary DRAM refresh events that would keep occuring during the large time offset before even a single request is replayed into the system. Change-Id: Ie0898842615def867ffd5c219948386d952af7f7 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>	2016-09-15 18:01:16 +01:00
Radhika Jagtap	d7724d5f54	cpu: Add frequency scaling to the Trace CPU This change adds a simple feature to scale the frequency of the Trace CPU. The compute delays in the input traces provide timing. This change adds a freqency multiplier parameter to the Trace CPU set to 1.0 by default. The compute delay is manipulated to effectively achieve the frequency at which the nodes become ready and thus scale the frequency of the Trace CPU. Change-Id: Iaabbd57806941ad56094fcddbeb38fcee1172431 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>	2016-09-15 18:01:09 +01:00
Michael LeBeane	443da2c030	kvm: Support timing accesses for KVM cpu This patch enables timing accesses for KVM cpu. A new state, RunningMMIOPending, is added to indicate that there are outstanding timing requests generated by KVM in the system. KVM's tick() is disabled and the simulation does not enter into KVM until all outstanding timing requests have completed. The main motivation for this is to allow KVM CPU to perform MMIO in Ruby, since Ruby does not support atomic accesses.	2016-09-13 23:20:03 -04:00
Michael LeBeane	458d4a3c7b	sim: Refactor quiesce and remove FS asserts The quiesce family of magic ops can be simplified by the inclusion of quiesceTick() and quiesce() functions on ThreadContext. This patch also gets rid of the FS guards, since suspending a CPU is also a valid operation for SE mode.	2016-09-13 23:17:42 -04:00
David Hashe	f3ccaab1e9	cpu, mem, sim: Change how KVM maps memory Only map memories into the KVM guest address space that are marked as usable by KVM. Create BackingStoreEntry class containing flags for is_conf_reported, in_addr_map, and kvm_map.	2016-08-22 11:41:05 -04:00
Andreas Sandberg	2c05f5207d	cpu: Add missing override in Minor's exec context Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>	2016-08-15 12:00:37 +01:00
Reiley Jeapaul	ff8257c7c2	cpu: Fixed clang errors. Added 'override' keyword for virtual functions. Change-Id: Ic37311443ca11ee6d95bceffea599e054e7aa110 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>	2016-08-15 12:00:36 +01:00
Nikos Nikoleris	698767e538	cpu, arch: fix the type used for the request flags Change-Id: I183b9942929c873c3272ce6d1abd4ebc472c7132 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>	2016-08-15 12:00:35 +01:00
Mitch Hayenga	752f1c1fe9	cpu: Fix Minor SMT WFI/drain interaction issues The behavior of WFI is to cause minor to cease evaluating pipeline logic until an interrupt is observed, however a user may wish to drain the system while a core is sleeping due to a WFI. This patch makes WFI drain. If an actual drain occurs during a WFI, the CPU is already drained and will immediately be ready for swapping, checkpointing, etc. This should not negatively impact performance as WFI instructions are 'stream-changing' (treated like unpredicted branches), so all remaining instructions are wrong-path and will be squashed rapidly. Change-Id: I63833d5acb53d8dde78f9f0c9611de0ece385e45	2016-07-21 17:19:16 +01:00
Mitch Hayenga	ff4009ac00	cpu: Add SMT support to MinorCPU This patch adds SMT support to the MinorCPU. Currently RoundRobin or Random thread scheduling are supported. Change-Id: I91faf39ff881af5918cca05051829fc6261f20e3	2016-07-21 17:19:16 +01:00
Andreas Sandberg	efb7fb6f85	mem: Resolve TrafficGen trace relative to the config The traffic generator currently resolves relative trace paths relative to gem5's current working directory. This can lead to surprising results for relative paths where the expectation would normally be that they are resolved relative to the configuration file. This changeset implements config-relative trace file lookups. The old behavior is kept as a fallback for configs that expect that behavior. Change-Id: I1bda4e16725842666ffc37dcb6838c23a6ff138c Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Curtis Dunham <curtis.dunham@arm.com>	2016-06-20 14:49:37 +01:00
David Guillen Fandos	fb5fc11da4	pwr: Low-power idle power state for idle CPUs Add functionality to the BaseCPU that will put the entire CPU into a low-power idle state whenever all threads in it are idle. Change-Id: I984d1656eb0a4863c87ceacd773d2d10de5cfd2b	2016-06-06 17:16:43 +01:00
David Guillen Fandos	70798b1ba0	stats: Fixing regStats function for some SimObjects Fixing an issue with regStats not calling the parent class method for most SimObjects in Gem5. This causes issues if one adds new stats in the base class (since they are never initialized properly!). Change-Id: Iebc5aa66f58816ef4295dc8e48a357558d76a77c Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>	2016-06-06 17:16:43 +01:00
Stephan Diestelhorst	589033c94c	sim: Call regStats of base-class as well We want to extend the stats of objects hierarchically and thus it is necessary to register the statistics of the base-class(es), as well. For now, these are empty, but generic stats will be added there. Patch originally provided by Akash Bagdia at ARM Ltd.	2016-06-06 17:16:43 +01:00
Ilias Vougioukas	7c8d6e3660	cpu: fix lastStopped unserialisation MinorCPU fix for corrupt numCycles when resuming from a previous simulation. --- src/cpu/minor/cpu.cc \| 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)	2016-05-27 16:55:01 +01:00
Andreas Hansson	d023b7e8db	cpu: Add a basic progress check to the TrafficGen This patch adds a progress check to the TrafficGen so that it is easier to detect deadlock scenarios where the generator gets stuck waiting for a retry, and makes no further progress. Change-Id: Ifb8779ad0939f52c0518d0e867bac73f99b82e2b Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Radhika Jagtap <radhika.jagtap@arm.com>	2016-05-26 11:56:24 +01:00
Mitch Hayenga	c75ff71139	mem: Remove threadId from memory request class In general, the ThreadID parameter is unnecessary in the memory system as the ContextID is what is used for the purposes of locks/wakeups. Since we allocate sequential ContextIDs for each thread on MT-enabled CPUs, ThreadID is unnecessary as the CPUs can identify the requesting thread through sideband info (SenderState / LSQ entries) or ContextID offset from the base ContextID for a cpu. This is a re-spin of 20264eb after the revert (bd1c6789) and includes some fixes of that commit.	2016-04-07 09:30:20 -05:00
Mitch Hayenga	d99deff8ea	cpu: Implement per-thread GHRs Branch predictors that use GHRs should index them on a per-thread basis. This makes that so. This is a re-spin of fb51231 after the revert (bd1c6789).	2016-04-05 12:20:19 -05:00
Mitch Hayenga	0fd4bb7f12	cpu: Add an indirect branch target predictor This patch adds a configurable indirect branch predictor that can be indexed by a combination of GHR and path history hashes. Implements the functionality described in: "Target prediction for indirect jumps" by Chang, Hao, and Patt http://dl.acm.org/citation.cfm?id=264209 This is a re-spin of fb9d142 after the revert (bd1c6789).	2016-04-05 11:48:37 -05:00
Mitch Hayenga	3f6874cb29	cpu: Fix BTB threading oversight The extant BTB code doesn't hash on the thread id but does check the thread id for 'btb hits'. This results in 1-thread of a multi-threaded workload taking a BTB entry, and all other threads missing for the same branch missing.	2016-04-05 11:44:27 -05:00
Andreas Sandberg	fd52a63e24	Revert to 74c1e6513bd0 (sim: Thermal support for Linux)	2016-04-07 10:42:07 +01:00
Andreas Sandberg	be28d96510	Revert power patch sets with unexpected interactions The following patches had unexpected interactions with the current upstream code and have been reverted for now: e07fd01651f3: power: Add support for power models 831c7f2f9e39: power: Low-power idle power state for idle CPUs 4f749e00b667: power: Add power states to ClockedObject Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> --HG-- extra : amend_source : 0b6fb073c6bbc24be533ec431eb51fbf1b269508	2016-04-06 19:43:31 +01:00
Mitch Hayenga	8615b27174	mem: Remove threadId from memory request class In general, the ThreadID parameter is unnecessary in the memory system as the ContextID is what is used for the purposes of locks/wakeups. Since we allocate sequential ContextIDs for each thread on MT-enabled CPUs, ThreadID is unnecessary as the CPUs can identify the requesting thread through sideband info (SenderState / LSQ entries) or ContextID offset from the base ContextID for a cpu.	2016-04-05 12:39:21 -05:00
Curtis Dunham	76ee011a12	cpu: Implement per-thread GHRs Branch predictors that use GHRs should index them on a per-thread basis. This makes that so.	2016-04-05 12:20:19 -05:00
Mitch Hayenga	1578d2d0b6	cpu: Add an indirect branch target predictor This patch adds a configurable indirect branch predictor that can be indexed by a combination of GHR and path history hashes. Implements the functionality described in: "Target prediction for indirect jumps" by Chang, Hao, and Patt http://dl.acm.org/citation.cfm?id=264209	2016-04-05 11:48:37 -05:00
Mitch Hayenga	7bc52af771	cpu: Fix BTB threading oversight The extant BTB code doesn't hash on the thread id but does check the thread id for 'btb hits'. This results in 1-thread of a multi-threaded workload taking a BTB entry, and all other threads missing for the same branch missing.	2016-04-05 11:44:27 -05:00
Akash Bagdia	1c34ee20df	power: Low-power idle power state for idle CPUs Add functionality to the BaseCPU that will put the entire CPU into a low-power idle state whenever all threads in it are idle.	2014-12-09 10:42:08 +00:00
Akash Bagdia	3ee4957b49	power: Add power states to ClockedObject Add 4 power states to the ClockedObject, provides necessary access functions to check and update the power state. Default power state is UNDEFINED, it is responsibility of the respective simulation model to provide the startup state and any other logic for state change. Add number of transition stat. Add distribution of time spent in clock gated state. Add power state residency stat. Add dump call back function to allow stats update of distribution and residency stats.	2014-11-18 14:00:48 +00:00
Mitch Hayenga	85dadcd381	cpu: Add instruction opclass histogram to minor	2016-04-05 08:08:12 -05:00
Geoffrey Blake	f948f9fca9	cpu: Query CPU for inst executed from Python This patch adds the ability for the simulator to query the number of instructions a CPU has executed so far per hw-thread. This can be used to enable more flexible periodic events such as taking checkpoints starting 1s into simulation and X instructions thereafter.	2016-04-05 05:29:02 -05:00
Andreas Sandberg	a3efb6bd1d	kvm: Add an option to force context sync on kvm entry/exit This changeset adds an option to force the kvm-based CPUs to always synchronize the gem5 thread context representation on entry/exit into the kernel. This is very useful for debugging. Unfortunately, it is also the only way to get reliable register contents when using remote gdb functionality. The long-term solution for the latter would be to implement a kvm-specific thread context. Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Curtis Dunham <curtis.dunham@arm.com> Reviewed-by: Alexandru Dutu <alexandru.dutu@amd.com>	2016-03-30 10:52:25 +01:00
Andreas Hansson	3ba481496d	cpu: warn if TrafficGen is suppressing a large numer of packets Add a basic warning for every 10000 packet that is suppressed to alert the user.	2016-03-20 06:38:34 -04:00

1 2 3 4 5 ...

1691 commits