sanchayanmaity/gem5 - Sanchayan Maity's repositories

Author	SHA1	Message	Date
Nikos Nikoleris	ec64b81a9d	cpu: fix RetiredStores probe point Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2015-01-10 14:30:53 -06:00
Andrew Lukefahr	6d32004407	minor: fixed LSQ MasterPortID Minor was reporting the data cache access as ".inst" accesses. This just switches the MasterPortID to dataMasterPortId. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2015-01-03 17:51:48 -06:00
Gabe Black	70eb68beae	Let other objects set up memory like regions in a KVM VM.	2014-12-09 21:53:44 -08:00
Gabe Black	bacbb8ecbc	cpu: Only check for PC events on instruction boundaries. Only the instruction address is actually checked, so there's no need to check repeatedly while we're working through the microops of a macroop and that's not changing.	2014-12-05 01:47:35 -08:00
Andrew Bardsley	df37cad0fd	cpu: Fix retries on barrier/store in Minor's store buffer This patch fixes a case where a store in Minor's store buffer never leaves the store buffer as it is pre-maturely counted as having been issued, leading to the store buffer idling. LSQ::StoreBuffer::numUnissuedAccesses should count the number of accesses either in memory, or still in the store buffer after being completed. For stores which are also barriers, the store will stay in the store buffer for a cycle after it is completed and will be cleaned up by the barrier clearing code (to ensure that barriers are completed in-order). To acheive this, numUnissuedAccesses is not decremented when a store-barrier is issued to memory, but when its barrier effect is cleared. Without this patch, the correct behaviour happens when a memory transaction is immediately accepted, but not if it needs a retry.	2014-12-02 06:08:15 -05:00
Andrew Bardsley	98f3e7a310	cpu: Fix memoryIssueLimit checking in Minor This patch fixes the checking of the number of memory instructions issued per cycles in the Minor CPU.	2014-12-02 06:08:13 -05:00
Marco Elver	9649395f85	cpu, o3: Ignored invalidate causing same-address load reordering In case the memory subsystem sends a combined response with invalidate (e.g. ReadRespWithInvalidate), we cannot ignore the invalidate part of the response. If we were to ignore the invalidate part, under certain circumstances this effectively leads to reordering of loads to the same address which is not permitted under any memory consistency model implemented in gem5. Consider the case where a later load's address is computed before an earlier load in program order, and is therefore sent to the memory subsystem first. At some point the earlier load's address is computed and in doing so correctly marks the later load as a possibleLoadViolation. In the meantime some other node writes and sends invalidations to all other nodes. The invalidation races with the later load's ReadResp, and arrives before ReadResp and is deferred. Upon receipt of the ReadResp, the response is changed to ReadRespWithInvalidate, and sent to the CPU. If we ignore the invalidate part of the packet, we let the later load read the old value of the address. Eventually the earlier load's ReadResp arrives, but with new data. As there was no invalidate snoop (sunk into the ReadRespWithInvalidate), and if we did not process the invalidate of the ReadRespWithInvalidate, we obtain a load reordering. A similar scenario can be constructed where the earlier load's address is computed after ReadRespWithInvalidate arrives for the younger load. In this case hitExternalSnoop needs to be set to true on the ReadRespWithInvalidate, so that upon knowing the address of the earlier load, checkViolations will cause the later load to be squashed. Finally we must account for the case where both loads are sent to the memory subsystem (reordered), a snoop invalidate arrives and correctly sets the later loads fault to ReExec. However, before the CPU processes the fault, the later load's ReadResp arrives and the writeback discards the outstanding fault. We must add a check to ensure that we do not skip any unprocessed faults.	2014-12-02 06:08:03 -05:00
Stephan Diestelhorst	810349a8a7	cpu: Move packet deallocation to recvTimingResp in the O3 CPU Move the packet deallocations in the O3 CPU so that the completeDataAccess deals only with the LSQ specific parts and the generic recvTimingResp frees the packet in all other cases.	2014-12-02 06:07:58 -05:00
Andreas Hansson	41846cb61b	mem: Assume all dynamic packet data is array allocated This patch simplifies how we deal with dynamically allocated data in the packet, always assuming that it is array allocated, and hence should be array deallocated (delete[] as opposed to delete). The only uses of dataDynamic was in the Ruby testers. The ARRAY_DATA flag in the packet is removed accordingly. No defragmentation of the flags is done at this point, leaving a gap in the bit masks. As the last part the patch, it renames dataDynamicArray to dataDynamic.	2014-12-02 06:07:43 -05:00
Andreas Hansson	9779ba2e37	mem: Add const getters for write packet data This patch takes a first step in tightening up how we use the data pointer in write packets. A const getter is added for the pointer itself (getConstPtr), and a number of member functions are also made const accordingly. In a range of places throughout the memory system the new member is used. The patch also removes the unused isReadWrite function.	2014-12-02 06:07:36 -05:00
Andreas Hansson	25bfc24999	mem: Remove null-check bypassing in Packet::getPtr This patch removes the parameter that enables bypassing the null check in the Packet::getPtr method. A number of call sites assume the value to be non-null. The one odd case is the RubyTester, which issues zero-sized prefetches(!), and despite being reads they had no valid data pointer. This is now fixed, but the size oddity remains (unless anyone object or has any good suggestions). Finally, in the Ruby Sequencer, appropriate checks are made for flush packets as they have no valid data pointer.	2014-12-02 06:07:34 -05:00
Alexandru Dutu	adbaa4dfde	kvm, x86: Adding support for SE mode execution This patch adds methods in KvmCPU model to handle KVM exits caused by syscall instructions and page faults. These types of exits will be encountered if KvmCPU is run in SE mode.	2014-11-23 18:01:08 -08:00
Andreas Hansson	481eb6ae80	arm: Fixes based on UBSan and static analysis Another churn to clean up undefined behaviour, mostly ARM, but some parts also touching the generic part of the code base. Most of the fixes are simply ensuring that proper intialisation. One of the more subtle changes is the return type of the sign-extension, which is changed to uint64_t. This is to avoid shifting negative values (undefined behaviour) in the ISA code.	2014-11-14 03:53:51 -05:00
Ali Saidi	b6f32253dd	arm: Fix timing wakeup with LLSC	2014-11-12 09:05:22 -05:00
Marc Orr	bf80734b2c	x86 isa: This patch attempts an implementation at mwait. Mwait works as follows: 1. A cpu monitors an address of interest (monitor instruction) 2. A cpu calls mwait - this loads the cache line into that cpu's cache. 3. The cpu goes to sleep. 4. When another processor requests write permission for the line, it is evicted from the sleeping cpu's cache. This eviction is forwarded to the sleeping cpu, which then wakes up. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2014-11-06 05:42:22 -06:00
Andrew Lukefahr	bd32d55a2c	cpu: Minor Draining Bug Fixes a bug where Minor drains in the midst of committing a conditional store. While committing a conditional store, lastCommitWasEndOfMacroop is true (from the previous instruction) as we still haven't finished the conditional store. If a drain occurs before the cache response, Minor would check just lastCommitWasEndOfMacroop, which was true, and set drainState=DrainHaltFetch, which increases the streamSeqNum. This caused the conditional store to be squashed when the memory responded and it completed. However, to the memory the store succeeded, while to the instruction sequence it never occurred. In the case of an LLSC, the instruction sequence will replay the squashed STREX, which will fail as the cache is no longer in LLSC. Then the instruction sequence will loop back to a LDREX, which receives the updated (incorrect) value. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2014-11-06 05:42:21 -06:00
Mitch Hayenga	5bfa521c46	cpu: Add writeback modeling for drain functionality It is possible for the O3 CPU to consider itself drained and later have a squashed instruction perform a writeback. This patch re-adds tracking of in-flight instructions to prevent falsely signaling a drained event.	2014-10-29 23:18:27 -05:00
Mitch Hayenga	6847bbf7ce	cpu: Add drain check functionality to IEW IEW did not check the instQueue and memDepUnit to ensure they were drained. This caused issues when drainSanityCheck() did check those structures after asserting IEW was drained.	2014-10-29 23:18:26 -05:00
Ali Saidi	e3ee27c7b4	cpu: Add support to checker for CACHE_BLOCK_ZERO commands. The checker didn't know how to properly validate these new commands.	2014-10-29 23:18:24 -05:00
Andrew Bardsley	536c72333f	cpu: Fix barrier push to store buffer when full bug in Minor This patch fixes a bug where a completing load or store which is also a barrier can push a barrier into the store buffer without first checking that there is a free slot. The bug was not fatal but would print a warning that the store buffer was full when inserting.	2014-10-29 23:18:24 -05:00
Nilay Vaish	922a9d8ed2	cpu: o3: corrects base FP and CC register index in removeThread()	2014-10-20 16:47:55 -05:00
Andreas Hansson	a2d246b6b8	arch: Use shared_ptr for all Faults This patch takes quite a large step in transitioning from the ad-hoc RefCountingPtr to the c++11 shared_ptr by adopting its use for all Faults. There are no changes in behaviour, and the code modifications are mostly just replacing "new" with "make_shared".	2014-10-16 05:49:51 -04:00
Andreas Hansson	a769963d16	o3: Use shared_ptr for MemDepEntry This patch transitions the o3 MemDepEntry from the ad-hoc RefCountingPtr to the c++11 shared_ptr. There are no changes in behaviour, and the code modifications are mainly replacing "new" with "make_shared".	2014-10-16 05:49:49 -04:00
Andreas Sandberg	e0074324ba	cpu: Probe points for basic PMU stats This changeset adds probe points that can be used to implement PMU counters for CPU stats. The following probes are supported: * BaseCPU::ppCycles / Cycles * BaseCPU::ppRetiredInsts / RetiredInsts * BaseCPU::ppRetiredLoads / RetiredLoads * BaseCPU::ppRetiredStores / RetiredStores * BaseCPU::ppRetiredBranches RetiredBranches	2014-10-16 05:49:41 -04:00
Andreas Sandberg	76b0ff9ecd	cpu: Add branch predictor PMU probe points This changeset adds probe points that can be used to implement PMU counters for branch predictor stats. The following probes are supported: * BPRedUnit::ppBranches / Branches * BPRedUnit::ppMisses / Misses	2014-10-16 05:49:40 -04:00
Andrew Lukefahr	8e07b36d2b	cpu: Fix o3 SMT IQCount bug Commmitted by: Nilay Vaish <nilay@cs.wisc.edu>	2014-10-11 16:16:02 -05:00
Mitch Hayenga	06f4b521aa	cpu: Remove Ozone CPU from the source tree The Ozone CPU is now very much out of date and completely non-functional, with no one actively working on restoring it. It is a source of confusion for new users who attempt to use it before realizing its current state. RIP	2014-10-09 17:51:58 -04:00
Andreas Hansson	341dbf2662	arch: Use const StaticInstPtr references where possible This patch optimises the passing of StaticInstPtr by avoiding copying the reference-counting pointer. This avoids first incrementing and then decrementing the reference-counting pointer.	2014-09-27 09:08:36 -04:00
Andreas Hansson	deb2200671	scons: Address issues related to gcc 4.9.1 Fix a number few minor issues to please gcc 4.9.1. Removing the '-fuse-linker-plugin' flag means no libraries are part of the LTO process, but hopefully this is an acceptable loss, as the flag causes issues on a lot of systems (only certain combinations of gcc, ld and ar work).	2014-09-27 09:08:34 -04:00
Andreas Hansson	de62aedabc	misc: Fix a bunch of minor issues identified by static analysis Add some missing initialisation, and fix a handful benign resource leaks (including some false positives).	2014-09-27 09:08:29 -04:00
Mitch Hayenga	cc6523e2d6	cpu: Remove unused deallocateContext calls The call paths for de-scheduling a thread are halt() and suspend(), from the thread context. There is no call to deallocateContext() in general, though some CPUs chose to define it. This patch removes the function from BaseCPU and the cores which do not require it.	2014-09-20 17:18:36 -04:00
Mitch Hayenga	e1403fc2af	alpha,arm,mips,power,x86,cpu,sim: Cleanup activate/deactivate activate(), suspend(), and halt() used on thread contexts had an optional delay parameter. However this parameter was often ignored. Also, when used, the delay was seemily arbitrarily set to 0 or 1 cycle (no other delays were ever specified). This patch removes the delay parameter and 'Events' associated with them across all ISAs and cores. Unused activate logic is also removed.	2014-09-20 17:18:35 -04:00
Andreas Hansson	1f6d5f8f84	mem: Rename Bus to XBar to better reflect its behaviour This patch changes the name of the Bus classes to XBar to better reflect the actual timing behaviour. The actual instances in the config scripts are not renamed, and remain as e.g. iobus or membus. As part of this renaming, the code has also been clean up slightly, making use of range-based for loops and tidying up some comments. The only changes outside the bus/crossbar code is due to the delay variables in the packet. --HG-- rename : src/mem/Bus.py => src/mem/XBar.py rename : src/mem/coherent_bus.cc => src/mem/coherent_xbar.cc rename : src/mem/coherent_bus.hh => src/mem/coherent_xbar.hh rename : src/mem/noncoherent_bus.cc => src/mem/noncoherent_xbar.cc rename : src/mem/noncoherent_bus.hh => src/mem/noncoherent_xbar.hh rename : src/mem/bus.cc => src/mem/xbar.cc rename : src/mem/bus.hh => src/mem/xbar.hh	2014-09-20 17:18:32 -04:00
Wendy Elsasser	a384525355	cpu: Update DRAM traffic gen Add new DRAM_ROTATE mode to traffic generator. This mode will generate DRAM traffic that rotates across banks per rank, command types, and ranks per channel The looping order is illustrated below: for (ranks per channel) for (command types) for (banks per rank) // Generate DRAM Command Series This patch also adds the read percentage as an input argument to the DRAM sweep script. If the simulated read percentage is 0 or 100, the middle for loop does not generate additional commands. This loop is used only when the read percentage is set to 50, in which case the middle loop will toggle between read and write commands. Modified sweep.py script, which generates DRAM traffic. Added input arguments and support for new DRAM_ROTATE mode. The script now has input arguments for: 1) Read percentage 2) Number of ranks 3) Address mapping 4) Traffic generator mode (DRAM or DRAM_ROTATE) The default values are: 100% reads, 1 rank, RoRaBaCoCh address mapping, and DRAM traffic gen mode For the DRAM traffic mode, added multi-rank support.	2014-09-20 17:17:55 -04:00
Andreas Hansson	0fa128bbd0	base: Clean up redundant string functions and use C++11 This patch does a bit of housekeeping on the string helper functions and relies on the C++11 standard library where possible. It also does away with our custom string hash as an implementation is already part of the standard library.	2014-09-20 17:17:49 -04:00
Mitch Hayenga	4f0e3cd4d7	cpu: Add ExecFlags debug flag Adds a debug flag to print out the flags a instruction is tagged with.	2014-09-20 17:17:45 -04:00
Dam Sunwoo	ca3513d630	cpu: use probes infrastructure to do simpoint profiling Instead of having code embedded in cpu model to do simpoint profiling use the probes infrastructure to do it.	2014-09-20 17:17:43 -04:00
Andreas Hansson	41fc8a573e	arch: Pass faults by const reference where possible This patch changes how faults are passed between methods in an attempt to copy as few reference-counting pointer instances as possible. This should avoid unecessary copies being created, contributing to the increment/decrement of the reference counters.	2014-09-19 10:35:18 -04:00
Andreas Hansson	619c5519fe	cpu: Use a deque in o3 rename instruction queue Switch from a list to a data structure with better data layout.	2014-09-19 10:35:14 -04:00
Andreas Hansson	efd5cf323a	misc: Use safe_cast when assumptions are made about return value This patch changes two dynamic_cast to safe_cast as we assume the return value is not NULL (without checking).	2014-09-19 10:35:11 -04:00
Andrew Bardsley	1a45a8c5d3	cpu: Fix memory access in Minor not setting parent Request flags This patch fixes cases where uncacheable/memory type flags are not set correctly on a memory op which is split in the LSQ. Without this patch, request->request if freely used to check flags where the flags should actually come from the accumulation of request fragment flags. This patch also fixes a bug where an uncacheable access which passes through tryToSendRequest more than once can increment LSQ::numAccessesInMemorySystem more than once.	2014-09-12 10:22:49 -04:00
Andrew Bardsley	c8b919aba2	style: Fix line continuation, especially in debug messages This patch closes a number of space gaps in debug messages caused by the incorrect use of line continuation within strings. (There's also one consistency change to a similar, but correct, use of line continuation)	2014-09-12 10:22:47 -04:00
Andreas Hansson	2b4906fc64	minor: Fix typo in DPRINTF for Minor branch prediction	2014-09-12 10:22:46 -04:00
Mitch Hayenga	cd1bd7572a	cpu: Only iterate over possible threads on the o3 cpu Some places in O3 always iterated over "Impl::MaxThreads" even if a CPU had fewer threads. This removes a few of those instances.	2014-09-09 04:36:34 -04:00
Andreas Hansson	da4539dc74	misc: Fix a number of unitialised variables and members Static analysis unearther a bunch of uninitialised variables and members, and this patch addresses the problem. In all cases these omissions seem benign in the end, but at least fixing them means less false positives next time round.	2014-09-09 04:36:31 -04:00
Andreas Hansson	2698e73966	base: Use the global Mersenne twister throughout This patch tidies up random number generation to ensure that it is done consistently throughout the code base. In essence this involves a clean-up of Ruby, and some code simplifications in the traffic generator. As part of this patch a bunch of skewed distributions (off-by-one etc) have been fixed. Note that a single global random number generator is used, and that the object instantiation order will impact the behaviour (the sequence of numbers will be unaffected, but if module A calles random before module B then they would obviously see a different outcome). The dependency on the instantiation order is true in any case due to the execution-model of gem5, so we leave it as is. Also note that the global ranom generator is not thread safe at this point. Regressions using the memtest, TrafficGen or any Ruby tester are affected and will be updated accordingly.	2014-09-03 07:42:54 -04:00
Curtis Dunham	e3b19cb294	mem: Refactor assignment of Packet types Put the packet type swizzling (that is currently done in a lot of places) into a refineCommand() member function.	2014-05-13 12:20:48 -05:00
Mitch Hayenga	659bdc1a6b	cpu: Fix o3 drain bug For X86, the o3 CPU would get stuck with the commit stage not being drained if an interrupt arrived while drain was pending. isDrained() makes sure that pcState.microPC() == 0, thus ensuring that we are at an instruction boundary. However, when we take an interrupt we execute: pcState.upc(romMicroPC(entry)); pcState.nupc(romMicroPC(entry) + 1); tc->pcState(pcState); As a result, the MicroPC is no longer zero. This patch ensures the drain is delayed until no interrupts are present. Once draining, non-synchronous interrupts are deffered until after the switch.	2014-09-03 07:42:45 -04:00
Curtis Dunham	4a3f11149d	arm: use condition code registers for ARM ISA Analogous to ee049bf (for x86). Requires a bump of the checkpoint version and corresponding upgrader code to move the condition code register values to the new register file.	2014-04-29 16:05:02 -05:00
Dam Sunwoo	5008a20aa4	cpu: fix bimodal predictor to use correct global history reg A small bug in the bimodal predictor caused significant degradation in performance on some benchmarks. This was caused by using the wrong globalHistoryReg during the update phase. This patches fixes the bug and brings the performance to normal level.	2014-09-03 07:42:41 -04:00

1 2 3 4 5 ...

1531 commits