sanchayanmaity/gem5 - Sanchayan Maity's repositories

Author	SHA1	Message	Date
Mitch Hayenga	771c864bf4	mem: Allowed tagged instruction prefetching in stride prefetcher For systems with a tightly coupled L2, a stride-based prefetcher may observe access requests from both instruction and data L1 caches. However, the PC address of an instruction miss gives no relevant training information to the stride based prefetcher(there is no stride to train). In theses cases, its better if the L2 stride prefetcher simply reverted back to a simple N-block ahead prefetcher. This patch enables this option. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2014-01-29 23:21:26 -06:00
Mitch Hayenga ext:(%2C%20Amin%20Farmahini%20%3Caminfar%40gmail.com%3E)	95735e10e7	mem: prefetcher: add options, support for unaligned addresses This patch extends the classic prefetcher to work on non-block aligned addresses. Because the existing prefetchers in gem5 mask off the lower address bits of cache accesses, many predictable strides fail to be detected. For example, if a load were to stride by 48 bytes, with 64 byte cachelines, the current stride based prefetcher would see an access pattern of 0, 64, 64, 128, 192.... Thus not detecting a constant stride pattern. This patch fixes this, by training the prefetcher on access and not masking off the lower address bits. It also adds the following configuration options: 1) Training/prefetching only on cache misses, 2) Training/prefetching only on data acceses, 3) Optionally tagging prefetches with a PC address. #3 allows prefetchers to train off of prefetch requests in systems with multiple cache levels and PC-based prefetchers present at multiple levels. It also effectively allows a pipelining of prefetch requests (like in POWER4) across multiple levels of cache hierarchy. Improves performance on my gem5 configuration by 4.3% for SPECINT and 4.7% for SPECFP (geomean).	2014-01-29 23:21:25 -06:00
Amin Farmahini	ffbdaa7cce	mem: Remove redundant findVictim() input argument The patch (1) removes the redundant writeback argument from findVictim() (2) fixes the description of access() function Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2014-01-28 18:00:50 -06:00
Giacomo Gabrielli	aefe9cc624	mem: Add support for a security bit in the memory system This patch adds the basic building blocks required to support e.g. ARM TrustZone by discerning secure and non-secure memory accesses.	2014-01-24 15:29:30 -06:00
Timothy M. Jones	427ceb57a9	Cache: Collect very basic stats on tag and data accesses Adds very basic statistics on the number of tag and data accesses within the cache, which is important for power modelling. For the tags, simply count the associativity of the cache each time. For the data, this depends on whether tags and data are accessed sequentially, which is given by a new parameter. In the parallel case, all data blocks are accessed each time, but with sequential accesses, a single data block is accessed only on a hit.	2014-01-24 15:29:30 -06:00
Dam Sunwoo	85e8779de7	mem: per-thread cache occupancy and per-block ages This patch enables tracking of cache occupancy per thread along with ages (in buckets) per cache blocks. Cache occupancy stats are recalculated on each stat dump.	2014-01-24 15:29:30 -06:00
Matt Horsnell	ca89eba79e	mem: track per-request latencies and access depths in the cache hierarchy Add some values and methods to the request object to track the translation and access latency for a request and which level of the cache hierarchy responded to the request.	2014-01-24 15:29:30 -06:00
Matt Horsnell	6decd70bfb	cpu: add consistent guarding to *_impl.hh files.	2013-10-17 10:20:45 -05:00
Andreas Hansson	19a5b68db7	arch: Resurrect the NOISA build target and rename it NULL This patch makes it possible to once again build gem5 without any ISA. The main purpose is to enable work around the interconnect and memory system without having to build any CPU models or device models. The regress script is updated to include the NULL ISA target. Currently no regressions make use of it, but all the testers could (and perhaps should) transition to it. --HG-- rename : build_opts/NOISA => build_opts/NULL rename : src/arch/noisa/SConsopts => src/arch/null/SConsopts rename : src/arch/noisa/cpu_dummy.hh => src/arch/null/cpu_dummy.hh rename : src/cpu/intr_control.cc => src/cpu/intr_control_noisa.cc	2013-09-04 13:22:57 -04:00
Andreas Hansson	d4273cc9a6	mem: Set the cache line size on a system level This patch removes the notion of a peer block size and instead sets the cache line size on the system level. Previously the size was set per cache, and communicated through the interconnect. There were plenty checks to ensure that everyone had the same size specified, and these checks are now removed. Another benefit that is not yet harnessed is that the cache line size is now known at construction time, rather than after the port binding. Hence, the block size can be locally stored and does not have to be queried every time it is used. A follow-on patch updates the configuration scripts accordingly.	2013-07-18 08:31:16 -04:00
Xiangyu Dong	4e8ecd7c6f	mem: Add cache class destructor to avoid memory leaks Make valgrind a little bit happier	2013-07-18 08:29:47 -04:00
Prakash Ramrakhyani	ac515d7a9b	mem: Reorganize cache tags and make them a SimObject This patch reorganizes the cache tags to allow more flexibility to implement new replacement policies. The base tags class is now a clocked object so that derived classes can use a clock if they need one. Also having deriving from SimObject allows specialized Tag classes to be swapped in/out in .py files. The cache set is now templatized to allow it to contain customized cache blocks with additional informaiton. This involved moving code to the .hh file and removing cacheset.cc. The statistics belonging to the cache tags are now including ".tags" in their name. Hence, the stats need an update to reflect the change in naming.	2013-06-27 05:49:50 -04:00
Andreas Hansson	0d68d36b9d	mem: Remove the cache builder This patch removes the redundant cache builder class.	2013-06-27 05:49:50 -04:00
Andreas Hansson	33a8d777ad	mem: Align cache timing to clock edges This patch changes the cache timing calculations such that the results are aligned to clock edges. Plenty stats change as a results of this patch.	2013-06-27 05:49:49 -04:00
Andreas Hansson	368f50a0a1	mem: Cycles converted to Ticks in atomic cache accesses This patch fixes an outstanding issue in the cache timing calculations where an atomic access returned a time in Cycles, but the port forwarded it on as if it was in Ticks. A separate patch will update the regression stats.	2013-06-27 05:49:49 -04:00
Andreas Hansson	f330b3c28d	mem: Remove a redundant heap allocation for a snoop packet This patch changes the updards snoop packet to avoid allocating and later deleting it. As the code executes in 0 time and the lifetime of the packet does not extend beyond the block there is no reason to heap allocate it.	2013-06-27 05:49:49 -04:00
Andreas Hansson	7da851d1a8	mem: Spring cleaning of MSHR and MSHRQueue This patch does some minor tidying up of the MSHR and MSHRQueue. The clean up started as part of some ad-hoc tracing and debugging, but seems worthwhile enough to go in as a separate patch. The highlights of the changes are reduced scoping (private) members where possible, avoiding redundant new/delete, and constructor initialisation to please static code analyzers.	2013-05-30 12:54:11 -04:00
Andreas Hansson	42191522cc	mem: Fix MSHR print format This patch fixes an incorrect print format string by adding an additional string element.	2013-05-30 12:54:09 -04:00
Uri Wiener	a8fbfefb5e	mem: Adding verbose debug output in the memory system This patch provides useful printouts throughut the memory system. This includes pretty-printed cache tags and function call messages (call-stack like).	2013-04-22 13:20:33 -04:00
Mitch Hayenga	4920f0d7e5	mem: Fix cache latency bug Fixes a latency calculation bug for accesses during a cache line fill. Under a cache miss, before the line is filled, accesses to the cache are associated with a MSHR and marked as targets. Once the line fill completes, MSHR target packets pay an additional latency of "responseLatency + busSerializationLatency". However, the "whenReady" field of the cache line is only set to an additional delay of "busSerializationLatency". This lacks the responseLatency component of the fill. It is possible for accesses that occur on the cycle of (or briefly after) the line fill to respond without properly paying the responseLatency. This also creates the situation where two accesses to the same address may be serviced in an order opposite of how they were received by the cache. For stores to the same address, this means that although the cache performs the stores in the order they were received, acknowledgements may be sent in a different order. Adding the responseLatency component to the whenReady field preserves the penalty that should be paid and prevents these ordering issues. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2013-03-27 18:36:09 -05:00
Rene de Jong	87089175cc	mem: Cancel cache retry event when blocking port This patch solves the corner case scenario where the sendRetryEvent could be scheduled twice, when an io device stresses the IOcache in the system. This should not be possible in the cache system.	2013-03-26 14:46:51 -04:00
Andreas Hansson	da950caed2	mem: Fix sender state bug and delay popping This patch fixes a newly introduced bug where the sender state was popped before checking that it should be. Amazingly all regressions pass, but Linux fails to boot on the detailed CPU with caches enabled.	2013-02-19 12:57:47 -05:00
Andreas Hansson	c10098f28b	scons: Fix up numerous warnings about name shadowing This patch address the most important name shadowing warnings (as produced when using gcc/clang with -Wshadow). There are many locations where constructor parameters and function parameters shadow local variables, but these are left unchanged.	2013-02-19 05:56:06 -05:00
Andreas Hansson	860155a5fc	mem: Enforce strict use of busFirst- and busLastWordTime This patch adds a check to ensure that the delay incurred by the bus is not simply disregarded, but accounted for by someone. At this point, all the modules do is to zero it out, and no additional time is spent. This highlights where the bus timing is simply dropped instead of being paid for. As a follow up, the locations identified in this patch should add this additional time to the packets in one way or another. For now it simply acts as a sanity check and highlights where the delay is simply ignored. Since no time is added, all regressions remain the same.	2013-02-19 05:56:06 -05:00
Andreas Hansson	40d0e6c899	mem: Change accessor function names to match the port interface This patch changes the names of the cache accessor functions to be in line with those used by the ports. This is done to avoid confusion and get closer to a one-to-one correspondence between the interface of the memory object (the cache in this case) and the port itself. The member function timingAccess has been split into a snoop/non-snoop part to avoid branching on the isResponse() of the packet.	2013-02-19 05:56:06 -05:00
Andreas Hansson	b3fc8839c4	mem: Make packet bus-related time accounting relative This patch changes the bus-related time accounting done in the packet to be relative. Besides making it easier to align the cache timing to cache clock cycles, it also makes it possible to create a Last-Level Cache (LLC) directly to a memory controller without a bus inbetween. The bus is unique in that it does not ever make the packets wait to reflect the time spent forwarding them. Instead, the cache is currently responsible for making the packets wait. Thus, the bus annotates the packets with the time needed for the first word to appear, and also the last word. The cache then delays the packets in its queues before passing them on. It is worth noting that every object attached to a bus (devices, memories, bridges, etc) should be doing this if we opt for keeping this way of accounting for the bus timing.	2013-02-19 05:56:06 -05:00
Andreas Hansson	362160c8ae	mem: Add deferred packet class to prefetcher This patch removes the time field from the packet as it was only used by the preftecher. Similar to the packet queue, the prefetcher now wraps the packet in a deferred packet, which also has a tick representing the absolute time when the packet should be sent.	2013-02-19 05:56:06 -05:00
Andreas Hansson	7cd49b24d2	sim: Make clock private and access using clockPeriod() This patch makes the clock member private to the ClockedObject and forces all children to access it using clockPeriod(). This makes it impossible to inadvertently change the clock, and also makes it easier to transition to a situation where the clock is derived from e.g. a clock domain, or through a multiplier.	2013-02-19 05:56:06 -05:00
Sascha Bischoff	86a4d09269	mem: Fix SenderState related cache deadlock This patch fixes a potential deadlock in the caches. This deadlock could occur when more than one cache is used in a system, and pkt->senderState is modified in between the two caches. This happened as the caches relied on the senderState remaining unchanged, and used it for instantaneous upstream communication with other caches. This issue has been addressed by iterating over the linked list of senderStates until we are either able to cast to a MSHR* or senderState is NULL. If the cast is successful, we know that the packet has previously passed through another cache, and therefore update the downstreamPending flag accordingly. Otherwise, we do nothing.	2013-02-19 05:56:06 -05:00
Andreas Hansson	0622f30961	mem: Add predecessor to SenderState base class This patch adds a predecessor field to the SenderState base class to make the process of linking them up more uniform, and enable a traversal of the stack without knowing the specific type of the subclasses. There are a number of simplifications done as part of changing the SenderState, particularly in the RubyTest.	2013-02-19 05:56:05 -05:00
Andreas Hansson	f6550b3d20	mem: Tighten up cache constness and scoping This patch merely adopts a more strict use of const for the cache member functions and variables, and also moves a large portion of the member functions from public to protected.	2013-02-15 17:40:10 -05:00
Andreas Sandberg	b904bd5437	sim: Add a system-global option to bypass caches Virtualized CPUs and the fastmem mode of the atomic CPU require direct access to physical memory. We currently require caches to be disabled when using them to prevent chaos. This is not ideal when switching between hardware virutalized CPUs and other CPU models as it would require a configuration change on each switch. This changeset introduces a new version of the atomic memory mode, 'atomic_noncaching', where memory accesses are inserted into the memory system as atomic accesses, but bypass caches. To make memory mode tests cleaner, the following methods are added to the System class: * isAtomicMode() -- True if the memory mode is 'atomic' or 'direct'. * isTimingMode() -- True if the memory mode is 'timing'. * bypassCaches() -- True if caches should be bypassed. The old getMemoryMode() and setMemoryMode() methods should never be used from the C++ world anymore.	2013-02-15 17:40:09 -05:00
Anthony Gutierrez	af0f8b31db	cache: remove drainManager because it's not used the cache drainManager is set but never cleared, this is because the cache itself does not need to be drained and thus never triggers a signalDrainDone(). because the drainManager variable is not used properly and does not appear to be necessary it has been removed with this patch.	2013-01-28 20:19:42 -05:00
Mitch Hayenga	c7dbd5e768	mem: Make LL/SC locks fine grained The current implementation in gem5 just keeps a list of locks per cacheline. Due to this, a store to a non-overlapping portion of the cacheline can cause an LL/SC pair to fail. This patch simply adds an address range to the lock structure, so that the lock is only invalidated if the store overlaps the lock range.	2013-01-08 08:54:07 -05:00
Andreas Sandberg	964aa49d15	mem: Fix guest corruption when caches handle uncacheable accesses When the classic gem5 cache sees an uncacheable memory access, it used to ignore it or silently drop the cache line in case of a write. Normally, there shouldn't be any data in the cache belonging to an uncacheable address range. However, since some architecture models don't implement cache maintenance instructions, there might be some dirty data in the cache that is discarded when this happens. The reason it has mostly worked before is because such cache lines were most likely evicted by normal memory activity before a TLB flush was requested by the OS. Previously, the cache model would invalidate cache lines when they were accessed by an uncacheable write. This changeset alters this behavior so all uncacheable memory accesses cause a cache flush with an associated writeback if necessary. This is implemented by reusing the cache flushing machinery used when draining the cache, which implies that writebacks are performed using functional accesses.	2013-01-07 13:05:47 -05:00
Andreas Sandberg	d44f2f611f	mem: Remove the IIC replacement policy The IIC replacement policy seems to be unused and has probably gathered too much bit rot to be useful. This patch removes the IIC and its associated cache parameters.	2013-01-07 13:05:39 -05:00
Andreas Hansson	921490a060	sim: Fatal if a clocked object is set to have a clock of 0 This patch adds a check to the clocked object constructor to ensure it is not configured to have a clock period of 0.	2013-01-07 13:05:39 -05:00
Ali Saidi	9a645d6e9b	cache: add note about where conflicts are handled	2013-01-07 13:05:32 -05:00
Andreas Sandberg	ddd6af414c	mem: Add support for writing back and flushing caches This patch adds support for the following optional drain methods in the classical memory system's cache model: memWriteback() - Write back all dirty cache lines to memory using functional accesses. memInvalidate() - Invalidate all cache lines. Dirty cache lines are lost unless a writeback is requested. Since memWriteback() is called when checkpointing systems, this patch adds support for checkpointing systems with caches. The serialization code now checks whether there are any dirty lines in the cache. If there are dirty lines in the cache, the checkpoint is flagged as bad and a warning is printed.	2012-11-02 11:32:02 -05:00
Andreas Sandberg	b81a977e6a	sim: Move the draining interface into a separate base class This patch moves the draining interface from SimObject to a separate class that can be used by any object needing draining. However, objects not visible to the Python code (i.e., objects not deriving from SimObject) still depend on their parents informing them when to drain. This patch also gets rid of the CountedDrainEvent (which isn't really an event) and replaces it with a DrainManager.	2012-11-02 11:32:01 -05:00
Andreas Sandberg	c0ab52799c	sim: Include object header files in SWIG interfaces When casting objects in the generated SWIG interfaces, SWIG uses classical C-style casts ( (Foo *)bar; ). In some cases, this can degenerate into the equivalent of a reinterpret_cast (mainly if only a forward declaration of the type is available). This usually works for most compilers, but it is known to break if multiple inheritance is used anywhere in the object hierarchy. This patch introduces the cxx_header attribute to Python SimObject definitions, which should be used to specify a header to include in the SWIG interface. The header should include the declaration of the wrapped object. We currently don't enforce header the use of the header attribute, but a warning will be generated for objects that do not use it.	2012-11-02 11:32:01 -05:00
Andreas Hansson	2a740aa096	Port: Add protocol-agnostic ports in the port hierarchy This patch adds an additional level of ports in the inheritance hierarchy, separating out the protocol-specific and protocl-agnostic parts. All the functionality related to the binding of ports is now confined to use BaseMaster/BaseSlavePorts, and all the protocol-specific parts stay in the Master/SlavePort. In the future it will be possible to add other protocol-specific implementations. The functions used in the binding of ports, i.e. getMaster/SlavePort now use the base classes, and the index parameter is updated to use the PortID typedef with the symbolic InvalidPortID as the default.	2012-10-15 08:12:35 -04:00
Andreas Hansson	93a159875a	Fix: Address a few minor issues identified by cppcheck This patch addresses a number of smaller issues identified by the code inspection utility cppcheck. There are a number of identified leaks in the arm/linux/system.cc (although the function only get's called once so it is not a major problem), a few deletes in dev/x86/i8042.cc that were not array deletes, and sprintfs where the character array had one element less than needed. In the IIC tags there was a function allocating an array of longs which is in fact never used.	2012-10-15 08:12:23 -04:00
Andreas Hansson	88554790c3	Mem: Use cycles to express cache-related latencies This patch changes the cache-related latencies from an absolute time expressed in Ticks, to a number of cycles that can be scaled with the clock period of the caches. Ultimately this patch serves to enable future work that involves dynamic frequency scaling. As an immediate benefit it also makes it more convenient to specify cache performance without implicitly assuming a specific CPU core operating frequency. The stat blocked_cycles that actually counter in ticks is now updated to count in cycles. As the timing is now rounded to the clock edges of the cache, there are some regressions that change. Plenty of them have very minor changes, whereas some regressions with a short run-time are perturbed quite significantly. A follow-on patch updates all the statistics for the regressions.	2012-10-15 08:10:54 -04:00
Djordje Kovacevic	80a26a3e39	MEM: Put memory system document into doxygen	2012-09-25 11:49:41 -05:00
Mrinmoy Ghosh	6fc0094337	Cache: add a response latency to the caches In the current caches the hit latency is paid twice on a miss. This patch lets a configurable response latency be set of the cache for the backward path.	2012-09-25 11:49:41 -05:00
Andreas Hansson	ffb6aec603	AddrRange: Transition from Range<T> to AddrRange This patch takes the final plunge and transitions from the templated Range class to the more specific AddrRange. In doing so it changes the obvious Range<Addr> to AddrRange, and also bumps the range_map to be AddrRangeMap. In addition to the obvious changes, including the removal of redundant includes, this patch also does some house keeping in preparing for the introduction of address interleaving support in the ranges. The Range class is also stripped of all the functionality that is never used. --HG-- rename : src/base/range.hh => src/base/addr_range.hh rename : src/base/range_map.hh => src/base/addr_range_map.hh	2012-09-19 06:15:44 -04:00
Andreas Hansson	292d8252a4	clang: Fix issues identified by the clang static analyzer This patch addresses a few minor issues reported by the clang static analyzer. The analysis was run with: scan-build -disable-checker deadcode \ -enable-checker experimental.core \ -disable-checker experimental.core.CastToStruct \ -enable-checker experimental.cpluscplus	2012-09-11 14:15:47 -04:00
Lena Olson	584eba3ab6	Cache: Split invalidateBlk up to seperate block vs. tags This seperates the functionality to clear the state in a block into blk.hh and the functionality to udpate the tag information into the tags. This gets rid of the case where calling invalidateBlk on an already-invalid block does something different than calling it on a valid block, which was confusing.	2012-09-11 14:14:49 -04:00
Andreas Hansson	287ea1a081	Param: Transition to Cycles for relevant parameters This patch is a first step to using Cycles as a parameter type. The main affected modules are the CPUs and the Ruby caches. There are definitely plenty more places that are affected, but this patch serves as a starting point to making the transition. An important part of this patch is to actually enable parameters to be specified as Param.Cycles which involves some changes to params.py.	2012-09-07 12:34:38 -04:00

1 2 3 4 5 ...

389 commits