sanchayanmaity/gem5 - Sanchayan Maity's repositories

Author	SHA1	Message	Date
Andreas Hansson	83a5977481	mem: Be less conservative in clearing load locks in the cache Avoid being overly conservative in clearing load locks in the cache, and allow writes to the line if they are from the same context. This is in line with ALPHA and ARM.	2016-02-10 04:08:25 -05:00
Andreas Hansson	92f021cbbe	mem: Move the point of coherency to the coherent crossbar This patch introduces the ability of making the coherent crossbar the point of coherency. If so, the crossbar does not forward packets where a cache with ownership has already committed to responding, and also does not forward any coherency-related packets that are not intended for a downstream memory controller. Thus, invalidations and upgrades are turned around in the crossbar, and the memory controller only sees normal reads and writes. In addition this patch moves the express snoop promotion of a packet to the crossbar, thus allowing the downstream cache to check the express snoop flag (as it should) for bypassing any blocking, rather than relying on whether a cache is responding or not.	2016-02-10 04:08:25 -05:00
Andreas Hansson	f84ee031cc	mem: Align cache behaviour in atomic when upstream is responding Adopt the same flow as in timing mode, where the caches on the path to memory get to keep the line (if present), and we use the responderHadWritable flag to determine if we need to forward the (invalidating) packet or not.	2016-02-10 04:08:24 -05:00
Andreas Hansson	986214f181	mem: Align how snoops are handled when hitting writebacks This patch unifies the snoop handling in case of hitting writebacks with how we handle snoops hitting in the tags. As a result, we end up using the same optimisation as the normal snoops, where we inform the downstream cache if we encounter a line in Modified (writable and dirty) state, which enables us to avoid sending out express snoops to invalidate any Shared copies of the line. A few regressions consequently change, as some transactions are sunk higher up in the cache hierarchy.	2016-02-10 04:08:24 -05:00
Andreas Hansson	fbdeb60316	mem: Deduce if cache should forward snoops This patch changes how the cache determines if snoops should be forwarded from the memory side to the CPU side. Instead of having a parameter, the cache now looks at the port connected on the CPU side, and if it is a snooping port, then snoops are forwarded. Less error prone, and less parameters to worry about. The patch also tidies up the CPU classes to ensure that their I-side port is not snooping by removing overrides to the snoop request handler, such that snoop requests will panic via the default MasterPort implement	2016-02-10 04:08:24 -05:00
Steve Reinhardt	f6b828d068	style: eliminate explicit boolean comparisons Result of running 'hg m5style --skip-all --fix-control -a' to get rid of '== true' comparisons, plus trivial manual edits to get rid of '== false'/'== False' comparisons. Left a couple of explicit comparisons in where they didn't seem unreasonable: invalid boolean comparison in src/arch/mips/interrupts.cc:155 >> DPRINTF(Interrupt, "Interrupts OnCpuTimerINterrupt(tc) == true\n");<< invalid boolean comparison in src/unittest/unittest.hh:110 >> "EXPECT_FALSE(" #expr ")", (expr) == false)<<	2016-02-06 17:21:20 -08:00
Steve Reinhardt	5592798865	style: fix missing spaces in control statements Result of running 'hg m5style --skip-all --fix-control -a'.	2016-02-06 17:21:19 -08:00
Steve Reinhardt	dc8018a5c3	style: remove trailing whitespace Result of running 'hg m5style --skip-all --fix-white -a'.	2016-02-06 17:21:18 -08:00
Brad Beckmann	dcd8eeec3b	ruby: removed Write_Only AccessPermission	2016-01-22 10:42:12 -05:00
David Hashe	698866d461	ruby: split CPU and GPU latency stats	2015-07-20 09:15:18 -05:00
Tony Gutierrez	1a7d3f9fcb	gpu-compute: AMD's baseline GPU model	2016-01-19 14:28:22 -05:00
Tony Gutierrez	28e353e040	mem: write combining for ruby protocols This patch adds support for write-combining in ruby.	2016-01-19 14:05:03 -05:00
Tony Gutierrez	d658b6e1cc	* * * mem: support for gpu-style RMWs in ruby This patch adds support for GPU-style read-modify-write (RMW) operations in ruby. Such atomic operations are traditionally executed at the memory controller (instead of through an L1 cache using cache-line locking). Currently, this patch works by propogating operation functors through the memory system.	2016-01-19 13:57:50 -05:00
Blake Hechtman	34fb6b5e35	mem: misc flags for AMD gpu model This patch add support to mark memory requests/packets with attributes defined in HSA, such as memory order and scope.	2015-07-20 09:15:18 -05:00
Steve Reinhardt	8406a54907	mem: fix bug in packet access endianness changes The new Packet::setRaw() method incorrectly still contained an htog() conversion. As a result, calls to the old set() method (now defined as setRaw(htog(v))) underwent two htog conversions, which breaks things when htog() is not a no-op. Interestingly the only test that caught this was a SPARC boot test, where an IsaFake device with a non-zero return value was getting swapped twice resulting in a register getting loaded with 0x100000000000000 instead of 1. (Good reason for keeping SPARC around, perhaps?)	2016-01-11 16:20:38 -05:00
Andreas Hansson	12eb034378	scons: Enable -Wextra by default Make best use of the compiler, and enable -Wextra as well as -Wall. There are a few issues that had to be resolved, but they are all trivial.	2016-01-11 05:52:20 -05:00
Steve Reinhardt	6caa2c9b4e	mem: add CacheVerbose debug flag, filter noisy DPRINTFs Some of the DPRINTFs added to the classic cache in cset 45df88079f04, while useful to those unfamiliar with the cache code, end up being noise when you're familiar with the code but are trying to debug tricky protocol issues. (Particularly getting two messages from each cache as it receives a snoop request then declares that there was no match.) This patch introduces a CacheVerbose debug flag, and moves a subset of the added DPRINTFs into that category, so that Cache by itself returns to being a more succinct summary of cache activity. Also added a CacheAll compound flag to turn on all the cache-related debug flags (other than CacheTags, which you really have to want badly to turn it on, IMO).	2015-12-31 09:32:09 -08:00
Andreas Hansson	c153b669fd	mem: Do not rely on the NeedsWritable flag for responses This patch removes the NeedsWritable flag for all responses, as it is really only the request that needs a writable response. The response, on the other hand, should in these cases always provide the line in a writable state, as indicated by the hasSharers flag not being set. When we send requests that has NeedsWritable set, the response will always have the hasSharers flag not set. Additionally, there are cases where the request did not have NeedsWritable set, and we still get a writable response with the hasSharers flag not set. This never happens on snoops, but is used by downstream caches to pass ownership upstream. As part of this patch, the affected response types are updated, and the snoop filter is similarly modified to check only the hasSharers flag (as it should). A sanity check is also added to the packet class, asserting that we never look at the NeedsWritable flag for responses. No regressions are affected.	2015-12-31 09:34:18 -05:00
Andreas Hansson	7fca994d04	mem: Do not allocate space for packet data if not needed This patch looks at the request and response command to determine if either actually has any data payload, and if not, we do not allocate any space for packet data. The only tricky case is where the command type is changed as part of the MSHR functionality. In these cases where the original packet had no data, but the new packet does, we need to explicitly call allocate().	2015-12-31 09:33:39 -05:00
Andreas Hansson	f1ec326be5	mem: Do not alter cache block state on uncacheable snoops This patch ensures we do not respond with a Modified (dirty and writable) line if the request is uncacheable, and that the cache responding retains the line without modifying the state (even if responding).	2015-12-31 09:33:25 -05:00
Andreas Hansson	0fcb376e5f	mem: Make cache terminology easier to understand This patch changes the name of a bunch of packet flags and MSHR member functions and variables to make the coherency protocol easier to understand. In addition the patch adds and updates lots of descriptions, explicitly spelling out assumptions. The following name changes are made: * the packet memInhibit flag is renamed to cacheResponding * the packet sharedAsserted flag is renamed to hasSharers * the packet NeedsExclusive attribute is renamed to NeedsWritable * the packet isSupplyExclusive is renamed responderHadWritable * the MSHR pendingDirty is renamed to pendingModified The cache states, Modified, Owned, Exclusive, Shared are also called out in the cache and MSHR code to make it easier to understand.	2015-12-31 09:32:58 -05:00
Tony Gutierrez	a317764577	ruby: slicc: have a static MachineType This patch is imported from reviewboard patch 2551 by Nilay. This patch moves from a dynamically defined MachineType to a statically defined one. The need for this patch was felt since a dynamically defined type prevents us from having types for which no machine definition may exist. The following changes have been made: i. each machine definition now uses a type from the MachineType enumeration instead of any random identifier. This required changing the grammar and the .sm files. ii. MachineType enumeration defined statically in RubySlicc_Exports.sm. * * normal protocol fixes for nilay's parser machine type fix	2015-07-20 09:15:18 -05:00
Tony Gutierrez	3f68884c0e	ruby: slicc: remove support for single machine, multiple types This patch is imported from reviewboard patch 2550 by Nilay. It was possible to specify multiple machine types with a single state machine. This seems unnecessary and is being removed.	2015-07-20 09:15:18 -05:00
Andreas Hansson	f5c4a45889	mem: Explicitly check MSHR snoops for cases not dealt with Add a sanity check to make it explicit that we currently do not allow an I/O coherent agent to directly issue writes into the coherent part of the memory system (it has to go via a cache, and get transformed into a read ex, upgrade or invalidation).	2015-12-28 11:14:18 -05:00
Andreas Hansson	f6525ff221	mem: Remove unused cache squash functionality This patch removes the unused squash function from the MSHR queue, and the associated (and also unused) threadNum member from the MSHR.	2015-12-28 11:14:16 -05:00
Andreas Hansson	fbf3987c7b	mem: Avoid unecessary checks when creating HardPFReq in cache The checks made before sending out a HardPFReq were unecessarily complex, and checked for cases that never occur. This patch tidies it up.	2015-12-28 11:14:15 -05:00
Andreas Hansson	b93a9d0d51	mem: Do not use sender state to track forwarded snoops in cache This patch changes how the cache tracks which snoops are forwarded, and which ones are created locally. Previously the identification was based on an empty sender state of a specific class, but this method fails to distinguish which cache actually attached the sender state. Instead we use the same mechanism as the crossbar, and keep track of the requests that have outstanding snoops.	2015-12-28 11:14:14 -05:00
Andreas Hansson	036263e280	mem: Fix cache sender state handling and add clarification This patch addresses a bug in how the cache attached the MSHR as a sender state. Rather than overwriting any existing sender state it now pushes a new one. The handling of upward snoops is also clarified.	2015-12-28 11:14:10 -05:00
Andreas Hansson	97887eb6dc	mem: Fix memory allocation bug in deferred snoop handling This patch fixes a corner case in the deferred snoop handling, where requests ended up being used by multiple packets with different lifetimes, and inadvertently got deleted while they were still in use.	2015-12-17 17:07:11 -05:00
David Hashe	f5f04c3120	mem: add request types for acquire and release Add support for acquire and release requests. These synchronization operations are commonly supported by several modern instruction sets.	2015-07-20 09:15:18 -05:00
Brad Beckmann	173a786921	ruby: more flexible ruby tester support This patch allows the ruby random tester to use ruby ports that may only support instr or data requests. This patch is similar to a previous changeset (8932:1b2c17565ac8) that was unfortunately broken by subsequent changesets. This current patch implements the support in a more straight-forward way. Since retries are now tested when running the ruby random tester, this patch splits up the retry and drain check behavior so that RubyPort children, such as the GPUCoalescer, can perform those operations correctly without having to duplicate code. Finally, the patch also includes better DPRINTFs for debugging the tester.	2015-07-20 09:15:18 -05:00
Tony Gutierrez	413f3088ea	mem: remove acq/rel cmds from packet and add mem fence req	2015-12-09 22:56:31 -05:00
Radhika Jagtap	54519fd51f	cpu: Support virtual addr in elastic traces This patch adds support to optionally capture the virtual address and asid for load/store instructions in the elastic traces. If they are present in the traces, Trace CPU will set those fields of the request during replay.	2015-12-07 16:42:16 -06:00
Radhika Jagtap	36bb848104	mem: Add instruction sequence number to request This patch adds the instruction sequence number to the request and provides a request constructor that accepts a sequence number for initialization.	2015-12-07 16:42:15 -06:00
Andreas Hansson	72b14f7ef6	mem: Fix search-replace issues in DRAMPower wrapper license Fix a number of unintentional insertions of 'const'.	2015-11-25 13:52:56 -05:00
Andreas Sandberg	2a6fe97092	arm: Add missing explicit overrides for classic caches Make clang when compiling on OSX.	2015-11-15 21:28:00 +00:00
Brad Beckmann	95f20a2905	ruby: added stl vector of ints to be used by SLICC	2015-07-20 09:15:20 -05:00
Tony Gutierrez	d10fac27bc	slicc: fixes for the Address to Addr changeset (11025) misc changes now that Address has become Addr including int to address util function	2015-11-13 17:30:58 -05:00
Joe Gross	5143d480f3	ruby: add BoolVec The BoolVec typedef and insertion operator overload function simplify usage of vectors of type bool	2015-11-13 17:30:56 -05:00
Brad Beckmann	aef8d851bd	mem: add boolean to disable PacketQueue's size sanity check the sanity check, while generally useful for exposing memory system bugs, may be spurious with respect to GPU workloads, which may generate many more requests than typical CPU workloads. the large number of requests generated by the GPU may cause the req/resp queues to back up, thus queueing more than 100 packets.	2015-07-20 09:15:18 -05:00
Andreas Hansson	7433d77fcf	mem: Add an option to perform clean writebacks from caches This patch adds the necessary commands and cache functionality to allow clean writebacks. This functionality is crucial, especially when having exclusive (victim) caches. For example, if read-only L1 instruction caches are not sending clean writebacks, there will never be any spills from the L1 to the L2. At the moment the cache model defaults to not sending clean writebacks, and this should possibly be re-evaluated. The implementation of clean writebacks relies on a new packet command WritebackClean, which acts much like a Writeback (renamed WritebackDirty), and also much like a CleanEvict. On eviction of a clean block the cache either sends a clean evict, or a clean writeback, and if any copies are still cached upstream the clean evict/writeback is dropped. Similarly, if a clean evict/writeback reaches a cache where there are outstanding MSHRs for the block, the packet is dropped. In the typical case though, the clean writeback allocates a block in the downstream cache, and marks it writable if the evicted block was writable. The patch changes the O3_ARM_v7a L1 cache configuration and the default L1 caches in config/common/Caches.py	2015-11-06 03:26:43 -05:00
Andreas Hansson	654266f39c	mem: Add cache clusivity This patch adds a parameter to control the cache clusivity, that is if the cache is mostly inclusive or exclusive. At the moment there is no intention to support strict policies, and thus the options are: 1) mostly inclusive, or 2) mostly exclusive. The choice of policy guides the behaviuor on a cache fill, and a new helper function, allocOnFill, is created to encapsulate the decision making process. For the timing mode, the decision is annotated on the MSHR on sending out the downstream packet, and in atomic we directly pass the decision to handleFill. We (ab)use the tempBlock in cases where we are not allocating on fill, leaving the rest of the cache unaffected. Simple and effective. This patch also makes it more explicit that multiple caches are allowed to consider a block writable (this is the case also before this patch). That is, for a mostly inclusive cache, multiple caches upstream may also consider the block exclusive. The caches considering the block writable/exclusive all appear along the same path to memory, and from a coherency protocol point of view it works due to the fact that we always snoop upwards in zero time before querying any downstream cache. Note that this patch does not introduce clean writebacks. Thus, for clean lines we are essentially removing a cache level if it is made mostly exclusive. For example, lines from the read-only L1 instruction cache or table-walker cache are always clean, and simply get dropped rather than being passed to the L2. If the L2 is mostly exclusive and does not allocate on fill it will thus never hold the line. A follow on patch adds the clean writebacks. The patch changes the L2 of the O3_ARM_v7a CPU configuration to be mostly exclusive (and stats are affected accordingly).	2015-11-06 03:26:41 -05:00
Ali Jafri	f02a9338c1	mem: Avoid unnecessary snoops on writebacks and clean evictions This patch optimises the handling of writebacks and clean evictions when using a snoop filter. Instead of snooping into the caches to determine if the block is cached or not, simply set the status based on the snoop-filter result.	2015-11-06 03:26:40 -05:00
Andreas Hansson	c086c20bd2	mem: Order packet queue only on matching addresses Instead of conservatively enforcing order for all packets, which may negatively impact the simulated-system performance, this patch updates the packet queue such that it only applies the restriction if there are already packets with the same address in the queue. The basic need for the order enforcement is due to coherency interactions where requests/responses to the same cache line must not over-take each other. We rely on the fact that any packet that needs order enforcement will have a block-aligned address. Thus, there is no need for the queue to know about the cacheline size.	2015-11-06 03:26:38 -05:00
Ali Jafri	52c8ae5187	mem: Enforce insertion order on the cache response path This patch enforces insertion order transmission of packets on the response path in the cache. Note that the logic to enforce order is already present in the packet queue, this patch simply turns it on for queues in the response path. Without this patch, there are corner cases where a request-response is faster than a response-response forwarded through the cache. This violation of queuing order causes problems in the snoop filter leaving it with inaccurate information. This causes assert failures in the snoop filter later on. A follow on patch relaxes the order enforcement in the packet queue to limit the performance impact.	2015-11-06 03:26:37 -05:00
Andreas Hansson	6b70afd0d4	mem: Use the packet delays and do not just zero them out This patch updates the I/O devices, bridge and simple memory to take the packet header and payload delay into account in their latency calculations. In all cases we add the header delay, i.e. the accumulated pipeline delay of any crossbars, and the payload delay needed for deserialisation of any payload. Due to the additional unknown latency contribution, the packet queue of the simple memory is changed to use insertion sorting based on the time stamp. Moreover, since the memory hands out exclusive (non shared) responses, we also need to ensure ordering for reads to the same address.	2015-11-06 03:26:36 -05:00
Andreas Hansson	8bc925e36d	mem: Align rules for sinking inhibited packets at the slave This patch aligns how the memory-system slaves, i.e. the various memory controllers and the bridge, identify and deal with sinking of inhibited packets that are only useful within the coherent part of the memory system. In the future we could shift the onus to the crossbar, and add a parameter "is_point_of_coherence" that would allow it to sink the aforementioned packets.	2015-11-06 03:26:35 -05:00
Andreas Hansson	8e55d51aaa	mem: Do not treat CleanEvict as a write operation This patch changes the CleanEvict command type to not be considered a write. Initially it was made a zero-sized write to match the writeback command, but as things developed it became clear that it causes more problems than it solves. For example, the memory modules (and bridge) should not consider the CleanEvict as a write, but instead discard it. With this patch it will be neither a read, nor write, and as it does not need a response the slave will simply sink it.	2015-11-06 03:26:33 -05:00
Andreas Hansson	ac1368df50	mem: Unify delayed packet deletion This patch unifies how we deal with delayed packet deletion, where the receiving slave is responsible for deleting the packet, but the sending agent (e.g. a cache) is still relying on the pointer until the call to sendTimingReq completes. Previously we used a mix of a deletion vector and a construct using unique_ptr. With this patch we ensure all slaves use the latter approach.	2015-11-06 03:26:21 -05:00
Andreas Hansson	2cb5467e85	misc: Appease clang static analyzer A few minor fixes to issues identified by the clang static analyzer.	2015-11-06 03:26:16 -05:00

1 2 3 4 5 ...

1920 commits