sanchayanmaity/gem5 - Sanchayan Maity's repositories

Author	SHA1	Message	Date
Nilay Vaish	9ea5d9cad9	ruby: rename variables Addr to addr Avoid clash between type Addr and variable name Addr.	2015-08-14 12:04:47 -05:00
Joel Hestness	905c0b347c	ruby: Protocol changes for SimObject MessageBuffers	2015-08-14 00:19:45 -05:00
Joel Hestness	581bae9ecb	ruby: Expose MessageBuffers as SimObjects Expose MessageBuffers from SLICC controllers as SimObjects that can be manipulated in Python. This patch has numerous benefits: 1) First and foremost, it exposes MessageBuffers as SimObjects that can be manipulated in Python code. This allows parameters to be set and checked in Python code to avoid obfuscating parameters within protocol files. Further, now as SimObjects, MessageBuffer parameters are printed to config output files as a way to track parameters across simulations (e.g. buffer sizes) 2) Cleans up special-case code for responseFromMemory buffers, and aligns their instantiation and use with mandatoryQueue buffers. These two special buffers are the only MessageBuffers that are exposed to components outside of SLICC controllers, and they're both slave ends of these buffers. They should be exposed outside of SLICC in the same way, and this patch does it. 3) Distinguishes buffer-specific parameters from buffer-to-network parameters. Specifically, buffer size, randomization, ordering, recycle latency, and ports are all specific to a MessageBuffer, while the virtual network ID and type are intrinsics of how the buffer is connected to network ports. The former are specified in the Python object, while the latter are specified in the controller *.sm files. Unlike buffer-specific parameters, which may need to change depending on the simulated system structure, buffer-to-network parameters can be specified statically for most or all different simulated systems.	2015-08-14 00:19:44 -05:00
Joel Hestness	bf06911b3f	ruby: Change PerfectCacheMemory::lookup to return pointer CacheMemory and DirectoryMemory lookup functions return pointers to entries stored in the memory. Bring PerfectCacheMemory in line with this convention, and clean up SLICC code generation that was in place solely to handle references like that which was returned by PerfectCacheMemory::lookup.	2015-08-14 00:19:39 -05:00
Joel Hestness	9567c839fe	ruby: Remove the RubyCache/CacheMemory latency The RubyCache (CacheMemory) latency parameter is only used for top-level caches instantiated for Ruby coherence protocols. However, the top-level cache hit latency is assessed by the Sequencer as accesses flow through to the cache hierarchy. Further, protocol state machines should be enforcing these cache hit latencies, but RubyCaches do not expose their latency to any existng state machines through the SLICC/C++ interface. Thus, the RubyCache latency parameter is superfluous for all caches. This is confusing for users. As a step toward pushing L0/L1 cache hit latency into the top-level cache controllers, move their latencies out of the RubyCache declarations and over to their Sequencers. Eventually, these Sequencer parameters should be exposed as parameters to the top-level cache controllers, which should assess the latency. NOTE: Assessing these latencies in the cache controllers will require modifying each to eliminate instantaneous Ruby hit callbacks in transitions that finish accesses, which is likely a large undertaking.	2015-08-14 00:19:37 -05:00
Nilay Vaish	380a2ca918	ruby: slicc: allow mathematical operations on Ticks	2015-08-11 11:39:23 -05:00
Andreas Sandberg	bbb3abc167	mem: Cleanup packet accessor methods The Packet::get() and Packet::set() methods both have very strange semantics. Currently, they automatically convert between the guest system's endianness and the host system's endianness. This behavior is usually undesired and unexpected. This patch introduces three new method pairs to access data: * getLE() / setLE() - Get data stored as little endian. * getBE() / setBE() - Get data stored as big endian. * get(ByteOrder) / set(v, ByteOrder) - Configurable endianness For example, a little endian device that is receiving a write request will use teh getLE() method to get the data from the packet. The old interface will be deprecated once all existing devices have been ported to the new interface.	2015-08-07 09:59:28 +01:00
Andreas Sandberg	53e777d683	base: Declare a type for context IDs Context IDs used to be declared as ad hoc (usually as int). This changeset introduces a typedef for ContextIDs and a constant for invalid context IDs.	2015-08-07 09:59:13 +01:00
Andreas Hansson	83a668ad25	mem: Remove extraneous acquire/release flags and attributes This patch removes the extraneous flags and attributes from the request and packet, and simply leaves the new commands. The change introduced when adding acquire/release breaks all compatibility with existing traces, and there is really no need for any new flags and attributes. The commands should be sufficient. This patch fixes packet tracing (urgent), and also removes the unnecessary complexity.	2015-08-07 04:55:38 -04:00
Andreas Sandberg	0194e6eb2d	mem: Fixup incorrect include guards --HG-- extra : rebase_source : 9dba84eaf9c734a114ecd0940e1d505303644064	2015-08-05 10:12:12 +01:00
Andreas Sandberg	a3f49f60c7	mem: Move trace functionality from the CommMonitor to a probe This changeset moves the access trace functionality from the CommMonitor into a separate probe. The probe can be hooked up to any component that exports probe points of the type ProbePoints::Packet. This patch moves the dependency on Google's Protocol Buffers library from the CommMonitor to the MemTraceProbe, which means that the CommMonitor (including stack distance profiling) no long depends on it.	2015-08-04 10:29:13 +01:00
Andreas Sandberg	022e69e6de	mem: Redesign the stack distance calculator as a probe This changeset removes the stack distance calculator hooks from the CommMonitor class and implements a stack distance calculator as a memory system probe instead. The probe can be hooked up to any component that exports probe points of the type ProbePoints::Packet.	2015-08-04 10:29:13 +01:00
Andreas Sandberg	feded87fc9	mem: Add probe support to the CommMonitor This changeset adds a standardized probe point type to monitor packets in the memory system and adds two probe points to the CommMonitor class. These probe points enable monitoring of successfully delivered requests and successfully delivered responses. Memory system probe listeners should use the BaseMemProbe base class to provide a unified configuration interface and reuse listener registration code. Unlike the ProbeListenerObject class, the BaseMemProbe allows objects to be wired to multiple ProbeManager instances as long as they use the same probe point name.	2015-08-04 10:29:13 +01:00
Timothy Jones	96091f358b	uby: Fix checkpointing and restore There are 2 problems with the existing checkpoint and restore code in ruby. The first is that when the event queue is altered by ruby during serialization, some events that are currently scheduled cannot be found (e.g. the event to stop simulation that always lives on the queue), causing a panic. The second is that ruby is sometimes serialized after the memory system, meaning that the dirty data in its cache is flushed back to memory too late and so isn't included in the checkpoint. These are fixed by implementing memory writeback in ruby, using the same technique of hijacking the event queue, but first descheduling all events that are currently on it. They are saved, along with their scheduled time, so that the event queue can be faithfully reconstructed after writeback has finished. Events with the AutoDelete flag set will delete themselves when they are descheduled, causing an error when attempting to schedule them again. This is fixed by simply not recording them when taking them off the queue. Writeback is still implemented using flushing, so the cache recorder object, that is created to generate the trace and manage flushing, is kept around and used during serialization to write the trace to disk. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2015-08-03 23:08:40 -05:00
Nilay Vaish	676ae57827	ruby: mesi three level: multiple corrections to the protocol 1. Eliminate state NP in L0 and L1 Caches: The two states 'NP' and 'I' both mean that the cache block is not present in the cache. 'I' also means that the cache entry has been allocated. This causes problems when we do not correctly initialize the cache entry when it is re-used. Hence, this patch eliminates the state NP altogether. Everytime a new block comes into the cache, a cache entry is allocated. Everytime a block leaves, the corresponding entry is deallocated. 2. Separate transient state for instruction fetches: purely for accouting purposes. 3. Drop state IS_I in L1 Cache and the message type STALE_DATA: when invalidation is received for a block in IS, the block used to be moved to IS_I. This meant that the data that would arrive in future would be used but not stored since the controller lost the permissions after gaining them. This state is being dropped and now invalidation messages would not processed till the data has arrived. This also means that STALE_DATA type is not longer required.	2015-08-03 22:44:29 -05:00
Nilay Vaish	9bf3b8828a	ruby: mesi two,three level: copy data only when dirty The level 2 controller has a bug. In one particular action, the data block was copied from a message irrespective whether the block is dirty or not. In cases when L1 sends no data, the data value copied was incorrect.	2015-08-03 22:44:28 -05:00
Brad Beckmann	03f2b8c23d	ruby: removed invalid assert in message comparitor It is perfectly valid to compare the same message and the greater than operator should work correctly.	2015-08-01 12:59:47 -04:00
Brad Beckmann	6b52f828cc	ruby: improved stall and wait debugging Added dprintfs and asserts for identifying stall and wait bugs.	2015-07-20 09:15:18 -05:00
Brad Beckmann	848861a17d	slicc: fix error in conflicing symbol declaration	2015-07-20 09:15:18 -05:00
Brad Beckmann	8a54adc2a5	slicc: enable overloading in functions not in classes For many years the slicc symbol table has supported overloaded functions in external classes. This patch extends that support to functions that are not part of classes (a.k.a. no parent). For example, this support allows slicc to understand that mapAddressToRange is overloaded and the NodeID is an optional parameter.	2015-07-20 09:15:18 -05:00
David Hashe	0d00cbc97b	ruby: change router pipeline stages to 2 This patch changes the router pipeline stages from 4 to 2. The canonical 4-stage router is conservative while a lower-latency router with look ahead routing and speculative allocation is well acknowledged.	2015-07-20 09:15:18 -05:00
David Hashe	8b32dad4d8	ruby: change advance_stage for flit_d Sets m_stage.second to the second parameter of the function. Then, for every place where advance_stage is called, adds a cycle to the argument being passed.	2015-07-20 09:15:18 -05:00
Brad Beckmann	f9fa242f42	slicc: improved stalling support in protocols Adds features to allow protocols to reschedule controllers when conditionally stalling within inport logic or actions. Also insures that resource and protocol stalls are re-evaluated the next cycle.	2015-07-20 09:15:18 -05:00
David Hashe	c4ffd4989c	ruby: expose access permission to replacement policies This patch adds support that allows the replacement policy to identify each cache block's access permission. This information can be useful when making replacement decisions.	2015-07-20 09:15:18 -05:00
David Hashe	967cfa939a	ruby: adds size and empty apis to the msg buffer stallmap	2015-07-20 09:15:18 -05:00
David Hashe	21aa5734a0	ruby: fix deadlock bug in banked array resource checks The Ruby banked array resource checks (initiated from SLICC) did a check and allocate at the same time. If a transition needs more than one resource, then it might check/allocate resource #1, then fail to get resource #2. Another transition might then try to get the same resources, but in reverse order. Deadlock. This patch separates resource checking and resource reservation into two steps to avoid deadlock.	2015-07-20 09:15:18 -05:00
David Hashe	63a9f10de8	ruby: Fix for stallAndWait bug It was previously possible for a stalled message to be reordered after an incomming message. This patch ensures that any stalled message stays in its original request order.	2015-07-20 09:15:18 -05:00
David Hashe	6511ab4654	mem: add request types for acquire and release Add support for acquire and release requests. These synchronization operations are commonly supported by several modern instruction sets.	2015-07-20 09:15:18 -05:00
David Hashe	7e9562013b	ruby: allocate a block in CacheMemory without updating LRU state	2015-07-20 09:15:18 -05:00
David Hashe	7e00772bda	ruby: speed up function used for cache walks This patch adds a few helpful functions that allow .sm files to directly invalidate all cache blocks using a trigger queue rather than rely on each individual cache block to be invalidated via requests from the mandatory queue.	2015-07-20 09:15:18 -05:00
David Hashe	3454a4a36e	slicc: support for arbitrary DPRINTF flags (not just RubySlicc) This patch allows DPRINTFs to be used in SLICC state machines similar to how they are used by the rest of gem5. Previously all DPRINTFs in the .sm files had to use the RubySlicc flag.	2015-07-20 09:15:18 -05:00
David Hashe	9324239922	slicc: support for local variable declarations in action blocks	2015-07-20 09:15:18 -05:00
David Hashe	1850ed410f	ruby: initialize replacement policies with their own simobjs this is in preparation for other replacement policies that take additional parameters.	2015-07-20 09:15:18 -05:00
David Hashe	74ca89f8b7	ruby: give access to cache tag/data latencies from SLICC This patch exposes the tag and data array latencies to the SLICC state machines so that it can be used to determine the correct enqueue latency for response messages.	2015-07-20 09:15:18 -05:00
David Hashe	536e3664e4	slicc: support for multiple cache entry types in the same state machine To have multiple Entry types (e.g., a cache Entry type and a directory Entry type), just declare one of them as a secondary type by using the pair 'main="false"', e.g.: structure(DirEntry, desc="...", interface="AbstractCacheEntry", main="false") { ...and the primary type would be declared: structure(Entry, desc="...", interface="AbstractCacheEntry") {	2015-07-20 09:15:18 -05:00
David Hashe	910638f338	slicc: Fix bug in enqueue and peek statements. These were not generating the correct c names for types declared within a machine scope.	2015-07-20 09:15:18 -05:00
David Hashe	3d8c8a85fa	slicc: fix missing inline function in LocalVariableAST	2015-07-20 09:15:18 -05:00
David Hashe	93fff6f636	slicc: improve support for prefix operations This patch fixes the type handling when prefix operations are used. Previously prefix operators would assume a void return type, which made it impossible to combine prefix operations with other expressions. This patch allows SLICC programmers to use prefix operations more naturally.	2015-07-20 09:15:18 -05:00
David Hashe	ee0d414fa8	slicc: support for transitions with a wildcard next state This patches adds support for transitions of the form: transition(START, EVENTS, ) { ACTIONS } This allows a machine to collapse states that differ only in the next state transition to collapse into one, and can help shorten/simplfy some protocols significantly. When is encountered as an end state of a transition, the next state is determined by calling the machine-specific getNextState function. The next state is determined before any actions of the transition execute, and therefore the next state calculation cannot depend on any of the transition actions.	2015-07-20 09:15:18 -05:00
David Hashe	6a288d9de3	slicc: support for multiple message types on the same buffer This patch allows SLICC protocols to use more than one message type with a message buffer. For example, you can declare two in ports as such: in_port(ResponseQueue_in, ResponseMsg, responseFromDir, rank=3) { ... } in_port(tgtResponseQueue_in, TgtResponseMsg, responseFromDir, rank=2) { ... }	2015-07-20 09:15:18 -05:00
Brad Beckmann	b609b032aa	slicc: fatal->panic on invalid transitions	2015-08-01 12:37:52 -04:00
David Hashe	3444d5f359	mem: Hit callback delay fix This patch was created by Bihn Pham during his internship at AMD. There is no need to delay hit callback response messages by a cycle because the response latency is already incurred in the Ruby protocol. This ensures correct timing of memory instructions.	2015-07-20 09:15:18 -05:00
Brad Beckmann	0c78abb302	ruby: re-added the addressToInt slicc interface function This helper function is very useful converting address offsets to integers that can be used for protocol specific destination mapping.	2015-07-20 09:15:18 -05:00
Brad Beckmann	4710eba588	ruby: add useful dprints to sequencer Added two data block dprints that are useful when tracking down data check failures in the ruby random tester.	2015-07-20 09:15:18 -05:00
David Hashe	a254786a19	slicc: isinstance bugfix This fix prevents spurious errors when searching for a symbol that may be located in one of multiple symbol tables.	2015-07-20 09:15:18 -05:00
Andreas Hansson	6fac40ceb0	mem: Add missing clean eviction on uncacheable access This patch adds a missing clean eviction, occuring when an uncacheable access flushes and invalidates an existing block.	2015-07-30 03:42:25 -04:00
Andreas Hansson	540e59fd70	mem: Remove unused RequestCause in cache This patch removes the RequestCause, and also simplifies how we schedule the sending of packets through the memory-side port. The deassertion of bus requests is removed as it is not used.	2015-07-30 03:41:43 -04:00
David Guillen-Fandos	0c89c15b23	mem: Make caches way aware This patch makes cache sets aware of the way number. This enables some nice features such as the ablity to restrict way allocation. The implemented mechanism allows to set a maximum way number to be allocated 'k' which must fulfill 0 < k <= N (where N is the number of ways). In the future more sophisticated mechasims can be implemented.	2015-07-30 03:41:42 -04:00
Andreas Hansson	5a18e181ff	mem: Transition away from isSupplyExclusive for writebacks This patch changes how writebacks communicate whether the line is passed as modified or owned. Previously we relied on the isSupplyExclusive mechanism, which was originally designed to avoid unecessary snoops. For normal cache requests we use the sharedAsserted mechanism to determine if a block should be marked writeable or not, and with this patch we transition the writebacks to also use this mechanism. Conceptually this is cleaner and more consistent.	2015-07-30 03:41:40 -04:00
Andreas Hansson	5902e29e84	mem: Tidy up CacheBlk class This patch modernises and tidies up the CacheBlk, removing dead code.	2015-07-30 03:41:39 -04:00
Andreas Hansson	41b39b22cd	mem: Tidy up packet Some minor fixes and removal of dead code. Changing the flags to be enums rather than static const (to avoid any linking issues caused by the latter). Also adding a getBlockAddr member which hopefully can slowly finds its way into caches, snoop filters etc.	2015-07-30 03:41:38 -04:00
Brandon Potter	582793468d	ruby: dma sequencer: removes redundant code	2015-07-24 12:25:22 -07:00
Nilay Vaish	1b71a20391	ruby: network: NetworkLink inherits from Consumer now.	2015-07-22 11:20:07 -05:00
Andreas Hansson	5410660919	mem: Fix (ab)use of emplace to avoid temporary object creation	2015-07-13 08:46:28 -04:00
Andreas Hansson	d870c399d3	mem: Updated DRAMSim2 wrapper to new drain API Somehow this one slipped through without being updated.	2015-07-13 08:46:16 -04:00
Brandon Potter	bfe7ee96ad	ruby: replace global g_abs_controls with per-RubySystem var This is another step in the process of removing global variables from Ruby to enable multiple RubySystem instances in a single simulation. The list of abstract controllers is per-RubySystem and should be represented that way, rather than as a global. Since this is the last remaining Ruby global variable, the src/mem/ruby/Common/Global.* files are also removed.	2015-07-10 16:05:24 -05:00
Brandon Potter	f9a370f172	ruby: replace global g_system_ptr with per-object pointers This is another step in the process of removing global variables from Ruby to enable multiple RubySystem instances in a single simulation. With possibly multiple RubySystem objects, we can no longer use a global variable to find "the" RubySystem object. Instead, each Ruby component has to carry a pointer to the RubySystem object to which it belongs.	2015-07-10 16:05:23 -05:00
Brandon Potter	c38f5098b1	ruby: replace g_ruby_start with per-RubySystem m_start_cycle This patch begins the process of removing global variables from the Ruby source with the goal of eventually allowing users to create multiple Ruby instances in a single simulation. Currently, users cannot do so because several global variables and static members are referenced by the RubySystem object in a way that assumes that there will only ever be a single RubySystem. These need to be replaced with per-RubySystem equivalents. This specific patch replaces the global var g_ruby_start, which is used to calculate throughput statistics for Throttles in simple networks and links in Garnet networks, with a RubySystem instance var m_start_cycle.	2015-07-10 16:05:23 -05:00
Brandon Potter	9eda4bdc5a	ruby: remove extra whitespace and correct misspelled words	2015-07-10 16:05:23 -05:00
Andreas Sandberg	ed38e3432c	sim: Refactor and simplify the drain API The drain() call currently passes around a DrainManager pointer, which is now completely pointless since there is only ever one global DrainManager in the system. It also contains vestiges from the time when SimObjects had to keep track of their child objects that needed draining. This changeset moves all of the DrainState handling to the Drainable base class and changes the drain() and drainResume() calls to reflect this. Particularly, the drain() call has been updated to take no parameters (the DrainManager argument isn't needed) and return a DrainState instead of an unsigned integer (there is no point returning anything other than 0 or 1 any more). Drainable objects should return either DrainState::Draining (equivalent to returning 1 in the old system) if they need more time to drain or DrainState::Drained (equivalent to returning 0 in the old system) if they are already in a consistent state. Returning DrainState::Running is considered an error. Drain done signalling is now done through the signalDrainDone() method in the Drainable class instead of using the DrainManager directly. The new call checks if the state of the object is DrainState::Draining before notifying the drain manager. This means that it is safe to call signalDrainDone() without first checking if the simulator has requested draining. The intention here is to reduce the code needed to implement draining in simple objects.	2015-07-07 09:51:05 +01:00
Andreas Sandberg	f16c0a4a90	sim: Decouple draining from the SimObject hierarchy Draining is currently done by traversing the SimObject graph and calling drain()/drainResume() on the SimObjects. This is not ideal when non-SimObjects (e.g., ports) need draining since this means that SimObjects owning those objects need to be aware of this. This changeset moves the responsibility for finding objects that need draining from SimObjects and the Python-side of the simulator to the DrainManager. The DrainManager now maintains a set of all objects that need draining. To reduce the overhead in classes owning non-SimObjects that need draining, objects inheriting from Drainable now automatically register with the DrainManager. If such an object is destroyed, it is automatically unregistered. This means that drain() and drainResume() should never be called directly on a Drainable object. While implementing the new functionality, the DrainManager has now been made thread safe. In practice, this means that it takes a lock whenever it manipulates the set of Drainable objects since SimObjects in different threads may create Drainable objects dynamically. Similarly, the drain counter is now an atomic_uint, which ensures that it is manipulated correctly when objects signal that they are done draining. A nice side effect of these changes is that it makes the drain state changes stricter, which the simulation scripts can exploit to avoid redundant drains.	2015-07-07 09:51:05 +01:00
Andreas Sandberg	e9c3d59aae	sim: Make the drain state a global typed enum The drain state enum is currently a part of the Drainable interface. The same state machine will be used by the DrainManager to identify the global state of the simulator. Make the drain state a global typed enum to better cater for this usage scenario.	2015-07-07 09:51:04 +01:00
Andreas Sandberg	76cd4393c0	sim: Refactor the serialization base class Objects that are can be serialized are supposed to inherit from the Serializable class. This class is meant to provide a unified API for such objects. However, so far it has mainly been used by SimObjects due to some fundamental design limitations. This changeset redesigns to the serialization interface to make it more generic and hide the underlying checkpoint storage. Specifically: * Add a set of APIs to serialize into a subsection of the current object. Previously, objects that needed this functionality would use ad-hoc solutions using nameOut() and section name generation. In the new world, an object that implements the interface has the methods serializeSection() and unserializeSection() that serialize into a named /subsection/ of the current object. Calling serialize() serializes an object into the current section. * Move the name() method from Serializable to SimObject as it is no longer needed for serialization. The fully qualified section name is generated by the main serialization code on the fly as objects serialize sub-objects. * Add a scoped ScopedCheckpointSection helper class. Some objects need to serialize data structures, that are not deriving from Serializable, into subsections. Previously, this was done using nameOut() and manual section name generation. To simplify this, this changeset introduces a ScopedCheckpointSection() helper class. When this class is instantiated, it adds a new /subsection/ and subsequent serialization calls during the lifetime of this helper class happen inside this section (or a subsection in case of nested sections). * The serialize() call is now const which prevents accidental state manipulation during serialization. Objects that rely on modifying state can use the serializeOld() call instead. The default implementation simply calls serialize(). Note: The old-style calls need to be explicitly called using the serializeOld()/serializeSectionOld() style APIs. These are used by default when serializing SimObjects. * Both the input and output checkpoints now use their own named types. This hides underlying checkpoint implementation from objects that need checkpointing and makes it easier to change the underlying checkpoint storage code.	2015-07-07 09:51:03 +01:00
Andreas Sandberg	777cc71c4a	mem: Cleanup CommMonitor in preparation for probe support Make configuration parameters constant and get rid of an unnecessary dependency on the Time class.	2015-07-06 17:08:53 +01:00
Nilay Vaish	d29d7c41f1	mem: packet: Add const to constructor argument	2015-07-04 10:43:46 -05:00
Nilay Vaish	16ac48e6a4	ruby: drop NetworkMessage class This patch drops the NetworkMessage class. The relevant data members and functions have been moved to the Message class, which was the parent of NetworkMessage.	2015-07-04 10:43:46 -05:00
Nilay Vaish	baa3eb0de3	ruby: mesi three level: name change to avoid clash The accessor function getDestination() for Destination variable in the coherence message clashes with the getDestination() that is part of the Message class. Hence the name change.	2015-07-04 10:43:46 -05:00
Nilay Vaish	b4efb48a71	ruby: remove message buffer node This structure's only purpose was to provide a comparison function for ordering messages in the MessageBuffer. The comparison function is now being moved to the Message class itself. So we no longer require this structure.	2015-07-04 10:43:46 -05:00
Andreas Hansson	7e711c98f8	mem: Increase the default buffer sizes for the DDR4 controller This patch increases the default read/write buffer sizes for the DDR4 controller config to values that are more suitable for the high bandwidth and high bank count.	2015-07-03 10:14:48 -04:00
Wendy Elsasser	31f901b69d	mem: Update DRAM command scheduler for bank groups This patch updates the command arbitration so that bank group timing as well as rank-to-rank delays will be taken into account. The resulting arbitration no longer selects commands (prepped or not) that cannot issue seamlessly if there are commands that can issue back-to-back, minimizing the effect of rank-to-rank (tCS) & same bank group (tCCD_L) delays. The arbitration selects a new command based on the following priority. Within each priority band, the arbitration will use FCFS to select the appropriate command: 1) Bank is prepped and burst can issue seamlessly, without a bubble 2) Bank is not prepped, but can prep and issue seamlessly, without a bubble 3) Bank is prepped but burst cannot issue seamlessly. In this case, a bubble will occur on the bus Thus, to enable more parallelism in subsequent selections, an unprepped packet is given higher priority if the bank prep can be hidden. If the bank prep cannot be hidden, the selection logic will choose a prepped packet that cannot issue seamlessly if one exist. Otherwise, the default selection will choose the packet with the minimum bank prep delay.	2015-07-03 10:14:46 -04:00
Andreas Hansson	b56167b682	mem: Avoid DRAM write queue iteration for merging and read lookup This patch adds a simple lookup structure to avoid iterating over the write queue to find read matches, and for the merging of write bursts. Instead of relying on iteration we simply store a set of currently-buffered write-burst addresses and compare against these. For the reads we still perform the iteration if we have a match. For the writes, we rely entirely on the set. Note that there are corner-cases where sub-bursts would actually not be mergeable without a read-modify-write. We ignore these cases and opt for speed.	2015-07-03 10:14:45 -04:00
Andreas Hansson	db85ddca1a	mem: Delay responses in the crossbar before forwarding This patch changes how the crossbar classes deal with responses. Instead of forwarding responses directly and burdening the neighbouring modules in paying for the latency (through the pkt->headerDelay), we now queue them before sending them. The coherency protocol is not affected as requests and any snoop requests/responses are still passed on in zero time. Thus, the responses end up paying for any header delay accumulated when passing through the crossbar. Any latency incurred on the request path will be paid for on the response side, if no other module has dealt with it. As a result of this patch, responses are returned at a later point. This affects the number of outstanding transactions, and quite a few regressions see an impact in blocking due to no MSHRs, increased cache-miss latencies, etc. Going forward we should be able to use the same concept also for snoop responses, and any request that is not an express snoop.	2015-07-03 10:14:44 -04:00
Andreas Hansson	b93c912013	mem: Remove redundant is_top_level cache parameter This patch takes the final step in removing the is_top_level parameter from the cache. With the recent changes to read requests and write invalidations, the parameter is no longer needed, and consequently removed. This also means that asymmetric cache hierarchies are now fully supported (and we are actually using them already with L1 caches, but no table-walker caches, connected to a shared L2).	2015-07-03 10:14:43 -04:00
Andreas Hansson	71856cfbbc	mem: Split WriteInvalidateReq into write and invalidate WriteInvalidateReq ensures that a whole-line write does not incur the cost of first doing a read exclusive, only to later overwrite the data. This patch splits the existing WriteInvalidateReq into a WriteLineReq, which is done locally, and an InvalidateReq that is sent out throughout the memory system. The WriteLineReq re-uses the normal WriteResp. The change allows us to better express the difference between the cache that is performing the write, and the ones that are merely invalidating. As a consequence, we no longer have to rely on the isTopLevel flag. Moreover, the actual memory in the system does not see the intitial write, only the writeback. We were marking the written line as dirty already, so there is really no need to also push the write all the way to the memory. The overall flow of the write-invalidate operation remains the same, i.e. the operation is only carried out once the response for the invalidate comes back. This patch adds the InvalidateResp for this very reason.	2015-07-03 10:14:41 -04:00
Andreas Hansson	0ddde83a47	mem: Add ReadCleanReq and ReadSharedReq packets This patch adds two new read requests packets: ReadCleanReq - For a cache to explicitly request clean data. The response is thus exclusive or shared, but not owned or modified. The read-only caches (see previous patch) use this request type to ensure they do not get dirty data. ReadSharedReq - We add this to distinguish cache read requests from those issued by other masters, such as devices and CPUs. Thus, devices use ReadReq, and caches use ReadCleanReq, ReadExReq, or ReadSharedReq. For the latter, the response can be any state, shared, exclusive, owned or even modified. Both ReadCleanReq and ReadSharedReq re-use the normal ReadResp. The two transactions are aligned with the emerging cache-coherent TLM standard and the AMBA nomenclature. With this change, the normal ReadReq should never be used by a cache, and is reserved for the actual (non-caching) masters in the system. We thus have a way of identifying if a request came from a cache or not. The introduction of ReadSharedReq thus removes the need for the current isTopLevel hack, and also allows us to stop relying on checking the packet size to determine if the source is a cache or not. This is fixed in follow-on patches.	2015-07-03 10:14:40 -04:00
Andreas Hansson	893533a126	mem: Allow read-only caches and check compliance This patch adds a parameter to the BaseCache to enable a read-only cache, for example for the instruction cache, or table-walker cache (not for x86). A number of checks are put in place in the code to ensure a read-only cache does not end up with dirty data. A follow-on patch adds suitable read requests to allow a read-only cache to explicitly ask for clean data.	2015-07-03 10:14:39 -04:00
Ali Jafri	a262908acc	mem: Add clean evicts to improve snoop filter tracking This patch adds eviction notices to the caches, to provide accurate tracking of cache blocks in snoop filters. We add the CleanEvict message to the memory heirarchy and use both CleanEvicts and Writebacks with BLOCK_CACHED flags to propagate notice of clean and dirty evictions respectively, down the memory hierarchy. Note that the BLOCK_CACHED flag indicates whether there exist any copies of the evicted block in the caches above the evicting cache. The purpose of the CleanEvict message is to notify snoop filters of silent evictions in the relevant caches. The CleanEvict message behaves much like a Writeback. CleanEvict is a write and a request but unlike a Writeback, CleanEvict does not have data and does not need exclusive access to the block. The cache generates the CleanEvict message on a fill resulting in eviction of a clean block. Before travelling downwards CleanEvict requests generate zero-time snoop requests to check if the same block is cached in upper levels of the memory heirarchy. If the block exists, the cache discards the CleanEvict message. The snoops check the tags, writeback queue and the MSHRs of upper level caches in a manner similar to snoops generated from HardPFReqs. Currently CleanEvicts keep travelling towards main memory unless they encounter the block corresponding to their address or reach main memory (since we have no well defined point of serialisation). Main memory simply discards CleanEvict messages. We have modified the behavior of Writebacks, such that they generate snoops to check for the presence of blocks in upper level caches. It is possible in our current implmentation for a lower level cache to be writing back a block while a shared copy of the same block exists in the upper level cache. If the snoops find the same block in upper level caches, we set the BLOCK_CACHED flag in the Writeback message. We have also added logic to account for interaction of other message types with CleanEvicts waiting in the writeback queue. A simple example is of a response arriving at a cache removing any CleanEvicts to the same address from the cache's writeback queue.	2015-07-03 10:14:37 -04:00
Andreas Hansson	aa5bbe81f6	mem: Convert Request static const flags to enums This patch fixes an issue which is very wide spread in the codebase, causing sporadic linking failures. The issue is that we declare static const class variables in the header, without any definition (as part of a source file). In most cases the compiler propagates the value and we have no issues. However, especially for less optimising builds such as debug, we get sporadic linking failures due to undefined references. This patch fixes the Request class, by turning the static const flags and master IDs into C++11 typed enums.	2015-07-03 10:14:36 -04:00
Nilay Vaish	57971248f6	ruby: slicc: remove README No longer maintained. Updates are only made to the wiki page. So being dropped.	2015-06-25 11:58:30 -05:00
Nilay Vaish	0647d99854	ruby: message: remove a data member added by mistake I (Nilay) had mistakenly added a data member to the Message class in revision c1694b4032a6. The data member is being removed.	2015-06-25 11:58:29 -05:00
Jason Power	2f3c467883	Ruby: Remove assert in RubyPort retry list logic Remove the assert when adding a port to the RubyPort retry list. Instead of asserting, just ignore the added port, since it's already on the list. Without this patch, Ruby+detailed fails for even the simplest tests	2015-06-25 11:58:28 -05:00
Ali Jafri	f0c3b70451	mem: Add check for express snoop in packet destructor Snoop packets share the request pointer with the originating packets. We need to ensure that the snoop packet destruction does not delete the request. Snoops are used for reads, invalidations, HardPFReqs, Writebacks and CleansEvicts. Reads, invalidations, and HardPFReqs need a response so their snoops do not delete the request. For Writebacks and CleanEvicts we need to check explicitly for whethere the current packet is an express snoop, in whcih case do not delete the request.	2015-06-09 09:21:18 -04:00
Andreas Hansson	578a7f20c6	mem: Fix snoop packet data allocation bug This patch fixes an issue where the snoop packet did not properly forward the data pointer in case of static data.	2015-06-09 09:21:17 -04:00
Marco Elver	6599dd87c8	ruby: Fix MESI consistency bug Fixes missed forward eviction to CPU. With the O3CPU this can lead to load-load reordering, as the LQ is never notified of the invalidate. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2015-06-07 14:02:40 -05:00
Matthias Jung	25fe4c2529	mem: Add HMC Timing Parameters A single HMC-2500 x32 model based on: [1] DRAMSpec: a high-level DRAM bank modelling tool developed at the University of Kaiserslautern. This high level tool uses RC (resistance-capacitance) and CV (capacitance-voltage) models to estimate the DRAM bank latency and power numbers. [2] A Logic-base Interconnect for Supporting Near Memory Computation in the Hybrid Memory Cube (E. Azarkhish et. al) Assumed for the HMC model is a 30 nm technology node. The modelled HMC consists of a 4 Gbit part with 4 layers connected with TSVs. Each layer has 16 vaults and each vault consists of 2 banks per layer. In order to be able to use the same controller used for 2D DRAM generations for HMC, the following analogy is done: Channel (DDR) => Vault (HMC) device_size (DDR) => size of a single layer in a vault ranks per channel (DDR) => number of layers banks per rank (DDR) => banks per layer devices per rank (DDR) => devices per layer ( 1 for HMC). The parameters for which no input is available are inherited from the DDR3 configuration.	2015-06-07 14:02:40 -05:00
Christoph Pfister	4a17494708	mem: addr_mapper: restore old address if request not sent Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2015-05-30 13:45:17 +02:00
Andreas Hansson	0cc350d2c5	ruby: Deprecation warning for RubyMemoryControl A step towards removing RubyMemoryControl and shift users to DRAMCtrl. The latter is faster, more representative, very versatile, and is integrated with power models.	2015-05-26 03:21:34 -04:00
Joel Hestness	0479569f67	ruby: Fix RubySystem warm-up and cool-down scope The processes of warming up and cooling down Ruby caches are simulation-wide processes, not just RubySystem instance-specific processes. Thus, the warm-up and cool-down variables should be globally visible to any Ruby components participating in either process. Make these variables static members and track the warm-up and cool-down processes as appropriate. This patch also has two side benefits: 1) It removes references to the RubySystem g_system_ptr, which are problematic for allowing multiple RubySystem instances in a single simulation. Warmup and cooldown variables being static (global) reduces the need for instance-specific dereferences through the RubySystem. 2) From the AbstractController, it removes local RubySystem pointers, which are used inconsistently with other uses of the RubySystem: 11 other uses reference the RubySystem with the g_system_ptr. Only sequencers have local pointers.	2015-05-19 10:56:51 -05:00
Stephan Diestelhorst	2847d5f517	mem: Create a request copy for deferred snoops Sometimes, we need to defer an express snoop in an MSHR, but the original request might complete and deallocate the original pkt->req. In those cases, create a copy of the request so that someone who is inspecting the delayed snoop can also inspect the request still. All of this is rather hacky, but the allocation / linking and general life-time management of Packet and Request is rather tricky. Deleting the copy is another tricky area, testing so far has shown that the right copy is deleted at the right time.	2015-03-17 11:50:55 +00:00
Andreas Sandberg	48281375ee	mem, cpu: Add a separate flag for strictly ordered memory The Request::UNCACHEABLE flag currently has two different functions. The first, and obvious, function is to prevent the memory system from caching data in the request. The second function is to prevent reordering and speculation in CPU models. This changeset gives the order/speculation requirement a separate flag (Request::STRICT_ORDER). This flag prevents CPU models from doing the following optimizations: * Speculation: CPU models are not allowed to issue speculative loads. * Write combining: CPU models and caches are not allowed to merge writes to the same cache line. Note: The memory system may still reorder accesses unless the UNCACHEABLE flag is set. It is therefore expected that the STRICT_ORDER flag is combined with the UNCACHEABLE flag to prevent this behavior.	2015-05-05 03:22:33 -04:00
Andreas Sandberg	1da634ace0	mem, alpha: Move Alpha-specific request flags Move Alpha-specific memory request flags to an architecture-specific header and map them to the architecture specific flag bit range.	2015-05-05 03:22:31 -04:00
Andreas Hansson	36f29496a0	mem: Snoop into caches on uncacheable accesses This patch takes a last step in fixing issues related to uncacheable accesses. We do not separate uncacheable memory from uncacheable devices, and in cases where it is really memory, there are valid scenarios where we need to snoop since we do not support cache maintenance instructions (yet). On snooping an uncacheable access we thus provide data if possible. In essence this makes uncacheable accesses IO coherent. The snoop filter is also queried to steer the snoops, but not updated since the uncacheable accesses do not allocate a block.	2015-05-05 03:22:29 -04:00
Andreas Hansson	14e5b2ea55	mem: Pass shared downstream through caches This patch ensures that we pass on information about a packet being shared (rather than exclusive), when forwarding a packet downstream. Without this patch there is a risk that a downstream cache considers the line exclusive when it really isn't.	2015-05-05 03:22:26 -04:00
Ali Jafri	3d33432136	mem: Add forward snoop check for HardPFReqs We should always check whether the cache is supposed to be forwarding snoops before generating snoops.	2015-05-05 03:22:25 -04:00
Andreas Hansson	0ebbf3f951	mem: Add missing stats update for uncacheable MSHRs This patch adds a missing counter update for the uncacheable accesses. By updating this counter we also get a meaningful average latency for uncacheable accesses (previously inf).	2015-05-05 03:22:24 -04:00
Andreas Hansson	33e3e370f2	mem: Tidy up BaseCache parameters This patch simply tidies up the BaseCache parameters and removes the unused "two_queue" parameter.	2015-05-05 03:22:22 -04:00
David Guillen	5287945a8b	mem: Remove templates in cache model This patch changes the cache implementation to rely on virtual methods rather than using the replacement policy as a template argument. There is no impact on the simulation performance, and overall the changes make it easier to modify (and subclass) the cache and/or replacement policy.	2015-05-05 03:22:21 -04:00
Rizwana Begum	52a3bc5e5c	mem: Simplify page close checks for adaptive policies Both open_adaptive and close_adaptive page polices keep the page open if a row hit is found. If a row hit is not found, close_adaptive page policy precharges the row, and open_adaptive policy precharges the row only if there is a bank conflict request waiting in the queue. This patch makes the checks for above conditions simpler. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2015-04-29 22:35:22 -05:00
Nilay Vaish	3a2731fb8c	ruby: set: replace long by unsigned long UBSan complains about negative value being shifted	2015-04-29 22:35:22 -05:00
Lena Olson	dea7acdb3e	ruby: allow restoring from checkpoint when using DRAMCtrl Restoring from a checkpoint with ruby + the DRAMCtrl memory model was not working, because ruby and DRAMCtrl disagreed on the current tick during warmup. Since there is no reason to do timing requests during warmup, use functional requests instead. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2015-04-13 17:33:57 -05:00
Stephan Diestelhorst	cb8856f580	mem: Support any number of master-IDs in stride prefetcher The stride prefetcher had a hardcoded number of contexts (i.e. master-IDs) that it could handle. Since master IDs need to be unique per system, and every core, cache etc. requires a separate master port, a static limit on these does not make much sense. Instead, this patch adds a small hash map that will map all master IDs to the right prefetch state and dynamically allocates new state for new master IDs.	2015-03-27 04:56:03 -04:00
Andreas Hansson	0197e580e5	mem: Allocate cache writebacks before new MSHRs This patch changes the order of writeback allocation such that any writebacks resulting from a tag lookup (e.g. for an uncacheable access), are added to the writebuffer before any new MSHR entries are allocated. This ensures that the writebacks logically precedes the new allocations. The patch also changes the uncacheable flush to use proper timed (or atomic) writebacks, as opposed to functional writes.	2015-03-27 04:56:02 -04:00
Andreas Hansson	24763c2177	mem: Cleanup flow for uncacheable accesses This patch simplifies the code dealing with uncacheable timing accesses, aiming to align it with the existing miss handling. Similar to what we do in atomic, a timing request now goes through Cache::access (where the block is also flushed), and then proceeds to ignore any existing MSHR for the block in question. This unifies the flow for cacheable and uncacheable accesses, and for atomic and timing.	2015-03-27 04:56:01 -04:00
Andreas Hansson	a7a1e6004a	mem: Ignore uncacheable MSHRs when finding matches This patch changes how we search for matching MSHRs, ignoring any MSHR that is allocated for an uncacheable access. By doing so, this patch fixes a corner case in the MSHRs where incorrect data ended up being copied into a (cacheable) read packet due to a first uncacheable MSHR target of size 4, followed by a cacheable target to the same MSHR of size 64. The latter target was filled with nonsense data.	2015-03-27 04:56:00 -04:00
Andreas Hansson	801ce65eae	mem: Remove redundant allocateUncachedReadBuffer in cache This patch removes the no-longer-needed allocateUncachedReadBuffer. Besides the checks it is exactly the same as allocateMissBuffer and thus provides no value.	2015-03-27 04:55:59 -04:00
Andreas Hansson	fe806a0dd7	mem: Modernise MSHR iterators to C++11 This patch updates the iterators in the MSHR and MSHR queues to use C++11 range-based for loops. It also does a bit of additional house keeping.	2015-03-27 04:55:57 -04:00
Andreas Hansson	7bae98459c	mem: Align all MSHR entries to block boundaries This patch aligns all MSHR queue entries to block boundaries to simplify checks for matches. Previously there were corner cases that could lead to existing entries not being identified as matches. There are, rather alarmingly, a few regressions that change with this patch.	2015-03-27 04:55:55 -04:00
Ali Jafri	15f0d9ff14	mem: Rename PREFETCH_SNOOP_SQUASH flag to BLOCK_CACHED This patch subsumes the PREFETCH_SNOOP_SQUASH flag with the more generic BLOCK_CACHED flag. Future patches implementing cache eviction messages can use the BLOCK_CACHED flag in almost the same manner as hardware prefetches use the PREFETCH_SNOOP_SQUASH flag. The PREFTECH_SNOOP_FLAG is set if the prefetch target is found in the tags or the MSHRs in any state, so we are simply replacing calls to setPrefetchSquashed() with setBlockCached(). The case of where the prefetch target is found in the writeback MSHRs of upper level caches continues to be covered by the MEM_INHIBIT flag.	2015-03-27 04:55:54 -04:00
Steve Reinhardt	6677b9122a	mem: rename Locked/LOCKED to LockedRMW/LOCKED_RMW Makes x86-style locked operations even more distinct from LLSC operations. Using "locked" by itself should be obviously ambiguous now.	2015-03-23 16:14:20 -07:00
Andreas Hansson	45286d9b64	mem: Tidy up Request This patch does a bit of house keeping, fixing up typos, removing dead code etc.	2015-03-23 06:57:34 -04:00
Andreas Hansson	5275c9d740	mem: Use emplace front/back for deferred packets Embrace C++11 for the deferred packets as we actually store the objects in the data structure, and not just pointers.	2015-03-19 04:06:11 -04:00
Geoffrey Blake	1d403960af	mem: Enable CommMonitor to output traces in atomic mode The CommMonitor by default only allows memory traces to be gathered in timing mode. This patch allows memory traces to be gathered in atomic mode if all one needs is a functional trace of memory addresses used and timing information is of a secondary concern.	2015-03-19 04:06:10 -04:00
Steve Reinhardt	e57ab463cf	mem: remove redundant test in in Cache::recvTimingResp() For some reason we were checking mshr->hasTargets() even though we had already called mshr->getTarget() unconditionally earlier in the same function (which asserts if there are no targets). Get rid of this useless check, and while we're at it get rid of the redundant call to mshr->getTarget(), since we still have the value saved in a local var.	2015-02-11 10:48:53 -08:00
Steve Reinhardt	89bb03a1a6	mem: add local var in Cache::recvTimingResp() The main loop in recvTimingResp() uses target->pkt all over the place. Create a local tgt_pkt to help keep lines under the line length limit.	2015-02-11 10:48:52 -08:00
Steve Reinhardt	ee0b52404c	mem: restructure Packet cmd initialization a bit more Refactor the way that specific MemCmd values are generated for packets. The new approach is a little more elegant in that we assign the right value up front, and it's also more amenable to non-heap-allocated Packet objects. Also replaced the code in the Minor model that was still doing it the ad-hoc way. This is basically a refinement of http://repo.gem5.org/gem5/rev/711eb0e64249.	2015-02-11 10:48:50 -08:00
Steve Reinhardt	ccef61d1cc	mem: clean up write buffer check in Cache::handleSnoop() The 'if (writebacks.size)' check was redundant, because writeBuffer.findMatches() would return false if the writebacks list was empty. Also renamed 'mshr' to 'wb_entry' in this context since we are pointing at a writebuffer entry and not an MSHR (even though it's the same C++ class).	2015-03-14 06:51:07 -07:00
Andreas Hansson	fc315901ff	mem: Unify all cache DPRINTF address formatting This patch changes all the DPRINTF messages in the cache to use '%#llx' every time a packet address is printed. The inclusion of '#' ensures '0x' is prepended, and since the address type is a uint64_t %x really should be %llx.	2015-03-02 04:00:56 -05:00
Andreas Hansson	88e2963951	mem: Fix cache MSHR conflict determination This patch fixes a rather subtle issue in the sending of MSHR requests in the cache, where the logic previously did not check for conflicts between the MSRH queue and the write queue when requests were not ready. The correct thing to do is to always check, since not having a ready MSHR does not guarantee that there is no conflict. The underlying problem seems to have slipped past due to the symmetric timings used for the write queue and MSHR queue. However, with the recent timing changes the bug caused regressions to fail.	2015-03-02 04:00:54 -05:00
Andreas Hansson	407737614e	mem: Add byte mask to Packet::checkFunctional This patch changes the valid-bytes start/end to a proper byte mask. With the changes in timing introduced in previous patches there are more packets waiting in queues, and there are regressions using the checker CPU failing due to non-contigous read data being found in the various cache queues. This patch also adds some more comments explaining what is going on, and adds the fourth and missing case to Packet::checkFunctional.	2015-03-02 04:00:52 -05:00
Stephan Diestelhorst	ecef1612b8	mem: Add option to force in-order insertion in PacketQueue By default, the packet queue is ordered by the ticks of the to-be-sent packages. With the recent modifications of packages sinking their header time when their resposne leaves the caches, there could be cases of MSHR targets being allocated and ordered A, B, but their responses being sent out in the order B,A. This led to inconsistencies in bus traffic, in particular the snoop filter observing first a ReadExResp and later a ReadRespWithInv. Logically, these were ordered the other way around behind the MSHR, but due to the timing adjustments when inserting into the PacketQueue, they were sent out in the wrong order on the bus, confusing the snoop filter. This patch adds a flag (off by default) such that these special cases can request in-order insertion into the packet queue, which might offset timing slighty. This is expected to occur rarely and not affect timing results.	2015-03-02 04:00:49 -05:00
Marco Balboni	d4ef8368aa	mem: Downstream components consumes new crossbar delays This patch makes the caches and memory controllers consume the delay that is annotated to a packet by the crossbar. Previously many components simply threw these delays away. Note that the devices still do not pay for these delays.	2015-03-02 04:00:48 -05:00
Andreas Hansson	36dc93a5fa	mem: Move crossbar default latencies to subclasses This patch introduces a few subclasses to the CoherentXBar and NoncoherentXBar to distinguish the different uses in the system. We use the crossbar in a wide range of places: interfacing cores to the L2, as a system interconnect, connecting I/O and peripherals, etc. Needless to say, these crossbars have very different performance, and the clock frequency alone is not enough to distinguish these scenarios. Instead of trying to capture every possible case, this patch introduces dedicated subclasses for the three primary use-cases: L2XBar, SystemXBar and IOXbar. More can be added if needed, and the defaults can be overridden.	2015-03-02 04:00:47 -05:00
Marco Balboni	d35dd71ab4	mem: Add crossbar latencies This patch introduces latencies in crossbar that were neglected before. In particular, it adds three parameters in crossbar model: front_end_latency, forward_latency, and response_latency. Along with these parameters, three corresponding members are added: frontEndLatency, forwardLatency, and responseLatency. The coherent crossbar has an additional snoop_response_latency. The latency of the request path through the xbar is set as --> frontEndLatency + forwardLatency In case the snoop filter is enabled, the request path latency is charged also by look-up latency of the snoop filter. --> frontEndLatency + SF(lookupLatency) + forwardLatency. The latency of the response path through the xbar is set instead as --> responseLatency. In case of snoop response, if the response is treated as a normal response the latency associated is again --> responseLatency; If instead it is forwarded as snoop response we add an additional variable + snoopResponseLatency and the latency associated is --> snoopResponseLatency; Furthermore, this patch lets the crossbar progress on the next clock edge after an unused retry, changing the time the crossbar considers itself busy after sending a retry that was not acted upon.	2015-03-02 04:00:46 -05:00
Andreas Hansson	987de4f5cc	mem: Tidy up the cache debug messages Avoid redundant inclusion of the name in the DPRINTF string.	2015-03-02 04:00:37 -05:00
Andreas Hansson	f26a289295	mem: Split port retry for all different packet classes This patch fixes a long-standing isue with the port flow control. Before this patch the retry mechanism was shared between all different packet classes. As a result, a snoop response could get stuck behind a request waiting for a retry, even if the send/recv functions were split. This caused message-dependent deadlocks in stress-test scenarios. The patch splits the retry into one per packet (message) class. Thus, sendTimingReq has a corresponding recvReqRetry, sendTimingResp has recvRespRetry etc. Most of the changes to the code involve simply clarifying what type of request a specific object was accepting. The biggest change in functionality is in the cache downstream packet queue, facing the memory. This queue was shared by requests and snoop responses, and it is now split into two queues, each with their own flow control, but the same physical MasterPort. These changes fixes the previously seen deadlocks.	2015-03-02 04:00:35 -05:00
Ali Jafri	6ebe8d863a	mem: Fix prefetchSquash + memInhibitAsserted bug This patch resolves a bug with hardware prefetches. Before a hardware prefetch is sent towards the memory, the system generates a snoop request to check all caches above the prefetch generating cache for the presence of the prefetth target. If the prefetch target is found in the tags or the MSHRs of the upper caches, the cache sets the prefetchSquashed flag in the snoop packet. When the snoop packet returns with the prefetchSquashed flag set, the prefetch generating cache deallocates the MSHR reserved for the prefetch. If the prefetch target is found in the writeback buffer of the upper cache, the cache sets the memInhibit flag, which signals the prefetch generating cache to expect the data from the writeback. When the snoop packet returns with the memInhibitAsserted flag set, it marks the allocated MSHR as inService and waits for the data from the writeback. If the prefetch target is found in multiple upper level caches, specifically in the tags or MSHRs of one upper level cache and the writeback buffer of another, the snoop packet will return with both prefetchSquashed and memInhibitAsserted set, while the current code is not written to handle such an outcome. Current code checks for the prefetchSquashed flag first, if it finds the flag, it deallocates the reserved MSHR. This leads to assert failure when the data from the writeback appears at cache. In this fix, we simply switch the order of checks. We first check for memInhibitAsserted and then for prefetch squashed.	2015-03-02 04:00:34 -05:00
Jason Power	670f44e05e	Ruby: Update backing store option to propagate through to all RubyPorts Previously, the user would have to manually set access_backing_store=True on all RubyPorts (Sequencers) in the config files. Now, instead there is one global option that each RubyPort checks on initialization. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2015-02-26 09:58:26 -06:00
Stephan Diestelhorst	93fa8e3cd4	mem: Fix initial value problem with MemChecker In highly loaded cases, reads might actually overlap with writes to the initial memory state. The mem checker needs to detect such cases and permit the read reading either from the writes (what it is doing now) or read from the initial, unknown value. This patch adds this logic.	2015-02-16 03:34:47 -05:00
Andreas Hansson	e17328a227	mem: mmap the backing store with MAP_NORESERVE This patch ensures we can run simulations with very large simulated memories (at least 64 TB based on some quick runs on a Linux workstation). In essence this allows us to efficiently deal with sparse address maps without having to implement a redirection layer in the backing store. This opens up for run-time errors if we eventually exhausts the hosts memory and swap space, but this should hopefully never happen.	2015-02-16 03:33:47 -05:00
Andreas Hansson	57758ca685	mem: Use the range cache for lookup as well as access This patch changes the range cache used in the global physical memory to be an iterator so that we can use it not only as part of isMemAddr, but also access and functionalAccess. This matches use-cases where a core is using the atomic non-caching memory mode, and repeatedly calls isMemAddr and access. Linux boot on aarch32, with a single atomic CPU, is now more than 30% faster when using "--fastmem" compared to not using the direct memory access.	2015-02-16 03:33:37 -05:00
Marco Balboni	268d9e59c5	mem: Clarification of packet crossbar timings This patch clarifies the packet timings annotated when going through a crossbar. The old 'firstWordDelay' is replaced by 'headerDelay' that represents the delay associated to the delivery of the header of the packet. The old 'lastWordDelay' is replaced by 'payloadDelay' that represents the delay needed to processing the payload of the packet. For now the uses and values remain identical. However, going forward the payloadDelay will be additive, and not include the headerDelay. Follow-on patches will make the headerDelay capture the pipeline latency incurred in the crossbar, whereas the payloadDelay will capture the additional serialisation delay.	2015-02-11 10:23:47 -05:00
Marco Balboni	e2828587b3	mem: Clarify usage of latency in the cache This patch adds some much-needed clarity in the specification of the cache timing. For now, hit_latency and response_latency are kept as top-level parameters, but the cache itself has a number of local variables to better map the individual timing variables to different behaviours (and sub-components). The introduced variables are: - lookupLatency: latency of tag lookup, occuring on any access - forwardLatency: latency that occurs in case of outbound miss - fillLatency: latency to fill a cache block We keep the existing responseLatency The forwardLatency is used by allocateInternalBuffer() for: - MSHR allocateWriteBuffer (unchached write forwarded to WriteBuffer); - MSHR allocateMissBuffer (cacheable miss in MSHR queue); - MSHR allocateUncachedReadBuffer (unchached read allocated in MSHR queue) It is our assumption that the time for the above three buffers is the same. Similarly, for snoop responses passing through the cache we use forwardLatency.	2015-02-11 10:23:36 -05:00
Andreas Hansson	461a80beb3	mem: Clarify express snoop behaviour This patch adds a bit of documentation with insights around how express snoops really work.	2015-02-03 14:26:02 -05:00
Andreas Hansson	193325ff60	mem: Clarify cache behaviour for pending dirty responses This patch adds a bit of clarification around the assumptions made in the cache when packets are sent out, and dirty responses are pending. As part of the change, the marking of an MSHR as in service is simplified slightly, and comments are added to explain what assumptions are made.	2015-02-03 14:25:59 -05:00
Andreas Hansson	5ea60a95b3	config: Adjust DRAM channel interleaving defaults This patch changes the DRAM channel interleaving default behaviour to be more representative. The default address mapping (RoRaBaCoCh) moves the channel bits towards the least significant bits, and uses 128 byte as the default channel interleaving granularity. These defaults can be overridden if desired, but should serve as a sensible starting point for most use-cases.	2015-02-03 14:25:52 -05:00
Andreas Hansson	10c69bb168	mem: Remove unused Packet src and dest fields This patch takes the final step in removing the src and dest fields in the packet. These fields were rather confusing in that they only remember a single multiplexing component, and pushed the responsibility to the bridge and caches to store the fields in a senderstate, thus effectively creating a stack. With the recent changes to the crossbar response routing the crossbar is now responsible without relying on the packet fields. Thus, these variables are now unused and can be removed.	2015-01-22 05:01:31 -05:00
Andreas Hansson	15c64035ed	mem: Remove Packet source from ForwardResponseRecord This patch removes the source field from the ForwardResponseRecord, but keeps the class as it is part of how the cache identifies responses to hardware prefetches that are snooped upwards.	2015-01-22 05:01:30 -05:00
Andreas Hansson	0c2ffd2daa	mem: Remove unused RequestState in the bridge This patch removes the bridge sender state as the Crossbar now takes care of remembering its own routing decisions.	2015-01-22 05:01:27 -05:00
Andreas Hansson	00536b0efc	mem: Always use SenderState for response routing in RubyPort This patch aligns how the response routing is done in the RubyPort, using the SenderState for both memory and I/O accesses. Before this patch, only the I/O used the SenderState, whereas the memory accesses relied on the src field in the packet. With this patch we shift to using SenderState in both cases, thus not relying on the src field any longer.	2015-01-22 05:01:24 -05:00
Andreas Hansson	072f78471d	mem: Make the XBar responsible for tracking response routing This patch removes the need for a source and destination field in the packet by shifting the onus of the tracking to the crossbar, much like a real implementation. This change in behaviour also means we no longer need a SenderState to remember the source/dest when ever we have multiple crossbars in the system. Thus, the stack that was created by the SenderState is not needed, and each crossbar locally tracks the response routing. The fields in the packet are still left behind as the RubyPort (which also acts as a crossbar) does routing based on them. In the succeeding patches the uses of the src and dest field will be removed. Combined, these patches improve the simulation performance by roughly 2%.	2015-01-22 05:01:14 -05:00
Andreas Hansson	f49830ce0b	mem: Clean up Request initialisation This patch tidies up how we create and set the fields of a Request. In essence it tries to use the constructor where possible (as opposed to setPhys and setVirt), thus avoiding spreading the information across a number of locations. In fact, setPhys is made private as part of this patch, and a number of places where we callede setVirt instead uses the appropriate constructor.	2015-01-22 05:00:53 -05:00
Andreas Hansson	6096e2f9c1	mem: Fix bug in cache request retry mechanism This patch ensures that inhibited packets that are about to be turned into express snoops do not update the retry flag in the cache.	2015-01-20 08:12:01 -05:00
Andreas Hansson	92585d60c9	mem: Move DRAM interleaving check to init This patch fixes a bug where the DRAM controller tried to access the system cacheline size before the system pointer was initialised. It also fixes a bug where the granularity is 0 (no interleaving).	2015-01-20 08:11:55 -05:00
Mitch Hayenga	b2342c5d9a	mem: Change prefetcher to use random_mt Prefechers has used rand() to generate random numers previously.	2014-12-23 09:31:19 -05:00
Curtis Dunham	516e6046ae	mem: Hide WriteInvalidate requests from prefetchers Without this tweak, a prefetcher will happily prefetch data that will promptly be invalidated and overwritten by a WriteInvalidate.	2014-12-23 09:31:19 -05:00
Mitch Hayenga	bd4f901c77	mem: Fix event scheduling issue for prefetches The cache's MemSidePacketQueue schedules a sendEvent based upon nextMSHRReadyTime() which is the time when the next MSHR is ready or whenever a future prefetch is ready. However, a prefetch being ready does not guarentee that it can obtain an MSHR. So, when all MSHRs are full, the simulation ends up unnecessiciarly scheduling a sendEvent every picosecond until an MSHR is finally freed and the prefetch can happen. This patch fixes this by not signaling the prefetch ready time if the prefetch could not be generated. The event is rescheduled as soon as a MSHR becomes available.	2014-12-23 09:31:18 -05:00
Mitch Hayenga	4acd4a2055	mem: Fix bug relating to writebacks and prefetches Previously the code commented about an unhandled case where it might be possible for a writeback to arrive after a prefetch was generated but before it was sent to the memory system. I hit that case. Luckily the prefetchSquash() logic already in the code handles dropping prefetch request in certian circumstances.	2014-12-23 09:31:18 -05:00
Mitch Hayenga	df82a2d003	mem: Rework the structuring of the prefetchers Re-organizes the prefetcher class structure. Previously the BasePrefetcher forced multiple assumptions on the prefetchers that inherited from it. This patch makes the BasePrefetcher class truly representative of base functionality. For example, the base class no longer enforces FIFO order. Instead, prefetchers with FIFO requests (like the existing stride and tagged prefetchers) now inherit from a new QueuedPrefetcher base class. Finally, the stride-based prefetcher now assumes a custimizable lookup table (sets/ways) rather than the previous fully associative structure.	2014-12-23 09:31:18 -05:00
Mitch Hayenga	6cb58b2bd2	mem: Add parameter to reserve MSHR entries for demand access Adds a new parameter that reserves some number of MSHR entries for demand accesses. This helps prevent prefetchers from taking all MSHRs, forcing demand requests from the CPU to stall.	2014-12-23 09:31:18 -05:00
Andreas Hansson	59460b91f3	config: Expose the DRAM ranks as a command-line option This patch gives the user direct influence over the number of DRAM ranks to make it easier to tune the memory density without affecting the bandwidth (previously the only means of scaling the device count was through the number of channels). The patch also adds some basic sanity checks to ensure that the number of ranks is a power of two (since we rely on bit slices in the address decoding).	2014-12-23 09:31:18 -05:00
Andreas Hansson	2f7baf9dbe	mem: Ensure DRAM controller is idle when in atomic mode This patch addresses an issue seen with the KVM CPU where the refresh events scheduled by the DRAM controller forces the simulator to switch out of the KVM mode, thus killing performance. The current patch works around the fact that we currently have no proper API to inform a SimObject of the mode switches. Instead we rely on drainResume being called after any switch, and cache the previous mode locally to be able to decide on appropriate actions. The switcheroo regression require a minor stats bump as a result.	2014-12-23 09:31:18 -05:00
Omar Naji	381d1da791	mem: Add rank-wise refresh to the DRAM controller This patch adds rank-wise refresh to the controller, as opposed to the channel-wide refresh currently in place. In essence each rank can be refreshed independently, and for this to be possible the controller is extended with a state machine per rank. Without this patch the data bus is always idle during a refresh, as all the ranks are refreshing at the same time. With the rank-wise refresh it is possible to use one rank while another one is refreshing, and thus the data bus can be kept busy. The patch introduces a Rank class to encapsulate the state per rank, and also shifts all the relevant banks, activation tracking etc to the rank. The arbitration is also updated to consider the state of the rank.	2014-12-23 09:31:18 -05:00
Omar Naji	152c02354e	mem: Fix a bug in the DRAM controller arbitration Fix a minor issue that affects multi-rank systems.	2014-12-23 09:31:18 -05:00
Kanishk Sugand	7a25b1a0e0	mem: Add stack distance statistics to the CommMonitor This patch adds the stack distance calculator to the CommMonitor. The stats are disabled by default.	2014-12-23 09:31:18 -05:00
Kanishk Sugand	888975b29d	mem: Add a stack distance calculator This patch adds a stand-alone stack distance calculator. The stack distance calculator is a passive SimObject that observes the addresses passed to it. It calculates stack distances (LRU Distances) of incoming addresses based on the partial sum hierarchy tree algorithm described by Alamasi et al. http://doi.acm.org/10.1145/773039.773043. For each transaction a hashtable look-up is performed. At every non-unique transaction the tree is traversed from the leaf at the returned index to the root, the old node is deleted from the tree, and the sums (to the right) are collected and decremented. The collected sum represets the stack distance of the found node. At every unique transaction the stack distance is returned as numeric_limits<uint64>::max(). In addition to the basic stack distance calculation, a feature to mark an old node in the tree is added. This is useful if it is required to see the reuse pattern. For example, Writebacks to the lower level (e.g. membus from L2), can be marked instead of being removed from the stack (isMarked flag of Node set to True). And then later if this same address is accessed (by L1), the value of the isMarked flag would be True. This gives some insight on how the Writeback policy of the lower level affect the read/write accesses in an application. Debugging is enabled by setting the verify flag to true. Debugging is implemented using a dummy stack that behaves in a naive way, using STL vectors. Note that this has a large impact on run time.	2014-12-23 09:31:18 -05:00
Marco Elver	dd0f3943e2	mem: Add MemChecker and MemCheckerMonitor This patch adds the MemChecker and MemCheckerMonitor classes. While MemChecker can be integrated anywhere in the system and is independent, the most convenient usage is through the MemCheckerMonitor -- this however, puts limitations on where the MemChecker is able to observe read/write transactions.	2014-12-23 09:31:17 -05:00
Curtis Dunham	5d22250845	mem: Support WriteInvalidate (again) This patch takes a clean-slate approach to providing WriteInvalidate (write streaming, full cache line writes without first reading) support. Unlike the prior attempt, which took an aggressive approach of directly writing into the cache before handling the coherence actions, this approach follows the existing cache flows as closely as possible.	2014-12-02 06:08:19 -05:00
Curtis Dunham	7ca27dd3cc	mem: Remove WriteInvalidate support Prepare for a different implementation following in the next patch	2014-12-02 06:08:17 -05:00
Andreas Hansson	5c84157c29	mem: Relax packet src/dest check and shift onus to crossbar This patch allows objects to get the src/dest of a packet even if it is not set to a valid port id. This simplifies (ab)using the bridge as a buffer and latency adapter in situations where the neighbouring MemObjects are not crossbars. The checks that were done in the packet are now shifted to the crossbar where the fields are used to index into the port arrays. Thus, the carrier of the information is not burdened with checking, and the crossbar can check not only that the destination is set, but also that the port index is within limits.	2014-12-02 06:07:56 -05:00
Andreas Hansson	ea5ccc7041	mem: Clean up packet data allocation This patch attempts to make the rules for data allocation in the packet explicit, understandable, and easy to verify. The constructor that copies a packet is extended with an additional flag "alloc_data" to enable the call site to explicitly say whether the newly created packet is short-lived (a zero-time snoop), or has an unknown life-time and therefore should allocate its own data (or copy a static pointer in the case of static data). The tricky case is the static data. In essence this is a copy-avoidance scheme where the original source of the request (DMA, CPU etc) does not ask the memory system to return data as part of the packet, but instead provides a pointer, and then the memory system carries this pointer around, and copies the appropriate data to the location itself. Thus any derived packet actually never copies any data. As the original source does not copy any data from the response packet when arriving back at the source, we must maintain the copy of the original pointer to not break the system. We might want to revisit this one day and pay the price for a few extra memcpy invocations. All in all this patch should make it easier to grok what is going on in the memory system and how data is actually copied (or not).	2014-12-02 06:07:54 -05:00
Andreas Hansson	f012166bb6	mem: Cleanup Packet::checkFunctional and hasData usage This patch cleans up the use of hasData and checkFunctional in the packet. The hasData function is unfortunately suggesting that it checks if the packet has a valid data pointer, when it does in fact only check if the specific packet type is specified to have a data payload. The confusion led to a bug in checkFunctional. The latter function is also tidied up to avoid name overloading.	2014-12-02 06:07:52 -05:00
Andreas Hansson	a2ee51f631	mem: Make the requests carried by packets const This adds a basic level of sanity checking to the packet by ensuring that a request is not modified once the packet is created. The only issue that had to be worked around is the relaying of software-prefetches in the cache. The specific situation is now solved by first copying the request, and then creating a new packet accordingly.	2014-12-02 06:07:50 -05:00
Andreas Hansson	fa60d5cf27	mem: Make Request getters const This patch tidies up the Request class, making all getters const. The odd one out is incAccessDepth which is called by the memory system as packets carry the request around. This is also const to enable the packet to hold on to a const Request.	2014-12-02 06:07:48 -05:00
Andreas Hansson	3d6ec81e66	mem: Add checks and explanation for assertMemInhibit usage	2014-12-02 06:07:46 -05:00
Andreas Hansson	41846cb61b	mem: Assume all dynamic packet data is array allocated This patch simplifies how we deal with dynamically allocated data in the packet, always assuming that it is array allocated, and hence should be array deallocated (delete[] as opposed to delete). The only uses of dataDynamic was in the Ruby testers. The ARRAY_DATA flag in the packet is removed accordingly. No defragmentation of the flags is done at this point, leaving a gap in the bit masks. As the last part the patch, it renames dataDynamicArray to dataDynamic.	2014-12-02 06:07:43 -05:00
Andreas Hansson	5df96cb690	mem: Remove redundant Packet::allocate calls This patch cleans up the packet memory allocation confusion. The data is always allocated at the requesting side, when a packet is created (or copied), and there is never a need for any device to allocate any space if it is merely responding to a paket. This behaviour is in line with how SystemC and TLM works as well, thus increasing interoperability, and matching established conventions. The redundant calls to Packet::allocate are removed, and the checks in the function are tightened up to make sure data is only ever allocated once. There are still some oddities in the packet copy constructor where we copy the data pointer if it is static (without ownership), and allocate new space if the data is dynamic (with ownership). The latter is being worked on further in a follow-on patch.	2014-12-02 06:07:41 -05:00
Andreas Hansson	0706a25203	mem: Use const pointers for port proxy write functions This patch changes the various write functions in the port proxies to use const pointers for all sources (similar to how memcpy works). The one unfortunate aspect is the need for a const_cast in the packet, to avoid having to juggle a const and a non-const data pointer. This design decision can always be re-evaluated at a later stage.	2014-12-02 06:07:38 -05:00
Andreas Hansson	9779ba2e37	mem: Add const getters for write packet data This patch takes a first step in tightening up how we use the data pointer in write packets. A const getter is added for the pointer itself (getConstPtr), and a number of member functions are also made const accordingly. In a range of places throughout the memory system the new member is used. The patch also removes the unused isReadWrite function.	2014-12-02 06:07:36 -05:00
Andreas Hansson	25bfc24999	mem: Remove null-check bypassing in Packet::getPtr This patch removes the parameter that enables bypassing the null check in the Packet::getPtr method. A number of call sites assume the value to be non-null. The one odd case is the RubyTester, which issues zero-sized prefetches(!), and despite being reads they had no valid data pointer. This is now fixed, but the size oddity remains (unless anyone object or has any good suggestions). Finally, in the Ruby Sequencer, appropriate checks are made for flush packets as they have no valid data pointer.	2014-12-02 06:07:34 -05:00
Omar Naji	0e63d2cd62	mem: Add a GDDR5 DRAM config This patch adds a first cut GDDR5 config to accommodate the users combining gem5 and GPUSim. The config is based on a SK Hynix datasheet, and the Nvidia GTX580 specification. Someone from the GPUSim user-camp should tweak the default page-policy and static frontend and backend latencies.	2014-12-02 06:07:32 -05:00
Andreas Hansson	d66b14ca61	misc: Another round of static analysis fixups Mostly addressing uninitialised members.	2014-11-24 09:03:38 -05:00
Alexandru Dutu	1f539f13c3	mem: Page Table map api modification This patch adds uncacheable/cacheable and read-only/read-write attributes to the map method of PageTableBase. It also modifies the constructor of TlbEntry structs for all architectures to consider the new attributes.	2014-11-23 18:01:09 -08:00
Alexandru Dutu	c11bcb8119	mem: Multi Level Page Table bug fix The multi level page table was giving false positives for already mapped translations. This patch fixes the bogus behavior.	2014-11-23 18:01:09 -08:00
Alexandru Dutu	e4859fae5b	mem: Page Table long lines Trimmed down all the lines greater than 78 characters.	2014-11-23 18:01:09 -08:00
Andreas Hansson	9ffe0e7ba6	mem: Clarify unit of DRAM controller buffer size	2014-11-14 03:53:48 -05:00
Mitch Hayenga	9d6d8e02aa	mem: Delete unused variable in Garnet NetworkLink With recent changes OSX clang compilation fails due to an unused variable.	2014-11-12 09:05:23 -05:00
Nilay Vaish	0811f21f67	ruby: provide a backing store Ruby's functional accesses are not guaranteed to succeed as of now. While this is not a problem for the protocols that are currently in the mainline repo, it seems that coherence protocols for gpus rely on a backing store to supply the correct data. The aim of this patch is to make this backing store configurable i.e. it comes into play only when a particular option: --access-backing-store is invoked. The backing store has been there since M5 and GEMS were integrated. The only difference is that earlier the system used to maintain the backing store and ruby's copy was write-only. Sometime last year, we moved to data being supplied supplied by ruby in SE mode simulations. And now we have patches on the reviewboard, which remove ruby's copy of memory altogether and rely completely on the system's memory to supply data. This patch adds back a SimpleMemory member to RubySystem. This member is used only if the option: access-backing-store is set to true. By default, the memory would not be accessed.	2014-11-06 05:42:21 -06:00
Nilay Vaish	3022d463fb	ruby: interface with classic memory controller This patch is the final in the series. The whole series and this patch in particular were written with the aim of interfacing ruby's directory controller with the memory controller in the classic memory system. This is being done since ruby's memory controller has not being kept up to date with the changes going on in DRAMs. Classic's memory controller is more up to date and supports multiple different types of DRAM. This also brings classic and ruby ever more close. The patch also changes ruby's memory controller to expose the same interface.	2014-11-06 05:42:21 -06:00
Nilay Vaish	68ddfab8a4	ruby: remove the function functionalReadBuffers() This function was added when I had incorrectly arrived at the conclusion that such a function can improve the chances of a functional read succeeding. As was later realized, this is not possible in the current setup. While the code using this function was dropped long back, this function was not. Hence the patch.	2014-11-06 05:42:20 -06:00
Nilay Vaish	d25b722e4a	ruby: coherence protocols: remove data block from dirctory entry This patch removes the data block present in the directory entry structure of each protocol in gem5's mainline. Firstly, this is required for moving towards common set of memory controllers for classic and ruby memory systems. Secondly, the data block was being misused in several places. It was being used for having free access to the physical memory instead of calling on the memory controller. From now on, the directory controller will not have a direct visibility into the physical memory. The Memory Vector object now resides in the Memory Controller class. This also means that some significant changes are being made to the functional accesses in ruby.	2014-11-06 05:42:20 -06:00
Nilay Vaish	0baaed60ab	ruby: slicc: allow adding a bool to an int, like C++.	2014-11-06 05:42:20 -06:00
Nilay Vaish	85c29973a3	ruby: remove sparse memory. In my opinion, it creates needless complications in rest of the code. Also, this structure hinders the move towards common set of code for physical memory controllers.	2014-11-06 05:42:20 -06:00
Nilay Vaish	95a0b18431	ruby: single physical memory in fs mode Both ruby and the system used to maintain memory copies. With the changes carried for programmed io accesses, only one single memory is required for fs simulations. This patch sets the copy of memory that used to reside with the system to null, so that no space is allocated, but address checks can still be carried out. All the memory accesses now source and sink values to the memory maintained by ruby.	2014-11-06 05:41:44 -06:00
Nilay Vaish	8ccfd9defa	ruby: dma sequencer: remove RubyPort as parent class As of now DMASequencer inherits from the RubyPort class. But the code in RubyPort class is heavily tailored for the CPU Sequencer. There are parts of the code that are not required at all for the DMA sequencer. Moreover, the next patch uses the dma sequencer for carrying out memory accesses for all the io devices. Hence, it is better to have a leaner dma sequencer.	2014-11-06 00:55:09 -06:00
Ali Saidi	b31d9e93e2	arm, mem: Fix drain bug and provide drain prints for more components.	2014-10-29 23:18:26 -05:00
Curtis Dunham	4024fab7fc	mem: don't inhibit WriteInv's or defer snoops on their MSHRs WriteInvalidate semantics depend on the unconditional writeback or they won't complete. Also, there's no point in deferring snoops on their MSHRs, as they don't get new data at the end of their life cycle the way other transactions do. Add comment in the cache about a minor inefficiency re: WriteInvalidate.	2014-10-21 17:04:41 -05:00
Curtis Dunham	46f9f11a55	mem: have WriteInvalidate obsolete MSHRs Since WriteInvalidate directly writes into the cache, it can create tricky timing interleavings with reads and writes to the same cache line that haven't yet completed. This patch ensures that these requests, when completed, don't overwrite the newer data from the WriteInvalidate.	2014-10-29 23:18:24 -05:00
Omar Naji	a4a8568bd2	mem: Fix DRAM activationlLimit bug Ensure that we do the proper event scheduling also when the activation limit is disabled.	2014-10-20 18:03:55 -04:00
Omar Naji	29dd2887f4	mem: Add DRAM device size and check against config This patch adds the size of the DRAM device to the DRAM config. It also compares the actual DRAM size (calculated using information from the config) to the size defined in the system. If these two values do not match gem5 will print a warning. In order to do correct DRAM research the size of the memory defined in the system should match the size of the DRAM in the config. The timing and current parameters found in the DRAM configs are defined for a DRAM device with a specific size and would differ for another device with a different size.	2014-10-20 18:03:52 -04:00
Andreas Hansson	6d4866383f	mem: Modernise PhysicalMemory with C++11 features Bring the PhysicalMemory up-to-date by making use of range-based for loops and vector intialisation where possible.	2014-10-16 05:50:01 -04:00
Andreas Hansson	edc77fc03c	misc: Move AddrRangeList from port.hh to addr_range.hh The new location seems like a better fit. The iterator typedefs are removed in favour of using C++11 auto.	2014-10-16 05:49:59 -04:00
Andrew Bardsley	d6732895a5	mem: Add ExternalMaster and ExternalSlave ports This patch adds two MemoryObject's: ExternalMaster and ExternalSlave. Each object has a single port which can be bound to an externally- provided bridge to a port of another simulation system at initialisation.	2014-10-16 05:49:56 -04:00
Andreas Hansson	db3739682d	mem: Use shared_ptr for Ruby Message classes This patch transitions the Ruby Message and its derived classes from the ad-hoc RefCountingPtr to the c++11 shared_ptr. There are no changes in behaviour, and the code modifications are mainly replacing "new" with "make_shared". The cloning of derived messages is slightly changed as they previously relied on overriding the base-class through covariant return types.	2014-10-16 05:49:49 -04:00
Andreas Hansson	2475862747	arch,x86,mem: Dynamically determine the ISA for Ruby store check This patch makes the memory system ISA-agnostic by enabling the Ruby Sequencer to dynamically determine if it has to do a store check. To enable this check, the ISA is encoded as an enum, and the system is able to provide the ISA to the Sequencer at run time. --HG-- rename : src/arch/x86/insts/microldstop.hh => src/arch/x86/ldstflags.hh	2014-10-16 05:49:44 -04:00
Andreas Hansson	df973abef3	mem: Dynamically determine page bytes in memory components This patch takes a step towards an ISA-agnostic memory system by enabling the components to establish the page size after instantiation. The swap operation in the memory is now also allowing any granularity to avoid depending on the IntReg of the ISA.	2014-10-16 05:49:43 -04:00
Nilay Vaish	a098fad174	ruby: network: garnet: add statistics for different activities This patch adds some statistics to garnet that record the activity of certain structures in the on-chip network. These statistics, in a later patch, will be used for computing the energy consumed by the on-chip network.	2014-10-11 15:02:23 -05:00
Nilay Vaish	25bb18f12b	ruby: network: garnet: remove functions for computing power	2014-10-11 15:02:23 -05:00
Nilay Vaish	9321a41c62	ruby: drop Orion network power model Orion is being dropped from ruby. It would be replaced with DSENT which has better models. Note that the power / energy numbers reported after this patch has been applied are not for use.	2014-10-11 15:02:23 -05:00
Nilay Vaish	b6d804a1e6	ruby: mesi: slight renaming	2014-10-11 15:02:23 -05:00
Nilay Vaish	e7f918d8cd	ruby: structures: coorect #ifndef macros in header files	2014-10-11 15:02:22 -05:00
Omar Naji	cd8023a1ee	mem: DRAMPower integration for on-line DRAM power stats This patch takes the final step in integrating DRAMPower and adds the appropriate calls in the DRAM controller to provide the command trace and extract the power and energy stats. The debug printouts are still left in place, but will eventually be removed. At the moment the DRAM power calculation is always on when using the DRAM controller model. The run-time impact of this addition is around 1.5% when looking at the total host seconds of the regressions. We deem this a sensible trade-off to avoid the complication of adding an enable/disable mechanism.	2014-07-29 17:22:44 +01:00
Omar Naji	afc6ce6228	mem: Add DRAMPower wrapping class This patch adds a class to wrap DRAMPower Library in gem5. This class initiates an object of class MemorySpecification of the DRAMPower Library, passes the parameters from DRAMCtrl.py to this object and creates an object of drampower library using the memory specification.	2014-07-29 17:29:36 +01:00
Omar Naji	00b37ffe50	mem: Add missig timing and current parameters to DRAM configs This patch adds missing timing and current parameters to the existing DRAM configs. These missing timing and current parameters are required by DRAMPower for the DRAM power calculations. The missing values are datasheet values of the specified DRAMs, and the appropriate references are added for the variuos configs.	2014-07-25 10:05:59 +01:00
Omar Naji	f9fce9ba07	mem: Remove DRAMSim2 DDR3 configuration This patch prunes the DDR3 config that was initially created to match the default config of DRAMSim2. The config is not complete as it is, and to avoid having to maintain it, the easiest way forward is to simply prune it. Going forward we are adding power number etc to the other configurations.	2014-10-09 17:52:04 -04:00
Andreas Hansson	f4a538f862	mem: Add packet sanity checks to cache and MSHRs This patch adds a number of asserts to the cache, checking basic assumptions about packets being requests or responses.	2014-10-09 17:51:56 -04:00
Andreas Hansson	4a453e8c95	mem: Allow packet queue to move next send event forward This patch changes the packet queue such that when scheduling a send, the queue is allowed to move the event forward.	2014-10-09 17:51:52 -04:00
Andreas Hansson	6498ccddb2	misc: Fix issues identified by static analysis Another bunch of issues addressed.	2014-10-01 08:05:54 -04:00
Curtis Dunham	b7f1d675da	mem: Output precise range when XBar has conflicts	2014-09-27 09:08:32 -04:00
Curtis Dunham	725be98fe8	mem: Provide better diagnostic for unconnected port When _masterPort is null, a message to that effect is more helpful than a segfault.	2014-09-27 09:08:30 -04:00
Andreas Hansson	de62aedabc	misc: Fix a bunch of minor issues identified by static analysis Add some missing initialisation, and fix a handful benign resource leaks (including some false positives).	2014-09-27 09:08:29 -04:00
Andreas Hansson	1f6d5f8f84	mem: Rename Bus to XBar to better reflect its behaviour This patch changes the name of the Bus classes to XBar to better reflect the actual timing behaviour. The actual instances in the config scripts are not renamed, and remain as e.g. iobus or membus. As part of this renaming, the code has also been clean up slightly, making use of range-based for loops and tidying up some comments. The only changes outside the bus/crossbar code is due to the delay variables in the packet. --HG-- rename : src/mem/Bus.py => src/mem/XBar.py rename : src/mem/coherent_bus.cc => src/mem/coherent_xbar.cc rename : src/mem/coherent_bus.hh => src/mem/coherent_xbar.hh rename : src/mem/noncoherent_bus.cc => src/mem/noncoherent_xbar.cc rename : src/mem/noncoherent_bus.hh => src/mem/noncoherent_xbar.hh rename : src/mem/bus.cc => src/mem/xbar.cc rename : src/mem/bus.hh => src/mem/xbar.hh	2014-09-20 17:18:32 -04:00
Stephan Diestelhorst	435f4aec3d	mem: Add access statistics for the snoop filter Adds a simple access counter for requests and snoops for the snoop filter and also classifies hits based on whether a single other holder existed or whether multiple shares held the line.	2014-04-25 12:36:16 +01:00
Stephan Diestelhorst	afa2428eca	mem: Tie in the snoop filter in the coherent bus	2014-09-20 17:18:29 -04:00
Stephan Diestelhorst	7d488cc66f	mem: Add a simple snoop counter per bus This patch adds a simple counter for both total messages and a histogram for the fan-out of snoop messages. The fan-out describes to how many ports snoops had to be sent per incoming request / snoop-from-below. Without any cleverness, this usually means to either all, or all but the requesting port.	2014-04-24 13:28:47 +01:00
Stephan Diestelhorst	ba98d598ae	mem: Simple Snoop Filter This is a first cut at a simple snoop filter that tracks presence of lines in the caches "above" it. The snoop filter can be applied at any given cache hierarchy and will then handle the caches above it appropriately; there is no need to use this only in the last-level bus. This design currently has some limitations: missing stats, no notion of clean evictions (these will not update the underlying snoop filter, because they are not sent from the evicting cache down), no notion of capacity for the snoop filter and thus no need for invalidations caused by capacity pressure in the snoop filter. These are planned to be added on top with future change sets.	2014-09-20 17:18:26 -04:00
Wendy Elsasser	bf23847072	mem: Add DDR4 bank group timing Added the following parameter to the DRAMCtrl class: - bank_groups_per_rank This defaults to 1. For the DDR4 case, the default is overridden to indicate bank group architecture, with multiple bank groups per rank. Added the following delays to the DRAMCtrl class: - tCCD_L : CAS-to-CAS, same bank group delay - tRRD_L : RAS-to-RAS, same bank group delay These parameters are only applied when bank group timing is enabled. Bank group timing is currently enabled only for DDR4 memories. For all other memories, these delays will default to '0 ns' In the DRAM controller model, applied the bank group timing to the per bank parameters actAllowedAt and colAllowedAt. The actAllowedAt will be updated based on bank group when an ACT is issued. The colAllowedAt will be updated based on bank group when a RD/WR burst is issued. At the moment no modifications are made to the scheduling.	2014-09-20 17:18:21 -04:00
Wendy Elsasser	b6ecfe9183	mem: Add memory rank-to-rank delay Add the following delay to the DRAM controller: - tCS : Different rank bus turnaround delay This will be applied for 1) read-to-read, 2) write-to-write, 3) write-to-read, and 4) read-to-write command sequences, where the new command accesses a different rank than the previous burst. The delay defaults to 2*tCK for each defined memory class. Note that this does not correspond to one particular timing constraint, but is a way of modelling all the associated constraints. The DRAM controller has some minor changes to prioritize commands to the same rank. This prioritization will only occur when the command stream is not switching from a read to write or vice versa (in the case of switching we have a gap in any case). To prioritize commands to the same rank, the model will determine if there are any commands queued (same type) to the same rank as the previous command. This check will ensure that the 'same rank' command will be able to execute without adding bubbles to the command flow, e.g. any ACT delay requirements can be done under the hoods, allowing the burst to issue seamlessly.	2014-09-20 17:17:57 -04:00
Mitch Hayenga	3e5bf0c922	mem: Remove the GHB prefetcher from the source tree There are two primary issues with this code which make it deserving of deletion. 1) GHB is a way to structure a prefetcher, not a definitive type of prefetcher 2) This prefetcher isn't even structured like a GHB prefetcher. It's basically a worse version of the stride prefetcher. It primarily serves to confuse new gem5 users and most functionality is already present in the stride prefetcher.	2014-09-20 17:17:44 -04:00
Andreas Hansson	efd5cf323a	misc: Use safe_cast when assumptions are made about return value This patch changes two dynamic_cast to safe_cast as we assume the return value is not NULL (without checking).	2014-09-19 10:35:11 -04:00
Andreas Hansson	f615c4aeb0	misc: Remove assertions ensuring unsigned values >= 0	2014-09-19 10:35:07 -04:00
Andreas Hansson	377f081251	mem: Check return value of checkFunctional in SimpleMemory Simple fix to ensure we only iterate until we are done.	2014-09-19 10:35:06 -04:00
Andreas Hansson	38646d48eb	mem: Add checks to sendTimingReq in cache A small fix to ensure the return value is not ignored.	2014-09-19 10:35:04 -04:00
Nilay Vaish	2ccdfc547d	ruby: network: revert some of the changes from ad9c042dce54 The changeset ad9c042dce54 made changes to the structures under the network directory to use a map of buffers instead of vector of buffers. The reasoning was that not all vnets that are created are used and we needlessly allocate more buffers than required and then iterate over them while processing network messages. But the move to map resulted in a slow down which was pointed out by Andreas Hansson. This patch moves things back to using vector of message buffers.	2014-09-15 16:19:38 -05:00
Mitch Hayenga	9a595fac74	mem: Add accessor function for vaddr Determine if a request has an associated virtual address.	2014-09-09 04:36:33 -04:00
Andreas Hansson	da4539dc74	misc: Fix a number of unitialised variables and members Static analysis unearther a bunch of uninitialised variables and members, and this patch addresses the problem. In all cases these omissions seem benign in the end, but at least fixing them means less false positives next time round.	2014-09-09 04:36:31 -04:00
Andreas Hansson	2698e73966	base: Use the global Mersenne twister throughout This patch tidies up random number generation to ensure that it is done consistently throughout the code base. In essence this involves a clean-up of Ruby, and some code simplifications in the traffic generator. As part of this patch a bunch of skewed distributions (off-by-one etc) have been fixed. Note that a single global random number generator is used, and that the object instantiation order will impact the behaviour (the sequence of numbers will be unaffected, but if module A calles random before module B then they would obviously see a different outcome). The dependency on the instantiation order is true in any case due to the execution-model of gem5, so we leave it as is. Also note that the global ranom generator is not thread safe at this point. Regressions using the memtest, TrafficGen or any Ruby tester are affected and will be updated accordingly.	2014-09-03 07:42:54 -04:00
Andreas Hansson	1ff4c45bbb	mem: Avoid unecessary retries when bus peer is not ready This patch removes unecessary retries that happened when the bus layer itself was no longer busy, but the the peer was not yet ready. Instead of sending a retry that will inevitably not succeed, the bus now silenty waits until the peer sends a retry.	2014-09-03 07:42:53 -04:00
Curtis Dunham	f6f63ec0aa	mem: write streaming support via WriteInvalidate promotion Support full-block writes directly rather than requiring RMW: * a cache line is allocated in the cache upon receipt of a WriteInvalidateReq, not the WriteInvalidateResp. * only top-level caches allocate the line; the others just pass the request along and invalidate as necessary. * to close a timing window between the Req and the Resp, a new metadata bit tracks whether another cache has read a copy of the new line before the writeback to memory.	2014-06-27 12:29:00 -05:00
Andreas Hansson	3be4f4b846	mem: Fix a bug in the cache port flow control This patch fixes a bug in the cache port where the retry flag was reset too early, allowing new requests to arrive before the retry was actually sent, but with the event already scheduled. This caused a deadlock in the interactions with the O3 LSQ. The patche fixes the underlying issue by shifting the resetting of the flag to be done by the event that also calls sendRetry(). The patch also tidies up the flow control in recvTimingReq and ensures that we also check if we already have a retry outstanding.	2014-09-03 07:42:50 -04:00
Curtis Dunham	5d029463ee	cpu, mem: Make software prefetches non-blocking Previously, they were treated so much like loads that they could stall at the head of the ROB. Now they are always treated like L1 hits. If they actually miss, a new request is created at the L1 and tracked from the MSHRs there if necessary (i.e. if it didn't coalesce with an existing outstanding load).	2014-05-13 12:20:49 -05:00
Curtis Dunham	e3b19cb294	mem: Refactor assignment of Packet types Put the packet type swizzling (that is currently done in a lot of places) into a refineCommand() member function.	2014-05-13 12:20:48 -05:00
Geoffrey Blake	b404ffde60	cache: Fix handling of LL/SC requests under contention If a set of LL/SC requests contend on the same cache block we can get into a situation where CPUs will deadlock if they expect a failed SC to supply them data. This case happens where 3 or more cores are contending for a cache block using LL/SC and the system is configured where 2 cores are connected to a local bus and the third is connected to a remote bus. If a core on the local bus sends an SCUpgrade and the core on the remote bus sends and SCUpgrade they will race to see who will win the SC access. In the meantime if the other core appends a read to one of the SCUpgrades it will expect to be supplied data by that SCUpgrade transaction. If it happens that the SCUpgrade that was picked to supply the data is failed, it will drop the appended request for data and never respond, leaving the requesting core to deadlock. This patch makes all SC's behave as normal stores to prevent this case but still makes sure to check whether it can perform the update.	2014-09-03 07:42:31 -04:00
Andreas Hansson	77c28cc395	mem: Packet queue clean up No change in functionality, just a bit of tidying up.	2014-09-03 07:42:28 -04:00
Andreas Hansson	e1ac962939	arch: Cleanup unused ISA traits constants This patch prunes unused values, and also unifies how the values are defined (not using an enum for ALPHA), aligning the use of int vs Addr etc. The patch also removes the duplication of PageBytes/PageShift and VMPageSize/LogVMPageSize. For all ISAs the two pairs had identical values and the latter has been removed.	2014-09-03 07:42:21 -04:00
Nilay Vaish	2cbe7c705b	ruby: remove typedef of Index as int64 The Index type defined as typedef int64 does not really provide any help since in most places we use primitive types instead of Index. Also, the name Index is very generic that it does not merit being used as a typename.	2014-09-01 16:55:50 -05:00
Nilay Vaish	b4dade6fb2	ruby: PerfectSwitch: moves code to a per vnet helper function This patch moves code from the wakeup() function to a operateVnet(). The aim is to improve the readiblity of the code.	2014-09-01 16:55:48 -05:00
Nilay Vaish	7a0d5aafe4	ruby: message buffers: significant changes This patch is the final patch in a series of patches. The aim of the series is to make ruby more configurable than it was. More specifically, the connections between controllers are not at all possible (unless one is ready to make significant changes to the coherence protocol). Moreover the buffers themselves are magically connected to the network inside the slicc code. These connections are not part of the configuration file. This patch makes changes so that these connections will now be made in the python configuration files associated with the protocols. This requires each state machine to expose the message buffers it uses for input and output. So, the patch makes these buffers configurable members of the machines. The patch drops the slicc code that usd to connect these buffers to the network. Now these buffers are exposed to the python configuration system as Master and Slave ports. In the configuration files, any master port can be connected any slave port. The file pyobject.cc has been modified to take care of allocating the actual message buffer. This is inline with how other port connections work.	2014-09-01 16:55:47 -05:00
Nilay Vaish	00286fc5cb	build opts: add MI_example to NULL ISA A later changeset changes the file src/python/swig/pyobject.cc to include a header file that includes a header file generated at build time depending on the PROTOCOL in use. Since NULL ISA was not specifying any protocol, this resulted in compilation problems. Hence, the changeset.	2014-09-01 16:55:46 -05:00
Nilay Vaish	d07abd9b5b	mem: change the namespace Message to ProtoMessage The namespace Message conflicts with the Message data type used extensively in Ruby. Since Ruby is being moved to the same Master/Slave ports based configuration style as the rest of gem5, this conflict needs to be resolved. Hence, the namespace is being renamed to ProtoMessage.	2014-09-01 16:55:46 -05:00
Nilay Vaish	cee8faaad0	ruby: slicc: change the way configurable members are specified There are two changes this patch makes to the way configurable members of a state machine are specified in SLICC. The first change is that the data member declarations will need to be separated by a semi-colon instead of a comma. Secondly, the default value to be assigned would now use SLICC's assignment operator i.e. ':='.	2014-09-01 16:55:45 -05:00
Nilay Vaish	b1d3873ec5	ruby: slicc: improve the grammar This patch changes the grammar for SLICC so as to remove some of the redundant / duplicate rules. In particular rules for object/variable declaration and class member declaration have been unified. Similarly, the rules for a general function and a class method have been unified. One more change is in the priority of two rules. The first rule is on declaring a function with all the params typed and named. The second rule is on declaring a function with all the params only typed. Earlier the second rule had a higher priority. Now the first rule has a higher priority.	2014-09-01 16:55:44 -05:00
Nilay Vaish	3202ec98e7	ruby: mesi three level: slight naming changes.	2014-09-01 16:55:44 -05:00
Nilay Vaish	557200725c	ruby: slicc: donot prefix machine name to variables This changeset does away with prefixing of member variables of state machines with the identity of the machine itself.	2014-09-01 16:55:43 -05:00
Nilay Vaish	6ceb1aadc2	ruby: remove unused toString() from AbstractController	2014-09-01 16:55:42 -05:00
Nilay Vaish	00dbadcbb0	ruby: network: move getNumNodes() to base class All the implementations were doing the same things.	2014-09-01 16:55:42 -05:00
Nilay Vaish	cc2cc58869	ruby: eliminate type Time There is another type Time in src/base class which results in a conflict.	2014-09-01 16:55:41 -05:00
Nilay Vaish	82d136285d	ruby: move files from ruby/system to ruby/structures The directory ruby/system is crowded and unorganized. Hence, the files the hold actual physical structures, are being moved to the directory ruby/structures. This includes Cache Memory, Directory Memory, Memory Controller, Wire Buffer, TBE Table, Perfect Cache Memory, Timer Table, Bank Array. The directory ruby/systems has the glue code that holds these structures together. --HG-- rename : src/mem/ruby/system/MachineID.hh => src/mem/ruby/common/MachineID.hh rename : src/mem/ruby/buffers/MessageBuffer.cc => src/mem/ruby/network/MessageBuffer.cc rename : src/mem/ruby/buffers/MessageBuffer.hh => src/mem/ruby/network/MessageBuffer.hh rename : src/mem/ruby/buffers/MessageBufferNode.cc => src/mem/ruby/network/MessageBufferNode.cc rename : src/mem/ruby/buffers/MessageBufferNode.hh => src/mem/ruby/network/MessageBufferNode.hh rename : src/mem/ruby/system/AbstractReplacementPolicy.hh => src/mem/ruby/structures/AbstractReplacementPolicy.hh rename : src/mem/ruby/system/BankedArray.cc => src/mem/ruby/structures/BankedArray.cc rename : src/mem/ruby/system/BankedArray.hh => src/mem/ruby/structures/BankedArray.hh rename : src/mem/ruby/system/Cache.py => src/mem/ruby/structures/Cache.py rename : src/mem/ruby/system/CacheMemory.cc => src/mem/ruby/structures/CacheMemory.cc rename : src/mem/ruby/system/CacheMemory.hh => src/mem/ruby/structures/CacheMemory.hh rename : src/mem/ruby/system/DirectoryMemory.cc => src/mem/ruby/structures/DirectoryMemory.cc rename : src/mem/ruby/system/DirectoryMemory.hh => src/mem/ruby/structures/DirectoryMemory.hh rename : src/mem/ruby/system/DirectoryMemory.py => src/mem/ruby/structures/DirectoryMemory.py rename : src/mem/ruby/system/LRUPolicy.hh => src/mem/ruby/structures/LRUPolicy.hh rename : src/mem/ruby/system/MemoryControl.cc => src/mem/ruby/structures/MemoryControl.cc rename : src/mem/ruby/system/MemoryControl.hh => src/mem/ruby/structures/MemoryControl.hh rename : src/mem/ruby/system/MemoryControl.py => src/mem/ruby/structures/MemoryControl.py rename : src/mem/ruby/system/MemoryNode.cc => src/mem/ruby/structures/MemoryNode.cc rename : src/mem/ruby/system/MemoryNode.hh => src/mem/ruby/structures/MemoryNode.hh rename : src/mem/ruby/system/MemoryVector.hh => src/mem/ruby/structures/MemoryVector.hh rename : src/mem/ruby/system/PerfectCacheMemory.hh => src/mem/ruby/structures/PerfectCacheMemory.hh rename : src/mem/ruby/system/PersistentTable.cc => src/mem/ruby/structures/PersistentTable.cc rename : src/mem/ruby/system/PersistentTable.hh => src/mem/ruby/structures/PersistentTable.hh rename : src/mem/ruby/system/PseudoLRUPolicy.hh => src/mem/ruby/structures/PseudoLRUPolicy.hh rename : src/mem/ruby/system/RubyMemoryControl.cc => src/mem/ruby/structures/RubyMemoryControl.cc rename : src/mem/ruby/system/RubyMemoryControl.hh => src/mem/ruby/structures/RubyMemoryControl.hh rename : src/mem/ruby/system/RubyMemoryControl.py => src/mem/ruby/structures/RubyMemoryControl.py rename : src/mem/ruby/system/SparseMemory.cc => src/mem/ruby/structures/SparseMemory.cc rename : src/mem/ruby/system/SparseMemory.hh => src/mem/ruby/structures/SparseMemory.hh rename : src/mem/ruby/system/TBETable.hh => src/mem/ruby/structures/TBETable.hh rename : src/mem/ruby/system/TimerTable.cc => src/mem/ruby/structures/TimerTable.cc rename : src/mem/ruby/system/TimerTable.hh => src/mem/ruby/structures/TimerTable.hh rename : src/mem/ruby/system/WireBuffer.cc => src/mem/ruby/structures/WireBuffer.cc rename : src/mem/ruby/system/WireBuffer.hh => src/mem/ruby/structures/WireBuffer.hh rename : src/mem/ruby/system/WireBuffer.py => src/mem/ruby/structures/WireBuffer.py rename : src/mem/ruby/recorder/CacheRecorder.cc => src/mem/ruby/system/CacheRecorder.cc rename : src/mem/ruby/recorder/CacheRecorder.hh => src/mem/ruby/system/CacheRecorder.hh	2014-09-01 16:55:40 -05:00
Alexandru	5efbb4442a	mem: adding architectural page table support for SE mode This patch enables the use of page tables that are stored in system memory and respect x86 specification, in SE mode. It defines an architectural page table for x86 as a MultiLevelPageTable class and puts a placeholder class for other ISAs page tables, giving the possibility for future implementation.	2014-08-28 10:11:44 -05:00
Alexandru	26ac28dec2	mem: adding a multi-level page table class This patch defines a multi-level page table class that stores the page table in system memory, consistent with ISA specifications. In this way, cpu models that use the actual hardware to execute (e.g. KvmCPU), are able to traverse the page table.	2014-04-01 12:18:12 -05:00
Andreas Hansson	9e4cd5bf1e	mem: Fix DRAMSim2 cycle check when restoring from checkpoint This patch ensures the cycle check is still valid even restoring from a checkpoint. In this case the DRAMSim2 cycle count is relative to the startTick rather than 0.	2014-08-26 10:14:38 -04:00
Andreas Hansson	3efabb4b2f	mem: Update DRAM controller comments Update comments and add a reference for more information.	2014-08-26 10:13:03 -04:00
Andreas Hansson	56b7796e0d	mem: Fix address interleaving bug in DRAM controller This patch fixes a bug in the DRAM controller address decoding. In cases where the DRAM burst size (e.g. 32 bytes in a rank with a single LPDDR3 x32) was smaller than the channel interleaving size (e.g. systems with a 64-byte cache line) one address bit effectively got used as a channel bit when it should have been a low-order column bit. This patch adds a notion of "columns per stripe", and more clearly deals with the low-order column bits and high-order column bits. The patch also relaxes the granularity check such that it is possible to use interleaving granularities other than the cache line size. The patch also adds a missing M5_CLASS_VAR_USED to the tCK member as it is only used in the debug build for now.	2014-08-26 10:12:45 -04:00
Mitch Hayenga	f6f6ae461e	mem: Properly set cache block status fields on writebacks When a cacheline is written back to a lower-level cache, tags->insertBlock() sets various status parameters. However these status bits were cleared immediately after calling. This patch makes it so that these status fields are not cleared by moving them outside of the tags->insertBlock() call.	2014-08-13 06:57:24 -04:00
Anthony Gutierrez	a628afedad	mem: refactor LRU cache tags and add random replacement tags this patch implements a new tags class that uses a random replacement policy. these tags prefer to evict invalid blocks first, if none are available a replacement candidate is chosen at random. this patch factors out the common code in the LRU class and creates a new abstract class: the BaseSetAssoc class. any set associative tag class must implement the functionality related to the actual replacement policy in the following methods: accessBlock() findVictim() insertBlock() invalidate()	2014-07-28 12:23:23 -04:00
Andreas Hansson	1f539ce4cc	mem: DRAMPower trace output This patch adds a DRAMPower flag to enable off-line DRAM power analysis using the DRAMPower tool. A new DRAMPower flag is added and a follow-on patch adds a Python script to post-process the output and order it based on time stamps. The long-term goal is to link DRAMPower as a library and provide the commands through function calls to the model rather than first printing and then parsing the commands. At the moment it is also up to the user to ensure that the same DRAM configuration is used by the gem5 controller model and DRAMPower.	2014-06-30 13:56:03 -04:00
Andreas Hansson	b4ce51eb9e	mem: Add bank and rank indices as fields to the DRAM bank This patch adds the index of the bank and rank as a field so that we can determine the identity of a given bank (reference or pointer) for the power tracing. We also grab the opportunity of cleaning up the arguments used for identifying the bank when activating.	2014-06-30 13:56:02 -04:00
Andreas Hansson	d59bc8ee1f	mem: Extend DRAM row bits from 16 to 32 for larger densities This patch extends the DRAM row bits to 32 to support larger density memories. Additional checks are also added to ensure the row fits in the 32 bits.	2014-06-30 13:56:01 -04:00
Steve Reinhardt	0be64ffe2f	style: eliminate equality tests with true and false Using '== true' in a boolean expression is totally redundant, and using '== false' is pretty verbose (and arguably less readable in most cases) compared to '!'. It's somewhat of a pet peeve, perhaps, but I had some time waiting for some tests to run and decided to clean these up. Unfortunately, SLICC appears not to have the '!' operator, so I had to leave the '== false' tests in the SLICC code.	2014-05-31 18:00:23 -07:00
Nilay Vaish	e685767b58	ruby: slicc: remove unused ids DNUCA*	2014-05-23 06:07:02 -05:00
Nilay Vaish	9c9257a612	ruby: remove old protocol documentation	2014-05-23 06:07:02 -05:00
Nilay Vaish	8bf41e41c1	ruby: message buffer: drop dequeue_getDelayCycles() The functionality of updating and returning the delay cycles would now be performed by the dequeue() function itself.	2014-05-23 06:07:02 -05:00
Andreas Hansson	f800f268db	mem: Update DDR3 and DDR4 based on datasheets This patch makes a more firm connection between the DDR3-1600 configuration and the corresponding datasheet, and also adds a DDR3-2133 and a DDR4-2400 configuration. At the moment there is also an ongoing effort to align the choice of datasheets to what is available in DRAMPower.	2014-05-09 18:58:49 -04:00
Andreas Hansson	cc4ca78f99	mem: Add DRAM cycle time This patch extends the current timing parameters with the DRAM cycle time. This is needed as the DRAMPower tool expects timestamps in DRAM cycles. At the moment we could get away with doing this in a post-processing step as the DRAMPower execution is separate from the simulation run. However, in the long run we want the tool to be called during the simulation, and then the cycle time is needed.	2014-05-09 18:58:49 -04:00
Andreas Hansson	8c56efe747	mem: Simplify DRAM response scheduling This patch simplifies the DRAM response scheduling based on the assumption that they are always returned in order.	2014-05-09 18:58:48 -04:00
Andreas Hansson	8e3869411d	mem: Add precharge all (PREA) to the DRAM controller This patch adds the basic ingredients for a precharge all operation, to be used in conjunction with DRAM power modelling. Currently we do not try and apply any cleverness when precharging all banks, thus even if only a single bank is open we use PREA as opposed to PRE. At the moment we only have a single tRP (tRPpb), and do not model the slightly longer all-bank precharge constraint (tRPab).	2014-05-09 18:58:48 -04:00
Andreas Hansson	0ba1e72e9b	mem: Remove printing of DRAM params This patch removes the redundant printing of DRAM params.	2014-05-09 18:58:48 -04:00
Andreas Hansson	6753cb705e	mem: Add tRTP to the DRAM controller This patch adds the tRTP timing constraint, governing the minimum time between a read command and a precharge. Default values are provided for the existing DRAM types.	2014-05-09 18:58:48 -04:00
Andreas Hansson	60799dc552	mem: Merge DRAM latency calculation and bank state update This patch merges the two control paths used to estimate the latency and update the bank state. As a result of this merging the computation is now in one place only, and should be easier to follow as it is all done in absolute (rather than relative) time. As part of this change, the scheduling is also refined to ensure that we look at a sensible estimate of the bank ready time in choosing the next request. The bank latency stat is removed as it ends up being misleading when the DRAM access code gets evaluated ahead of time (due to the eagerness of waking the model up for scheduling the next request).	2014-05-09 18:58:48 -04:00
Andreas Hansson	b8631d9ae8	mem: Add tWR to DRAM activate and precharge constraints This patch adds the write recovery time to the DRAM timing constraints, and changes the current tRASDoneAt to a more generic preAllowedAt, capturing when a precharge is allowed to take place. The part of the DRAM access code that accounts for the precharge and activate constraints is updated accordingly.	2014-05-09 18:58:48 -04:00
Andreas Hansson	c735ef6cb0	mem: Merge DRAM page-management calculations This patch treats the closed page policy as yet another case of auto-precharging, and thus merges the code with that used for the other policies.	2014-05-09 18:58:48 -04:00
Andreas Hansson	87f4c956c4	mem: Add DRAM power states to the controller This patch adds power states to the controller. These states and the transitions can be used together with the Micron power model. As a more elaborate use-case, the transitions can be used to drive the DRAMPower tool. At the moment, the power-down modes are not used, and this patch simply serves to capture the idle, auto refresh and active modes. The patch adds a third state machine that interacts with the refresh state machine.	2014-05-09 18:58:48 -04:00
Andreas Hansson	babf072c1c	mem: Ensure DRAM refresh respects timings This patch adds a state machine for the refresh scheduling to ensure that no accesses are allowed while the refresh is in progress, and that all banks are propely precharged. As part of this change, the precharging of banks of broken out into a method of its own, making is similar to how activations are dealt with. The idle accounting is also updated to ensure that the refresh duration is not added to the time that the DRAM is in the idle state with all banks precharged.	2014-05-09 18:58:48 -04:00
Andreas Hansson	5c2c3f598e	mem: Make DRAM read/write switching less conservative This patch changes the read/write event loop to use a single event (nextReqEvent), along with a state variable, thus joining the two control flows. This change makes it easier to follow the state transitions, and control what happens when. With the new loop we modify the overly conservative switching times such that the write-to-read switch allows bank preparation to happen in parallel with the bus turn around. Similarly, the read-to-write switch uses the introduced tRTW constraint.	2014-05-09 18:58:48 -04:00
Mitch Hayenga	a15b713cba	mem: Squash prefetch requests from downstream caches This patch squashes prefetch requests from downstream caches, so that they do not steal cachelines away from caches closer to the cpu. It was originally coded by Mitch Hayenga and modified by Aasheesh Kolli.	2014-05-09 18:58:46 -04:00
Sascha Bischoff	e940bac278	mem: Auto-generate CommMonitor trace file names Splits the CommMonitor trace_file parameter into three parameters. Previously, the trace was only enabled if the trace_file parameter was set, and would be written to this file. This patch adds in a trace_enable and trace_compress parameter to the CommMonitor. No trace is generated if trace_enable is set to False. If it is set to True, the trace is written to a file based on the name of the SimObject in the simulation hierarchy. For example, system.cluster.il1_commmonitor.trc. This filename can be overridden by additionally specifying a file name to the trace_file parameter (more on this later). The trace_compress parameter will append .gz to any filename if set to True. This enables compression of the generated traces. If the file name already ends in .gz, then no changes are made. The trace_file parameter will override the name set by the trace_enable parameter. In the case that the specified name does not end in .gz but trace_compress is set to true, .gz is appended to the supplied file name.	2014-05-09 18:58:46 -04:00
Mitch Hayenga	a0d30f36a6	mem: Don't print out the data of a cache block This never actually worked since it was printing out only a word of the cache block and not the entire thing and doubly didn't work csprintf overrides the %#x specifier and assumes a char* array is actually a string.	2014-04-01 14:24:36 -05:00
Nilay Vaish	4ceeda20aa	ruby: slicc: remove old documentation Has not been maintained at all. Since there is alternate documentation available on gem5.org, no need to have this separately.	2014-04-19 09:00:31 -05:00
Nilay Vaish	183100b8cb	ruby: slicc: slight change to rule for transitions It had an unnecessary pairs token which is being removed.	2014-04-19 09:00:31 -05:00
Marco Elver	d9fa950396	ruby: recorder: Fix (de-)serializing with different cache block-sizes Upon aggregating records, serialize system's cache-block size, as the cache-block size can be different when restoring from a checkpoint. This way, we can correctly read all records when restoring from a checkpoints, even if the cache-block size is different. Note, that it is only possible to restore from a checkpoint if the desired cache-block size is smaller or equal to the cache-block size when the checkpoint was taken; we can split one larger request into multiple small ones, but it is not reliable to do the opposite. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2014-04-19 09:00:30 -05:00
Nilay Vaish	d805e42b81	ruby: slicc: change enqueue statement As of now, the enqueue statement can take in any number of 'pairs' as argument. But we only use the pair in which latency is the key. This latency is allowed to be either a fixed integer or a member variable of controller in which the expression appears. This patch drops the use of pairs in an enqueue statement. Instead, an expression is allowed which will be interpreted to be the latency of the enqueue. This expression can anything allowed by slicc including a constant integer or a member variable.	2014-04-08 13:26:30 -05:00
Nilay Vaish	e689c00b16	ruby: coherence protocols: drop the phrase IntraChip The phrase is no longer valid since we do not distinguish between inter and intra chip communication.	2014-04-08 13:26:29 -05:00
Andreas Hansson	a00383a40a	mem: Track DRAM read/write switching and add hysteresis This patch adds stats for tracking the number of reads/writes per bus turn around, and also adds hysteresis to the write-to-read switching to ensure that the queue does not oscilate around the low threshold.	2014-03-23 11:12:14 -04:00
Andreas Hansson	7c18691db1	mem: Rename SimpleDRAM to a more suitable DRAMCtrl This patch renames the not-so-simple SimpleDRAM to a more suitable DRAMCtrl. The name change is intended to ensure that we do not send the wrong message (although the "simple" in SimpleDRAM was originally intended as in cleverly simple, or elegant). As the DRAM controller modelling work is being presented at ISPASS'14 our hope is that a broader audience will use the model in the future. --HG-- rename : src/mem/SimpleDRAM.py => src/mem/DRAMCtrl.py rename : src/mem/simple_dram.cc => src/mem/dram_ctrl.cc rename : src/mem/simple_dram.hh => src/mem/dram_ctrl.hh	2014-03-23 11:12:12 -04:00
Andreas Hansson	3dd1587afc	mem: Change memory defaults to be more representative Make the default memory type DDR3-1600 x64, and use the open-adaptive page policy. This change is aiming to ensure that users by default are using a realistic memory system.	2014-03-23 11:12:10 -04:00
Wendy Elsasser	bbbae677ed	mem: Add close adaptive paging policy to DRAM controller model This patch adds a second adaptive page policy to the DRAM controller, closing the page unless there are already queued accesses to the open page.	2014-03-23 11:12:08 -04:00
Andreas Hansson	03a1aed803	mem: DRAM controller tidying up Minor tidying up and removing of redundant code, including the printing of queue state every million accesses.	2014-03-23 11:12:06 -04:00
Andreas Hansson	bc83eb2197	mem: Fix bug in DRAM bytes per activate This patch ensures that we do not sample the bytes per activate when the row has already been closed.	2014-03-23 11:12:05 -04:00
Andreas Hansson	116985d661	mem: Limit the accesses to a page before forcing a precharge This patch adds a basic starvation-prevention mechanism where a DRAM page is forced to close after a certain number of accesses. The limit is combined with the open and open-adaptive page policy and if reached causes an auto-precharge.	2014-03-23 11:12:03 -04:00
Andreas Hansson	6557741311	mem: Make DRAM write queue draining more aggressive This patch changes the triggering condition for the write draining such that we grab the opportunity to issue writes if there are no reads waiting (as opposed to waiting for the writes to reach the high threshold). As a result, we potentially drain some of the writes in read idle periods (if any). A low threshold is added to be able to control how many write bursts are kept in the memory controller queue (acting as on-chip storage). The high and low thresholds are updated to sensible values for a 32/64 size write buffer. Note that the thresholds should be adjusted along with the queue sizes. This patch also adds some basic initialisation sanity checks and moves part of the initialisation to the constructor.	2014-03-23 11:12:01 -04:00
Neha Agarwal	43abaf518f	mem: DDR3 config for comparing with DRAMSim2 This patch adds a new DDR3 configuration to match with the parameters that are specified in one of the DDR3 configs used in DRAMSim2.	2014-03-23 11:11:56 -04:00
Andreas Hansson	7e7b67472a	mem: More descriptive address-mapping scheme names This patch adds the row bits to the name of the address mapping schemes to make it more clear that all the current schemes places the row bits as the most significant bits.	2014-03-23 11:11:53 -04:00
Andreas Hansson	9ac4f781ec	ruby: Move Ruby debug flags to ruby dir and remove stale options This patch moves the Ruby-related debug flags to the ruby sub-directory, and also removes the state SConsopts that add the no-longer-used NO_VECTOR_BOUNDS_CHECK.	2014-03-23 11:11:48 -04:00
Andreas Hansson	9f018d2f5a	mem: Include the DRAMSim2 wrapper in NULL build This patch makes sure DRAMSim2 is included in a build of the NULL ISA.	2014-03-23 11:11:44 -04:00
Sascha Bischoff	548d47ea2c	mem: CommMonitor trace warn on non-timing mode Add a warning to the CommMonitor which will alert the user if they try and record a trace when the system is not in timing mode.	2014-03-23 11:11:40 -04:00
Nilay Vaish	52a83c1d0e	ruby: consumer: avoid accessing wakeup times when waking up Each consumer object maintains a set of tick values when the object is supposed to wakeup and do some processing. As of now, the object accesses this set both when scheduling a wakeup event and when the object actually wakes up. The set is accessed during wakeup to remove the current tick value from the set. This functionality is now being moved to the scheduling function where ticks are removed at a later time.	2014-03-20 09:14:14 -05:00
Nilay Vaish	4b67ada89e	ruby: garnet: convert network interfaces into clocked objects This helps in configuring the network interfaces from the python script and these objects no longer rely on the network object for the timing information.	2014-03-20 09:14:14 -05:00
Nilay Vaish	4f7ef51efb	ruby: slicc: code refactor	2014-03-20 09:14:14 -05:00
Nilay Vaish	9b3418d163	ruby: no piobus in se mode Piobus was recently added to se scripts for ruby so that the interrupt controller can be connected to something (required since the interrupt controller sends address range messages). This patch removes the piobus and instead, the pio port of ruby port will now ignore the range change messages in se mode.	2014-03-20 08:03:09 -05:00
Nilay Vaish	f7e7fa6d90	ruby: remove some of the unnecessary code	2014-03-17 17:40:14 -05:00
Prakash Ramrakhyani	e88cffb30a	mem: Fix incorrect assert failure in the Cache This patch fixes an assert condition that is not true at all times. There are valid situations that arise in dual-core dual-workload runs where the assert condition is false. The function call following the assert however needs to be called only when the condition is true (a block cannot be invalidated in the tags structure if has not been allocated in the structure, and the tempBlock is never allocated). Hence the 'assert' has been replaced with an 'if'.	2014-03-07 15:56:23 -05:00

... 4 5 6 7 8 ...

2021 commits