Commit graph

1977 commits

Author SHA1 Message Date
Nilay Vaish 4b19e06644 ruby: sequencer: remove commented out function printProgress() 2015-09-16 11:59:55 -05:00
David Hashe b6b972da99 ruby: rename System.{hh,cc} to RubySystem.{hh,cc}
The eventual aim of this change is to pass RubySystem pointers through to
objects generated from the SLICC protocol code.

Because some of these objects need to dereference their RubySystem pointers,
they need access to the System.hh header file.

In src/mem/ruby/SConscript, the MakeInclude function creates single-line header
files in the build directory that do nothing except include the corresponding
header file from the source tree.

However, SLICC also generates a list of header files from its symbol table, and
writes it to mem/protocol/Types.hh in the build directory. This code assumes
that the header file name is the same as the class name.

The end result of this is the many of the generated slicc files try to include
RubySystem.hh, when the file they really need is System.hh. The path of least
resistence is just to rename System.hh to RubySystem.hh.

--HG--
rename : src/mem/ruby/system/System.cc => src/mem/ruby/system/RubySystem.cc
rename : src/mem/ruby/system/System.hh => src/mem/ruby/system/RubySystem.hh
2015-09-16 12:03:03 -04:00
Anthony Gutierrez 3edadb0bd3 slicc: export uint64_t instead of uint64 2015-09-16 12:01:39 -04:00
Nilay Vaish 6bee1d9124 ruby: topology: refactor code. 2015-09-14 10:14:50 -05:00
Nilay Vaish 4e898be762 ruby: slicc: remove member buffer_expr from Var class
This was added by changeset 51f40b101a56.  Instead, buffer_expr would now be
associated with the InPort class.
2015-09-14 10:04:55 -05:00
Nilay Vaish 8b199b775e ruby: perfect switch: refactor code
Refactored the code in operateVnet(), moved partly to a new function
operateMessageBuffer().  This is required since a later patch moves to having a
wakeup event per MessageBuffer instead of one event for the entire Switch.
2015-09-12 16:16:17 -05:00
Nilay Vaish 25cd13dbf1 ruby: simple network: store Switch* in PerfectSwitch and Throttle
There are two reasons for doing so:

a. provide a source of clock to PerfectSwitch. A follow on patch removes sender
and receiver pointers from MessageBuffer means that the object owning the
buffer should have some way of providing timing info.

b. schedule events.  A follow on patch removes the consumer class.  So the
PerfectSwitch needs some EventManager object to schedule events on its own.
2015-09-12 16:16:03 -05:00
Nilay Vaish f611d4f22e ruby: slicc: remove nextLineHack from Type.py 2015-09-08 19:32:04 -05:00
Nilay Vaish 740984b30b ruby: call setMRU from L1 controllers, not from sequencer
Currently the sequencer calls the function setMRU that updates the replacement
policy structures with the first level caches.  While functionally this is
correct, the problem is that this requires calling findTagInSet() which is an
expensive function.  This patch removes the calls to setMRU from the sequencer.
All controllers should now update the replacement policy on their own.

The set and the way index for a given cache entry can be found within the
AbstractCacheEntry structure. Use these indicies to update the replacement
policy structures.
2015-09-05 09:35:39 -05:00
Nilay Vaish 8f29298bc7 ruby: adds set and way indices to AbstractCacheEntry 2015-09-05 09:35:31 -05:00
Nilay Vaish abcc67010e ruby: set: reimplement using std::bitset
The current Set data structure is slow and therefore is being reimplemented
using std::bitset. A maximum limit of 64 is being set on the number of
controllers of each type.  This means that for simulating a system with more
controllers of a given type, one would need to change the value of the variable
NUMBER_BITS_PER_SET
2015-09-05 09:34:25 -05:00
Nilay Vaish 7962a81148 ruby: declare all protocol message buffers as parameters
MessageBuffer is a SimObject now.  There were protocols that still declared
some of the message buffers are variables of the controller, but not as input
parameters.  Special handling was required for these variables in the SLICC
compiler.  This patch changes this.  Now all message buffers are declared as
input parameters.
2015-09-05 09:34:24 -05:00
Andreas Hansson 419d437385 mem: Avoid setting markPending if not needed
In cases where a newly added target does not have any upstream MSHR to
mark as downstreamPending, remember that nothing is marked. This
allows us to avoid attempting to find the MSHR as part of the clearing
of downstreamPending.
2015-09-04 13:14:03 -04:00
Andreas Hansson 2c50a83ba2 mem: Tidy up CacheSet
Minor tweaks and house keeping.
2015-09-04 13:14:01 -04:00
Andreas Hansson 76088fb9ca mem: Tidy up the snoop state-transition logic
Remove broken and unused option to pass dirty data on non-exclusive
snoops. Also beef up the comments a bit.
2015-09-04 13:13:58 -04:00
Nilay Vaish fe47f0a72f ruby: remove random seed
We no longer use the C library based random number generator: random().
Instead we use the C++ library provided rng.  So setting the random seed for
the RubySystem class has no effect.  Hence the variable and the corresponding
option are being dropped.
2015-09-01 15:50:33 -05:00
Nilay Vaish 5d555df359 ruby: directory memory: drop unused variable. 2015-09-01 15:50:32 -05:00
Nilay Vaish a60a93eb05 ruby: specify number of vnets for each protocol
The default value for number of virtual networks is being removed.  Each protocol
should now specify the value it needs.
2015-08-30 12:24:18 -05:00
Nilay Vaish bf8ae288fa ruby: network: drop member m_in_use
This member indicates whether or not a particular virtual network is in use.
Instead of having a default big value for the number of virtual networks and
then checking whether a virtual network is in use, the next patch removes the
default value and the protocol configuration file would now specify the
number of virtual networks it requires.

Additionally, the patch also refactors some of the code used for computing the
virtual channel next in the round robin order.
2015-08-30 12:24:18 -05:00
Nilay Vaish 7175db4a3f ruby: garnet: mark few functions const in BaseGarnetNetwork.hh 2015-08-30 12:24:18 -05:00
Nilay Vaish 426e38af8b ruby: slicc: avoid duplicate code for function argument check
Both FuncCallExprAST and MethodCallExprAST had code for checking the arguments
with which a function is being called.  The patch does away with this
duplication.  Now the code for checking function call arguments resides in the
Func class.
2015-08-30 10:52:58 -05:00
Nilay Vaish 4727fc26f8 ruby: eliminate type uint64 and int64
These types are being replaced with uint64_t and int64_t.
2015-08-29 10:19:23 -05:00
Andreas Sandberg e9d6bf5e35 ruby: Use the const serialize interface in RubySystem
The new serialization code (kudos to Tim Jones) moves all of the state
mangling in RubySystem to memWriteback. This makes it possible to use
the new const serialization interface.

This changeset moves the cache recorder cleanup from the checkpoint()
method to drainResume() to make checkpointing truly constant and
updates the checkpointing code to use the new interface.
2015-08-28 10:58:44 +01:00
Nilay Vaish fc3d34a488 ruby: handle llsc accesses through CacheEntry, not CacheMemory
The sequencer takes care of llsc accesses by calling upon functions
from the CacheMemory.  This is unnecessary once the required CacheEntry object
is available.  Thus some of the calls to findTagInSet() are avoided.
2015-08-27 12:51:40 -05:00
Andreas Hansson ce4f6a9020 mem: Revert requirement on packet addr/size always valid
This patch reverts part of (842f56345a42), as apparently there are
use-cases outside the main repository relying on the late setting of
the physical address.
2015-08-24 05:03:45 -04:00
Andreas Hansson daaae3744d mem: Reflect that packet address and size are always valid
This patch simplifies the packet, and removes the possibility of
creating a packet without a valid address and/or size. Under no
circumstances are these fields set at a later point, and thus they
really have to be provided at construction time.

The patch also fixes a case there the MinorCPU creates a packet
without a valid address and size, only to later delete it.
2015-08-21 07:03:27 -04:00
Andreas Hansson 6eb434c8a2 arm, mem: Remove unused CLEAR_LL request flag
Cleaning up dead code. The CLREX stores zero directly to
MISCREG_LOCKFLAG and so the request flag is no longer needed. The
corresponding functionality in the cache tags is also removed.
2015-08-21 07:03:25 -04:00
Andreas Hansson bda79817c8 mem: Remove unused cache squash functionality
Tidying up.
2015-08-21 07:03:24 -04:00
Andreas Hansson ddfa96cf45 mem: Add explicit Cache subclass and make BaseCache abstract
Open up for other subclasses to BaseCache and transition to using the
explicit Cache subclass.

--HG--
rename : src/mem/cache/BaseCache.py => src/mem/cache/Cache.py
2015-08-21 07:03:23 -04:00
Andreas Hansson d71a0d790d ruby: Move Rubys cache class from Cache.py to RubyCache.py
This patch serves to avoid name clashes with the classic cache. For
some reason having two 'SimObject' files with the same name creates
problems.

--HG--
rename : src/mem/ruby/structures/Cache.py => src/mem/ruby/structures/RubyCache.py
2015-08-21 07:03:21 -04:00
Andreas Hansson 1bf389a2bf mem: Move cache_impl.hh to cache.cc
There is no longer any need to keep the implementation in a header.
2015-08-21 07:03:20 -04:00
Andreas Hansson ae06e9a5c6 cpu: Move invldPid constant from Request to BaseCPU
A more natural home for this constant.
2015-08-21 07:03:14 -04:00
Nilay Vaish 2f44dada68 ruby: reverts to changeset: bf82f1f7b040 2015-08-19 10:02:01 -05:00
Nilay Vaish 2d9f3f8582 ruby: add accessor functions to SLICC def of MachineID 2015-08-14 19:28:44 -05:00
Nilay Vaish 62dcbe3d95 ruby: simple network: refactor code
Drops an unused variable and marks three variables as const.
2015-08-14 19:28:44 -05:00
Nilay Vaish d0cf41300b ruby: profiler: provide the number of vnets through ruby system
The aim is to ultimately do away with the static function
Network::getNumberOfVirtualNetworks().
2015-08-14 19:28:44 -05:00
Nilay Vaish e63c120d0d ruby: directory memory: drop unused variable. 2015-08-14 19:28:44 -05:00
Nilay Vaish 8114c7ff2c ruby: slicc: remove a stray line in StateMachine.py 2015-08-14 19:28:44 -05:00
Nilay Vaish 85506f1c21 ruby: garnet: flexible: refactor flit 2015-08-14 19:28:44 -05:00
Nilay Vaish ae87d68551 ruby: DataBlock: adds a comment 2015-08-14 19:28:44 -05:00
Nilay Vaish d660b3145b ruby: remove random seed
We no longer use the C library based random number generator: random().
Instead we use the C++ library provided rng.  So setting the random seed for
the RubySystem class has no effect.  Hence the variable and the corresponding
option are being dropped.
2015-08-14 19:28:44 -05:00
Nilay Vaish ca368765a1 ruby: SubBlock: refactor code 2015-08-14 19:28:44 -05:00
Nilay Vaish 514f18cdda ruby: cache recorder: move check on block size to RubySystem. 2015-08-14 19:28:44 -05:00
Nilay Vaish 3a726752c1 ruby: abstract controller: mark some variables as const 2015-08-14 19:28:44 -05:00
Nilay Vaish 3230a0b89f ruby: simple network: store Switch* in PerfectSwitch and Throttle 2015-08-14 19:28:44 -05:00
Nilay Vaish cb133b5f2c ruby: remove unused functionalRead() function. 2015-08-14 19:28:44 -05:00
Nilay Vaish 5f1d1ce5d4 ruby: perfect switch: refactor code
Refactored the code in operateVnet(), moved partly to a new function
operateMessageBuffer().
2015-08-14 19:28:44 -05:00
Nilay Vaish a706b6259a ruby: cache memory: drop {try,test}CacheAccess functions 2015-08-14 19:28:43 -05:00
Nilay Vaish 5060e572ca ruby: call setMRU from L1 controllers, not from sequencer
Currently the sequencer calls the function setMRU that updates the replacement
policy structures with the first level caches.  While functionally this is
correct, the problem is that this requires calling findTagInSet() which is an
expensive function.  This patch removes the calls to setMRU from the sequencer.
All controllers should now update the replacement policy on their own.

The set and the way index for a given cache entry can be found within the
AbstractCacheEntry structure. Use these indicies to update the replacement
policy structures.
2015-08-14 19:28:43 -05:00
Nilay Vaish b815221718 ruby: adds set and way indices to AbstractCacheEntry 2015-08-14 19:28:43 -05:00
Nilay Vaish a6f3f38f2c ruby: eliminate type uint64 and int64
These types are being replaced with uint64_t and int64_t.
2015-08-14 19:28:43 -05:00
Nilay Vaish 9648c05db1 ruby: slicc: use default argument value
Before this patch, while one could declare / define a function with default
argument values, but the actual function call would require one to specify
all the arguments.  This patch changes the check for  function arguments.
Now a function call needs to specify arguments that are at least as much as
those with default values and at most the total number of arguments taken
as input by the function.
2015-08-14 19:28:43 -05:00
Nilay Vaish 7fc725fdb5 ruby: slicc: avoid duplicate code for function argument check
Both FuncCallExprAST and MethodCallExprAST had code for checking the arguments
with which a function is being called.  The patch does away with this
duplication.  Now the code for checking function call arguments resides in the
Func class.
2015-08-14 19:28:43 -05:00
Nilay Vaish f391cee5e1 ruby: drop the [] notation for lookup function.
This is in preparation for adding a second arugment to the lookup
function for the CacheMemory class.  The change to *.sm files was made using
the following sed command:

sed -i 's/\[\([0-9A-Za-z._()]*\)\]/.lookup(\1)/' src/mem/protocol/*.sm
2015-08-14 19:28:43 -05:00
Nilay Vaish 1a3e8a3370 ruby: handle llsc accesses through CacheEntry, not CacheMemory
The sequencer takes care of llsc accesses by calling upon functions
from the CacheMemory.  This is unnecessary once the required CacheEntry object
is available.  Thus some of the calls to findTagInSet() are avoided.
2015-08-14 19:28:42 -05:00
Nilay Vaish 91a84c5b3c ruby: replace Address by Addr
This patch eliminates the type Address defined by the ruby memory system.
This memory system would now use the type Addr that is in use by the
rest of the system.
2015-08-14 12:04:51 -05:00
Nilay Vaish 9ea5d9cad9 ruby: rename variables Addr to addr
Avoid clash between type Addr and variable name Addr.
2015-08-14 12:04:47 -05:00
Joel Hestness 905c0b347c ruby: Protocol changes for SimObject MessageBuffers 2015-08-14 00:19:45 -05:00
Joel Hestness 581bae9ecb ruby: Expose MessageBuffers as SimObjects
Expose MessageBuffers from SLICC controllers as SimObjects that can be
manipulated in Python. This patch has numerous benefits:
1) First and foremost, it exposes MessageBuffers as SimObjects that can be
manipulated in Python code. This allows parameters to be set and checked in
Python code to avoid obfuscating parameters within protocol files. Further, now
as SimObjects, MessageBuffer parameters are printed to config output files as a
way to track parameters across simulations (e.g. buffer sizes)

2) Cleans up special-case code for responseFromMemory buffers, and aligns their
instantiation and use with mandatoryQueue buffers. These two special buffers
are the only MessageBuffers that are exposed to components outside of SLICC
controllers, and they're both slave ends of these buffers. They should be
exposed outside of SLICC in the same way, and this patch does it.

3) Distinguishes buffer-specific parameters from buffer-to-network parameters.
Specifically, buffer size, randomization, ordering, recycle latency, and ports
are all specific to a MessageBuffer, while the virtual network ID and type are
intrinsics of how the buffer is connected to network ports. The former are
specified in the Python object, while the latter are specified in the
controller *.sm files. Unlike buffer-specific parameters, which may need to
change depending on the simulated system structure, buffer-to-network
parameters can be specified statically for most or all different simulated
systems.
2015-08-14 00:19:44 -05:00
Joel Hestness bf06911b3f ruby: Change PerfectCacheMemory::lookup to return pointer
CacheMemory and DirectoryMemory lookup functions return pointers to entries
stored in the memory. Bring PerfectCacheMemory in line with this convention,
and clean up SLICC code generation that was in place solely to handle
references like that which was returned by PerfectCacheMemory::lookup.
2015-08-14 00:19:39 -05:00
Joel Hestness 9567c839fe ruby: Remove the RubyCache/CacheMemory latency
The RubyCache (CacheMemory) latency parameter is only used for top-level caches
instantiated for Ruby coherence protocols. However, the top-level cache hit
latency is assessed by the Sequencer as accesses flow through to the cache
hierarchy. Further, protocol state machines should be enforcing these cache hit
latencies, but RubyCaches do not expose their latency to any existng state
machines through the SLICC/C++ interface. Thus, the RubyCache latency parameter
is superfluous for all caches. This is confusing for users.

As a step toward pushing L0/L1 cache hit latency into the top-level cache
controllers, move their latencies out of the RubyCache declarations and over to
their Sequencers. Eventually, these Sequencer parameters should be exposed as
parameters to the top-level cache controllers, which should assess the latency.
NOTE: Assessing these latencies in the cache controllers will require modifying
each to eliminate instantaneous Ruby hit callbacks in transitions that finish
accesses, which is likely a large undertaking.
2015-08-14 00:19:37 -05:00
Nilay Vaish 380a2ca918 ruby: slicc: allow mathematical operations on Ticks 2015-08-11 11:39:23 -05:00
Andreas Sandberg bbb3abc167 mem: Cleanup packet accessor methods
The Packet::get() and Packet::set() methods both have very strange
semantics. Currently, they automatically convert between the guest
system's endianness and the host system's endianness. This behavior is
usually undesired and unexpected.

This patch introduces three new method pairs to access data:
  * getLE() / setLE() - Get data stored as little endian.
  * getBE() / setBE() - Get data stored as big endian.
  * get(ByteOrder) / set(v, ByteOrder) - Configurable endianness

For example, a little endian device that is receiving a write request
will use teh getLE() method to get the data from the packet.

The old interface will be deprecated once all existing devices have
been ported to the new interface.
2015-08-07 09:59:28 +01:00
Andreas Sandberg 53e777d683 base: Declare a type for context IDs
Context IDs used to be declared as ad hoc (usually as int). This
changeset introduces a typedef for ContextIDs and a constant for
invalid context IDs.
2015-08-07 09:59:13 +01:00
Andreas Hansson 83a668ad25 mem: Remove extraneous acquire/release flags and attributes
This patch removes the extraneous flags and attributes from the
request and packet, and simply leaves the new commands. The change
introduced when adding acquire/release breaks all compatibility with
existing traces, and there is really no need for any new flags and
attributes. The commands should be sufficient.

This patch fixes packet tracing (urgent), and also removes the
unnecessary complexity.
2015-08-07 04:55:38 -04:00
Andreas Sandberg 0194e6eb2d mem: Fixup incorrect include guards
--HG--
extra : rebase_source : 9dba84eaf9c734a114ecd0940e1d505303644064
2015-08-05 10:12:12 +01:00
Andreas Sandberg a3f49f60c7 mem: Move trace functionality from the CommMonitor to a probe
This changeset moves the access trace functionality from the
CommMonitor into a separate probe. The probe can be hooked up to any
component that exports probe points of the type ProbePoints::Packet.

This patch moves the dependency on Google's Protocol Buffers library
from the CommMonitor to the MemTraceProbe, which means that the
CommMonitor (including stack distance profiling) no long depends on
it.
2015-08-04 10:29:13 +01:00
Andreas Sandberg 022e69e6de mem: Redesign the stack distance calculator as a probe
This changeset removes the stack distance calculator hooks from the
CommMonitor class and implements a stack distance calculator as a
memory system probe instead. The probe can be hooked up to any
component that exports probe points of the type ProbePoints::Packet.
2015-08-04 10:29:13 +01:00
Andreas Sandberg feded87fc9 mem: Add probe support to the CommMonitor
This changeset adds a standardized probe point type to monitor packets
in the memory system and adds two probe points to the CommMonitor
class. These probe points enable monitoring of successfully delivered
requests and successfully delivered responses.

Memory system probe listeners should use the BaseMemProbe base class
to provide a unified configuration interface and reuse listener
registration code. Unlike the ProbeListenerObject class, the
BaseMemProbe allows objects to be wired to multiple ProbeManager
instances as long as they use the same probe point name.
2015-08-04 10:29:13 +01:00
Timothy Jones 96091f358b uby: Fix checkpointing and restore
There are 2 problems with the existing checkpoint and restore code in ruby.
The first is that when the event queue is altered by ruby during serialization,
some events that are currently scheduled cannot be found (e.g. the event to
stop simulation that always lives on the queue), causing a panic.
The second is that ruby is sometimes serialized after the memory system,
meaning that the dirty data in its cache is flushed back to memory too late
and so isn't included in the checkpoint.

These are fixed by implementing memory writeback in ruby, using the same
technique of hijacking the event queue, but first descheduling all events that
are currently on it.  They are saved, along with their scheduled time, so that
the event queue can be faithfully reconstructed after writeback has finished.
Events with the AutoDelete flag set will delete themselves when they
are descheduled, causing an error when attempting to schedule them again.
This is fixed by simply not recording them when taking them off the queue.

Writeback is still implemented using flushing, so the cache recorder object,
that is created to generate the trace and manage flushing, is kept
around and used during serialization to write the trace to disk.

Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2015-08-03 23:08:40 -05:00
Nilay Vaish 676ae57827 ruby: mesi three level: multiple corrections to the protocol
1. Eliminate state NP in L0 and L1 Caches:  The two states 'NP' and 'I' both
mean that the cache block is not present in the cache.  'I' also means that the
cache entry has been allocated.  This causes problems when we do not correctly
initialize the cache entry when it is re-used.  Hence, this patch eliminates
the state NP altogether.  Everytime a new block comes into the cache, a cache
entry is allocated.  Everytime a block leaves, the corresponding entry is
deallocated.

2. Separate transient state for instruction fetches: purely for accouting
purposes.

3. Drop state IS_I in L1 Cache and the message type STALE_DATA: when
invalidation is received for a block in IS, the block used to be moved to IS_I.
This meant that the data that would arrive in future would be used but not
stored since the controller lost the permissions after gaining them.  This
state is being dropped and now invalidation messages would not processed till
the data has arrived.  This also means that STALE_DATA type is not longer
required.
2015-08-03 22:44:29 -05:00
Nilay Vaish 9bf3b8828a ruby: mesi two,three level: copy data only when dirty
The level 2 controller has a bug. In one particular action, the data block was
copied from a message irrespective whether the block is dirty or not.  In cases
when L1 sends no data, the data value copied was incorrect.
2015-08-03 22:44:28 -05:00
Brad Beckmann 03f2b8c23d ruby: removed invalid assert in message comparitor
It is perfectly valid to compare the same message and the greater than
operator should work correctly.
2015-08-01 12:59:47 -04:00
Brad Beckmann 6b52f828cc ruby: improved stall and wait debugging
Added dprintfs and asserts for identifying stall and wait bugs.
2015-07-20 09:15:18 -05:00
Brad Beckmann 848861a17d slicc: fix error in conflicing symbol declaration 2015-07-20 09:15:18 -05:00
Brad Beckmann 8a54adc2a5 slicc: enable overloading in functions not in classes
For many years the slicc symbol table has supported overloaded functions in
external classes.  This patch extends that support to functions that are not
part of classes (a.k.a. no parent).  For example, this support allows slicc
to understand that mapAddressToRange is overloaded and the NodeID is an
optional parameter.
2015-07-20 09:15:18 -05:00
David Hashe 0d00cbc97b ruby: change router pipeline stages to 2
This patch changes the router pipeline stages from 4 to 2. The
canonical 4-stage router is conservative while a lower-latency router
with look ahead routing and speculative allocation is well acknowledged.
2015-07-20 09:15:18 -05:00
David Hashe 8b32dad4d8 ruby: change advance_stage for flit_d
Sets m_stage.second to the second parameter of the function.
Then, for every place where advance_stage is called, adds
a cycle to the argument being passed.
2015-07-20 09:15:18 -05:00
Brad Beckmann f9fa242f42 slicc: improved stalling support in protocols
Adds features to allow protocols to reschedule controllers when conditionally
stalling within inport logic or actions.  Also insures that resource and
protocol stalls are re-evaluated the next cycle.
2015-07-20 09:15:18 -05:00
David Hashe c4ffd4989c ruby: expose access permission to replacement policies
This patch adds support that allows the replacement policy to identify each
cache block's access permission.  This information can be useful when making
replacement decisions.
2015-07-20 09:15:18 -05:00
David Hashe 967cfa939a ruby: adds size and empty apis to the msg buffer stallmap 2015-07-20 09:15:18 -05:00
David Hashe 21aa5734a0 ruby: fix deadlock bug in banked array resource checks
The Ruby banked array resource checks (initiated from SLICC) did a check and
allocate at the same time. If a transition needs more than one resource, then
it might check/allocate resource #1, then fail to get resource #2. Another
transition might then try to get the same resources, but in reverse order.
Deadlock.

This patch separates resource checking and resource reservation into two
steps to avoid deadlock.
2015-07-20 09:15:18 -05:00
David Hashe 63a9f10de8 ruby: Fix for stallAndWait bug
It was previously possible for a stalled message to be reordered after an
incomming message. This patch ensures that any stalled message stays in its
original request order.
2015-07-20 09:15:18 -05:00
David Hashe 6511ab4654 mem: add request types for acquire and release
Add support for acquire and release requests.  These synchronization operations
are commonly supported by several modern instruction sets.
2015-07-20 09:15:18 -05:00
David Hashe 7e9562013b ruby: allocate a block in CacheMemory without updating LRU state 2015-07-20 09:15:18 -05:00
David Hashe 7e00772bda ruby: speed up function used for cache walks
This patch adds a few helpful functions that allow .sm files to directly
invalidate all cache blocks using a trigger queue rather than rely on each
individual cache block to be invalidated via requests from the mandatory
queue.
2015-07-20 09:15:18 -05:00
David Hashe 3454a4a36e slicc: support for arbitrary DPRINTF flags (not just RubySlicc)
This patch allows DPRINTFs to be used in SLICC state machines similar to how
they are used by the rest of gem5.  Previously all DPRINTFs in the .sm files
had to use the RubySlicc flag.
2015-07-20 09:15:18 -05:00
David Hashe 9324239922 slicc: support for local variable declarations in action blocks 2015-07-20 09:15:18 -05:00
David Hashe 1850ed410f ruby: initialize replacement policies with their own simobjs
this is in preparation for other replacement policies that take additional
parameters.
2015-07-20 09:15:18 -05:00
David Hashe 74ca89f8b7 ruby: give access to cache tag/data latencies from SLICC
This patch exposes the tag and data array latencies to the SLICC state machines
so that it can be used to determine the correct enqueue latency for response
messages.
2015-07-20 09:15:18 -05:00
David Hashe 536e3664e4 slicc: support for multiple cache entry types in the same state machine
To have multiple Entry types (e.g., a cache Entry type and
a directory Entry type), just declare one of them as a secondary
type by using the pair 'main="false"', e.g.:

  structure(DirEntry, desc="...", interface="AbstractCacheEntry",
            main="false") {

...and the primary type would be declared:

  structure(Entry, desc="...", interface="AbstractCacheEntry") {
2015-07-20 09:15:18 -05:00
David Hashe 910638f338 slicc: Fix bug in enqueue and peek statements.
These were not generating the correct c names for types declared within a
machine scope.
2015-07-20 09:15:18 -05:00
David Hashe 3d8c8a85fa slicc: fix missing inline function in LocalVariableAST 2015-07-20 09:15:18 -05:00
David Hashe 93fff6f636 slicc: improve support for prefix operations
This patch fixes the type handling when prefix operations are used.  Previously
prefix operators would assume a void return type, which made it impossible to
combine prefix operations with other expressions.  This patch allows SLICC
programmers to use prefix operations more naturally.
2015-07-20 09:15:18 -05:00
David Hashe ee0d414fa8 slicc: support for transitions with a wildcard next state
This patches adds support for transitions of the form:

transition(START, EVENTS, *) { ACTIONS }

This allows a machine to collapse states that differ only in the next state
transition to collapse into one, and can help shorten/simplfy some protocols
significantly.

When * is encountered as an end state of a transition, the next state is
determined by calling the machine-specific getNextState function. The next
state is determined before any actions of the transition execute, and
therefore the next state calculation cannot depend on any of the transition
actions.
2015-07-20 09:15:18 -05:00
David Hashe 6a288d9de3 slicc: support for multiple message types on the same buffer
This patch allows SLICC protocols to use more than one message type with a
message buffer. For example, you can declare two in ports as such:

  in_port(ResponseQueue_in, ResponseMsg, responseFromDir, rank=3) { ... }
  in_port(tgtResponseQueue_in, TgtResponseMsg, responseFromDir, rank=2) { ... }
2015-07-20 09:15:18 -05:00
Brad Beckmann b609b032aa slicc: fatal->panic on invalid transitions 2015-08-01 12:37:52 -04:00
David Hashe 3444d5f359 mem: Hit callback delay fix
This patch was created by Bihn Pham during his internship at AMD.

There is no need to delay hit callback response messages by a cycle because
the response latency is already incurred in the Ruby protocol. This ensures
correct timing of memory instructions.
2015-07-20 09:15:18 -05:00
Brad Beckmann 0c78abb302 ruby: re-added the addressToInt slicc interface function
This helper function is very useful converting address offsets to integers
that can be used for protocol specific destination mapping.
2015-07-20 09:15:18 -05:00
Brad Beckmann 4710eba588 ruby: add useful dprints to sequencer
Added two data block dprints that are useful when tracking down data check
failures in the ruby random tester.
2015-07-20 09:15:18 -05:00
David Hashe a254786a19 slicc: isinstance bugfix
This fix prevents spurious errors when searching for a symbol that may be
located in one of multiple symbol tables.
2015-07-20 09:15:18 -05:00
Andreas Hansson 6fac40ceb0 mem: Add missing clean eviction on uncacheable access
This patch adds a missing clean eviction, occuring when an uncacheable
access flushes and invalidates an existing block.
2015-07-30 03:42:25 -04:00
Andreas Hansson 540e59fd70 mem: Remove unused RequestCause in cache
This patch removes the RequestCause, and also simplifies how we
schedule the sending of packets through the memory-side port. The
deassertion of bus requests is removed as it is not used.
2015-07-30 03:41:43 -04:00
David Guillen-Fandos 0c89c15b23 mem: Make caches way aware
This patch makes cache sets aware of the way number. This enables
some nice features such as the ablity to restrict way allocation. The
implemented mechanism allows to set a maximum way number to be
allocated 'k' which must fulfill 0 < k <= N (where N is the number of
ways). In the future more sophisticated mechasims can be implemented.
2015-07-30 03:41:42 -04:00
Andreas Hansson 5a18e181ff mem: Transition away from isSupplyExclusive for writebacks
This patch changes how writebacks communicate whether the line is
passed as modified or owned. Previously we relied on the
isSupplyExclusive mechanism, which was originally designed to avoid
unecessary snoops.

For normal cache requests we use the sharedAsserted mechanism to
determine if a block should be marked writeable or not, and with this
patch we transition the writebacks to also use this
mechanism. Conceptually this is cleaner and more consistent.
2015-07-30 03:41:40 -04:00
Andreas Hansson 5902e29e84 mem: Tidy up CacheBlk class
This patch modernises and tidies up the CacheBlk, removing dead code.
2015-07-30 03:41:39 -04:00
Andreas Hansson 41b39b22cd mem: Tidy up packet
Some minor fixes and removal of dead code. Changing the flags to be
enums rather than static const (to avoid any linking issues caused by
the latter). Also adding a getBlockAddr member which hopefully can
slowly finds its way into caches, snoop filters etc.
2015-07-30 03:41:38 -04:00
Brandon Potter 582793468d ruby: dma sequencer: removes redundant code 2015-07-24 12:25:22 -07:00
Nilay Vaish 1b71a20391 ruby: network: NetworkLink inherits from Consumer now. 2015-07-22 11:20:07 -05:00
Andreas Hansson 5410660919 mem: Fix (ab)use of emplace to avoid temporary object creation 2015-07-13 08:46:28 -04:00
Andreas Hansson d870c399d3 mem: Updated DRAMSim2 wrapper to new drain API
Somehow this one slipped through without being updated.
2015-07-13 08:46:16 -04:00
Brandon Potter bfe7ee96ad ruby: replace global g_abs_controls with per-RubySystem var
This is another step in the process of removing global variables
from Ruby to enable multiple RubySystem instances in a single simulation.

The list of abstract controllers is per-RubySystem and should be
represented that way, rather than as a global.

Since this is the last remaining Ruby global variable, the
src/mem/ruby/Common/Global.* files are also removed.
2015-07-10 16:05:24 -05:00
Brandon Potter f9a370f172 ruby: replace global g_system_ptr with per-object pointers
This is another step in the process of removing global variables
from Ruby to enable multiple RubySystem instances in a single simulation.

With possibly multiple RubySystem objects, we can no longer use a global
variable to find "the" RubySystem object.  Instead, each Ruby component
has to carry a pointer to the RubySystem object to which it belongs.
2015-07-10 16:05:23 -05:00
Brandon Potter c38f5098b1 ruby: replace g_ruby_start with per-RubySystem m_start_cycle
This patch begins the process of removing global variables from the Ruby
source with the goal of eventually allowing users to create multiple Ruby
instances in a single simulation.  Currently, users cannot do so because
several global variables and static members are referenced by the RubySystem
object in a way that assumes that there will only ever be a single RubySystem.
These need to be replaced with per-RubySystem equivalents.

This specific patch replaces the global var g_ruby_start, which is used
to calculate throughput statistics for Throttles in simple networks and
links in Garnet networks, with a RubySystem instance var m_start_cycle.
2015-07-10 16:05:23 -05:00
Brandon Potter 9eda4bdc5a ruby: remove extra whitespace and correct misspelled words 2015-07-10 16:05:23 -05:00
Andreas Sandberg ed38e3432c sim: Refactor and simplify the drain API
The drain() call currently passes around a DrainManager pointer, which
is now completely pointless since there is only ever one global
DrainManager in the system. It also contains vestiges from the time
when SimObjects had to keep track of their child objects that needed
draining.

This changeset moves all of the DrainState handling to the Drainable
base class and changes the drain() and drainResume() calls to reflect
this. Particularly, the drain() call has been updated to take no
parameters (the DrainManager argument isn't needed) and return a
DrainState instead of an unsigned integer (there is no point returning
anything other than 0 or 1 any more). Drainable objects should return
either DrainState::Draining (equivalent to returning 1 in the old
system) if they need more time to drain or DrainState::Drained
(equivalent to returning 0 in the old system) if they are already in a
consistent state. Returning DrainState::Running is considered an
error.

Drain done signalling is now done through the signalDrainDone() method
in the Drainable class instead of using the DrainManager directly. The
new call checks if the state of the object is DrainState::Draining
before notifying the drain manager. This means that it is safe to call
signalDrainDone() without first checking if the simulator has
requested draining. The intention here is to reduce the code needed to
implement draining in simple objects.
2015-07-07 09:51:05 +01:00
Andreas Sandberg f16c0a4a90 sim: Decouple draining from the SimObject hierarchy
Draining is currently done by traversing the SimObject graph and
calling drain()/drainResume() on the SimObjects. This is not ideal
when non-SimObjects (e.g., ports) need draining since this means that
SimObjects owning those objects need to be aware of this.

This changeset moves the responsibility for finding objects that need
draining from SimObjects and the Python-side of the simulator to the
DrainManager. The DrainManager now maintains a set of all objects that
need draining. To reduce the overhead in classes owning non-SimObjects
that need draining, objects inheriting from Drainable now
automatically register with the DrainManager. If such an object is
destroyed, it is automatically unregistered. This means that drain()
and drainResume() should never be called directly on a Drainable
object.

While implementing the new functionality, the DrainManager has now
been made thread safe. In practice, this means that it takes a lock
whenever it manipulates the set of Drainable objects since SimObjects
in different threads may create Drainable objects
dynamically. Similarly, the drain counter is now an atomic_uint, which
ensures that it is manipulated correctly when objects signal that they
are done draining.

A nice side effect of these changes is that it makes the drain state
changes stricter, which the simulation scripts can exploit to avoid
redundant drains.
2015-07-07 09:51:05 +01:00
Andreas Sandberg e9c3d59aae sim: Make the drain state a global typed enum
The drain state enum is currently a part of the Drainable
interface. The same state machine will be used by the DrainManager to
identify the global state of the simulator. Make the drain state a
global typed enum to better cater for this usage scenario.
2015-07-07 09:51:04 +01:00
Andreas Sandberg 76cd4393c0 sim: Refactor the serialization base class
Objects that are can be serialized are supposed to inherit from the
Serializable class. This class is meant to provide a unified API for
such objects. However, so far it has mainly been used by SimObjects
due to some fundamental design limitations. This changeset redesigns
to the serialization interface to make it more generic and hide the
underlying checkpoint storage. Specifically:

  * Add a set of APIs to serialize into a subsection of the current
    object. Previously, objects that needed this functionality would
    use ad-hoc solutions using nameOut() and section name
    generation. In the new world, an object that implements the
    interface has the methods serializeSection() and
    unserializeSection() that serialize into a named /subsection/ of
    the current object. Calling serialize() serializes an object into
    the current section.

  * Move the name() method from Serializable to SimObject as it is no
    longer needed for serialization. The fully qualified section name
    is generated by the main serialization code on the fly as objects
    serialize sub-objects.

  * Add a scoped ScopedCheckpointSection helper class. Some objects
    need to serialize data structures, that are not deriving from
    Serializable, into subsections. Previously, this was done using
    nameOut() and manual section name generation. To simplify this,
    this changeset introduces a ScopedCheckpointSection() helper
    class. When this class is instantiated, it adds a new /subsection/
    and subsequent serialization calls during the lifetime of this
    helper class happen inside this section (or a subsection in case
    of nested sections).

  * The serialize() call is now const which prevents accidental state
    manipulation during serialization. Objects that rely on modifying
    state can use the serializeOld() call instead. The default
    implementation simply calls serialize(). Note: The old-style calls
    need to be explicitly called using the
    serializeOld()/serializeSectionOld() style APIs. These are used by
    default when serializing SimObjects.

  * Both the input and output checkpoints now use their own named
    types. This hides underlying checkpoint implementation from
    objects that need checkpointing and makes it easier to change the
    underlying checkpoint storage code.
2015-07-07 09:51:03 +01:00
Andreas Sandberg 777cc71c4a mem: Cleanup CommMonitor in preparation for probe support
Make configuration parameters constant and get rid of an unnecessary
dependency on the Time class.
2015-07-06 17:08:53 +01:00
Nilay Vaish d29d7c41f1 mem: packet: Add const to constructor argument 2015-07-04 10:43:46 -05:00
Nilay Vaish 16ac48e6a4 ruby: drop NetworkMessage class
This patch drops the NetworkMessage class.  The relevant data members and functions
have been moved to the Message class, which was the parent of NetworkMessage.
2015-07-04 10:43:46 -05:00
Nilay Vaish baa3eb0de3 ruby: mesi three level: name change to avoid clash
The accessor function getDestination() for Destination variable in the
coherence message clashes with the getDestination() that is part of the Message
class.  Hence the name change.
2015-07-04 10:43:46 -05:00
Nilay Vaish b4efb48a71 ruby: remove message buffer node
This structure's only purpose was to provide a comparison function for
ordering messages in the MessageBuffer.  The comparison function is now
being moved to the Message class itself.  So we no longer require this
structure.
2015-07-04 10:43:46 -05:00
Andreas Hansson 7e711c98f8 mem: Increase the default buffer sizes for the DDR4 controller
This patch increases the default read/write buffer sizes for the DDR4
controller config to values that are more suitable for the high
bandwidth and high bank count.
2015-07-03 10:14:48 -04:00
Wendy Elsasser 31f901b69d mem: Update DRAM command scheduler for bank groups
This patch updates the command arbitration so that bank group timing
as well as rank-to-rank delays will be taken into account. The
resulting arbitration no longer selects commands (prepped or not) that
cannot issue seamlessly if there are commands that can issue
back-to-back, minimizing the effect of rank-to-rank (tCS) & same bank
group (tCCD_L) delays.

The arbitration selects a new command based on the following priority.
Within each priority band, the arbitration will use FCFS to select the
appropriate command:

1) Bank is prepped and burst can issue seamlessly, without a bubble

2) Bank is not prepped, but can prep and issue seamlessly, without a
bubble

3) Bank is prepped but burst cannot issue seamlessly. In this case, a
bubble will occur on the bus

Thus, to enable more parallelism in subsequent selections, an
unprepped packet is given higher priority if the bank prep can be
hidden. If the bank prep cannot be hidden, the selection logic will
choose a prepped packet that cannot issue seamlessly if one exist.
Otherwise, the default selection will choose the packet with the
minimum bank prep delay.
2015-07-03 10:14:46 -04:00
Andreas Hansson b56167b682 mem: Avoid DRAM write queue iteration for merging and read lookup
This patch adds a simple lookup structure to avoid iterating over the
write queue to find read matches, and for the merging of write
bursts. Instead of relying on iteration we simply store a set of
currently-buffered write-burst addresses and compare against
these. For the reads we still perform the iteration if we have a
match. For the writes, we rely entirely on the set. Note that there
are corner-cases where sub-bursts would actually not be mergeable
without a read-modify-write. We ignore these cases and opt for speed.
2015-07-03 10:14:45 -04:00
Andreas Hansson db85ddca1a mem: Delay responses in the crossbar before forwarding
This patch changes how the crossbar classes deal with
responses. Instead of forwarding responses directly and burdening the
neighbouring modules in paying for the latency (through the
pkt->headerDelay), we now queue them before sending them.

The coherency protocol is not affected as requests and any snoop
requests/responses are still passed on in zero time. Thus, the
responses end up paying for any header delay accumulated when passing
through the crossbar. Any latency incurred on the request path will be
paid for on the response side, if no other module has dealt with it.

As a result of this patch, responses are returned at a later
point. This affects the number of outstanding transactions, and quite
a few regressions see an impact in blocking due to no MSHRs, increased
cache-miss latencies, etc.

Going forward we should be able to use the same concept also for snoop
responses, and any request that is not an express snoop.
2015-07-03 10:14:44 -04:00
Andreas Hansson b93c912013 mem: Remove redundant is_top_level cache parameter
This patch takes the final step in removing the is_top_level parameter
from the cache. With the recent changes to read requests and write
invalidations, the parameter is no longer needed, and consequently
removed.

This also means that asymmetric cache hierarchies are now fully
supported (and we are actually using them already with L1 caches, but
no table-walker caches, connected to a shared L2).
2015-07-03 10:14:43 -04:00
Andreas Hansson 71856cfbbc mem: Split WriteInvalidateReq into write and invalidate
WriteInvalidateReq ensures that a whole-line write does not incur the
cost of first doing a read exclusive, only to later overwrite the
data. This patch splits the existing WriteInvalidateReq into a
WriteLineReq, which is done locally, and an InvalidateReq that is sent
out throughout the memory system. The WriteLineReq re-uses the normal
WriteResp.

The change allows us to better express the difference between the
cache that is performing the write, and the ones that are merely
invalidating. As a consequence, we no longer have to rely on the
isTopLevel flag. Moreover, the actual memory in the system does not
see the intitial write, only the writeback. We were marking the
written line as dirty already, so there is really no need to also push
the write all the way to the memory.

The overall flow of the write-invalidate operation remains the same,
i.e. the operation is only carried out once the response for the
invalidate comes back. This patch adds the InvalidateResp for this
very reason.
2015-07-03 10:14:41 -04:00
Andreas Hansson 0ddde83a47 mem: Add ReadCleanReq and ReadSharedReq packets
This patch adds two new read requests packets:

ReadCleanReq - For a cache to explicitly request clean data. The
response is thus exclusive or shared, but not owned or modified. The
read-only caches (see previous patch) use this request type to ensure
they do not get dirty data.

ReadSharedReq - We add this to distinguish cache read requests from
those issued by other masters, such as devices and CPUs. Thus, devices
use ReadReq, and caches use ReadCleanReq, ReadExReq, or
ReadSharedReq. For the latter, the response can be any state, shared,
exclusive, owned or even modified.

Both ReadCleanReq and ReadSharedReq re-use the normal ReadResp. The
two transactions are aligned with the emerging cache-coherent TLM
standard and the AMBA nomenclature.

With this change, the normal ReadReq should never be used by a cache,
and is reserved for the actual (non-caching) masters in the system. We
thus have a way of identifying if a request came from a cache or
not. The introduction of ReadSharedReq thus removes the need for the
current isTopLevel hack, and also allows us to stop relying on
checking the packet size to determine if the source is a cache or
not. This is fixed in follow-on patches.
2015-07-03 10:14:40 -04:00
Andreas Hansson 893533a126 mem: Allow read-only caches and check compliance
This patch adds a parameter to the BaseCache to enable a read-only
cache, for example for the instruction cache, or table-walker cache
(not for x86). A number of checks are put in place in the code to
ensure a read-only cache does not end up with dirty data.

A follow-on patch adds suitable read requests to allow a read-only
cache to explicitly ask for clean data.
2015-07-03 10:14:39 -04:00
Ali Jafri a262908acc mem: Add clean evicts to improve snoop filter tracking
This patch adds eviction notices to the caches, to provide accurate
tracking of cache blocks in snoop filters. We add the CleanEvict
message to the memory heirarchy and use both CleanEvicts and
Writebacks with BLOCK_CACHED flags to propagate notice of clean and
dirty evictions respectively, down the memory hierarchy. Note that the
BLOCK_CACHED flag indicates whether there exist any copies of the
evicted block in the caches above the evicting cache.

The purpose of the CleanEvict message is to notify snoop filters of
silent evictions in the relevant caches. The CleanEvict message
behaves much like a Writeback. CleanEvict is a write and a request but
unlike a Writeback, CleanEvict does not have data and does not need
exclusive access to the block. The cache generates the CleanEvict
message on a fill resulting in eviction of a clean block. Before
travelling downwards CleanEvict requests generate zero-time snoop
requests to check if the same block is cached in upper levels of the
memory heirarchy. If the block exists, the cache discards the
CleanEvict message. The snoops check the tags, writeback queue and the
MSHRs of upper level caches in a manner similar to snoops generated
from HardPFReqs. Currently CleanEvicts keep travelling towards main
memory unless they encounter the block corresponding to their address
or reach main memory (since we have no well defined point of
serialisation). Main memory simply discards CleanEvict messages.

We have modified the behavior of Writebacks, such that they generate
snoops to check for the presence of blocks in upper level caches. It
is possible in our current implmentation for a lower level cache to be
writing back a block while a shared copy of the same block exists in
the upper level cache. If the snoops find the same block in upper
level caches, we set the BLOCK_CACHED flag in the Writeback message.

We have also added logic to account for interaction of other message
types with CleanEvicts waiting in the writeback queue. A simple
example is of a response arriving at a cache removing any CleanEvicts
to the same address from the cache's writeback queue.
2015-07-03 10:14:37 -04:00
Andreas Hansson aa5bbe81f6 mem: Convert Request static const flags to enums
This patch fixes an issue which is very wide spread in the codebase,
causing sporadic linking failures. The issue is that we declare static
const class variables in the header, without any definition (as part
of a source file). In most cases the compiler propagates the value and
we have no issues. However, especially for less optimising builds such
as debug, we get sporadic linking failures due to undefined
references.

This patch fixes the Request class, by turning the static const flags
and master IDs into C++11 typed enums.
2015-07-03 10:14:36 -04:00
Nilay Vaish 57971248f6 ruby: slicc: remove README
No longer maintained.  Updates are only made to the wiki page.  So being
dropped.
2015-06-25 11:58:30 -05:00
Nilay Vaish 0647d99854 ruby: message: remove a data member added by mistake
I (Nilay) had mistakenly added a data member to  the Message class in revision c1694b4032a6.
The data member is being removed.
2015-06-25 11:58:29 -05:00
Jason Power 2f3c467883 Ruby: Remove assert in RubyPort retry list logic
Remove the assert when adding a port to the RubyPort retry list.
Instead of asserting, just ignore the added port, since it's
already on the list.
Without this patch, Ruby+detailed fails for even the simplest tests
2015-06-25 11:58:28 -05:00
Ali Jafri f0c3b70451 mem: Add check for express snoop in packet destructor
Snoop packets share the request pointer with the originating
packets. We need to ensure that the snoop packet destruction does not
delete the request. Snoops are used for reads, invalidations,
HardPFReqs, Writebacks and CleansEvicts. Reads, invalidations, and
HardPFReqs need a response so their snoops do not delete the
request. For Writebacks and CleanEvicts we need to check explicitly
for whethere the current packet is an express snoop, in whcih case do
not delete the request.
2015-06-09 09:21:18 -04:00
Andreas Hansson 578a7f20c6 mem: Fix snoop packet data allocation bug
This patch fixes an issue where the snoop packet did not properly
forward the data pointer in case of static data.
2015-06-09 09:21:17 -04:00
Marco Elver 6599dd87c8 ruby: Fix MESI consistency bug
Fixes missed forward eviction to CPU. With the O3CPU this can lead to load-load
reordering, as the LQ is never notified of the invalidate.

Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2015-06-07 14:02:40 -05:00
Matthias Jung 25fe4c2529 mem: Add HMC Timing Parameters
A single HMC-2500 x32 model based on:

[1] DRAMSpec: a high-level DRAM bank modelling tool developed at the University
of Kaiserslautern. This high level tool uses RC (resistance-capacitance) and CV
(capacitance-voltage) models to estimate the DRAM bank latency and power
numbers.

[2] A Logic-base Interconnect for Supporting Near Memory Computation in the
Hybrid Memory Cube (E. Azarkhish et. al) Assumed for the HMC model is a 30 nm
technology node.  The modelled HMC consists of a 4 Gbit part with 4 layers
connected with TSVs.  Each layer has 16 vaults and each vault consists of 2
banks per layer.  In order to be able to use the same controller used for 2D
DRAM generations for HMC, the following analogy is done: Channel (DDR) => Vault
(HMC) device_size (DDR) => size of a single layer in a vault ranks per channel
(DDR) => number of layers banks per rank (DDR) => banks per layer devices per
rank (DDR) => devices per layer ( 1 for HMC).  The parameters for which no
input is available are inherited from the DDR3 configuration.
2015-06-07 14:02:40 -05:00
Christoph Pfister 4a17494708 mem: addr_mapper: restore old address if request not sent
Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2015-05-30 13:45:17 +02:00
Andreas Hansson 0cc350d2c5 ruby: Deprecation warning for RubyMemoryControl
A step towards removing RubyMemoryControl and shift users to
DRAMCtrl. The latter is faster, more representative, very versatile,
and is integrated with power models.
2015-05-26 03:21:34 -04:00
Joel Hestness 0479569f67 ruby: Fix RubySystem warm-up and cool-down scope
The processes of warming up and cooling down Ruby caches are simulation-wide
processes, not just RubySystem instance-specific processes. Thus, the warm-up
and cool-down variables should be globally visible to any Ruby components
participating in either process. Make these variables static members and track
the warm-up and cool-down processes as appropriate.

This patch also has two side benefits:
1) It removes references to the RubySystem g_system_ptr, which are problematic
for allowing multiple RubySystem instances in a single simulation. Warmup and
cooldown variables being static (global) reduces the need for instance-specific
dereferences through the RubySystem.
2) From the AbstractController, it removes local RubySystem pointers, which are
used inconsistently with other uses of the RubySystem: 11 other uses reference
the RubySystem with the g_system_ptr. Only sequencers have local pointers.
2015-05-19 10:56:51 -05:00
Stephan Diestelhorst 2847d5f517 mem: Create a request copy for deferred snoops
Sometimes, we need to defer an express snoop in an MSHR, but the original
request might complete and deallocate the original pkt->req.  In those cases,
create a copy of the request so that someone who is inspecting the delayed
snoop can also inspect the request still.  All of this is rather hacky, but the
allocation / linking and general life-time management of Packet and Request is
rather tricky.  Deleting the copy is another tricky area, testing so far has
shown that the right copy is deleted at the right time.
2015-03-17 11:50:55 +00:00
Andreas Sandberg 48281375ee mem, cpu: Add a separate flag for strictly ordered memory
The Request::UNCACHEABLE flag currently has two different
functions. The first, and obvious, function is to prevent the memory
system from caching data in the request. The second function is to
prevent reordering and speculation in CPU models.

This changeset gives the order/speculation requirement a separate flag
(Request::STRICT_ORDER). This flag prevents CPU models from doing the
following optimizations:

    * Speculation: CPU models are not allowed to issue speculative
      loads.

    * Write combining: CPU models and caches are not allowed to merge
      writes to the same cache line.

Note: The memory system may still reorder accesses unless the
UNCACHEABLE flag is set. It is therefore expected that the
STRICT_ORDER flag is combined with the UNCACHEABLE flag to prevent
this behavior.
2015-05-05 03:22:33 -04:00
Andreas Sandberg 1da634ace0 mem, alpha: Move Alpha-specific request flags
Move Alpha-specific memory request flags to an architecture-specific
header and map them to the architecture specific flag bit range.
2015-05-05 03:22:31 -04:00
Andreas Hansson 36f29496a0 mem: Snoop into caches on uncacheable accesses
This patch takes a last step in fixing issues related to uncacheable
accesses. We do not separate uncacheable memory from uncacheable
devices, and in cases where it is really memory, there are valid
scenarios where we need to snoop since we do not support cache
maintenance instructions (yet). On snooping an uncacheable access we
thus provide data if possible. In essence this makes uncacheable
accesses IO coherent.

The snoop filter is also queried to steer the snoops, but not updated
since the uncacheable accesses do not allocate a block.
2015-05-05 03:22:29 -04:00
Andreas Hansson 14e5b2ea55 mem: Pass shared downstream through caches
This patch ensures that we pass on information about a packet being
shared (rather than exclusive), when forwarding a packet downstream.

Without this patch there is a risk that a downstream cache considers
the line exclusive when it really isn't.
2015-05-05 03:22:26 -04:00
Ali Jafri 3d33432136 mem: Add forward snoop check for HardPFReqs
We should always check whether the cache is supposed to be forwarding snoops
before generating snoops.
2015-05-05 03:22:25 -04:00
Andreas Hansson 0ebbf3f951 mem: Add missing stats update for uncacheable MSHRs
This patch adds a missing counter update for the uncacheable
accesses. By updating this counter we also get a meaningful average
latency for uncacheable accesses (previously inf).
2015-05-05 03:22:24 -04:00
Andreas Hansson 33e3e370f2 mem: Tidy up BaseCache parameters
This patch simply tidies up the BaseCache parameters and removes the
unused "two_queue" parameter.
2015-05-05 03:22:22 -04:00
David Guillen 5287945a8b mem: Remove templates in cache model
This patch changes the cache implementation to rely on virtual methods
rather than using the replacement policy as a template argument.

There is no impact on the simulation performance, and overall the
changes make it easier to modify (and subclass) the cache and/or
replacement policy.
2015-05-05 03:22:21 -04:00
Rizwana Begum 52a3bc5e5c mem: Simplify page close checks for adaptive policies
Both open_adaptive and close_adaptive page polices keep the page
open if a row hit is found. If a row hit is not found, close_adaptive
page policy precharges the row, and open_adaptive policy precharges
the row only if there is a bank conflict request waiting in the queue.

This patch makes the checks for above conditions simpler.

Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2015-04-29 22:35:22 -05:00
Nilay Vaish 3a2731fb8c ruby: set: replace long by unsigned long
UBSan complains about negative value being shifted
2015-04-29 22:35:22 -05:00
Lena Olson dea7acdb3e ruby: allow restoring from checkpoint when using DRAMCtrl
Restoring from a checkpoint with ruby + the DRAMCtrl memory model was not
working, because ruby and DRAMCtrl disagreed on the current tick during warmup.
Since there is no reason to do timing requests during warmup, use functional
requests instead.

Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2015-04-13 17:33:57 -05:00
Stephan Diestelhorst cb8856f580 mem: Support any number of master-IDs in stride prefetcher
The stride prefetcher had a hardcoded number of contexts (i.e. master-IDs)
that it could handle. Since master IDs need to be unique per system, and
every core, cache etc. requires a separate master port, a static limit on
these does not make much sense.

Instead, this patch adds a small hash map that will map all master IDs to
the right prefetch state and dynamically allocates new state for new master
IDs.
2015-03-27 04:56:03 -04:00
Andreas Hansson 0197e580e5 mem: Allocate cache writebacks before new MSHRs
This patch changes the order of writeback allocation such that any
writebacks resulting from a tag lookup (e.g. for an uncacheable
access), are added to the writebuffer before any new MSHR entries are
allocated. This ensures that the writebacks logically precedes the new
allocations.

The patch also changes the uncacheable flush to use proper timed (or
atomic) writebacks, as opposed to functional writes.
2015-03-27 04:56:02 -04:00
Andreas Hansson 24763c2177 mem: Cleanup flow for uncacheable accesses
This patch simplifies the code dealing with uncacheable timing
accesses, aiming to align it with the existing miss handling. Similar
to what we do in atomic, a timing request now goes through
Cache::access (where the block is also flushed), and then proceeds to
ignore any existing MSHR for the block in question. This unifies the
flow for cacheable and uncacheable accesses, and for atomic and timing.
2015-03-27 04:56:01 -04:00
Andreas Hansson a7a1e6004a mem: Ignore uncacheable MSHRs when finding matches
This patch changes how we search for matching MSHRs, ignoring any MSHR
that is allocated for an uncacheable access. By doing so, this patch
fixes a corner case in the MSHRs where incorrect data ended up being
copied into a (cacheable) read packet due to a first uncacheable MSHR
target of size 4, followed by a cacheable target to the same MSHR of
size 64. The latter target was filled with nonsense data.
2015-03-27 04:56:00 -04:00
Andreas Hansson 801ce65eae mem: Remove redundant allocateUncachedReadBuffer in cache
This patch removes the no-longer-needed
allocateUncachedReadBuffer. Besides the checks it is exactly the same
as allocateMissBuffer and thus provides no value.
2015-03-27 04:55:59 -04:00
Andreas Hansson fe806a0dd7 mem: Modernise MSHR iterators to C++11
This patch updates the iterators in the MSHR and MSHR queues to use
C++11 range-based for loops. It also does a bit of additional house
keeping.
2015-03-27 04:55:57 -04:00
Andreas Hansson 7bae98459c mem: Align all MSHR entries to block boundaries
This patch aligns all MSHR queue entries to block boundaries to
simplify checks for matches. Previously there were corner cases that
could lead to existing entries not being identified as matches.

There are, rather alarmingly, a few regressions that change with this
patch.
2015-03-27 04:55:55 -04:00
Ali Jafri 15f0d9ff14 mem: Rename PREFETCH_SNOOP_SQUASH flag to BLOCK_CACHED
This patch subsumes the PREFETCH_SNOOP_SQUASH flag with the more
generic BLOCK_CACHED flag. Future patches implementing cache eviction
messages can use the BLOCK_CACHED flag in almost the same manner as
hardware prefetches use the PREFETCH_SNOOP_SQUASH flag. The
PREFTECH_SNOOP_FLAG is set if the prefetch target is found in the tags
or the MSHRs in any state, so we are simply replacing calls to
setPrefetchSquashed() with setBlockCached(). The case of where the
prefetch target is found in the writeback MSHRs of upper level caches
continues to be covered by the MEM_INHIBIT flag.
2015-03-27 04:55:54 -04:00
Steve Reinhardt 6677b9122a mem: rename Locked/LOCKED to LockedRMW/LOCKED_RMW
Makes x86-style locked operations even more distinct from
LLSC operations.  Using "locked" by itself should be
obviously ambiguous now.
2015-03-23 16:14:20 -07:00
Andreas Hansson 45286d9b64 mem: Tidy up Request
This patch does a bit of house keeping, fixing up typos, removing dead
code etc.
2015-03-23 06:57:34 -04:00
Andreas Hansson 5275c9d740 mem: Use emplace front/back for deferred packets
Embrace C++11 for the deferred packets as we actually store the
objects in the data structure, and not just pointers.
2015-03-19 04:06:11 -04:00
Geoffrey Blake 1d403960af mem: Enable CommMonitor to output traces in atomic mode
The CommMonitor by default only allows memory traces to be gathered in
timing mode. This patch allows memory traces to be gathered in atomic
mode if all one needs is a functional trace of memory addresses used
and timing information is of a secondary concern.
2015-03-19 04:06:10 -04:00
Steve Reinhardt e57ab463cf mem: remove redundant test in in Cache::recvTimingResp()
For some reason we were checking mshr->hasTargets() even though
we had already called mshr->getTarget() unconditionally earlier
in the same function (which asserts if there are no targets).
Get rid of this useless check, and while we're at it get rid
of the redundant call to mshr->getTarget(), since we still have
the value saved in a local var.
2015-02-11 10:48:53 -08:00
Steve Reinhardt 89bb03a1a6 mem: add local var in Cache::recvTimingResp()
The main loop in recvTimingResp() uses target->pkt all over
the place.  Create a local tgt_pkt to help keep lines
under the line length limit.
2015-02-11 10:48:52 -08:00
Steve Reinhardt ee0b52404c mem: restructure Packet cmd initialization a bit more
Refactor the way that specific MemCmd values are generated for packets.
The new approach is a little more elegant in that we assign the right
value up front, and it's also more amenable to non-heap-allocated
Packet objects.

Also replaced the code in the Minor model that was still doing it the
ad-hoc way.

This is basically a refinement of http://repo.gem5.org/gem5/rev/711eb0e64249.
2015-02-11 10:48:50 -08:00
Steve Reinhardt ccef61d1cc mem: clean up write buffer check in Cache::handleSnoop()
The 'if (writebacks.size)' check was redundant, because
writeBuffer.findMatches() would return false if the
writebacks list was empty.

Also renamed 'mshr' to 'wb_entry' in this context since
we are pointing at a writebuffer entry and not an MSHR
(even though it's the same C++ class).
2015-03-14 06:51:07 -07:00
Andreas Hansson fc315901ff mem: Unify all cache DPRINTF address formatting
This patch changes all the DPRINTF messages in the cache to use
'%#llx' every time a packet address is printed. The inclusion of '#'
ensures '0x' is prepended, and since the address type is a uint64_t %x
really should be %llx.
2015-03-02 04:00:56 -05:00
Andreas Hansson 88e2963951 mem: Fix cache MSHR conflict determination
This patch fixes a rather subtle issue in the sending of MSHR requests
in the cache, where the logic previously did not check for conflicts
between the MSRH queue and the write queue when requests were not
ready. The correct thing to do is to always check, since not having a
ready MSHR does not guarantee that there is no conflict.

The underlying problem seems to have slipped past due to the symmetric
timings used for the write queue and MSHR queue. However, with the
recent timing changes the bug caused regressions to fail.
2015-03-02 04:00:54 -05:00
Andreas Hansson 407737614e mem: Add byte mask to Packet::checkFunctional
This patch changes the valid-bytes start/end to a proper byte
mask. With the changes in timing introduced in previous patches there
are more packets waiting in queues, and there are regressions using
the checker CPU failing due to non-contigous read data being found in
the various cache queues.

This patch also adds some more comments explaining what is going on,
and adds the fourth and missing case to Packet::checkFunctional.
2015-03-02 04:00:52 -05:00
Stephan Diestelhorst ecef1612b8 mem: Add option to force in-order insertion in PacketQueue
By default, the packet queue is ordered by the ticks of the to-be-sent
packages. With the recent modifications of packages sinking their header time
when their resposne leaves the caches, there could be cases of MSHR targets
being allocated and ordered A, B, but their responses being sent out in the
order B,A. This led to inconsistencies in bus traffic, in particular the snoop
filter observing first a ReadExResp and later a ReadRespWithInv.  Logically,
these were ordered the other way around behind the MSHR, but due to the timing
adjustments when inserting into the PacketQueue, they were sent out in the
wrong order on the bus, confusing the snoop filter.

This patch adds a flag (off by default) such that these special cases can
request in-order insertion into the packet queue, which might offset timing
slighty. This is expected to occur rarely and not affect timing results.
2015-03-02 04:00:49 -05:00
Marco Balboni d4ef8368aa mem: Downstream components consumes new crossbar delays
This patch makes the caches and memory controllers consume the delay
that is annotated to a packet by the crossbar. Previously many
components simply threw these delays away. Note that the devices still
do not pay for these delays.
2015-03-02 04:00:48 -05:00
Andreas Hansson 36dc93a5fa mem: Move crossbar default latencies to subclasses
This patch introduces a few subclasses to the CoherentXBar and
NoncoherentXBar to distinguish the different uses in the system. We
use the crossbar in a wide range of places: interfacing cores to the
L2, as a system interconnect, connecting I/O and peripherals,
etc. Needless to say, these crossbars have very different performance,
and the clock frequency alone is not enough to distinguish these
scenarios.

Instead of trying to capture every possible case, this patch
introduces dedicated subclasses for the three primary use-cases:
L2XBar, SystemXBar and IOXbar. More can be added if needed, and the
defaults can be overridden.
2015-03-02 04:00:47 -05:00
Marco Balboni d35dd71ab4 mem: Add crossbar latencies
This patch introduces latencies in crossbar that were neglected
before. In particular, it adds three parameters in crossbar model:
front_end_latency, forward_latency, and response_latency. Along with
these parameters, three corresponding members are added:
frontEndLatency, forwardLatency, and responseLatency. The coherent
crossbar has an additional snoop_response_latency.

The latency of the request path through the xbar is set as
--> frontEndLatency + forwardLatency

In case the snoop filter is enabled, the request path latency is charged
also by look-up latency of the snoop filter.
--> frontEndLatency + SF(lookupLatency) + forwardLatency.

The latency of the response path through the xbar is set instead as
--> responseLatency.

In case of snoop response, if the response is treated as a normal response
the latency associated is again
--> responseLatency;

If instead it is forwarded as snoop response we add an additional variable
+ snoopResponseLatency
and the latency associated is
--> snoopResponseLatency;

Furthermore, this patch lets the crossbar progress on the next clock
edge after an unused retry, changing the time the crossbar considers
itself busy after sending a retry that was not acted upon.
2015-03-02 04:00:46 -05:00
Andreas Hansson 987de4f5cc mem: Tidy up the cache debug messages
Avoid redundant inclusion of the name in the DPRINTF string.
2015-03-02 04:00:37 -05:00
Andreas Hansson f26a289295 mem: Split port retry for all different packet classes
This patch fixes a long-standing isue with the port flow
control. Before this patch the retry mechanism was shared between all
different packet classes. As a result, a snoop response could get
stuck behind a request waiting for a retry, even if the send/recv
functions were split. This caused message-dependent deadlocks in
stress-test scenarios.

The patch splits the retry into one per packet (message) class. Thus,
sendTimingReq has a corresponding recvReqRetry, sendTimingResp has
recvRespRetry etc. Most of the changes to the code involve simply
clarifying what type of request a specific object was accepting.

The biggest change in functionality is in the cache downstream packet
queue, facing the memory. This queue was shared by requests and snoop
responses, and it is now split into two queues, each with their own
flow control, but the same physical MasterPort. These changes fixes
the previously seen deadlocks.
2015-03-02 04:00:35 -05:00
Ali Jafri 6ebe8d863a mem: Fix prefetchSquash + memInhibitAsserted bug
This patch resolves a bug with hardware prefetches. Before a hardware prefetch
is sent towards the memory, the system generates a snoop request to check all
caches above the prefetch generating cache for the presence of the prefetth
target. If the prefetch target is found in the tags or the MSHRs of the upper
caches, the cache sets the prefetchSquashed flag in the snoop packet. When the
snoop packet returns with the prefetchSquashed flag set, the prefetch
generating cache deallocates the MSHR reserved for the prefetch. If the
prefetch target is found in the writeback buffer of the upper cache, the cache
sets the memInhibit flag, which signals the prefetch generating cache to
expect the data from the writeback. When the snoop packet returns with the
memInhibitAsserted flag set, it marks the allocated MSHR as inService and
waits for the data from the writeback.

If the prefetch target is found in multiple upper level caches, specifically
in the tags or MSHRs of one upper level cache and the writeback buffer of
another, the snoop packet will return with both prefetchSquashed and
memInhibitAsserted set, while the current code is not written to handle such
an outcome. Current code checks for the prefetchSquashed flag first, if it
finds the flag, it deallocates the reserved MSHR. This leads to assert failure
when the data from the writeback appears at cache. In this fix, we simply
switch the order of checks. We first check for memInhibitAsserted and then for
prefetch squashed.
2015-03-02 04:00:34 -05:00
Jason Power 670f44e05e Ruby: Update backing store option to propagate through to all RubyPorts
Previously, the user would have to manually set access_backing_store=True
on all RubyPorts (Sequencers) in the config files.
Now, instead there is one global option that each RubyPort checks on
initialization.

Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2015-02-26 09:58:26 -06:00
Stephan Diestelhorst 93fa8e3cd4 mem: Fix initial value problem with MemChecker
In highly loaded cases, reads might actually overlap with writes to the
initial memory state. The mem checker needs to detect such cases and
permit the read reading either from the writes (what it is doing now) or
read from the initial, unknown value.

This patch adds this logic.
2015-02-16 03:34:47 -05:00
Andreas Hansson e17328a227 mem: mmap the backing store with MAP_NORESERVE
This patch ensures we can run simulations with very large simulated
memories (at least 64 TB based on some quick runs on a Linux
workstation). In essence this allows us to efficiently deal with
sparse address maps without having to implement a redirection layer in
the backing store.

This opens up for run-time errors if we eventually exhausts the hosts
memory and swap space, but this should hopefully never happen.
2015-02-16 03:33:47 -05:00
Andreas Hansson 57758ca685 mem: Use the range cache for lookup as well as access
This patch changes the range cache used in the global physical memory
to be an iterator so that we can use it not only as part of isMemAddr,
but also access and functionalAccess. This matches use-cases where a
core is using the atomic non-caching memory mode, and repeatedly calls
isMemAddr and access.

Linux boot on aarch32, with a single atomic CPU, is now more than 30%
faster when using "--fastmem" compared to not using the direct memory
access.
2015-02-16 03:33:37 -05:00
Marco Balboni 268d9e59c5 mem: Clarification of packet crossbar timings
This patch clarifies the packet timings annotated
when going through a crossbar.

The old 'firstWordDelay' is replaced by 'headerDelay' that represents
the delay associated to the delivery of the header of the packet.

The old 'lastWordDelay' is replaced by 'payloadDelay' that represents
the delay needed to processing the payload of the packet.

For now the uses and values remain identical. However, going forward
the payloadDelay will be additive, and not include the
headerDelay. Follow-on patches will make the headerDelay capture the
pipeline latency incurred in the crossbar, whereas the payloadDelay
will capture the additional serialisation delay.
2015-02-11 10:23:47 -05:00
Marco Balboni e2828587b3 mem: Clarify usage of latency in the cache
This patch adds some much-needed clarity in the specification of the
cache timing. For now, hit_latency and response_latency are kept as
top-level parameters, but the cache itself has a number of local
variables to better map the individual timing variables to different
behaviours (and sub-components).

The introduced variables are:
- lookupLatency: latency of tag lookup, occuring on any access
- forwardLatency: latency that occurs in case of outbound miss
- fillLatency: latency to fill a cache block
We keep the existing responseLatency

The forwardLatency is used by allocateInternalBuffer() for:
- MSHR allocateWriteBuffer (unchached write forwarded to WriteBuffer);
- MSHR allocateMissBuffer (cacheable miss in MSHR queue);
- MSHR allocateUncachedReadBuffer (unchached read allocated in MSHR
  queue)
It is our assumption that the time for the above three buffers is the
same. Similarly, for snoop responses passing through the cache we use
forwardLatency.
2015-02-11 10:23:36 -05:00
Andreas Hansson 461a80beb3 mem: Clarify express snoop behaviour
This patch adds a bit of documentation with insights around how
express snoops really work.
2015-02-03 14:26:02 -05:00
Andreas Hansson 193325ff60 mem: Clarify cache behaviour for pending dirty responses
This patch adds a bit of clarification around the assumptions made in
the cache when packets are sent out, and dirty responses are
pending. As part of the change, the marking of an MSHR as in service
is simplified slightly, and comments are added to explain what
assumptions are made.
2015-02-03 14:25:59 -05:00
Andreas Hansson 5ea60a95b3 config: Adjust DRAM channel interleaving defaults
This patch changes the DRAM channel interleaving default behaviour to
be more representative. The default address mapping (RoRaBaCoCh) moves
the channel bits towards the least significant bits, and uses 128 byte
as the default channel interleaving granularity.

These defaults can be overridden if desired, but should serve as a
sensible starting point for most use-cases.
2015-02-03 14:25:52 -05:00
Andreas Hansson 10c69bb168 mem: Remove unused Packet src and dest fields
This patch takes the final step in removing the src and dest fields in
the packet. These fields were rather confusing in that they only
remember a single multiplexing component, and pushed the
responsibility to the bridge and caches to store the fields in a
senderstate, thus effectively creating a stack. With the recent
changes to the crossbar response routing the crossbar is now
responsible without relying on the packet fields. Thus, these
variables are now unused and can be removed.
2015-01-22 05:01:31 -05:00
Andreas Hansson 15c64035ed mem: Remove Packet source from ForwardResponseRecord
This patch removes the source field from the ForwardResponseRecord,
but keeps the class as it is part of how the cache identifies
responses to hardware prefetches that are snooped upwards.
2015-01-22 05:01:30 -05:00
Andreas Hansson 0c2ffd2daa mem: Remove unused RequestState in the bridge
This patch removes the bridge sender state as the Crossbar now takes
care of remembering its own routing decisions.
2015-01-22 05:01:27 -05:00
Andreas Hansson 00536b0efc mem: Always use SenderState for response routing in RubyPort
This patch aligns how the response routing is done in the RubyPort,
using the SenderState for both memory and I/O accesses. Before this
patch, only the I/O used the SenderState, whereas the memory accesses
relied on the src field in the packet. With this patch we shift to
using SenderState in both cases, thus not relying on the src field any
longer.
2015-01-22 05:01:24 -05:00
Andreas Hansson 072f78471d mem: Make the XBar responsible for tracking response routing
This patch removes the need for a source and destination field in the
packet by shifting the onus of the tracking to the crossbar, much like
a real implementation. This change in behaviour also means we no
longer need a SenderState to remember the source/dest when ever we
have multiple crossbars in the system. Thus, the stack that was
created by the SenderState is not needed, and each crossbar locally
tracks the response routing.

The fields in the packet are still left behind as the RubyPort (which
also acts as a crossbar) does routing based on them. In the succeeding
patches the uses of the src and dest field will be removed. Combined,
these patches improve the simulation performance by roughly 2%.
2015-01-22 05:01:14 -05:00
Andreas Hansson f49830ce0b mem: Clean up Request initialisation
This patch tidies up how we create and set the fields of a Request. In
essence it tries to use the constructor where possible (as opposed to
setPhys and setVirt), thus avoiding spreading the information across a
number of locations. In fact, setPhys is made private as part of this
patch, and a number of places where we callede setVirt instead uses
the appropriate constructor.
2015-01-22 05:00:53 -05:00
Andreas Hansson 6096e2f9c1 mem: Fix bug in cache request retry mechanism
This patch ensures that inhibited packets that are about to be turned
into express snoops do not update the retry flag in the cache.
2015-01-20 08:12:01 -05:00
Andreas Hansson 92585d60c9 mem: Move DRAM interleaving check to init
This patch fixes a bug where the DRAM controller tried to access the
system cacheline size before the system pointer was initialised. It
also fixes a bug where the granularity is 0 (no interleaving).
2015-01-20 08:11:55 -05:00
Mitch Hayenga b2342c5d9a mem: Change prefetcher to use random_mt
Prefechers has used rand() to generate random numers previously.
2014-12-23 09:31:19 -05:00