Commit graph

1546 commits

Author SHA1 Message Date
Andreas Hansson 2698e73966 base: Use the global Mersenne twister throughout
This patch tidies up random number generation to ensure that it is
done consistently throughout the code base. In essence this involves a
clean-up of Ruby, and some code simplifications in the traffic
generator.

As part of this patch a bunch of skewed distributions (off-by-one etc)
have been fixed.

Note that a single global random number generator is used, and that
the object instantiation order will impact the behaviour (the sequence
of numbers will be unaffected, but if module A calles random before
module B then they would obviously see a different outcome). The
dependency on the instantiation order is true in any case due to the
execution-model of gem5, so we leave it as is. Also note that the
global ranom generator is not thread safe at this point.

Regressions using the memtest, TrafficGen or any Ruby tester are
affected and will be updated accordingly.
2014-09-03 07:42:54 -04:00
Andreas Hansson 1ff4c45bbb mem: Avoid unecessary retries when bus peer is not ready
This patch removes unecessary retries that happened when the bus layer
itself was no longer busy, but the the peer was not yet ready. Instead
of sending a retry that will inevitably not succeed, the bus now
silenty waits until the peer sends a retry.
2014-09-03 07:42:53 -04:00
Curtis Dunham f6f63ec0aa mem: write streaming support via WriteInvalidate promotion
Support full-block writes directly rather than requiring RMW:
 * a cache line is allocated in the cache upon receipt of a
   WriteInvalidateReq, not the WriteInvalidateResp.
 * only top-level caches allocate the line; the others just pass
   the request along and invalidate as necessary.
 * to close a timing window between the *Req and the *Resp, a new
   metadata bit tracks whether another cache has read a copy of
   the new line before the writeback to memory.
2014-06-27 12:29:00 -05:00
Andreas Hansson 3be4f4b846 mem: Fix a bug in the cache port flow control
This patch fixes a bug in the cache port where the retry flag was
reset too early, allowing new requests to arrive before the retry was
actually sent, but with the event already scheduled. This caused a
deadlock in the interactions with the O3 LSQ.

The patche fixes the underlying issue by shifting the resetting of the
flag to be done by the event that also calls sendRetry(). The patch
also tidies up the flow control in recvTimingReq and ensures that we
also check if we already have a retry outstanding.
2014-09-03 07:42:50 -04:00
Curtis Dunham 5d029463ee cpu, mem: Make software prefetches non-blocking
Previously, they were treated so much like loads that they could stall
at the head of the ROB.  Now they are always treated like L1 hits.
If they actually miss, a new request is created at the L1 and tracked
from the MSHRs there if necessary (i.e. if it didn't coalesce with
an existing outstanding load).
2014-05-13 12:20:49 -05:00
Curtis Dunham e3b19cb294 mem: Refactor assignment of Packet types
Put the packet type swizzling (that is currently done in a lot of places)
into a refineCommand() member function.
2014-05-13 12:20:48 -05:00
Geoffrey Blake b404ffde60 cache: Fix handling of LL/SC requests under contention
If a set of LL/SC requests contend on the same cache block we
can get into a situation where CPUs will deadlock if they expect
a failed SC to supply them data.  This case happens where 3 or
more cores are contending for a cache block using LL/SC and the system
is configured where 2 cores are connected to a local bus and the
third is connected to a remote bus.  If a core on the local bus
sends an SCUpgrade and the core on the remote bus sends and SCUpgrade
they will race to see who will win the SC access.  In the meantime
if the other core appends a read to one of the SCUpgrades it will expect
to be supplied data by that SCUpgrade transaction.  If it happens that
the SCUpgrade that was picked to supply the data is failed, it will
drop the appended request for data and never respond, leaving the requesting
core to deadlock.  This patch makes all SC's behave as normal stores to
prevent this case but still makes sure to check whether it can perform
the update.
2014-09-03 07:42:31 -04:00
Andreas Hansson 77c28cc395 mem: Packet queue clean up
No change in functionality, just a bit of tidying up.
2014-09-03 07:42:28 -04:00
Andreas Hansson e1ac962939 arch: Cleanup unused ISA traits constants
This patch prunes unused values, and also unifies how the values are
defined (not using an enum for ALPHA), aligning the use of int vs Addr
etc.

The patch also removes the duplication of PageBytes/PageShift and
VMPageSize/LogVMPageSize. For all ISAs the two pairs had identical
values and the latter has been removed.
2014-09-03 07:42:21 -04:00
Nilay Vaish 2cbe7c705b ruby: remove typedef of Index as int64
The Index type defined as typedef int64 does not really provide any help
since in most places we use primitive types instead of Index.  Also, the name
Index is very generic that it does not merit being used as a typename.
2014-09-01 16:55:50 -05:00
Nilay Vaish b4dade6fb2 ruby: PerfectSwitch: moves code to a per vnet helper function
This patch moves code from the wakeup() function to a operateVnet().
The aim is to improve the readiblity of the code.
2014-09-01 16:55:48 -05:00
Nilay Vaish 7a0d5aafe4 ruby: message buffers: significant changes
This patch is the final patch in a series of patches.  The aim of the series
is to make ruby more configurable than it was.  More specifically, the
connections between controllers are not at all possible (unless one is ready
to make significant changes to the coherence protocol).  Moreover the buffers
themselves are magically connected to the network inside the slicc code.
These connections are not part of the configuration file.

This patch makes changes so that these connections will now be made in the
python configuration files associated with the protocols.  This requires
each state machine to expose the message buffers it uses for input and output.
So, the patch makes these buffers configurable members of the machines.

The patch drops the slicc code that usd to connect these buffers to the
network.  Now these buffers are exposed to the python configuration system
as Master and Slave ports.  In the configuration files, any master port
can be connected any slave port.  The file pyobject.cc has been modified to
take care of allocating the actual message buffer.  This is inline with how
other port connections work.
2014-09-01 16:55:47 -05:00
Nilay Vaish 00286fc5cb build opts: add MI_example to NULL ISA
A later changeset changes the file src/python/swig/pyobject.cc to include
a header file that includes a header file generated at build time depending
on the PROTOCOL in use.  Since NULL ISA was not specifying any protocol,
this resulted in compilation problems.  Hence, the changeset.
2014-09-01 16:55:46 -05:00
Nilay Vaish d07abd9b5b mem: change the namespace Message to ProtoMessage
The namespace Message conflicts with the Message data type used extensively
in Ruby.  Since Ruby is being moved to the same Master/Slave ports based
configuration style as the rest of gem5, this conflict needs to be resolved.
Hence, the namespace is being renamed to ProtoMessage.
2014-09-01 16:55:46 -05:00
Nilay Vaish cee8faaad0 ruby: slicc: change the way configurable members are specified
There are two changes this patch makes to the way configurable members of a
state machine are specified in SLICC.  The first change is that the data
member declarations will need to be separated by a semi-colon instead of a
comma.  Secondly, the default value to be assigned would now use SLICC's
assignment operator i.e. ':='.
2014-09-01 16:55:45 -05:00
Nilay Vaish b1d3873ec5 ruby: slicc: improve the grammar
This patch changes the grammar for SLICC so as to remove some of the
redundant / duplicate rules.  In particular rules for object/variable
declaration and class member declaration have been unified. Similarly, the
rules for a general function and a class method have been unified.

One more change is in the priority of two rules.  The first rule is on
declaring a function with all the params typed and named.  The second rule is
on declaring a function with all the params only typed.  Earlier the second
rule had a higher priority.  Now the first rule has a higher priority.
2014-09-01 16:55:44 -05:00
Nilay Vaish 3202ec98e7 ruby: mesi three level: slight naming changes. 2014-09-01 16:55:44 -05:00
Nilay Vaish 557200725c ruby: slicc: donot prefix machine name to variables
This changeset does away with prefixing of member variables of state machines
with the identity of the machine itself.
2014-09-01 16:55:43 -05:00
Nilay Vaish 6ceb1aadc2 ruby: remove unused toString() from AbstractController 2014-09-01 16:55:42 -05:00
Nilay Vaish 00dbadcbb0 ruby: network: move getNumNodes() to base class
All the implementations were doing the same things.
2014-09-01 16:55:42 -05:00
Nilay Vaish cc2cc58869 ruby: eliminate type Time
There is another type Time in src/base class which results in a conflict.
2014-09-01 16:55:41 -05:00
Nilay Vaish 82d136285d ruby: move files from ruby/system to ruby/structures
The directory ruby/system is crowded and unorganized. Hence, the files the
hold actual physical structures, are being moved to the directory
ruby/structures.  This includes Cache Memory, Directory Memory,
Memory Controller, Wire Buffer, TBE Table, Perfect Cache Memory, Timer Table,
Bank Array.

The directory ruby/systems has the glue code that holds these structures
together.

--HG--
rename : src/mem/ruby/system/MachineID.hh => src/mem/ruby/common/MachineID.hh
rename : src/mem/ruby/buffers/MessageBuffer.cc => src/mem/ruby/network/MessageBuffer.cc
rename : src/mem/ruby/buffers/MessageBuffer.hh => src/mem/ruby/network/MessageBuffer.hh
rename : src/mem/ruby/buffers/MessageBufferNode.cc => src/mem/ruby/network/MessageBufferNode.cc
rename : src/mem/ruby/buffers/MessageBufferNode.hh => src/mem/ruby/network/MessageBufferNode.hh
rename : src/mem/ruby/system/AbstractReplacementPolicy.hh => src/mem/ruby/structures/AbstractReplacementPolicy.hh
rename : src/mem/ruby/system/BankedArray.cc => src/mem/ruby/structures/BankedArray.cc
rename : src/mem/ruby/system/BankedArray.hh => src/mem/ruby/structures/BankedArray.hh
rename : src/mem/ruby/system/Cache.py => src/mem/ruby/structures/Cache.py
rename : src/mem/ruby/system/CacheMemory.cc => src/mem/ruby/structures/CacheMemory.cc
rename : src/mem/ruby/system/CacheMemory.hh => src/mem/ruby/structures/CacheMemory.hh
rename : src/mem/ruby/system/DirectoryMemory.cc => src/mem/ruby/structures/DirectoryMemory.cc
rename : src/mem/ruby/system/DirectoryMemory.hh => src/mem/ruby/structures/DirectoryMemory.hh
rename : src/mem/ruby/system/DirectoryMemory.py => src/mem/ruby/structures/DirectoryMemory.py
rename : src/mem/ruby/system/LRUPolicy.hh => src/mem/ruby/structures/LRUPolicy.hh
rename : src/mem/ruby/system/MemoryControl.cc => src/mem/ruby/structures/MemoryControl.cc
rename : src/mem/ruby/system/MemoryControl.hh => src/mem/ruby/structures/MemoryControl.hh
rename : src/mem/ruby/system/MemoryControl.py => src/mem/ruby/structures/MemoryControl.py
rename : src/mem/ruby/system/MemoryNode.cc => src/mem/ruby/structures/MemoryNode.cc
rename : src/mem/ruby/system/MemoryNode.hh => src/mem/ruby/structures/MemoryNode.hh
rename : src/mem/ruby/system/MemoryVector.hh => src/mem/ruby/structures/MemoryVector.hh
rename : src/mem/ruby/system/PerfectCacheMemory.hh => src/mem/ruby/structures/PerfectCacheMemory.hh
rename : src/mem/ruby/system/PersistentTable.cc => src/mem/ruby/structures/PersistentTable.cc
rename : src/mem/ruby/system/PersistentTable.hh => src/mem/ruby/structures/PersistentTable.hh
rename : src/mem/ruby/system/PseudoLRUPolicy.hh => src/mem/ruby/structures/PseudoLRUPolicy.hh
rename : src/mem/ruby/system/RubyMemoryControl.cc => src/mem/ruby/structures/RubyMemoryControl.cc
rename : src/mem/ruby/system/RubyMemoryControl.hh => src/mem/ruby/structures/RubyMemoryControl.hh
rename : src/mem/ruby/system/RubyMemoryControl.py => src/mem/ruby/structures/RubyMemoryControl.py
rename : src/mem/ruby/system/SparseMemory.cc => src/mem/ruby/structures/SparseMemory.cc
rename : src/mem/ruby/system/SparseMemory.hh => src/mem/ruby/structures/SparseMemory.hh
rename : src/mem/ruby/system/TBETable.hh => src/mem/ruby/structures/TBETable.hh
rename : src/mem/ruby/system/TimerTable.cc => src/mem/ruby/structures/TimerTable.cc
rename : src/mem/ruby/system/TimerTable.hh => src/mem/ruby/structures/TimerTable.hh
rename : src/mem/ruby/system/WireBuffer.cc => src/mem/ruby/structures/WireBuffer.cc
rename : src/mem/ruby/system/WireBuffer.hh => src/mem/ruby/structures/WireBuffer.hh
rename : src/mem/ruby/system/WireBuffer.py => src/mem/ruby/structures/WireBuffer.py
rename : src/mem/ruby/recorder/CacheRecorder.cc => src/mem/ruby/system/CacheRecorder.cc
rename : src/mem/ruby/recorder/CacheRecorder.hh => src/mem/ruby/system/CacheRecorder.hh
2014-09-01 16:55:40 -05:00
Alexandru 5efbb4442a mem: adding architectural page table support for SE mode
This patch enables the use of page tables that are stored in system memory
and respect x86 specification, in SE mode. It defines an architectural
page table for x86 as a MultiLevelPageTable class and puts a placeholder
class for other ISAs page tables, giving the possibility for future
implementation.
2014-08-28 10:11:44 -05:00
Alexandru 26ac28dec2 mem: adding a multi-level page table class
This patch defines a multi-level page table class that stores the page table in
system memory, consistent with ISA specifications. In this way, cpu models that
use the actual hardware to execute (e.g. KvmCPU), are able to traverse the page
table.
2014-04-01 12:18:12 -05:00
Andreas Hansson 9e4cd5bf1e mem: Fix DRAMSim2 cycle check when restoring from checkpoint
This patch ensures the cycle check is still valid even restoring from
a checkpoint. In this case the DRAMSim2 cycle count is relative to the
startTick rather than 0.
2014-08-26 10:14:38 -04:00
Andreas Hansson 3efabb4b2f mem: Update DRAM controller comments
Update comments and add a reference for more information.
2014-08-26 10:13:03 -04:00
Andreas Hansson 56b7796e0d mem: Fix address interleaving bug in DRAM controller
This patch fixes a bug in the DRAM controller address decoding. In
cases where the DRAM burst size (e.g. 32 bytes in a rank with a single
LPDDR3 x32) was smaller than the channel interleaving size
(e.g. systems with a 64-byte cache line) one address bit effectively
got used as a channel bit when it should have been a low-order column
bit.

This patch adds a notion of "columns per stripe", and more clearly
deals with the low-order column bits and high-order column bits. The
patch also relaxes the granularity check such that it is possible to
use interleaving granularities other than the cache line size.

The patch also adds a missing M5_CLASS_VAR_USED to the tCK member as
it is only used in the debug build for now.
2014-08-26 10:12:45 -04:00
Mitch Hayenga f6f6ae461e mem: Properly set cache block status fields on writebacks
When a cacheline is written back to a lower-level cache,
tags->insertBlock() sets various status parameters. However these
status bits were cleared immediately after calling. This patch makes
it so that these status fields are not cleared by moving them outside
of the tags->insertBlock() call.
2014-08-13 06:57:24 -04:00
Anthony Gutierrez a628afedad mem: refactor LRU cache tags and add random replacement tags
this patch implements a new tags class that uses a random replacement policy.
these tags prefer to evict invalid blocks first, if none are available a
replacement candidate is chosen at random.

this patch factors out the common code in the LRU class and creates a new
abstract class: the BaseSetAssoc class. any set associative tag class must
implement the functionality related to the actual replacement policy in the
following methods:

accessBlock()
findVictim()
insertBlock()
invalidate()
2014-07-28 12:23:23 -04:00
Andreas Hansson 1f539ce4cc mem: DRAMPower trace output
This patch adds a DRAMPower flag to enable off-line DRAM power
analysis using the DRAMPower tool. A new DRAMPower flag is added
and a follow-on patch adds a Python script to post-process the output
and order it based on time stamps.

The long-term goal is to link DRAMPower as a library and provide the
commands through function calls to the model rather than first
printing and then parsing the commands. At the moment it is also up to
the user to ensure that the same DRAM configuration is used by the
gem5 controller model and DRAMPower.
2014-06-30 13:56:03 -04:00
Andreas Hansson b4ce51eb9e mem: Add bank and rank indices as fields to the DRAM bank
This patch adds the index of the bank and rank as a field so that we can
determine the identity of a given bank (reference or pointer) for the
power tracing. We also grab the opportunity of cleaning up the
arguments used for identifying the bank when activating.
2014-06-30 13:56:02 -04:00
Andreas Hansson d59bc8ee1f mem: Extend DRAM row bits from 16 to 32 for larger densities
This patch extends the DRAM row bits to 32 to support larger density
memories. Additional checks are also added to ensure the row fits in
the 32 bits.
2014-06-30 13:56:01 -04:00
Steve Reinhardt 0be64ffe2f style: eliminate equality tests with true and false
Using '== true' in a boolean expression is totally redundant,
and using '== false' is pretty verbose (and arguably less
readable in most cases) compared to '!'.

It's somewhat of a pet peeve, perhaps, but I had some time
waiting for some tests to run and decided to clean these up.

Unfortunately, SLICC appears not to have the '!' operator,
so I had to leave the '== false' tests in the SLICC code.
2014-05-31 18:00:23 -07:00
Nilay Vaish e685767b58 ruby: slicc: remove unused ids DNUCA* 2014-05-23 06:07:02 -05:00
Nilay Vaish 9c9257a612 ruby: remove old protocol documentation 2014-05-23 06:07:02 -05:00
Nilay Vaish 8bf41e41c1 ruby: message buffer: drop dequeue_getDelayCycles()
The functionality of updating and returning the delay cycles would now be
performed by the dequeue() function itself.
2014-05-23 06:07:02 -05:00
Andreas Hansson f800f268db mem: Update DDR3 and DDR4 based on datasheets
This patch makes a more firm connection between the DDR3-1600
configuration and the corresponding datasheet, and also adds a
DDR3-2133 and a DDR4-2400 configuration. At the moment there is also
an ongoing effort to align the choice of datasheets to what is
available in DRAMPower.
2014-05-09 18:58:49 -04:00
Andreas Hansson cc4ca78f99 mem: Add DRAM cycle time
This patch extends the current timing parameters with the DRAM cycle
time. This is needed as the DRAMPower tool expects timestamps in DRAM
cycles. At the moment we could get away with doing this in a
post-processing step as the DRAMPower execution is separate from the
simulation run. However, in the long run we want the tool to be called
during the simulation, and then the cycle time is needed.
2014-05-09 18:58:49 -04:00
Andreas Hansson 8c56efe747 mem: Simplify DRAM response scheduling
This patch simplifies the DRAM response scheduling based on the
assumption that they are always returned in order.
2014-05-09 18:58:48 -04:00
Andreas Hansson 8e3869411d mem: Add precharge all (PREA) to the DRAM controller
This patch adds the basic ingredients for a precharge all operation,
to be used in conjunction with DRAM power modelling.

Currently we do not try and apply any cleverness when precharging all
banks, thus even if only a single bank is open we use PREA as opposed
to PRE. At the moment we only have a single tRP (tRPpb), and do not
model the slightly longer all-bank precharge constraint (tRPab).
2014-05-09 18:58:48 -04:00
Andreas Hansson 0ba1e72e9b mem: Remove printing of DRAM params
This patch removes the redundant printing of DRAM params.
2014-05-09 18:58:48 -04:00
Andreas Hansson 6753cb705e mem: Add tRTP to the DRAM controller
This patch adds the tRTP timing constraint, governing the minimum time
between a read command and a precharge. Default values are provided
for the existing DRAM types.
2014-05-09 18:58:48 -04:00
Andreas Hansson 60799dc552 mem: Merge DRAM latency calculation and bank state update
This patch merges the two control paths used to estimate the latency
and update the bank state. As a result of this merging the computation
is now in one place only, and should be easier to follow as it is all
done in absolute (rather than relative) time.

As part of this change, the scheduling is also refined to ensure that
we look at a sensible estimate of the bank ready time in choosing the
next request. The bank latency stat is removed as it ends up being
misleading when the DRAM access code gets evaluated ahead of time (due
to the eagerness of waking the model up for scheduling the next
request).
2014-05-09 18:58:48 -04:00
Andreas Hansson b8631d9ae8 mem: Add tWR to DRAM activate and precharge constraints
This patch adds the write recovery time to the DRAM timing
constraints, and changes the current tRASDoneAt to a more generic
preAllowedAt, capturing when a precharge is allowed to take place.

The part of the DRAM access code that accounts for the precharge and
activate constraints is updated accordingly.
2014-05-09 18:58:48 -04:00
Andreas Hansson c735ef6cb0 mem: Merge DRAM page-management calculations
This patch treats the closed page policy as yet another case of
auto-precharging, and thus merges the code with that used for the
other policies.
2014-05-09 18:58:48 -04:00
Andreas Hansson 87f4c956c4 mem: Add DRAM power states to the controller
This patch adds power states to the controller. These states and the
transitions can be used together with the Micron power model. As a
more elaborate use-case, the transitions can be used to drive the
DRAMPower tool.

At the moment, the power-down modes are not used, and this patch
simply serves to capture the idle, auto refresh and active modes. The
patch adds a third state machine that interacts with the refresh state
machine.
2014-05-09 18:58:48 -04:00
Andreas Hansson babf072c1c mem: Ensure DRAM refresh respects timings
This patch adds a state machine for the refresh scheduling to
ensure that no accesses are allowed while the refresh is in progress,
and that all banks are propely precharged.

As part of this change, the precharging of banks of broken out into a
method of its own, making is similar to how activations are dealt
with. The idle accounting is also updated to ensure that the refresh
duration is not added to the time that the DRAM is in the idle state
with all banks precharged.
2014-05-09 18:58:48 -04:00
Andreas Hansson 5c2c3f598e mem: Make DRAM read/write switching less conservative
This patch changes the read/write event loop to use a single event
(nextReqEvent), along with a state variable, thus joining the two
control flows. This change makes it easier to follow the state
transitions, and control what happens when.

With the new loop we modify the overly conservative switching times
such that the write-to-read switch allows bank preparation to happen
in parallel with the bus turn around. Similarly, the read-to-write
switch uses the introduced tRTW constraint.
2014-05-09 18:58:48 -04:00
Mitch Hayenga a15b713cba mem: Squash prefetch requests from downstream caches
This patch squashes prefetch requests from downstream caches,
so that they do not steal cachelines away from caches closer
to the cpu.  It was originally coded by Mitch Hayenga and
modified by Aasheesh Kolli.
2014-05-09 18:58:46 -04:00
Sascha Bischoff e940bac278 mem: Auto-generate CommMonitor trace file names
Splits the CommMonitor trace_file parameter into three parameters. Previously,
the trace was only enabled if the trace_file parameter was set, and would be
written to this file. This patch adds in a trace_enable and trace_compress
parameter to the CommMonitor.

No trace is generated if trace_enable is set to False. If it is set to True, the
trace is written to a file based on the name of the SimObject in the simulation
hierarchy. For example, system.cluster.il1_commmonitor.trc. This filename can be
overridden by additionally specifying a file name to the trace_file parameter
(more on this later).

The trace_compress parameter will append .gz to any filename if set to True.
This enables compression of the generated traces. If the file name already ends
in .gz, then no changes are made.

The trace_file parameter will override the name set by the trace_enable
parameter. In the case that the specified name does not end in .gz but
trace_compress is set to true, .gz is appended to the supplied file name.
2014-05-09 18:58:46 -04:00