Commit graph

22 commits

Author SHA1 Message Date
Andreas Hansson 1f6d5f8f84 mem: Rename Bus to XBar to better reflect its behaviour
This patch changes the name of the Bus classes to XBar to better
reflect the actual timing behaviour. The actual instances in the
config scripts are not renamed, and remain as e.g. iobus or membus.

As part of this renaming, the code has also been clean up slightly,
making use of range-based for loops and tidying up some comments. The
only changes outside the bus/crossbar code is due to the delay
variables in the packet.

--HG--
rename : src/mem/Bus.py => src/mem/XBar.py
rename : src/mem/coherent_bus.cc => src/mem/coherent_xbar.cc
rename : src/mem/coherent_bus.hh => src/mem/coherent_xbar.hh
rename : src/mem/noncoherent_bus.cc => src/mem/noncoherent_xbar.cc
rename : src/mem/noncoherent_bus.hh => src/mem/noncoherent_xbar.hh
rename : src/mem/bus.cc => src/mem/xbar.cc
rename : src/mem/bus.hh => src/mem/xbar.hh
2014-09-20 17:18:32 -04:00
Stephan Diestelhorst afa2428eca mem: Tie in the snoop filter in the coherent bus 2014-09-20 17:18:29 -04:00
Stephan Diestelhorst 7d488cc66f mem: Add a simple snoop counter per bus
This patch adds a simple counter for both total messages and a histogram for
the fan-out of snoop messages.  The fan-out describes to how many ports snoops
had to be sent per incoming request / snoop-from-below.  Without any
cleverness, this usually means to either all, or all but the requesting port.
2014-04-24 13:28:47 +01:00
Andreas Hansson 7e13c4d046 mem: Make returning snoop responses occupy response layer
This patch introduces a mirrored internal snoop port to facilitate
easy addition of flow control for the snoop responses that are turned
into normal responses on their return. To perform this, the slave
ports of the coherent bus are wrapped in internal master ports that
are passed as the source ports to the response layer in question.

As a result of this patch, there is more contention for the response
resources, and as such system performance will decrease slightly.

A consequence of the mirrored internal port is that the port the bus
tells to retry (the internal one) and the port actually retrying (the
mirrored) one are not the same. Thus, the existing check in tryTiming
is not longer correct. In fact, the test is redundant as the layer is
only in the retry state while calling sendRetry on the waiting port,
and if the latter does not immediately call the bus then the retry
state is left. Consequently the check is removed.
2013-05-30 12:54:02 -04:00
Andreas Hansson 2308f812ef mem: Make the buses multi layered
This patch makes the buses multi layered, and effectively creates a
crossbar structure with distributed contention ports at the
destination ports. Before this patch, a bus could have a single
request, response and snoop response in flight at any time, and with
these changes there can be as many requests as connected slaves (bus
master ports), and as many responses as connected masters (bus slave
ports).

Together with address interleaving, this patch enables us to create
high-throughput memory interconnects, e.g. 50+ GByte/s.
2013-05-30 12:54:01 -04:00
Andreas Hansson e82996d9da mem: Separate the two snoop response cases in the bus
This patch makes the flow control and state updates of the coherent
bus more clear by separating the two cases, i.e. forward as a snoop
response, or turn it into a normal response.

With this change it is also more clear what resources are being
occupied, and that we effectively bypass the busy check for the second
case. As a result of the change in resource usage some stats change.
2013-05-30 12:54:00 -04:00
Andreas Hansson cb62d39835 mem: Tidy up a few variables in the bus
This patch does some minor housekeeping on the bus code, removing
redundant code, and moving the extraction of the destination id to the
top of the functions using it.
2013-05-30 12:53:59 -04:00
Uri Wiener 91f7b065a9 mem: Add basic stats to the buses
This patch adds a basic set of stats which are hard to impossible to
implement using only communication monitors, and are needed for
insight such as bus utilization, transactions through the bus etc.

Stats added include throughput and transaction distribution, and also
a two-dimensional vector capturing how many packets and how much data
is exchanged between the masters and slaves connected to the bus.
2013-05-30 12:53:58 -04:00
Uri Wiener a8fbfefb5e mem: Adding verbose debug output in the memory system
This patch provides useful printouts throughut the memory system. This
includes pretty-printed cache tags and function call messages
(call-stack like).
2013-04-22 13:20:33 -04:00
Andreas Hansson 93a8423dea mem: Separate waiting for the bus and waiting for a peer
This patch splits the retryList into a list of ports that are waiting
for the bus itself to become available, and a map that tracks the
ports where forwarding failed due to a peer not accepting the
packet. Thus, when a retry reaches the bus, it can be sent to the
appropriate port that initiated that transaction.

As a consequence of this patch, only ports that are really ready to go
will get a retry, thus reducing the amount of redundant failed
attempts. This patch also makes it easier to reason about the order of
servicing requests as the ports waiting for the bus are now clearly
FIFO and much easier to change if desired.
2013-03-26 14:46:47 -04:00
Andreas Hansson 860155a5fc mem: Enforce strict use of busFirst- and busLastWordTime
This patch adds a check to ensure that the delay incurred by
the bus is not simply disregarded, but accounted for by someone. At
this point, all the modules do is to zero it out, and no additional
time is spent. This highlights where the bus timing is simply dropped
instead of being paid for.

As a follow up, the locations identified in this patch should add this
additional time to the packets in one way or another. For now it
simply acts as a sanity check and highlights where the delay is simply
ignored.

Since no time is added, all regressions remain the same.
2013-02-19 05:56:06 -05:00
Andreas Hansson b3fc8839c4 mem: Make packet bus-related time accounting relative
This patch changes the bus-related time accounting done in the packet
to be relative. Besides making it easier to align the cache timing to
cache clock cycles, it also makes it possible to create a Last-Level
Cache (LLC) directly to a memory controller without a bus inbetween.

The bus is unique in that it does not ever make the packets wait to
reflect the time spent forwarding them. Instead, the cache is
currently responsible for making the packets wait. Thus, the bus
annotates the packets with the time needed for the first word to
appear, and also the last word. The cache then delays the packets in
its queues before passing them on. It is worth noting that every
object attached to a bus (devices, memories, bridges, etc) should be
doing this if we opt for keeping this way of accounting for the bus
timing.
2013-02-19 05:56:06 -05:00
Andreas Hansson 7cd49b24d2 sim: Make clock private and access using clockPeriod()
This patch makes the clock member private to the ClockedObject and
forces all children to access it using clockPeriod(). This makes it
impossible to inadvertently change the clock, and also makes it easier
to transition to a situation where the clock is derived from e.g. a
clock domain, or through a multiplier.
2013-02-19 05:56:06 -05:00
Andreas Sandberg b904bd5437 sim: Add a system-global option to bypass caches
Virtualized CPUs and the fastmem mode of the atomic CPU require direct
access to physical memory. We currently require caches to be disabled
when using them to prevent chaos. This is not ideal when switching
between hardware virutalized CPUs and other CPU models as it would
require a configuration change on each switch. This changeset
introduces a new version of the atomic memory mode,
'atomic_noncaching', where memory accesses are inserted into the
memory system as atomic accesses, but bypass caches.

To make memory mode tests cleaner, the following methods are added to
the System class:

 * isAtomicMode() -- True if the memory mode is 'atomic' or 'direct'.
 * isTimingMode() -- True if the memory mode is 'timing'.
 * bypassCaches() -- True if caches should be bypassed.

The old getMemoryMode() and setMemoryMode() methods should never be
used from the C++ world anymore.
2013-02-15 17:40:09 -05:00
Andreas Sandberg b81a977e6a sim: Move the draining interface into a separate base class
This patch moves the draining interface from SimObject to a separate
class that can be used by any object needing draining. However,
objects not visible to the Python code (i.e., objects not deriving
from SimObject) still depend on their parents informing them when to
drain. This patch also gets rid of the CountedDrainEvent (which isn't
really an event) and replaces it with a DrainManager.
2012-11-02 11:32:01 -05:00
Andreas Hansson 43ca8415e8 Mem: Determine bus block size during initialisation
This patch moves the block size computation from findBlockSize to
initialisation time, once all the neighbouring ports are connected.

There is no need to dynamically update the block size, and the caching
of the value effectively avoided that anyhow. This is very similar to
what was already in place, just with a slightly leaner implementation.
2012-10-11 06:38:43 -04:00
Andreas Hansson b265d9925c Port: Align port names in C++ and Python
This patch is a first step to align the port names used in the Python
world and the C++ world. Ultimately it serves to make the use of
config.json together with output from the simulation easier, including
post-processing of statistics.

Most notably, the CPU, cache, and bus is addressed in this patch, and
there might be other ports that should be updated accordingly. The
dash name separator has also been replaced with a "." which is what is
used to concatenate the names in python, and a separation is made
between the master and slave port in the bus.
2012-07-09 12:35:39 -04:00
Andreas Hansson 8caaac048a Bus: Split the bus into separate request/response layers
This patch splits the existing buses into multiple layers. The
non-coherent bus is split into a request and a response layer, and the
coherent bus adds an additional layer for the snoop responses. The
layer is modified to be templatised on the port type, such that the
different layers can have retryLists with either master or slave
ports. This patch also removes the dynamic cast from the retry, as
previously promised when moving the recvRetry from the port base class
to the master/slave port respectively.

Overall, the split bus more closely reflects any modern on-chip bus
and should be at step in the right direction. From this point, it
would be reasonable straight forward to add separate layers (and thus
contention points and arbitration) for each port and thus create a
true crossbar.

The regressions all produce the correct output, but have varying
degrees of changes to their statistics. A separate patch will be
pushed with the updates to the reference statistics.
2012-07-09 12:35:37 -04:00
Andreas Hansson 995e6e4670 Bus: Add a notion of layers to the buses
This patch moves all flow control, arbitration and state information
into a bus layer. The layer is thus responsible for all the state
transitions, and for keeping hold of the retry list. Consequently the
layer is also responsible for the draining.

With this change, the non-coherent and coherent bus are given a single
layer to avoid changing any temporal behaviour, but the patch opens up
for adding more layers.
2012-07-09 12:35:36 -04:00
Andreas Hansson 14f9c77dd3 Bus: Replace tickNextIdle and inRetry with a state variable
This patch adds a state enum and member variable in the bus, tracking
the bus state, thus eliminating the need for tickNextIdle and inRetry,
and fixing an issue that allowed the bus to be occupied by multiple
packets at once (hopefully it also makes it easier to understand the
code).

The bus, in its current form, uses tickNextIdle and inRetry to keep
track of the state of the bus. However, it only updates tickNextIdle
_after_ forwarding a packet using sendTiming, and the result is that
the bus is still seen as idle, and a module that receives the packet
and starts transmitting new packets in zero time will still see the
bus as idle (and this is done by a number of DMA devices). The issue
can also be seen in isOccupied where the bus calls reschedule on an
event instead of schedule.

This patch addresses the problem by marking the bus as _not_ idle
already by the time we conclude that the bus is not occupied and we
will deal with the packet.

As a result of not allowing multiple packets to occupy the bus, some
regressions have slight changes in their statistics. A separate patch
updates these accordingly.

Further ahead, a follow-on patch will introduce a separate state
variable for request/responses/snoop responses, and thus implement a
split request/response bus with separate flow control for the
different message types (even further ahead it will introduce a
multi-layer bus).
2012-07-09 12:35:35 -04:00
Andreas Hansson 49407d76aa Port: Add isSnooping to slave port (asking master port)
This patch adds isSnooping to the slave port, and thus avoids going
through getMasterPort to be able to ask the master. Over the course of
the next few patches, all getMasterPort/getSlavePort in Port and
MemObject are to be protocol agnostic, and the snooping is part of the
protocol layer.

The function is already present on the master port, where it is
implemented by the module itself, e.g. a cache. On the slave side, it
is merely asking the connected master port. The same name is used by
both functions despite their difference in behaviour. The initial
design used isMasterSnooping on the slave port side, but the more
verbose function name was later changed.
2012-07-09 12:35:32 -04:00
Andreas Hansson 0d32940711 Bus: Split the bus into a non-coherent and coherent bus
This patch introduces a class hierarchy of buses, a non-coherent one,
and a coherent one, splitting the existing bus functionality. By doing
so it also enables further specialisation of the two types of buses.

A non-coherent bus connects a number of non-snooping masters and
slaves, and routes the request and response packets based on the
address. The request packets issued by the master connected to a
non-coherent bus could still snoop in caches attached to a coherent
bus, as is the case with the I/O bus and memory bus in most system
configurations. No snoops will, however, reach any master on the
non-coherent bus itself. The non-coherent bus can be used as a
template for modelling PCI, PCIe, and non-coherent AMBA and OCP buses,
and is typically used for the I/O buses.

A coherent bus connects a number of (potentially) snooping masters and
slaves, and routes the request and response packets based on the
address, and also forwards all requests to the snoopers and deals with
the snoop responses. The coherent bus can be used as a template for
modelling QPI, HyperTransport, ACE and coherent OCP buses, and is
typically used for the L1-to-L2 buses and as the main system
interconnect.

The configuration scripts are updated to use a NoncoherentBus for all
peripheral and I/O buses.

A bit of minor tidying up has also been done.

--HG--
rename : src/mem/bus.cc => src/mem/coherent_bus.cc
rename : src/mem/bus.hh => src/mem/coherent_bus.hh
rename : src/mem/bus.cc => src/mem/noncoherent_bus.cc
rename : src/mem/bus.hh => src/mem/noncoherent_bus.hh
2012-05-31 13:30:04 -04:00