Commit graph

337 commits

Author SHA1 Message Date
Anthony Gutierrez 0b3897fc90 O3,ARM: fix some problems with drain/switchout functionality and add Drain DPRINTFs
This patch fixes some problems with the drain/switchout functionality
for the O3 cpu and for the ARM ISA and adds some useful debug print
statements.

This is an incremental fix as there are still a few bugs/mem leaks with the
switchout code. Particularly when switching from an O3CPU to a
TimingSimpleCPU. However, when switching from O3 to O3 cores with the ARM ISA
I haven't encountered any more assertion failures; now the kernel will
typically panic inside of simulation.
2012-08-15 10:38:08 -04:00
Anthony Gutierrez 7bf14aedbf cache: don't allow dirty data in the i-cache
removes the optimization that forwards an exclusive copy to a requester on a
read, only for the i-cache. this optimization isn't necessary because we
typically won't be writing to the i-cache.
2012-07-27 16:08:04 -04:00
Andreas Hansson b265d9925c Port: Align port names in C++ and Python
This patch is a first step to align the port names used in the Python
world and the C++ world. Ultimately it serves to make the use of
config.json together with output from the simulation easier, including
post-processing of statistics.

Most notably, the CPU, cache, and bus is addressed in this patch, and
there might be other ports that should be updated accordingly. The
dash name separator has also been replaced with a "." which is what is
used to concatenate the names in python, and a separation is made
between the master and slave port in the bus.
2012-07-09 12:35:39 -04:00
Andreas Hansson 46d9adb68c Port: Make getAddrRanges const
This patch makes getAddrRanges const throughout the code base. There
is no reason why it should not be, and making it const prevents adding
any unintentional side-effects.
2012-07-09 12:35:34 -04:00
Andreas Hansson 49407d76aa Port: Add isSnooping to slave port (asking master port)
This patch adds isSnooping to the slave port, and thus avoids going
through getMasterPort to be able to ask the master. Over the course of
the next few patches, all getMasterPort/getSlavePort in Port and
MemObject are to be protocol agnostic, and the snooping is part of the
protocol layer.

The function is already present on the master port, where it is
implemented by the module itself, e.g. a cache. On the slave side, it
is merely asking the connected master port. The same name is used by
both functions despite their difference in behaviour. The initial
design used isMasterSnooping on the slave port side, but the more
verbose function name was later changed.
2012-07-09 12:35:32 -04:00
Andreas Hansson 17f9270dad Port: Move retry from port base class to Master/SlavePort
This patch is the last part of moving all protocol-related
functionality out of the Port base class. All the send/recv functions
are already moved, and the retry (which still governs all the timing
transport functions) is the only part that remained in the base class.

The only point where this currently causes a bit of inconvenience is
in the bus where the retry list is global and holds Port pointers (not
Master/SlavePort). This is about to change with the split into a
request/response bus and will soon be removed anyway.

The patch has no impact on any regressions.
2012-07-09 12:35:31 -04:00
Andreas Hansson ff5718f042 Fix: Address a few benign memory leaks
This patch is the result of static analysis identifying a number of
memory leaks. The leaks are all benign as they are a result of not
deallocating memory in the desctructor. The fix still has value as it
removes false positives in the static analysis.
2012-07-09 12:35:30 -04:00
Lena Olson d2ebade5a5 Cache: Fix the LRU policy for classic memory hierarchy
The LRU policy always evicted the least recently touched way, even if it
contained valid data and another way was invalid, as can happen if a block has
been invalidated by coherance.  This can result in caches never warming up even
though they are replacing blocks.  This modifies the LRU policy to move blocks
to LRU position on invalidation.
2012-06-29 11:21:58 -04:00
Dam Sunwoo 7cbe0cf564 Mem: fix master id assertion in cache_impl.hh
The assertion was applied to the wrong packet.
This patch fixes the issue rerported by Xiang Jiang on the gem5-dev mailing list.
2012-06-29 11:19:07 -04:00
Ali Saidi 8d1e56bdcd Cache: Only invalidate a line in the cache when an uncacheable write is seen. 2012-06-29 11:18:29 -04:00
Ali Saidi c80cd4136e mem: Delay deleting of incoming packets by one call.
This patch is a temporary fix until Andreas' four-phase patches
get reviewed and committed. Removing FastAlloc seems to have exposed
an issue which previously was reasonable rare in which packets are freed
before the sending cache is done with them. This change puts incoming packets
no a pendingDelete queue which are deleted at the start of the next call and
thus breaks the dependency between when the caller returns true and when the
packet is actually used by the sending cache.

Running valgrind on a multi-core linux boot and the memtester results in no
valgrind warnings.
2012-06-07 10:59:03 -04:00
Ali Saidi 1b370431d0 sim: Remove FastAlloc
While FastAlloc provides a small performance increase (~1.5%) over regular malloc it isn't thread safe.
After removing FastAlloc and using tcmalloc I've seen a performance increase of 12% over libc malloc
when running twolf for ARM.
2012-06-05 01:23:08 -04:00
Andreas Hansson 5880fbe96d Bus: Turn the PortId into a transport function parameter
The main aim of this patch is to arrive at a suitable port interface
for vector ports, including both the packet and the port id. This
patch changes the bus transport functions
(recvFunctional/Atomic/Timing) to require a PortId parameter
indicating the source port. Previously this information was passed by
setting the source field of the packet, and this is only required in
the case of a timing request.

With this patch, the use of the source and destination field is also
more restrictive, as they are only needed for timing accesses. The
modifications to these fields for atomic snoops is now removed
entirely, also making minor modifications to the cache.
2012-05-30 05:30:24 -04:00
Andreas Hansson cad802761a Packet: Unify the use of PortID in packet and port
This patch removes the Packet::NodeID typedef and unifies it with the
Port::PortId. The src and dest fields in the packet are used to hold a
port id (e.g. in the bus), and thus the two should actually be the
same.

The typedef PortID is now global (in base/types.hh) and aligned with
the ThreadID in terms of capitalisation and naming of the
InvalidPortID constant.

Before this patch, two flags were used for valid destination and
source, rather than relying on a named value (InvalidPortID), and
this is now redundant, as the src and dest field themselves are
sufficient to tell whether the current value is a valid port
identifier or not. Consequently, the VALID_SRC and VALID_DST are
removed.

As part of the cleaning up, a number of int parameters and local
variables are updated to use PortID.

Note that Ruby still has its own NodeID typedef. Furthermore, the
MemObject getMaster/SlavePort still has an int idx parameter with a
default value of -1 which should eventually change to PortID idx =
InvalidPortID.
2012-05-30 05:29:42 -04:00
Andreas Hansson 49da0497d3 Cache: Remove dangling doWriteback declaration
This patch removes the declaration of doWriteback as there is no
implementation for this member function.
2012-05-24 04:09:19 -04:00
Ali Saidi c02dc07424 Cache: restructure code that actually isn't a loop 2012-05-10 18:04:27 -05:00
Ali Saidi 4f66bcdd2e gem5: fix some iterator use and erase bugs 2012-05-10 18:04:27 -05:00
Ali Saidi 8cee4dacc8 gem5: Fix a number of incorrect case statements 2012-05-10 18:04:26 -05:00
Ali Saidi f6895e8bd4 Cache: Panic if you attempt to create a checkpoint with a cache in the system 2012-05-10 18:04:26 -05:00
Andreas Hansson 3fea59e162 MEM: Separate requests and responses for timing accesses
This patch moves send/recvTiming and send/recvTimingSnoop from the
Port base class to the MasterPort and SlavePort, and also splits them
into separate member functions for requests and responses:
send/recvTimingReq, send/recvTimingResp, and send/recvTimingSnoopReq,
send/recvTimingSnoopResp. A master port sends requests and receives
responses, and also receives snoop requests and sends snoop
responses. A slave port has the reciprocal behaviour as it receives
requests and sends responses, and sends snoop requests and receives
snoop responses.

For all MemObjects that have only master ports or slave ports (but not
both), e.g. a CPU, or a PIO device, this patch merely adds more
clarity to what kind of access is taking place. For example, a CPU
port used to call sendTiming, and will now call
sendTimingReq. Similarly, a response previously came back through
recvTiming, which is now recvTimingResp. For the modules that have
both master and slave ports, e.g. the bus, the behaviour was
previously relying on branches based on pkt->isRequest(), and this is
now replaced with a direct call to the apprioriate member function
depending on the type of access. Please note that send/recvRetry is
still shared by all the timing accessors and remains in the Port base
class for now (to maintain the current bus functionality and avoid
changing the statistics of all regressions).

The packet queue is split into a MasterPort and SlavePort version to
facilitate the use of the new timing accessors. All uses of the
PacketQueue are updated accordingly.

With this patch, the type of packet (request or response) is now well
defined for each type of access, and asserts on pkt->isRequest() and
pkt->isResponse() are now moved to the appropriate send member
functions. It is also worth noting that sendTimingSnoopReq no longer
returns a boolean, as the semantics do not alow snoop requests to be
rejected or stalled. All these assumptions are now excplicitly part of
the port interface itself.
2012-05-01 13:40:42 -04:00
Andreas Hansson 750f33a901 MEM: Remove the Broadcast destination from the packet
This patch simplifies the packet by removing the broadcast flag and
instead more firmly relying on (and enforcing) the semantics of
transactions in the classic memory system, i.e. request packets are
routed from a master to a slave based on the address, and when they
are created they have neither a valid source, nor destination. On
their way to the slave, the request packet is updated with a source
field for all modules that multiplex packets from multiple master
(e.g. a bus). When a request packet is turned into a response packet
(at the final slave), it moves the potentially populated source field
to the destination field, and the response packet is routed through
any multiplexing components back to the master based on the
destination field.

Modules that connect multiplexing components, such as caches and
bridges store any existing source and destination field in the sender
state as a stack (just as before).

The packet constructor is simplified in that there is no longer a need
to pass the Packet::Broadcast as the destination (this was always the
case for the classic memory system). In the case of Ruby, rather than
using the parameter to the constructor we now rely on setDest, as
there is already another three-argument constructor in the packet
class.

In many places where the packet information was printed as part of
DPRINTFs, request packets would be printed with a numeric "dest" that
would always be -1 (Broadcast) and that field is now removed from the
printing.
2012-04-14 05:45:55 -04:00
Andreas Hansson dccca0d3a9 MEM: Separate snoops and normal memory requests/responses
This patch introduces port access methods that separates snoop
request/responses from normal memory request/responses. The
differentiation is made for functional, atomic and timing accesses and
builds on the introduction of master and slave ports.

Before the introduction of this patch, the packets belonging to the
different phases of the protocol (request -> [forwarded snoop request
-> snoop response]* -> response) all use the same port access
functions, even though the snoop packets flow in the opposite
direction to the normal packet. That is, a coherent master sends
normal request and receives responses, but receives snoop requests and
sends snoop responses (vice versa for the slave). These two distinct
phases now use different access functions, as described below.

Starting with the functional access, a master sends a request to a
slave through sendFunctional, and the request packet is turned into a
response before the call returns. In a system without cache coherence,
this is all that is needed from the functional interface. For the
cache-coherent scenario, a slave also sends snoop requests to coherent
masters through sendFunctionalSnoop, with responses returned within
the same packet pointer. This is currently used by the bus and caches,
and the LSQ of the O3 CPU. The send/recvFunctional and
send/recvFunctionalSnoop are moved from the Port super class to the
appropriate subclass.

Atomic accesses follow the same flow as functional accesses, with
request being sent from master to slave through sendAtomic. In the
case of cache-coherent ports, a slave can send snoop requests to a
master through sendAtomicSnoop. Just as for the functional access
methods, the atomic send and receive member functions are moved to the
appropriate subclasses.

The timing access methods are different from the functional and atomic
in that requests and responses are separated in time and
send/recvTiming are used for both directions. Hence, a master uses
sendTiming to send a request to a slave, and a slave uses sendTiming
to send a response back to a master, at a later point in time. Snoop
requests and responses travel in the opposite direction, similar to
what happens in functional and atomic accesses. With the introduction
of this patch, it is possible to determine the direction of packets in
the bus, and no longer necessary to look for both a master and a slave
port with the requested port id.

In contrast to the normal recvFunctional, recvAtomic and recvTiming
that are pure virtual functions, the recvFunctionalSnoop,
recvAtomicSnoop and recvTimingSnoop have a default implementation that
calls panic. This is to allow non-coherent master and slave ports to
not implement these functions.
2012-04-14 05:45:07 -04:00
Andreas Hansson b00949d88b MEM: Enable multiple distributed generalized memories
This patch removes the assumption on having on single instance of
PhysicalMemory, and enables a distributed memory where the individual
memories in the system are each responsible for a single contiguous
address range.

All memories inherit from an AbstractMemory that encompasses the basic
behaviuor of a random access memory, and provides untimed access
methods. What was previously called PhysicalMemory is now
SimpleMemory, and a subclass of AbstractMemory. All future types of
memory controllers should inherit from AbstractMemory.

To enable e.g. the atomic CPU and RubyPort to access the now
distributed memory, the system has a wrapper class, called
PhysicalMemory that is aware of all the memories in the system and
their associated address ranges. This class thus acts as an
infinitely-fast bus and performs address decoding for these "shortcut"
accesses. Each memory can specify that it should not be part of the
global address map (used e.g. by the functional memories by some
testers). Moreover, each memory can be configured to be reported to
the OS configuration table, useful for populating ATAG structures, and
any potential ACPI tables.

Checkpointing support currently assumes that all memories have the
same size and organisation when creating and resuming from the
checkpoint. A future patch will enable a more flexible
re-organisation.

--HG--
rename : src/mem/PhysicalMemory.py => src/mem/AbstractMemory.py
rename : src/mem/PhysicalMemory.py => src/mem/SimpleMemory.py
rename : src/mem/physical.cc => src/mem/abstract_mem.cc
rename : src/mem/physical.hh => src/mem/abstract_mem.hh
rename : src/mem/physical.cc => src/mem/simple_mem.cc
rename : src/mem/physical.hh => src/mem/simple_mem.hh
2012-04-06 13:46:31 -04:00
William Wang f9d403a7b9 MEM: Introduce the master/slave port sub-classes in C++
This patch introduces the notion of a master and slave port in the C++
code, thus bringing the previous classification from the Python
classes into the corresponding simulation objects and memory objects.

The patch enables us to classify behaviours into the two bins and add
assumptions and enfore compliance, also simplifying the two
interfaces. As a starting point, isSnooping is confined to a master
port, and getAddrRanges to slave ports. More of these specilisations
are to come in later patches.

The getPort function is not getMasterPort and getSlavePort, and
returns a port reference rather than a pointer as NULL would never be
a valid return value. The default implementation of these two
functions is placed in MemObject, and calls fatal.

The one drawback with this specific patch is that it requires some
code duplication, e.g. QueuedPort becomes QueuedMasterPort and
QueuedSlavePort, and BusPort becomes BusMasterPort and BusSlavePort
(avoiding multiple inheritance). With the later introduction of the
port interfaces, moving the functionality outside the port itself, a
lot of the duplicated code will disappear again.
2012-03-30 09:40:11 -04:00
Andreas Hansson c2d2ea99e3 MEM: Split SimpleTimingPort into PacketQueue and ports
This patch decouples the queueing and the port interactions to
simplify the introduction of the master and slave ports. By separating
the queueing functionality from the port itself, it becomes much
easier to distinguish between master and slave ports, and still retain
the queueing ability for both (without code duplication).

As part of the split into a PacketQueue and a port, there is now also
a hierarchy of two port classes, QueuedPort and SimpleTimingPort. The
QueuedPort is useful for ports that want to leave the packet
transmission of outgoing packets to the queue and is used by both
master and slave ports. The SimpleTimingPort inherits from the
QueuedPort and adds the implemention of recvTiming and recvFunctional
through recvAtomic.

The PioPort and MessagePort are cleaned up as part of the changes.

--HG--
rename : src/mem/tport.cc => src/mem/packet_queue.cc
rename : src/mem/tport.hh => src/mem/packet_queue.hh
2012-03-22 06:36:27 -04:00
Ali Saidi eaa994e7f6 cache: Allow main memory to be at disjoint address ranges. 2012-03-09 09:59:25 -05:00
Ali Saidi d907d0ec72 Cache: Fix an issue with LRU when bonus block is used to complete transaction.
The block is never inserted because it's the one extra block in the cache, but
it can be invalidated twice in a row. In that case the block doesn't have a
new master id (beacuse it was never inserted), however it is valid and
the accounting goes wrong at that point.
2012-03-01 17:26:31 -06:00
Andreas Hansson 0cd0a8fdd3 MEM: Simplify cache ports preparing for master/slave split
This patch splits the two cache ports into a master (memory-side) and
slave (cpu-side) subclass of port with slightly different
functionality. For example, it is only the CPU-side port that blocks
incoming requests, and only the memory-side port that schedules send
events outside of what the transmit list dictates.

This patch simplifies the two classes by relying further on
SimpleTimingPort and also generalises the latter to better accommodate
the changes (introducing trySendTiming and scheduleSend). The
memory-side cache port overrides sendDeferredPacket to be able to not
only send responses from the transmit list, but also send requests
based on the MSHRs.

A follow on patch further simplifies the SimpleTimingPort and the
cache ports.
2012-02-24 11:52:49 -05:00
Andreas Hansson 5a9a743cfc MEM: Introduce the master/slave port roles in the Python classes
This patch classifies all ports in Python as either Master or Slave
and enforces a binding of master to slave. Conceptually, a master (such
as a CPU or DMA port) issues requests, and receives responses, and
conversely, a slave (such as a memory or a PIO device) receives
requests and sends back responses. Currently there is no
differentiation between coherent and non-coherent masters and slaves.

The classification as master/slave also involves splitting the dual
role port of the bus into a master and slave port and updating all the
system assembly scripts to use the appropriate port. Similarly, the
interrupt devices have to have their int_port split into a master and
slave port. The intdev and its children have minimal changes to
facilitate the extra port.

Note that this patch does not enforce any port typing in the C++
world, it merely ensures that the Python objects have a notion of the
port roles and are connected in an appropriate manner. This check is
carried when two ports are connected, e.g. bus.master =
memory.port. The following patches will make use of the
classifications and specialise the C++ ports into masters and slaves.
2012-02-13 06:43:09 -05:00
Dam Sunwoo 230540e655 mem: fix cache stats to use request ids correctly
This patch fixes the cache stats to use the new request ids.
Cache stats also display the requestor names in the vector subnames.
Most cache stats now include "nozero" and "nonan" flags to reduce the
amount of excessive cache stat dump. Also, simplified
incMissCount()/incHitCount() functions.
2012-02-12 16:07:39 -06:00
Ali Saidi 8aaa39e93d mem: Add a master ID to each request object.
This change adds a master id to each request object which can be
used identify every device in the system that is capable of issuing a request.
This is part of the way to removing the numCpus+1 stats in the cache and
replacing them with the master ids. This is one of a series of changes
that make way for the stats output to be changed to python.
2012-02-12 16:07:38 -06:00
Mrinmoy Ghosh 7e104a1af2 prefetcher: Make prefetcher a sim object instead of it being a parameter on cache 2012-02-12 16:07:38 -06:00
Gabe Black ea8b347dc5 Merge with head, hopefully the last time for this batch. 2012-01-31 22:40:08 -08:00
Koan-Sin Tan 7d4f187700 clang: Enable compiling gem5 using clang 2.9 and 3.0
This patch adds the necessary flags to the SConstruct and SConscript
files for compiling using clang 2.9 and later (on Ubuntu et al and OSX
XCode 4.2), and also cleans up a bunch of compiler warnings found by
clang. Most of the warnings are related to hidden virtual functions,
comparisons with unsigneds >= 0, and if-statements with empty
bodies. A number of mismatches between struct and class are also
fixed. clang 2.8 is not working as it has problems with class names
that occur in multiple namespaces (e.g. Statistics in
kernel_stats.hh).

clang has a bug (http://llvm.org/bugs/show_bug.cgi?id=7247) which
causes confusion between the container std::set and the function
Packet::set, and this is currently addressed by not including the
entire namespace std, but rather selecting e.g. "using std::vector" in
the appropriate places.
2012-01-31 12:05:52 -05:00
Andreas Hansson 4590b91fb8 MEM: Remove the otherPort from the cache ports
This patch is a very straight-forward simplification, removing the
unecessary otherPort pointer from the cache port. The pointer was only
used to forward range changes, and the address range is fixed for the
cache. Removing the pointer simplifies the transition to master/slave
ports.
2012-01-31 11:51:19 -05:00
Gabe Black c3d41a2def Merge with the main repo.
--HG--
rename : src/mem/vport.hh => src/mem/fs_translating_port_proxy.hh
rename : src/mem/translating_port.cc => src/mem/se_translating_port_proxy.cc
rename : src/mem/translating_port.hh => src/mem/se_translating_port_proxy.hh
2012-01-28 07:24:01 -08:00
William Wang e731cf4c1d MEM: Remove the functional ports from the memory system
The functional ports are no longer used and this patch cleans up the
legacy that is still present in buses, memories, CPUs etc. Note that
this does not refer to the class FunctionalPort (already removed), but
rather ports with the name (and use) functional.
2012-01-17 12:55:09 -06:00
Andreas Hansson 07cf9d914b MEM: Separate queries for snooping and address ranges
This patch simplifies the address-range determination mechanism and
also unifies the naming across ports and devices. It further splits
the queries for determining if a port is snooping and what address
ranges it responds to (aiming towards a separation of
cache-maintenance ports and pure memory-mapped ports). Default
behaviours are such that most ports do not have to define isSnooping,
and master ports need not implement getAddrRanges.
2012-01-17 12:55:09 -06:00
Andreas Hansson 142380a373 MEM: Remove Port removeConn and MemObject deletePortRefs
Cleaning up and simplifying the ports and going towards a more strict
elaboration-time creation and binding of the ports.
2012-01-17 12:55:09 -06:00
Andreas Hansson de34e49d15 MEM: Simplify ports by removing EventManager
This patch removes the inheritance of EventManager from the ports and
moves all responsibility for event queues to the owner. Eventually the
event manager should be the interface block, which could either be the
structural owner or a subblock like a LSQ in the O3 CPU for example.
2012-01-17 12:55:09 -06:00
Andreas Hansson 13ef7a5647 MEM: Differentiate functional cache accesses from CPU and memory
This patch changes the functionalAccess member function in the cache
model such that it is aware of what port the access came from, i.e. if
it came from the CPU side or from the memory side. By adding this
information, it is possible to respect the 'forwardSnoops' flag for
snooping requests coming from the memory side and not forward
them. This fixes an outstanding issue with the IO bus getting accesses
that have no valid destination port and also cleans up future changes
to the bus model.
2012-01-17 12:55:07 -06:00
Gabe Black 36a822f08e Merge with main repository. 2012-01-07 02:10:34 -08:00
Gabe Black 85424bef19 SE/FS: Get rid of includes of config/full_system.hh. 2011-11-18 02:20:22 -08:00
Gabe Black 71c4534ce9 SE/FS: Get rid of FULL_SYSTEM in mem. 2011-11-07 01:13:43 -08:00
Gabe Black d735abe5da GCC: Get everything working with gcc 4.6.1.
And by "everything" I mean all the quick regressions.
2011-10-31 01:09:44 -07:00
Ali Saidi 0c29a97ba9 Prefetch: Don't prefetch if address is in the write queue.
Check that we're not currently writing back an address the prefetcher is trying
to prefetch before issuing it. We previously checked the mshrQueue and the cache
itself, but forgot to check the writeBuffer. This fixes a memory corrucption
issue with an L2 prefetcher.
2011-09-13 12:06:13 -05:00
Lisa Hsu f6a2ef22ff Fix build for gcc-4.2 opt/fast
Even though the code is safe, compiler flags a warning here, which are treated as errors for fast/opt. I know it's redundant but it has no side effects and fixes the compile.
2011-09-01 15:25:54 -07:00
Ali Saidi c9c2d979b8 Mem: Put prefetcher notify call before packet is deleted. 2011-08-19 15:08:08 -05:00
Ali Saidi 6779bd3e5d Prefetcher: Fix some memory leaks with the prefetcher. 2011-08-19 15:08:05 -05:00
Ali Saidi 147095cb08 Mem: Fix issue with prefetches originating at non-L1 caches getting stale data
Prefetch requests issued from the L2 or below wouldn't check if valid data is
present higher in the system. If a prefetch into the L2 occured at the same
time as writeback from a higher-level cache the dirty data could be replaced
in by unmodified data in memory.
2011-07-15 11:53:35 -05:00