Commit graph

5558 commits

Author SHA1 Message Date
Nilay Vaish
b422994fea MESI Protocol: Correct the virtual network in profile functions
The virtual network in a couple of places was incorrectly mentioned
as 3 in place of 1. This is being corrected.
2012-08-25 15:49:06 -05:00
Nilay Vaish
01f1430833 MESI Coherence Protocol: Add copyright notice 2012-08-25 13:16:45 -05:00
Andreas Hansson
2c1052cd4d DMA: Refactor the DMA device and align timing and atomic
This patch does a bunch of house-keeping updates on the DMA, including
indentation, and formatting, but most importantly breaks out the
response handling such that it can be shared between the atomic and
timing modes. It also removes a potential bug caused by the atomic
handling of responses only deleting the allocated request (pkt->req)
once the DMA action completes instead of doing so for every packet.

Before this patch, the handling of responses was near identical for
atomic and timing, but the code was simply duplicated. With this
patch, the handleResp method deals with the responses in both cases.

There are further updates to make after removing the NACKs, but that
will be part of a separate follow-up patch. This patch does not change
the behaviour of any regression.
2012-08-22 11:40:01 -04:00
Andreas Hansson
c60db56741 Packet: Remove NACKs from packet and its use in endpoints
This patch removes the NACK frrom the packet as there is no longer any
module in the system that issues them (the bridge was the only one and
the previous patch removes that).

The handling of NACKs was mostly avoided throughout the code base, by
using e.g. panic or assert false, but in a few locations the NACKs
were actually dealt with (although NACKs never occured in any of the
regressions). Most notably, the DMA port will now never receive a NACK
and the backoff time is thus never changed. As a consequence, the
entire backoff mechanism (similar to a PCI bus) is now removed and the
DMA port entirely relies on the bus performing the arbitration and
issuing a retry when appropriate. This is more in line with e.g. PCIe.

Surprisingly, this patch has no impact on any of the regressions. As
mentioned in the patch that removes the NACK from the bridge, a
follow-up patch should change the request and response buffer size for
at least one regression to also verify that the system behaves as
expected when the bridge fills up.
2012-08-22 11:39:59 -04:00
Andreas Hansson
a6074016e2 Bridge: Remove NACKs in the bridge and unify with packet queue
This patch removes the NACKing in the bridge, as the split
request/response busses now ensure that protocol deadlocks do not
occur, i.e. the message-dependency chain is broken by always allowing
responses to make progress without being stalled by requests. The
NACKs had limited support in the system with most components ignoring
their use (with a suitable call to panic), and as the NACKs are no
longer needed to avoid protocol deadlocks, the cleanest way is to
simply remove them.

The bridge is the starting point as this is the only place where the
NACKs are created. A follow-up patch will remove the code that deals
with NACKs in the endpoints, e.g. the X86 table walker and DMA
port. Ultimately the type of packet can be complete removed (until
someone sees a need for modelling more complex protocols, which can
now be done in parts of the system since the port and interface is
split).

As a consequence of the NACK removal, the bridge now has to send a
retry to a master if the request or response queue was full on the
first attempt. This change also makes the bridge ports very similar to
QueuedPorts, and a later patch will change the bridge to use these. A
first step in this direction is taken by aligning the name of the
member functions, as done by this patch.

A bit of tidying up has also been done as part of the simplifications.

Surprisingly, this patch has no impact on any of the
regressions. Hence, there was never any NACKs issued. In a follow-up
patch I would suggest changing the size of the bridge buffers set in
FSConfig.py to also test the situation where the bridge fills up.
2012-08-22 11:39:58 -04:00
Andreas Hansson
e317d8b9ff Port: Extend the QueuedPort interface and use where appropriate
This patch extends the queued port interfaces with methods for
scheduling the transmission of a timing request/response. The methods
are named similar to the corresponding sendTiming(Snoop)Req/Resp,
replacing the "send" with "sched". As the queues are currently
unbounded, the methods always succeed and hence do not return a value.

This functionality was previously provided in the subclasses by
calling PacketQueue::schedSendTiming with the appropriate
parameters. With this change, there is no need to introduce these
extra methods in the subclasses, and the use of the queued interface
is more uniform and explicit.
2012-08-22 11:39:56 -04:00
Andreas Hansson
70e99e0b91 Device: Remove overloaded pio_latency parameter
This patch removes the overloading of the parameter, which seems both
redundant, and possibly incorrect.

The PciConfigAll now also uses a Param.Latency rather than a
Param.Tick. For backwards compatibility it still sets the pio_latency
to 1 tick. All the comments have also been updated to not state that
it is in simticks when it is not necessarily the case.
2012-08-21 05:50:03 -04:00
Andreas Hansson
a81c969529 CPU: Remove overloaded function_trace_start parameter
This patch removes the overloading of the parameter, which seems both
redundant, and possibly incorrect.

The inorder CPU is particularly interesting as it uses a different
name for the parameter, and never make any use of it internally.
2012-08-21 05:49:43 -04:00
Andreas Hansson
5803309574 PacketQueue: Allow queuing in the same tick as desired send tick
This patch allows packets to be enqueued in the same tick as they are
intended to be sent. This does not imply they actually are sent that
tick, although that is possible.

This change is useful for module that use the queued ports primarly to
avoid handling the flow control involved in sending and retrying
packets.
2012-08-21 05:49:24 -04:00
Andreas Hansson
4be1ae3cf8 EventManager: Remove test for NULL pointer in constructor
This patch tidies up the EventManager constructor and prunes a corner
case where the EventManager would initialise its eventq pointer to
NULL. This would cause segmentation faults on actual use and should
never happen.
2012-08-21 05:49:18 -04:00
Andreas Hansson
016593f2e9 Clock: Make Tick unsigned and remove UTick
This patch makes the Tick unsigned and removes the UTick typedef. The
ticks should never be negative, and there was only one major issue
with removing it, caused by the o3 CPU using a -1 as an initial value.

The patch has no impact on any regressions.
2012-08-21 05:49:09 -04:00
Andreas Hansson
452217817f Clock: Move the clock and related functions to ClockedObject
This patch moves the clock of the CPU, bus, and numerous devices to
the new class ClockedObject, that sits in between the SimObject and
MemObject in the class hierarchy. Although there are currently a fair
amount of MemObjects that do not make use of the clock, they
potentially should do so, e.g. the caches should at some point have
the same clock as the CPU, potentially with a 1:n ratio. This patch
does not introduce any new clock objects or object hierarchies
(clusters, clock domains etc), but is still a step in the direction of
having a more structured approach clock domains.

The most contentious part of this patch is the serialisation of clocks
that some of the modules (but not all) did previously. This
serialisation should not be needed as the clock is set through the
parameters even when restoring from the checkpoint. In other words,
the state is "stored" in the Python code that creates the modules.

The nextCycle methods are also simplified and the clock phase
parameter of the CPU is removed (this could be part of a clock object
once they are introduced).
2012-08-21 05:49:01 -04:00
Nilay Vaish
0160d51483 Ruby Banked Array: add copyrights 2012-08-19 13:05:53 -05:00
Jason Power
44b4c96253 Ruby: Add RubySystem parameter to MemoryControl
This guarantees that RubySystem object is created before the MemoryController
object is created.
2012-08-16 23:39:36 -05:00
Nilay Vaish
649e377937 Alpha System: override startup(), instead of loadState()
Alpha System was overriding loadState() function to setup some functional
event. The system tried to read/write to memory before the Ruby memory had
unserialized the state. With this patch, Alpha System overrides the
startup() function, and sets up functional events in this function. This
works because startup() is called after Ruby memory system has unserialized
the memory state.
2012-08-16 23:45:21 -05:00
Anthony Gutierrez
0b3897fc90 O3,ARM: fix some problems with drain/switchout functionality and add Drain DPRINTFs
This patch fixes some problems with the drain/switchout functionality
for the O3 cpu and for the ARM ISA and adds some useful debug print
statements.

This is an incremental fix as there are still a few bugs/mem leaks with the
switchout code. Particularly when switching from an O3CPU to a
TimingSimpleCPU. However, when switching from O3 to O3 cores with the ARM ISA
I haven't encountered any more assertion failures; now the kernel will
typically panic inside of simulation.
2012-08-15 10:38:08 -04:00
Ali Saidi
dd1b346584 sysemul: bump all linux versions of for syscal emulation to 3.0.
New tool chains seem to be looking for kernel versions newer than what
this this was previously set to. Also take this opportunity to change
the hostname we report in uname to sim.gem5.org.
2012-08-15 10:38:04 -04:00
Jason Power
11411cc9c7 Ruby: Clean up topology changes
This patch moves instantiateTopology into Ruby.py and removes the
mem/ruby/network/topologies directory. It also adds some extra inheritance to
the topologies to clean up some issues in the existing topologies.
2012-08-10 13:50:42 -05:00
Nilay Vaish
706e84f2b8 System: set kernel to null, if unspecified. 2012-08-08 13:40:32 -05:00
Marc Orr
7cef6b9bef syscall emulation: Enabled getrlimit and getrusage for x86.
Added/moved rlimit constants to base linux header file.

This patch is a revised version of Vince Weaver's earlier patch.
2012-08-06 19:52:56 -07:00
Steve Reinhardt
f4b424cd53 SETranslatingPortProxy: fix bug in tryReadString()
Off-by-one loop termination meant that we were stuffing
the terminating '\0' into the std::string value, which
makes for difficult-to-debug string comparison failures.
2012-08-06 16:57:11 -07:00
Steve Reinhardt
73ef8bd168 process: add progName() virtual function
This replaces a (potentially uninitialized) string
field with a virtual function so that we can have
a safe interface without requiring changes to the
eio code.
2012-08-06 16:55:34 -07:00
Steve Reinhardt
e232152db6 syscall_emul: clean up open() code a bit. 2012-08-06 16:55:28 -07:00
Steve Reinhardt
b647b48bf4 str: add an overloaded startswith() utility method
for various string types and use it in a few places.
2012-08-06 16:52:49 -07:00
Marc Orr
d55115936e syscall emulation: Clean up ioctl handling, and implement for x86.
Enable different whitelists for different OS/arch combinations,
since some use the generic Linux definitions only, and others
use definitions inherited from earlier Unix flavors on those
architectures.

Also update x86 function pointers so ioctl is no longer
unimplemented on that platform.

This patch is a revised version of Vince Weaver's earlier patch.
2012-08-06 16:52:40 -07:00
Jason Power
6721b3e325 Ruby NetDest: add assert for bad element in netdest 2012-08-01 17:07:34 -05:00
Anthony Gutierrez
630068be6f dma: remove unused variable
this patch removes the actionInProgress field from the DmaPort class.
this variable is only defined and initiated in the ctor. it is never used.
2012-07-27 16:08:05 -04:00
Anthony Gutierrez
8133f2460f checker: make checker cpu id match its host's cpu id
when using the checker i ran into problems where an instruction reading the
cpu id register failed because the ids did not match, and hence, the result
of the instruction did not match. this patch ensures that the ids match so
this instruction does not fail. this problem only seemed to manifest itself
when multiple cores were in the system, either multi-core, or extra switched-
out cores present in the system.
2012-07-27 16:08:04 -04:00
Anthony Gutierrez
7bf14aedbf cache: don't allow dirty data in the i-cache
removes the optimization that forwards an exclusive copy to a requester on a
read, only for the i-cache. this optimization isn't necessary because we
typically won't be writing to the i-cache.
2012-07-27 16:08:04 -04:00
Anthony Gutierrez
2eb6b403c9 ARM: fix value of MISCREG_CTR returned by readMiscReg()
According to the A15 TRM the value of this register is as follows (assuming 16 word = 64 byte lines)
[31:29] Format - b100 specifies v7
[28] RAZ - b0
[27:24] CWG log2(max writeback size #words) - 0x4 16 words
[23:20] ERG log2(max reservation size #words) - 0x4 16 words
[19:16] DminLine log2(smallest dcache line #words) - 0x4 16 words
[15:14] L1Ip L1 index/tagging policy - b11 specifies PIPT
[13:4] RAZ - b0000000000
[3:0] IminLine log2(smallest icache line #words) - 0x4 16 words
2012-07-27 16:08:04 -04:00
Andreas Hansson
66f5124e2b Bridge: Use EventWrapper instead of Event subclass for sendEvent
This class simply cleans up the code by making use of the EventWrapper
convenience class to schedule the sendEvent in the bridge ports.
2012-07-23 09:32:19 -04:00
Nilay Vaish
11a551ae3a X86 CPUID: Return false if unknown processor family 2012-07-22 20:31:23 -05:00
Andreas Hansson
f00cba34eb Mem: Make SimpleMemory single ported
This patch changes the simple memory to have a single slave port
rather than a vector port. The simple memory makes no attempts at
modelling the contention between multiple ports, and any such
multiplexing and demultiplexing could be done in a bus (or crossbar)
outside the memory controller. This scenario also matches with the
ongoing work on a SimpleDRAM model, which will be a single-ported
single-channel controller that can be used in conjunction with a bus
(or crossbar) to create a multi-port multi-channel controller.

There are only very few regressions that make use of the vector port,
and these are all for functional accesses only. To facilitate these
cases, memtest and memtest-ruby have been updated to also have a
"functional" bus to perform the (de)multiplexing of the functional
memory accesses.
2012-07-12 12:56:13 -04:00
Nilay Vaish
b913af440b Ruby: remove config information from ruby.stats
This patch removes printConfig() functions from all structures in Ruby.
Most of the information is already part of config.ini, and where ever it
is not, it would become in due course.
2012-07-12 08:39:19 -05:00
Nilay Vaish
ce4e9a9a50 Ruby: remove some unused stuff from SLICC files 2012-07-12 08:39:18 -05:00
Brad Beckmann
8c18f6da9e x86: added page size in bytes tlb entry function 2012-07-11 12:21:04 -07:00
Brad Beckmann
5931087dcd ruby: improved DRAM reset comment 2012-07-11 09:44:34 -07:00
Marc Orr
387f843d51 syscall emulation: Add the futex system call. 2012-07-10 22:51:54 -07:00
Brad Beckmann
52540b1b78 x86: logSize and lruSeq are now optional ckpt params 2012-07-10 22:51:54 -07:00
Steve Reinhardt
2e47aaabc0 Add hook to call map() on Process from python.
This enables configuration scripts to set up mappings
from process virtual addresses to specific physical
addresses in SE mode.  This feature is needed to
support modeling of user-accessible memories or
devices in SE mode, avoiding the complexities of FS
mode and the need to write a device driver.
2012-07-10 22:51:54 -07:00
Brad Beckmann
645fa9c262 # User Brad Beckmann <Brad.Beckmann@amd.com>
ruby: fixed fatal print statement
2012-07-10 22:51:54 -07:00
Brad Beckmann
6f9bd33b73 ruby: remove the cpu assumptions for the random tester 2012-07-10 22:51:54 -07:00
Brad Beckmann
a22918dd41 # User Brad Beckmann <Brad.Beckmann@amd.com>
ruby: fixed msgptr print call
2012-07-10 22:51:54 -07:00
Brad Beckmann
884cd6f752 imported patch jason/slicc-external-structure-fix 2012-07-10 22:51:54 -07:00
Brad Beckmann
86d6b788f6 ruby: banked cache array resource model
This patch models a cache as separate tag and data arrays.  The patch exposes
the banked array as another resource that is checked by SLICC before a
transition is allowed to execute.  This is similar to how TBE entries and slots
in output ports are modeled.
2012-07-10 22:51:54 -07:00
Joel Hestness
467093ebf2 ruby: tag and data cache access support
Updates to Ruby to support statistics counting of cache accesses.  This feature
serves multiple purposes beyond simple stats collection.  It provides the
foundation for ruby to model the cache tag and data arrays as physical
resources, as well as provide the necessary input data for McPAT power
modeling.
2012-07-10 22:51:54 -07:00
Nuwan Jayasena
c10f348120 ruby: adds reset function to Ruby memory controllers 2012-07-10 22:51:54 -07:00
Nuwan Jayasena
1740c4c448 ruby: memory controllers now inherit from an abstract "MemoryControl" class 2012-07-10 22:51:53 -07:00
Brad Beckmann
4a52a6ea2d cpu: added assertions to ensure the correct proxies are used 2012-07-10 22:51:53 -07:00
Brad Beckmann
11b725c19d ruby: changes how Topologies are created
Instead of just passing a list of controllers to the makeTopology function
in src/mem/ruby/network/topologies/<Topo>.py we pass in a function pointer
which knows how to make the topology, possibly with some extra state set
in the configs/ruby/<protocol>.py file. Thus, we can move all of the files
from network/topologies to configs/topologies. A new class BaseTopology
is added which all topologies in configs/topologies must inheirit from and
follow its API.

--HG--
rename : src/mem/ruby/network/topologies/Crossbar.py => configs/topologies/Crossbar.py
rename : src/mem/ruby/network/topologies/Mesh.py => configs/topologies/Mesh.py
rename : src/mem/ruby/network/topologies/MeshDirCorners.py => configs/topologies/MeshDirCorners.py
rename : src/mem/ruby/network/topologies/Pt2Pt.py => configs/topologies/Pt2Pt.py
rename : src/mem/ruby/network/topologies/Torus.py => configs/topologies/Torus.py
2012-07-10 22:51:53 -07:00
Andreas Hansson
745274cbd4 EventManager: Rename queue accessor and remove cast operator
This patch renames the queue() accessor to the less ambigious
eventQueue, and also removes the cast operator. The queue() member
function cause problems in derived classes that declare members with
the same name, e.g. a MemObject subclass that has a packet queue on
its own. The operator is not causing any harm at this point, but as it
is not used there is little point in keeping it.
2012-07-09 12:35:46 -04:00
Andreas Hansson
d2f458e7b5 Mem: Make members relating to range and size constant
This patch makes the address-range related members const. The change
is trivial and merely ensures that they can be called on a const
memory.
2012-07-09 12:35:44 -04:00
Andreas Hansson
67e257f442 Port: Hide the queue implementation in SimpleTimingPort
This patch makes the queue implementation in the SimpleTimingPort
private to avoid confusion with the protected member queue in the
QueuedSlavePort. The SimpleTimingPort provides the queue_impl to the
QueuedSlavePort and it can be accessed via the reference in the base
class. The use of the member name queue is thus no longer overloaded.
2012-07-09 12:35:42 -04:00
Andreas Hansson
b265d9925c Port: Align port names in C++ and Python
This patch is a first step to align the port names used in the Python
world and the C++ world. Ultimately it serves to make the use of
config.json together with output from the simulation easier, including
post-processing of statistics.

Most notably, the CPU, cache, and bus is addressed in this patch, and
there might be other ports that should be updated accordingly. The
dash name separator has also been replaced with a "." which is what is
used to concatenate the names in python, and a separation is made
between the master and slave port in the bus.
2012-07-09 12:35:39 -04:00
Andreas Hansson
1c2ee987f3 Bus: Make the default bus width 8 bytes instead of 64
This patch changes the default bus width to a more sensible 8 bytes
(64 bits), which is in line with most on-chip buses. Although there
are cases where a wider or narrower bus is useful, the 8 bytes is a
good compromise to serve as the default.

This patch changes essentially all statistics, and will be bundled
with the outstanding changes to the bus.
2012-07-09 12:35:38 -04:00
Andreas Hansson
8caaac048a Bus: Split the bus into separate request/response layers
This patch splits the existing buses into multiple layers. The
non-coherent bus is split into a request and a response layer, and the
coherent bus adds an additional layer for the snoop responses. The
layer is modified to be templatised on the port type, such that the
different layers can have retryLists with either master or slave
ports. This patch also removes the dynamic cast from the retry, as
previously promised when moving the recvRetry from the port base class
to the master/slave port respectively.

Overall, the split bus more closely reflects any modern on-chip bus
and should be at step in the right direction. From this point, it
would be reasonable straight forward to add separate layers (and thus
contention points and arbitration) for each port and thus create a
true crossbar.

The regressions all produce the correct output, but have varying
degrees of changes to their statistics. A separate patch will be
pushed with the updates to the reference statistics.
2012-07-09 12:35:37 -04:00
Andreas Hansson
995e6e4670 Bus: Add a notion of layers to the buses
This patch moves all flow control, arbitration and state information
into a bus layer. The layer is thus responsible for all the state
transitions, and for keeping hold of the retry list. Consequently the
layer is also responsible for the draining.

With this change, the non-coherent and coherent bus are given a single
layer to avoid changing any temporal behaviour, but the patch opens up
for adding more layers.
2012-07-09 12:35:36 -04:00
Andreas Hansson
14f9c77dd3 Bus: Replace tickNextIdle and inRetry with a state variable
This patch adds a state enum and member variable in the bus, tracking
the bus state, thus eliminating the need for tickNextIdle and inRetry,
and fixing an issue that allowed the bus to be occupied by multiple
packets at once (hopefully it also makes it easier to understand the
code).

The bus, in its current form, uses tickNextIdle and inRetry to keep
track of the state of the bus. However, it only updates tickNextIdle
_after_ forwarding a packet using sendTiming, and the result is that
the bus is still seen as idle, and a module that receives the packet
and starts transmitting new packets in zero time will still see the
bus as idle (and this is done by a number of DMA devices). The issue
can also be seen in isOccupied where the bus calls reschedule on an
event instead of schedule.

This patch addresses the problem by marking the bus as _not_ idle
already by the time we conclude that the bus is not occupied and we
will deal with the packet.

As a result of not allowing multiple packets to occupy the bus, some
regressions have slight changes in their statistics. A separate patch
updates these accordingly.

Further ahead, a follow-on patch will introduce a separate state
variable for request/responses/snoop responses, and thus implement a
split request/response bus with separate flow control for the
different message types (even further ahead it will introduce a
multi-layer bus).
2012-07-09 12:35:35 -04:00
Andreas Hansson
46d9adb68c Port: Make getAddrRanges const
This patch makes getAddrRanges const throughout the code base. There
is no reason why it should not be, and making it const prevents adding
any unintentional side-effects.
2012-07-09 12:35:34 -04:00
Andreas Hansson
830391cad9 Port: Add getAddrRanges to master port (asking slave port)
This patch adds getAddrRanges to the master port, and thus avoids
going through getSlavePort to be able to ask the slave. Similar to the
previous patch that added isSnooping to the SlavePort, this patch aims
to introduce an additional level of hierarchy in the ports (base port
being protocol-agnostic) and getSlave/MasterPort will return port
pointers to these base classes.

The function is named getAddrRanges also on the master port, but does
nothing besides asking the connected slave port. The slave port, as
before, has to provide an implementation and actually produce a list
of address ranges. The initial design used the name getSlaveAddrRanges
for the new function, but the more verbose name was later changed.
2012-07-09 12:35:33 -04:00
Andreas Hansson
49407d76aa Port: Add isSnooping to slave port (asking master port)
This patch adds isSnooping to the slave port, and thus avoids going
through getMasterPort to be able to ask the master. Over the course of
the next few patches, all getMasterPort/getSlavePort in Port and
MemObject are to be protocol agnostic, and the snooping is part of the
protocol layer.

The function is already present on the master port, where it is
implemented by the module itself, e.g. a cache. On the slave side, it
is merely asking the connected master port. The same name is used by
both functions despite their difference in behaviour. The initial
design used isMasterSnooping on the slave port side, but the more
verbose function name was later changed.
2012-07-09 12:35:32 -04:00
Andreas Hansson
17f9270dad Port: Move retry from port base class to Master/SlavePort
This patch is the last part of moving all protocol-related
functionality out of the Port base class. All the send/recv functions
are already moved, and the retry (which still governs all the timing
transport functions) is the only part that remained in the base class.

The only point where this currently causes a bit of inconvenience is
in the bus where the retry list is global and holds Port pointers (not
Master/SlavePort). This is about to change with the split into a
request/response bus and will soon be removed anyway.

The patch has no impact on any regressions.
2012-07-09 12:35:31 -04:00
Andreas Hansson
ff5718f042 Fix: Address a few benign memory leaks
This patch is the result of static analysis identifying a number of
memory leaks. The leaks are all benign as they are a result of not
deallocating memory in the desctructor. The fix still has value as it
removes false positives in the static analysis.
2012-07-09 12:35:30 -04:00
Andreas Hansson
92eaac0711 gcc: Fix warnings for gcc 4.7 and clang 3.1
This patch fixes two warnings, one related to a narrowing conversion
(int to MachInst), and one due to the cast operator for arguments and
a mismatch in const-ness (const void* and void*).
2012-07-02 08:21:53 -04:00
Lena Olson
d2ebade5a5 Cache: Fix the LRU policy for classic memory hierarchy
The LRU policy always evicted the least recently touched way, even if it
contained valid data and another way was invalid, as can happen if a block has
been invalidated by coherance.  This can result in caches never warming up even
though they are replacing blocks.  This modifies the LRU policy to move blocks
to LRU position on invalidation.
2012-06-29 11:21:58 -04:00
Uri Wiener
fcccab0dcd Bus: enable non/coherent buses sub-classes
This patch merely changes several methods to be virtual in order to enable
non/coherent buses sub-classes.
2012-06-29 11:19:08 -04:00
Dam Sunwoo
7cbe0cf564 Mem: fix master id assertion in cache_impl.hh
The assertion was applied to the wrong packet.
This patch fixes the issue rerported by Xiang Jiang on the gem5-dev mailing list.
2012-06-29 11:19:07 -04:00
Matt Evans
579047c76d Mem: Fix a livelock resulting in LLSC/locked memory access implementation.
Currently when multiple CPUs perform a load-linked/store-conditional sequence,
the loads all create a list of reservations which is then scanned when the
stores occur.  A reservation matching the context and address of the store is
sought, BUT all reservations matching the address are also erased at this point.

The upshot is that a store-conditional will remove all reservations even if the
store itself does not succeed.  A livelock was observed using 7-8 CPUs where a
thread would erase the reservations of other threads, not succeed, loop and put
its own reservation in again only to have it blown by another thread that
unsuccessfully now tries to store-conditional -- no forward progress was made,
hanging the system.

The correct way to do this is to only blow a reservation when a store
(conditional or not) actually /occurs/ to its address.  One thread always wins
(the one that does the store-conditional first).
2012-06-29 11:19:05 -04:00
Nathanael Premillieu
af2b14a362 O3: Track if the RAS has been pushed or not to pop the RAS if neccessary.
Add new flag (named pushedRAS) in the PredictorHistory structure.
This flag tracks whether the RAS has been pushed or not during a prediction.
Then, in the squash function it is used to pop the RAS if necessary.
2012-06-29 11:18:29 -04:00
Ali Saidi
71daeb0b2b ARM: Fix identification of one RAS pop instruction.
The check should be with the op2 field, not with the op1 field.
2012-06-29 11:18:29 -04:00
Ali Saidi
8d1e56bdcd Cache: Only invalidate a line in the cache when an uncacheable write is seen. 2012-06-29 11:18:29 -04:00
Ali Saidi
7e3496c78c ARM: Update version of linux we claim to be to 3.0.0.
Static binaries generated with new versions of libc complain that the kernel
is too old otherwise.
2012-06-29 11:18:29 -04:00
Ali Saidi
aed8050824 ARM: Fix issue with predicted next pc being wrong because of advance() ordering.
npc in PCState for ARM was being calculated before the current flags were
updated with the next flags. This causes an issue as the npc is incremented by
two or four depending on the current flags (thumb or not) and was leading to
branches that were predicted correctly being identified as mispredicted.
2012-06-29 11:18:28 -04:00
Ali Saidi
c51fc5ceff ARM: Fix address range issue with VExpress EMM 2012-06-27 19:23:02 -04:00
Anthony Gutierrez
9764cde7f2 ARM: implement the ProcessInfo methods 2012-06-11 11:07:41 -04:00
Andreas Hansson
754a9570f2 Timing CPU: Remove a redundant port pointer
This patch is trivial and merely prunes a pointer that was never set
or used.
2012-06-08 12:45:24 -04:00
Andreas Hansson
a118c01716 Power: Fix MaxMiscDestRegs which was set to zero
This patch fixes a failing compilation caused by MaxMiscDestRegs being
zero. According to gcc 4.6, the result is a comparison that is always
false due to limited range of data type.
2012-06-08 12:44:17 -04:00
Nilay Vaish
d6609793d4 X86 TLB: Add a missing = sign 2012-06-07 17:03:45 -05:00
Ali Saidi
c80cd4136e mem: Delay deleting of incoming packets by one call.
This patch is a temporary fix until Andreas' four-phase patches
get reviewed and committed. Removing FastAlloc seems to have exposed
an issue which previously was reasonable rare in which packets are freed
before the sending cache is done with them. This change puts incoming packets
no a pendingDelete queue which are deleted at the start of the next call and
thus breaks the dependency between when the caller returns true and when the
packet is actually used by the sending cache.

Running valgrind on a multi-core linux boot and the memtester results in no
valgrind warnings.
2012-06-07 10:59:03 -04:00
Jayneel Gandhi
7183c3fd56 X86 TLB: Fix for gcc 4.4.3
Due to recent changes to X86 TLB, gem5 stopped compiling on
gcc version 4.4.3. This patch provides the fix for that problem. The patch
is tested on gcc 4.4.3. The change is not required for more recent
versions of gcc (like on 4.6.3).
2012-06-07 08:11:00 -05:00
Anthony Gutierrez
d6da3ff317 cpu: Don't init simple and inorder CPUs if they are defered.
initCPU() will be called to initialize switched out CPUs for the simple and
inorder CPU models. this patch prevents those CPUs from being initialized
because they should get their state from the active CPU when it is switched
out.
2012-06-05 14:20:13 -04:00
Ali Saidi
20d25b9da7 ISA: Back-out NoopMachInst as a StaticInstPtr change. 2012-06-05 13:52:30 -04:00
Ali Saidi
c06970b673 cpt: update some comments in the checkpoint migration script 2012-06-05 10:36:59 -04:00
William Wang
e5f0d6016b stats: when applying an operation to two vectors sum the components first.
Previously writing X/Y in a formula would result in:
x[0]/y[0] + x[1]/y[1]
In reality you want:
(x[0] +x[1])/(y[0] + y[1])
2012-06-05 01:23:11 -04:00
Dam Sunwoo
14539ccae1 Mem: add per-master stats to physmem
Added per-master stats (similar to cache stats) to physmem.
2012-06-05 01:23:11 -04:00
Geoffrey Blake
eced845a5e ARM: Add PCIe support to VExpress_EMM model and remove deprecated ELT 2012-06-05 01:23:11 -04:00
Chander Sudanthi
15228694d0 ARM: removed extra white space
Extra white space fixes in miscregs.hh
2012-06-05 01:23:10 -04:00
Chander Sudanthi
8a2ca2fd24 ARM: Fix MPIDR and MIDR register implementation.
This change allows designating a system as MP capable or not as some
bootloaders/kernels care that it's set right. You can have a single
processor MP capable system, but you can't have a multi-processor
UP only system. This change also fixes the initialization of the MIDR
register.
2012-06-05 01:23:10 -04:00
Chander Sudanthi
e60b2ac706 ARM: PS2 encoding fix
Fixed Disable encoding and added SetDefaults.
See http://wiki.osdev.org/Mouse_Input for encodings.
2012-06-05 01:23:10 -04:00
Ali Saidi
70d7d6cc7f sim: Provide a framework for detecting out of data checkpoints and migrating them. 2012-06-05 01:23:10 -04:00
Ali Saidi
2e988bbab0 stats: Add stats unittest for total calculations. 2012-06-05 01:23:10 -04:00
Ali Saidi
6df196b71e O3: Clean up the O3 structures and try to pack them a bit better.
DynInst is extremely large the hope is that this re-organization will put the
most used members close to each other.
2012-06-05 01:23:09 -04:00
Ali Saidi
1b370431d0 sim: Remove FastAlloc
While FastAlloc provides a small performance increase (~1.5%) over regular malloc it isn't thread safe.
After removing FastAlloc and using tcmalloc I've seen a performance increase of 12% over libc malloc
when running twolf for ARM.
2012-06-05 01:23:08 -04:00
Ali Saidi
d6997777be ARM: Fix over-eager assert in gic. 2012-06-05 01:23:08 -04:00
Mitchell Hayenga
8294d49bb6 stats: Provide a mechanism to get a callback when stats are dumped.
This mechanism is useful for dumping output that is correlated with stats
dumping, but isn't tracked by the gem5 statistics.
2012-06-05 01:23:08 -04:00
Ali Saidi
0b0c5621ee ARM: Fix compilation on ARM after Gabe's change. 2012-06-05 01:23:08 -04:00
Gabe Black
008b17d816 ISA: Turn the ExtMachInst NoopMachinst into the StaticInstPtr NoopStaticInst.
This eliminates a use of the ExtMachInst type outside of the ISAs.
2012-06-04 10:57:23 -07:00
Gabe Black
35fa5074aa X86: Ensure that the CPUID instruction always writes its outputs.
The CPUID instruction was implemented so that it would only write its results
if the instruction was successful. This works fine on the simple CPU where
unwritten registers retain their old values, but on a CPU like O3 with
renaming this is broken. The instruction needs to write the old values back
into the registers explicitly if they aren't being changed.
2012-06-04 10:43:09 -07:00
Gabe Black
7b73c36f5d X86: Ensure that the decoder's internal ExtMachInst is completely initialized.
There are some bits of some fields of the ExtMachInst which are not actually
used for anything but are included in the hash of an ExtMachInst for
simplicity and efficiency. This change makes sure the decoder's internal
working ExtMachInst is completely initialized, even these unused bits, so that
there isn't any nondeterministic behavior, no valgrind messages about
uninitialized variables, and no potential false misses/redundant entries in
the decode cache.
2012-06-04 10:43:08 -07:00
Andreas Hansson
0d32940711 Bus: Split the bus into a non-coherent and coherent bus
This patch introduces a class hierarchy of buses, a non-coherent one,
and a coherent one, splitting the existing bus functionality. By doing
so it also enables further specialisation of the two types of buses.

A non-coherent bus connects a number of non-snooping masters and
slaves, and routes the request and response packets based on the
address. The request packets issued by the master connected to a
non-coherent bus could still snoop in caches attached to a coherent
bus, as is the case with the I/O bus and memory bus in most system
configurations. No snoops will, however, reach any master on the
non-coherent bus itself. The non-coherent bus can be used as a
template for modelling PCI, PCIe, and non-coherent AMBA and OCP buses,
and is typically used for the I/O buses.

A coherent bus connects a number of (potentially) snooping masters and
slaves, and routes the request and response packets based on the
address, and also forwards all requests to the snoopers and deals with
the snoop responses. The coherent bus can be used as a template for
modelling QPI, HyperTransport, ACE and coherent OCP buses, and is
typically used for the L1-to-L2 buses and as the main system
interconnect.

The configuration scripts are updated to use a NoncoherentBus for all
peripheral and I/O buses.

A bit of minor tidying up has also been done.

--HG--
rename : src/mem/bus.cc => src/mem/coherent_bus.cc
rename : src/mem/bus.hh => src/mem/coherent_bus.hh
rename : src/mem/bus.cc => src/mem/noncoherent_bus.cc
rename : src/mem/bus.hh => src/mem/noncoherent_bus.hh
2012-05-31 13:30:04 -04:00
Andreas Hansson
1d520cda80 gcc: Small fixes to compile with gcc 4.7
This patch makes two very minor changes to please gcc 4.7. The
CopyData function no longer exists and this has been replaced. For
some reason previous versions of gcc did not complain on the const
char casting not having an implementation, but this is now addressed.
2012-05-30 05:31:48 -04:00
Andreas Hansson
b8cf48accc Bus: Remove redundant packet parameter from isOccupied
This patch merely remove the Packet* from the isOccupied member
function. Historically this was used to check if the packet was an
express snoop, but this is now done outside this function (where
relevant).
2012-05-30 05:31:11 -04:00
Andreas Hansson
5880fbe96d Bus: Turn the PortId into a transport function parameter
The main aim of this patch is to arrive at a suitable port interface
for vector ports, including both the packet and the port id. This
patch changes the bus transport functions
(recvFunctional/Atomic/Timing) to require a PortId parameter
indicating the source port. Previously this information was passed by
setting the source field of the packet, and this is only required in
the case of a timing request.

With this patch, the use of the source and destination field is also
more restrictive, as they are only needed for timing accesses. The
modifications to these fields for atomic snoops is now removed
entirely, also making minor modifications to the cache.
2012-05-30 05:30:24 -04:00
Andreas Hansson
cad802761a Packet: Unify the use of PortID in packet and port
This patch removes the Packet::NodeID typedef and unifies it with the
Port::PortId. The src and dest fields in the packet are used to hold a
port id (e.g. in the bus), and thus the two should actually be the
same.

The typedef PortID is now global (in base/types.hh) and aligned with
the ThreadID in terms of capitalisation and naming of the
InvalidPortID constant.

Before this patch, two flags were used for valid destination and
source, rather than relying on a named value (InvalidPortID), and
this is now redundant, as the src and dest field themselves are
sufficient to tell whether the current value is a valid port
identifier or not. Consequently, the VALID_SRC and VALID_DST are
removed.

As part of the cleaning up, a number of int parameters and local
variables are updated to use PortID.

Note that Ruby still has its own NodeID typedef. Furthermore, the
MemObject getMaster/SlavePort still has an int idx parameter with a
default value of -1 which should eventually change to PortID idx =
InvalidPortID.
2012-05-30 05:29:42 -04:00
Andreas Hansson
6a54f7fc5f Packet: Updated comments for src and dest fields
This patch updates the comments for the src and dest fields to reflect
their actual use. Due to a number of patches (e.g. removing the
Broadcast flag), the old comments are no longer indicative of the
current usage.
2012-05-30 05:29:07 -04:00
Andreas Hansson
3b367db42c Bridge: Split deferred request, response and sender state
This patch splits the PacketBuffer class into a RequestState and a
DeferredRequest and DeferredResponse. Only the requests need a
SenderState, and the deferred requests and responses only need an
associated point in time for the request and the response queue.

Besides the cleaning up, the goal is to simplify the transition to a
new port handshake, and with these changes, the two packet queues are
starting to look very similar to the generic packet queue, but
currently they do a few unique things relating to the NACK and
counting of requests/responses that the packet queue cannot be
conveniently used. This will be addressed in a later patch.
2012-05-30 05:28:06 -04:00
Gabe Black
d9988ded3c X86: Use the HandyM5Reg to avoid a register read and some logic in the TLB. 2012-05-28 21:56:23 -07:00
Gabe Black
40084e0c3e X86: Move the GDT down to where it can be accessed in 32 bit mode.
The GDT can be accessed by user level software running in compatibility mode
by moving segment selectors into segment registers. The GDT needs to be set up
at an address accessible in this mode.
2012-05-27 19:01:08 -07:00
Gabe Black
1d96135087 X86: Truncate addresses to 32 bits except in 64 bit mode, not long mode.
A small change was added a while ago to keep addresses from overflowing 32
bits when larger addresses shouldn't be accessible to software. That change
truncated when not in long mode, but really it should have truncated when not
in 64 bit mode. The difference is whether compatibility mode is included, a
mode that's supposed to act like a legacy 32 bit mode.
2012-05-27 19:01:04 -07:00
Gabe Black
19df4e94ee ISA,CPU: Generalize and split out the components of the decode cache.
This will allow it to be specialized by the ISAs. The existing caching scheme
is provided by the BasicDecodeCache in the GenericISA namespace and is built
from the generalized components.

--HG--
rename : src/cpu/decode_cache.cc => src/arch/generic/decode_cache.cc
2012-05-26 13:45:12 -07:00
Gabe Black
0cba96ba6a CPU: Merge the predecoder and decoder.
These classes are always used together, and merging them will give the ISAs
more flexibility in how they cache things and manage the process.

--HG--
rename : src/arch/x86/predecoder_tables.cc => src/arch/x86/decoder_tables.cc
2012-05-26 13:44:46 -07:00
Gabe Black
eae1e97fb0 ISA: Make the decode function part of the ISA's decoder. 2012-05-25 00:55:24 -07:00
Gabe Black
276f3e9535 CPU: Simplify the implementation of the decode cache.
Also reorganize it to make it more amenable to being rearranged later.
2012-05-25 00:54:39 -07:00
Gabe Black
82a228bd43 Decode: Make the Decoder class defined per ISA.
--HG--
rename : src/cpu/decode.cc => src/arch/generic/decoder.cc
rename : src/cpu/decode.hh => src/arch/generic/decoder.hh
2012-05-25 00:53:37 -07:00
Andreas Hansson
49da0497d3 Cache: Remove dangling doWriteback declaration
This patch removes the declaration of doWriteback as there is no
implementation for this member function.
2012-05-24 04:09:19 -04:00
Andreas Hansson
3e0ed08706 Packet: Cleaning up packet command and attribute
This patch removes unused commands and attributes from the packet to
avoid any confusion. It is part of an effort to clear up how and where
different commands and attributes are used.
2012-05-23 09:18:04 -04:00
Andreas Hansson
01906f957a Config: Use the attribute naming and include ports in JSON
This patch changes the organisation of the JSON output slightly to
make it easier to traverse and use the files. Most importantly, the
hierarchical dictionaries now use keys that correspond to the
attribute names also in the case of VectorParams (used to be
e.f. "cpu0 cpu1"). It also adds the name and the path to each
SimObject directory entry. Before this patch, to get cpu0, you would
have to query dict['system']['cpu0 cpu1'][0] and this could be a dict
with 'cpu0' : { cpu parameters }. Now you use dict['system']['cpu'][0]
and get { cpu parameters } (where one is "name" : "cpu0").

Additionally this patch includes more verbose information about the
ports, specifying their role, and using a JSON array rather than a
concatenated string for the peer.
2012-05-23 09:16:39 -04:00
Andreas Hansson
d4847fe6ea DMA: Split the DMA device and IO device into seperate files
This patch moves the DMA device to its own set of files, splitting it
from the IO device. There are no behavioural changes associated with
this patch.

The patch also grabs the opportunity to do some very minor tidying up,
including some white space removal and pruning some redundant
parameters.

Besides the immediate benefits of the separation-of-concerns, this
patch also makes upcoming changes more streamlined as it split the
devices that are only slaves and the DMA device that also acts as a
master.

--HG--
rename : src/dev/io_device.cc => src/dev/dma_device.cc
rename : src/dev/io_device.hh => src/dev/dma_device.hh
2012-05-23 09:15:45 -04:00
Andreas Hansson
5b36cf623c MEM: Add a snooping DMA port subclass for table walker
This patch makes the (device) DmaPort non-snooping and removes the
recvSnoop constructor parameter and instead introduces a
SnoopingDmaPort subclass for the ARM table walker.

Functionality is unchanged, as are the stats, and the patch merely
clarifies that the normal DMA ports are not snooping (although they
may issue requests that are snooped by others, as done with PCI, PCIe,
AMBA4 ACE etc).

Currently this port is declared in the ARM table walker as it is not
used anywhere else. If other ports were to have similar behaviour it
could be moved in a future patch.
2012-05-23 09:14:12 -04:00
Andreas Hansson
31b4ac5cec Config: Exit with fatal if a port is already connected
This patch turns the existing warning into a fatal, as there should
never be any cases where a (non-vector) port is assigned to and then
later connected to something else. If this behaviour is allowed, as it
used to be, there are cases where the wrong number of C++ ports are
created when instantiating objects with VectorPorts (obviously that
could be fixed, but the better approach is to simply not allow it).
2012-05-23 09:01:56 -04:00
Nilay Vaish
1031fe7b6f Ruby: Remove the unused src/mem/ruby/common/Driver.* files. 2012-05-22 11:35:58 -05:00
Nilay Vaish
6a966d5eeb Ruby Sequencer: Schedule deadlock check event at correct time
The scheduling of the deadlock check event was being done incorrectly as the
clock was not being multiplied, so as to convert the time into ticks. This
patch removes that bug.
2012-05-22 11:32:57 -05:00
Nilay Vaish
4d4d212ae9 X86: Split Condition Code register
This patch moves the ECF and EZF bits to individual registers (ecfBit and
ezfBit) and the CF and OF bits to cfofFlag registers. This is being done
so as to lower the read after write dependencies on the the condition code
register. Ultimately we will have the following registers [ZAPS], [OF],
[CF], [ECF], [EZF] and [DF]. Note that this is only one part of the
solution for lowering the dependencies. The other part will check whether
or not the condition code register needs to be actually read. This would
be done through a separate patch.
2012-05-22 11:29:53 -05:00
Marc Orr
16a559c9c6 x86 ISA: Implement the sse3 haddps instruction.
Shuffle the 32 bit values into position, and then add in parallel.
2012-05-19 04:32:25 -07:00
Gabe Black
250c40799d Syscalls: warn when the length argument to mmap is excessive.
If the length argument to mmap is larger than the arbitrary but reasonable
limit of 4GB, there's a good chance that the value is nonsense and not
intentional. Rather than attempting to satisfy the mmap anyway, this change
makes gem5 warn to make it more apparent what's going wrong.
2012-05-19 04:13:47 -07:00
Lena Olson
8fe8efeb34 Mem: Fix size check when allocating physical memory 2012-05-14 20:31:33 -05:00
Koan-Sin Tan
0b2d5e20d1 ARM: fix the calculation of the values in the RV clocks
This clock is used by the linux scheduler.
2012-05-10 18:04:28 -05:00
Ali Saidi
331696582f stats: fix compilation of unit test. 2012-05-10 18:04:28 -05:00
Ali Saidi
ec50c78f83 stats: fix bug in assert for 2d vector 2012-05-10 18:04:28 -05:00
Chander Sudanthi
1965a89873 ARM: pl011 raw interrupt fix
Raw interrupt was not being set when interrupt was disabled.
This patch sets the raw interrupt regardless of the mask.
2012-05-10 18:04:28 -05:00
Chander Sudanthi
200689c53f ARM: EMM board address range fix
0x40000000 is reservered for external AXI addresses.  This address
range is not used currently.  Removed the range from the bridge.
2012-05-10 18:04:28 -05:00
Uri Wiener
29a5e6ff35 DOT: improved dot-based system visualization
Revised system visualization to reflect structure and memory hierarchy.
Improved visualization: less congested and cluttered; more colorful.
Nodes reflect components; directed edges reflect dirctional relation, from
a master port to a slave port. Requires pydot.
2012-05-10 18:04:27 -05:00
Uri Wiener
cb1b63ea61 DOT: fixed broken code for visualizing configuration using dot
Fixed broken code which visualizes the system configuration by generating a
tree from each component's children, starting from root.
Requires DOT (hence pydot).
2012-05-10 18:04:27 -05:00
Dam Sunwoo
f2f7fa1a1c ARM: guard masked symbol tables by default
Symbol tables masked with the loadAddrMask create redundant entries
that could conflict with kernel function events that rely on the
original addresses.  This patch guards the creation of those masked
symbol tables by default, with an option to enable them when needed
(for early-stage kernel debugging, etc.)
2012-05-10 18:04:27 -05:00
Ali Saidi
041b932428 mem: fix bug with CopyStringOut and null string termination. 2012-05-10 18:04:27 -05:00
Ali Saidi
c02dc07424 Cache: restructure code that actually isn't a loop 2012-05-10 18:04:27 -05:00
Ali Saidi
e029941bda dev: use correct delete operation in SimpleDisk 2012-05-10 18:04:27 -05:00
Ali Saidi
d9b484b41a ARM: Fix incorrect use of not operators in arm devices 2012-05-10 18:04:27 -05:00
Ali Saidi
5745665509 gem5: assert before indexing intro arrays to verify bounds 2012-05-10 18:04:27 -05:00
Ali Saidi
4f66bcdd2e gem5: fix some iterator use and erase bugs 2012-05-10 18:04:27 -05:00
Ali Saidi
5ecaf30219 gem5: fix a number of use after free issues 2012-05-10 18:04:27 -05:00
Ali Saidi
da10fbf5ca base: fix a invalid ?: operator 2012-05-10 18:04:27 -05:00
Ali Saidi
8cee4dacc8 gem5: Fix a number of incorrect case statements 2012-05-10 18:04:26 -05:00
Ali Saidi
413ba1fdaf stats: track if the stats have been enabled and prevent requesting master id
Track the point in the initialization where statistics have been registered.
After this point registering new masterIds can no longer work as some
SimObjects may have sized stats vectors based on the previous value. If someone
tries to register a masterId after this point the simulator executes fatal().
2012-05-10 18:04:26 -05:00
Ali Saidi
f6895e8bd4 Cache: Panic if you attempt to create a checkpoint with a cache in the system 2012-05-10 18:04:26 -05:00
Pritha Ghoshal
dc456d8166 IGbE: Fix writeback conditions for i8254x GbE in updated data sheet.
An older revision of the data sheet specified that txdctl.gran was 1 the granularity was
based on cache block and gran being 0 is based on descriptor count. The newer version of
the data sheet reverses this errata
2012-05-10 18:04:26 -05:00
Nathan Binkert
55411f7f71 stats: use nan instead of no_value 2012-05-09 11:51:42 -07:00
Andreas Hansson
ab23e29487 MEM: Add the communication monitor
This patch adds a communication monitor MemObject that can be inserted
between a master and slave port to provide a range of statistics about
the communication passing through it. The communication monitor is
non-invasive and does not change any properties or timing of the
packets, with the exception of adding a sender state to be able to
track latency. The statistics are only collected in timing mode (not
atomic) to avoid slowing down any fast forwarding.

An example of the statistics captured by the monitor are: read/write
burst lengths, bandwidth, request-response latency, outstanding
transactions, inter transaction time, transaction count, and address
distribution. The monitor can be used in combination with periodic
resetting and dumping of stats (through schedStatEvent) to study the
behaviour over time.

In future patches, a selection of convenience scripts will be added to
aid in visualising the statistics collected by the monitor.
2012-05-09 04:37:45 -04:00
Andreas Hansson
692351ea34 MEM: Do not forward uncacheable to bus snoopers
This patch adds a guarding if-statement to avoid forwarding
uncacheable requests (or rather their corresponding request packets)
to bus snoopers. These packets should never have any effect on the
caches, and thus there is no need to forward them to the snoopers.
2012-05-08 05:15:52 -04:00
Andreas Hansson
15e28c5ba6 Ruby: Ensure snoop requests are sent using sendTimingSnoopReq
This patch fixes a bug that caused snoop requests to be placed in a
packet queue. Instead, the packet is now sent immediately using
sendTimingSnoopReq, thus bypassing the packet queue and any normal
responses waiting to be sent.
2012-05-04 03:30:02 -04:00
Andreas Hansson
3fea59e162 MEM: Separate requests and responses for timing accesses
This patch moves send/recvTiming and send/recvTimingSnoop from the
Port base class to the MasterPort and SlavePort, and also splits them
into separate member functions for requests and responses:
send/recvTimingReq, send/recvTimingResp, and send/recvTimingSnoopReq,
send/recvTimingSnoopResp. A master port sends requests and receives
responses, and also receives snoop requests and sends snoop
responses. A slave port has the reciprocal behaviour as it receives
requests and sends responses, and sends snoop requests and receives
snoop responses.

For all MemObjects that have only master ports or slave ports (but not
both), e.g. a CPU, or a PIO device, this patch merely adds more
clarity to what kind of access is taking place. For example, a CPU
port used to call sendTiming, and will now call
sendTimingReq. Similarly, a response previously came back through
recvTiming, which is now recvTimingResp. For the modules that have
both master and slave ports, e.g. the bus, the behaviour was
previously relying on branches based on pkt->isRequest(), and this is
now replaced with a direct call to the apprioriate member function
depending on the type of access. Please note that send/recvRetry is
still shared by all the timing accessors and remains in the Port base
class for now (to maintain the current bus functionality and avoid
changing the statistics of all regressions).

The packet queue is split into a MasterPort and SlavePort version to
facilitate the use of the new timing accessors. All uses of the
PacketQueue are updated accordingly.

With this patch, the type of packet (request or response) is now well
defined for each type of access, and asserts on pkt->isRequest() and
pkt->isResponse() are now moved to the appropriate send member
functions. It is also worth noting that sendTimingSnoopReq no longer
returns a boolean, as the semantics do not alow snoop requests to be
rejected or stalled. All these assumptions are now excplicitly part of
the port interface itself.
2012-05-01 13:40:42 -04:00
Gabe Black
2c85cf41a2 X86: Fix the IMUL_R_P_I macroop.
The disp displacement was left off the load microop so the wrong value was
used.
2012-04-29 02:26:34 -07:00
Vince Weaver
03a91b0533 X86: Fix up the open system call's flags. 2012-04-29 00:31:03 -07:00
Vince Weaver
38799e2b3f X86: Make gem5 ignore a bunch of syscalls. 2012-04-29 00:30:56 -07:00
Nilay Vaish
04a558bb41 Garnet: Correct computation of link utilization
The computation for link utilization was incorrect for the flexible network.
The utilization was being divided twice by the total time.
2012-04-28 16:57:31 -05:00
Nilay Vaish
c3dad222e3 Ruby: Remove extra statements from Sequencer 2012-04-25 17:52:03 -05:00
Andreas Hansson
beed20d7bc MEM: Use base class Master/SlavePort pointers in the bus
This patch makes some rather trivial simplifications to the bus in
that it changes the use of BusMasterPort and BusSlavePort pointers to
simply use MasterPort and SlavePort (iterators are also updated
accordingly).

This change is a step towards a future patch that introduces a
separation of the interface and the structural port itself.
2012-04-25 10:45:23 -04:00
Andreas Hansson
4c92708b48 MEM: Add the PortId type and a corresponding id field to Port
This patch introduces the PortId type, moves the definition of
INVALID_PORT_ID to the Port class, and also gives every port an id to
reflect the fact that each element in a vector port has an
identifier/index.

Previously the bus and Ruby testers (and potentially other users of
the vector ports) added the id field in their port subclasses, and now
this functionality is always present as it is moved to the base class.
2012-04-25 10:41:23 -04:00
Andreas Hansson
79750fc575 clang/gcc: Use STL hash function for int64_t and uint64_t
This patch changes the guards for the definition of hash functions to
also exclude the int64_t and uint64_t hash functions in the case we
are using the c++0x STL <unordered_map> (and <hash>) or the TR1
version of the same header. Previously the guard only covered the hash
function for strings, but it seems there is also no need to define a
hash for the 64-bit integer types, and this has caused problems with
builds on 32-bit Ubuntu.
2012-04-25 08:57:18 -04:00
Gabe Black
64bf90dca3 X86: Clear out duplicate TLB entries when adding a new one.
It's possible for two page table walks to overlap which will go in the same
place in the TLB's trie. They would land on top of each other, so this change
adds some code which detects if an address already matches an entry and if so
throws away the new one.
2012-04-24 00:48:41 -07:00
Gabe Black
74ca8a3cd0 ISA: Put parser generated files in a "generated" directory.
This is to avoid collision with non-generated files.
2012-04-23 12:00:41 -07:00
Gabe Black
80c6cdae18 base: Include cassert in trie.hh.
trie.hh uses assert, but it wasn't explicitly including cassert.
2012-04-22 05:20:44 -07:00
Gabe Black
29329e61b7 X86: Report an error if there's no kernel object, don't blindly use it.
This way the user gets a nice message instead of a less nice segfault.
2012-04-21 15:00:23 -07:00
Gabe Black
a5187f9d96 CPU: Tidy up some formatting and a DPRINTF in the simple CPU base class.
Put the { on the same line as the if and put a space between the if and the
open paren. Also, use the # format modifier which puts a 0x in front of hex
values automatically. If the ExtMachInst type isn't integral and actually
prints something more complicated, the # falls away harmlessly and we aren't
left with a phantom 0x followed by a bunch of unrelated text.
2012-04-15 12:35:49 -07:00
Gabe Black
8fe112d61b X86: Fix a tiny typo in the load/store microop constructor.
The parameter is _machInst, which is very similar to the member machInst. If
machInst is used to pass the parameter to a lower level constructor, what
really happens is that machInst is set to whatever it already happened to be,
effectively leaving it uninitialized.
2012-04-15 01:07:39 -07:00
Gabe Black
aacb676220 X86: Use the AddrTrie class to implement the TLB.
This change also adjusts the TlbEntry class so that it stores the number of
address bits wide a page is rather than its size in bytes. In other words,
instead of storing 4K for a 4K page, it stores 12. 12 is easy to turn into 4K,
but it's a little harder going the other way.
2012-04-14 23:24:18 -07:00
Gabe Black
d6031d72df sim: Update some comments in trie.hh that were meant to go in the last change. 2012-04-14 23:22:57 -07:00
Gabe Black
c4c27ded42 sim: A trie data structure specifically to speed up paging lookups.
This change adds a trie data structure which stores an arbitrary pointer type
based on an address and a number of relevant bits. Then lookups can be done
against the trie where the tree is traversed and the first legitimate match
found is returned.
2012-04-14 23:19:34 -07:00
Andreas Hansson
14edc6013d Ruby: Use MasterPort base-class pointers where possible
This patch simplifies future patches by changing the pointer type used
in a number of the Ruby testers to use MasterPort instead of using a
derived CpuPort class. There is no reason for using the more
specialised pointers, and there is no longer a need to do any casting.

With the latest changes to the tester, organising ports as readers and
writes, things got a bit more complicated, and the "type" now had to
be removed to be able to fall back to using MasterPort rather than
CpuPort.
2012-04-14 05:46:59 -04:00
Andreas Hansson
750f33a901 MEM: Remove the Broadcast destination from the packet
This patch simplifies the packet by removing the broadcast flag and
instead more firmly relying on (and enforcing) the semantics of
transactions in the classic memory system, i.e. request packets are
routed from a master to a slave based on the address, and when they
are created they have neither a valid source, nor destination. On
their way to the slave, the request packet is updated with a source
field for all modules that multiplex packets from multiple master
(e.g. a bus). When a request packet is turned into a response packet
(at the final slave), it moves the potentially populated source field
to the destination field, and the response packet is routed through
any multiplexing components back to the master based on the
destination field.

Modules that connect multiplexing components, such as caches and
bridges store any existing source and destination field in the sender
state as a stack (just as before).

The packet constructor is simplified in that there is no longer a need
to pass the Packet::Broadcast as the destination (this was always the
case for the classic memory system). In the case of Ruby, rather than
using the parameter to the constructor we now rely on setDest, as
there is already another three-argument constructor in the packet
class.

In many places where the packet information was printed as part of
DPRINTFs, request packets would be printed with a numeric "dest" that
would always be -1 (Broadcast) and that field is now removed from the
printing.
2012-04-14 05:45:55 -04:00
Andreas Hansson
dccca0d3a9 MEM: Separate snoops and normal memory requests/responses
This patch introduces port access methods that separates snoop
request/responses from normal memory request/responses. The
differentiation is made for functional, atomic and timing accesses and
builds on the introduction of master and slave ports.

Before the introduction of this patch, the packets belonging to the
different phases of the protocol (request -> [forwarded snoop request
-> snoop response]* -> response) all use the same port access
functions, even though the snoop packets flow in the opposite
direction to the normal packet. That is, a coherent master sends
normal request and receives responses, but receives snoop requests and
sends snoop responses (vice versa for the slave). These two distinct
phases now use different access functions, as described below.

Starting with the functional access, a master sends a request to a
slave through sendFunctional, and the request packet is turned into a
response before the call returns. In a system without cache coherence,
this is all that is needed from the functional interface. For the
cache-coherent scenario, a slave also sends snoop requests to coherent
masters through sendFunctionalSnoop, with responses returned within
the same packet pointer. This is currently used by the bus and caches,
and the LSQ of the O3 CPU. The send/recvFunctional and
send/recvFunctionalSnoop are moved from the Port super class to the
appropriate subclass.

Atomic accesses follow the same flow as functional accesses, with
request being sent from master to slave through sendAtomic. In the
case of cache-coherent ports, a slave can send snoop requests to a
master through sendAtomicSnoop. Just as for the functional access
methods, the atomic send and receive member functions are moved to the
appropriate subclasses.

The timing access methods are different from the functional and atomic
in that requests and responses are separated in time and
send/recvTiming are used for both directions. Hence, a master uses
sendTiming to send a request to a slave, and a slave uses sendTiming
to send a response back to a master, at a later point in time. Snoop
requests and responses travel in the opposite direction, similar to
what happens in functional and atomic accesses. With the introduction
of this patch, it is possible to determine the direction of packets in
the bus, and no longer necessary to look for both a master and a slave
port with the requested port id.

In contrast to the normal recvFunctional, recvAtomic and recvTiming
that are pure virtual functions, the recvFunctionalSnoop,
recvAtomicSnoop and recvTimingSnoop have a default implementation that
calls panic. This is to allow non-coherent master and slave ports to
not implement these functions.
2012-04-14 05:45:07 -04:00
Andreas Hansson
b9bc530ad2 Regression: Add ANSI colours to highlight test status
This patch adds a very basic pretty-printing of the test status
(passed or failed) to highlight failing tests even more: green for
passed, and red for failed. The printing only uses ANSI it the target
output is a tty and supports ANSI colours. Hence, any regression
scripts that are outputting to files or sending e-mails etc should
still be fine.
2012-04-14 05:44:27 -04:00
Andreas Hansson
b6aa6d55eb clang/gcc: Fix compilation issues with clang 3.0 and gcc 4.6
This patch addresses a number of minor issues that cause problems when
compiling with clang >= 3.0 and gcc >= 4.6. Most importantly, it
avoids using the deprecated ext/hash_map and instead uses
unordered_map (and similarly so for the hash_set). To make use of the
new STL containers, g++ and clang has to be invoked with "-std=c++0x",
and this is now added for all gcc versions >= 4.6, and for clang >=
3.0. For gcc >= 4.3 and <= 4.5 and clang <= 3.0 we use the tr1
unordered_map to avoid the deprecation warning.

The addition of c++0x in turn causes a few problems, as the
compiler is more stringent and adds a number of new warnings. Below,
the most important issues are enumerated:

1) the use of namespaces is more strict, e.g. for isnan, and all
   headers opening the entire namespace std are now fixed.

2) another other issue caused by the more stringent compiler is the
   narrowing of the embedded python, which used to be a char array,
   and is now unsigned char since there were values larger than 128.

3) a particularly odd issue that arose with the new c++0x behaviour is
   found in range.hh, where the operator< causes gcc to complain about
   the template type parsing (the "<" is interpreted as the beginning
   of a template argument), and the problem seems to be related to the
   begin/end members introduced for the range-type iteration, which is
   a new feature in c++11.

As a minor update, this patch also fixes the build flags for the clang
debug target that used to be shared with gcc and incorrectly use
"-ggdb".
2012-04-14 05:43:31 -04:00
Steve Reinhardt
29482e90ba SCons: restore Werror option in src/SConscript
Partial backout of cset 8b223e308b08.

Although it's great that there's currently no need
for Werror=false in the current tree, some of us
have uncommitted code that still needs this option.
2012-04-13 08:13:04 -07:00
Andreas Hansson
c9634d9b38 Ruby: Ensure order-dependent iteration uses an ordered map
This patch fixes a bug in Ruby that caused non-deterministic
simulation when changing the underlying hash map implementation. The
reason is order-dependent behaviour in combination with iteration over
the hash map contents. The two locations where a sorted container is
assumed are now changed to make use of a std::map instead of the
unordered hash map.

With this change, the stats changes slightly and the follow-on
changeset will update the relevant statistics.
2012-04-12 08:35:49 -04:00
Gabe Black
15ca4f2fc7 tests: Fix building unit tests.
Unit tests shouldn't build in gem5's main function because they have thier
own.
2012-04-09 23:20:30 -07:00
Brad Beckmann
3fd425124c rubytest: remove spurious printf 2012-04-06 17:51:47 -07:00
Lisa Hsu
a5287efc58 slicc: Controllers attached to Sequencers no longer have to be named L1Cache. 2012-04-06 13:47:08 -07:00
Brad Beckmann
5dfa4cd3f5 sim-ruby: checkpointing fixes and dependent eventq improvements
Fixes checkpointing with respect to lost events after swapping event queues.
Also adds DPRINTFs to better understand what's going on when Ruby serializes
and unserializes.
2012-04-06 13:47:07 -07:00
Brad Beckmann
70682e36dd slicc: fixed error message when the type has no inheritance 2012-04-06 13:47:07 -07:00
Brad Beckmann
5838ed7290 MOESI_hammer: tbe allocation and dependent wakeup fixes 2012-04-06 13:47:07 -07:00
Brad Beckmann
f12961bf25 python: added __nonzero__ function to SimObject Bool params 2012-04-06 13:47:07 -07:00
Brad Beckmann
f050ebe3a8 MOESI_hammer: fixed bug with single cpu + flushes, then modified the regression tester to check this functionality 2012-04-06 13:47:06 -07:00
Brad Beckmann
0a9f4b950f rubytest: seperated read and write ports.
This patch allows the ruby tester to support protocols where the i-cache and d-cache
are managed by seperate controllers.
2012-04-06 13:47:06 -07:00
Andreas Hansson
b00949d88b MEM: Enable multiple distributed generalized memories
This patch removes the assumption on having on single instance of
PhysicalMemory, and enables a distributed memory where the individual
memories in the system are each responsible for a single contiguous
address range.

All memories inherit from an AbstractMemory that encompasses the basic
behaviuor of a random access memory, and provides untimed access
methods. What was previously called PhysicalMemory is now
SimpleMemory, and a subclass of AbstractMemory. All future types of
memory controllers should inherit from AbstractMemory.

To enable e.g. the atomic CPU and RubyPort to access the now
distributed memory, the system has a wrapper class, called
PhysicalMemory that is aware of all the memories in the system and
their associated address ranges. This class thus acts as an
infinitely-fast bus and performs address decoding for these "shortcut"
accesses. Each memory can specify that it should not be part of the
global address map (used e.g. by the functional memories by some
testers). Moreover, each memory can be configured to be reported to
the OS configuration table, useful for populating ATAG structures, and
any potential ACPI tables.

Checkpointing support currently assumes that all memories have the
same size and organisation when creating and resuming from the
checkpoint. A future patch will enable a more flexible
re-organisation.

--HG--
rename : src/mem/PhysicalMemory.py => src/mem/AbstractMemory.py
rename : src/mem/PhysicalMemory.py => src/mem/SimpleMemory.py
rename : src/mem/physical.cc => src/mem/abstract_mem.cc
rename : src/mem/physical.hh => src/mem/abstract_mem.hh
rename : src/mem/physical.cc => src/mem/simple_mem.cc
rename : src/mem/physical.hh => src/mem/simple_mem.hh
2012-04-06 13:46:31 -04:00
Tushar Krishna
dbe1608fd5 NetworkTest: remove unnecessary memory allocation 2012-04-05 17:51:26 -04:00
Nilay Vaish
4f4a710457 Config: corrects the way Ruby attaches to the DMA ports
With recent changes to the memory system, a port cannot be assigned a peer
port twice. While making use of the Ruby memory system in FS mode, DMA
ports were assigned peer twice, once for the classic memory system
and once for the Ruby memory system. This patch removes this double
assignment of peer ports.
2012-04-05 11:09:19 -05:00
Andreas Hansson
aab2001ab7 Python: Make the All proxy traverse SimObject children as well
This patch changes the behaviour of the All proxy parameter to not
only consider the direct children, but also do a pre-order depth-first
traversal of the object tree and append all results from the
children.

This is used in a later patch to find all the memories in the system,
independent of where they are located in the hierarchy.
2012-04-05 10:44:35 -04:00
Andreas Hansson
a8e6adb0b1 Atomic: Remove the physmem_port and access memory directly
This patch removes the physmem_port from the Atomic CPU and instead
uses the system pointer to access the physmem when using the fastmem
option. The system already keeps track of the physmem and the valid
memory address ranges, and with this patch we merely make use of that
existing functionality. As a result of this change, the overloaded
getMasterPort in the Atomic CPU can be removed, thus unifying the CPUs.
2012-04-03 03:50:14 -04:00
Gabe Black
a7859f7e45 X86: Fix address size handling so real mode works properly.
Virtual (pre-segmentation) addresses are truncated based on address size, and
any non-64 bit linear address is truncated to 32 bits. This means that real
mode addresses aren't truncated down to 16 bits after their segment bases are
added in.
2012-03-31 12:27:33 -07:00
Andreas Hansson
74043c4f5c MEM: Remove legacy DRAM in preparation for memory updates
This patch removes the DRAM memory class in preparation for updates to
the memory system, with the first one introducing an abstract memory
class, and removing the assumption of a single physical memory.
2012-03-30 12:57:48 -04:00
Andreas Hansson
a128ba7cd1 Ruby: Remove the physMemPort and instead access memory directly
This patch removes the physMemPort from the RubySequencer and instead
uses the system pointer to access the physmem. The system already
keeps track of the physmem and the valid memory address ranges, and
with this patch we merely make use of that existing functionality. The
memory is modified so that it is possible to call the access functions
(atomic and functional) without going through the port, and the memory
is allowed to be unconnected, i.e. have no ports (since Ruby does not
attach it like the conventional memory system).
2012-03-30 09:42:36 -04:00
William Wang
f9d403a7b9 MEM: Introduce the master/slave port sub-classes in C++
This patch introduces the notion of a master and slave port in the C++
code, thus bringing the previous classification from the Python
classes into the corresponding simulation objects and memory objects.

The patch enables us to classify behaviours into the two bins and add
assumptions and enfore compliance, also simplifying the two
interfaces. As a starting point, isSnooping is confined to a master
port, and getAddrRanges to slave ports. More of these specilisations
are to come in later patches.

The getPort function is not getMasterPort and getSlavePort, and
returns a port reference rather than a pointer as NULL would never be
a valid return value. The default implementation of these two
functions is placed in MemObject, and calls fatal.

The one drawback with this specific patch is that it requires some
code duplication, e.g. QueuedPort becomes QueuedMasterPort and
QueuedSlavePort, and BusPort becomes BusMasterPort and BusSlavePort
(avoiding multiple inheritance). With the later introduction of the
port interfaces, moving the functionality outside the port itself, a
lot of the duplicated code will disappear again.
2012-03-30 09:40:11 -04:00
Andreas Hansson
a14013af3a CPU: Unify initMemProxies across CPUs and simulation modes
This patch unifies where initMemProxies is called, in the init()
method of each BaseCPU subclass, before TheISA::initCPU is
called. Moreover, it also ensures that initMemProxies is called in
both full-system and syscall-emulation mode, thus unifying also across
the modes. An additional check is added in the ThreadState to ensure
that initMemProxies is only called once.
2012-03-30 09:38:35 -04:00
Andreas Hansson
9d7c715c46 range_map: Enable const find and iteration
This patch adds const access functions to the range_map to enable its use
in a const context, similar to the STL container classes.
2012-03-26 05:37:00 -04:00
Andreas Hansson
312efd742e Power: Change bitfield name to avoid conflicts with range_map
This patch changes the name of a bitfield from W to W_FIELD to avoid
clashes with W being used as a class (typename) in the templatized
range_map. It also changes L to L_FIELD to avoid future problems. The
problem manifestes itself when the CPU includes a header that in turn
includes range_map.hh. The relevant parts of the decoder are updated.
2012-03-26 05:35:24 -04:00
Andreas Hansson
ca9790a2db Ruby: Fix Set::print for 32-bit hosts
This patch fixes a compilation error caused by a length mismatch on
32-bit hosts. The ifdef and sprintf is replaced by a csprintf.
2012-03-23 06:54:25 -04:00
Andreas Hansson
9727b1be18 MEM: Unify bus access methods and prepare for master/slave split
This patch unifies the recvFunctional, recvAtomic and recvTiming to
all be based on a similar structure: 1) extract information about the
incoming packet, 2) send it out to the appropriate snoopers, 3)
determine where it is going, and 4) forward it to the right
destination. The naming of variables across the different access
functions is now consistent as well.

Additionally, the patch introduces the member functions releaseBus and
retryWaiting to better distinguish between the two cases when we
should tell a sender to retry. The first case is when the bus goes
from busy to idle, and the second case is when it receives a retry
from a destination that did not immediatelly accept a packet.

As a very minor change, the MMU debug flag is no longer used in the bus.
2012-03-22 06:37:21 -04:00
Andreas Hansson
c2d2ea99e3 MEM: Split SimpleTimingPort into PacketQueue and ports
This patch decouples the queueing and the port interactions to
simplify the introduction of the master and slave ports. By separating
the queueing functionality from the port itself, it becomes much
easier to distinguish between master and slave ports, and still retain
the queueing ability for both (without code duplication).

As part of the split into a PacketQueue and a port, there is now also
a hierarchy of two port classes, QueuedPort and SimpleTimingPort. The
QueuedPort is useful for ports that want to leave the packet
transmission of outgoing packets to the queue and is used by both
master and slave ports. The SimpleTimingPort inherits from the
QueuedPort and adds the implemention of recvTiming and recvFunctional
through recvAtomic.

The PioPort and MessagePort are cleaned up as part of the changes.

--HG--
rename : src/mem/tport.cc => src/mem/packet_queue.cc
rename : src/mem/tport.hh => src/mem/packet_queue.hh
2012-03-22 06:36:27 -04:00
Andreas Hansson
fb395b56dd Scons: Remove Werror=False in SConscript files
This patch removes the overriding of "-Werror" in a handful of
cases. The code compiles with gcc 4.6.3 and clang 3.0 without any
warnings, and thus without any errors. There are no functional changes
introduced by this patch. In the future, rather than ypassing
"-Werror", address the warnings.
2012-03-22 06:34:50 -04:00