Commit graph

10568 commits

Author SHA1 Message Date
Andreas Hansson
341dbf2662 arch: Use const StaticInstPtr references where possible
This patch optimises the passing of StaticInstPtr by avoiding copying
the reference-counting pointer. This avoids first incrementing and
then decrementing the reference-counting pointer.
2014-09-27 09:08:36 -04:00
Andreas Hansson
deb2200671 scons: Address issues related to gcc 4.9.1
Fix a number few minor issues to please gcc 4.9.1. Removing the
'-fuse-linker-plugin' flag means no libraries are part of the LTO
process, but hopefully this is an acceptable loss, as the flag causes
issues on a lot of systems (only certain combinations of gcc, ld and
ar work).
2014-09-27 09:08:34 -04:00
Curtis Dunham
4836aef1e4 dev: Output invalid access size in IsaFake panic 2014-09-27 09:08:33 -04:00
Curtis Dunham
b7f1d675da mem: Output precise range when XBar has conflicts 2014-09-27 09:08:32 -04:00
Curtis Dunham
725be98fe8 mem: Provide better diagnostic for unconnected port
When _masterPort is null, a message to that effect is
more helpful than a segfault.
2014-09-27 09:08:30 -04:00
Andreas Hansson
de62aedabc misc: Fix a bunch of minor issues identified by static analysis
Add some missing initialisation, and fix a handful benign resource
leaks (including some false positives).
2014-09-27 09:08:29 -04:00
Steve Reinhardt
71d5f03175 stats: update t1000 stats for recent changes 2014-09-21 23:04:39 -04:00
Steve Reinhardt
8649757135 stats: update eio stats for recent changes 2014-09-21 16:15:14 -04:00
Andreas Hansson
c4e91289ae stats: Bump stats for filter, crossbar and config changes
This patch bumps the stats to reflect the addition of the snoop filter
and snoop stats, the change from bus to crossbar, and the updates to
the ARM regressions that are now using a different CPU and cache
configuration. Lastly, some minor changes are expected due to the
activation cleanup of the CPUs.
2014-09-20 17:18:53 -04:00
Mitch Hayenga
cc6523e2d6 cpu: Remove unused deallocateContext calls
The call paths for de-scheduling a thread are halt() and suspend(), from
the thread context. There is no call to deallocateContext() in general,
though some CPUs chose to define it. This patch removes the function
from BaseCPU and the cores which do not require it.
2014-09-20 17:18:36 -04:00
Mitch Hayenga
e1403fc2af alpha,arm,mips,power,x86,cpu,sim: Cleanup activate/deactivate
activate(), suspend(), and halt() used on thread contexts had an optional
delay parameter. However this parameter was often ignored. Also, when used,
the delay was seemily arbitrarily set to 0 or 1 cycle (no other delays were
ever specified). This patch removes the delay parameter and 'Events'
associated with them across all ISAs and cores. Unused activate logic
is also removed.
2014-09-20 17:18:35 -04:00
Andreas Hansson
2b0438a11e tests: Use more representative configs for ARM tests
This patch changes the CPU and cache configurations used in the ARM SE and FS
regressions to make them more representative, and also get better code
coverage by exercising different replacement policies and use an L2
prefetcher.
2014-09-20 17:18:33 -04:00
Andreas Hansson
1f6d5f8f84 mem: Rename Bus to XBar to better reflect its behaviour
This patch changes the name of the Bus classes to XBar to better
reflect the actual timing behaviour. The actual instances in the
config scripts are not renamed, and remain as e.g. iobus or membus.

As part of this renaming, the code has also been clean up slightly,
making use of range-based for loops and tidying up some comments. The
only changes outside the bus/crossbar code is due to the delay
variables in the packet.

--HG--
rename : src/mem/Bus.py => src/mem/XBar.py
rename : src/mem/coherent_bus.cc => src/mem/coherent_xbar.cc
rename : src/mem/coherent_bus.hh => src/mem/coherent_xbar.hh
rename : src/mem/noncoherent_bus.cc => src/mem/noncoherent_xbar.cc
rename : src/mem/noncoherent_bus.hh => src/mem/noncoherent_xbar.hh
rename : src/mem/bus.cc => src/mem/xbar.cc
rename : src/mem/bus.hh => src/mem/xbar.hh
2014-09-20 17:18:32 -04:00
Andreas Hansson
1884bcc03b tests: Add a memtest version using the ideal SnoopFilter
This patch adds a basic regression test for the snoop filter.

--HG--
rename : tests/configs/memtest.py => tests/configs/memtest-filter.py
2014-09-20 17:18:30 -04:00
Stephan Diestelhorst
435f4aec3d mem: Add access statistics for the snoop filter
Adds a simple access counter for requests and snoops for the snoop filter and
also classifies hits based on whether a single other holder existed or whether
multiple shares held the line.
2014-04-25 12:36:16 +01:00
Stephan Diestelhorst
afa2428eca mem: Tie in the snoop filter in the coherent bus 2014-09-20 17:18:29 -04:00
Stephan Diestelhorst
7d488cc66f mem: Add a simple snoop counter per bus
This patch adds a simple counter for both total messages and a histogram for
the fan-out of snoop messages.  The fan-out describes to how many ports snoops
had to be sent per incoming request / snoop-from-below.  Without any
cleverness, this usually means to either all, or all but the requesting port.
2014-04-24 13:28:47 +01:00
Stephan Diestelhorst
fe98cb6be4 misc: Add functions for doing popcount and power-of-two checking
Adds two public domain algorithms for determining number of set bits and also
whether a value is a power of two, uses the builtin that is available in GCC
and clang for popcount.
2014-04-24 17:41:26 +01:00
Stephan Diestelhorst
ba98d598ae mem: Simple Snoop Filter
This is a first cut at a simple snoop filter that tracks presence of lines in
the caches "above" it. The snoop filter can be applied at any given cache
hierarchy and will then handle the caches above it appropriately; there is no
need to use this only in the last-level bus.

This design currently has some limitations: missing stats, no notion of clean
evictions (these will not update the underlying snoop filter, because they are
not sent from the evicting cache down), no notion of capacity for the snoop
filter and thus no need for invalidations caused by capacity pressure in the
snoop filter. These are planned to be added on top with future change sets.
2014-09-20 17:18:26 -04:00
Stephan Diestelhorst
16351ba8d6 energy: Tighter checking of levels for DFS systems
There are cases where users might by accident / intention specify less voltage
operating points thatn frequency points.  We consider one of these cases
special: giving only a single voltage to a voltage domain effectively renders
it as a static domain.  This patch adds additional logic in the auxiliary parts
of the functionality to handle these cases properly (simple driver asking for
N>1 operating levels, we should return the same voltage for all of them) and
adds error checking code in the voltage domain.
2014-08-12 19:00:44 +01:00
Stephan Diestelhorst
65aaf62714 energy: Add the Energy Controller in the right configs
Tie in the newly created energy controller components in the default
configurations.
2014-07-25 13:36:23 +01:00
Akash Bagdia
04e51e5e3e energy: Memory-mapped Energy Controller component
This patch provides an Energy Controller device that provides software
(driver) access to a DVFS handler. The device is currently residing in
the dev/arm tree, but there is nothing inherently ARM specific in the
behaviour. It is currently only tested and supported for ARM Linux,
hence the location.
2014-09-20 17:18:23 -04:00
Stephan Diestelhorst
4422d1322a energy: Small extentions and fixes for DVFS handler
These additions allow easier interoperability with and querying from an
additional controller which will be in a separate patch.  Also adding warnings
for changing the enabled state of the handler across checkpoint / resume and
deviating from the state in the configuration.

Contributed-by: Akash Bagdia <akash.bagdia@arm.com>
2014-06-16 14:59:44 +01:00
Wendy Elsasser
bf23847072 mem: Add DDR4 bank group timing
Added the following parameter to the DRAMCtrl class:
 - bank_groups_per_rank

This defaults to 1. For the DDR4 case, the default is overridden to indicate
bank group architecture, with multiple bank groups per rank.

Added the following delays to the DRAMCtrl class:
 - tCCD_L : CAS-to-CAS, same bank group delay
 - tRRD_L : RAS-to-RAS, same bank group delay

These parameters are only applied when bank group timing is enabled.  Bank
group timing is currently enabled only for DDR4 memories.

For all other memories, these delays will default to '0 ns'

In the DRAM controller model, applied the bank group timing to the per bank
parameters actAllowedAt and colAllowedAt.
The actAllowedAt will be updated based on bank group when an ACT is issued.
The colAllowedAt will be updated based on bank group when a RD/WR burst is
issued.

At the moment no modifications are made to the scheduling.
2014-09-20 17:18:21 -04:00
Wendy Elsasser
b6ecfe9183 mem: Add memory rank-to-rank delay
Add the following delay to the DRAM controller:
 - tCS : Different rank bus turnaround delay

This will be applied for
 1) read-to-read,
 2) write-to-write,
 3) write-to-read, and
 4) read-to-write
command sequences, where the new command accesses a different rank
than the previous burst.

The delay defaults to 2*tCK for each defined memory class. Note that
this does not correspond to one particular timing constraint, but is a
way of modelling all the associated constraints.

The DRAM controller has some minor changes to prioritize commands to
the same rank. This prioritization will only occur when the command
stream is not switching from a read to write or vice versa (in the
case of switching we have a gap in any case).

To prioritize commands to the same rank, the model will determine if there are
any commands queued (same type) to the same rank as the previous command.
This check will ensure that the 'same rank' command will be able to execute
without adding bubbles to the command flow, e.g. any ACT delay requirements
can be done under the hoods, allowing the burst to issue seamlessly.
2014-09-20 17:17:57 -04:00
Wendy Elsasser
a384525355 cpu: Update DRAM traffic gen
Add new DRAM_ROTATE mode to traffic generator.
This mode will generate DRAM traffic that rotates across
banks per rank, command types, and ranks per channel

The looping order is illustrated below:
for (ranks per channel)
   for (command types)
      for (banks per rank)
         // Generate DRAM Command Series

This patch also adds the read percentage as an input argument to the
DRAM sweep script. If the simulated read percentage is 0 or 100, the
middle for loop does not generate additional commands.  This loop is
used only when the read percentage is set to 50, in which case the
middle loop will toggle between read and write commands.

Modified sweep.py script, which generates DRAM traffic.
Added input arguments and support for new DRAM_ROTATE mode.
The script now has input arguments for:
 1) Read percentage
 2) Number of ranks
 3) Address mapping
 4) Traffic generator mode  (DRAM or DRAM_ROTATE)

The default values are:
 100% reads, 1 rank, RoRaBaCoCh address mapping, and DRAM traffic gen mode

For the DRAM traffic mode, added multi-rank support.
2014-09-20 17:17:55 -04:00
Andreas Sandberg
3f7a9348dd dev: Add support for 9p proxying over VirtIO
This patch adds support for 9p filesystem proxying over VirtIO. It can
currently operate by connecting to a 9p server over a socket
(VirtIO9PSocket) or by starting the diod 9p server and connecting over
pipe (VirtIO9PDiod).


*WARNING*: Checkpoints are currently not supported for systems with 9p
 proxies!
2014-09-20 17:17:54 -04:00
Andreas Sandberg
8c070c8f1b dev: Add a VirtIO block device model 2014-09-20 17:17:53 -04:00
Andreas Sandberg
b8c9b04bd6 dev: Add a VirtIO console device model 2014-09-20 17:17:52 -04:00
Andreas Sandberg
bf2c2183c6 dev, pci: Implement basic VirtIO support
This patch adds support for VirtIO over the PCI bus. It does so by
providing the following new SimObjects:

 * VirtIODeviceBase - Abstract base class for VirtIO devices.
 * PciVirtIO - VirtIO PCI transport interface.

A VirtIO device is hooked up to the guest system by adding a PciVirtIO
device to the PCI bus and connecting it to a VirtIO device using the
vio parameter.

New VirtIO devices should inherit from VirtIODevice base and
implementing one or more VirtQueues. The VirtQueues are usually
device-specific and all derive from the VirtQueue class. Queues must
be registered with the base class from the constructor since the
device assumes that the number of queues stay constant.
2014-09-20 17:17:51 -04:00
Andreas Sandberg
0c5139310d dev: Refactor terminal<->UART interface to make it more generic
The terminal currently assumes that the transport to the guest always
inherits from the Uart class. This assumption breaks when
implementing, for example, a VirtIO consoles. This patch removes this
assumption by adding pointer to the from the terminal to the uart and
replacing it with a more general callback interface. The Uart, or any
other class using the terminal, class implements an instance of the
callbacks class and registers it with the terminal.
2014-09-20 17:17:50 -04:00
Andreas Hansson
0fa128bbd0 base: Clean up redundant string functions and use C++11
This patch does a bit of housekeeping on the string helper functions
and relies on the C++11 standard library where possible. It also does
away with our custom string hash as an implementation is already part
of the standard library.
2014-09-20 17:17:49 -04:00
Andrew Bardsley
b2c2e67468 base: Add getSectionNames to IniFile
Add an accessor to IniFile to list all the sections in the file.
2014-09-20 17:17:47 -04:00
Curtis Dunham
e553ca67d4 tests: automatically kill regressions that take too long
When GNU coreutils 'timeout' is available, limit each regression
simulation to 4 hours.
2014-08-25 14:32:00 -05:00
Mitch Hayenga
4f0e3cd4d7 cpu: Add ExecFlags debug flag
Adds a debug flag to print out the flags a instruction is tagged with.
2014-09-20 17:17:45 -04:00
Mitch Hayenga
3e5bf0c922 mem: Remove the GHB prefetcher from the source tree
There are two primary issues with this code which make it deserving of deletion.

1) GHB is a way to structure a prefetcher, not a definitive type of prefetcher
2) This prefetcher isn't even structured like a GHB prefetcher.
   It's basically a worse version of the stride prefetcher.

It primarily serves to confuse new gem5 users and most functionality is already
present in the stride prefetcher.
2014-09-20 17:17:44 -04:00
Dam Sunwoo
ca3513d630 cpu: use probes infrastructure to do simpoint profiling
Instead of having code embedded in cpu model to do simpoint profiling use
the probes infrastructure to do it.
2014-09-20 17:17:43 -04:00
Andrew Bardsley
7329c0e20b config: Cleanup .json config file generation
This patch 'completes' .json config files generation by adding in the
SimObject references and String-valued parameters not currently
printed.

TickParamValues are also changed to print in the same tick-value
format as in .ini files.

This allows .json files to describe a system as fully as the .ini files
currently do.

This patch adds a new function config_value (which mirrors ini_str) to
each ParamValue and to SimObject.  This function can then be explicitly
changed to give different .json and .ini printing behaviour rather than
being written in terms of ini_str.
2014-09-20 17:17:42 -04:00
Andreas Hansson
41fc8a573e arch: Pass faults by const reference where possible
This patch changes how faults are passed between methods in an attempt
to copy as few reference-counting pointer instances as possible. This
should avoid unecessary copies being created, contributing to the
increment/decrement of the reference counters.
2014-09-19 10:35:18 -04:00
Andreas Hansson
619c5519fe cpu: Use a deque in o3 rename instruction queue
Switch from a list to a data structure with better data layout.
2014-09-19 10:35:14 -04:00
Andreas Hansson
586a219d11 base: Ensure the CP annotation compiles again
A bit of revamping to get the CP annotate functionality to compile.
2014-09-19 10:35:12 -04:00
Andreas Hansson
efd5cf323a misc: Use safe_cast when assumptions are made about return value
This patch changes two dynamic_cast to safe_cast as we assume the
return value is not NULL (without checking).
2014-09-19 10:35:11 -04:00
Andreas Hansson
32c111eda4 misc: Restore ostream flags where needed
This patch ensures we adhere to the normal ostream usage rules, and
restore the flags after modifying them.
2014-09-19 10:35:09 -04:00
Andreas Hansson
addfd89dce stats: Fix flow-control bug in Vector2D printing 2014-09-19 10:35:08 -04:00
Andreas Hansson
f615c4aeb0 misc: Remove assertions ensuring unsigned values >= 0 2014-09-19 10:35:07 -04:00
Andreas Hansson
377f081251 mem: Check return value of checkFunctional in SimpleMemory
Simple fix to ensure we only iterate until we are done.
2014-09-19 10:35:06 -04:00
Andreas Hansson
38646d48eb mem: Add checks to sendTimingReq in cache
A small fix to ensure the return value is not ignored.
2014-09-19 10:35:04 -04:00
Nilay Vaish
2ccdfc547d ruby: network: revert some of the changes from ad9c042dce54
The changeset ad9c042dce54 made changes to the structures under the network
directory to use a map of buffers instead of vector of buffers.
The reasoning was that not all vnets that are created are used and we
needlessly allocate more buffers than required and then iterate over them
while processing network messages.  But the move to map resulted in a slow
down which was pointed out by Andreas Hansson.  This patch moves things
back to using vector of message buffers.
2014-09-15 16:19:38 -05:00
Andreas Hansson
8d18713d28 stats: Minor update of Minor stats after uncacheable fix 2014-09-12 10:22:50 -04:00
Andrew Bardsley
1a45a8c5d3 cpu: Fix memory access in Minor not setting parent Request flags
This patch fixes cases where uncacheable/memory type flags are not set
correctly on a memory op which is split in the LSQ.  Without this
patch, request->request if freely used to check flags where the flags
should actually come from the accumulation of request fragment flags.

This patch also fixes a bug where an uncacheable access which passes
through tryToSendRequest more than once can increment
LSQ::numAccessesInMemorySystem more than once.
2014-09-12 10:22:49 -04:00