Couple of errors were discovered in 4eec7bdde5b0 which necessitated this patch.
Firstly, we create interrupt controllers in the se mode, but no piobus was
being created. RubyPort, which earlier used to ignore range changes now
forwards those to the piobus. The lack of piobus resulted in segmentation
fault. This patch creates a piobus even in se mode. It is not created only
when some tester is running. Secondly, I had missed out on modifying port
connections for other coherence protocols.
Currently, the interrupt controller in x86 is connected to the io bus
directly. Therefore the packets between the io devices and the interrupt
controller do not go through ruby. This patch changes ruby port so that
these packets arrive at the ruby port first, which then routes them to their
destination. Note that the patch does not make these packets go through the
ruby network. That would happen in a subsequent patch.
For some reason, the default x86 kernel is specified in
tests/configs/x86_generic.py and not in configs/common/FSConfig.py,
where the kernels for all the other ISAs are. This means that
running configs/example/fs.py for x86 fails because no kernel
is specified. Moving the specification over fixes this problem.
There is another problem that this uncovers, which is that going
past the init stage (i.e., past where the regression test stops)
fails because the fsck test on the disk device fails, but that's
a separate issue.
The output from the switcheroo tests is voluminous and
(because it includes timestamps) highly sensitive to
minor changes, leading to extremely large updates to the
reference outputs. This patch addresses this problem
by suppressing output from the tests. An internal
parameter can be set to enable the output. Wiring that
up to a command-line flag (perhaps even the rudimantary
-v/-q options in m5/main.py) is left for future work.
Update stats for recent changes. Mostly minor changes
in register access stats due to addition of new cc
register type and slightly different (and more accurate)
classification of int vs. fp register accesses.
In the unusual case that regressions are run with --update-ref
when there is no existing regression output, scons gets
confused because it depends on stats.txt to trigger the
update, but it has no indication that running the test will
generate the stats.txt file. (In the typical case where
stats.txt already exists, scons doesn't care about where
it came from.)
It's easy to fix this just by adding the stats.txt file
to the target list for the test action.
This patch simply brings the stats for the pc-simple-timing-ruby
regression up to date. The particular regression seems to give
different results on different systems unfortunately, and this update
reflects the current behaviour on zizzer.
The updates to the x87 caused the stats for several regressions to
change. This was mainly caused by the addition of a working 32-bit and
80-bit FP load instruction and xsave support.
Apparently only stats.txt was updated the last time, so
this changeset updates other reference output files
(config.ini, simout, simerr, ruby.stats) so that
test output diffs should not be cluttered with irrelevant
changes. There are a few stats.txt updates too, but
they are in the minority.
This patch simply takes a first step to use the NULL ISA build for
tests that do not make use of a CPU. Most of the Ruby tests could go
the same way, but to avoid duplicating a lot of compilation targets
that will have to wait until Ruby is built as a library and linked in
independently.
--HG--
rename : tests/quick/se/50.memtest/ref/alpha/linux/memtest/config.ini => tests/quick/se/50.memtest/ref/null/none/memtest/config.ini
rename : tests/quick/se/50.memtest/ref/alpha/linux/memtest/simerr => tests/quick/se/50.memtest/ref/null/none/memtest/simerr
rename : tests/quick/se/50.memtest/ref/alpha/linux/memtest/simout => tests/quick/se/50.memtest/ref/null/none/memtest/simout
rename : tests/quick/se/50.memtest/ref/alpha/linux/memtest/stats.txt => tests/quick/se/50.memtest/ref/null/none/memtest/stats.txt
rename : tests/quick/se/70.tgen/ref/arm/linux/tgen-simple-dram/simerr => tests/quick/se/70.tgen/ref/null/none/tgen-simple-dram/simerr
rename : tests/quick/se/70.tgen/ref/arm/linux/tgen-simple-dram/simout => tests/quick/se/70.tgen/ref/null/none/tgen-simple-dram/simout
rename : tests/quick/se/70.tgen/ref/arm/linux/tgen-simple-dram/stats.txt => tests/quick/se/70.tgen/ref/null/none/tgen-simple-dram/stats.txt
rename : tests/quick/se/70.tgen/ref/arm/linux/tgen-simple-mem/simerr => tests/quick/se/70.tgen/ref/null/none/tgen-simple-mem/simerr
rename : tests/quick/se/70.tgen/ref/arm/linux/tgen-simple-mem/simout => tests/quick/se/70.tgen/ref/null/none/tgen-simple-mem/simout
rename : tests/quick/se/70.tgen/ref/arm/linux/tgen-simple-mem/stats.txt => tests/quick/se/70.tgen/ref/null/none/tgen-simple-mem/stats.txt
The number of transitions per cycle that a controller can carry out is
a proxy for the number of ports that a controller has. This value is
currently 32 which is way too high. The patch introduces an option
for the number of ports and uses this option in the protocol files
to set the number of transitions. The default value is being set to
4. None of the se regressions change. Ruby stats for the fs regression
change and are being updated.
This patch updates the stats to reflect the: 1) addition of the
internal queue in SimpleMemory, 2) moving of the memory class outside
FSConfig, 3) fixing up of the 2D vector printing format, 4) specifying
burst size and interface width for the DRAM instead of relying on
cache-line size, 5) performing merging in the DRAM controller write
buffer, and 6) fixing how idle cycles are counted in the atomic and
timing CPU models.
The main reason for bundling them up is to minimise the changeset
size.
This patch changes the default parameter value of conf_table_reported
to match the common case. It also simplifies the regression and config
scripts to reflect this change.
This patch adds the notion of voltage domains, and groups clock
domains that operate under the same voltage (i.e. power supply) into
domains. Each clock domain is required to be associated with a voltage
domain, and the latter requires the voltage to be explicitly set.
A voltage domain is an independently controllable voltage supply being
provided to section of the design. Thus, if you wish to perform
dynamic voltage scaling on a CPU, its clock domain should be
associated with a separate voltage domain.
The current implementation of the voltage domain does not take into
consideration cases where there are derived voltage domains running at
ratio of native voltage domains, as with the case where there can be
on-chip buck/boost (charge pumps) voltage regulation logic.
The regression and configuration scripts are updated with a generic
voltage domain for the system, and one for the CPUs.
This patch moves the instantiation of the memory controller outside
FSConfig and instead relies on the mem_ranges to pass the information
to the caller (e.g. fs.py or one of the regression scripts). The main
motivation for this change is to expose the structural composition of
the memory system and allow more tuning and configuration without
adding a large number of options to the makeSystem functions.
The patch updates the relevant example scripts to maintain the current
functionality. As the order that ports are connected to the memory bus
changes (in certain regresisons), some bus stats are shuffled
around. For example, what used to be layer 0 is now layer 1.
Going forward, options will be added to support the addition of
multi-channel memory controllers.
This patch removes the sparse histogram total from the CommMonitor
stats. It also bumps the stats after the unit fixes in the atomic
cache access. Lastly, it updates the stats to match the new port
ordering. All numbers are the same, and the only thing that changes is
which master corresponds to what port index.
This patch adds the notion of source- and derived-clock domains to the
ClockedObjects. As such, all clock information is moved to the clock
domain, and the ClockedObjects are grouped into domains.
The clock domains are either source domains, with a specific clock
period, or derived domains that have a parent domain and a divider
(potentially chained). For piece of logic that runs at a derived clock
(a ratio of the clock its parent is running at) the necessary derived
clock domain is created from its corresponding parent clock
domain. For now, the derived clock domain only supports a divider,
thus ensuring a lower speed compared to its parent. Multiplier
functionality implies a PLL logic that has not been modelled yet
(create a separate clock instead).
The clock domains should be used as a mechanism to provide a
controllable clock source that affects clock for every clocked object
lying beneath it. The clock of the domain can (in a future patch) be
controlled by a handler responsible for dynamic frequency scaling of
the respective clock domains.
All the config scripts have been retro-fitted with clock domains. For
the System a default SrcClockDomain is created. For CPUs that run at a
different speed than the system, there is a seperate clock domain
created. This domain incorporates the CPU and the associated
caches. As before, Ruby runs under its own clock domain.
The clock period of all domains are pre-computed, such that no virtual
functions or multiplications are needed when calling
clockPeriod. Instead, the clock period is pre-computed when any
changes occur. For this to be possible, each clock domain tracks its
children.
This patch extends the existing system builders to also include a
syscall-emulation builder. This builder is deployed in all
syscall-emulation regressions that do not involve Ruby,
i.e. o3-timing, simple-timing and simple-atomic, as well as the
multi-processor regressions o3-timing-mp, simple-timing-mp and
simple-atomic-mp (the latter are only used by SPARC at this point).
The values chosen for the cache sizes match those that were used in
the existing config scripts (despite being on the large
side). Similarly, a mem_class parameter is added to the builder base
class to enable simple-atomic to use SimpleMemory and o3-timing to use
the default DDR3 configuration.
Due to the different order the ports are connected, the bus stats get
shuffled around for the multi-processor regressions. A separate patch
bumps the port indices. Besides this, all behaviour is exactly the
same.
This patch adds a 'sys_clock' command-line option and use it to assign
clocks to the system during instantiation.
As part of this change, the default clock in the System class is
removed and whenever a system is instantiated a system clock value
must be set. A default value is provided for the command-line option.
The configs and tests are updated accordingly.
This patch removes the explicit setting of the clock period for
certain instances of CoherentBus, NonCoherentBus and IOCache where the
specified clock is same as the default value of the system clock. As
all the values used are the defaults, there are no performance
changes. There are similar cases where the toL2Bus is set to use the
parent CPU clock which is already the default behaviour.
The main motivation for these simplifications is to ease the
introduction of clock domains.
This patch prunes the 00.gzip regressions with the main motivation
being that it adds little (or no) coverage and requires a substantial
amount of run time.
A complete regression run, including compilation from a clean repo, is
almost 20% faster(!).
This patch changes the regression script such that it is possible to
identify the runs that fail with an exit code, and those that finish
with stats differences. The ones that truly fail are reported as
FAILED, and those that finish with changed stats as CHANGED.
The yellow colour has been reclaimed from the skipped regressions and
is now used for the changed ones. With no obvious good option left the
skipped ones are now in cyan.
While I was editing the script I also bumped any occurence of M5 to
gem5.
Ruby's controller statistics have been mostly moved to stats.txt now.
Plus stats.txt for solaris/t1000-simple-atomic and arm/20.parser are
also being updated.
This patch updates the stats to reflect the addition of the bus stats,
and changes to the bus layers. In addition it updates the stats to
match the addition of the static pipeline latency of the memory
conotroller and the addition of a stat tracking the bytes per
activate.
This patch changes the class names of the variuos DRAM configurations
to better reflect what memory they are based on. The speed and
interface width is now part of the name, and also the alias that is
used to select them on the command line.
Some minor changes are done to the actual parameters, to better
reflect the named configurations. As a result of these changes the
regressions change slightly and the stats will be bumped in a separate
patch.
This patch enables selection of the memory controller class through a
mem-type command-line option. Behind the scenes, this option is
treated much like the cpu-type, and a similar framework is used to
resolve the valid options, and translate the short-hand description to
a valid class.
The regression scripts are updated with a hardcoded memory class for
the moment. The best solution going forward is probably to get the
memory out of the makeSystem functions, but Ruby complicates things as
it does not connect the memory controller to the membus.
--HG--
rename : configs/common/CpuConfig.py => configs/common/MemConfig.py
This changeset adds support for initializing a KVM VM in the
BaseSystem test class and adds the following methods in run.py:
require_file -- Test if a file exists and abort/skip if not.
require_kvm -- Test if KVM support has been compiled into gem5 (i.e.,
BaseKvmCPU exists) and the KVM device exists on the
host.
Add the options 'panic_on_panic' and 'panic_on_oops' to the
LinuxArmSystem SimObject. When these option are enabled, the simulator
panics when the guest kernel panics or oopses. Enable panic on panic
and panic on oops in ARM-based test cases.
The new changeset that can reorder Ruby profilers will cause the ruby.stats
files to reordered statistics (the point of the patch). Update the references
to ensure that these changes are reflected in regressions.
CPU switching consists of the following steps:
1. Drain the system
2. Switch out old CPUs (cpu.switchOut())
3. Change the system timing mode to the mode the new CPUs require
4. Flush caches if switching to hardware virtualization
5. Inform new CPUs of the handover (cpu.takeOverFrom())
6. Resume the system
m5.switchCpus() previously only did step 2 & 5. Since information
about the new processors' memory system requirements is now exposed,
do all of the steps above.
This patch adds automatic memory system switching and flush (if
needed) to switchCpus(). Additionally, it adds optional draining to
switchCpus(). This has the following implications:
* changeToTiming and changeToAtomic are no longer needed, so they have
been removed.
* changeMemoryMode is only used internally, so it is has been renamed
to be private.
* switchCpus requires a reference to the system containing the CPUs as
its first parameter.
WARNING: This changeset breaks compatibility with existing
configuration scripts since it changes the signature of
m5.switchCpus().
This patch moves the default DRAM parameters from the SimpleDRAM class
to two different subclasses, one for DDR3 and one for LPDDR2. More can
be added as we go forward.
The regressions that previously used the SimpleDRAM are now using
SimpleDDR3 as this is the most similar configuration.
The actual statistical values are being updated for only two tests belonging
to sparc architecture and inorder cpu: 00.hello and 02.insttest. For others
the patch updates config.ini and name changes to statistical variables.
This changeset adds a set of tests that stress the CPU switching
code. It adds the following test configurations:
* tsunami-switcheroo-full -- Alpha system (atomic, timing, O3)
* realview-switcheroo-atomic -- ARM system (atomic<->atomic)
* realview-switcheroo-timing -- ARM system (timing<->timing)
* realview-switcheroo-o3 -- ARM system (O3<->O3)
* realview-switcheroo-full -- ARM system (atomic, timing, O3)
Reference data is provided for the 10.linux-boot test case. All of the
tests trigger a CPU switch once per millisecond during the boot
process.
The in-order CPU model was not included in any of the tests as it does
not support CPU handover.
This patch generalises the address range resolution for the I/O cache
and I/O bridge such that they do not assume a single memory. The patch
involves adding a parameter to the system which is then defined based
on the memories that are to be visible from the I/O subsystem, whether
behind a cache or a bridge.
The change is needed to allow interleaved memory controllers in the
system.
This patch adds support for reading input traces encoded using
protobuf according to what is done in the CommMonitor.
A follow-up patch adds a Python script that can be used to convert the
previously used ASCII traces to protobuf equivalents. The appropriate
regression input is updated as part of this patch.
The EIO tests depend on the EIO support from the "encumbered"
repository, which means that they are not normally built with
gem5. This causes all EIO related tests to fail, which is both
annoying and confusing. This patch addresses this by adding support
for skipping tests if certain conditions (e.g., the presence of a
SimObject) can not be met. It introduces the following Python
functions that can be called from within a test case:
* skip_test -- Skip a test and optionally print why the test was
skipped.
* has_sim_object -- Test if a SimObject exists.
* require_sim_object -- Test if a SimObject exists and skip, or
optionally fail, the test if not.
Additionally, this patch updates the EIO tests to check for the
presence of EioProcess.
This patch adds packet tracing to the communication monitor using a
protobuf as the mechanism for creating the trace.
If no file is specified, then the tracing is disabled. If a file is
specified, then for every packet that is successfully sent, a protobuf
message is serialized to the file.
This patch changes the traffic generator period such that it does not
completely saturate the DRAM controller and create an ever-growing
backlog in the queued port.
A separate patch updates the stats.
The ISA class on stores the contents of ID registers on many
architectures. In order to make reset values of such registers
configurable, we make the class inherit from SimObject, which allows
us to use the normal generated parameter headers.
This patch introduces a Python helper method, BaseCPU.createThreads(),
which creates a set of ISAs for each of the threads in an SMT
system. Although it is currently only needed when creating
multi-threaded CPUs, it should always be called before instantiating
the system as this is an obvious place to configure ID registers
identifying a thread/CPU.
Previous to this change we didn't always set the memory mode which worked as
long as we never attempted to switch CPUs or checked that a CPU was in a
memory system with the correct mode. Future changes will make CPUs verify
that they're operating in the correct mode and thus we need to always set it.
Most of the test cases currently contain a large amount of duplicated
boiler plate code. This changeset introduces a set of classes that
encapsulates most of the functionality when setting up a test
configuration.
The following base classes are introduced:
* BaseSystem - Basic system configuration that can be used for both
SE and FS simulation.
* BaseFSSystem - Basic FS configuration uni-processor and multi-processor
configurations.
* BaseFSSystemUniprocessor - Basic FS configuration for uni-processor
configurations. This is provided as a way
to make existing test cases backwards
compatible.
Architecture specific implementations are provided for ARM, Alpha, and
X86.