This patch changes the CPU configuration used for the full-system ARM
regressions to increase the test coverage. Note that it is only the
core configuration, and not the caches etc.
This patch moves the instantiation of the memory controller outside
FSConfig and instead relies on the mem_ranges to pass the information
to the caller (e.g. fs.py or one of the regression scripts). The main
motivation for this change is to expose the structural composition of
the memory system and allow more tuning and configuration without
adding a large number of options to the makeSystem functions.
The patch updates the relevant example scripts to maintain the current
functionality. As the order that ports are connected to the memory bus
changes (in certain regresisons), some bus stats are shuffled
around. For example, what used to be layer 0 is now layer 1.
Going forward, options will be added to support the addition of
multi-channel memory controllers.
Most of the test cases currently contain a large amount of duplicated
boiler plate code. This changeset introduces a set of classes that
encapsulates most of the functionality when setting up a test
configuration.
The following base classes are introduced:
* BaseSystem - Basic system configuration that can be used for both
SE and FS simulation.
* BaseFSSystem - Basic FS configuration uni-processor and multi-processor
configurations.
* BaseFSSystemUniprocessor - Basic FS configuration for uni-processor
configurations. This is provided as a way
to make existing test cases backwards
compatible.
Architecture specific implementations are provided for ARM, Alpha, and
X86.
This patch uses the common L1, L2 and IOCache configuration for the
regressions that all share the same cache parameters. There are a few
regressions that use a slightly different configuration (memtest,
o3-timing=mp, simple-atomic-mp and simple-timing-mp), and the latter
are not changed in this patch. They will be updated in a future patch.
The common cache configurations are changed to match the ones used in
the regressions, and are slightly changed with respect to what they
were. Hopefully this means we can converge on a common base
configuration, used both in the normal user configurations and
regressions.
As only regressions that shared the same cache configuration are
updated, no regressions are affected.
This patch changes the cache-related latencies from an absolute time
expressed in Ticks, to a number of cycles that can be scaled with the
clock period of the caches. Ultimately this patch serves to enable
future work that involves dynamic frequency scaling. As an immediate
benefit it also makes it more convenient to specify cache performance
without implicitly assuming a specific CPU core operating frequency.
The stat blocked_cycles that actually counter in ticks is now updated
to count in cycles.
As the timing is now rounded to the clock edges of the cache, there
are some regressions that change. Plenty of them have very minor
changes, whereas some regressions with a short run-time are perturbed
quite significantly. A follow-on patch updates all the statistics for
the regressions.
This patch unifies the full-system regression config scripts and uses
the BaseCPU convenience method addTwoLevelCacheHierarchy to connect up
the L1s and L2, and create the bus inbetween.
The patch is a step on the way to use the clock period to express the
cache latencies, as the CPU is now the parent of the L1, L2 and L1-L2
bus, and these modules thus use the CPU clock.
The patch does not change the value of any stats, but plenty names,
and a follow-up patch contains the update to the stats, chaning
system.l2c to system.cpu.l2cache.
In the current caches the hit latency is paid twice on a miss. This patch lets
a configurable response latency be set of the cache for the backward path.
This patch introduces a class hierarchy of buses, a non-coherent one,
and a coherent one, splitting the existing bus functionality. By doing
so it also enables further specialisation of the two types of buses.
A non-coherent bus connects a number of non-snooping masters and
slaves, and routes the request and response packets based on the
address. The request packets issued by the master connected to a
non-coherent bus could still snoop in caches attached to a coherent
bus, as is the case with the I/O bus and memory bus in most system
configurations. No snoops will, however, reach any master on the
non-coherent bus itself. The non-coherent bus can be used as a
template for modelling PCI, PCIe, and non-coherent AMBA and OCP buses,
and is typically used for the I/O buses.
A coherent bus connects a number of (potentially) snooping masters and
slaves, and routes the request and response packets based on the
address, and also forwards all requests to the snoopers and deals with
the snoop responses. The coherent bus can be used as a template for
modelling QPI, HyperTransport, ACE and coherent OCP buses, and is
typically used for the L1-to-L2 buses and as the main system
interconnect.
The configuration scripts are updated to use a NoncoherentBus for all
peripheral and I/O buses.
A bit of minor tidying up has also been done.
--HG--
rename : src/mem/bus.cc => src/mem/coherent_bus.cc
rename : src/mem/bus.hh => src/mem/coherent_bus.hh
rename : src/mem/bus.cc => src/mem/noncoherent_bus.cc
rename : src/mem/bus.hh => src/mem/noncoherent_bus.hh