sanchayanmaity/gem5 - Sanchayan Maity's repositories

Author	SHA1	Message	Date
Brandon Potter	9eda4bdc5a	ruby: remove extra whitespace and correct misspelled words	2015-07-10 16:05:23 -05:00
Andreas Sandberg	a74c446e7d	dev, arm: Add a device model that uses the NoMali model Add a simple device shim that interfaces with the NoMali model library. The gem5 side of the interface supports Mali T60x/T62x/T760 GPUs. This device model pretends to be a Mali GPU, but doesn't render anything and executes in zero time.	2015-07-07 10:03:14 +01:00
Andreas Sandberg	ed38e3432c	sim: Refactor and simplify the drain API The drain() call currently passes around a DrainManager pointer, which is now completely pointless since there is only ever one global DrainManager in the system. It also contains vestiges from the time when SimObjects had to keep track of their child objects that needed draining. This changeset moves all of the DrainState handling to the Drainable base class and changes the drain() and drainResume() calls to reflect this. Particularly, the drain() call has been updated to take no parameters (the DrainManager argument isn't needed) and return a DrainState instead of an unsigned integer (there is no point returning anything other than 0 or 1 any more). Drainable objects should return either DrainState::Draining (equivalent to returning 1 in the old system) if they need more time to drain or DrainState::Drained (equivalent to returning 0 in the old system) if they are already in a consistent state. Returning DrainState::Running is considered an error. Drain done signalling is now done through the signalDrainDone() method in the Drainable class instead of using the DrainManager directly. The new call checks if the state of the object is DrainState::Draining before notifying the drain manager. This means that it is safe to call signalDrainDone() without first checking if the simulator has requested draining. The intention here is to reduce the code needed to implement draining in simple objects.	2015-07-07 09:51:05 +01:00
Andreas Sandberg	f16c0a4a90	sim: Decouple draining from the SimObject hierarchy Draining is currently done by traversing the SimObject graph and calling drain()/drainResume() on the SimObjects. This is not ideal when non-SimObjects (e.g., ports) need draining since this means that SimObjects owning those objects need to be aware of this. This changeset moves the responsibility for finding objects that need draining from SimObjects and the Python-side of the simulator to the DrainManager. The DrainManager now maintains a set of all objects that need draining. To reduce the overhead in classes owning non-SimObjects that need draining, objects inheriting from Drainable now automatically register with the DrainManager. If such an object is destroyed, it is automatically unregistered. This means that drain() and drainResume() should never be called directly on a Drainable object. While implementing the new functionality, the DrainManager has now been made thread safe. In practice, this means that it takes a lock whenever it manipulates the set of Drainable objects since SimObjects in different threads may create Drainable objects dynamically. Similarly, the drain counter is now an atomic_uint, which ensures that it is manipulated correctly when objects signal that they are done draining. A nice side effect of these changes is that it makes the drain state changes stricter, which the simulation scripts can exploit to avoid redundant drains.	2015-07-07 09:51:05 +01:00
Andreas Sandberg	d5f5fbb855	sim: Move mem(Writeback\|Invalidate) to SimObject The memWriteback() and memInvalidate() calls used to live in the Serializable interface. In this series of patches, the Serializable interface will be redesigned to make serialization independent of the object graph and always work on the entire simulator. This means that the Serialization interface won't be useful to perform maintenance of the caches in a sub-graph of the entire SimObject graph. This changeset moves these memory maintenance methods to the SimObject interface instead.	2015-07-07 09:51:04 +01:00
Andreas Sandberg	e9c3d59aae	sim: Make the drain state a global typed enum The drain state enum is currently a part of the Drainable interface. The same state machine will be used by the DrainManager to identify the global state of the simulator. Make the drain state a global typed enum to better cater for this usage scenario.	2015-07-07 09:51:04 +01:00
Andreas Sandberg	1dc5e63b88	python: Remove redundant drain when changing memory modes When the Python helper code switches CPU models, it sometimes also needs to change the memory mode of the simulator. When this happens, it accidentally tried to drain the simulator despite having done so already. This changeset removes the redundant drain.	2015-07-07 09:51:04 +01:00
Andreas Sandberg	7773cb9565	sim: Add macros to serialize objects into a section Add the SERIALIZE_OBJ / UNSERIALIZE_OBJ macros that serialize an object into a subsection of the current checkpoint section.	2015-07-07 09:51:04 +01:00
Andreas Sandberg	b3ecfa6ae0	base: Add serialization support to Pixels and FrameBuffer Serialize pixels as unsigned 32 bit integers by adding the required to_number() and stream operators. This is used by the FrameBuffer, which now implements the Serializable interface. Users of frame buffers are expected to serialize it into its own section by calling serializeSection().	2015-07-07 09:51:04 +01:00
Andreas Sandberg	888ec455cb	sim: Fix broken event unserialization Events expected to be unserialized using an event-specific unserializeEvent call. This call was never actually used, which meant the events relying on it never got unserialized (or scheduled after unserialization). Instead of relying on a custom call, we now use the normal serialization code again. In order to schedule the event correctly, the parrent object is expected to use the EventQueue::checkpointReschedule() call. This happens automatically for events that are serialized using the AutoSerialize mechanism.	2015-07-07 09:51:04 +01:00
Andreas Sandberg	76cd4393c0	sim: Refactor the serialization base class Objects that are can be serialized are supposed to inherit from the Serializable class. This class is meant to provide a unified API for such objects. However, so far it has mainly been used by SimObjects due to some fundamental design limitations. This changeset redesigns to the serialization interface to make it more generic and hide the underlying checkpoint storage. Specifically: * Add a set of APIs to serialize into a subsection of the current object. Previously, objects that needed this functionality would use ad-hoc solutions using nameOut() and section name generation. In the new world, an object that implements the interface has the methods serializeSection() and unserializeSection() that serialize into a named /subsection/ of the current object. Calling serialize() serializes an object into the current section. * Move the name() method from Serializable to SimObject as it is no longer needed for serialization. The fully qualified section name is generated by the main serialization code on the fly as objects serialize sub-objects. * Add a scoped ScopedCheckpointSection helper class. Some objects need to serialize data structures, that are not deriving from Serializable, into subsections. Previously, this was done using nameOut() and manual section name generation. To simplify this, this changeset introduces a ScopedCheckpointSection() helper class. When this class is instantiated, it adds a new /subsection/ and subsequent serialization calls during the lifetime of this helper class happen inside this section (or a subsection in case of nested sections). * The serialize() call is now const which prevents accidental state manipulation during serialization. Objects that rely on modifying state can use the serializeOld() call instead. The default implementation simply calls serialize(). Note: The old-style calls need to be explicitly called using the serializeOld()/serializeSectionOld() style APIs. These are used by default when serializing SimObjects. * Both the input and output checkpoints now use their own named types. This hides underlying checkpoint implementation from objects that need checkpointing and makes it easier to change the underlying checkpoint storage code.	2015-07-07 09:51:03 +01:00
Andreas Sandberg	7cd5db8c6d	sim: Add serialization macros for std containers	2015-07-07 09:51:03 +01:00
Andreas Sandberg	777cc71c4a	mem: Cleanup CommMonitor in preparation for probe support Make configuration parameters constant and get rid of an unnecessary dependency on the Time class.	2015-07-06 17:08:53 +01:00
Nikos Nikoleris	67925a8334	x86: Adjust the size of the values written to the x87 misc registers All x87 misc registers are implemented in an array of 64 bit values but in real hardware the size of some of these registers is smaller. Previsouly all 64 bits where incorrectly set and then later read. To ensure correctness we mask the value in setMiscRegNoEffect to write only the valid bits. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2015-07-04 10:43:47 -05:00
Nilay Vaish	11a48faeb4	o3: correct the number of cc registers in rename map	2015-07-04 10:43:46 -05:00
Nilay Vaish	d29d7c41f1	mem: packet: Add const to constructor argument	2015-07-04 10:43:46 -05:00
Nilay Vaish	16ac48e6a4	ruby: drop NetworkMessage class This patch drops the NetworkMessage class. The relevant data members and functions have been moved to the Message class, which was the parent of NetworkMessage.	2015-07-04 10:43:46 -05:00
Nilay Vaish	baa3eb0de3	ruby: mesi three level: name change to avoid clash The accessor function getDestination() for Destination variable in the coherence message clashes with the getDestination() that is part of the Message class. Hence the name change.	2015-07-04 10:43:46 -05:00
Nilay Vaish	b4efb48a71	ruby: remove message buffer node This structure's only purpose was to provide a comparison function for ordering messages in the MessageBuffer. The comparison function is now being moved to the Message class itself. So we no longer require this structure.	2015-07-04 10:43:46 -05:00
Andreas Hansson	7e711c98f8	mem: Increase the default buffer sizes for the DDR4 controller This patch increases the default read/write buffer sizes for the DDR4 controller config to values that are more suitable for the high bandwidth and high bank count.	2015-07-03 10:14:48 -04:00
Wendy Elsasser	31f901b69d	mem: Update DRAM command scheduler for bank groups This patch updates the command arbitration so that bank group timing as well as rank-to-rank delays will be taken into account. The resulting arbitration no longer selects commands (prepped or not) that cannot issue seamlessly if there are commands that can issue back-to-back, minimizing the effect of rank-to-rank (tCS) & same bank group (tCCD_L) delays. The arbitration selects a new command based on the following priority. Within each priority band, the arbitration will use FCFS to select the appropriate command: 1) Bank is prepped and burst can issue seamlessly, without a bubble 2) Bank is not prepped, but can prep and issue seamlessly, without a bubble 3) Bank is prepped but burst cannot issue seamlessly. In this case, a bubble will occur on the bus Thus, to enable more parallelism in subsequent selections, an unprepped packet is given higher priority if the bank prep can be hidden. If the bank prep cannot be hidden, the selection logic will choose a prepped packet that cannot issue seamlessly if one exist. Otherwise, the default selection will choose the packet with the minimum bank prep delay.	2015-07-03 10:14:46 -04:00
Andreas Hansson	b56167b682	mem: Avoid DRAM write queue iteration for merging and read lookup This patch adds a simple lookup structure to avoid iterating over the write queue to find read matches, and for the merging of write bursts. Instead of relying on iteration we simply store a set of currently-buffered write-burst addresses and compare against these. For the reads we still perform the iteration if we have a match. For the writes, we rely entirely on the set. Note that there are corner-cases where sub-bursts would actually not be mergeable without a read-modify-write. We ignore these cases and opt for speed.	2015-07-03 10:14:45 -04:00
Andreas Hansson	db85ddca1a	mem: Delay responses in the crossbar before forwarding This patch changes how the crossbar classes deal with responses. Instead of forwarding responses directly and burdening the neighbouring modules in paying for the latency (through the pkt->headerDelay), we now queue them before sending them. The coherency protocol is not affected as requests and any snoop requests/responses are still passed on in zero time. Thus, the responses end up paying for any header delay accumulated when passing through the crossbar. Any latency incurred on the request path will be paid for on the response side, if no other module has dealt with it. As a result of this patch, responses are returned at a later point. This affects the number of outstanding transactions, and quite a few regressions see an impact in blocking due to no MSHRs, increased cache-miss latencies, etc. Going forward we should be able to use the same concept also for snoop responses, and any request that is not an express snoop.	2015-07-03 10:14:44 -04:00
Andreas Hansson	b93c912013	mem: Remove redundant is_top_level cache parameter This patch takes the final step in removing the is_top_level parameter from the cache. With the recent changes to read requests and write invalidations, the parameter is no longer needed, and consequently removed. This also means that asymmetric cache hierarchies are now fully supported (and we are actually using them already with L1 caches, but no table-walker caches, connected to a shared L2).	2015-07-03 10:14:43 -04:00
Andreas Hansson	71856cfbbc	mem: Split WriteInvalidateReq into write and invalidate WriteInvalidateReq ensures that a whole-line write does not incur the cost of first doing a read exclusive, only to later overwrite the data. This patch splits the existing WriteInvalidateReq into a WriteLineReq, which is done locally, and an InvalidateReq that is sent out throughout the memory system. The WriteLineReq re-uses the normal WriteResp. The change allows us to better express the difference between the cache that is performing the write, and the ones that are merely invalidating. As a consequence, we no longer have to rely on the isTopLevel flag. Moreover, the actual memory in the system does not see the intitial write, only the writeback. We were marking the written line as dirty already, so there is really no need to also push the write all the way to the memory. The overall flow of the write-invalidate operation remains the same, i.e. the operation is only carried out once the response for the invalidate comes back. This patch adds the InvalidateResp for this very reason.	2015-07-03 10:14:41 -04:00
Andreas Hansson	0ddde83a47	mem: Add ReadCleanReq and ReadSharedReq packets This patch adds two new read requests packets: ReadCleanReq - For a cache to explicitly request clean data. The response is thus exclusive or shared, but not owned or modified. The read-only caches (see previous patch) use this request type to ensure they do not get dirty data. ReadSharedReq - We add this to distinguish cache read requests from those issued by other masters, such as devices and CPUs. Thus, devices use ReadReq, and caches use ReadCleanReq, ReadExReq, or ReadSharedReq. For the latter, the response can be any state, shared, exclusive, owned or even modified. Both ReadCleanReq and ReadSharedReq re-use the normal ReadResp. The two transactions are aligned with the emerging cache-coherent TLM standard and the AMBA nomenclature. With this change, the normal ReadReq should never be used by a cache, and is reserved for the actual (non-caching) masters in the system. We thus have a way of identifying if a request came from a cache or not. The introduction of ReadSharedReq thus removes the need for the current isTopLevel hack, and also allows us to stop relying on checking the packet size to determine if the source is a cache or not. This is fixed in follow-on patches.	2015-07-03 10:14:40 -04:00
Andreas Hansson	893533a126	mem: Allow read-only caches and check compliance This patch adds a parameter to the BaseCache to enable a read-only cache, for example for the instruction cache, or table-walker cache (not for x86). A number of checks are put in place in the code to ensure a read-only cache does not end up with dirty data. A follow-on patch adds suitable read requests to allow a read-only cache to explicitly ask for clean data.	2015-07-03 10:14:39 -04:00
Ali Jafri	a262908acc	mem: Add clean evicts to improve snoop filter tracking This patch adds eviction notices to the caches, to provide accurate tracking of cache blocks in snoop filters. We add the CleanEvict message to the memory heirarchy and use both CleanEvicts and Writebacks with BLOCK_CACHED flags to propagate notice of clean and dirty evictions respectively, down the memory hierarchy. Note that the BLOCK_CACHED flag indicates whether there exist any copies of the evicted block in the caches above the evicting cache. The purpose of the CleanEvict message is to notify snoop filters of silent evictions in the relevant caches. The CleanEvict message behaves much like a Writeback. CleanEvict is a write and a request but unlike a Writeback, CleanEvict does not have data and does not need exclusive access to the block. The cache generates the CleanEvict message on a fill resulting in eviction of a clean block. Before travelling downwards CleanEvict requests generate zero-time snoop requests to check if the same block is cached in upper levels of the memory heirarchy. If the block exists, the cache discards the CleanEvict message. The snoops check the tags, writeback queue and the MSHRs of upper level caches in a manner similar to snoops generated from HardPFReqs. Currently CleanEvicts keep travelling towards main memory unless they encounter the block corresponding to their address or reach main memory (since we have no well defined point of serialisation). Main memory simply discards CleanEvict messages. We have modified the behavior of Writebacks, such that they generate snoops to check for the presence of blocks in upper level caches. It is possible in our current implmentation for a lower level cache to be writing back a block while a shared copy of the same block exists in the upper level cache. If the snoops find the same block in upper level caches, we set the BLOCK_CACHED flag in the Writeback message. We have also added logic to account for interaction of other message types with CleanEvicts waiting in the writeback queue. A simple example is of a response arriving at a cache removing any CleanEvicts to the same address from the cache's writeback queue.	2015-07-03 10:14:37 -04:00
Andreas Hansson	aa5bbe81f6	mem: Convert Request static const flags to enums This patch fixes an issue which is very wide spread in the codebase, causing sporadic linking failures. The issue is that we declare static const class variables in the header, without any definition (as part of a source file). In most cases the compiler propagates the value and we have no issues. However, especially for less optimising builds such as debug, we get sporadic linking failures due to undefined references. This patch fixes the Request class, by turning the static const flags and master IDs into C++11 typed enums.	2015-07-03 10:14:36 -04:00
Curtis Dunham	e385ae0c72	base: remove fd from object loaders All the object loaders directly examine the (already completely loaded by object_file.cc) memory image. There is no current motivation to keep the fd around.	2015-07-03 10:14:34 -04:00
Andreas Hansson	c466d55a26	scons: Bump compiler requirement to gcc >= 4.7 and clang >= 3.1 This patch updates the compiler minimum requirement to gcc 4.7 and clang 3.1, thus allowing: 1. Explicit virtual overrides (no need for M5_ATTR_OVERRIDE) 2. Non-static data member initializers 3. Template aliases 4. Delegating constructors This patch also enables a transition from --std=c++0x to --std=c++11.	2015-07-03 10:14:15 -04:00
Nilay Vaish	57971248f6	ruby: slicc: remove README No longer maintained. Updates are only made to the wiki page. So being dropped.	2015-06-25 11:58:30 -05:00
Nilay Vaish	0647d99854	ruby: message: remove a data member added by mistake I (Nilay) had mistakenly added a data member to the Message class in revision c1694b4032a6. The data member is being removed.	2015-06-25 11:58:29 -05:00
Jason Power	2f3c467883	Ruby: Remove assert in RubyPort retry list logic Remove the assert when adding a port to the RubyPort retry list. Instead of asserting, just ignore the added port, since it's already on the list. Without this patch, Ruby+detailed fails for even the simplest tests	2015-06-25 11:58:28 -05:00
Andreas Sandberg	cc813cd5f7	base: Add a warn_if macro Add a warn if macro that is analogous to the panic_if and fatal_if.	2015-06-21 20:52:13 +01:00
Andreas Sandberg	d541038549	arm: Cleanup arch headers to remove dma_device.hh dependency Break the dependency on dma_device.hh by forward-declaring DmaPort in the relevant header.	2015-06-21 20:48:33 +01:00
Ali Jafri	f0c3b70451	mem: Add check for express snoop in packet destructor Snoop packets share the request pointer with the originating packets. We need to ensure that the snoop packet destruction does not delete the request. Snoops are used for reads, invalidations, HardPFReqs, Writebacks and CleansEvicts. Reads, invalidations, and HardPFReqs need a response so their snoops do not delete the request. For Writebacks and CleanEvicts we need to check explicitly for whethere the current packet is an express snoop, in whcih case do not delete the request.	2015-06-09 09:21:18 -04:00
Andreas Hansson	578a7f20c6	mem: Fix snoop packet data allocation bug This patch fixes an issue where the snoop packet did not properly forward the data pointer in case of static data.	2015-06-09 09:21:17 -04:00
Rune Holm	eb3ed11794	arm: Delete debug print in initialization of hardware thread There seems to have been a debug print left in when the original ARMv8 support was merged in. This printout is performed every time you initialize a hardware thread, and it prints raw pointers, so it always causes diffs in the regression. This patch removes the debug print.	2015-06-09 09:21:16 -04:00
Rune Holm	f4311d3932	arm: Fix typo in ldrsh instruction name ldrsh was typoed as hdrsh, which is a bit annoying when printing instructions. This patch fixes it.	2015-06-09 09:21:15 -04:00
Andreas Sandberg	737e5da7f6	base: Reset CircleBuf size on flush() The flush() method in CircleBuf resets the state of the circular buffer, but fails to set size to zero. This obviously confuses code that tries to determine the amount of data in the buffer. Set the size to zero on flush.	2015-06-09 09:21:14 -04:00
Andreas Sandberg	a9cad92011	dev, arm: Include PIO size in AmbaDmaDevice constructor Make it possible to specify the size of the PIO space for an AMBA DMA device. Maintain backwards compatibility and default to zero.	2015-06-09 09:21:12 -04:00
Marco Elver	6599dd87c8	ruby: Fix MESI consistency bug Fixes missed forward eviction to CPU. With the O3CPU this can lead to load-load reordering, as the LQ is never notified of the invalidate. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2015-06-07 14:02:40 -05:00
Matthias Jung	25fe4c2529	mem: Add HMC Timing Parameters A single HMC-2500 x32 model based on: [1] DRAMSpec: a high-level DRAM bank modelling tool developed at the University of Kaiserslautern. This high level tool uses RC (resistance-capacitance) and CV (capacitance-voltage) models to estimate the DRAM bank latency and power numbers. [2] A Logic-base Interconnect for Supporting Near Memory Computation in the Hybrid Memory Cube (E. Azarkhish et. al) Assumed for the HMC model is a 30 nm technology node. The modelled HMC consists of a 4 Gbit part with 4 layers connected with TSVs. Each layer has 16 vaults and each vault consists of 2 banks per layer. In order to be able to use the same controller used for 2D DRAM generations for HMC, the following analogy is done: Channel (DDR) => Vault (HMC) device_size (DDR) => size of a single layer in a vault ranks per channel (DDR) => number of layers banks per rank (DDR) => banks per layer devices per rank (DDR) => devices per layer ( 1 for HMC). The parameters for which no input is available are inherited from the DDR3 configuration.	2015-06-07 14:02:40 -05:00
Ruslan Bukin ext:(%2C%20Zhang%20Guoye)	736d3314bf	arch: fix build under MacOSX put O_DIRECT under ifdefs -- this fixes build for MacOSX. Also use correct class for arm64 openFlagTable. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2015-06-07 14:02:40 -05:00
Christoph Pfister	4a17494708	mem: addr_mapper: restore old address if request not sent Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2015-05-30 13:45:17 +02:00
Curtis Dunham	31825bd988	sim, arm: add checkpoint upgrader for d02b45a5 The insertion of CONTEXTIDR_EL2 in the ARM miscellaneous registers obsoletes old checkpoints.	2015-06-01 18:05:11 -05:00
Andreas Sandberg	7c4eb3b4d8	kvm, arm: Add support for aarch64 This changeset adds support for aarch64 in kvm. The CPU module supports both checkpointing and online CPU model switching as long as no devices are simulated by the host kernel. It currently has the following limitations: * The system register based generic timer can only be simulated by the host kernel. Workaround: Use a memory mapped timer instead to simulate the timer in gem5. * Simulating devices (e.g., the generic timer) in the host kernel requires that the host kernel also simulates the GIC. * ID registers in the host and in gem5 must match for switching between simulated CPUs and KVM. This is particularly important for ID registers describing memory system capabilities (e.g., ASID size, physical address size). * Switching between a virtualized CPU and a simulated CPU is currently not supported if in-kernel device emulation is used. This could be worked around by adding support for switching to the gem5 (e.g., the KvmGic) side of the device models. A simpler workaround is to avoid in-kernel device models altogether.	2015-06-01 19:44:19 +01:00
Andreas Sandberg	dbfd6effe0	kvm, arm, dev: Add an in-kernel GIC implementation This changeset adds a GIC implementation that uses the kernel's built-in support for simulating the interrupt controller. Since there is currently no support for state transfer between gem5 and the kernel, the device model does not support serialization and CPU switching (which would require switching to a gem5-simulated GIC).	2015-06-01 19:44:17 +01:00
Andreas Sandberg	8e7c0575dc	kvm: Handle inst events at the current instruction count There are cases (particularly when attaching GDB) when instruction events are scheduled at the current instruction tick. This used to trigger an assertion error in kvm. This changeset adds a check for this condition and forces KVM to do a quick entry that completes any pending IO operations, but does not execute any new instructions, before servicing the event. We could check if we need to enter KVM at all, but forcing a quick entry is makes the code slightly cleaner and does not hurt correctness (performance is hardly an issue in these cases).	2015-06-01 19:43:41 +01:00
Andreas Sandberg	06cf5cc60b	kvm, arm: Move ARM-specific files to arch/arm/kvm/ This changeset moves the ARM-specific KVM CPU implementation to arch/arm/kvm/. This change is expected to keep the source tree somewhat cleaner as we start adding support for ARMv8 and KVM in-kernel interrupt controller simulation. --HG-- rename : src/cpu/kvm/ArmKvmCPU.py => src/arch/arm/kvm/ArmKvmCPU.py rename : src/cpu/kvm/arm_cpu.cc => src/arch/arm/kvm/arm_cpu.cc rename : src/cpu/kvm/arm_cpu.hh => src/arch/arm/kvm/arm_cpu.hh	2015-06-01 19:43:40 +01:00
Curtis Dunham	e590f0d1ef	arm: implement the CONTEXTIDR_EL2 system reg.	2015-05-26 03:21:45 -04:00
Nathanael Premillieu	31fd18ab15	arm: Make address translation faster with better caching This patch adds better caching of the sys regs for AArch64, thus avoiding unnecessary calls to tc->readMiscReg(MISCREG_CPSR) in the non-faulting case.	2015-05-26 03:21:42 -04:00
Andreas Hansson	53a360985b	base: Allow multiple interleaved ranges This patch changes how the address range calculates intersection such that a system can have a number of non-overlapping interleaved ranges without complaining. Without this patch we end up with a panic.	2015-05-26 03:21:40 -04:00
Andrew Bardsley	cea1d14a93	cpu: Fix a bug in counting issued instructions in MinorCPU The MinorCPU would count bubbles in Execute::issue as part of the num_insts_issued and so sometimes reach the instruction issue limit incorrectly. Fixed by checking for a bubble in one new place.	2015-05-26 03:21:37 -04:00
Giacomo Gabrielli	cc2346e8ca	arm: Implement some missing syscalls (SE mode) Adding a few syscalls that were previously considered unimplemented.	2015-05-26 03:21:35 -04:00
Andreas Hansson	0cc350d2c5	ruby: Deprecation warning for RubyMemoryControl A step towards removing RubyMemoryControl and shift users to DRAMCtrl. The latter is faster, more representative, very versatile, and is integrated with power models.	2015-05-26 03:21:34 -04:00
Andreas Sandberg	f3f06e1684	arm, dev: Add support for a memory mapped generic timer There are cases when we don't want to use a system register mapped generic timer, but can't use the SP804. For example, when using KVM on aarch64, we want to intercept accesses to the generic timer, but can't do so if it is using the system register interface. In such cases, we need to use a memory-mapped generic timer. This changeset adds a device model that implements the memory mapped generic timer interface. The current implementation only supports a single frame (i.e., one virtual timer and one physical timer).	2015-05-23 13:46:56 +01:00
Andreas Sandberg	6533f2000b	arm: Get rid of pointless have_generic_timer param The ArmSystem class has a parameter to indicate whether it is configured to use the generic timer extension or not. This parameter doesn't affect any feature flags in the current implementation and is therefore completely unnecessary. In fact, we usually don't set it even if a system has a generic timer. If we ever need to check if there is a generic timer present, we should just request a pointer and check if it is non-null instead.	2015-05-23 13:46:54 +01:00
Andreas Sandberg	2278fec1d1	dev, arm: Add virtual timers to the generic timer model The generic timer model currently does not support virtual counters. Virtual and physical counters both tick with the same frequency. However, virtual timers allow a hypervisor to set an offset that is subtracted from the counter when it is read. This enables the hypervisor to present a time base that ticks with virtual time in the VM (i.e., doesn't tick when the VM isn't running). Modern Linux kernels generally assume that virtual counters exist and try to use them by default.	2015-05-23 13:46:53 +01:00
Andreas Sandberg	65f3f097d3	dev, arm: Refactor and clean up the generic timer model This changeset cleans up the generic timer a bit and moves most of the register juggling from the ISA code into a separate class in the same source file as the rest of the generic timer. It also removes the assumption that there is always 8 or fewer CPUs in the system. Instead of having a fixed limit, we now instantiate per-core timers as they are requested. This is all in preparation for other patches that add support for virtual timers and a memory mapped interface.	2015-05-23 13:46:52 +01:00
Andreas Sandberg	5435f25ec8	kvm: Fix dumping code for large registers The register dumping code in kvm tries to print the bytes in large registers (128 bits and larger) instead of printing them as hex. This changeset fixes that.	2015-05-23 13:37:22 +01:00
Andreas Sandberg	ed447bbff9	kvm, x86: Guard x86-specific APIs in KvmVM Protect x86-specific APIs in KvmVM with compile-time guards to avoid breaking ARM builds.	2015-05-23 13:37:20 +01:00
Andreas Sandberg	cba3a125e1	arm: Workaround incorrect HDLCD register order in kernel Some versions of the kernel incorrectly swap the red and blue color select registers. This changeset adds a workaround for that by swapping them when instantiating a PixelConverter.	2015-05-23 13:37:04 +01:00
Andreas Sandberg	db5c9a5f90	base: Redesign internal frame buffer handling Currently, frame buffer handling in gem5 is quite ad hoc. In practice, we pass around naked pointers to raw pixel data and expect consumers to convert frame buffers using the (broken) VideoConverter. This changeset completely redesigns the way we handle frame buffers internally. In summary, it fixes several color conversion bugs, adds support for more color formats (e.g., big endian), and makes the code base easier to follow. In the new world, gem5 always represents pixel data using the Pixel struct when pixels need to be passed between different classes (e.g., a display controller and the VNC server). Producers of entire frames (e.g., display controllers) should use the FrameBuffer class to represent a frame. Frame producers are expected to create one instance of the FrameBuffer class in their constructors and register it with its consumers once. Consumers are expected to check the dimensions of the frame buffer when they consume it. Conversion between the external representation and the internal representation is supported for all common "true color" RGB formats of up to 32-bit color depth. The external pixel representation is expected to be between 1 and 4 bytes in either big endian or little endian. Color channels are assumed to be contiguous ranges of bits within each pixel word. The external pixel value is scaled to an 8-bit internal representation using a floating multiplication to map it to the entire 8-bit range.	2015-05-23 13:37:03 +01:00
Andreas Sandberg	1985d28ef9	base: Clean up bitmap generation code The bitmap generation code is hard to follow and incorrectly uses the size of an enum member to calculate the size of a pixel. This changeset cleans up the code and adds some documentation.	2015-05-23 13:37:01 +01:00
Joel Hestness	0479569f67	ruby: Fix RubySystem warm-up and cool-down scope The processes of warming up and cooling down Ruby caches are simulation-wide processes, not just RubySystem instance-specific processes. Thus, the warm-up and cool-down variables should be globally visible to any Ruby components participating in either process. Make these variables static members and track the warm-up and cool-down processes as appropriate. This patch also has two side benefits: 1) It removes references to the RubySystem g_system_ptr, which are problematic for allowing multiple RubySystem instances in a single simulation. Warmup and cooldown variables being static (global) reduces the need for instance-specific dereferences through the RubySystem. 2) From the AbstractController, it removes local RubySystem pointers, which are used inconsistently with other uses of the RubySystem: 11 other uses reference the RubySystem with the g_system_ptr. Only sequencers have local pointers.	2015-05-19 10:56:51 -05:00
Andreas Hansson	99d3fa5945	arm: Identify table-walker requests This patch ensures all page-table walks are flagged as such.	2015-05-15 13:40:01 -04:00
Andreas Hansson	bd583d00f9	misc: Appease gcc 5.1 Three minor issues are resolved: 1. Apparently gcc 5.1 does not like negation of booleans followed by bitwise AND. 2. Somehow the compiler also gets confused and warns about NoopMachInst being unused (removing it causes compilation errors though). Most likely a compiler bug. 3. There seems to be a number of instances where loop unrolling causes false positives for the array-bounds check. For now, switch to std::array. Potentially we could disable the warning for newer gcc versions, but switching to std::array is probably a good move in any case.	2015-05-15 13:39:53 -04:00
Andreas Sandberg	37aab4a155	sim: Don't clear the active CPU vector in System::initState The system class currently clears the vector of active CPUs in initState(). CPUs are added to the list by registerThreadContext() which is called from BaseCPU::init(). This obviously breaks when the System object is initialized after the CPUs. This changeset removes the offending clear() call since the list will be empty after it has been instantiated anyway.	2015-05-15 13:39:44 -04:00
Steve Reinhardt	c65fa3dceb	syscall_emul: fix warn_once behavior The current ignoreWarnOnceFunc doesn't really work as expected, since it will only generate one warning total, for whichever "warn-once" syscall is invoked first. This patch fixes that behavior by keeping a "warned" flag in the SyscallDesc object, allowing suitably flagged syscalls to warn exactly once per syscall.	2015-05-05 09:25:59 -07:00
Andreas Hansson	f349592071	arm: Add missing FPEXC.EN check Add a missing check to ensure that exceptions are generated properly.	2015-05-05 03:22:45 -04:00
Giacomo Gabrielli	a3f23894eb	arm: enable DCZVA by default in SE mode	2015-05-05 03:22:42 -04:00
Stephan Diestelhorst	2847d5f517	mem: Create a request copy for deferred snoops Sometimes, we need to defer an express snoop in an MSHR, but the original request might complete and deallocate the original pkt->req. In those cases, create a copy of the request so that someone who is inspecting the delayed snoop can also inspect the request still. All of this is rather hacky, but the allocation / linking and general life-time management of Packet and Request is rather tricky. Deleting the copy is another tricky area, testing so far has shown that the right copy is deleted at the right time.	2015-03-17 11:50:55 +00:00
Andreas Sandberg	706597f021	arm: Relax ordering for some uncacheable accesses We currently assume that all uncacheable memory accesses are strictly ordered. Instead of always enforcing strict ordering, we now only enforce it if the required memory type is device memory or strongly ordered memory.	2015-05-05 03:22:34 -04:00
Andreas Sandberg	48281375ee	mem, cpu: Add a separate flag for strictly ordered memory The Request::UNCACHEABLE flag currently has two different functions. The first, and obvious, function is to prevent the memory system from caching data in the request. The second function is to prevent reordering and speculation in CPU models. This changeset gives the order/speculation requirement a separate flag (Request::STRICT_ORDER). This flag prevents CPU models from doing the following optimizations: * Speculation: CPU models are not allowed to issue speculative loads. * Write combining: CPU models and caches are not allowed to merge writes to the same cache line. Note: The memory system may still reorder accesses unless the UNCACHEABLE flag is set. It is therefore expected that the STRICT_ORDER flag is combined with the UNCACHEABLE flag to prevent this behavior.	2015-05-05 03:22:33 -04:00
Andreas Sandberg	1da634ace0	mem, alpha: Move Alpha-specific request flags Move Alpha-specific memory request flags to an architecture-specific header and map them to the architecture specific flag bit range.	2015-05-05 03:22:31 -04:00
Andreas Hansson	23b9792681	arm: Remove unnecessary boot uncachability With the recent patches addressing how we deal with uncacheable accesses there is no longer need for the work arounds put in place to enforce certain sections of memory to be uncacheable during boot.	2015-05-05 03:22:30 -04:00
Andreas Hansson	36f29496a0	mem: Snoop into caches on uncacheable accesses This patch takes a last step in fixing issues related to uncacheable accesses. We do not separate uncacheable memory from uncacheable devices, and in cases where it is really memory, there are valid scenarios where we need to snoop since we do not support cache maintenance instructions (yet). On snooping an uncacheable access we thus provide data if possible. In essence this makes uncacheable accesses IO coherent. The snoop filter is also queried to steer the snoops, but not updated since the uncacheable accesses do not allocate a block.	2015-05-05 03:22:29 -04:00
Andreas Hansson	554ddc7c07	arch, cpu: Do not forward snoops to table walker This patch simplifies the overall CPU by changing the TLB caches such that they do not forward snoops to the table walker port(s). Note that only ARM and X86 are affected. There is no reason for the ports to snoop as they do not actually take any action, and from a performance point of view we are better of not snooping more than we have to. Should it at a later point be required to snoop for a particular TLB design it is easy enough to add it back.	2015-05-05 03:22:27 -04:00
Andreas Hansson	14e5b2ea55	mem: Pass shared downstream through caches This patch ensures that we pass on information about a packet being shared (rather than exclusive), when forwarding a packet downstream. Without this patch there is a risk that a downstream cache considers the line exclusive when it really isn't.	2015-05-05 03:22:26 -04:00
Ali Jafri	3d33432136	mem: Add forward snoop check for HardPFReqs We should always check whether the cache is supposed to be forwarding snoops before generating snoops.	2015-05-05 03:22:25 -04:00
Andreas Hansson	0ebbf3f951	mem: Add missing stats update for uncacheable MSHRs This patch adds a missing counter update for the uncacheable accesses. By updating this counter we also get a meaningful average latency for uncacheable accesses (previously inf).	2015-05-05 03:22:24 -04:00
Andreas Hansson	33e3e370f2	mem: Tidy up BaseCache parameters This patch simply tidies up the BaseCache parameters and removes the unused "two_queue" parameter.	2015-05-05 03:22:22 -04:00
David Guillen	5287945a8b	mem: Remove templates in cache model This patch changes the cache implementation to rely on virtual methods rather than using the replacement policy as a template argument. There is no impact on the simulation performance, and overall the changes make it easier to modify (and subclass) the cache and/or replacement policy.	2015-05-05 03:22:21 -04:00
Andreas Hansson	d0d933facc	cpu: Work around gcc 4.9 issues with Num_OpClasses This patch fixes a recent issue with gcc 4.9 (and possibly more) being convinced that indices outside the array bounds are used when initialising the FUPool members.	2015-05-05 03:22:19 -04:00
Ruslan Bukin	81f3211149	arch, base, dev, kern, sym: FreeBSD support This adds support for FreeBSD/aarch64 FS and SE mode (basic set of syscalls only) Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2015-04-29 22:35:23 -05:00
Rizwana Begum	52a3bc5e5c	mem: Simplify page close checks for adaptive policies Both open_adaptive and close_adaptive page polices keep the page open if a row hit is found. If a row hit is not found, close_adaptive page policy precharges the row, and open_adaptive policy precharges the row only if there is a bank conflict request waiting in the queue. This patch makes the checks for above conditions simpler. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2015-04-29 22:35:22 -05:00
Nilay Vaish	3a2731fb8c	ruby: set: replace long by unsigned long UBSan complains about negative value being shifted	2015-04-29 22:35:22 -05:00
Nilay Vaish	4333549575	cpu: o3: replace issueLatency with bool pipelined Currently, each op class has a parameter issueLat that denotes the cycles after which another op of the same class can be issued. As of now, this latency can either be one cycle (fully pipelined) or same as execution latency of the op (not at all pipelined). The fact that issueLat is a parameter of type Cycles makes one believe that it can be set to any value. To avoid the confusion, the parameter is being renamed as 'pipelined' with type boolean. If set to true, the op would execute in a fully pipelined fashion. Otherwise, it would execute in an unpipelined fashion.	2015-04-29 22:35:22 -05:00
Nilay Vaish	0dbd696aae	cpu: o3: single cycle default div microop latency on x86 This patch sets the default latency of the division microop to a single cycle on x86. This is because the division instructions DIV and IDIV have been implemented as loops of div microops, where each microop computes a single bit of the quotient.	2015-04-29 22:35:22 -05:00
Nilay Vaish	ee06fed656	x86: change divide-by-zero fault to divide-error Same exception is raised whether division with zero is performed or the quotient is greater than the maximum value that the provided space can hold. Divide-by-Zero is the AMD terminology, while Divide-Error is Intel's.	2015-04-29 22:35:22 -05:00
Andreas Hansson	179787f31f	misc: Appease gcc 5.1 without moving GDB_REG_BYTES This patch rolls back the move of the GDB_REG_BYTES constant, and instead adds M5_VAR_USED.	2015-04-24 03:30:08 -04:00
Rene de Jong	483f873d01	arm, dev: Add a UFS device This patch introduces a UFS host controller and a UFS device. More information about the UFS standard can be found at the JEDEC site: http://www.jedec.org/standards-documents/results/jesd220 Note that the model does not implement the complete standard, and as such is not an actual implementation of UFS. The following SCSI commands are implemented: inquiry, read, read capacity, report LUNs, start/stop, test unit ready, verify, write, format unit, send diagnostic, synchronize cache, mode select, mode sense, request sense, unmap, write buffer and read buffer. This is sufficient for usage with Linux and Android. To interact with this model a kernel version 3.9 or above is needed.	2015-04-23 13:37:50 -04:00
Rene de Jong	fff28ce954	arm, dev: Add a NAND flash timing model This adds a NAND flash timing model. This model takes the number of planes into account and is ultimately intended to be used as a high-level performance model for any device using flash. To access the memory, use either readMemory or writeMemory. To make use of the model you will need an interface model such as UFSHostDevice, which is part of a separate patch. At the moment the flash device is part of the ARM device tree since the only use if the UFSHostDevice, and that in turn relies on the ARM GIC.	2015-04-23 13:37:49 -04:00
Peter Enns	2e64590b88	dev: Add support for i2c devices This patch adds an I2C bus and base device. I2C is used to connect a variety of sensors, and this patch serves as a starting point to enable a range of I2C devices.	2015-04-23 13:37:48 -04:00
Andreas Hansson	c8c4f66889	misc: Appease gcc 5.1 This patch fixes a few small issues to ensure gem5 compiles when using gcc 5.1. First, the GDB_REG_BYTES in the RemoteGDB header are, rather surprisingly, flagged as unused for both ARM and X86. Removing them, however, causes compilation errors as they are actually used in the source file. Moving the constant into the class definition fixes the issue. Possibly a gcc bug. Second, we have an unused EthPktData constructor using auto_ptr, and the latter is deprecated. Since the code is never used it is simply removed.	2015-04-23 13:37:46 -04:00
Brandon Potter	a70a83155b	cpu: remove conditional check (count > 0) on o3 IQ squashes The o3 cpu instruction queue model uses the count variable to track the number of unissued instructions in the queue. Previously, the squash method used this variable to avoid executing the doSquash method when there were no unissued instructions in the pipeline. A corner case problem exists when only issued instructions exist in the pipeline and a squash occurs; the doSquash code is not invoked and subsequently does not clean up state properly.	2015-04-22 07:52:03 -07:00
Brandon Potter	4991c29965	syscall_emul: implement clock_gettime system call	2015-04-22 07:51:27 -07:00
Monir Mozumder	00e3cab8fc	syscall_emul: update x86 syscall table Update table with additional definitions through Linux 3.13.	2015-04-22 07:51:27 -07:00

1 2 3 4 5 ...

6802 commits