sanchayanmaity/gem5 - Sanchayan Maity's repositories

Author	SHA1	Message	Date
Andreas Sandberg	d5f5fbb855	sim: Move mem(Writeback\|Invalidate) to SimObject The memWriteback() and memInvalidate() calls used to live in the Serializable interface. In this series of patches, the Serializable interface will be redesigned to make serialization independent of the object graph and always work on the entire simulator. This means that the Serialization interface won't be useful to perform maintenance of the caches in a sub-graph of the entire SimObject graph. This changeset moves these memory maintenance methods to the SimObject interface instead.	2015-07-07 09:51:04 +01:00
Andreas Sandberg	e9c3d59aae	sim: Make the drain state a global typed enum The drain state enum is currently a part of the Drainable interface. The same state machine will be used by the DrainManager to identify the global state of the simulator. Make the drain state a global typed enum to better cater for this usage scenario.	2015-07-07 09:51:04 +01:00
Andreas Sandberg	1dc5e63b88	python: Remove redundant drain when changing memory modes When the Python helper code switches CPU models, it sometimes also needs to change the memory mode of the simulator. When this happens, it accidentally tried to drain the simulator despite having done so already. This changeset removes the redundant drain.	2015-07-07 09:51:04 +01:00
Andreas Sandberg	7773cb9565	sim: Add macros to serialize objects into a section Add the SERIALIZE_OBJ / UNSERIALIZE_OBJ macros that serialize an object into a subsection of the current checkpoint section.	2015-07-07 09:51:04 +01:00
Andreas Sandberg	b3ecfa6ae0	base: Add serialization support to Pixels and FrameBuffer Serialize pixels as unsigned 32 bit integers by adding the required to_number() and stream operators. This is used by the FrameBuffer, which now implements the Serializable interface. Users of frame buffers are expected to serialize it into its own section by calling serializeSection().	2015-07-07 09:51:04 +01:00
Andreas Sandberg	888ec455cb	sim: Fix broken event unserialization Events expected to be unserialized using an event-specific unserializeEvent call. This call was never actually used, which meant the events relying on it never got unserialized (or scheduled after unserialization). Instead of relying on a custom call, we now use the normal serialization code again. In order to schedule the event correctly, the parrent object is expected to use the EventQueue::checkpointReschedule() call. This happens automatically for events that are serialized using the AutoSerialize mechanism.	2015-07-07 09:51:04 +01:00
Andreas Sandberg	76cd4393c0	sim: Refactor the serialization base class Objects that are can be serialized are supposed to inherit from the Serializable class. This class is meant to provide a unified API for such objects. However, so far it has mainly been used by SimObjects due to some fundamental design limitations. This changeset redesigns to the serialization interface to make it more generic and hide the underlying checkpoint storage. Specifically: * Add a set of APIs to serialize into a subsection of the current object. Previously, objects that needed this functionality would use ad-hoc solutions using nameOut() and section name generation. In the new world, an object that implements the interface has the methods serializeSection() and unserializeSection() that serialize into a named /subsection/ of the current object. Calling serialize() serializes an object into the current section. * Move the name() method from Serializable to SimObject as it is no longer needed for serialization. The fully qualified section name is generated by the main serialization code on the fly as objects serialize sub-objects. * Add a scoped ScopedCheckpointSection helper class. Some objects need to serialize data structures, that are not deriving from Serializable, into subsections. Previously, this was done using nameOut() and manual section name generation. To simplify this, this changeset introduces a ScopedCheckpointSection() helper class. When this class is instantiated, it adds a new /subsection/ and subsequent serialization calls during the lifetime of this helper class happen inside this section (or a subsection in case of nested sections). * The serialize() call is now const which prevents accidental state manipulation during serialization. Objects that rely on modifying state can use the serializeOld() call instead. The default implementation simply calls serialize(). Note: The old-style calls need to be explicitly called using the serializeOld()/serializeSectionOld() style APIs. These are used by default when serializing SimObjects. * Both the input and output checkpoints now use their own named types. This hides underlying checkpoint implementation from objects that need checkpointing and makes it easier to change the underlying checkpoint storage code.	2015-07-07 09:51:03 +01:00
Andreas Sandberg	d7a56ee524	tests: Skip SPARC tests if the required binaries are missing The full-system SPARC tests depend on several binaries that aren't generally available to the wider community. Flag the tests as skipped instead of failed if these binaries can't be found.	2015-07-07 09:51:03 +01:00
Andreas Sandberg	7cd5db8c6d	sim: Add serialization macros for std containers	2015-07-07 09:51:03 +01:00
Andreas Sandberg	777cc71c4a	mem: Cleanup CommMonitor in preparation for probe support Make configuration parameters constant and get rid of an unnecessary dependency on the Time class.	2015-07-06 17:08:53 +01:00
Nilay Vaish	381e9191dd	stats: x86: update stats missed out on in preivous changeset	2015-07-05 20:26:18 -05:00
Nilay Vaish	9954eb74df	stats: update stale config.ini files, eio and few other stats.	2015-07-04 10:43:47 -05:00
Nikos Nikoleris	67925a8334	x86: Adjust the size of the values written to the x87 misc registers All x87 misc registers are implemented in an array of 64 bit values but in real hardware the size of some of these registers is smaller. Previsouly all 64 bits where incorrectly set and then later read. To ensure correctness we mask the value in setMiscRegNoEffect to write only the valid bits. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2015-07-04 10:43:47 -05:00
David Hashe	64af6dafb1	config: Update location of ruby topologies in help Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2015-07-04 10:43:47 -05:00
Nilay Vaish	11a48faeb4	o3: correct the number of cc registers in rename map	2015-07-04 10:43:46 -05:00
Nilay Vaish	d29d7c41f1	mem: packet: Add const to constructor argument	2015-07-04 10:43:46 -05:00
Nilay Vaish	16ac48e6a4	ruby: drop NetworkMessage class This patch drops the NetworkMessage class. The relevant data members and functions have been moved to the Message class, which was the parent of NetworkMessage.	2015-07-04 10:43:46 -05:00
Nilay Vaish	baa3eb0de3	ruby: mesi three level: name change to avoid clash The accessor function getDestination() for Destination variable in the coherence message clashes with the getDestination() that is part of the Message class. Hence the name change.	2015-07-04 10:43:46 -05:00
Nilay Vaish	b4efb48a71	ruby: remove message buffer node This structure's only purpose was to provide a comparison function for ordering messages in the MessageBuffer. The comparison function is now being moved to the Message class itself. So we no longer require this structure.	2015-07-04 10:43:46 -05:00
Andreas Hansson	25e1b1c1f5	stats: Update stats for cache, crossbar and DRAM changes This update includes the changes to whole-line writes, the refinement of Read to ReadClean and ReadShared, the introduction of CleanEvict for snoop-filter tracking, and updates to the DRAM command scheduler for bank-group-aware scheduling. Needless to say, almost every regression is affected.	2015-07-03 10:15:03 -04:00
Andreas Hansson	7e711c98f8	mem: Increase the default buffer sizes for the DDR4 controller This patch increases the default read/write buffer sizes for the DDR4 controller config to values that are more suitable for the high bandwidth and high bank count.	2015-07-03 10:14:48 -04:00
Wendy Elsasser	31f901b69d	mem: Update DRAM command scheduler for bank groups This patch updates the command arbitration so that bank group timing as well as rank-to-rank delays will be taken into account. The resulting arbitration no longer selects commands (prepped or not) that cannot issue seamlessly if there are commands that can issue back-to-back, minimizing the effect of rank-to-rank (tCS) & same bank group (tCCD_L) delays. The arbitration selects a new command based on the following priority. Within each priority band, the arbitration will use FCFS to select the appropriate command: 1) Bank is prepped and burst can issue seamlessly, without a bubble 2) Bank is not prepped, but can prep and issue seamlessly, without a bubble 3) Bank is prepped but burst cannot issue seamlessly. In this case, a bubble will occur on the bus Thus, to enable more parallelism in subsequent selections, an unprepped packet is given higher priority if the bank prep can be hidden. If the bank prep cannot be hidden, the selection logic will choose a prepped packet that cannot issue seamlessly if one exist. Otherwise, the default selection will choose the packet with the minimum bank prep delay.	2015-07-03 10:14:46 -04:00
Andreas Hansson	b56167b682	mem: Avoid DRAM write queue iteration for merging and read lookup This patch adds a simple lookup structure to avoid iterating over the write queue to find read matches, and for the merging of write bursts. Instead of relying on iteration we simply store a set of currently-buffered write-burst addresses and compare against these. For the reads we still perform the iteration if we have a match. For the writes, we rely entirely on the set. Note that there are corner-cases where sub-bursts would actually not be mergeable without a read-modify-write. We ignore these cases and opt for speed.	2015-07-03 10:14:45 -04:00
Andreas Hansson	db85ddca1a	mem: Delay responses in the crossbar before forwarding This patch changes how the crossbar classes deal with responses. Instead of forwarding responses directly and burdening the neighbouring modules in paying for the latency (through the pkt->headerDelay), we now queue them before sending them. The coherency protocol is not affected as requests and any snoop requests/responses are still passed on in zero time. Thus, the responses end up paying for any header delay accumulated when passing through the crossbar. Any latency incurred on the request path will be paid for on the response side, if no other module has dealt with it. As a result of this patch, responses are returned at a later point. This affects the number of outstanding transactions, and quite a few regressions see an impact in blocking due to no MSHRs, increased cache-miss latencies, etc. Going forward we should be able to use the same concept also for snoop responses, and any request that is not an express snoop.	2015-07-03 10:14:44 -04:00
Andreas Hansson	b93c912013	mem: Remove redundant is_top_level cache parameter This patch takes the final step in removing the is_top_level parameter from the cache. With the recent changes to read requests and write invalidations, the parameter is no longer needed, and consequently removed. This also means that asymmetric cache hierarchies are now fully supported (and we are actually using them already with L1 caches, but no table-walker caches, connected to a shared L2).	2015-07-03 10:14:43 -04:00
Andreas Hansson	71856cfbbc	mem: Split WriteInvalidateReq into write and invalidate WriteInvalidateReq ensures that a whole-line write does not incur the cost of first doing a read exclusive, only to later overwrite the data. This patch splits the existing WriteInvalidateReq into a WriteLineReq, which is done locally, and an InvalidateReq that is sent out throughout the memory system. The WriteLineReq re-uses the normal WriteResp. The change allows us to better express the difference between the cache that is performing the write, and the ones that are merely invalidating. As a consequence, we no longer have to rely on the isTopLevel flag. Moreover, the actual memory in the system does not see the intitial write, only the writeback. We were marking the written line as dirty already, so there is really no need to also push the write all the way to the memory. The overall flow of the write-invalidate operation remains the same, i.e. the operation is only carried out once the response for the invalidate comes back. This patch adds the InvalidateResp for this very reason.	2015-07-03 10:14:41 -04:00
Andreas Hansson	0ddde83a47	mem: Add ReadCleanReq and ReadSharedReq packets This patch adds two new read requests packets: ReadCleanReq - For a cache to explicitly request clean data. The response is thus exclusive or shared, but not owned or modified. The read-only caches (see previous patch) use this request type to ensure they do not get dirty data. ReadSharedReq - We add this to distinguish cache read requests from those issued by other masters, such as devices and CPUs. Thus, devices use ReadReq, and caches use ReadCleanReq, ReadExReq, or ReadSharedReq. For the latter, the response can be any state, shared, exclusive, owned or even modified. Both ReadCleanReq and ReadSharedReq re-use the normal ReadResp. The two transactions are aligned with the emerging cache-coherent TLM standard and the AMBA nomenclature. With this change, the normal ReadReq should never be used by a cache, and is reserved for the actual (non-caching) masters in the system. We thus have a way of identifying if a request came from a cache or not. The introduction of ReadSharedReq thus removes the need for the current isTopLevel hack, and also allows us to stop relying on checking the packet size to determine if the source is a cache or not. This is fixed in follow-on patches.	2015-07-03 10:14:40 -04:00
Andreas Hansson	893533a126	mem: Allow read-only caches and check compliance This patch adds a parameter to the BaseCache to enable a read-only cache, for example for the instruction cache, or table-walker cache (not for x86). A number of checks are put in place in the code to ensure a read-only cache does not end up with dirty data. A follow-on patch adds suitable read requests to allow a read-only cache to explicitly ask for clean data.	2015-07-03 10:14:39 -04:00
Ali Jafri	a262908acc	mem: Add clean evicts to improve snoop filter tracking This patch adds eviction notices to the caches, to provide accurate tracking of cache blocks in snoop filters. We add the CleanEvict message to the memory heirarchy and use both CleanEvicts and Writebacks with BLOCK_CACHED flags to propagate notice of clean and dirty evictions respectively, down the memory hierarchy. Note that the BLOCK_CACHED flag indicates whether there exist any copies of the evicted block in the caches above the evicting cache. The purpose of the CleanEvict message is to notify snoop filters of silent evictions in the relevant caches. The CleanEvict message behaves much like a Writeback. CleanEvict is a write and a request but unlike a Writeback, CleanEvict does not have data and does not need exclusive access to the block. The cache generates the CleanEvict message on a fill resulting in eviction of a clean block. Before travelling downwards CleanEvict requests generate zero-time snoop requests to check if the same block is cached in upper levels of the memory heirarchy. If the block exists, the cache discards the CleanEvict message. The snoops check the tags, writeback queue and the MSHRs of upper level caches in a manner similar to snoops generated from HardPFReqs. Currently CleanEvicts keep travelling towards main memory unless they encounter the block corresponding to their address or reach main memory (since we have no well defined point of serialisation). Main memory simply discards CleanEvict messages. We have modified the behavior of Writebacks, such that they generate snoops to check for the presence of blocks in upper level caches. It is possible in our current implmentation for a lower level cache to be writing back a block while a shared copy of the same block exists in the upper level cache. If the snoops find the same block in upper level caches, we set the BLOCK_CACHED flag in the Writeback message. We have also added logic to account for interaction of other message types with CleanEvicts waiting in the writeback queue. A simple example is of a response arriving at a cache removing any CleanEvicts to the same address from the cache's writeback queue.	2015-07-03 10:14:37 -04:00
Andreas Hansson	aa5bbe81f6	mem: Convert Request static const flags to enums This patch fixes an issue which is very wide spread in the codebase, causing sporadic linking failures. The issue is that we declare static const class variables in the header, without any definition (as part of a source file). In most cases the compiler propagates the value and we have no issues. However, especially for less optimising builds such as debug, we get sporadic linking failures due to undefined references. This patch fixes the Request class, by turning the static const flags and master IDs into C++11 typed enums.	2015-07-03 10:14:36 -04:00
Curtis Dunham	359904194d	scons: remove dead leading underscore check e56c3d8 (2008) added it but 8e37348 (2010) removed its only use.	2015-07-03 10:14:35 -04:00
Curtis Dunham	e385ae0c72	base: remove fd from object loaders All the object loaders directly examine the (already completely loaded by object_file.cc) memory image. There is no current motivation to keep the fd around.	2015-07-03 10:14:34 -04:00
Andreas Hansson	d9f8f07613	util: Remove DRAMPower trace script This script is deprecated and DRAMPower is now properly integrated with the controller model.	2015-07-03 10:14:24 -04:00
Andreas Hansson	c466d55a26	scons: Bump compiler requirement to gcc >= 4.7 and clang >= 3.1 This patch updates the compiler minimum requirement to gcc 4.7 and clang 3.1, thus allowing: 1. Explicit virtual overrides (no need for M5_ATTR_OVERRIDE) 2. Non-static data member initializers 3. Template aliases 4. Delegating constructors This patch also enables a transition from --std=c++0x to --std=c++11.	2015-07-03 10:14:15 -04:00
Nilay Vaish	57971248f6	ruby: slicc: remove README No longer maintained. Updates are only made to the wiki page. So being dropped.	2015-06-25 11:58:30 -05:00
Nilay Vaish	0647d99854	ruby: message: remove a data member added by mistake I (Nilay) had mistakenly added a data member to the Message class in revision c1694b4032a6. The data member is being removed.	2015-06-25 11:58:29 -05:00
Jason Power	2f3c467883	Ruby: Remove assert in RubyPort retry list logic Remove the assert when adding a port to the RubyPort retry list. Instead of asserting, just ignore the added port, since it's already on the list. Without this patch, Ruby+detailed fails for even the simplest tests	2015-06-25 11:58:28 -05:00
Andreas Sandberg	cc813cd5f7	base: Add a warn_if macro Add a warn if macro that is analogous to the panic_if and fatal_if.	2015-06-21 20:52:13 +01:00
Andreas Sandberg	d541038549	arm: Cleanup arch headers to remove dma_device.hh dependency Break the dependency on dma_device.hh by forward-declaring DmaPort in the relevant header.	2015-06-21 20:48:33 +01:00
Ali Jafri	f0c3b70451	mem: Add check for express snoop in packet destructor Snoop packets share the request pointer with the originating packets. We need to ensure that the snoop packet destruction does not delete the request. Snoops are used for reads, invalidations, HardPFReqs, Writebacks and CleansEvicts. Reads, invalidations, and HardPFReqs need a response so their snoops do not delete the request. For Writebacks and CleanEvicts we need to check explicitly for whethere the current packet is an express snoop, in whcih case do not delete the request.	2015-06-09 09:21:18 -04:00
Andreas Hansson	578a7f20c6	mem: Fix snoop packet data allocation bug This patch fixes an issue where the snoop packet did not properly forward the data pointer in case of static data.	2015-06-09 09:21:17 -04:00
Rune Holm	eb3ed11794	arm: Delete debug print in initialization of hardware thread There seems to have been a debug print left in when the original ARMv8 support was merged in. This printout is performed every time you initialize a hardware thread, and it prints raw pointers, so it always causes diffs in the regression. This patch removes the debug print.	2015-06-09 09:21:16 -04:00
Rune Holm	f4311d3932	arm: Fix typo in ldrsh instruction name ldrsh was typoed as hdrsh, which is a bit annoying when printing instructions. This patch fixes it.	2015-06-09 09:21:15 -04:00
Andreas Sandberg	737e5da7f6	base: Reset CircleBuf size on flush() The flush() method in CircleBuf resets the state of the circular buffer, but fails to set size to zero. This obviously confuses code that tries to determine the amount of data in the buffer. Set the size to zero on flush.	2015-06-09 09:21:14 -04:00
Andreas Sandberg	a9cad92011	dev, arm: Include PIO size in AmbaDmaDevice constructor Make it possible to specify the size of the PIO space for an AMBA DMA device. Maintain backwards compatibility and default to zero.	2015-06-09 09:21:12 -04:00
Andreas Hansson	b29e55d44a	scons: Allow GNU assembler version strings with hyphen Make scons a bit more forgiving when determining the GNU assembler version.	2015-06-09 09:21:11 -04:00
Marco Elver	6599dd87c8	ruby: Fix MESI consistency bug Fixes missed forward eviction to CPU. With the O3CPU this can lead to load-load reordering, as the LQ is never notified of the invalidate. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2015-06-07 14:02:40 -05:00
Matthias Jung	25fe4c2529	mem: Add HMC Timing Parameters A single HMC-2500 x32 model based on: [1] DRAMSpec: a high-level DRAM bank modelling tool developed at the University of Kaiserslautern. This high level tool uses RC (resistance-capacitance) and CV (capacitance-voltage) models to estimate the DRAM bank latency and power numbers. [2] A Logic-base Interconnect for Supporting Near Memory Computation in the Hybrid Memory Cube (E. Azarkhish et. al) Assumed for the HMC model is a 30 nm technology node. The modelled HMC consists of a 4 Gbit part with 4 layers connected with TSVs. Each layer has 16 vaults and each vault consists of 2 banks per layer. In order to be able to use the same controller used for 2D DRAM generations for HMC, the following analogy is done: Channel (DDR) => Vault (HMC) device_size (DDR) => size of a single layer in a vault ranks per channel (DDR) => number of layers banks per rank (DDR) => banks per layer devices per rank (DDR) => devices per layer ( 1 for HMC). The parameters for which no input is available are inherited from the DDR3 configuration.	2015-06-07 14:02:40 -05:00
Ruslan Bukin ext:(%2C%20Zhang%20Guoye)	736d3314bf	arch: fix build under MacOSX put O_DIRECT under ifdefs -- this fixes build for MacOSX. Also use correct class for arm64 openFlagTable. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2015-06-07 14:02:40 -05:00
Christoph Pfister	4a17494708	mem: addr_mapper: restore old address if request not sent Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2015-05-30 13:45:17 +02:00

... 8 9 10 11 12 ...

11362 commits