sanchayanmaity/gem5 - Sanchayan Maity's repositories

Author	SHA1	Message	Date
Nilay Vaish	f59a7af50a	ruby: stats: use gem5's stats for cache and memory controllers This moves event and transition count statistics for cache controllers to gem5's statistics. It does the same for the statistics associated with the memory controller in ruby. All the cache/directory/dma controllers individually collect the event and transition counts. A callback function, collateStats(), has been added that is invoked on the controller version 0 of each controller class. This function adds all the individual controller statistics to a vector variables. All the code for registering the statistical variables and collating them is generated by SLICC. The patch removes the files _Profiler.{cc,hh} and _ProfileDumper.{cc,hh} which were earlier used for collecting and dumping statistics respectively.	2013-06-09 07:29:59 -05:00
Nilay Vaish	38736ce7c3	ruby: remove undefined functions in Address class	2013-06-09 07:29:58 -05:00
Andreas Hansson	3bc4ecdcb4	mem: More descriptive DRAM config names This patch changes the class names of the variuos DRAM configurations to better reflect what memory they are based on. The speed and interface width is now part of the name, and also the alias that is used to select them on the command line. Some minor changes are done to the actual parameters, to better reflect the named configurations. As a result of these changes the regressions change slightly and the stats will be bumped in a separate patch.	2013-05-30 12:54:14 -04:00
Andreas Hansson	83d99aebb1	mem: Add bytes per activate DRAM controller stat This patch adds a histogram to track how many bytes are accessed in an open row before it is closed. This metric is useful in characterising a workload and the efficiency of the DRAM scheduler. For example, a DDR3-1600 device requires 44 cycles (tRC) before it can activate another row in the same bank. For a x32 interface (8 bytes per cycle) that means 8 x 44 = 352 bytes must be transferred to hide the preparation time.	2013-05-30 12:54:13 -04:00
Andreas Hansson	d82bffd297	mem: Add static latency to the DRAM controller This patch adds a frontend and backend static latency to the DRAM controller by delaying the responses. Two parameters expressing the frontend and backend contributions in absolute time are added to the controller, and the appropriate latency is added to the responses when adding them to the (infinite) queued port for sending. For writes and reads that hit in the write buffer, only the frontend latency is added. For reads that are serviced by the DRAM, the static latency is the sum of the pipeline latencies of the entire frontend, backend and PHY. The default values are chosen based on having roughly 10 pipeline stages in total at 500 MHz. In the future, it would be sensible to make the controller use its clock and convert these latencies (and a few of the DRAM timings) to cycles.	2013-05-30 12:54:12 -04:00
Andreas Hansson	7da851d1a8	mem: Spring cleaning of MSHR and MSHRQueue This patch does some minor tidying up of the MSHR and MSHRQueue. The clean up started as part of some ad-hoc tracing and debugging, but seems worthwhile enough to go in as a separate patch. The highlights of the changes are reduced scoping (private) members where possible, avoiding redundant new/delete, and constructor initialisation to please static code analyzers.	2013-05-30 12:54:11 -04:00
Andreas Hansson	42191522cc	mem: Fix MSHR print format This patch fixes an incorrect print format string by adding an additional string element.	2013-05-30 12:54:09 -04:00
Andreas Hansson	7e13c4d046	mem: Make returning snoop responses occupy response layer This patch introduces a mirrored internal snoop port to facilitate easy addition of flow control for the snoop responses that are turned into normal responses on their return. To perform this, the slave ports of the coherent bus are wrapped in internal master ports that are passed as the source ports to the response layer in question. As a result of this patch, there is more contention for the response resources, and as such system performance will decrease slightly. A consequence of the mirrored internal port is that the port the bus tells to retry (the internal one) and the port actually retrying (the mirrored) one are not the same. Thus, the existing check in tryTiming is not longer correct. In fact, the test is redundant as the layer is only in the retry state while calling sendRetry on the waiting port, and if the latter does not immediately call the bus then the retry state is left. Consequently the check is removed.	2013-05-30 12:54:02 -04:00
Andreas Hansson	2308f812ef	mem: Make the buses multi layered This patch makes the buses multi layered, and effectively creates a crossbar structure with distributed contention ports at the destination ports. Before this patch, a bus could have a single request, response and snoop response in flight at any time, and with these changes there can be as many requests as connected slaves (bus master ports), and as many responses as connected masters (bus slave ports). Together with address interleaving, this patch enables us to create high-throughput memory interconnects, e.g. 50+ GByte/s.	2013-05-30 12:54:01 -04:00
Andreas Hansson	e82996d9da	mem: Separate the two snoop response cases in the bus This patch makes the flow control and state updates of the coherent bus more clear by separating the two cases, i.e. forward as a snoop response, or turn it into a normal response. With this change it is also more clear what resources are being occupied, and that we effectively bypass the busy check for the second case. As a result of the change in resource usage some stats change.	2013-05-30 12:54:00 -04:00
Andreas Hansson	cb62d39835	mem: Tidy up a few variables in the bus This patch does some minor housekeeping on the bus code, removing redundant code, and moving the extraction of the destination id to the top of the functions using it.	2013-05-30 12:53:59 -04:00
Uri Wiener	91f7b065a9	mem: Add basic stats to the buses This patch adds a basic set of stats which are hard to impossible to implement using only communication monitors, and are needed for insight such as bus utilization, transactions through the bus etc. Stats added include throughput and transaction distribution, and also a two-dimensional vector capturing how many packets and how much data is exchanged between the masters and slaves connected to the bus.	2013-05-30 12:53:58 -04:00
Andreas Hansson	e1e73c5f39	mem: Use unordered set in bus request tracking This patch changes the set used to track outstanding requests to an unordered set (part of C++11 STL). There is no need to maintain the order, and hopefully there might even be a small performance benefit.	2013-05-30 12:53:57 -04:00
Andreas Hansson	82397921a5	mem: Check for waiting state in bus draining This patch fixes a bug in the bus where the bus transitions from busy to idle and still has a port that is waiting for a retry from a peer.	2013-05-30 12:53:57 -04:00
Andreas Hansson	bf6291460d	mem: Add a LPDDR3-1600 configuration This patch adds a typical (leaning towards fast) LPDDR3 configuration based on publically available data. As expected, it looks very similar to the LPDDR2-S4 configuration, only with a slightly lower burst time.	2013-05-30 12:53:56 -04:00
Andreas Hansson	ce1ad84abd	mem: Adapt the LPDDR2 to match a single x32 channel This patch adapts the existing LPDDR2 configuration to make use of the multi-channel functionality. Thus, to get a x64 interface two controllers should be instantiated using the makeMultiChannel method. The page size and ranks are also adapted to better suit with a typical LPDDR2 part.	2013-05-30 12:53:55 -04:00
Andreas Hansson	88aa7755f4	mem: Avoid explicitly zeroing the memory backing store This patch removes the explicit memset as it is redundant and causes the simulator to touch the entire space, forcing the host system to allocate the pages. Anonymous pages are mapped on the first access, and the page-fault handler is responsible for zeroing them. Thus, the pages are still zeroed, but we avoid touching the entire allocated space which enables us to use much larger memory sizes as long as not all the memory is actually used.	2013-05-30 12:53:54 -04:00
Malek Musleh	64af621cc6	ruby: slicc: fix error msg in TypeFieldMemberAST.py	2013-05-21 11:57:14 -05:00
Nilay Vaish	4ef466cc8a	ruby: moesi hammer: cosmetic changes Updates copyright years, removes space at the end of lines, shortens variable names.	2013-05-21 11:32:45 -05:00
Nilay Vaish	09d5bc7e6f	ruby: mesi cmp directory: cosmetic changes Updates copyright years, removes space at the end of lines, shortens variable names.	2013-05-21 11:32:38 -05:00
Nilay Vaish	bd3d1955da	ruby: moesi cmp token: cosmetic changes Updates copyright years, removes space at the end of lines, shortens variable names.	2013-05-21 11:32:24 -05:00
Nilay Vaish	e7ce518168	ruby: moesi cmp directory: cosmetic changes Updates copyright years, removes space at the end of lines, shortens variable names.	2013-05-21 11:32:15 -05:00
Nilay Vaish ext:(%2C%20Malek%20Musleh%20%3Cmalek.musleh%40gmail.com%3E)	59a7abff29	ruby: add stats to .sm files, remove cache profiler This patch changes the way cache statistics are collected in ruby. As of now, there is separate entity called CacheProfiler which holds statistical variables for caches. The CacheMemory class defines different functions for accessing the CacheProfiler. These functions are then invoked in the .sm files. I find this approach opaque and prone to error. Secondly, we probably should not be paying the cost of a function call for recording statistics. Instead, this patch allows for accessing statistical variables in the .sm files. The collection would become transparent. Secondly, it would happen in place, so no function calls. The patch also removes the CacheProfiler class. --HG-- rename : src/mem/slicc/ast/InfixOperatorExprAST.py => src/mem/slicc/ast/OperatorExprAST.py	2013-05-21 11:31:31 -05:00
Mitch Hayenga	b222ba2fd3	sim: Fix two bugs relating to software caching of PageTable entries. The existing implementation can read uninitialized data or stale information from the cached PageTable entries. 1) Add a valid bit for the cache entries. Simply using zero for the virtual address to signify invalid entries is not sufficient. Speculative, wrong-path accesses frequently access page zero. The current implementation would return a uninitialized TLB entry when address zero was accessed and the PageTable cache entry was invalid. 2) When unmapping/mapping/remaping a page, invalidate the corresponding PageTable cache entry if one already exists.	2013-04-23 09:47:52 -04:00
Nilay Vaish	95eebf9e5e	ruby: mesi coherence protocol: remove unused state M_MB	2013-04-23 00:03:07 -05:00
Nilay Vaish	aa86800e7a	ruby: patch checkpoint restore with garnet Due to recent changes to clocking system in Ruby and the way Ruby restores state from a checkpoint, garnet was failing to run from a checkpointed state. The problem is that Ruby resets the time to zero while warming up the caches. If any component records a local copy of the time (read calls curCycle()) before the simulation has started, then that component will not operate until that time is reached. In the context of this particular patch, the Garnet Network class calls curCycle() at multiple places. Any non-operational component can block in requests in the memory system, which the system interprets as a deadlock. This patch makes changes so that Garnet can successfully run from checkpointed state. It adds a globally visible time at which the actual execution started. This time is initialized in RubySystem::startup() function. This variable is only meant for components with in Ruby. This replaces the private variable that was maintained within Garnet since it is not possible to figure out the correct time when the value of this variable can be set. The patch also does away with all cases where curCycle() is called with in some Ruby component before the system has actually started executing. This is required due to the quirky manner in which ruby restores from a checkpoint.	2013-04-23 00:03:02 -05:00
Andreas Hansson	e23e3bea8b	mem: Address mapping with fine-grained channel interleaving This patch adds an address mapping scheme where the channel interleaving takes place on a cache line granularity. It is similar to the existing RaBaChCo that interleaves on a DRAM page, but should give higher performance when there is less locality in the address stream.	2013-04-22 13:20:34 -04:00
Andreas Hansson	e61799aa7c	mem: More descriptive enum names for address mapping This patch changes the slightly ambigious names used for the address mapping scheme to be more descriptive, and actually spell out what they do. With this patch we also open up for adding more flavours of open- and close-type mappings, i.e. interleaving across channels with the open map.	2013-04-22 13:20:33 -04:00
Andreas Hansson	a35d3ff167	mem: Add a WideIO DRAM configuration This patch adds a WideIO 200 MHz configuration that can be used as a baseline to compare with DDRx and LPDDRx. Note that it is a single channel and that it should be replicated 4 times. It is based on publically available information and attempts to capture an envisioned 8 Gbit single-die part (i.e. without TSVs).	2013-04-22 13:20:33 -04:00
Uri Wiener	a8fbfefb5e	mem: Adding verbose debug output in the memory system This patch provides useful printouts throughut the memory system. This includes pretty-printed cache tags and function call messages (call-stack like).	2013-04-22 13:20:33 -04:00
Andreas Hansson	9929e884b6	mem: Replace check with panic where inhibited should not happen This patch changes the SimpleTimingPort and RubyPort to panic on inhibited requests as this should never happen in either of the cases. The SimpleTimingPort is only used for the I/O devices PIO port and the DMA devices config port and should thus never see an inhibited request. Similarly, the SimpleTimingPort is also used for the MessagePort in x86, and there should also not be any cases where the port sees an inhibited request.	2013-04-22 13:20:33 -04:00
Dam Sunwoo	e8381142b0	sim: separate nextCycle() and clockEdge() in clockedObjects Previously, nextCycle() could return the current cycle if the current tick was already aligned with the clock edge. This behavior is not only confusing (not quite what the function name implies), but also caused problems in the drainResume() function. When exiting/re-entering the sim loop (e.g., to take checkpoints), the CPUs will drain and resume. Due to the previous behavior of nextCycle(), the CPU tick events were being rescheduled in the same ticks that were already processed before draining. This caused divergence from runs that did not exit/re-entered the sim loop. (Initially a cycle difference, but a significant impact later on.) This patch separates out the two behaviors (nextCycle() and clockEdge()), uses nextCycle() in drainResume, and uses clockEdge() everywhere else. Nothing (other than name) should change except for the drainResume timing.	2013-04-22 13:20:31 -04:00
Nilay Vaish	03c60f005e	ruby: moesi cmp directory: add copyright notice	2013-04-17 16:06:58 -05:00
Joel Hestness	1583056de8	Ruby: Fix RubyPort evict packet memory leak When using the o3 or inorder CPUs with many Ruby protocols, the caches may need to forward invalidations to the CPUs. The RubyPort was instantiating a packet to be sent to the CPUs to signal the eviction, but the packets were not being freed by the CPUs. Consistent with the classic memory model, stack allocate the packet and heap allocate the request so on ruby_eviction_callback() completion, the packet deconstructor is called, and deletes the request (*Note: stack allocating the request causes double deletion, since it will be deleted in the packet destructor). This results in the least memory allocations without memory errors.	2013-04-09 16:25:30 -05:00
Joel Hestness	46d4b71aa2	Ruby: Delete packet requests during warmup When warming up caches in Ruby, the CacheRecorder sends fetch requests into Ruby Sequencers with packet types that require responses. Since responses are never generated for these CacheRecorder requests, the requests are not deleted in the packet destructor called from the Ruby hit callback. Free the request.	2013-04-09 16:25:29 -05:00
Joel Hestness	e98c3c227d	Ruby: Add field to slicc machine for generic type This allows you to have (i.e.) an L2 cache that is not named "L2Cache" but is still a GenericMachineType_L2Cache. This is particularly helpful if the protocol has multiple L2 controllers.	2013-04-09 16:25:29 -05:00
Joel Hestness	b936619ab4	Ruby: Order profilers based on version When Ruby stats are printed for events and transitions, they include stats for all of the controllers of the same type, but they are not necessarily printed in order of the controller ID "version", because of the way the profilers were added to the profiler vector. This patch fixes the push order problem so that the stats are printed in ascending order 0->(# controllers), so statistics parsers may correctly assume the controller to which the stats belong.	2013-04-09 16:25:29 -05:00
Jason Power	88d34665d0	Ruby: More descriptive message buffer connection fatal When connecting message buffers between Ruby controllers, it is easy to mistakenly connect multiple controllers to the same message buffer. This patch prints a more descriptive fatal message than the previous assert statement in order to facilitate easier debugging.	2013-04-09 16:15:06 -05:00
Jason Power	19cc9fc6bd	Ruby: Fix typo in Slicc if-statement AST error The error in the SLICC code was hidden by the python error in SLICC parser before this patch	2013-04-09 16:12:42 -05:00
Joel Hestness	3b02210713	Ruby System, Cache Recorder: Use delete [] for trace vars The cache trace variables are array allocated uint8_t* in the RubySystem and the Ruby CacheRecorder, but the code used delete to free the memory, resulting in Valgrind memory errors. Change these deletes to delete [] to get rid of the errors.	2013-04-07 20:31:15 -05:00
Mitch Hayenga	4920f0d7e5	mem: Fix cache latency bug Fixes a latency calculation bug for accesses during a cache line fill. Under a cache miss, before the line is filled, accesses to the cache are associated with a MSHR and marked as targets. Once the line fill completes, MSHR target packets pay an additional latency of "responseLatency + busSerializationLatency". However, the "whenReady" field of the cache line is only set to an additional delay of "busSerializationLatency". This lacks the responseLatency component of the fill. It is possible for accesses that occur on the cycle of (or briefly after) the line fill to respond without properly paying the responseLatency. This also creates the situation where two accesses to the same address may be serviced in an order opposite of how they were received by the cache. For stores to the same address, this means that although the cache performs the stores in the order they were received, acknowledgements may be sent in a different order. Adding the responseLatency component to the whenReady field preserves the penalty that should be paid and prevents these ordering issues. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2013-03-27 18:36:09 -05:00
Rene de Jong	87089175cc	mem: Cancel cache retry event when blocking port This patch solves the corner case scenario where the sendRetryEvent could be scheduled twice, when an io device stresses the IOcache in the system. This should not be possible in the cache system.	2013-03-26 14:46:51 -04:00
Andreas Hansson	93a8423dea	mem: Separate waiting for the bus and waiting for a peer This patch splits the retryList into a list of ports that are waiting for the bus itself to become available, and a map that tracks the ports where forwarding failed due to a peer not accepting the packet. Thus, when a retry reaches the bus, it can be sent to the appropriate port that initiated that transaction. As a consequence of this patch, only ports that are really ready to go will get a retry, thus reducing the amount of redundant failed attempts. This patch also makes it easier to reason about the order of servicing requests as the ports waiting for the bus are now clearly FIFO and much easier to change if desired.	2013-03-26 14:46:47 -04:00
Andreas Hansson	362f6f1a16	mem: Introduce a variable for the retrying port This patch introduces a variable to keep track of the retrying port instead of relying on it being the front of the retryList. Besides the improvement in readability, this patch is a step towards separating out the two cases where a port is waiting for the bus to be free, and where the forwarding did not succeed and the bus is waiting for a retry to pass on to the original initiator of the transaction. The changes made are currently such that the regressions are not affected. This is ensured by always prioritizing the currently retrying port and putting it back at the front of the retry list.	2013-03-26 14:46:46 -04:00
Andreas Hansson	7a57b1bce0	mem: Add optional request flags to the packet trace This patch adds an optional flags field to the packet trace to encode the request flags that contain information about whether the request is (un)cacheable, instruction fetch, preftech etc.	2013-03-26 14:46:44 -04:00
Nilay Vaish	b2c8c50f17	ruby: slicc: set sender, receiver clock objs for optional queue	2013-03-22 17:21:23 -05:00
Nilay Vaish	e85b556d70	ruby: message buffer: correct previous errors A recent set of patches added support for multiple clock domains to ruby. I had made some errors while writing those patches. The sender was using the receiver side clock while enqueuing a message in the buffer. Those errors became visible while creating (or restoring from) checkpoints. The errors also become visible when a multi eventq scenario occurs.	2013-03-22 17:21:22 -05:00
Nilay Vaish	47c8cb72fc	ruby: message buffer: remove _ptr from some variables The names were getting too long.	2013-03-22 15:53:27 -05:00
Nilay Vaish	6465cf5824	ruby: message buffer node: used Tick in place of Cycles The message buffer node used to keep time in terms of Cycles. Since the sender and the receiver can have different clock periods, storing node time in cycles requires some conversion. Instead store the time directly in Ticks.	2013-03-22 15:53:26 -05:00
Nilay Vaish	39e9445468	ruby: consumer: avoid using receiver side clock A set of patches was recently committed to allow multiple clock domains in ruby. In those patches, I had inadvertently made an incorrect use of the clocks. Suppose object A needs to schedule an event on object B. It was possible that A accesses B's clock to schedule the event. This is not possible in actual system. Hence, changes are being to the Consumer class so as to avoid such happenings. Note that in a multi eventq simulation, this can possibly lead to an incorrect simulation. There are two functions in the Consumer class that are used for scheduling events. The first function takes in the relative delay over the current time as the argument and adds the current time to it for scheduling the event. The second function takes in the absolute time (in ticks) for scheduling the event. The first function is now being moved to protected section of the class so that only objects of the derived classes can use it. All other objects will have to specify absolute time while scheduling an event for some consumer.	2013-03-22 15:53:26 -05:00
Nilay Vaish	28005a7626	ruby: remove unsued profile functions	2013-03-22 15:53:25 -05:00
Nilay Vaish	89bb826079	ruby: keep histogram of outstanding requests in seq The histogram for tracking outstanding counts per cycle is maintained in the profiler. For a parallel implementation of the memory system, we need that this histogram is maintained locally. Hence it will now be kept in the sequencer itself. The resulting histograms will be merged when the stats are printed.	2013-03-22 15:53:25 -05:00
Nilay Vaish	870d545788	slicc: remove check if the L1Cache has a sequencer	2013-03-22 15:53:24 -05:00
Nilay Vaish	8573a69d8f	ruby: move stall and wakeup functions to AbstractController These functions are currently implemented in one of the files related to Slicc. Since these are purely C++ functions, they are better suited to be in the base class.	2013-03-22 15:53:24 -05:00
Nilay Vaish	eccc86e809	ruby: connect two controllers using only message buffers This patch modifies ruby so that two controllers can be connected to each other with only message buffers in between. Before this patch, all the controllers had to be connected to the network for them to communicate with each other. With this patch, one can have protocols where a controller is not connected to the network, but communicates with another controller through a message buffer.	2013-03-22 15:53:23 -05:00
Nilay Vaish	5aa43e130a	ruby: convert Topology to regular class The Topology class in Ruby does not need to inherit from SimObject class. This patch turns it into a regular class. The topology object is now created in the constructor of the Network class. All the parameters for the topology class have been moved to the network class.	2013-03-22 15:53:23 -05:00
Nilay Vaish	2d50127642	ruby: network: move routers from topology to network	2013-03-22 15:53:22 -05:00
Andreas Hansson	c01c5e971b	mem: Fix missing delete of packet in DRAM access This patch fixes a memory leak caused by not deleting packets that require no response.	2013-03-18 05:22:45 -04:00
Nilay Vaish	dc37b03439	ruby: set: corrects csprintf() call introduced by 7d95b650c9b6	2013-03-15 16:28:08 -05:00
Andreas Hansson	92e973b310	ruby: Fix gcc 4.8 maybe-uninitialized compilation error This patch fixes the one-and-only gcc 4.8 compilation error, being a warning about "maybe uninitialized" in Orion.	2013-03-07 05:55:02 -05:00
Nilay Vaish	c061819890	ruby: remove the functional copy of memory in se mode This patch removes the functional copy of the memory that was maintained in the se mode. Now ruby itself will provide the data.	2013-03-06 21:53:57 -06:00
Nilay Vaish	e8802fa127	ruby: garnet: fixed: implement functional access	2013-03-06 21:53:16 -06:00
Blake Hechtman ext:(%2C%20Nilay%20Vaish%20%3Cnilay%40cs.wisc.edu%3E)	af8eb67fb4	ruby: fixes functional writes to RubyRequest The functional write code was assuming that all writes are block sized, which may not be true for Ruby Requests. This bug can lead to a buffer overflow. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2013-03-02 23:12:55 -06:00
Andreas Hansson	e5bcb30756	mem: Add check if SimpleDRAM nextReqEvent is scheduled This check covers a case where a retry is called from the SimpleDRAM causing a new request to appear before the DRAM itself schedules a nextReqEvent. By adding this check, the event is not scheduled twice.	2013-03-01 13:20:33 -05:00
Andreas Hansson	da5356ccce	mem: Add a method to build multi-channel DRAM configurations This patch adds a class method that allows easy creation of channel-interleaved multi-channel DRAM configurations. It is enabled by a class method to allow customisation of the class independent of the channel configuration. For example, the user can create a MyDDR subclass of e.g. SimpleDDR3, and then create a four-channel configuration of the subclass by calling MyDDR.makeMultiChannel(4, mem_start, mem_size).	2013-03-01 13:20:32 -05:00
Andreas Hansson	0facc8e1ac	mem: SimpleDRAM variable naming and whitespace fixes This patch fixes a number of small cosmetic issues in the SimpleDRAM module. The most important change is to move the accounting of received packets to after the check is made if the packet should be retried or not. Thus, packets are only counted if they are actually accepted.	2013-03-01 13:20:24 -05:00
Andreas Hansson	3ba131f4d5	mem: Add support for multi-channel DRAM configurations This patch adds support for multi-channel instances of the DRAM controller model by stripping away the channel bits in the address decoding. The patch relies on the availiability of address interleaving and, at this time, it is up to the user to configure the interleaving appropriately. At the moment it is assumed that the channel interleaving bits are immediately following the column bits (smallest sensible interleaving). Convenience methods for building multi-channel configurations will be added later.	2013-03-01 13:20:22 -05:00
Andreas Hansson	1a58362e25	mem: Merge interleaved ranges when creating backing store This patch adds merging of interleaved ranges before creating the backing stores. The backing stores are always a contigous chunk of the address space, and with this patch it is possible to have interleaved memories in the system.	2013-03-01 13:20:21 -05:00
Andreas Hansson	cafd38f36c	mem: Merge ranges in bus before passing them on This patch adds basic merging of address ranges to the bus, such that interleaved ranges are merged together before being passed on by the bus. As such, the bus aggregates the address ranges of the connected slave ports and then passes on the merged ranges through its master ports. The bus thus hides the complexity of the interleaved ranges and only exposes contigous ranges to the surrounding system. As part of this patch, the bus ranges are also cached for any future queries.	2013-03-01 13:20:19 -05:00
Dibakar Gope ext:(%2C%20Nilay%20Vaish%20%3Cnilay%40cs.wisc.edu%3E)	c636a09e83	ruby: mesi coherence protocol: invalidate lock The MESI CMP directory coherence protocol, while transitioning from SM to IM, did not invalidate the lock that it might have taken on a cache line. This patch adds an action for doing so. The problem was found by Dibakar, but I was not happy with his proposed solution. So I implemented a different solution. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2013-02-28 10:04:26 -06:00
Nilay Vaish	fea27bc49b	slicc: remove unused variable message_buffer_names	2013-02-19 22:58:51 -06:00
Nilay Vaish	e95e78ff2f	ruby: remove unused variable m_print_config in class Topology	2013-02-19 22:58:50 -06:00
Andreas Hansson	da950caed2	mem: Fix sender state bug and delay popping This patch fixes a newly introduced bug where the sender state was popped before checking that it should be. Amazingly all regressions pass, but Linux fails to boot on the detailed CPU with caches enabled.	2013-02-19 12:57:47 -05:00
Andreas Hansson	a62afd094b	scons: Fix warnings issued by clang 3.2svn (XCode 4.6) This patch fixes the warnings that clang3.2svn emit due to the "-Wall" flag. There is one case of an uninitialised value in the ARM neon ISA description, and then a whole range of unused private fields that are pruned.	2013-02-19 05:56:08 -05:00
Andreas Hansson	319443d42d	scons: Add warning for missing declarations This patch enables warnings for missing declarations. To avoid issues with SWIG-generated code, the warning is only applied to non-SWIG code.	2013-02-19 05:56:07 -05:00
Andreas Hansson	c10098f28b	scons: Fix up numerous warnings about name shadowing This patch address the most important name shadowing warnings (as produced when using gcc/clang with -Wshadow). There are many locations where constructor parameters and function parameters shadow local variables, but these are left unchanged.	2013-02-19 05:56:06 -05:00
Andreas Hansson	860155a5fc	mem: Enforce strict use of busFirst- and busLastWordTime This patch adds a check to ensure that the delay incurred by the bus is not simply disregarded, but accounted for by someone. At this point, all the modules do is to zero it out, and no additional time is spent. This highlights where the bus timing is simply dropped instead of being paid for. As a follow up, the locations identified in this patch should add this additional time to the packets in one way or another. For now it simply acts as a sanity check and highlights where the delay is simply ignored. Since no time is added, all regressions remain the same.	2013-02-19 05:56:06 -05:00
Andreas Hansson	40d0e6c899	mem: Change accessor function names to match the port interface This patch changes the names of the cache accessor functions to be in line with those used by the ports. This is done to avoid confusion and get closer to a one-to-one correspondence between the interface of the memory object (the cache in this case) and the port itself. The member function timingAccess has been split into a snoop/non-snoop part to avoid branching on the isResponse() of the packet.	2013-02-19 05:56:06 -05:00
Andreas Hansson	b3fc8839c4	mem: Make packet bus-related time accounting relative This patch changes the bus-related time accounting done in the packet to be relative. Besides making it easier to align the cache timing to cache clock cycles, it also makes it possible to create a Last-Level Cache (LLC) directly to a memory controller without a bus inbetween. The bus is unique in that it does not ever make the packets wait to reflect the time spent forwarding them. Instead, the cache is currently responsible for making the packets wait. Thus, the bus annotates the packets with the time needed for the first word to appear, and also the last word. The cache then delays the packets in its queues before passing them on. It is worth noting that every object attached to a bus (devices, memories, bridges, etc) should be doing this if we opt for keeping this way of accounting for the bus timing.	2013-02-19 05:56:06 -05:00
Andreas Hansson	362160c8ae	mem: Add deferred packet class to prefetcher This patch removes the time field from the packet as it was only used by the preftecher. Similar to the packet queue, the prefetcher now wraps the packet in a deferred packet, which also has a tick representing the absolute time when the packet should be sent.	2013-02-19 05:56:06 -05:00
Andreas Hansson	7cd49b24d2	sim: Make clock private and access using clockPeriod() This patch makes the clock member private to the ClockedObject and forces all children to access it using clockPeriod(). This makes it impossible to inadvertently change the clock, and also makes it easier to transition to a situation where the clock is derived from e.g. a clock domain, or through a multiplier.	2013-02-19 05:56:06 -05:00
Sascha Bischoff	86a4d09269	mem: Fix SenderState related cache deadlock This patch fixes a potential deadlock in the caches. This deadlock could occur when more than one cache is used in a system, and pkt->senderState is modified in between the two caches. This happened as the caches relied on the senderState remaining unchanged, and used it for instantaneous upstream communication with other caches. This issue has been addressed by iterating over the linked list of senderStates until we are either able to cast to a MSHR* or senderState is NULL. If the cast is successful, we know that the packet has previously passed through another cache, and therefore update the downstreamPending flag accordingly. Otherwise, we do nothing.	2013-02-19 05:56:06 -05:00
Andreas Hansson	0622f30961	mem: Add predecessor to SenderState base class This patch adds a predecessor field to the SenderState base class to make the process of linking them up more uniform, and enable a traversal of the stack without knowing the specific type of the subclasses. There are a number of simplifications done as part of changing the SenderState, particularly in the RubyTest.	2013-02-19 05:56:05 -05:00
Andreas Hansson	9947923c60	mem: Ensure trace captures packet fields before forwarding This patch fixes a bug in the CommMonitor caused by the packet being modified before it is captured in the trace. By recording the fields before passing the packet on, and then putting these values in the trace we ensure that even if the packet is modified the trace captures what the CommMonitor saw.	2013-02-19 05:56:05 -05:00
Andreas Hansson	f6550b3d20	mem: Tighten up cache constness and scoping This patch merely adopts a more strict use of const for the cache member functions and variables, and also moves a large portion of the member functions from public to protected.	2013-02-15 17:40:10 -05:00
Andreas Sandberg	b904bd5437	sim: Add a system-global option to bypass caches Virtualized CPUs and the fastmem mode of the atomic CPU require direct access to physical memory. We currently require caches to be disabled when using them to prevent chaos. This is not ideal when switching between hardware virutalized CPUs and other CPU models as it would require a configuration change on each switch. This changeset introduces a new version of the atomic memory mode, 'atomic_noncaching', where memory accesses are inserted into the memory system as atomic accesses, but bypass caches. To make memory mode tests cleaner, the following methods are added to the System class: * isAtomicMode() -- True if the memory mode is 'atomic' or 'direct'. * isTimingMode() -- True if the memory mode is 'timing'. * bypassCaches() -- True if caches should be bypassed. The old getMemoryMode() and setMemoryMode() methods should never be used from the C++ world anymore.	2013-02-15 17:40:09 -05:00
Andreas Hansson	7c6bc52bf5	Ruby: Fix compilation errors on gcc 4.7 and clang 3.2 This patch fixes a few (recently added) errors that prevented gem5 from compiling on more recent versions of gcc and clang.	2013-02-14 12:24:51 -05:00
Nilay Vaish	71c27e6370	ruby: MI protocol: add a missing transition The transition for state MII and event Store was found missing during testing. The transition is being added. The controller will not stall the Store request in state MII	2013-02-10 21:43:18 -06:00
Nilay Vaish	cb7782f78d	ruby: enable multiple clock domains This patch allows ruby to have multiple clock domains. As I understand with this patch, controllers can have different frequencies. The entire network needs to run at a single frequency. The idea is that with in an object, time is treated in terms of cycles. But the messages that are passed from one entity to another should contain the time in Ticks. As of now, this is only true for the message buffers, but not for the links in the network. As I understand the code, all the entities in different networks (simple, garnet-fixed, garnet-flexible) should be clocked at the same frequency. Another problem is that the directory controller has to operate at the same frequency as the ruby system. This is because the memory controller does not make use of the Message Buffer, and instead implements a buffer of its own. So, it has no idea of the frequency at which the directory controller is operating and uses ruby system's frequency for scheduling events.	2013-02-10 21:43:17 -06:00
Nilay Vaish	253e8edf13	ruby: replace Time with Cycles (final patch in the series) This patch is as of now the final patch in the series of patches that replace Time with Cycles.This patch further replaces Time with Cycles in Sequencer, Profiler, different protocols and related entities. Though Time has not been completely removed, the places where it is in use seem benign as of now.	2013-02-10 21:43:10 -06:00
Nilay Vaish	f6e3ab7bd4	ruby: replace Time with Cycles in garnet fixed and flexible	2013-02-10 21:43:09 -06:00
Nilay Vaish	9d6d6c6718	ruby: replace Time with Tick in replacement policy classes	2013-02-10 21:43:08 -06:00
Nilay Vaish	221d39284e	ruby: convert block size, memory size to unsigned	2013-02-10 21:43:07 -06:00
Nilay Vaish	5e33045a2a	ruby: replace Time with Cycles in MessageBuffer	2013-02-10 21:26:26 -06:00
Nilay Vaish	b742081cc1	ruby: replace Time with Cycles in Memory Controller	2013-02-10 21:26:25 -06:00
Nilay Vaish	89f86dbd28	ruby: Replace Time with Cycles in SequencerMessage	2013-02-10 21:26:25 -06:00
Nilay Vaish	7862478eef	ruby: replace Time with Cycles in Message class Concomitant changes are being committed as well, including the io operator<< for the Cycles class.	2013-02-10 21:26:24 -06:00
Nilay Vaish	d3aebe1f91	ruby: replaces Time with Cycles in many places The patch started of with replacing Time with Cycles in the Consumer class. But to get ruby to compile, the rest of the changes had to be carried out. Subsequent patches will further this process, till we completely replace Time with Cycles.	2013-02-10 21:26:24 -06:00
Nilay Vaish	bc1daae7fd	ruby: modifies histogram add() function This patch modifies the Histogram class' add() function so that it can add linear histograms as well. The function assumes that the left end point of the ranges of the two histograms are the same. It also assumes that when the ranges of the two histogram are changed to accomodate an element not in the range, the factor used in changing the range is same for both the histograms. This function is then used in removing one of the calls to the global profiler*. The histograms for recording the delays incurred in processing different requests are now maintained by the controllers. The profiler adds these histograms when it needs to print the stats.	2013-02-10 21:26:22 -06:00
Nilay Vaish	a49b1df3f0	ruby: record fully busy cycle with in the controller This patch does several things. First, the counter for fully busy cycles for a controller is now kept with in the controller, instead of being part of the profiler. Second, the topology class no longer keeps an array of controllers which was only used for printing stats. Instead, ruby system will now ask each controller to print the stats. Thirdly, the statistical variable for recording how many different types were created is being moved in to the controller from the profiler. Note that for printing, the profiler will collate results from different controllers.	2013-02-10 21:26:22 -06:00
Nilay Vaish	6aed4d4f93	ruby: correct computation of number of bits required for address The number of bits required for an address was set to floorLog2(memory size). This is correct under the assumption that the memory size is a power of 2, which is not always true. Hence, floorLog2 is being replaced with ceilLog2.	2013-01-31 09:44:20 -06:00
Andreas Hansson	a4288dabf9	mem: Add comments for the DRAM address decoding This patch adds more verbose comments to explain the two different address mapping schemes of the DRAM controller.	2013-01-31 07:49:18 -05:00
Andreas Hansson	c4898b15bc	mem: Add DDR3 and LPDDR2 DRAM controller configurations This patch moves the default DRAM parameters from the SimpleDRAM class to two different subclasses, one for DDR3 and one for LPDDR2. More can be added as we go forward. The regressions that previously used the SimpleDRAM are now using SimpleDDR3 as this is the most similar configuration.	2013-01-31 07:49:14 -05:00
Ani Udipi	eaa37e611f	mem: Add tTAW and tFAW to the SimpleDRAM model This patch adds two additional scheduling constraints to the DRAM controller model, to constrain the activation rate. The two metrics are determine the size of the activation window in terms of the number of activates and the minimum time required for that number of activates. This maps to current DDRx, LPDDRx and WIOx standards that have either tFAW (4 activate window) or tTAW (2 activate window) scheduling constraints.	2013-01-31 07:49:14 -05:00
Andreas Hansson	b7153e2a64	mem: Separate out the different cases for DRAM bus busy time This patch changes how the data bus busy time is calculated such that it is delayed to the actual scheduling time of the request as opposed to being done as soon as possible. This patch changes a bunch of statistics, and the stats update is bundled together with the introruction of tFAW/tTAW and the named DRAM configurations like DDR3 and LPDDR2.	2013-01-31 07:49:13 -05:00
Anthony Gutierrez	af0f8b31db	cache: remove drainManager because it's not used the cache drainManager is set but never cleared, this is because the cache itself does not need to be drained and thus never triggers a signalDrainDone(). because the drainManager variable is not used properly and does not appear to be necessary it has been removed with this patch.	2013-01-28 20:19:42 -05:00
Nilay Vaish	a8eb5b18e0	ruby: remove get_time() This patch replaces get_time() in *.sm files with curCycle() which is now possible since controllers are clocked objects.	2013-01-28 06:14:18 -06:00
Nilay Vaish	31659e83fb	ruby: remove call to curCycle in panic() The panic() function already prints the current tick value. This call to curCycle() is as such redundant. Since we are trying to move towards multiple clock domains, this call will print misleading time.	2013-01-28 06:11:42 -06:00
Nilay Vaish	5b6f972750	ruby: remove calls to g_system_ptr->getTime() This patch further removes calls to g_system_ptr->getTime() where ever other clocked objects are available for providing current time.	2013-01-17 13:10:12 -06:00
Malek Musleh	1abf950f3c	ruby sequencer: converts cycles to ticks in deadlock panic() This patch converts the panic() print outs in the Sequencer::wakeup() call from ruby cycles to Ticks(). This makes it easier to debug deadlocks with the ProtocolTrace flag so the issue time indicated in the panic message can be quickly searched for. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2013-01-14 10:05:12 -06:00
Nilay Vaish	2012983718	Ruby: remove reference to g_system_ptr from class Message This patch was initiated so as to remove reference to g_system_ptr, the pointer to Ruby System that is used for getting the current time. That simple change actual requires changing a lot many things in slicc and garnet. All these changes are related to how time is handled. In most of the places, g_system_ptr has been replaced by another clock object. The changes have been done under the assumption that all the components in the memory system are on the same clock frequency, but the actual clocks might be distributed.	2013-01-14 10:05:10 -06:00
Nilay Vaish	cf232de461	Ruby: use ClockedObject in Consumer class Many Ruby structures inherit from the Consumer, which is used for scheduling events. The Consumer used to relay on an Event Manager for scheduling events and on g_system_ptr for time. With this patch, the Consumer will now use a ClockedObject to schedule events and to query for current time. This resulted in several structures being converted from SimObjects to ClockedObjects. Also, the MessageBuffer class now requires a pointer to a ClockedObject so as to query for time.	2013-01-14 10:04:21 -06:00
Mitch Hayenga	c7dbd5e768	mem: Make LL/SC locks fine grained The current implementation in gem5 just keeps a list of locks per cacheline. Due to this, a store to a non-overlapping portion of the cacheline can cause an LL/SC pair to fail. This patch simply adds an address range to the lock structure, so that the lock is only invalidated if the store overlaps the lock range.	2013-01-08 08:54:07 -05:00
Mitch Hayenga	dc4a0aa2fa	mem: Fix use-after-free bug Running with valgrind I noticed a use after free originating from simple_mem.cc. It looks like this is a known issue and this additional call site was missed in an earlier patch.	2013-01-08 08:54:06 -05:00
Andreas Sandberg	964aa49d15	mem: Fix guest corruption when caches handle uncacheable accesses When the classic gem5 cache sees an uncacheable memory access, it used to ignore it or silently drop the cache line in case of a write. Normally, there shouldn't be any data in the cache belonging to an uncacheable address range. However, since some architecture models don't implement cache maintenance instructions, there might be some dirty data in the cache that is discarded when this happens. The reason it has mostly worked before is because such cache lines were most likely evicted by normal memory activity before a TLB flush was requested by the OS. Previously, the cache model would invalidate cache lines when they were accessed by an uncacheable write. This changeset alters this behavior so all uncacheable memory accesses cause a cache flush with an associated writeback if necessary. This is implemented by reusing the cache flushing machinery used when draining the cache, which implies that writebacks are performed using functional accesses.	2013-01-07 13:05:47 -05:00
Andreas Sandberg	d44f2f611f	mem: Remove the IIC replacement policy The IIC replacement policy seems to be unused and has probably gathered too much bit rot to be useful. This patch removes the IIC and its associated cache parameters.	2013-01-07 13:05:39 -05:00
Andreas Hansson	921490a060	sim: Fatal if a clocked object is set to have a clock of 0 This patch adds a check to the clocked object constructor to ensure it is not configured to have a clock period of 0.	2013-01-07 13:05:39 -05:00
Andreas Hansson	18b147acef	mem: Merge ranges that are part of the conf table This patch adds basic merging of address ranges when determining which address ranges should be reported in the configuration table. By performing this merging it is possible to distribute an address range across many memory channels (controllers). This is essential to enable address interleaving.	2013-01-07 13:05:38 -05:00
Andreas Hansson	01c5598373	mem: Add interleaving bits to the address ranges This patch adds support for interleaving bits for the address ranges. What was previously just a start and end address, now has an additional three fields, for the high bit, and number of bits to use for interleaving, and a match value to compare against. If the number of interleaving bits is set to zero it is effectively disabled. A number of convenience functions are added to the range to enquire about the interleaving, its granularity and the number of stripes it is part of.	2013-01-07 13:05:38 -05:00
Andreas Hansson	e0d93fde99	base: Simplify the AddrRangeMap by removing unused code This patch cleans up the AddrRangeMap in preparation for the addition of interleaving by removing unused code. The non-const editions of find are never used, and hence the duplication is not needed.	2013-01-07 13:05:38 -05:00
Andreas Hansson	15a979c6be	mem: Tidy up bus addr range debug messages This patch tidies up a number of the bus DPRINTFs related to range manipulation. In particular, it shifts the message about range changes to the start of the member function, and also adds information about when all ranges are received.	2013-01-07 13:05:38 -05:00
Andreas Hansson	caf6786ad5	mem: Skip address mapper range checks to allow more flexibility This patch makes the address mapper less stringent about checking the before and after ranges, i.e. the original and remapped ranges. The checks were not really necessary, and there are situations when the previous checks were too strict.	2013-01-07 13:05:38 -05:00
Andreas Hansson	71da1d2157	base: Encapsulate the underlying fields in AddrRange This patch makes the start and end address private in a move to prevent direct manipulation and matching of ranges based on these fields. This is done so that a transition to ranges with interleaving support is possible. As a result of hiding the start and end, a number of member functions are needed to perform the comparisons and manipulations that previously took place directly on the members. An accessor function is provided for the start address, and a function is added to test if an address is within a range. As a result of the latter the != and == operator is also removed in favour of the member function. A member function that returns a string representation is also created to allow debug printing. In general, this patch does not add any functionality, but it does take us closer to a situation where interleaving (and more cleverness) can be added under the bonnet without exposing it to the user. More on that in a later patch.	2013-01-07 13:05:38 -05:00
Andreas Hansson	cfdaf53104	mem: Remove the joining of neighbouring ranges This patch temporarily removes the joining of ranges when creating the backing store, to reserve this functionality for the interleaved ranges that are about to be introduced. When creating the mmaps for the backing store, there is no point in creating larger contigous chunks that what is necessary. The larger chunks will only make life more difficult for the host. Merging will be re-added later, but then only for interleaved ranges.	2013-01-07 13:05:38 -05:00
Andreas Hansson	f456c7983d	mem: Add tracing support in the communication monitor This patch adds packet tracing to the communication monitor using a protobuf as the mechanism for creating the trace. If no file is specified, then the tracing is disabled. If a file is specified, then for every packet that is successfully sent, a protobuf message is serialized to the file.	2013-01-07 13:05:37 -05:00
Andreas Hansson	852a7bcf92	mem: Add sanity check to packet queue size This patch adds a basic check to ensure that the packet queue does not grow absurdly large. The queue should only be used to store packets that were delayed due to blocking from the neighbouring port, and not for actual storage. Thus, a limit of 100 has been chosen for now (which is already quite substantial).	2013-01-07 13:05:35 -05:00
Andreas Hansson	ce5fc494e3	ruby: Fix missing cxx_header in Switch This patch addresses a warning related to the swig interface generation for the Switch class. The cxx_header is now specified correctly, and the header in question has got a few includes added to make it all compile.	2013-01-07 13:05:35 -05:00
Andreas Hansson	174269978a	mem: Fix a bug in the memory serialization file naming This patch fixes a bug that caused multiple systems to overwrite each other physical memory. The system name is now included in the filename such that this is avoided.	2013-01-07 13:05:35 -05:00
Ali Saidi	9a645d6e9b	cache: add note about where conflicts are handled	2013-01-07 13:05:32 -05:00
Nilay Vaish	f3d0be210f	ruby: add support for prefetching to MESI protocol	2012-12-11 10:05:56 -06:00
Nilay Vaish	9b72a0f627	ruby: change slicc to allow for constructor args The patch adds support to slicc for recognizing arguments that should be passed to the constructor of a class. I did not like the fact that an explicit check was being carried on the type 'TBETable' to figure out the arguments to be passed to the constructor. The patch also moves some of the member variables that are declared for all the controllers to the base class AbstractController.	2012-12-11 10:05:55 -06:00
Nilay Vaish	93e283abb3	ruby: add a prefetcher This patch adds a prefetcher for the ruby memory system. The prefetcher is based on a prefetcher implemented by others (well, I don't know who wrote the original). The prefetcher does stride-based prefetching, both unit and non-unit. It obseves the misses in the cache and trains on these. After the training period is over, the prefetcher starts issuing prefetch requests to the controller.	2012-12-11 10:05:54 -06:00
Nilay Vaish	d502384795	ruby: add functions for computing next stride/page address	2012-12-11 10:05:53 -06:00
Nilay Vaish	2d6470936c	sim: have a curTick per eventq This patch adds a _curTick variable to an eventq. This variable is updated whenever an event is serviced in function serviceOne(), or all events upto a particular time are processed in function serviceEvents(). This change helps when there are eventqs that do not make use of curTick for scheduling events.	2012-11-16 10:27:47 -06:00
Nilay Vaish	90c45c29fe	ruby: support functional accesses in garnet flexible network	2012-11-10 17:18:01 -06:00
Nilay Vaish	1492ab066d	ruby: bug in functionalRead, revert recent changes Recent changes to functionalRead() in the memory system was not correct. The change allowed for returning data from the first message found in the buffers of the memory system. This is not correct since it is possible that a timing message has data from an older state of the block. The changes are being reverted.	2012-11-10 17:18:00 -06:00
Andreas Hansson	c4b36901d0	mem: Fix DRAM draining to ensure write queue is empty This patch fixes the draining of the SimpleDRAM controller model. The controller performs buffering of writes and normally there is no need to ever empty the write buffer (if you have a fast on-chip memory, then use it). The patch adds checks to ensure the write buffer is drained when the controller is asked to do so.	2012-11-08 04:25:06 -05:00
Hamid Reza Khaleghzadeh ext:(%2C%20Lluc%20Alvarez%20%3Clluc.alvarez%40bsc.es%3E%2C%20Nilay%20Vaish%20%3Cnilay%40cs.wisc.edu%3E)	8cd475d58e	ruby: reset and dump stats along with reset of the system This patch adds support to ruby so that the statistics maintained by ruby are reset/dumped when the statistics for the rest of the system are reset/dumped. For resetting the statistics, ruby now provides the resetStats() function that a sim object can provide. As a consequence, the clearStats() function has been removed from RubySystem. For dumping stats, Ruby now adds a callback event to the dumpStatsQueue. The exit callback that ruby used to add earlier is being removed. Created by: Hamid Reza Khaleghzadeh. Improved by: Lluc Alvarez, Nilay Vaish Committed by: Nilay Vaish	2012-11-02 12:18:25 -05:00
Ali Saidi	ce5766c409	mem: fix use after free issue in memories until 4-phase work complete.	2012-11-02 11:50:16 -05:00
Andreas Sandberg	ddd6af414c	mem: Add support for writing back and flushing caches This patch adds support for the following optional drain methods in the classical memory system's cache model: memWriteback() - Write back all dirty cache lines to memory using functional accesses. memInvalidate() - Invalidate all cache lines. Dirty cache lines are lost unless a writeback is requested. Since memWriteback() is called when checkpointing systems, this patch adds support for checkpointing systems with caches. The serialization code now checks whether there are any dirty lines in the cache. If there are dirty lines in the cache, the checkpoint is flagged as bad and a warning is printed.	2012-11-02 11:32:02 -05:00
Andreas Sandberg	b81a977e6a	sim: Move the draining interface into a separate base class This patch moves the draining interface from SimObject to a separate class that can be used by any object needing draining. However, objects not visible to the Python code (i.e., objects not deriving from SimObject) still depend on their parents informing them when to drain. This patch also gets rid of the CountedDrainEvent (which isn't really an event) and replaces it with a DrainManager.	2012-11-02 11:32:01 -05:00
Andreas Sandberg	c0ab52799c	sim: Include object header files in SWIG interfaces When casting objects in the generated SWIG interfaces, SWIG uses classical C-style casts ( (Foo *)bar; ). In some cases, this can degenerate into the equivalent of a reinterpret_cast (mainly if only a forward declaration of the type is available). This usually works for most compilers, but it is known to break if multiple inheritance is used anywhere in the object hierarchy. This patch introduces the cxx_header attribute to Python SimObject definitions, which should be used to specify a header to include in the SWIG interface. The header should include the declaration of the wrapped object. We currently don't enforce header the use of the header attribute, but a warning will be generated for objects that do not use it.	2012-11-02 11:32:01 -05:00
Dam Sunwoo	81406018b0	ARM: dump stats and process info on context switches This patch enables dumping statistics and Linux process information on context switch boundaries (__switch_to() calls) that are used for Streamline integration (a graphical statistics viewer from ARM).	2012-11-02 11:32:01 -05:00
Andreas Hansson	3d98119717	mem: Fix typo in port comments This patch merely fixes a few typos in the port comments.	2012-10-31 09:28:23 -04:00
Andreas Hansson	6f6adbf0f6	dev: Make default clock more reasonable for system and devices This patch changes the default system clock from 1THz to 1GHz. This clock is used by all modules that do not override the default (parent clock), and primarily affects the IO subsystem. Every DMA device uses its clock to schedule the next transfer, and the change will thus cause this inter-transfer delay to be longer. The default clock of the bus is removed, as the clock inherited from the system provides exactly the same value. A follow-on patch will bump the stats.	2012-10-25 13:14:44 -04:00
Nilay Vaish	52d8693677	ruby: functional access updates to network test protocol I had forgotten to change the network test protocol while making changes to ruby for supporting functional accesses. This patch updates the protocol so that it can compile correctly.	2012-10-18 18:35:42 -05:00
Nilay Vaish	5ffc165939	ruby: improved support for functional accesses This patch adds support to different entities in the ruby memory system for more reliable functional read/write accesses. Only the simple network has been augmented as of now. Later on Garnet will also support functional accesses. The patch adds functional access code to all the different types of messages that protocols can send around. These messages are functionally accessed by going through the buffers maintained by the network entities. The patch also rectifies some of the bugs found in coherence protocols while testing the patch. With this patch applied, functional writes always succeed. But functional reads can still fail.	2012-10-15 17:51:57 -05:00
Nilay Vaish	61434a9943	ruby: register multiple memory controllers Currently the Ruby System maintains pointer to only one of the memory controllers. But there can be multiple controllers in the system. This patch adds a vector of memory controllers.	2012-10-15 17:27:17 -05:00
Nilay Vaish	c14e6cfc4e	ruby: remove AbstractMemOrCache The only place where this abstract class is in use is the memory controller, which it self is an abstract class. Does not seem useful at all.	2012-10-15 17:27:16 -05:00
Nilay Vaish	3e607f146f	ruby: allow function definition in slicc structs This patch adds support for function definitions to appear in slicc structs. This is required for supporting functional accesses for different types of messages. Subsequent patches will use this to development.	2012-10-15 17:27:16 -05:00
Nilay Vaish	c7b0901b97	ruby banked array: do away with event scheduling It seems unecessary that the BankedArray class needs to schedule an event to figure out when the access ends. Instead only the time for the end of access needs to be tracked.	2012-10-15 17:27:15 -05:00
Nilay Vaish	6a65fafa52	ruby: reset timing after cache warm up Ruby system was recently converted to a clocked object. Such objects maintain state related to the time that has passed so far. During the cache warmup, Ruby system changes its own time and the global time. Later on, the global time is restored. So Ruby system also needs to reset its own time.	2012-10-15 17:27:15 -05:00
Andreas Hansson	b6bd4f34b4	Mem: Fix incorrect logic in bus blocksize check This patch fixes the logic in the blocksize check such that the warning is printed if the size is not 16, 32, 64 or 128.	2012-10-15 12:51:21 -04:00
Andreas Hansson	2a740aa096	Port: Add protocol-agnostic ports in the port hierarchy This patch adds an additional level of ports in the inheritance hierarchy, separating out the protocol-specific and protocl-agnostic parts. All the functionality related to the binding of ports is now confined to use BaseMaster/BaseSlavePorts, and all the protocol-specific parts stay in the Master/SlavePort. In the future it will be possible to add other protocol-specific implementations. The functions used in the binding of ports, i.e. getMaster/SlavePort now use the base classes, and the index parameter is updated to use the PortID typedef with the symbolic InvalidPortID as the default.	2012-10-15 08:12:35 -04:00
Andreas Hansson	9baa35ba80	Mem: Separate the host and guest views of memory backing store This patch moves all the memory backing store operations from the independent memory controllers to the global physical memory. The main reason for this patch is to allow address striping in a future set of patches, but at this point it already provides some useful functionality in that it is now possible to change the number of memory controllers and their address mapping in combination with checkpointing. Thus, the host and guest view of the memory backing store are now completely separate. With this patch, the individual memory controllers are far simpler as all responsibility for serializing/unserializing is moved to the physical memory. Currently, the functionality is more or less moved from AbstractMemory to PhysicalMemory without any major changes. However, in a future patch the physical memory will also resolve any ranges that are interleaved and properly assign the backing store to the memory controllers, and keep the host memory as a single contigous chunk per address range. Functionality for future extensions which involve CPU virtualization also enable the host to get pointers to the backing store.	2012-10-15 08:12:32 -04:00
Andreas Hansson	0c58106b6e	Mem: Use deque instead of list for bus retries This patch changes the data structure used to keep track of ports that should be told to retry. As the bus is doing this in an FCFS way, there is no point having a list. A deque is a better match (and is at least in theory a better choice from a performance point of view).	2012-10-15 08:12:25 -04:00
Andreas Hansson	93a159875a	Fix: Address a few minor issues identified by cppcheck This patch addresses a number of smaller issues identified by the code inspection utility cppcheck. There are a number of identified leaks in the arm/linux/system.cc (although the function only get's called once so it is not a major problem), a few deletes in dev/x86/i8042.cc that were not array deletes, and sprintfs where the character array had one element less than needed. In the IIC tags there was a function allocating an array of longs which is in fact never used.	2012-10-15 08:12:23 -04:00
Andreas Hansson	88554790c3	Mem: Use cycles to express cache-related latencies This patch changes the cache-related latencies from an absolute time expressed in Ticks, to a number of cycles that can be scaled with the clock period of the caches. Ultimately this patch serves to enable future work that involves dynamic frequency scaling. As an immediate benefit it also makes it more convenient to specify cache performance without implicitly assuming a specific CPU core operating frequency. The stat blocked_cycles that actually counter in ticks is now updated to count in cycles. As the timing is now rounded to the clock edges of the cache, there are some regressions that change. Plenty of them have very minor changes, whereas some regressions with a short run-time are perturbed quite significantly. A follow-on patch updates all the statistics for the regressions.	2012-10-15 08:10:54 -04:00
Andreas Hansson	36d199b9a9	Mem: Use range operations in bus in preparation for striping This patch transitions the bus to use the AddrRange operations instead of directly accessing the start and end. The change facilitates the move to a more elaborate AddrRange class that also supports address striping in the bus by specifying interleaving bits in the ranges. Two new functions are added to the AddrRange to determine if two ranges intersect, and if one is a subset of another. The bus propagation of address ranges is also tweaked such that an update is only propagated if the bus received information from all the downstream slave modules. This avoids the iteration and need for the cycle-breaking scheme that was previously used.	2012-10-15 08:07:04 -04:00
Andreas Hansson	43ca8415e8	Mem: Determine bus block size during initialisation This patch moves the block size computation from findBlockSize to initialisation time, once all the neighbouring ports are connected. There is no need to dynamically update the block size, and the caching of the value effectively avoided that anyhow. This is very similar to what was already in place, just with a slightly leaner implementation.	2012-10-11 06:38:43 -04:00
Nilay Vaish	88ba1c452b	ruby: makes some members non-static This patch makes some of the members (profiler, network, memory vector) of ruby system non-static.	2012-10-02 14:35:45 -05:00
Nilay Vaish	4488379244	ruby: changes to simple network This patch makes the Switch structure inherit from BasicRouter, as is done in two other networks.	2012-10-02 14:35:45 -05:00
Nilay Vaish	b370f6a7b2	ruby: rename template_hack to template I don't like using the word hack. Hence, the patch.	2012-10-02 14:35:44 -05:00
Nilay Vaish	d58f84c481	ruby: remove unused code in protocols	2012-10-02 14:35:44 -05:00
Nilay Vaish	73eafe4849	ruby: remove some unused things in slicc This patch removes the parts of slicc that were required for multi-chip protocols. Going ahead, it seems multi-chip protocols would be implemented by playing with the network itself.	2012-10-02 14:35:43 -05:00
Nilay Vaish	3c9d3b16d8	ruby: move functional access to ruby system This patch moves the code for functional accesses to ruby system. This is because the subsequent patches add support for making functional accesses to the messages in the interconnect. Making those accesses from the ruby port would be cumbersome.	2012-10-02 14:35:42 -05:00
Nilay Vaish	95664da097	MI coherence protocol: add copyright notice	2012-09-30 13:20:53 -05:00
Djordje Kovacevic	80a26a3e39	MEM: Put memory system document into doxygen	2012-09-25 11:49:41 -05:00
Mrinmoy Ghosh	6fc0094337	Cache: add a response latency to the caches In the current caches the hit latency is paid twice on a miss. This patch lets a configurable response latency be set of the cache for the backward path.	2012-09-25 11:49:41 -05:00
Ali Saidi	396600de10	mem: Add a gasket that allows memory ranges to be re-mapped. For example if DRAM is at two locations and mirrored this patch allows the mirroring to occur.	2012-09-25 11:49:40 -05:00
Joel Hestness	4095af5fd6	RubyPort and Sequencer: Fix draining Fix the drain functionality of the RubyPort to only call drain on child ports during a system-wide drain process, instead of calling each time that a ruby_hit_callback is executed. This fixes the issue of the RubyPort ports being reawakened during the drain simulation, possibly with work they didn't previously have to complete. If they have new work, they may call process on the drain event that they had not registered work for, causing an assertion failure when completing the drain event. Also, in RubyPort, set the drainEvent to NULL when there are no events to be drained. If not set to NULL, the drain loop can result in stale drainEvents used.	2012-09-23 13:57:08 -05:00
Andreas Hansson	3b6a143ec5	DRAM: Introduce SimpleDRAM to capture a high-level controller This patch introduces a high-level model of a DRAM controller, with a basic read/write buffer structure, a selectable and customisable arbiter, a few address mapping options, and the basic DRAM timing constraints. The parameters make it possible to turn this model into any desired DDRx/LPDDRx/WideIOx memory controller. The intention is not to be cycle accurate or capture every aspect of a DDR DRAM interface, but rather to enable exploring of the high-level knobs with a good simulation speed. Thus, contrary to e.g. DRAMSim this module emphasizes simulation speed with a good-enough accuracy. This module is merely a starting point, and there are plenty additions and improvements to come. A notable addition is the support for address-striping in the bus to enable a multi-channel DRAM controller. Also note that there are still a few "todo's" in the code base that will be addressed as we go along. A follow-up patch will add basic performance regressions that use the traffic generator to exercise a few well-defined corner cases.	2012-09-21 11:48:13 -04:00
Andreas Hansson	4aee3aa073	Mem: Tidy up bus member variables types This patch merely tidies up the types used for the bus member variables. It also makes the constant ones const.	2012-09-21 10:11:24 -04:00
Anthony Gutierrez	9cd0c5ecc8	bus: removed outdated warn regarding 64 B block sizes this warn is outdated as 64 B blocks are very common, and even the default size for some CPU types. E.g., arm_detailed.	2012-09-20 17:25:52 -04:00
Andreas Hansson	a731f8f9dd	Mem: Remove the file parameter from AbstractMemory This patch removes the unused file parameter from the AbstractMemory. The patch serves to make it easier to transition to a separation of the actual contigious host memory backing store, and the gem5 memory controllers. Without the file parameter it becomes easier to hide the creation of the mmap in the PhysicalMemory, as there are no longer any reasons to expose the actual contigious ranges to the user. To the best of my knowledge there is no use of the parameter, so the change should not affect anyone.	2012-09-19 06:15:46 -04:00
Andreas Hansson	ffb6aec603	AddrRange: Transition from Range<T> to AddrRange This patch takes the final plunge and transitions from the templated Range class to the more specific AddrRange. In doing so it changes the obvious Range<Addr> to AddrRange, and also bumps the range_map to be AddrRangeMap. In addition to the obvious changes, including the removal of redundant includes, this patch also does some house keeping in preparing for the introduction of address interleaving support in the ranges. The Range class is also stripped of all the functionality that is never used. --HG-- rename : src/base/range.hh => src/base/addr_range.hh rename : src/base/range_map.hh => src/base/addr_range_map.hh	2012-09-19 06:15:44 -04:00
Nilay Vaish	33c904e0a5	ruby: eliminate typedef integer_t	2012-09-18 22:49:12 -05:00
Nilay Vaish	86b1c0fd54	ruby: avoid using g_system_ptr for event scheduling This patch removes the use of g_system_ptr for event scheduling. Each consumer object now needs to specify upfront an EventManager object it would use for scheduling events. This makes the ruby memory system more amenable for a multi-threaded simulation.	2012-09-18 22:46:34 -05:00
Andreas Hansson	7c55464aac	Mem: Add a maximum bandwidth to SimpleMemory This patch makes a minor addition to the SimpleMemory by enforcing a maximum data rate. The bandwidth is configurable, and a reasonable value (12.8GB/s) has been choosen as the default. The changes do add some complexity to the SimpleMemory, but they should definitely be justifiable as this enables a far more realistic setup using even this simple memory controller. The rate regulation is done for reads and writes combined to reflect the bidirectional data busses used by most (if not all) relevant memories. Moreover, the regulation is done per packet as opposed to long term, as it is the short term data rate (data bus width times frequency) that is the limiting factor. A follow-up patch bumps the stats for the regressions.	2012-09-18 10:30:02 -04:00
Andreas Hansson	806a1144ce	scons: Use c++0x with gcc >= 4.4 instead of 4.6 This patch shifts the version of gcc for which we enable c++0x from 4.6 to 4.4 The more long term plan is to see what the c++0x features can bring and what level of support would be enabled simply by bumping the required version of gcc from 4.3 to 4.4. A few minor things had to be fixed in the code base, most notably the choice of a hashmap implementation. In the Ruby Sequencer there were also a few minor issues that gcc 4.4 was not too happy about.	2012-09-14 12:13:18 -04:00
Jason Power	aa8bcd15ec	Ruby: Modify Scons so that we can put .sm files in extras Also allows for header files which are required in slicc generated code to be in a directory other than src/mem/ruby/slicc_interface.	2012-09-12 14:52:04 -05:00
Andreas Hansson	292d8252a4	clang: Fix issues identified by the clang static analyzer This patch addresses a few minor issues reported by the clang static analyzer. The analysis was run with: scan-build -disable-checker deadcode \ -enable-checker experimental.core \ -disable-checker experimental.core.CastToStruct \ -enable-checker experimental.cpluscplus	2012-09-11 14:15:47 -04:00
Lena Olson	584eba3ab6	Cache: Split invalidateBlk up to seperate block vs. tags This seperates the functionality to clear the state in a block into blk.hh and the functionality to udpate the tag information into the tags. This gets rid of the case where calling invalidateBlk on an already-invalid block does something different than calling it on a valid block, which was confusing.	2012-09-11 14:14:49 -04:00
Nilay Vaish	637c6c7e32	Ruby: Use uint32_t instead of uint32 everywhere	2012-09-11 09:24:45 -05:00
Nilay Vaish	f00347a20f	Ruby: Use uint8_t instead of uint8 everywhere	2012-09-11 09:23:56 -05:00
Nilay Vaish	c5bf1390aa	Ruby System: Convert to Clocked Object This patch moves Ruby System from being a SimObject to recently introduced ClockedObject.	2012-09-10 12:21:01 -05:00
Nilay Vaish	4e6f048ef0	Ruby Slicc: remove the call to cin.get() function If I understand correctly, this was put in place so that a debugger can be attached when the protocol aborts. While this sounds useful, it is a problem when the simulation is not being actively monitored. I think it is better to remove this.	2012-09-10 12:20:34 -05:00
Marco Elver	9e0edbcea8	Mem: Allow serializing of more than INT_MAX bytes Despite gzwrite taking an unsigned for length, it returns an int for bytes written; gzwrite fails if (int)len < 0. Because of this, call gzwrite with len no larger than INT_MAX: write in blocks of INT_MAX if data to be written is larger than INT_MAX.	2012-09-10 11:57:43 -04:00
Andreas Hansson	287ea1a081	Param: Transition to Cycles for relevant parameters This patch is a first step to using Cycles as a parameter type. The main affected modules are the CPUs and the Ruby caches. There are definitely plenty more places that are affected, but this patch serves as a starting point to making the transition. An important part of this patch is to actually enable parameters to be specified as Param.Cycles which involves some changes to params.py.	2012-09-07 12:34:38 -04:00
Joel Hestness	6924e10978	Ruby Memory Controller: Fix clocking	2012-09-05 20:51:41 -05:00
Jason Power	494f6a858e	Ruby: Correct DataBlock =operator The =operator for the DataBlock class was incorrectly interpreting the class member m_alloc. This variable stands for whether the assigned memory for the data block needs to be freed or not by the class itself. It seems that the =operator interpreted the variable as whether the memory is assigned to the data block. This wrong interpretation was causing values not to propagate to RubySystem::m_mem_vec_ptr. This caused major issues with restoring from checkpoints when using a protocol which verified that the cache data was consistent with the backing store (i.e. MOESI-hammer).	2012-08-28 17:57:51 -05:00
Andreas Hansson	0cacf7e817	Clock: Add a Cycles wrapper class and use where applicable This patch addresses the comments and feedback on the preceding patch that reworks the clocks and now more clearly shows where cycles (relative cycle counts) are used to express time. Instead of bumping the existing patch I chose to make this a separate patch, merely to try and focus the discussion around a smaller set of changes. The two patches will be pushed together though. This changes done as part of this patch are mostly following directly from the introduction of the wrapper class, and change enough code to make things compile and run again. There are definitely more places where int/uint/Tick is still used to represent cycles, and it will take some time to chase them all down. Similarly, a lot of parameters should be changed from Param.Tick and Param.Unsigned to Param.Cycles. In addition, the use of curTick is questionable as there should not be an absolute cycle. Potential solutions can be built on top of this patch. There is a similar situation in the o3 CPU where lastRunningCycle is currently counting in Cycles, and is still an absolute time. More discussion to be had in other words. An additional change that would be appropriate in the future is to perform a similar wrapping of Tick and probably also introduce a Ticks class along with suitable operators for all these classes.	2012-08-28 14:30:33 -04:00
Andreas Hansson	d14e5857c7	Port: Stricter port bind/unbind semantics This patch tightens up the semantics around port binding and checks that the ports that are being bound are currently not connected, and similarly connected before unbind is called. The patch consequently also changes the order of the unbind and bind for the switching of CPUs to ensure that the rules are adhered to. Previously the ports would be "over-written" without any check. There are no changes in behaviour due to this patch, and the only place where the unbind functionality is used is in the CPU.	2012-08-28 14:30:27 -04:00
Nilay Vaish	85c7352462	Ruby: remove README.debugging and Decommissioning_note These files were relevant when Ruby was part of GEMS. They are not required any longer.	2012-08-27 14:57:46 -05:00
Nilay Vaish	9190940511	Ruby: Remove RubyEventQueue This patch removes RubyEventQueue. Consumer objects now rely on RubySystem or themselves for scheduling events.	2012-08-27 01:00:55 -05:00
Nilay Vaish	7122b83d8f	Ruby Memory Vector: Allow more than 4GB of memory The memory size variable was a 32-bit int. This meant that the size of the memory was limited to 4GB. This patch changes the type of the variable to 64-bit to support larger memory sizes. Thanks to Raghuraman Balasubramanian for bringing this to notice.	2012-08-27 01:00:54 -05:00
Nilay Vaish	b422994fea	MESI Protocol: Correct the virtual network in profile functions The virtual network in a couple of places was incorrectly mentioned as 3 in place of 1. This is being corrected.	2012-08-25 15:49:06 -05:00
Nilay Vaish	01f1430833	MESI Coherence Protocol: Add copyright notice	2012-08-25 13:16:45 -05:00
Andreas Hansson	c60db56741	Packet: Remove NACKs from packet and its use in endpoints This patch removes the NACK frrom the packet as there is no longer any module in the system that issues them (the bridge was the only one and the previous patch removes that). The handling of NACKs was mostly avoided throughout the code base, by using e.g. panic or assert false, but in a few locations the NACKs were actually dealt with (although NACKs never occured in any of the regressions). Most notably, the DMA port will now never receive a NACK and the backoff time is thus never changed. As a consequence, the entire backoff mechanism (similar to a PCI bus) is now removed and the DMA port entirely relies on the bus performing the arbitration and issuing a retry when appropriate. This is more in line with e.g. PCIe. Surprisingly, this patch has no impact on any of the regressions. As mentioned in the patch that removes the NACK from the bridge, a follow-up patch should change the request and response buffer size for at least one regression to also verify that the system behaves as expected when the bridge fills up.	2012-08-22 11:39:59 -04:00
Andreas Hansson	a6074016e2	Bridge: Remove NACKs in the bridge and unify with packet queue This patch removes the NACKing in the bridge, as the split request/response busses now ensure that protocol deadlocks do not occur, i.e. the message-dependency chain is broken by always allowing responses to make progress without being stalled by requests. The NACKs had limited support in the system with most components ignoring their use (with a suitable call to panic), and as the NACKs are no longer needed to avoid protocol deadlocks, the cleanest way is to simply remove them. The bridge is the starting point as this is the only place where the NACKs are created. A follow-up patch will remove the code that deals with NACKs in the endpoints, e.g. the X86 table walker and DMA port. Ultimately the type of packet can be complete removed (until someone sees a need for modelling more complex protocols, which can now be done in parts of the system since the port and interface is split). As a consequence of the NACK removal, the bridge now has to send a retry to a master if the request or response queue was full on the first attempt. This change also makes the bridge ports very similar to QueuedPorts, and a later patch will change the bridge to use these. A first step in this direction is taken by aligning the name of the member functions, as done by this patch. A bit of tidying up has also been done as part of the simplifications. Surprisingly, this patch has no impact on any of the regressions. Hence, there was never any NACKs issued. In a follow-up patch I would suggest changing the size of the bridge buffers set in FSConfig.py to also test the situation where the bridge fills up.	2012-08-22 11:39:58 -04:00

... 2 3 4 5 6 ...

1500 commits