sanchayanmaity/gem5 - Sanchayan Maity's repositories

Author	SHA1	Message	Date
Marco Balboni	d35dd71ab4	mem: Add crossbar latencies This patch introduces latencies in crossbar that were neglected before. In particular, it adds three parameters in crossbar model: front_end_latency, forward_latency, and response_latency. Along with these parameters, three corresponding members are added: frontEndLatency, forwardLatency, and responseLatency. The coherent crossbar has an additional snoop_response_latency. The latency of the request path through the xbar is set as --> frontEndLatency + forwardLatency In case the snoop filter is enabled, the request path latency is charged also by look-up latency of the snoop filter. --> frontEndLatency + SF(lookupLatency) + forwardLatency. The latency of the response path through the xbar is set instead as --> responseLatency. In case of snoop response, if the response is treated as a normal response the latency associated is again --> responseLatency; If instead it is forwarded as snoop response we add an additional variable + snoopResponseLatency and the latency associated is --> snoopResponseLatency; Furthermore, this patch lets the crossbar progress on the next clock edge after an unused retry, changing the time the crossbar considers itself busy after sending a retry that was not acted upon.	2015-03-02 04:00:46 -05:00
Andreas Sandberg	7be9d4eb67	dev, arm: Clean up PL011 and rewrite interrupt handling The ARM PL011 UART model didn't clear and raise interrupts correctly. This changeset rewrites the whole interrupt handling and makes it both simpler and fixes several cases where the correct interrupts weren't raised or cleared. Additionally, it cleans up many other aspects of the code.	2015-03-02 04:00:44 -05:00
Andreas Hansson	d64b34bef8	arm: Share a port for the two table walker objects This patch changes how the MMU and table walkers are created such that a single port is used to connect the MMU and the TLBs to the memory system. Previously two ports were needed as there are two table walker objects (stage one and stage two), and they both had a port. Now the port itself is moved to the Stage2MMU, and each TableWalker is simply using the port from the parent. By using the same port we also remove the need for having an additional crossbar joining the two ports before the walker cache or the L2. This simplifies the creation of the CPU cache topology in BaseCPU.py considerably. Moreover, for naming and symmetry reasons, the TLB walker port is connected through the stage-one table walker thus making the naming identical to x86. Along the same line, we use the stage-one table walker to generate the master id that is used by all TLB-related requests.	2015-03-02 04:00:42 -05:00
Giacomo Gabrielli	bd70db5521	arm: Remove unnecessary dependencies between AArch64 FP instructions	2015-03-02 04:00:41 -05:00
Rekai	3d5434022a	cpu: o3 register renaming request handling improved Now, prior to the renaming, the instruction requests the exact amount of registers it will need, and the rename_map decides whether the instruction is allowed to proceed or not.	2015-03-02 04:00:38 -05:00
Andreas Hansson	987de4f5cc	mem: Tidy up the cache debug messages Avoid redundant inclusion of the name in the DPRINTF string.	2015-03-02 04:00:37 -05:00
Andreas Hansson	f26a289295	mem: Split port retry for all different packet classes This patch fixes a long-standing isue with the port flow control. Before this patch the retry mechanism was shared between all different packet classes. As a result, a snoop response could get stuck behind a request waiting for a retry, even if the send/recv functions were split. This caused message-dependent deadlocks in stress-test scenarios. The patch splits the retry into one per packet (message) class. Thus, sendTimingReq has a corresponding recvReqRetry, sendTimingResp has recvRespRetry etc. Most of the changes to the code involve simply clarifying what type of request a specific object was accepting. The biggest change in functionality is in the cache downstream packet queue, facing the memory. This queue was shared by requests and snoop responses, and it is now split into two queues, each with their own flow control, but the same physical MasterPort. These changes fixes the previously seen deadlocks.	2015-03-02 04:00:35 -05:00
Ali Jafri	6ebe8d863a	mem: Fix prefetchSquash + memInhibitAsserted bug This patch resolves a bug with hardware prefetches. Before a hardware prefetch is sent towards the memory, the system generates a snoop request to check all caches above the prefetch generating cache for the presence of the prefetth target. If the prefetch target is found in the tags or the MSHRs of the upper caches, the cache sets the prefetchSquashed flag in the snoop packet. When the snoop packet returns with the prefetchSquashed flag set, the prefetch generating cache deallocates the MSHR reserved for the prefetch. If the prefetch target is found in the writeback buffer of the upper cache, the cache sets the memInhibit flag, which signals the prefetch generating cache to expect the data from the writeback. When the snoop packet returns with the memInhibitAsserted flag set, it marks the allocated MSHR as inService and waits for the data from the writeback. If the prefetch target is found in multiple upper level caches, specifically in the tags or MSHRs of one upper level cache and the writeback buffer of another, the snoop packet will return with both prefetchSquashed and memInhibitAsserted set, while the current code is not written to handle such an outcome. Current code checks for the prefetchSquashed flag first, if it finds the flag, it deallocates the reserved MSHR. This leads to assert failure when the data from the writeback appears at cache. In this fix, we simply switch the order of checks. We first check for memInhibitAsserted and then for prefetch squashed.	2015-03-02 04:00:34 -05:00
Stephan Diestelhorst	de46eeade7	cpu: Add a PC-value to the traffic generator requests Have the traffic generator add its masterID as the PC address to the requests. That way, prefetchers (and other components) that use a PC for request classification will see per-tester streams of requests. This enables us to test strided prefetchers with the memchecker, too.	2015-03-02 04:00:31 -05:00
Andreas Hansson	eed0795f3a	tests: Run regression timeout as foreground Allow the user to send signals such as Ctrl C to the gem5 runs. Note that this assumes coreutils >= 8.13, which aligns with Ubuntu 12.04 and RHE6.	2015-03-02 04:00:29 -05:00
Andreas Sandberg	3b4ae7debb	arm: Don't truncate 16-bit ASIDs to 8 bits The ISA code sometimes stores 16-bit ASIDs as 8-bit unsigned integers and has a couple of inverted checks that mask out the high 8 bits of an ASID if 16-bit ASIDs have been /enabled/. This changeset fixes both of those issues.	2015-03-02 04:00:28 -05:00
Andreas Sandberg	804b11a3ed	arm: Correctly access the stack pointer in GDB We curently use INTREG_X31 instead of INTREG_SPX when accessing the stack pointer in GDB. gem5 normally uses INTREG_SPX to access the stack pointer, which gets mapped to the stack pointer corresponding (INTREG_SPn) to the current exception level. This changeset updates the GDB interface to use SPX instead of X31 (which is always zero) when transfering CPU state to gdb.	2015-03-02 04:00:27 -05:00
Andreas Sandberg	34dcd90b61	arm: Fix broken page table permissions checks in remote GDB The remote GDB interface currently doesn't check if translations are valid before reading memory. This causes a panic when GDB tries to access unmapped memory (e.g., when getting a stack trace). There are two reasons for this: 1) The function used to check for valid translations (virtvalid()) doesn't work and panics on invalid translations. 2) The method in the GDB interface used to test if a translation is valid (RemoteGDB::acc) always returns true regardless of the return from virtvalid(). This changeset fixes both of these issues.	2015-03-02 04:00:27 -05:00
Jason Power	670f44e05e	Ruby: Update backing store option to propagate through to all RubyPorts Previously, the user would have to manually set access_backing_store=True on all RubyPorts (Sequencers) in the config files. Now, instead there is one global option that each RubyPort checks on initialization. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2015-02-26 09:58:26 -06:00
Andreas Hansson	f18d2120fa	config: Add memcheck stress test This is a rather unfortunate copy of the memtest.py example script, that actually stresses the system with true sharing as opposed to the false sharing of the MemTest. To do so it uses TrafficGen instances to generate the reads/writes, and MemCheckerMonitor combined with the MemChecker to check the validity of the read/written values. As a bonus, this script also enables the addition of prefetchers, and the traffic is created to have a mix of random addresses and linear strides. We use the TaggedPrefetcher since the packets do not have a request with a PC. At the moment the code is almost identical to the memtest.py script, and no effort has been made to factor out the construction of the tree. The challenge is that the instantiation and connection of the testers and monitors is done as part of the tree building.	2015-02-16 03:35:23 -05:00
Andreas Hansson	8c78aa31ea	cpu: TrafficGen sinks snoops without complaining To be able to use the TrafficGen in a system with caches we need to allow it to sink incoming snoop requests. By default the master port panics, so silently ignore any snoops.	2015-02-16 03:34:55 -05:00
Stephan Diestelhorst	93fa8e3cd4	mem: Fix initial value problem with MemChecker In highly loaded cases, reads might actually overlap with writes to the initial memory state. The mem checker needs to detect such cases and permit the read reading either from the writes (what it is doing now) or read from the initial, unknown value. This patch adds this logic.	2015-02-16 03:34:47 -05:00
Andreas Hansson	661dac1598	dev: Fix undefined behaviuor in i8254xGBe This patch fixes a rather unfortunate oversight where the annotation pointer was used even though it is null. Somehow the code still works, but UBSan is rather unhappy. The use is now guarded, and the variable is initialised in the constructor (as well as init()).	2015-02-16 03:34:35 -05:00
Andreas Sandberg	0a2ee77616	arm: Wire up the GIC with the platform in the base class Move the (common) GIC initialization code that notifies the platform code of the new GIC to the base class (BaseGic) instead of the Pl390 implementation.	2015-02-16 03:34:18 -05:00
Andreas Hansson	e17328a227	mem: mmap the backing store with MAP_NORESERVE This patch ensures we can run simulations with very large simulated memories (at least 64 TB based on some quick runs on a Linux workstation). In essence this allows us to efficiently deal with sparse address maps without having to implement a redirection layer in the backing store. This opens up for run-time errors if we eventually exhausts the hosts memory and swap space, but this should hopefully never happen.	2015-02-16 03:33:47 -05:00
Andreas Hansson	57758ca685	mem: Use the range cache for lookup as well as access This patch changes the range cache used in the global physical memory to be an iterator so that we can use it not only as part of isMemAddr, but also access and functionalAccess. This matches use-cases where a core is using the atomic non-caching memory mode, and repeatedly calls isMemAddr and access. Linux boot on aarch32, with a single atomic CPU, is now more than 30% faster when using "--fastmem" compared to not using the direct memory access.	2015-02-16 03:33:37 -05:00
Andreas Hansson	d0e1b8a19c	arch: Make readMiscRegNoEffect const throughout Finally took the plunge and made this apply to all ISAs, not just ARM.	2015-02-16 03:33:28 -05:00
Curtis Dunham	07ce60bdfa	config: add --root-device machine parameter In case /dev/sda1 is not actually the boot partition for an image, we can override it on the command line or in a benchmark definition.	2015-01-16 14:12:03 -06:00
Andreas Sandberg	5bfa7e3d59	arm: Merge ISA files with pseudo instructions This changeset moves the pseudo instructions used to signal unknown instructions and unimplemented instructions to the same source files as the decoder fault.	2015-02-16 03:32:58 -05:00
Ali Saidi	4eff4fa12e	cpu: add support for outputing a protobuf formatted CPU trace Doesn't support x86 due to static instruction representation. --HG-- rename : src/cpu/CPUTracers.py => src/cpu/InstPBTrace.py	2015-02-16 03:32:38 -05:00
Marco Balboni	268d9e59c5	mem: Clarification of packet crossbar timings This patch clarifies the packet timings annotated when going through a crossbar. The old 'firstWordDelay' is replaced by 'headerDelay' that represents the delay associated to the delivery of the header of the packet. The old 'lastWordDelay' is replaced by 'payloadDelay' that represents the delay needed to processing the payload of the packet. For now the uses and values remain identical. However, going forward the payloadDelay will be additive, and not include the headerDelay. Follow-on patches will make the headerDelay capture the pipeline latency incurred in the crossbar, whereas the payloadDelay will capture the additional serialisation delay.	2015-02-11 10:23:47 -05:00
Marco Balboni	e2828587b3	mem: Clarify usage of latency in the cache This patch adds some much-needed clarity in the specification of the cache timing. For now, hit_latency and response_latency are kept as top-level parameters, but the cache itself has a number of local variables to better map the individual timing variables to different behaviours (and sub-components). The introduced variables are: - lookupLatency: latency of tag lookup, occuring on any access - forwardLatency: latency that occurs in case of outbound miss - fillLatency: latency to fill a cache block We keep the existing responseLatency The forwardLatency is used by allocateInternalBuffer() for: - MSHR allocateWriteBuffer (unchached write forwarded to WriteBuffer); - MSHR allocateMissBuffer (cacheable miss in MSHR queue); - MSHR allocateUncachedReadBuffer (unchached read allocated in MSHR queue) It is our assumption that the time for the above three buffers is the same. Similarly, for snoop responses passing through the cache we use forwardLatency.	2015-02-11 10:23:36 -05:00
Andreas Sandberg	5a573762d0	style: Fix broken m5format command The m5format command didn't actually work due to parameter handling issues and missing language detection. This changeset fixes those issues and cleans up some of the code to shared between the style checker and the format checker.	2015-02-11 10:23:34 -05:00
Andreas Sandberg	267443fa22	style: Fix incorrect style checker option name The style used to support the option -w to automatically fix white space issues. However, this option was actually wired up to fix all styles issues the checker encountered. This changeset cleans up the code that handles automatic fixing and adds an option to fix all issues, and separate options for white spaces and include ordering.	2015-02-11 10:23:33 -05:00
Andreas Hansson	9738f34411	config: Revamp memtest to allow testers on any level This patch revamps the memtest example script and allows for the insertion of testers at any level in the cache hierarchy. Previously all created topologies placed testers only at the very top, and the tree was thus entirely symmetric. With the changes made, it is possible to not only place testers at the leaf caches (L1), but also to connect testers at the L2, L3 etc. As part of the changes the object hierarchy is also simplified to ensure that the visual representation from the DOT printing looks sensible. Using SubSystems to group the objects is one of the key features.	2015-02-11 10:23:31 -05:00
Andreas Hansson	acf5a4a3da	stats: Bump the MemTest regression stats Reflect changes in the tester behaviour.	2015-02-11 10:23:31 -05:00
Andreas Hansson	6563ec8634	cpu: Tidy up the MemTest and make false sharing more obvious The MemTest class really only tests false sharing, and as such there was a lot of old cruft that could be removed. This patch cleans up the tester, and also makes it more clear what the assumptions are. As part of this simplification the reference functional memory is also removed. The regression configs using MemTest are updated to reflect the changes, and the stats will be bumped in a separate patch. The example config will be updated in a separate patch due to more extensive re-work. In a follow-on patch a new tester will be introduced that uses the MemChecker to implement true sharing.	2015-02-11 10:23:28 -05:00
Andreas Sandberg	550c318490	sim: Move the BaseTLB to src/arch/generic/ The TLB-related code is generally architecture dependent and should live in the arch directory to signify that. --HG-- rename : src/sim/BaseTLB.py => src/arch/generic/BaseTLB.py rename : src/sim/tlb.cc => src/arch/generic/tlb.cc rename : src/sim/tlb.hh => src/arch/generic/tlb.hh	2015-02-11 10:23:27 -05:00
Andreas Sandberg	9e6f803254	base: Add compiler macros to add deprecation warnings Gcc and clang both provide an attribute that can be used to flag a function as deprecated at compile time. This changeset adds a gem5 compiler macro for that compiler feature. The macro can be used to indicate that a legacy API within gem5 has been deprecated and provide a graceful migration to the new API.	2015-02-11 10:23:24 -05:00
Andreas Hansson	c9b8616c51	base: Do not dereference NULL in CompoundFlag creation This patch fixes the CompoundFlag constructor, ensuring that it does not dereference NULL. Doing so has undefined behaviuor, and both clang and gcc's undefined-behaviour sanitiser was rather unhappy.	2015-02-11 10:23:23 -05:00
Andreas Sandberg	431a6d708b	dev: Remove unused system pointer in the Platform base class The Platform base class contains a pointer to an instance of the System which is never initialized. This can lead to subtle bugs since some architecture-specific platform implementations contain their own system pointer which is normally used. However, if the platform is accessed through a pointer to its base class, the dangling pointer will be used instead.	2015-02-11 10:23:22 -05:00
Alexandru Dutu	ad1b177550	cpu: Idle CPU status logic revised This patch sets the CPU status to idle when the last active thread gets suspended.	2015-02-06 18:01:22 -08:00
Steve Reinhardt	774922895b	config: rename 'file' var Rename uses of 'file' as a local variable to avoid conflict with the built-in type of the same name.	2015-02-05 16:45:12 -08:00
Steve Reinhardt	634d923751	config: make M5_PATH a real search path Although you can put a list of colon-separated directory names in M5_PATH, the current code just takes the first one that exists and assumes all files must live there. This change makes the code search the specified list of directories for each individual binary or disk image that's requested. The main motivation is that the x86/Alpha binaries and the ARM binaries are in separate downloads, and thus naturally end up in separate directories. With this change, you can have M5_PATH point to those two directories, then run any FS regression test without changing M5_PATH. Currently, you either have to merge the two download directories or change M5_PATH (or do something else I haven't figured out).	2015-02-05 16:45:06 -08:00
Andreas Hansson	461a80beb3	mem: Clarify express snoop behaviour This patch adds a bit of documentation with insights around how express snoops really work.	2015-02-03 14:26:02 -05:00
Andreas Hansson	193325ff60	mem: Clarify cache behaviour for pending dirty responses This patch adds a bit of clarification around the assumptions made in the cache when packets are sent out, and dirty responses are pending. As part of the change, the marking of an MSHR as in service is simplified slightly, and comments are added to explain what assumptions are made.	2015-02-03 14:25:59 -05:00
Curtis Dunham	f0a764edc6	base: add an accessor and operators ==,!= to address ranges	2015-02-03 14:25:58 -05:00
Andreas Hansson	28a7cea2b3	config: Add XOR hashing to the DRAM channel interleaving This patch uses the recently added XOR hashing capabilities for the DRAM channel interleaving. This avoids channel biasing due to strided access patterns.	2015-02-03 14:25:55 -05:00
Andreas Hansson	ccb512ecc1	base: Add XOR-based hashed address interleaving This patch extends the current address interleaving with basic hashing support. Instead of directly comparing a number of address bits with a matching value, it is now possible to use two independent set of address bits XOR'ed together. This avoids issues where strided address patterns are heavily biased to a subset of the interleaved ranges.	2015-02-03 14:25:54 -05:00
Andreas Hansson	5ea60a95b3	config: Adjust DRAM channel interleaving defaults This patch changes the DRAM channel interleaving default behaviour to be more representative. The default address mapping (RoRaBaCoCh) moves the channel bits towards the least significant bits, and uses 128 byte as the default channel interleaving granularity. These defaults can be overridden if desired, but should serve as a sensible starting point for most use-cases.	2015-02-03 14:25:52 -05:00
Andreas Sandberg	9aad5b4569	style: Update the style checker to handle new include order As of August 2014, the gem5 style guide mandates that a source file's primary header is included first in that source file. This helps to ensure that the header file does not depend on include file ordering and avoids surprises down the road when someone tries to reuse code. In the new order, include files are grouped into the following blocks: * Primary header file (e.g., foo.hh for foo.cc) * Python headers * C system/stdlib includes * C++ stdlib includes * Include files in the gem5 source tree Just like before, include files within a block are required to be sorted in alphabetical order. This changeset updates the style checker to enforce the new order.	2015-02-03 14:25:50 -05:00
Andreas Sandberg	fe200c2487	sim: Remove test for non-NULL this in Event The method Event::initialized() tests if this != NULL as a part of the expression that tests if an event is initialized. The only case when this check could be false is if the method is called on a null pointer, which is illegal and leads to undefined behavior (such as eating your pets) according to the C++ standard. Because of this, modern compilers (specifically, recent versions of clang) warn about this which we treat as an error. This changeset removes the redundant check to fix said warning.	2015-02-03 14:25:48 -05:00
Andreas Sandberg	851b29ad20	dev: Correctly clear interrupts in VirtIO PCI Correctly clear the PCI interrupt belonging to a VirtIO device when the ISR register is read.	2015-02-03 14:25:47 -05:00
Andreas Hansson	b34b55b597	scons: Avoid implicit command dependencies Work around a bug in scons that causes the param wrappers being compiled twice. The easiest way for us to do so is to tell scons to ignore implicit command dependencies.	2015-02-03 14:25:43 -05:00
Curtis Dunham	b89fd57663	sim: prioritize async events; prevent starvation If a time quantum event is the only one in the queue, async events (Ctrl-C, I/O, etc.) will never be processed. So process them first.	2014-12-19 15:32:34 -06:00

1 2 3 4 5 ...

10720 commits