sanchayanmaity/gem5 - Sanchayan Maity's repositories

Author	SHA1	Message	Date
Nathanael Premillieu	e6a6d6445b	arm: Add secure flag to TableWalker request when needed	2015-10-29 08:48:26 -04:00
Victor Garcia	8427d05daa	kvm, arm: Fix compilation errors due to API changes The checkpoint changes, along with the SMT patches have changed a number of APIs. Adapt the ArmKvmCPU accordingly.	2015-10-29 08:48:23 -04:00
Boris Shingarov	58cb57bacc	power: Implement Remote GDB	2015-10-25 16:01:52 -07:00
Andreas Hansson	b48ed9b6c2	x86: Add missing explicit overrides for X86 devices Make clang >= 3.5 happy when compiling build/X86/gem5.opt on OSX.	2015-10-23 09:51:12 -04:00
Andreas Hansson	2ac04c11ac	misc: Add explicit overrides and fix other clang >= 3.5 issues This patch adds explicit overrides as this is now required when using "-Wall" with clang >= 3.5, the latter now part of the most recent XCode. The patch consequently removes "virtual" for those methods where "override" is added. The latter should be enough of an indication. As part of this patch, a few minor issues that clang >= 3.5 complains about are also resolved (unused methods and variables).	2015-10-12 04:08:01 -04:00
Andreas Hansson	22c04190c6	misc: Remove redundant compiler-specific defines This patch moves away from using M5_ATTR_OVERRIDE and the m5::hashmap (and similar) abstractions, as these are no longer needed with gcc 4.7 and clang 3.1 as minimum compiler versions.	2015-10-12 04:07:59 -04:00
Rekai Gonzalez Alberquilla	d3d159749a	isa: Add parameter to pick different decoder inside ISA The decoder is responsible for splitting instructions in micro operations (uops). Given that different micro architectures may split operations differently, this patch allows to specify which micro architecture each isa implements, so different cores in the system can split instructions differently, also decoupling uop splitting (microArch) from ISA (Arch). This is done making the decodification calls templates that receive a type 'DecoderFlavour' that maps the name of the operation to the class that implements it. This way there is only one selection point (converting the command line enum to the appropriate DecodeFeatures object). In addition, there is no explicit code replication: template instantiation hides that, and the compiler should be able to resolve a number of things at compile-time.	2015-10-09 14:50:54 -05:00
Steve Reinhardt	90c279e4b1	arch: clean up isa_parser error handling Although some decent error messages were getting generated inside isa_parser.py, they weren't always getting printed because of the screwy way we were handling exceptions. (Basically an inner exception would get hidden by an outer exception, and the more informative inner error message would not get printed.) Also line numbers were messed up, since they were taken from the lexer, which is typically a token (or more) ahead of the grammar rule that's being matched. Using the 'lineno' attribute that PLY associates with the grammar production is more accurate. The new LineTracker class extends lineno to track filenames as well as line numbers.	2015-10-06 17:26:50 -07:00
Steve Reinhardt	a2c875c746	x86: implement rcpps and rcpss SSE insts These are packed single-precision approximate reciprocal operations, vector and scalar versions, respectively. This code was basically developed by copying the code for sqrtps and sqrtss. The mrcp micro-op was simplified relative to msqrt since there are no double-precision versions of this operation.	2015-10-06 17:26:50 -07:00
Steve Reinhardt	57b9f53afa	x86: implement fild, fucomi, and fucomip x87 insts fild loads an integer value into the x87 top of stack register. fucomi/fucomip compare two x87 register values (the latter also doing a stack pop). These instructions are used by some versions of GNU libstdc++.	2015-10-06 17:26:50 -07:00
Mitch Hayenga	ccf4f6c3d7	arm: Change TLB Software Caching In ARM, certain variables are only updated when a necessary change is detected. Having 2 SMT threads share a TLB resulted in these not being updated as required. This patch adds a thread context identifer to assist in the invalidation of these variables.	2015-09-30 11:14:19 -05:00
Mitch Hayenga	9e07a7504c	cpu,isa,mem: Add per-thread wakeup logic Changes wakeup functionality so that only specific threads on SMT capable cpus are woken.	2015-09-30 11:14:19 -05:00
Mitch Hayenga	a5c4eb3de9	isa,cpu: Add support for FS SMT Interrupts Adds per-thread interrupt controllers and thread/context logic so that interrupts properly get routed in SMT systems.	2015-09-30 11:14:19 -05:00
Mitch Hayenga	e255fa053f	arm: SMT MPIDR Setting Changes assignment of the MPIDR for multi-threaded systems only.	2015-09-30 11:14:19 -05:00
Palle Lyckegaard	3de9def6c1	sparc: writing to tick_cmpr should not cause a panic This register is writable according to UA2005 Tried to boot NetBSD which starts the kernel by writing to the tick_cmpr register. Without the patch gem5 crashes with a panic. With the patch NetBSD starts to boot normally (although sun4v support in NetBSD is not complete yet) Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2015-09-15 08:14:07 -05:00
Andreas Hansson	6eb434c8a2	arm, mem: Remove unused CLEAR_LL request flag Cleaning up dead code. The CLREX stores zero directly to MISCREG_LOCKFLAG and so the request flag is no longer needed. The corresponding functionality in the cache tags is also removed.	2015-08-21 07:03:25 -04:00
Andreas Hansson	ae06e9a5c6	cpu: Move invldPid constant from Request to BaseCPU A more natural home for this constant.	2015-08-21 07:03:14 -04:00
David Hashe	a2d9aae3c3	x86: x86 instruction-implementation bug fixes Added explicit data sizes and an opcode type for correct execution.	2015-07-20 09:15:18 -05:00
David Hashe	d0f6aad3c6	syscall: Add readlink to x86 with special case /proc/self/exe This patch implements the correct behavior.	2015-07-20 09:15:18 -05:00
Nilay Vaish	aafa5c3f86	revert 5af8f40d8f2c	2015-07-28 01:58:04 -05:00
Nilay Vaish	608641e23c	cpu: implements vector registers This adds a vector register type. The type is defined as a std::array of a fixed number of uint64_ts. The isa_parser.py has been modified to parse vector register operands and generate the required code. Different cpus have vector register files now.	2015-07-26 10:21:20 -05:00
Nilay Vaish	0ef3dcc27b	x86: decode instructions with vex prefix This patch updates the x86 decoder so that it can decode instructions with vex prefix. It also updates the isa with opcodes from vex opcode maps 1, 2 and 3. Note that none of the instructions have been implemented yet. The implementations would be provided in due course of time.	2015-07-17 11:31:22 -05:00
Andreas Sandberg	ed38e3432c	sim: Refactor and simplify the drain API The drain() call currently passes around a DrainManager pointer, which is now completely pointless since there is only ever one global DrainManager in the system. It also contains vestiges from the time when SimObjects had to keep track of their child objects that needed draining. This changeset moves all of the DrainState handling to the Drainable base class and changes the drain() and drainResume() calls to reflect this. Particularly, the drain() call has been updated to take no parameters (the DrainManager argument isn't needed) and return a DrainState instead of an unsigned integer (there is no point returning anything other than 0 or 1 any more). Drainable objects should return either DrainState::Draining (equivalent to returning 1 in the old system) if they need more time to drain or DrainState::Drained (equivalent to returning 0 in the old system) if they are already in a consistent state. Returning DrainState::Running is considered an error. Drain done signalling is now done through the signalDrainDone() method in the Drainable class instead of using the DrainManager directly. The new call checks if the state of the object is DrainState::Draining before notifying the drain manager. This means that it is safe to call signalDrainDone() without first checking if the simulator has requested draining. The intention here is to reduce the code needed to implement draining in simple objects.	2015-07-07 09:51:05 +01:00
Andreas Sandberg	f16c0a4a90	sim: Decouple draining from the SimObject hierarchy Draining is currently done by traversing the SimObject graph and calling drain()/drainResume() on the SimObjects. This is not ideal when non-SimObjects (e.g., ports) need draining since this means that SimObjects owning those objects need to be aware of this. This changeset moves the responsibility for finding objects that need draining from SimObjects and the Python-side of the simulator to the DrainManager. The DrainManager now maintains a set of all objects that need draining. To reduce the overhead in classes owning non-SimObjects that need draining, objects inheriting from Drainable now automatically register with the DrainManager. If such an object is destroyed, it is automatically unregistered. This means that drain() and drainResume() should never be called directly on a Drainable object. While implementing the new functionality, the DrainManager has now been made thread safe. In practice, this means that it takes a lock whenever it manipulates the set of Drainable objects since SimObjects in different threads may create Drainable objects dynamically. Similarly, the drain counter is now an atomic_uint, which ensures that it is manipulated correctly when objects signal that they are done draining. A nice side effect of these changes is that it makes the drain state changes stricter, which the simulation scripts can exploit to avoid redundant drains.	2015-07-07 09:51:05 +01:00
Andreas Sandberg	e9c3d59aae	sim: Make the drain state a global typed enum The drain state enum is currently a part of the Drainable interface. The same state machine will be used by the DrainManager to identify the global state of the simulator. Make the drain state a global typed enum to better cater for this usage scenario.	2015-07-07 09:51:04 +01:00
Andreas Sandberg	76cd4393c0	sim: Refactor the serialization base class Objects that are can be serialized are supposed to inherit from the Serializable class. This class is meant to provide a unified API for such objects. However, so far it has mainly been used by SimObjects due to some fundamental design limitations. This changeset redesigns to the serialization interface to make it more generic and hide the underlying checkpoint storage. Specifically: * Add a set of APIs to serialize into a subsection of the current object. Previously, objects that needed this functionality would use ad-hoc solutions using nameOut() and section name generation. In the new world, an object that implements the interface has the methods serializeSection() and unserializeSection() that serialize into a named /subsection/ of the current object. Calling serialize() serializes an object into the current section. * Move the name() method from Serializable to SimObject as it is no longer needed for serialization. The fully qualified section name is generated by the main serialization code on the fly as objects serialize sub-objects. * Add a scoped ScopedCheckpointSection helper class. Some objects need to serialize data structures, that are not deriving from Serializable, into subsections. Previously, this was done using nameOut() and manual section name generation. To simplify this, this changeset introduces a ScopedCheckpointSection() helper class. When this class is instantiated, it adds a new /subsection/ and subsequent serialization calls during the lifetime of this helper class happen inside this section (or a subsection in case of nested sections). * The serialize() call is now const which prevents accidental state manipulation during serialization. Objects that rely on modifying state can use the serializeOld() call instead. The default implementation simply calls serialize(). Note: The old-style calls need to be explicitly called using the serializeOld()/serializeSectionOld() style APIs. These are used by default when serializing SimObjects. * Both the input and output checkpoints now use their own named types. This hides underlying checkpoint implementation from objects that need checkpointing and makes it easier to change the underlying checkpoint storage code.	2015-07-07 09:51:03 +01:00
Nikos Nikoleris	67925a8334	x86: Adjust the size of the values written to the x87 misc registers All x87 misc registers are implemented in an array of 64 bit values but in real hardware the size of some of these registers is smaller. Previsouly all 64 bits where incorrectly set and then later read. To ensure correctness we mask the value in setMiscRegNoEffect to write only the valid bits. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2015-07-04 10:43:47 -05:00
Andreas Sandberg	d541038549	arm: Cleanup arch headers to remove dma_device.hh dependency Break the dependency on dma_device.hh by forward-declaring DmaPort in the relevant header.	2015-06-21 20:48:33 +01:00
Rune Holm	eb3ed11794	arm: Delete debug print in initialization of hardware thread There seems to have been a debug print left in when the original ARMv8 support was merged in. This printout is performed every time you initialize a hardware thread, and it prints raw pointers, so it always causes diffs in the regression. This patch removes the debug print.	2015-06-09 09:21:16 -04:00
Rune Holm	f4311d3932	arm: Fix typo in ldrsh instruction name ldrsh was typoed as hdrsh, which is a bit annoying when printing instructions. This patch fixes it.	2015-06-09 09:21:15 -04:00
Ruslan Bukin ext:(%2C%20Zhang%20Guoye)	736d3314bf	arch: fix build under MacOSX put O_DIRECT under ifdefs -- this fixes build for MacOSX. Also use correct class for arm64 openFlagTable. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2015-06-07 14:02:40 -05:00
Andreas Sandberg	7c4eb3b4d8	kvm, arm: Add support for aarch64 This changeset adds support for aarch64 in kvm. The CPU module supports both checkpointing and online CPU model switching as long as no devices are simulated by the host kernel. It currently has the following limitations: * The system register based generic timer can only be simulated by the host kernel. Workaround: Use a memory mapped timer instead to simulate the timer in gem5. * Simulating devices (e.g., the generic timer) in the host kernel requires that the host kernel also simulates the GIC. * ID registers in the host and in gem5 must match for switching between simulated CPUs and KVM. This is particularly important for ID registers describing memory system capabilities (e.g., ASID size, physical address size). * Switching between a virtualized CPU and a simulated CPU is currently not supported if in-kernel device emulation is used. This could be worked around by adding support for switching to the gem5 (e.g., the KvmGic) side of the device models. A simpler workaround is to avoid in-kernel device models altogether.	2015-06-01 19:44:19 +01:00
Andreas Sandberg	dbfd6effe0	kvm, arm, dev: Add an in-kernel GIC implementation This changeset adds a GIC implementation that uses the kernel's built-in support for simulating the interrupt controller. Since there is currently no support for state transfer between gem5 and the kernel, the device model does not support serialization and CPU switching (which would require switching to a gem5-simulated GIC).	2015-06-01 19:44:17 +01:00
Andreas Sandberg	06cf5cc60b	kvm, arm: Move ARM-specific files to arch/arm/kvm/ This changeset moves the ARM-specific KVM CPU implementation to arch/arm/kvm/. This change is expected to keep the source tree somewhat cleaner as we start adding support for ARMv8 and KVM in-kernel interrupt controller simulation. --HG-- rename : src/cpu/kvm/ArmKvmCPU.py => src/arch/arm/kvm/ArmKvmCPU.py rename : src/cpu/kvm/arm_cpu.cc => src/arch/arm/kvm/arm_cpu.cc rename : src/cpu/kvm/arm_cpu.hh => src/arch/arm/kvm/arm_cpu.hh	2015-06-01 19:43:40 +01:00
Curtis Dunham	e590f0d1ef	arm: implement the CONTEXTIDR_EL2 system reg.	2015-05-26 03:21:45 -04:00
Nathanael Premillieu	31fd18ab15	arm: Make address translation faster with better caching This patch adds better caching of the sys regs for AArch64, thus avoiding unnecessary calls to tc->readMiscReg(MISCREG_CPSR) in the non-faulting case.	2015-05-26 03:21:42 -04:00
Giacomo Gabrielli	cc2346e8ca	arm: Implement some missing syscalls (SE mode) Adding a few syscalls that were previously considered unimplemented.	2015-05-26 03:21:35 -04:00
Andreas Sandberg	6533f2000b	arm: Get rid of pointless have_generic_timer param The ArmSystem class has a parameter to indicate whether it is configured to use the generic timer extension or not. This parameter doesn't affect any feature flags in the current implementation and is therefore completely unnecessary. In fact, we usually don't set it even if a system has a generic timer. If we ever need to check if there is a generic timer present, we should just request a pointer and check if it is non-null instead.	2015-05-23 13:46:54 +01:00
Andreas Sandberg	2278fec1d1	dev, arm: Add virtual timers to the generic timer model The generic timer model currently does not support virtual counters. Virtual and physical counters both tick with the same frequency. However, virtual timers allow a hypervisor to set an offset that is subtracted from the counter when it is read. This enables the hypervisor to present a time base that ticks with virtual time in the VM (i.e., doesn't tick when the VM isn't running). Modern Linux kernels generally assume that virtual counters exist and try to use them by default.	2015-05-23 13:46:53 +01:00
Andreas Sandberg	65f3f097d3	dev, arm: Refactor and clean up the generic timer model This changeset cleans up the generic timer a bit and moves most of the register juggling from the ISA code into a separate class in the same source file as the rest of the generic timer. It also removes the assumption that there is always 8 or fewer CPUs in the system. Instead of having a fixed limit, we now instantiate per-core timers as they are requested. This is all in preparation for other patches that add support for virtual timers and a memory mapped interface.	2015-05-23 13:46:52 +01:00
Andreas Hansson	99d3fa5945	arm: Identify table-walker requests This patch ensures all page-table walks are flagged as such.	2015-05-15 13:40:01 -04:00
Andreas Hansson	bd583d00f9	misc: Appease gcc 5.1 Three minor issues are resolved: 1. Apparently gcc 5.1 does not like negation of booleans followed by bitwise AND. 2. Somehow the compiler also gets confused and warns about NoopMachInst being unused (removing it causes compilation errors though). Most likely a compiler bug. 3. There seems to be a number of instances where loop unrolling causes false positives for the array-bounds check. For now, switch to std::array. Potentially we could disable the warning for newer gcc versions, but switching to std::array is probably a good move in any case.	2015-05-15 13:39:53 -04:00
Steve Reinhardt	c65fa3dceb	syscall_emul: fix warn_once behavior The current ignoreWarnOnceFunc doesn't really work as expected, since it will only generate one warning total, for whichever "warn-once" syscall is invoked first. This patch fixes that behavior by keeping a "warned" flag in the SyscallDesc object, allowing suitably flagged syscalls to warn exactly once per syscall.	2015-05-05 09:25:59 -07:00
Andreas Hansson	f349592071	arm: Add missing FPEXC.EN check Add a missing check to ensure that exceptions are generated properly.	2015-05-05 03:22:45 -04:00
Giacomo Gabrielli	a3f23894eb	arm: enable DCZVA by default in SE mode	2015-05-05 03:22:42 -04:00
Andreas Sandberg	706597f021	arm: Relax ordering for some uncacheable accesses We currently assume that all uncacheable memory accesses are strictly ordered. Instead of always enforcing strict ordering, we now only enforce it if the required memory type is device memory or strongly ordered memory.	2015-05-05 03:22:34 -04:00
Andreas Sandberg	48281375ee	mem, cpu: Add a separate flag for strictly ordered memory The Request::UNCACHEABLE flag currently has two different functions. The first, and obvious, function is to prevent the memory system from caching data in the request. The second function is to prevent reordering and speculation in CPU models. This changeset gives the order/speculation requirement a separate flag (Request::STRICT_ORDER). This flag prevents CPU models from doing the following optimizations: * Speculation: CPU models are not allowed to issue speculative loads. * Write combining: CPU models and caches are not allowed to merge writes to the same cache line. Note: The memory system may still reorder accesses unless the UNCACHEABLE flag is set. It is therefore expected that the STRICT_ORDER flag is combined with the UNCACHEABLE flag to prevent this behavior.	2015-05-05 03:22:33 -04:00
Andreas Sandberg	1da634ace0	mem, alpha: Move Alpha-specific request flags Move Alpha-specific memory request flags to an architecture-specific header and map them to the architecture specific flag bit range.	2015-05-05 03:22:31 -04:00
Andreas Hansson	23b9792681	arm: Remove unnecessary boot uncachability With the recent patches addressing how we deal with uncacheable accesses there is no longer need for the work arounds put in place to enforce certain sections of memory to be uncacheable during boot.	2015-05-05 03:22:30 -04:00
Andreas Hansson	554ddc7c07	arch, cpu: Do not forward snoops to table walker This patch simplifies the overall CPU by changing the TLB caches such that they do not forward snoops to the table walker port(s). Note that only ARM and X86 are affected. There is no reason for the ports to snoop as they do not actually take any action, and from a performance point of view we are better of not snooping more than we have to. Should it at a later point be required to snoop for a particular TLB design it is easy enough to add it back.	2015-05-05 03:22:27 -04:00
Ruslan Bukin	81f3211149	arch, base, dev, kern, sym: FreeBSD support This adds support for FreeBSD/aarch64 FS and SE mode (basic set of syscalls only) Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2015-04-29 22:35:23 -05:00
Nilay Vaish	ee06fed656	x86: change divide-by-zero fault to divide-error Same exception is raised whether division with zero is performed or the quotient is greater than the maximum value that the provided space can hold. Divide-by-Zero is the AMD terminology, while Divide-Error is Intel's.	2015-04-29 22:35:22 -05:00
Andreas Hansson	179787f31f	misc: Appease gcc 5.1 without moving GDB_REG_BYTES This patch rolls back the move of the GDB_REG_BYTES constant, and instead adds M5_VAR_USED.	2015-04-24 03:30:08 -04:00
Andreas Hansson	c8c4f66889	misc: Appease gcc 5.1 This patch fixes a few small issues to ensure gem5 compiles when using gcc 5.1. First, the GDB_REG_BYTES in the RemoteGDB header are, rather surprisingly, flagged as unused for both ARM and X86. Removing them, however, causes compilation errors as they are actually used in the source file. Moving the constant into the class definition fixes the issue. Possibly a gcc bug. Second, we have an unused EthPktData constructor using auto_ptr, and the latter is deprecated. Since the code is never used it is simply removed.	2015-04-23 13:37:46 -04:00
Brandon Potter	4991c29965	syscall_emul: implement clock_gettime system call	2015-04-22 07:51:27 -07:00
Monir Mozumder	00e3cab8fc	syscall_emul: update x86 syscall table Update table with additional definitions through Linux 3.13.	2015-04-22 07:51:27 -07:00
Nilay Vaish	e596e52498	x86: implements x87 mult/div instructions	2015-04-13 17:33:57 -05:00
Lena Olson	333988a73e	x86: fix debug trace output for mwait When running with the Exec flag, the mwait instruction attempted to print out its source registers, which were never actually initialized. This led to sporadic assertion failures when the value stored there was invalid. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2015-04-03 11:42:10 -05:00
Steve Reinhardt	6677b9122a	mem: rename Locked/LOCKED to LockedRMW/LOCKED_RMW Makes x86-style locked operations even more distinct from LLSC operations. Using "locked" by itself should be obviously ambiguous now.	2015-03-23 16:14:20 -07:00
Andreas Hansson	d64b34bef8	arm: Share a port for the two table walker objects This patch changes how the MMU and table walkers are created such that a single port is used to connect the MMU and the TLBs to the memory system. Previously two ports were needed as there are two table walker objects (stage one and stage two), and they both had a port. Now the port itself is moved to the Stage2MMU, and each TableWalker is simply using the port from the parent. By using the same port we also remove the need for having an additional crossbar joining the two ports before the walker cache or the L2. This simplifies the creation of the CPU cache topology in BaseCPU.py considerably. Moreover, for naming and symmetry reasons, the TLB walker port is connected through the stage-one table walker thus making the naming identical to x86. Along the same line, we use the stage-one table walker to generate the master id that is used by all TLB-related requests.	2015-03-02 04:00:42 -05:00
Giacomo Gabrielli	bd70db5521	arm: Remove unnecessary dependencies between AArch64 FP instructions	2015-03-02 04:00:41 -05:00
Andreas Hansson	f26a289295	mem: Split port retry for all different packet classes This patch fixes a long-standing isue with the port flow control. Before this patch the retry mechanism was shared between all different packet classes. As a result, a snoop response could get stuck behind a request waiting for a retry, even if the send/recv functions were split. This caused message-dependent deadlocks in stress-test scenarios. The patch splits the retry into one per packet (message) class. Thus, sendTimingReq has a corresponding recvReqRetry, sendTimingResp has recvRespRetry etc. Most of the changes to the code involve simply clarifying what type of request a specific object was accepting. The biggest change in functionality is in the cache downstream packet queue, facing the memory. This queue was shared by requests and snoop responses, and it is now split into two queues, each with their own flow control, but the same physical MasterPort. These changes fixes the previously seen deadlocks.	2015-03-02 04:00:35 -05:00
Andreas Sandberg	3b4ae7debb	arm: Don't truncate 16-bit ASIDs to 8 bits The ISA code sometimes stores 16-bit ASIDs as 8-bit unsigned integers and has a couple of inverted checks that mask out the high 8 bits of an ASID if 16-bit ASIDs have been /enabled/. This changeset fixes both of those issues.	2015-03-02 04:00:28 -05:00
Andreas Sandberg	804b11a3ed	arm: Correctly access the stack pointer in GDB We curently use INTREG_X31 instead of INTREG_SPX when accessing the stack pointer in GDB. gem5 normally uses INTREG_SPX to access the stack pointer, which gets mapped to the stack pointer corresponding (INTREG_SPn) to the current exception level. This changeset updates the GDB interface to use SPX instead of X31 (which is always zero) when transfering CPU state to gdb.	2015-03-02 04:00:27 -05:00
Andreas Sandberg	34dcd90b61	arm: Fix broken page table permissions checks in remote GDB The remote GDB interface currently doesn't check if translations are valid before reading memory. This causes a panic when GDB tries to access unmapped memory (e.g., when getting a stack trace). There are two reasons for this: 1) The function used to check for valid translations (virtvalid()) doesn't work and panics on invalid translations. 2) The method in the GDB interface used to test if a translation is valid (RemoteGDB::acc) always returns true regardless of the return from virtvalid(). This changeset fixes both of these issues.	2015-03-02 04:00:27 -05:00
Andreas Hansson	d0e1b8a19c	arch: Make readMiscRegNoEffect const throughout Finally took the plunge and made this apply to all ISAs, not just ARM.	2015-02-16 03:33:28 -05:00
Andreas Sandberg	5bfa7e3d59	arm: Merge ISA files with pseudo instructions This changeset moves the pseudo instructions used to signal unknown instructions and unimplemented instructions to the same source files as the decoder fault.	2015-02-16 03:32:58 -05:00
Marco Balboni	268d9e59c5	mem: Clarification of packet crossbar timings This patch clarifies the packet timings annotated when going through a crossbar. The old 'firstWordDelay' is replaced by 'headerDelay' that represents the delay associated to the delivery of the header of the packet. The old 'lastWordDelay' is replaced by 'payloadDelay' that represents the delay needed to processing the payload of the packet. For now the uses and values remain identical. However, going forward the payloadDelay will be additive, and not include the headerDelay. Follow-on patches will make the headerDelay capture the pipeline latency incurred in the crossbar, whereas the payloadDelay will capture the additional serialisation delay.	2015-02-11 10:23:47 -05:00
Andreas Sandberg	550c318490	sim: Move the BaseTLB to src/arch/generic/ The TLB-related code is generally architecture dependent and should live in the arch directory to signify that. --HG-- rename : src/sim/BaseTLB.py => src/arch/generic/BaseTLB.py rename : src/sim/tlb.cc => src/arch/generic/tlb.cc rename : src/sim/tlb.hh => src/arch/generic/tlb.hh	2015-02-11 10:23:27 -05:00
Ali Saidi	89b3616d7e	arm: always set the IsFirstMicroop flag While the IsFirstMicroop flag exists it was only occasionally used in the ARM instructions that gem5 microOps and therefore couldn't be relied on to be correct.	2015-01-25 07:22:56 -05:00
Ali Saidi	f6742ea26e	cpu: Remove all notion that we know when the cpu is misspeculating. We have no way of knowing if a CPU model is on the wrong path with our execute-in-execute CPU models. Don't pretend that we do.	2015-01-25 07:22:26 -05:00
Ali Saidi	0bd986015b	cpu: Put all CPU instruction tracers in a single file	2015-01-25 07:22:17 -05:00
Andreas Hansson	10c69bb168	mem: Remove unused Packet src and dest fields This patch takes the final step in removing the src and dest fields in the packet. These fields were rather confusing in that they only remember a single multiplexing component, and pushed the responsibility to the bridge and caches to store the fields in a senderstate, thus effectively creating a stack. With the recent changes to the crossbar response routing the crossbar is now responsible without relying on the packet fields. Thus, these variables are now unused and can be removed.	2015-01-22 05:01:31 -05:00
Andreas Hansson	ce12d4bc63	x86: Delay X86 table walk on receiving walker response This patch fixes a minor issue in the X86 page table walker where it ended up sending new request packets to the crossbar before the response processing was finished (recvTimingResp is directly calling sendTimingReq). Under certain conditions this caused the crossbar to see illegal combinations of request/response overlap, in turn causing problems with a slightly modified crossbar implementation.	2015-01-22 05:00:54 -05:00
Andreas Hansson	f49830ce0b	mem: Clean up Request initialisation This patch tidies up how we create and set the fields of a Request. In essence it tries to use the constructor where possible (as opposed to setPhys and setVirt), thus avoiding spreading the information across a number of locations. In fact, setPhys is made private as part of this patch, and a number of places where we callede setVirt instead uses the appropriate constructor.	2015-01-22 05:00:53 -05:00
Emilio Castillo	7bb65dd434	x86 : fxsave and fxrestore missing template code This patch corrects the FXSAVE and FXRSTOR Macroops. The actual code used for saving/restore the FP registers is in the file but it was not used. The FXSAVE and FXRSTOR instructions are used in the kernel for saving and loading the state of the mmx,xmm and fpu registers. This operation is triggered in FS by issuing a Device Not Available Fault. The cr0 register has a TS flag that is set upon each context change. Every time a task access any FP related register (SIMD as well) if the TS flag is set to one, the device not available fault is issued. The kernel saves the current state of the registers, and restore the previous state of the currently running task. Right now Gem5 lacks of this capability. the Device Not Available Fault is never issued, leading to several problems when different threads share the same CPU and SMT is not used. The PARSEC Ferret benchmark is an example of this behavior. In order to test this a hack in the atomic cpu code was done to detect if a static instruction has any FP operands and the cr0 reg TS bit is set. This check must be done in the ISA dependent code. But it seems to be tricky to access the cr0 register while executing an instruction. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2015-01-10 14:30:53 -06:00
Gabe Black	cd6380605c	x86: Enable three bits in the FamilyModelStepping ECX CPUID bitfield. These are for the monitor/mwait instructions, SSSE3, and XSAVE.	2015-01-06 22:15:00 -08:00
Gabe Black	cb181d6f91	cpuid, x86: Revert "Enabling more features in CPUid" That change enables CPUID bits for features that aren't implemented in gem5. If a simulated system tries to use those features because it was told it could, bad things can happen.	2015-01-06 22:13:56 -08:00
mike upton	cb911559dc	arm: Add unlinkat syscall implementation added ARM aarch64 unlinkat syscall support, modeled on other <xxx>at syscalls. This gets all of the cpu2006 int workloads passing in SE mode on aarch64. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2015-01-03 17:51:48 -06:00
Maxime Martinasso	5a5416d575	x86: implements the simd128 ADDSUBPD instruction This patch implements the simd128 ADDSUBPD instruction for the x86 architecture. Tested with a simple program in assembly language which executes the instruction. Checked that different versions of the instruction are executed by using the execution tracing option. Committed by: Nilay Vaish <nilay@cs.wisc.edu	2015-01-03 17:51:48 -06:00
Curtis Dunham	4d88978913	arm: Add stats to table walker This patch adds table walker stats for: - Walk events - Instruction vs Data - Page size histogram - Wait time and service time histograms - Pending requests histogram (per cycle) - measures dist. of L (p(1..) = how often busy, p(0) = how often idle) - Squashes, before starting and after completion	2014-12-23 09:31:18 -05:00
Andreas Sandberg	184fefbb3b	arm: Raise an alignment fault if a PC has illegal alignment We currently don't handle unaligned PCs correctly. There is one check for unaligned PCs in the TLB when running in aarch64 mode, but this check does not cover cases where the CPU does not do a TLB lookup when decoding an instruction (e.g., a branch stays within the same cache line). Additionally, the Decoder class sometimes throws an assertion for unaligned PCs which breaks speculation. This changeset introduces a decoder fault bit field in the ExtMachInst structure. This field can be used to signal a decoder failure. If set, the decoder generates an internal gem5fault instruction instead of a normal instruction. This instruction in turns either panics (fault type PANIC), returns an PCAlignmentFault (fault type UNALIGNED, aarch64) or PrefetchAbort (fault type UNALIGNED, aarch32). The patch causes minor changes to the realview64 regressions, and a stats bump will follow.	2014-12-23 09:31:17 -05:00
Andreas Sandberg	b33812ba43	arm: Clean up and document decoder API This changeset adds more documentation to the ArmISA::Decoder class and restructures it slightly to make API groups more obvious.	2014-12-23 09:31:17 -05:00
Andreas Sandberg	070b4a81db	arm: Add support for filtering in the PMU This patch adds support for filtering events in the PMU. In order to do so, it updates the ISADevice base class to forward an ISA pointer to ISA devices. This enables such devices to access the MiscReg file to determine the current execution level.	2014-12-23 09:31:17 -05:00
Andreas Sandberg	9b7578d8c7	arm: Fix decoding of PMXEVTYPER_EL0 and PMCCFILTR_EL0 The aarch64 system register decoder is currently not decoding PMXEVTYPER_EL0 and PMCCFILTR_EL0 correctly. This changeset updates the decoder so that they are decoded using the values in table C5-6 in ARM DDI 0478A.c.	2014-12-08 04:49:53 -05:00
Gabe Black	4a8a0a0798	misc: Generalize GDB single stepping. The new single stepping implementation for x86 doesn't rely on any ISA specific properties or functionality. This change pulls out the per ISA implementation of those functions and promotes the X86 implementation to the base class. One drawback of that implementation is that the CPU might stop on an instruction twice if it's affected by both breakpoints and single stepping. While that might be a little surprising, it's harmless and would only happen under somewhat unlikely circumstances.	2014-12-05 22:37:03 -08:00
Gabe Black	fb07d43b1a	x86: Implement a remote GDB stub. This stub should allow remote debugging of 32 bit and 64 bit targets. Single stepping seems to work, as do breakpoints. If both breakpoints and single stepping affect an instruction, gdb will stop at the instruction twice before continuing. That's a little surprising, but is generally harmless.	2014-12-05 22:36:16 -08:00
Gabe Black	fe48c0a32b	misc: Make the GDB register cache accessible in various sized chunks. Not all ISAs have 64 bit sized registers, so it's not always very convenient to access the GDB register cache in 64 bit sized chunks. This change makes it accessible in 8, 16, 32, or 64 bit chunks. The MIPS and ARM implementations were working around that limitation by bundling and unbundling 32 bit values into 64 bit values. That code has been removed.	2014-12-05 01:44:24 -08:00
Gabe Black	22aaa5867f	x86: Rework opcode parsing to support 3 byte opcodes properly. Instead of counting the number of opcode bytes in an instruction and recording each byte before the actual opcode, we can represent the path we took to get to the actual opcode byte by using a type code. That has a couple of advantages. First, we can disambiguate the properties of opcodes of the same length which have different properties. Second, it reduces the amount of data stored in an ExtMachInst, making them slightly easier/faster to create and process. This also adds some flexibility as far as how different types of opcodes are handled, which might come in handy if we decide to support VEX or XOP instructions. This change also adds tables to support properly decoding 3 byte opcodes. Before we would fall off the end of some arrays, on top of the ambiguity described above. This change doesn't measureably affect performance on the twolf benchmark. --HG-- rename : src/arch/x86/isa/decoder/three_byte_opcodes.isa => src/arch/x86/isa/decoder/three_byte_0f38_opcodes.isa rename : src/arch/x86/isa/decoder/three_byte_opcodes.isa => src/arch/x86/isa/decoder/three_byte_0f3a_opcodes.isa	2014-12-04 15:53:54 -08:00
Gabe Black	3069c28a02	arch: Allow named constants as decode case values. The values in a "bitfield" or in an ExtMachInst structure member may not be a literal value, it might select from an arbitrary collection of options. Instead of using the raw value of those constants in the decoder, it's easier to tell what's going on if they can be referred to as a symbolic constant/enum. To support that, the ISA description language is extended slightly so that in addition to integer literals, the case value for decode blobs can also be a string literal. It's up to the ISA author to ensure that the string evaluates to a legal constant value when interpretted as C++.	2014-12-04 15:52:48 -08:00
Gabe Black	d67cf81f5d	x86: Clean up style in process.cc.	2014-12-02 22:01:51 -08:00
Andrew Bardsley	3cd0b1f6a6	arm: Fix TLB ignoring faults when table walking This patch fixes a case where the Minor CPU can deadlock due to the lack of a response to TLB request because of a bug in fault handling in the ARM table walker. TableWalker::processWalkWrapper is the scheduler-called wrapper which handles deferred walks which calls to TableWalker::wait cannot immediately process. The handling of faults generated by processWalk{AArch64,LPAE,} calls in those two functions is is different. processWalkWrapper ignores fault returns from processWalk... which can lead to ::finish not being called on a translation. This fix provides fault handling in processWalkWrapper similar to that found in the leaf functions which BaseTLB::Translation::finish.	2014-12-02 06:08:11 -05:00
Andreas Hansson	74bbe20141	cpu: Always mask the snoop address when performing lock check Ensure the snoop address check is always using a cache-block aligned address. This patch updates Alpha and Mips to match the other ISAs.	2014-12-02 06:08:00 -05:00
Alexandru Dutu	1f539f13c3	mem: Page Table map api modification This patch adds uncacheable/cacheable and read-only/read-write attributes to the map method of PageTableBase. It also modifies the constructor of TlbEntry structs for all architectures to consider the new attributes.	2014-11-23 18:01:09 -08:00
Alexandru Dutu	f743bdcb69	x86: Segment initialization to support KvmCPU in SE This patch sets up low and high privilege code and data segments and places them in the following order: cs low, ds low, ds, cs, in the GDT. Additionally, a syscall and page fault handler for KvmCPU in SE mode are defined. The order of the segment selectors in GDT is required in this manner for interrupt handling to work properly. Segment initialization is done for all the thread contexts.	2014-11-23 18:01:08 -08:00
Alexandru Dutu	adbaa4dfde	kvm, x86: Adding support for SE mode execution This patch adds methods in KvmCPU model to handle KVM exits caused by syscall instructions and page faults. These types of exits will be encountered if KvmCPU is run in SE mode.	2014-11-23 18:01:08 -08:00
Alexandru Dutu	335514dfdc	cpuid, x86: Enabling more features in CPUid Adding more features in the CPUid with the purpose of supporting running the KvmCPU in SE mode.	2014-11-23 18:01:08 -08:00
Gabe Black	aceeecb192	x86: Fix setting segment bases in real mode. The data size used for actually writing the base value for the segment was the default size, but really it should set the entire value without any possible truncation.	2014-11-17 01:00:53 -08:00
Gabe Black	f8603fa120	x86: Fix some bugs in the real mode far jmp instruction. The far pointer should be shifted right to get the selector value, not left. Also, when calculating the width of the offset, the wrong register was used in one spot.	2014-11-17 00:20:01 -08:00
Gabe Black	7739c24fbe	x86: APIC: Only set deliveryStatus if our IPI is going somewhere. Otherwise the IPI which isn't sent will never arrive, and the deliveryStatus bit will never be cleared.	2014-11-17 00:19:07 -08:00
Gabe Black	79e7ca307e	x86: APIC: Fix the getRegArrayBit function. The getRegArrayBit function extracts a bit from a series of registers which are treated as a single large bit array. A previous change had modified the logic which figured out which bit to extract from ">> 5" to "% 5" which seems wrong, especially when other, similar functions were changed to use "% 32".	2014-11-17 00:17:06 -08:00
Gabe Black	d228db1143	x86: Fix the CPUID Long Mode Address Size function. The value in EAX has an 8 bit field for the linear address size and one for the physical address size when calling that function. A recent change implemented it but returned 0xff for both of those fields. That implies that linear and physical addresses are 255 bits wide which is wrong. When using the KVM CPU model this causes an error, presumably because some of those bits are actually reserved, or the CPU or kernel realizes 255 bits is a bad value. This change makes those values 48.	2014-11-16 23:12:42 -08:00
Andreas Hansson	481eb6ae80	arm: Fixes based on UBSan and static analysis Another churn to clean up undefined behaviour, mostly ARM, but some parts also touching the generic part of the code base. Most of the fixes are simply ensuring that proper intialisation. One of the more subtle changes is the return type of the sign-extension, which is changed to uint64_t. This is to avoid shifting negative values (undefined behaviour) in the ISA code.	2014-11-14 03:53:51 -05:00
Marc Orr	bf80734b2c	x86 isa: This patch attempts an implementation at mwait. Mwait works as follows: 1. A cpu monitors an address of interest (monitor instruction) 2. A cpu calls mwait - this loads the cache line into that cpu's cache. 3. The cpu goes to sleep. 4. When another processor requests write permission for the line, it is evicted from the sleeping cpu's cache. This eviction is forwarded to the sleeping cpu, which then wakes up. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2014-11-06 05:42:22 -06:00
Ali Saidi	7a0bf814b6	automated merge	2014-10-29 23:22:26 -05:00
Ali Saidi	f2db2a96d1	arm, tests: Update config files to more recent kernels and create 64-bit regressions. This changes the default ARM system to a Versatile Express-like system that supports 2GB of memory and PCI devices and updates the default kernels/file-systems for AArch64 ARM systems (64-bit) to support up to 32GB of memory and PCI devices. Some platforms that are no longer supported have been pruned from the configuration files. In addition a set of 64-bit ARM regressions have been added to the regression system.	2014-10-29 23:18:27 -05:00
Ali Saidi	b31d9e93e2	arm, mem: Fix drain bug and provide drain prints for more components.	2014-10-29 23:18:26 -05:00
Ali Saidi	baf88e908d	arm: Fix multi-system AArch64 boot w/caches. Automatically extract cpu release address from DTB file. Check SCTLR_EL1 to verify all caches are enabled.	2014-10-29 23:18:26 -05:00
Ali Saidi	9900629f83	arm: Mark some miscregs (timer counter) registers at unverifiable. The checker can't verify timer registers, so it should just grab the version from the executing CPU, otherwise it could get a larger value and diverge execution.	2014-10-29 23:18:24 -05:00
Nilay Vaish	6523aad25c	sim: revert 6709bbcf564d The identifier SYS_getdents is not available on Mac OS X. Therefore, its use results in compilation failure. It seems there is no straight forward way to implement the system call getdents using readdir() or similar C functions. Hence the commit 6709bbcf564d is being rolled back.	2014-10-22 15:59:57 -05:00
Andreas Hansson	d6f1c6ce89	x86: Fixes to avoid LTO warnings This patch fixes a few minor issues that caused link-time warnings when using LTO, mainly for x86. The most important change is how the syscall array is created. Previously gcc and clang would complain that the declaration and definition types did not match. The organisation is now changed to match how it is done for ARM, moving the code that was previously in syscalls.cc into process.cc, and having a class variable pointing to the static array. With these changes, there are no longer any warnings using gcc 4.6.3 with LTO.	2014-10-20 18:03:56 -04:00
Michael Adler	a3fe4c0662	sim: implement getdents/getdents64 in user mode Has been tested only for alpha. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2014-10-20 16:44:53 -05:00
Andreas Hansson	a2d246b6b8	arch: Use shared_ptr for all Faults This patch takes quite a large step in transitioning from the ad-hoc RefCountingPtr to the c++11 shared_ptr by adopting its use for all Faults. There are no changes in behaviour, and the code modifications are mostly just replacing "new" with "make_shared".	2014-10-16 05:49:51 -04:00
Andreas Hansson	2475862747	arch,x86,mem: Dynamically determine the ISA for Ruby store check This patch makes the memory system ISA-agnostic by enabling the Ruby Sequencer to dynamically determine if it has to do a store check. To enable this check, the ISA is encoded as an enum, and the system is able to provide the ISA to the Sequencer at run time. --HG-- rename : src/arch/x86/insts/microldstop.hh => src/arch/x86/ldstflags.hh	2014-10-16 05:49:44 -04:00
Andreas Sandberg	37908d62a4	arm: Add helper methods to setup architected PMU events	2014-10-16 05:49:42 -04:00
Andreas Sandberg	9d35d48e84	arm: Add TLB PMU probes This changeset adds probe points that can be used to implement PMU counters for TLB stats. The following probes are supported: * ArmISA::TLB::ppRefills / TLB Refills (TLB insertions)	2014-10-16 05:49:41 -04:00
Andreas Sandberg	3697990c27	arm: Add a model of an ARM PMUv3 This class implements a subset of the ARM PMU v3 specification as described in the ARMv8 reference manual. It supports most of the features of the PMU, however the following features are known to be missing: * Event filtering (e.g., from different privilege levels). * Access controls (the PMU currently ignores the execution level). * The chain counter (event no. 0x1E) is unimplemented. The PMU itself does not implement any events, it merely provides an interface for the configuration scripts to hook up probes that drive events. Configuration scripts should call addEventProbe() to configure custom events or high-level methods to configure architected events. The Python implementation of addEventProbe() automatically delays event type registration until after instantiation. In order to support CPU switching and some combined counters (e.g., memory references synthesized from loads and stores), the PMU allows multiple probes per event type. When creating a system that switches between CPU models that share the same PMU, PMU events for all of the CPU models can be registered with the PMU. Kudos to Matt Horsnell for the initial gem5 implementation of the PMU.	2014-10-16 05:49:39 -04:00
Akash Bagdia	8b7724d04c	arm: Don't speculatively access most miscregisters. Speculative exeuction can cause panics in detailed execution mode that shouldn't happen.	2014-09-02 11:26:32 +01:00
Jiuyue Ma	9fb8b8515b	x86: add LongModeAddressSize function to cpuid LongModeAddressSize was used by kernel 2.6.28.4 for physical address validation, if not properly implemented, PCI resource allocation may failed because of ioremap failed: - linux-2.6.28.4/arch/x86/mm/ioremap.c:27-30 27 static inline int phys_addr_valid(unsigned long addr) 28 { 29 return addr < (1UL << boot_cpu_data.x86_phys_bits); 30 } - linux-2.6.28.4/arch/x86/kernel/cpu/common.c:475-482 475 #ifdef CONFIG_X86_64 476 if (c->extended_cpuid_level >= 0x80000008) { 477 u32 eax = cpuid_eax(0x80000008); 478 479 c->x86_virt_bits = (eax >> 8) & 0xff; 480 c->x86_phys_bits = eax & 0xff; 481 } 482 #endif - linux-2.6.28.4/arch/x86/mm/ioremap.c:209-214 209 if (!phys_addr_valid(phys_addr)) { 210 printk(KERN_WARNING "ioremap: invalid physical address %llx\n", 211 (unsigned long long)phys_addr); 212 WARN_ON_ONCE(1); 213 return NULL; 214 } This patch return 0x0000ffff for LongModeAddressSize, which guarantee phys_addr_valid never failed. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2014-06-13 16:48:47 +08:00
Andreas Hansson	b520223699	arm: Use MiscRegIndex rather than int when flattening Some additional type checking to avoid future issues.	2014-10-01 08:05:52 -04:00
Andreas Hansson	10f82934be	arm: More UBSan cleanups after additional full-system runs Some incorrect casting to IntRegIndex, and a few uninitialized members in the i8254xGBe device.	2014-10-01 08:05:51 -04:00
Andreas Hansson	ec41000dad	arm: Fixed undefined behaviours identified by gcc This patch fixes the runtime errors highlighted by the undefined behaviour sanitizer. In the end there were two issues. First, when rotating an immediate, we ended up shifting an uint32_t by 32 in some cases. This case is fixed by checking for a rotation by 0 positions. Second, the Mrc15 and Mcr15 are operating on an IntReg and a MiscReg, but we used the type RegRegImmOp and passed a MiscRegIndex as an IntRegIndex. This issue is resolved by introducing a MiscRegRegImmOp and RegMiscRegImmOp with the appropriate types. With these fixes there are no runtime errors identified for the full ARM regressions.	2014-09-27 09:08:37 -04:00
Andreas Hansson	341dbf2662	arch: Use const StaticInstPtr references where possible This patch optimises the passing of StaticInstPtr by avoiding copying the reference-counting pointer. This avoids first incrementing and then decrementing the reference-counting pointer.	2014-09-27 09:08:36 -04:00
Andreas Hansson	deb2200671	scons: Address issues related to gcc 4.9.1 Fix a number few minor issues to please gcc 4.9.1. Removing the '-fuse-linker-plugin' flag means no libraries are part of the LTO process, but hopefully this is an acceptable loss, as the flag causes issues on a lot of systems (only certain combinations of gcc, ld and ar work).	2014-09-27 09:08:34 -04:00
Mitch Hayenga	e1403fc2af	alpha,arm,mips,power,x86,cpu,sim: Cleanup activate/deactivate activate(), suspend(), and halt() used on thread contexts had an optional delay parameter. However this parameter was often ignored. Also, when used, the delay was seemily arbitrarily set to 0 or 1 cycle (no other delays were ever specified). This patch removes the delay parameter and 'Events' associated with them across all ISAs and cores. Unused activate logic is also removed.	2014-09-20 17:18:35 -04:00
Andreas Hansson	1f6d5f8f84	mem: Rename Bus to XBar to better reflect its behaviour This patch changes the name of the Bus classes to XBar to better reflect the actual timing behaviour. The actual instances in the config scripts are not renamed, and remain as e.g. iobus or membus. As part of this renaming, the code has also been clean up slightly, making use of range-based for loops and tidying up some comments. The only changes outside the bus/crossbar code is due to the delay variables in the packet. --HG-- rename : src/mem/Bus.py => src/mem/XBar.py rename : src/mem/coherent_bus.cc => src/mem/coherent_xbar.cc rename : src/mem/coherent_bus.hh => src/mem/coherent_xbar.hh rename : src/mem/noncoherent_bus.cc => src/mem/noncoherent_xbar.cc rename : src/mem/noncoherent_bus.hh => src/mem/noncoherent_xbar.hh rename : src/mem/bus.cc => src/mem/xbar.cc rename : src/mem/bus.hh => src/mem/xbar.hh	2014-09-20 17:18:32 -04:00
Andreas Hansson	41fc8a573e	arch: Pass faults by const reference where possible This patch changes how faults are passed between methods in an attempt to copy as few reference-counting pointer instances as possible. This should avoid unecessary copies being created, contributing to the increment/decrement of the reference counters.	2014-09-19 10:35:18 -04:00
Andrew Bardsley	c8b919aba2	style: Fix line continuation, especially in debug messages This patch closes a number of space gaps in debug messages caused by the incorrect use of line continuation within strings. (There's also one consistency change to a similar, but correct, use of line continuation)	2014-09-12 10:22:47 -04:00
Mitch Hayenga	8f95144e16	arm: Make memory ops work on 64bit/128-bit quantities Multiple instructions assume only 32-bit load operations are available, this patch increases load sizes to 64-bit or 128-bit for many load pair and load multiple instructions.	2014-09-03 07:42:52 -04:00
Mitch Hayenga	afbae1ec95	x86: Flag instructions that call suspend as IsQuiesce The o3 cpu relies upon instructions that suspend a thread context being flagged as "IsQuiesce". If they are not, unpredictable behavior can occur. This patch fixes that for the x86 ISA.	2014-09-03 07:42:46 -04:00
Mitch Hayenga	bb1e6cf7c4	arm: Fix v8 neon latency issue for loads/stores Neon memory ops that operate on multiple registers currently have very poor performance because of interleave/deinterleave micro-ops. This patch marks the deinterleave/interleave micro-ops as "No_OpClass" such that they take minumum cycles to execute and are never resource constrained. Additionaly the micro-ops over-read registers. Although one form may need to read up to 20 sources, not all do. This adds in new forms so false dependencies are not modeled. Instructions read their minimum number of sources.	2014-09-03 07:42:44 -04:00
Curtis Dunham	4a3f11149d	arm: use condition code registers for ARM ISA Analogous to ee049bf (for x86). Requires a bump of the checkpoint version and corresponding upgrader code to move the condition code register values to the new register file.	2014-04-29 16:05:02 -05:00
Andrew Bardsley	035a82ee2c	arm: ISA X31 destination register fix This patch substituted the zero register for X31 used as a destination register. This prevents false dependencies based on X31.	2014-09-03 07:42:43 -04:00
Mitch Hayenga	476c6fe368	arm: Mark v7 cbz instructions as direct branches v7 cbz/cbnz instructions were improperly marked as indirect branches.	2014-09-03 07:42:40 -04:00
Mitch Hayenga	fd722946dd	arch: Properly guess OpClass from optional StaticInst flags isa_parser.py guesses the OpClass if none were given based upon the StaticInst flags. The existing code does not take into account optionally set flags. This code hoists the setting of optional flags so OpClass is properly assigned.	2014-09-03 07:42:32 -04:00
Curtis Dunham	12210ada54	arm: support 16kb vm granules	2014-05-27 11:00:56 -05:00
Andreas Sandberg	326662b01b	arch, cpu: Factor out the ExecContext into a proper base class We currently generate and compile one version of the ISA code per CPU model. This is obviously wasting a lot of resources at compile time. This changeset factors out the interface into a separate ExecContext class, which also serves as documentation for the interface between CPUs and the ISA code. While doing so, this changeset also fixes up interface inconsistencies between the different CPU models. The main argument for using one set of ISA code per CPU model has always been performance as this avoid indirect branches in the generated code. However, this argument does not hold water. Booting Linux on a simulated ARM system running in atomic mode (opt/10.linux-boot/realview-simple-atomic) is actually 2% faster (compiled using clang 3.4) after applying this patch. Additionally, compilation time is decreased by 35%.	2014-09-03 07:42:22 -04:00
Andreas Hansson	e1ac962939	arch: Cleanup unused ISA traits constants This patch prunes unused values, and also unifies how the values are defined (not using an enum for ALPHA), aligning the use of int vs Addr etc. The patch also removes the duplication of PageBytes/PageShift and VMPageSize/LogVMPageSize. For all ISAs the two pairs had identical values and the latter has been removed.	2014-09-03 07:42:21 -04:00
Mitch Hayenga	23c8540756	config: Change parsing of Addr so hex values work from scripts When passed from a configuration script with a hexadecimal value (like "0x80000000"), gem5 would error out. This is because it would call "toMemorySize" which requires the argument to end with a size specifier (like 1MB, etc). This modification makes it so raw hex values can be passed through Addr parameters from the configuration scripts.	2014-09-03 07:42:20 -04:00
Andreas Hansson	1046b8d6e5	arm: Fix ExtMachInst hash operator underlying type This patch fixes the hash operator used for ARM ExtMachInst, which incorrectly was still using uint32_t. Instead of changing it to uint64_t it is not using the underlying data type of the BitUnion.	2014-09-03 07:42:19 -04:00
Nilay Vaish	4ccdf8fb81	x86: set op class of two fp instructions This patch sets op class of two fp instructions: movfp and pop x87 stack as IntAluOp since these instructions do not make use of the fp alu.	2014-09-01 16:55:49 -05:00
Alexandru	5efbb4442a	mem: adding architectural page table support for SE mode This patch enables the use of page tables that are stored in system memory and respect x86 specification, in SE mode. It defines an architectural page table for x86 as a MultiLevelPageTable class and puts a placeholder class for other ISAs page tables, giving the possibility for future implementation.	2014-08-28 10:11:44 -05:00
Andreas Sandberg	70176fecd1	base: Replace the internal varargs stuff with C++11 constructs We currently use our own home-baked support for type-safe variadic functions. This is confusing and somewhat limited (e.g., cprintf only supports a limited number of arguments). This changeset converts all uses of our internal varargs support to use C++11 variadic macros.	2014-08-26 10:13:45 -04:00
Mitch Hayenga	0da99b7e0c	mips: Fix RLIMIT_RSS naming MIPS defined RLIMIT_RSS in a way that could cause a naming conflict with RLIMIT_RSS from the host system. Broke clang+MacOS build.	2014-08-26 10:13:31 -04:00
Andreas Sandberg	a3d3eb0ff7	sparc: Fixup bit ordering in the PSTATE bit union The order of the MSB and LSB bit of the mm field in the PSTATE union is wrong. Any access to this field will currently be ignored and reads will always return zero. This patch fixes the ordering so it is <MSB, LSB> instead of <LSB, MSB>.	2014-08-26 10:13:23 -04:00
Dam Sunwoo	b04d6c7c33	arm: change MISCREG_L2ERRSR to warn not fail Some newer binaries compiled for Versatile Express TC2 contain access to implementation specific L2MERRSR registers. This causes an infinite loop of undefined exceptions. This patch changes the behavior to "warn not fail" to keep the workloads going.	2014-08-13 06:57:36 -04:00
Andreas Sandberg	8b8d991df0	mips: Remove unused private members to fix compile-time warning Certain versions of clang complain about unused private members if they are not used. This changeset removes such members from the MIPS-specific classes to silence the warning.	2014-08-13 06:57:30 -04:00
Andreas Sandberg	8d04e32a83	power: Remove unused private members to fix compile-time warning Certain versions of clang complain about unused private members if they are not used. This changeset removes such members from the POWER-specific ProcessInfo struct to silence the warning.	2014-08-13 06:57:29 -04:00
Curtis Dunham	94daae6864	arm: remove dead code fplib mul64x64	2014-03-11 09:50:02 -05:00
Stephan Diestelhorst	65cea4708e	power: Add basic DVFS support for gem5 Adds DVFS capabilities to gem5, by allowing users to specify lists for frequencies and voltages in SrcClockDomains and VoltageDomains respectively. A separate component, DVFSHandler, provides a small interface to change operating points of the associated domains. Clock domains will be linked to voltage domains and thus allow separate clock, but shared voltage lines. Currently all the valid performance-level updates are performed with a fixed transition latency as specified for the domain. Config file example: ... vd = VoltageDomain(voltage = ['1V','0.95V','0.90V','0.85V']) tsys.cluster1.clk_domain.clock = ['1GHz','700MHz','400MHz','230MHz'] tsys.cluster2.clk_domain.clock = ['1GHz','700MHz','400MHz','230MHz'] tsys.cluster1.clk_domain.domain_id = 0 tsys.cluster2.clk_domain.domain_id = 1 tsys.cluster1.clk_domain.voltage_domain = vd tsys.cluster2.clk_domain.voltage_domain = vd tsys.dvfs_handler.domains = [tsys.cluster1.clk_domain, tsys.cluster2.clk_domain] tsys.dvfs_handler.enable = True	2014-06-30 13:56:06 -04:00
Binh Pham	b085db84af	x86: fix table walker assertion In a cycle, we could see a R and W requests corresponding to the same page walk being sent to the memory. During the cycle that assertion happens, we have 2 responses corresponding to the R and W above. We also have a 'read' variable to keep track of the inflight Read request, this gets reset to NULL right after we send out any R request; and gets set to the next R in the page walk when a response comes back. The issue we are seeing here is when we get a response for W request, assert(!read) fires because we got a response for R request right before this, hence we set 'read' to NOT NULL value, pointing to the next R request in the pagewalk! This work was done while Binh was an intern at AMD Research.	2014-06-21 10:39:44 -07:00
Steve Reinhardt	0be64ffe2f	style: eliminate equality tests with true and false Using '== true' in a boolean expression is totally redundant, and using '== false' is pretty verbose (and arguably less readable in most cases) compared to '!'. It's somewhat of a pet peeve, perhaps, but I had some time waiting for some tests to run and decided to clean these up. Unfortunately, SLICC appears not to have the '!' operator, so I had to leave the '== false' tests in the SLICC code.	2014-05-31 18:00:23 -07:00
Steve Reinhardt	109908c2a6	syscall emulation: clean up & comment SyscallReturn	2014-05-12 14:23:31 -07:00
Ali Saidi	dbaf43394b	arm: Make sure UndefinedInstructions are properly initialized	2014-04-17 16:56:09 -05:00
Ali Saidi	a00b44ebe8	arm: allow DC instructions by default so SE mode works	2014-04-17 16:55:54 -05:00
Ali Saidi	c4a2f76fea	sim, arm: implement more of the at variety syscalls Needed for new AArch64 binaries	2014-04-17 16:55:05 -05:00
Andrew Bardsley	bf78299f04	cpu: Add flag name printing to StaticInst This patch adds a the member function StaticInst::printFlags to allow all of an instruction's flags to be printed without using the individual is... member functions or resorting to exposing the 'flags' vector It also replaces the enum definition StaticInst::Flags with a Python-generated enumeration and adds to the enum generation mechanism in src/python/m5/params.py to allow Enums to be placed in namespaces other than Enums or, alternatively, in wrapper structs allowing them to be inherited by other classes (so populating that class's name-space with the enumeration element names).	2014-05-09 18:58:47 -04:00
Andrew Bardsley	f7d80348fa	arm: Add branch flags onto macroops Mark branch flags onto macroops to allow branch prediction before microop decomposition	2014-05-09 18:58:47 -04:00
Curtis Dunham	af39ab297f	arm: add preliminary ISA splits for ARM arch	2014-05-09 18:58:47 -04:00
Curtis Dunham	fe27f937aa	arch: teach ISA parser how to split code across files This patch encompasses several interrelated and interdependent changes to the ISA generation step. The end goal is to reduce the size of the generated compilation units for instruction execution and decoding so that batch compilation can proceed with all CPUs active without exhausting physical memory. The ISA parser (src/arch/isa_parser.py) has been improved so that it can accept 'split [output_type];' directives at the top level of the grammar and 'split(output_type)' python calls within 'exec {{ ... }}' blocks. This has the effect of "splitting" the files into smaller compilation units. I use air-quotes around "splitting" because the files themselves are not split, but preprocessing directives are inserted to have the same effect. Architecturally, the ISA parser has had some changes in how it works. In general, it emits code sooner. It doesn't generate per-CPU files, and instead defers to the C preprocessor to create the duplicate copies for each CPU type. Likewise there are more files emitted and the C preprocessor does more substitution that used to be done by the ISA parser. Finally, the build system (SCons) needs to be able to cope with a dynamic list of source files coming out of the ISA parser. The changes to the SCons{cript,truct} files support this. In broad strokes, the targets requested on the command line are hidden from SCons until all the build dependencies are determined, otherwise it would try, realize it can't reach the goal, and terminate in failure. Since build steps (i.e. running the ISA parser) must be taken to determine the file list, several new build stages have been inserted at the very start of the build. First, the build dependencies from the ISA parser will be emitted to arch/$ISA/generated/inc.d, which is then read by a new SCons builder to finalize the dependencies. (Once inc.d exists, the ISA parser will not need to be run to complete this step.) Once the dependencies are known, the 'Environments' are made by the makeEnv() function. This function used to be called before the build began but now happens during the build. It is easy to see that this step is quite slow; this is a known issue and it's important to realize that it was already slow, but there was no obvious cause to attribute it to since nothing was displayed to the terminal. Since new steps that used to be performed serially are now in a potentially-parallel build phase, the pathname handling in the SCons scripts has been tightened up to deal with chdir() race conditions. In general, pathnames are computed earlier and more likely to be stored, passed around, and processed as absolute paths rather than relative paths. In the end, some of these issues had to be fixed by inserting serializing dependencies in the build. Minor note: For the null ISA, we just provide a dummy inc.d so SCons is never compelled to try to generate it. While it seems slightly wrong to have anything in src/arch/*/generated (i.e. a non-generated 'generated' file), it's by far the simplest solution.	2014-05-09 18:58:47 -04:00
Geoffrey Blake	85940fd537	arch, arm: Preserve TLB bootUncacheability when switching CPUs The ARM TLBs have a bootUncacheability flag used to make some loads and stores become uncacheable when booting in FS mode. Later the flag is cleared to let those loads and stores operate as normal. When doing a takeOverFrom(), this flag's state is not preserved and is momentarily reset until the CPSR is touched. On single core runs this is a non-issue. On multi-core runs this can lead to crashes on the O3 CPU model from the following series of events: 1) takeOverFrom executed to switch from Atomic -> O3 2) All bootUncacheability flags are reset to true 3) Core2 tries to execute a load covered by bootUncacheability, it is flagged as uncacheable 4) Core2's load needs to replay due to a pipeline flush 3) Core1 core does an action on CPSR 4) The handling code for CPSR then checks all other cores to determine if bootUncacheability can be set to false 5) Asynchronously set bootUncacheability on all cores to false 6) Core2 replays load previously set as uncacheable and notices it is now flagged as cacheable, leads to a panic. This patch implements takeOverFrom() functionality for the ARM TLBs to preserve flag values when switching from atomic -> detailed.	2014-05-09 18:58:47 -04:00
Akash Bagdia	2b1a01ee6c	cpu, arm: Allow the specification of a socket field Allow the specification of a socket ID for every core that is reflected in the MPIDR field in ARM systems. This allows studying multi-socket / cluster systems with ARM CPUs.	2014-05-09 18:58:46 -04:00
Geoffrey Blake	29601eada7	arm: Panics in miscreg read functions can be tripped by O3 model Unimplemented miscregs for the generic timer were guarded by panics in arm/isa.cc which can be tripped by the O3 model if it speculatively executes a wrong path containing a mrs instruction with a bad miscreg index. These registers were flagged as implemented and accessible. This patch changes the miscreg info bit vector to flag them as unimplemented and inaccessible. In this case, and UndefinedInst fault will be generated if the register access is not trapped by a hypervisor.	2014-05-09 18:58:46 -04:00
Curtis Dunham	7f1603d207	arch: remove inline specifiers on all inst constrs, all ISAs With (upcoming) separate compilation, they are useless. Only link-time optimization could re-inline them, but ideally feedback-directed optimization would choose to do so only for profitable (i.e. common) instructions.	2014-05-09 18:58:46 -04:00
Curtis Dunham	eb61f0123b	arm: cleanup ARM ISA definition	2014-05-09 18:58:46 -04:00
Curtis Dunham	ecf774bc56	arm: Correctly display disassembly of vldmia/vstmia The MicroMemOp class generates the disassembly for both integer and floating point instructions, but it would always print its first operand as an integer register without considering that the op may be a floating instruction in which case a float register should be displayed instead.	2014-04-23 05:18:30 -04:00
Mitchell Hayenga	0fad0c7f7d	arm: Don't use a stack allocated mnemonic FailUnimplemented passed a stack created mnemonic as a const char * which causes some grief when the stack goes away.	2014-04-23 05:18:20 -04:00
Curtis Dunham	e651188f75	arch: remove 'null update' check in isa-parser SCons already does this for all build steps.	2014-04-23 05:17:57 -04:00
Eric Van Hensbergen	7630168a75	arm: m5ops readfile64 args broken, offset coming through garbage There were several sections of the m5ops code which were essentially copy/pasted versions of the 32-bit code. The problem is that some of these didn't account fo4 64-bit registers leading to arguments being in the wrong registers. This patch addresses the args for readfile64, writefile64, and addsymbol64 -- all of which seemed to suffer from a similar set of problems when moving to 64-bit.	2014-03-23 11:11:34 -04:00
Andreas Sandberg	f791e7b313	kvm: x86: Add support for x86 INIT and STARTUP handling This changeset adds support for INIT and STARTUP IPI handling. We currently handle both of these interrupts in gem5 and transfer the state to KVM. Since we do not have a BIOS loaded, we pretend that the INIT interrupt suspends the CPU after reset. --HG-- extra : rebase_source : 7f3b25f3801d68f668b6cd91eaf50d6f48ee2a6a	2014-03-16 17:28:23 +01:00
Paul Rosenfeld	32bf74cb8e	alpha: Small removal of dead comments/code from alpha ISA Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2014-03-12 07:03:22 -05:00
Geoffrey Blake	c4a8e5c36c	arm: Handle functional TLB walks properly The table walker code currently accounts for two types of walks, Atomic and Timing, and treats them differently. Atomic walks keep a single instance of WalkerState around for all walks to use in currState. Timing mode keeps a queue of in-flight WalkerStates and maintains currState as NULL between walks. If a functional walk is done during Timing mode, it is treated as an atomic walk and either creates a persistent WalkerState if in between Timing walks, or stomps an existing currState for an in-progress Timing walk. This patch distinguishes functional walks as being able to exist at any time and sets up a temporary WalkerState for its exclusive use and then cleans up when finished, leaving any in progress Atomic or Timing walks undisturbed.	2014-03-07 15:56:23 -05:00
Mitch Hayenga	b9a9d99b22	scons: Fixes uninitialized warnings issued by clang Small fixes to appease recent clang versions.	2014-03-07 15:56:23 -05:00
Stephan Diestelhorst	bef2086f5b	arm: Fix uninitialised warning with gcc 4.8 Small fix for a warning that prevents compilation with gcc 4.8.1 due to detecting that a variable might be uninitialised. The fix is to assign a safe default.	2014-03-07 15:56:23 -05:00
Ali Saidi	bf39a475fe	mem: Wakeup sleeping CPUs without caches on LLSC For systems without caches, the LLSC code does not get snoops for wake-ups. We add the LLSC code in the abstract memory to do the job for us.	2014-03-07 15:56:23 -05:00
Andreas Sandberg	be246cef62	x86: Setup correct TSL/TR segment attributes on INIT The TSL/LDT & TR/TSS segments didn't contain valid attributes. This caused problems when transfering the state into KVM where invalid state is a no-go. Fixup the attributes with values from AMD's architecture programmer's manual.	2014-03-03 14:44:57 +01:00
Christopher Torng	919baa603d	cpu: Enable fast-forwarding for MIPS InOrderCPU and O3CPU A copyRegs() function is added to MIPS utilities to copy architectural state from the old CPU to the new CPU during fast-forwarding. This addition alone enables fast-forwarding for the o3 cpu model running MIPS. The patch also adds takeOverFrom() and drainResume() functions to the InOrderCPU to enable it to take over from another CPU. This change enables fast-forwarding for the inorder cpu model running MIPS, but not for Alpha. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2014-03-01 23:35:23 -06:00
Andreas Sandberg	e76a37985f	x86: Fix x87 state transfer bug Changeset 7274310be1bb (isa: clean up register constants) increased the value of NumFloatRegs, which triggered a bug in X86ISA::copyRegs(). This bug is caused by the x87 stack being copied twice since register indexes past NUM_FLOATREGS are mapped into the x87 stack relative to the top of the stack, which is undefined when the copy takes place. This changeset updates the copyRegs() function to use access registers using the non-flattening interface, which guarantees that undesirable register folding does not happen.	2014-02-05 14:08:13 +01:00
Nikos Nikoleris	c6279f2d19	x86, kvm: Fix bug in the RFlags get and set functions The getRFlags and setRFlags utility functions were not updated correctly when condition registers were separated into their own register class. This lead to incorrect state transfer in calls from kvm into the simulator (e.g., m5 readfile ended up in an infinite loop) and when switching CPUs. This patch makes these utility functions use getCCReg and setCCReg instead of getIntReg and setIntReg which read and write the integer registers. Reviewed-by: Andreas Sandberg <andreas@sandberg.pp.se>	2014-02-02 16:37:35 +01:00
Mitch Hayenga	b77ca57f8c	arm: Enable umask syscall in SE mode Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2014-01-28 18:00:51 -06:00
Nilay Vaish	bdee69d0b1	x86: use lfpimm instead of limm for fptan	2014-01-27 18:50:54 -06:00
Nilay Vaish	6a543b5134	x86: implements x87 add/sub instructions	2014-01-27 18:50:53 -06:00
Nilay Vaish	5be0b846b1	x86: implements fxch instruction.	2014-01-27 18:50:52 -06:00
Nilay Vaish	4eb3b1ed0b	x86: correct error in emms instruction.	2014-01-27 18:50:51 -06:00
ARM gem5 Developers	612f8f074f	arm: Add support for ARMv8 (AArch64 & AArch32) Note: AArch64 and AArch32 interworking is not supported. If you use an AArch64 kernel you are restricted to AArch64 user-mode binaries. This will be addressed in a later patch. Note: Virtualization is only supported in AArch32 mode. This will also be fixed in a later patch. Contributors: Giacomo Gabrielli (TrustZone, LPAE, system-level AArch64, AArch64 NEON, validation) Thomas Grocutt (AArch32 Virtualization, AArch64 FP, validation) Mbou Eyole (AArch64 NEON, validation) Ali Saidi (AArch64 Linux support, code integration, validation) Edmund Grimley-Evans (AArch64 FP) William Wang (AArch64 Linux support) Rene De Jong (AArch64 Linux support, performance opt.) Matt Horsnell (AArch64 MP, validation) Matt Evans (device models, code integration, validation) Chris Adeniyi-Jones (AArch64 syscall-emulation) Prakash Ramrakhyani (validation) Dam Sunwoo (validation) Chander Sudanthi (validation) Stephan Diestelhorst (validation) Andreas Hansson (code integration, performance opt.) Eric Van Hensbergen (performance opt.) Gabe Black	2014-01-24 15:29:34 -06:00
Andreas Hansson	cfc4a99982	arch: Make all register index flattening const This patch makes all the register index flattening methods const for all the ISAs. As part of this, readMiscRegNoEffect for ARM is also made const.	2014-01-24 15:29:30 -06:00
Ali Saidi	7d0344704a	arch, cpu: Add support for flattening misc register indexes. With ARMv8 support the same misc register id results in accessing different registers depending on the current mode of the processor. This patch adds the same orthogonality to the misc register file as the others (int, float, cc). For all the othre ISAs this is currently a null-implementation. Additionally, a system variable is added to all the ISA objects.	2014-01-24 15:29:30 -06:00
Ali Saidi	6bed6e0352	cpu: Add CPU support for generatig wake up events when LLSC adresses are snooped. This patch add support for generating wake-up events in the CPU when an address that is currently in the exclusive state is hit by a snoop. This mechanism is required for ARMv8 multi-processor support.	2014-01-24 15:29:30 -06:00
Ali Saidi	904872a01a	mem: Remove explict cast from memhelper. Previously we were casting the result type to the the memory type which is incorrect for things like dual-memory operations which still return a single result.	2014-01-24 15:29:30 -06:00
Dam Sunwoo	85e8779de7	mem: per-thread cache occupancy and per-block ages This patch enables tracking of cache occupancy per thread along with ages (in buckets) per cache blocks. Cache occupancy stats are recalculated on each stat dump.	2014-01-24 15:29:30 -06:00
Andreas Hansson	f2b0b551cc	x86: Fix memory leak in table walker This patch fixes a memory leak in the table walker, by ensuring that the sender state is deleted again if the request packet cannot be successfully sent.	2014-01-24 15:29:29 -06:00
Christopher Torng	903b442228	mips: Floating point convert bug fix In mips architecture, floating point convert instructions use the FloatConvertOp format defined in src/arch/mips/isa/formats/fp.isa. The type of the operands in the ISA description file (_sw for signed word, or _sf for signed float, etc.) is used to create a type for the operand in C++. Then the operand is converted using the fpConvert() function in src/arch/mips/utility.cc. If we are converting from a word to a float, and we want to convert 0xffffffff, we expect -1 to be passed into fpConvert(). Instead, we see MAX_INT passed in. Then fpConvert() converts _val_ to MAX_INT in single-precision floating point, and we get the wrong value. To fix it, the signs of the convert operands are being changed from unsigned to signed in the MIPS ISA description. Then, the FloatConvertOp format is being changed to insert a int32_t into the C++ code instead of a uint32_t. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2013-12-29 19:29:45 -06:00
Christian Menard	d4f205ea2f	x86: Implementation of Int3 and Int_Ib in long mode This is an implementation of the x86 int3 and int immediate instructions for long mode according to 'AMD64 Programmers Manual Volume 3'.	2013-11-26 17:51:07 +01:00
Chander Sudanthi	3e6da89419	ARM: add support for TEEHBR access Thumb2 ARM kernels may access the TEEHBR via thumbee_notifier in arch/arm/kernel/thumbee.c. The Linux kernel code just seems to be saving and restoring the register. This patch adds support for the TEEHBR cp14 register. Note, this may be a special case when restoring from an image that was run on a system that supports ThumbEE.	2013-10-31 13:41:13 -05:00
Prakash Ramrakhyani	885656f2ed	mem: Add privilege info to request class This patch adds a flag in the request class that indicates if the request was made in privileged mode.	2013-10-31 13:41:13 -05:00
Eric Van Hensbergen	bfdd031c0d	arm: Accomodate function name changes in newer linux kernels	2013-10-17 10:20:45 -05:00
Yasuko Eckert	1bb293d1e7	arch/x86: add support for explicit CC register file Convert condition code registers from being specialized ("pseudo") integer registers to using the recently added CC register class. Nilay Vaish also contributed to this patch.	2013-10-15 14:22:44 -04:00
Yasuko Eckert	2c293823aa	cpu: add a condition-code register class Add a third register class for condition codes, in parallel with the integer and FP classes. No ISAs use the CC class at this point though.	2013-10-15 14:22:44 -04:00
Steve Reinhardt	219c423f1f	cpu: rename _DepTag constants to _Reg_Base Make these names more meaningful. Specifically, made these substitutions: s/FP_Base_DepTag/FP_Reg_Base/g; s/Ctrl_Base_DepTag/Misc_Reg_Base/g; s/Max_DepTag/Max_Reg_Index/g;	2013-10-15 14:22:43 -04:00
Steve Reinhardt	a830e63de7	isa: clean up register constants Clean up and add some consistency to the *_Base_DepTag constants as well as some related register constants: - Get rid of NumMiscArchRegs, TotalArchRegs, and TotalDataRegs since they're never used and not always defined - Set FP_Base_DepTag = NumIntRegs when possible (i.e., every case except x86) - Set Ctrl_Base_DepTag = FP_Base_DepTag + NumFloatRegs (this was true before, but wasn't always expressed that way) - Drastically reduce the number of arbitrary constants appearing in these calculations	2013-10-15 14:22:43 -04:00

... 2 3 4 5 6 ...

3075 commits