sanchayanmaity/gem5 - Sanchayan Maity's repositories

Author	SHA1	Message	Date
Nilay Vaish	e038741598	x86: add tlb checkpointing This patch adds checkpointing support to x86 tlb. It upgrades the cpt_upgrader.py script so that previously created checkpoints can be updated. It moves the checkpoint version to 6.	2013-08-07 14:51:17 -05:00
Andreas Hansson	d4273cc9a6	mem: Set the cache line size on a system level This patch removes the notion of a peer block size and instead sets the cache line size on the system level. Previously the size was set per cache, and communicated through the interconnect. There were plenty checks to ensure that everyone had the same size specified, and these checks are now removed. Another benefit that is not yet harnessed is that the cache line size is now known at construction time, rather than after the port binding. Hence, the block size can be locally stored and does not have to be queried every time it is used. A follow-on patch updates the configuration scripts accordingly.	2013-07-18 08:31:16 -04:00
Steve Reinhardt	1f43e244bd	dev: make BasicPioDevice take size in constructor Instead of relying on derived classes explicitly assigning to the BasicPioDevice pioSize field, require them to pass a size value in to the constructor. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2013-07-11 21:57:04 -05:00
Steve Reinhardt	502ad1e675	dev: consistently end device classes in 'Device' PciDev and IntDev stuck out as the only device classes that ended in 'Dev' rather than 'Device'. This patch takes care of that inconsistency. Note that you may need to delete pre-existing files matching build//python/m5/internal/param_ as scons does not pick up indirect dependencies on imported python modules when generating params, and the PciDev -> PciDevice rename takes place in a file (dev/Device.py) that gets imported quite a bit. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2013-07-11 21:56:50 -05:00
Steve Reinhardt	b0b1c0205c	devices: make more classes derive from BasicPioDevice A couple of devices that have single fixed memory mapped regions were not derived from BasicPioDevice, when that's exactly the functionality that BasicPioDevice provides. This patch gets rid of a little bit of redundant code by making those devices actually do so. Also fixed the weird case of X86ISA::Interrupts, where the class already did derive from BasicPioDevice but didn't actually use all the features it could have. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2013-07-11 21:56:24 -05:00
Akash Bagdia	7d7ab73862	sim: Add the notion of clock domains to all ClockedObjects This patch adds the notion of source- and derived-clock domains to the ClockedObjects. As such, all clock information is moved to the clock domain, and the ClockedObjects are grouped into domains. The clock domains are either source domains, with a specific clock period, or derived domains that have a parent domain and a divider (potentially chained). For piece of logic that runs at a derived clock (a ratio of the clock its parent is running at) the necessary derived clock domain is created from its corresponding parent clock domain. For now, the derived clock domain only supports a divider, thus ensuring a lower speed compared to its parent. Multiplier functionality implies a PLL logic that has not been modelled yet (create a separate clock instead). The clock domains should be used as a mechanism to provide a controllable clock source that affects clock for every clocked object lying beneath it. The clock of the domain can (in a future patch) be controlled by a handler responsible for dynamic frequency scaling of the respective clock domains. All the config scripts have been retro-fitted with clock domains. For the System a default SrcClockDomain is created. For CPUs that run at a different speed than the system, there is a seperate clock domain created. This domain incorporates the CPU and the associated caches. As before, Ruby runs under its own clock domain. The clock period of all domains are pre-computed, such that no virtual functions or multiplications are needed when calling clockPeriod. Instead, the clock period is pre-computed when any changes occur. For this to be possible, each clock domain tracks its children.	2013-06-27 05:49:49 -04:00
Andreas Sandberg	d06064c386	x86: Add support for maintaining the x87 tag word The current implementation of the x87 never updates the x87 tag word. This is currently not a big issue since the simulated x87 never checks for stack overflows, however this becomes an issue when switching between a virtualized CPU and a simulated CPU. This changeset adds support, which is enabled by default, for updating the tag register to every floating point microop that updates the stack top using the spm mechanism. The new tag words is generated by the helper function X86ISA::genX87Tags(). This function is currently limited to flagging a stack position as valid or invalid and does not try to distinguish between the valid, zero, and special states.	2013-06-18 16:36:08 +02:00
Andreas Sandberg	a8e8c4f433	x86: Fix loading of floating point constants This changeset actually fixes two issues: * The lfpimm instruction didn't work correctly when applied to a floating point constant (it did work for integers containing the bit string representation of a constant) since it used reinterpret_cast to convert a double to a uint64_t. This caused a compilation error, at least, in gcc 4.6.3. * The instructions loading floating point constants in the x87 processor didn't work correctly since they just stored a truncated integer instead of a double in the floating point register. This changeset fixes the old microcode by using lfpimm instruction instead of the limm instructions.	2013-06-18 16:30:06 +02:00
Andreas Sandberg	c9c02efb99	x86: Initialize the MXCSR register	2013-06-18 16:28:36 +02:00
Andreas Sandberg	688fc7f71f	x86: Make the boot state VMX compliant This patch allows the default x86 state to be used when by CPUs that use hardware virtualization.	2013-06-18 16:27:28 +02:00
Andreas Sandberg	5d584934ad	x86: Make fprem like the fprem on a real x87 The current implementation of fprem simply does an fmod and doesn't simulate any of the iterative behavior in a real fprem. This isn't normally a problem, however, it can lead to problems when switching between CPU models. If switching from a real CPU in the middle of an fprem loop to a simulated CPU, the output of the fprem loop becomes correupted. This changeset changes the fprem implementation to work like the one on real hardware.	2013-06-18 16:10:42 +02:00
Andreas Sandberg	46a8cbbb7f	x86: Add helper functions to access rflags The rflags register is spread across several different registers. Most of the flags are stored in MISCREG_RFLAGS, but some are stored in microcode registers. When accessing RFLAGS, we need to reconstruct it from these registers. This changeset adds two functions, X86ISA::getRFlags() and X86ISA::setRFlags(), that take care of this magic.	2013-06-18 16:10:22 +02:00
Andreas Sandberg	de89e133d8	x86: Fix the flag handling code in FABS and FCHS This changeset fixes two problems in the FABS and FCHS implementation. First, the ISA parser expects the assignment in flag_code to be a pure assignment and not an and-assignment, which leads to the isa_parser omitting the misc reg update. Second, the FCHS and FABS macro-ops don't set the SetStatus flag, which means that the default micro-op version, which doesn't update FSW, is executed.	2013-06-18 16:10:21 +02:00
Andreas Sandberg	0b4a8b4086	x86: Fix bug when copying TSC on CPU handover The TSC value stored in MISCREG_TSC is actually just an offset from the current CPU cycle to the actual TSC value. Writes with side-effects to the TSC subtract the current cycle count before storing the new value, while reads add the current cycle count. When switching CPUs, the current value is copied without side-effects. This works as long as the source and the destination CPUs have the same clock frequencies. The TSC will jump, sometimes backwards, if they have different clock frequencies. Most OSes assume the TSC to be monotonic and break when this happens. This changeset makes sure that the TSC is copied with side-effects to ensure that the offset is updated to match the new CPU.	2013-06-11 09:24:38 +02:00
Andreas Sandberg	7846f59d0d	arch: Create a method to finalize physical addresses in the TLB Some architectures (currently only x86) require some fixing-up of physical addresses after a normal address translation. This is usually to remap devices such as the APIC, but could be used for other memory mapped devices as well. When running the CPU in a using hardware virtualization, we still need to do these address fix-ups before inserting the request into the memory system. This patch moves this patch allows that code to be used by such CPUs without doing full address translations.	2013-06-03 13:55:41 +02:00
Gedare Bloom	22b60c57e6	x86: Squash outstanding walks when instructions are squashed. This is the x86 version of the ARM changeset baa17ba80e06. In case an instruction has been squashed by the o3 cpu, this patch allows page table walker to avoid carrying out a pending translation that the instruction requested for.	2013-05-21 11:40:11 -05:00
Nilay Vaish	30fe807316	x86: mark instructions for being function call/return Currently call and return instructions are marked as IsCall and IsReturn. Thus, the branch predictor does not use RAS for these instructions. Similarly, the number of function calls that took place is recorded as 0. This patch marks these instructions as they should be.	2013-05-21 11:34:41 -05:00
Nilay Vaish	fba40864aa	x86: add op class for int and fp microops in isa description Currently all the integer microops are marked as IntAluOp and the floating point microops are marked as FloatAddOp. This patch adds support for marking different microops differently. Now IntMultOp, IntDivOp, FloatDivOp, FloatMultOp, FloatCvtOp, FloatSqrtOp classes will be used as well. This will help in providing different latencies for different op class.	2013-05-21 11:33:57 -05:00
Andreas Sandberg	1ae30c68c1	arm: Add support for the m5fail pseudo-op	2013-05-14 15:06:50 +02:00
Michael Levenhagen	223f89a162	x86: corrects vsyscall address for gettimeofday The vsyscall address for gettimeofday is 0xffffffffff600000ul. The offset therefore should be 0x0 instead of 0x410. This can be cross checked with the file sysdeps/unix/sysv/linux/x86_64/gettimeofday.c in source of glibc. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2013-04-23 15:21:32 -05:00
Michael Levenhagen	794d00257a	x86: enable gettimeofday and getppid system calls Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2013-04-23 15:21:30 -05:00
Christian Menard	25a6b1866e	x86: increment the stack pointer in lret inst The 'lret' instruction reloads instruction pointer and code segment from the stack and then pops them. But the popping part is missing from the current implementation. This caused incorrect behavior in some code related to the Fiasco OS. Microops are being added to rectify the behavior of the instruction. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2013-04-23 00:03:04 -05:00
Andreas Sandberg	6d2941d990	arm: Add a method to query interrupt state ignoring CPSR masks Add the method checkRaw to ArmISA::Interrupts. This method can be used to query the raw state (ignoring CPSR masks) of an interrupt. It is primarily intended for hardware virtualized CPUs.	2013-04-22 13:20:32 -04:00
Andreas Sandberg	5f2361f3af	arm: Enable support for triggering a sim panic on kernel panics Add the options 'panic_on_panic' and 'panic_on_oops' to the LinuxArmSystem SimObject. When these option are enabled, the simulator panics when the guest kernel panics or oopses. Enable panic on panic and panic on oops in ARM-based test cases.	2013-04-22 13:20:31 -04:00
Andreas Sandberg	aa08069b3f	sim: Add helper functions that add PCEvents with custom arguments This changeset adds support for forwarding arguments to the PC event constructors to following methods: addKernelFuncEvent addFuncEvent Additionally, this changeset adds the following helper method to the System base class: addFuncEventOrPanic - Hook a PCEvent to a symbol, panic on failure. addKernelFuncEventOrPanic - Hook a PCEvent to a kernel symbol, panic on failure. System implementations have been updated to use the new functionality where appropriate.	2013-04-22 13:20:31 -04:00
Nathanael Premillieu	3ff091bdf4	arm: set ldr_ret_uop as conditional or unconditional control This patch adds a missing flag to the ldr_ret_uop microop instruction. The flag is added when the instruction is used, not directly in the constructor of the instruction. Committed by: Nilay Vaish <nilay@cs.wisc.edu>"	2013-04-17 16:07:10 -05:00
Nilay Vaish	d2fd3b2ec2	x86: changes to apic, keyboard It is possible that operating system wants to shutdown the lapic timer by writing timer's initial count to 0. This patch adds a check that the timer event is only scheduled if the count is 0. The patch also converts few of the panics related to the keyboard to warnings since we are any way not interested in simulating the keyboard.	2013-03-28 09:34:23 -05:00
Nilay Vaish	5c940fec0a	x86: implement some of the x87 instructions This patch implements ftan, fprem, fyl2x, fld* floating-point instructions.	2013-03-11 13:15:46 -05:00
Andreas Hansson	c4645c0d68	x86: Make the table walker reset the packet delay This patch fixes an issue related to the table walker recycling packets that still have a bus delay that is not accounted for. For now, we simply ignore the values and reset them to zero.	2013-03-07 05:55:01 -05:00
Ali Saidi	f4fd12d49e	ARM: fix some cases where instructions that write to fp reg 15 are accidently branches.	2013-03-04 23:33:47 -05:00
Andreas Hansson	a62afd094b	scons: Fix warnings issued by clang 3.2svn (XCode 4.6) This patch fixes the warnings that clang3.2svn emit due to the "-Wall" flag. There is one case of an uninitialised value in the ARM neon ISA description, and then a whole range of unused private fields that are pruned.	2013-02-19 05:56:08 -05:00
Andreas Hansson	319443d42d	scons: Add warning for missing declarations This patch enables warnings for missing declarations. To avoid issues with SWIG-generated code, the warning is only applied to non-SWIG code.	2013-02-19 05:56:07 -05:00
Andreas Hansson	b44e0ce52b	scons: Add warning for overloaded virtual functions Fix the ISA startup warnings	2013-02-19 05:56:07 -05:00
Andreas Hansson	0acd2a96e5	scons: Add warning for overloaded virtual functions A derived function with a different signature than a base class function will result in the base class function of the same name being hidden. The parameter list and return type for the member function in the derived class must match those of the member function in the base class, otherwise the function in the derived class will hide the function in the base class and no polymorphic behaviour will occur. This patch addresses these warnings by ensuring a unique function name to avoid (unintentionally) hiding any functions.	2013-02-19 05:56:06 -05:00
Andreas Hansson	d670fa60a1	scons: Add warning for missing field initializers This patch adds a warning for missing field initializers for both gcc and clang, and addresses the warnings that were generated.	2013-02-19 05:56:06 -05:00
Andreas Hansson	c10098f28b	scons: Fix up numerous warnings about name shadowing This patch address the most important name shadowing warnings (as produced when using gcc/clang with -Wshadow). There are many locations where constructor parameters and function parameters shadow local variables, but these are left unchanged.	2013-02-19 05:56:06 -05:00
Andreas Hansson	5c7ebee434	x86: Move APIC clock divider to Python This patch moves the 16x APIC clock divider to the Python code to avoid the post-instantiation modifications to the clock. The x86 APIC was the only object setting the clock after creation time and this required some custom functionality and configuration. With this patch, the clock multiplier is moved to the Python code and the objects are instantiated with the appropriate clock.	2013-02-19 05:56:06 -05:00
Andreas Hansson	0622f30961	mem: Add predecessor to SenderState base class This patch adds a predecessor field to the SenderState base class to make the process of linking them up more uniform, and enable a traversal of the stack without knowing the specific type of the subclasses. There are a number of simplifications done as part of changing the SenderState, particularly in the RubyTest.	2013-02-19 05:56:05 -05:00
Anthony Gutierrez	f7107fb795	loader: add a flattened device tree blob (dtb) object this adds a dtb_object so the loader can load in the dtb file for linux/android ARM kernels.	2013-02-15 18:48:59 -05:00
Mrinmoy Ghosh	8cef39fb67	arm: fix a page table walker issue where a page could be translated multiple times If multiple memory operations to the same page are miss the TLB they are all inserted into the page table queue and before this change could result in multiple uncessesary walks as well as duplicate enteries being inserted into the TLB.	2013-02-15 17:40:10 -05:00
Andreas Sandberg	b904bd5437	sim: Add a system-global option to bypass caches Virtualized CPUs and the fastmem mode of the atomic CPU require direct access to physical memory. We currently require caches to be disabled when using them to prevent chaos. This is not ideal when switching between hardware virutalized CPUs and other CPU models as it would require a configuration change on each switch. This changeset introduces a new version of the atomic memory mode, 'atomic_noncaching', where memory accesses are inserted into the memory system as atomic accesses, but bypass caches. To make memory mode tests cleaner, the following methods are added to the System class: * isAtomicMode() -- True if the memory mode is 'atomic' or 'direct'. * isTimingMode() -- True if the memory mode is 'timing'. * bypassCaches() -- True if caches should be bypassed. The old getMemoryMode() and setMemoryMode() methods should never be used from the C++ world anymore.	2013-02-15 17:40:09 -05:00
Ali Saidi	db5c478e70	arm: fix some fp comparisons that worked by accident. The explict tests in the follwing fp comparison operations were incorrect as they checked for only signaling NaNs and not quite-NaNs as well. When compiled with gcc, the comparison generates a fp exception that causes the FE_INVALID flag to be set and we check for it, so even though the check was incorrect, the correct exception was set. With clang this behavior seems to not occur. The checks are updated to test for nans and the behavior is now correct with both clang and gcc.	2013-02-15 17:40:08 -05:00
Ali Saidi	68495a0748	ARM: Fix an issue with clang generating wrong code. Clang generated executables would enter the if condition when it wasn't supposted to, resulting in the wrong simulated behavior. Implementing the operation this way is a bit faster anyway.	2013-02-15 17:40:08 -05:00
Nilay Vaish	fc57ae6401	x86, cpu: corrects 270c9a75e91f, take over decoder on cpu switch The changes made by the changeset 270c9a75e91f do not work well with switching of cpus. The problem is that decoder for the old thread context holds state that is not taken over by the new decoder. This patch adds a takeOverFrom() function to Decoder class in each ISA. Except for x86, functions in other ISAs are blank. For x86, the function copies state from the old decoder to the new decoder.	2013-01-22 00:10:10 -06:00
Nilay Vaish	f2bcf4f01c	x86 cpuid: enable clflush Note that clflush is only being enabled. It is not implemented in actual. A warning is printed if the cpu encounters a clflush instruction. We need to enable this instruction in cpuid since JRE 1.7 tests for it.	2013-01-15 07:43:21 -06:00
Nilay Vaish	ac9bb51405	x86: implements fsin, fcos instructions	2013-01-15 07:43:21 -06:00
Nilay Vaish	7f5463539b	x86: implements emms instruction	2013-01-15 07:43:20 -06:00
Nilay Vaish	91b00d98a5	x86: implement fabs, fchs instructions	2013-01-15 07:43:19 -06:00
Nilay Vaish	25ec278a0b	x86: Changes to decoder, corrects 9376 The changes made by the changeset 9376 were not quite correct. The patch made changes to the code which resulted in decoder not getting initialized correctly when the state was restored from a checkpoint. This patch adds a startup function to each ISA object. For x86, this function sets the required state in the decoder. For other ISAs, the function is empty right now.	2013-01-12 22:09:48 -06:00
Lluís Vilanova	807168a1de	util: add m5_fail op. Used as a command in full-system scripts helps the user ensure the benchmarks have finished successfully. For example, one can use: /path/to/benchmark args \|\| /sbin/m5 fail 1 and thus ensure gem5 will exit with an error if the benchmark fails.	2013-01-08 08:54:12 -05:00
Mitch Hayenga	4a752b1655	arm: add access syscall for ARM SE mode This patch adds the "access" syscall for ARM SE as required by some spec2006 benchmarks.	2013-01-08 08:54:07 -05:00
Andreas Sandberg	e09e9fa279	cpu: Flush TLBs on switchOut() This changeset inserts a TLB flush in BaseCPU::switchOut to prevent stale translations when doing repeated switching. Additionally, the TLB flushing functionality is exported to the Python to make debugging of switching/checkpointing easier. A simulation script will typically use the TLB flushing functionality to generate a reference trace. The following sequence can be used to simulate a handover (this depends on how drain is implemented, but is generally the case) between identically configured CPU models: m5.drain(test_sys) [ cpu.flushTLBs() for cpu in test_sys.cpu ] m5.resume(test_sys) The generated trace should normally be identical to a trace generated when switching between identically configured CPU models or checkpointing and resuming.	2013-01-07 13:05:48 -05:00
Andreas Sandberg	fb52ea9220	arm: Invalidate cached TLB configuration in drainResume Currently, we invalidate the cached miscregs in TLB::unserialize(). The intended use of the drainResume() method is to invalidate cached state and prepare the system to resume after a CPU handover or (un)serialization. This patch moves the TLB miscregs invalidation code to the drainResume() method to avoid surprising behavior.	2013-01-07 13:05:45 -05:00
Andreas Sandberg	0d59549cd9	arm: Fix draining of the pagetable walker when squashing Since the page table walker only checks if a drain has completed in doL1DescriptorWrapper() and doL2DescriptorWrapper(), it sometimes looses track of a drain request if there is a squash. This changeset adds a completeDrain() call after squashing requests in the pending queue, which fixes this issue.	2013-01-07 13:05:45 -05:00
Andreas Sandberg	38925ff621	arm: Remove the register mapping hack used when copying TCs In order to see all registers independent of the current CPU mode, the ARM architecture model uses the magic MISCREG_CPSR_MODE register to change the register mappings without actually updating the CPU mode. This hack is no longer needed since the thread context now provides a flat interface to the register file. This patch replaces the CPSR_MODE hack with the flat register interface.	2013-01-07 13:05:44 -05:00
Andreas Sandberg	17b47d35e1	arch: Move the ISA object to a separate section After making the ISA an independent SimObject, it is serialized automatically by the Python world. Previously, this just resulted in an empty ISA section. This patch moves the contents of the ISA to that section and removes the explicit ISA serialization from the thread contexts, which makes it behave like a normal SimObject during serialization. Note: This patch breaks checkpoint backwards compatibility! Use the cpt_upgrader.py utility to upgrade old checkpoints to the new format.	2013-01-07 13:05:42 -05:00
Andreas Sandberg	94561dd526	arch: Add support for invalidating TLBs when draining This patch adds support for the memInvalidate() drain method. TLB flushing is requested by calling the virtual flushAll() method on the TLB. Note: This patch renames invalidateAll() to flushAll() on x86 and SPARC to make the interface consistent across all supported architectures.	2013-01-07 13:05:40 -05:00
Andreas Sandberg	c3551e82f7	arch: Fix broken M5VarArgsFault initialization At least gcc 4.4.3 seems to get confused by the use of func both as a template parameter and a member variable in the M5VarArgsFault class. This causes the value of the member variable func to be unpredictable in M5VarArgsFault objects. This changeset renames the template parameter to remove this ambiguity.	2013-01-07 13:05:38 -05:00
Andreas Hansson	71da1d2157	base: Encapsulate the underlying fields in AddrRange This patch makes the start and end address private in a move to prevent direct manipulation and matching of ranges based on these fields. This is done so that a transition to ranges with interleaving support is possible. As a result of hiding the start and end, a number of member functions are needed to perform the comparisons and manipulations that previously took place directly on the members. An accessor function is provided for the start address, and a function is added to test if an address is within a range. As a result of the latter the != and == operator is also removed in favour of the member function. A member function that returns a string representation is also created to allow debug printing. In general, this patch does not add any functionality, but it does take us closer to a situation where interleaving (and more cleverness) can be added under the bonnet without exposing it to the user. More on that in a later patch.	2013-01-07 13:05:38 -05:00
Andreas Sandberg	0d1ad50326	arm: Make ID registers ISA parameters This patch makes the values of ID_ISARx, MIDR, and FPSID configurable as ISA parameter values. Additionally, setMiscReg now ignores writes to all of the ID registers. Note: This moves the MIDR parameter from ArmSystem to ArmISA for consistency.	2013-01-07 13:05:35 -05:00
Andreas Sandberg	3db3f83a5e	arch: Make the ISA class inherit from SimObject The ISA class on stores the contents of ID registers on many architectures. In order to make reset values of such registers configurable, we make the class inherit from SimObject, which allows us to use the normal generated parameter headers. This patch introduces a Python helper method, BaseCPU.createThreads(), which creates a set of ISAs for each of the threads in an SMT system. Although it is currently only needed when creating multi-threaded CPUs, it should always be called before instantiating the system as this is an obvious place to configure ID registers identifying a thread/CPU.	2013-01-07 13:05:35 -05:00
Ali Saidi	69d419f313	o3: Fix issue with LLSC ordering and speculation This patch unlocks the cpu-local monitor when the CPU sees a snoop to a locked address. Previously we relied on the cache to handle the locking for us, however some users on the gem5 mailing list reported a case where the cpu speculatively executes a ll operation after a pending sc operation in the pipeline and that makes the cache monitor valid. This should handle that case by invaliding the local monitor.	2013-01-07 13:05:33 -05:00
Gabe Black	e17c375ddd	Decoder: Remove the thread context get/set from the decoder. This interface is no longer used, and getting rid of it simplifies the decoders and code that sets up the decoders. The thread context had been used to read architectural state which was used to contextualize the instruction memory as it came in. That was changed so that the state is now sent to the decoders to keep locally if/when it changes. That's significantly more efficient. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2013-01-04 19:00:45 -06:00
Gabe Black	d1965af220	X86: Move address based decode caching in front of the predecoder. The predecoder in x86 does a lot of work, most of which can be skipped if the decoder cache is put in front of it. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2013-01-04 19:00:44 -06:00
Gabe Black	63b10907ef	SPARC: Keep a copy of the current ASI in the decoder. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2013-01-04 18:09:45 -06:00
Gabe Black	a83e74b37a	ARM: Keep a copy of the fpscr len and stride fields in the decoder. Avoid reading them every instruction, and also eliminate the last use of the thread context in the decoders. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2013-01-04 18:09:35 -06:00
Nilay Vaish	e9fa54de58	x86: implement x87 fp instruction fnstsw This patch implements the fnstsw instruction. The code was originally written by Vince Weaver. Gabe had made some comments about the code, but those were never addressed. This patch addresses those comments.	2012-12-30 12:45:50 -06:00
Nilay Vaish	23ba6fc5fb	x86: implement x87 fp instruction fsincos This patch implements the fsincos instruction. The code was originally written by Vince Weaver. Gabe had made some comments about the code, but those were never addressed. This patch addresses those comments.	2012-12-30 12:45:45 -06:00
Nathanael Premillieu	3026a116ba	arm: set uopSet_uop as conditional or unconditional control uopSet_uop is microop instruction that has the IsControl flags set, but the IsCondControl or IsUncondControl flags seems not to be set, neither in the construction nor where the microop is used. This patch adds the the flags in the constructor of the instruction (MicroUopSetPCCPSR). Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2012-12-12 09:50:33 -06:00
Nathanael Premillieu	84fc57bfe6	arm: set movret_uop as conditional or unconditional control A flag was missing for the movret_uop microop instruction. This patch adds that flag when the instruction is used, not directly in the constructor of the instruction. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2012-12-12 09:50:16 -06:00
Andreas Sandberg	b81a977e6a	sim: Move the draining interface into a separate base class This patch moves the draining interface from SimObject to a separate class that can be used by any object needing draining. However, objects not visible to the Python code (i.e., objects not deriving from SimObject) still depend on their parents informing them when to drain. This patch also gets rid of the CountedDrainEvent (which isn't really an event) and replaces it with a DrainManager.	2012-11-02 11:32:01 -05:00
Andreas Sandberg	c0ab52799c	sim: Include object header files in SWIG interfaces When casting objects in the generated SWIG interfaces, SWIG uses classical C-style casts ( (Foo *)bar; ). In some cases, this can degenerate into the equivalent of a reinterpret_cast (mainly if only a forward declaration of the type is available). This usually works for most compilers, but it is known to break if multiple inheritance is used anywhere in the object hierarchy. This patch introduces the cxx_header attribute to Python SimObject definitions, which should be used to specify a header to include in the SWIG interface. The header should include the declaration of the wrapped object. We currently don't enforce header the use of the header attribute, but a warning will be generated for objects that do not use it.	2012-11-02 11:32:01 -05:00
Andreas Sandberg	249e318212	mips: Remove unused Python file Remove BISystem.py, BareIronMipsSystem is already implemented in MipsSystem.py.	2012-11-02 11:32:01 -05:00
Dam Sunwoo	81406018b0	ARM: dump stats and process info on context switches This patch enables dumping statistics and Linux process information on context switch boundaries (__switch_to() calls) that are used for Streamline integration (a graphical statistics viewer from ARM).	2012-11-02 11:32:01 -05:00
Dam Sunwoo	ac161c1d72	ISA: generic Linux thread info support This patch takes the Linux thread info support scattered across different ISA implementations (currently in ARM, ALPHA, and MIPS), and unifies them into a single file. Adds a few more helper functions to read out TGID, mm, etc. ISA-specific information (e.g., ALPHA PCBB register) is now moved to the corresponding isa_traits.hh files.	2012-11-02 11:32:00 -05:00
Andreas Hansson	1fdc4e850e	arm: Use table walker clock that is inherited from CPU This patch simplifies the scheduling of the next walk for the ARM table walker. Previously it used the CPU clock, but as the table walker inherits the clock from the CPU, it is cleaner to simply use its own clock (which is the same).	2012-10-25 04:32:42 -04:00
Andreas Hansson	2a740aa096	Port: Add protocol-agnostic ports in the port hierarchy This patch adds an additional level of ports in the inheritance hierarchy, separating out the protocol-specific and protocl-agnostic parts. All the functionality related to the binding of ports is now confined to use BaseMaster/BaseSlavePorts, and all the protocol-specific parts stay in the Master/SlavePort. In the future it will be possible to add other protocol-specific implementations. The functions used in the binding of ports, i.e. getMaster/SlavePort now use the base classes, and the index parameter is updated to use the PortID typedef with the symbolic InvalidPortID as the default.	2012-10-15 08:12:35 -04:00
Andreas Hansson	d7ad8dc608	Checkpoint: Make system serialize call children This patch changes how the serialization of the system works. The base class had a non-virtual serialize and unserialize, that was hidden by a function with the same name for a number of subclasses (most likely not intentional as the base class should have been virtual). A few of the derived systems had no specialization at all (e.g. Power and x86 that simply called the System::serialize), but MIPS and Alpha adds additional symbol table entries to the checkpoint. Instead of overriding the virtual function, the additional entries are now printed through a virtual function (un)serializeSymtab. The reason for not calling System::serialize from the two related systems is that a follow up patch will require the system to also serialize the PhysicalMemory, and if this is done in the base class if ends up being between the general parts and the specialized symbol table. With this patch, the checkpoint is not modified, as the order of the segments is unchanged.	2012-10-15 08:12:29 -04:00
Andreas Hansson	93a159875a	Fix: Address a few minor issues identified by cppcheck This patch addresses a number of smaller issues identified by the code inspection utility cppcheck. There are a number of identified leaks in the arm/linux/system.cc (although the function only get's called once so it is not a major problem), a few deletes in dev/x86/i8042.cc that were not array deletes, and sprintfs where the character array had one element less than needed. In the IIC tags there was a function allocating an array of longs which is in fact never used.	2012-10-15 08:12:23 -04:00
Dam Sunwoo	acbb7a2eed	ARM: added support for flattened device tree blobs Newer Linux kernels require DTB (device tree blobs) to specify platform configurations. The input DTB filename can be specified through gem5 parameters in LinuxArmSystem.	2012-09-25 11:49:41 -05:00
Ali Saidi	0c99d21ad7	ARM: Squash outstanding walks when instructions are squashed.	2012-09-25 11:49:40 -05:00
Andreas Sandberg	6f603e0807	arm: Use a static_assert to test that miscRegName[] is complete Instead of statically defining miscRegName to contain NUM_MISCREGS elements, let the compiler determine the length of the array. This allows us to use a static_assert to test that all registers are listed in the name vector.	2012-09-25 11:49:40 -05:00
Nathanael Premillieu	bfffbb6797	ARM: Inst writing to cntrlReg registers not set as control inst Deletion of the fact that instructions that writes to registers of type "cntrlReg" are not set as control instruction (flag IsControl not set).	2012-09-25 11:49:40 -05:00
Ali Saidi	04ca96427c	ARM: Predict target of more instructions that modify PC.	2012-09-25 11:49:40 -05:00
Andreas Hansson	ffb6aec603	AddrRange: Transition from Range<T> to AddrRange This patch takes the final plunge and transitions from the templated Range class to the more specific AddrRange. In doing so it changes the obvious Range<Addr> to AddrRange, and also bumps the range_map to be AddrRangeMap. In addition to the obvious changes, including the removal of redundant includes, this patch also does some house keeping in preparing for the introduction of address interleaving support in the ranges. The Range class is also stripped of all the functionality that is never used. --HG-- rename : src/base/range.hh => src/base/addr_range.hh rename : src/base/range_map.hh => src/base/addr_range_map.hh	2012-09-19 06:15:44 -04:00
Nilay Vaish	f47c2f6415	X86: make use of register predication The patch introduces two predicates for condition code registers -- one tests if a register needs to be read, the other tests whether a register needs to be written to. These predicates are evaluated twice -- during construction of the microop and during its execution. Register reads and writes are elided depending on how the predicates evaluate.	2012-09-11 09:33:42 -05:00
Nilay Vaish	6369df59c8	x86: Add a separate register for D flag bit The D flag bit is part of the cc flag bit register currently. But since it is not being used any where in the implementation, it creates an unnecessary dependency. Hence, it is being moved to a separate register.	2012-09-11 09:25:43 -05:00
Nilay Vaish	3700e5448a	ISA Parser: Allow predication of source and destination registers This patch is meant for allowing predicated reads and writes. Note that this predication is different from the ISA provided predication. They way we currently provide the ISA description for X86, we read/write registers that do not need to be actually read/written. This is likely to be true for other ISAs as well. This patch allows for read and write predicates to be associated with operands. It allows for the register indices for source and destination registers to be decided at the time when the microop is constructed. The run time indicies come in to play only when the at least one of the predicates has been provided. This patch will not affect any of the ISAs that do not provide these predicates. Also the patch assumes that the order in which operands appear in any function of the microop is same across all the functions of the microops. A subsequent patch will enable predication for the x86 ISA.	2012-06-03 10:59:04 -05:00
Palle Lyckegaard	21d4d50ba1	NetBSD: Build on NetBSD Minor patch against so building on NetBSD is possible.	2012-09-10 11:57:42 -04:00
Andreas Hansson	0cacf7e817	Clock: Add a Cycles wrapper class and use where applicable This patch addresses the comments and feedback on the preceding patch that reworks the clocks and now more clearly shows where cycles (relative cycle counts) are used to express time. Instead of bumping the existing patch I chose to make this a separate patch, merely to try and focus the discussion around a smaller set of changes. The two patches will be pushed together though. This changes done as part of this patch are mostly following directly from the introduction of the wrapper class, and change enough code to make things compile and run again. There are definitely more places where int/uint/Tick is still used to represent cycles, and it will take some time to chase them all down. Similarly, a lot of parameters should be changed from Param.Tick and Param.Unsigned to Param.Cycles. In addition, the use of curTick is questionable as there should not be an absolute cycle. Potential solutions can be built on top of this patch. There is a similar situation in the o3 CPU where lastRunningCycle is currently counting in Cycles, and is still an absolute time. More discussion to be had in other words. An additional change that would be appropriate in the future is to perform a similar wrapping of Tick and probably also introduce a Ticks class along with suitable operators for all these classes.	2012-08-28 14:30:33 -04:00
Andreas Hansson	d53d04473e	Clock: Rework clocks to avoid tick-to-cycle transformations This patch introduces the notion of a clock update function that aims to avoid costly divisions when turning the current tick into a cycle. Each clocked object advances a private (hidden) cycle member and a tick member and uses these to implement functions for getting the tick of the next cycle, or the tick of a cycle some time in the future. In the different modules using the clocks, changes are made to avoid counting in ticks only to later translate to cycles. There are a few oddities in how the O3 and inorder CPU count idle cycles, as seen by a few locations where a cycle is subtracted in the calculation. This is done such that the regression does not change any stats, but should be revisited in a future patch. Another, much needed, change that is not done as part of this patch is to introduce a new typedef uint64_t Cycle to be able to at least hint at the unit of the variables counting Ticks vs Cycles. This will be done as a follow-up patch. As an additional follow up, the thread context still uses ticks for the book keeping of last activate and last suspend and this should probably also be changed into cycles as well.	2012-08-28 14:30:31 -04:00
Andreas Hansson	c60db56741	Packet: Remove NACKs from packet and its use in endpoints This patch removes the NACK frrom the packet as there is no longer any module in the system that issues them (the bridge was the only one and the previous patch removes that). The handling of NACKs was mostly avoided throughout the code base, by using e.g. panic or assert false, but in a few locations the NACKs were actually dealt with (although NACKs never occured in any of the regressions). Most notably, the DMA port will now never receive a NACK and the backoff time is thus never changed. As a consequence, the entire backoff mechanism (similar to a PCI bus) is now removed and the DMA port entirely relies on the bus performing the arbitration and issuing a retry when appropriate. This is more in line with e.g. PCIe. Surprisingly, this patch has no impact on any of the regressions. As mentioned in the patch that removes the NACK from the bridge, a follow-up patch should change the request and response buffer size for at least one regression to also verify that the system behaves as expected when the bridge fills up.	2012-08-22 11:39:59 -04:00
Andreas Hansson	70e99e0b91	Device: Remove overloaded pio_latency parameter This patch removes the overloading of the parameter, which seems both redundant, and possibly incorrect. The PciConfigAll now also uses a Param.Latency rather than a Param.Tick. For backwards compatibility it still sets the pio_latency to 1 tick. All the comments have also been updated to not state that it is in simticks when it is not necessarily the case.	2012-08-21 05:50:03 -04:00
Andreas Hansson	452217817f	Clock: Move the clock and related functions to ClockedObject This patch moves the clock of the CPU, bus, and numerous devices to the new class ClockedObject, that sits in between the SimObject and MemObject in the class hierarchy. Although there are currently a fair amount of MemObjects that do not make use of the clock, they potentially should do so, e.g. the caches should at some point have the same clock as the CPU, potentially with a 1:n ratio. This patch does not introduce any new clock objects or object hierarchies (clusters, clock domains etc), but is still a step in the direction of having a more structured approach clock domains. The most contentious part of this patch is the serialisation of clocks that some of the modules (but not all) did previously. This serialisation should not be needed as the clock is set through the parameters even when restoring from the checkpoint. In other words, the state is "stored" in the Python code that creates the modules. The nextCycle methods are also simplified and the clock phase parameter of the CPU is removed (this could be part of a clock object once they are introduced).	2012-08-21 05:49:01 -04:00
Nilay Vaish	649e377937	Alpha System: override startup(), instead of loadState() Alpha System was overriding loadState() function to setup some functional event. The system tried to read/write to memory before the Ruby memory had unserialized the state. With this patch, Alpha System overrides the startup() function, and sets up functional events in this function. This works because startup() is called after Ruby memory system has unserialized the memory state.	2012-08-16 23:45:21 -05:00
Anthony Gutierrez	0b3897fc90	O3,ARM: fix some problems with drain/switchout functionality and add Drain DPRINTFs This patch fixes some problems with the drain/switchout functionality for the O3 cpu and for the ARM ISA and adds some useful debug print statements. This is an incremental fix as there are still a few bugs/mem leaks with the switchout code. Particularly when switching from an O3CPU to a TimingSimpleCPU. However, when switching from O3 to O3 cores with the ARM ISA I haven't encountered any more assertion failures; now the kernel will typically panic inside of simulation.	2012-08-15 10:38:08 -04:00
Ali Saidi	dd1b346584	sysemul: bump all linux versions of for syscal emulation to 3.0. New tool chains seem to be looking for kernel versions newer than what this this was previously set to. Also take this opportunity to change the hostname we report in uname to sim.gem5.org.	2012-08-15 10:38:04 -04:00
Marc Orr	7cef6b9bef	syscall emulation: Enabled getrlimit and getrusage for x86. Added/moved rlimit constants to base linux header file. This patch is a revised version of Vince Weaver's earlier patch.	2012-08-06 19:52:56 -07:00
Marc Orr	d55115936e	syscall emulation: Clean up ioctl handling, and implement for x86. Enable different whitelists for different OS/arch combinations, since some use the generic Linux definitions only, and others use definitions inherited from earlier Unix flavors on those architectures. Also update x86 function pointers so ioctl is no longer unimplemented on that platform. This patch is a revised version of Vince Weaver's earlier patch.	2012-08-06 16:52:40 -07:00
Anthony Gutierrez	2eb6b403c9	ARM: fix value of MISCREG_CTR returned by readMiscReg() According to the A15 TRM the value of this register is as follows (assuming 16 word = 64 byte lines) [31:29] Format - b100 specifies v7 [28] RAZ - b0 [27:24] CWG log2(max writeback size #words) - 0x4 16 words [23:20] ERG log2(max reservation size #words) - 0x4 16 words [19:16] DminLine log2(smallest dcache line #words) - 0x4 16 words [15:14] L1Ip L1 index/tagging policy - b11 specifies PIPT [13:4] RAZ - b0000000000 [3:0] IminLine log2(smallest icache line #words) - 0x4 16 words	2012-07-27 16:08:04 -04:00
Nilay Vaish	11a551ae3a	X86 CPUID: Return false if unknown processor family	2012-07-22 20:31:23 -05:00
Brad Beckmann	8c18f6da9e	x86: added page size in bytes tlb entry function	2012-07-11 12:21:04 -07:00
Marc Orr	387f843d51	syscall emulation: Add the futex system call.	2012-07-10 22:51:54 -07:00
Brad Beckmann	52540b1b78	x86: logSize and lruSeq are now optional ckpt params	2012-07-10 22:51:54 -07:00
Andreas Hansson	46d9adb68c	Port: Make getAddrRanges const This patch makes getAddrRanges const throughout the code base. There is no reason why it should not be, and making it const prevents adding any unintentional side-effects.	2012-07-09 12:35:34 -04:00
Andreas Hansson	92eaac0711	gcc: Fix warnings for gcc 4.7 and clang 3.1 This patch fixes two warnings, one related to a narrowing conversion (int to MachInst), and one due to the cast operator for arguments and a mismatch in const-ness (const void* and void*).	2012-07-02 08:21:53 -04:00
Ali Saidi	71daeb0b2b	ARM: Fix identification of one RAS pop instruction. The check should be with the op2 field, not with the op1 field.	2012-06-29 11:18:29 -04:00
Ali Saidi	7e3496c78c	ARM: Update version of linux we claim to be to 3.0.0. Static binaries generated with new versions of libc complain that the kernel is too old otherwise.	2012-06-29 11:18:29 -04:00
Ali Saidi	aed8050824	ARM: Fix issue with predicted next pc being wrong because of advance() ordering. npc in PCState for ARM was being calculated before the current flags were updated with the next flags. This causes an issue as the npc is incremented by two or four depending on the current flags (thumb or not) and was leading to branches that were predicted correctly being identified as mispredicted.	2012-06-29 11:18:28 -04:00
Anthony Gutierrez	9764cde7f2	ARM: implement the ProcessInfo methods	2012-06-11 11:07:41 -04:00
Andreas Hansson	a118c01716	Power: Fix MaxMiscDestRegs which was set to zero This patch fixes a failing compilation caused by MaxMiscDestRegs being zero. According to gcc 4.6, the result is a comparison that is always false due to limited range of data type.	2012-06-08 12:44:17 -04:00
Nilay Vaish	d6609793d4	X86 TLB: Add a missing = sign	2012-06-07 17:03:45 -05:00
Jayneel Gandhi	7183c3fd56	X86 TLB: Fix for gcc 4.4.3 Due to recent changes to X86 TLB, gem5 stopped compiling on gcc version 4.4.3. This patch provides the fix for that problem. The patch is tested on gcc 4.4.3. The change is not required for more recent versions of gcc (like on 4.6.3).	2012-06-07 08:11:00 -05:00
Anthony Gutierrez	d6da3ff317	cpu: Don't init simple and inorder CPUs if they are defered. initCPU() will be called to initialize switched out CPUs for the simple and inorder CPU models. this patch prevents those CPUs from being initialized because they should get their state from the active CPU when it is switched out.	2012-06-05 14:20:13 -04:00
Ali Saidi	20d25b9da7	ISA: Back-out NoopMachInst as a StaticInstPtr change.	2012-06-05 13:52:30 -04:00
Chander Sudanthi	15228694d0	ARM: removed extra white space Extra white space fixes in miscregs.hh	2012-06-05 01:23:10 -04:00
Chander Sudanthi	8a2ca2fd24	ARM: Fix MPIDR and MIDR register implementation. This change allows designating a system as MP capable or not as some bootloaders/kernels care that it's set right. You can have a single processor MP capable system, but you can't have a multi-processor UP only system. This change also fixes the initialization of the MIDR register.	2012-06-05 01:23:10 -04:00
Ali Saidi	6df196b71e	O3: Clean up the O3 structures and try to pack them a bit better. DynInst is extremely large the hope is that this re-organization will put the most used members close to each other.	2012-06-05 01:23:09 -04:00
Ali Saidi	1b370431d0	sim: Remove FastAlloc While FastAlloc provides a small performance increase (~1.5%) over regular malloc it isn't thread safe. After removing FastAlloc and using tcmalloc I've seen a performance increase of 12% over libc malloc when running twolf for ARM.	2012-06-05 01:23:08 -04:00
Ali Saidi	0b0c5621ee	ARM: Fix compilation on ARM after Gabe's change.	2012-06-05 01:23:08 -04:00
Gabe Black	008b17d816	ISA: Turn the ExtMachInst NoopMachinst into the StaticInstPtr NoopStaticInst. This eliminates a use of the ExtMachInst type outside of the ISAs.	2012-06-04 10:57:23 -07:00
Gabe Black	35fa5074aa	X86: Ensure that the CPUID instruction always writes its outputs. The CPUID instruction was implemented so that it would only write its results if the instruction was successful. This works fine on the simple CPU where unwritten registers retain their old values, but on a CPU like O3 with renaming this is broken. The instruction needs to write the old values back into the registers explicitly if they aren't being changed.	2012-06-04 10:43:09 -07:00
Gabe Black	7b73c36f5d	X86: Ensure that the decoder's internal ExtMachInst is completely initialized. There are some bits of some fields of the ExtMachInst which are not actually used for anything but are included in the hash of an ExtMachInst for simplicity and efficiency. This change makes sure the decoder's internal working ExtMachInst is completely initialized, even these unused bits, so that there isn't any nondeterministic behavior, no valgrind messages about uninitialized variables, and no potential false misses/redundant entries in the decode cache.	2012-06-04 10:43:08 -07:00
Gabe Black	d9988ded3c	X86: Use the HandyM5Reg to avoid a register read and some logic in the TLB.	2012-05-28 21:56:23 -07:00
Gabe Black	40084e0c3e	X86: Move the GDT down to where it can be accessed in 32 bit mode. The GDT can be accessed by user level software running in compatibility mode by moving segment selectors into segment registers. The GDT needs to be set up at an address accessible in this mode.	2012-05-27 19:01:08 -07:00
Gabe Black	1d96135087	X86: Truncate addresses to 32 bits except in 64 bit mode, not long mode. A small change was added a while ago to keep addresses from overflowing 32 bits when larger addresses shouldn't be accessible to software. That change truncated when not in long mode, but really it should have truncated when not in 64 bit mode. The difference is whether compatibility mode is included, a mode that's supposed to act like a legacy 32 bit mode.	2012-05-27 19:01:04 -07:00
Gabe Black	19df4e94ee	ISA,CPU: Generalize and split out the components of the decode cache. This will allow it to be specialized by the ISAs. The existing caching scheme is provided by the BasicDecodeCache in the GenericISA namespace and is built from the generalized components. --HG-- rename : src/cpu/decode_cache.cc => src/arch/generic/decode_cache.cc	2012-05-26 13:45:12 -07:00
Gabe Black	0cba96ba6a	CPU: Merge the predecoder and decoder. These classes are always used together, and merging them will give the ISAs more flexibility in how they cache things and manage the process. --HG-- rename : src/arch/x86/predecoder_tables.cc => src/arch/x86/decoder_tables.cc	2012-05-26 13:44:46 -07:00
Gabe Black	eae1e97fb0	ISA: Make the decode function part of the ISA's decoder.	2012-05-25 00:55:24 -07:00
Gabe Black	82a228bd43	Decode: Make the Decoder class defined per ISA. --HG-- rename : src/cpu/decode.cc => src/arch/generic/decoder.cc rename : src/cpu/decode.hh => src/arch/generic/decoder.hh	2012-05-25 00:53:37 -07:00
Andreas Hansson	d4847fe6ea	DMA: Split the DMA device and IO device into seperate files This patch moves the DMA device to its own set of files, splitting it from the IO device. There are no behavioural changes associated with this patch. The patch also grabs the opportunity to do some very minor tidying up, including some white space removal and pruning some redundant parameters. Besides the immediate benefits of the separation-of-concerns, this patch also makes upcoming changes more streamlined as it split the devices that are only slaves and the DMA device that also acts as a master. --HG-- rename : src/dev/io_device.cc => src/dev/dma_device.cc rename : src/dev/io_device.hh => src/dev/dma_device.hh	2012-05-23 09:15:45 -04:00
Andreas Hansson	5b36cf623c	MEM: Add a snooping DMA port subclass for table walker This patch makes the (device) DmaPort non-snooping and removes the recvSnoop constructor parameter and instead introduces a SnoopingDmaPort subclass for the ARM table walker. Functionality is unchanged, as are the stats, and the patch merely clarifies that the normal DMA ports are not snooping (although they may issue requests that are snooped by others, as done with PCI, PCIe, AMBA4 ACE etc). Currently this port is declared in the ARM table walker as it is not used anywhere else. If other ports were to have similar behaviour it could be moved in a future patch.	2012-05-23 09:14:12 -04:00
Nilay Vaish	4d4d212ae9	X86: Split Condition Code register This patch moves the ECF and EZF bits to individual registers (ecfBit and ezfBit) and the CF and OF bits to cfofFlag registers. This is being done so as to lower the read after write dependencies on the the condition code register. Ultimately we will have the following registers [ZAPS], [OF], [CF], [ECF], [EZF] and [DF]. Note that this is only one part of the solution for lowering the dependencies. The other part will check whether or not the condition code register needs to be actually read. This would be done through a separate patch.	2012-05-22 11:29:53 -05:00
Marc Orr	16a559c9c6	x86 ISA: Implement the sse3 haddps instruction. Shuffle the 32 bit values into position, and then add in parallel.	2012-05-19 04:32:25 -07:00
Dam Sunwoo	f2f7fa1a1c	ARM: guard masked symbol tables by default Symbol tables masked with the loadAddrMask create redundant entries that could conflict with kernel function events that rely on the original addresses. This patch guards the creation of those masked symbol tables by default, with an option to enable them when needed (for early-stage kernel debugging, etc.)	2012-05-10 18:04:27 -05:00
Ali Saidi	8cee4dacc8	gem5: Fix a number of incorrect case statements	2012-05-10 18:04:26 -05:00
Andreas Hansson	3fea59e162	MEM: Separate requests and responses for timing accesses This patch moves send/recvTiming and send/recvTimingSnoop from the Port base class to the MasterPort and SlavePort, and also splits them into separate member functions for requests and responses: send/recvTimingReq, send/recvTimingResp, and send/recvTimingSnoopReq, send/recvTimingSnoopResp. A master port sends requests and receives responses, and also receives snoop requests and sends snoop responses. A slave port has the reciprocal behaviour as it receives requests and sends responses, and sends snoop requests and receives snoop responses. For all MemObjects that have only master ports or slave ports (but not both), e.g. a CPU, or a PIO device, this patch merely adds more clarity to what kind of access is taking place. For example, a CPU port used to call sendTiming, and will now call sendTimingReq. Similarly, a response previously came back through recvTiming, which is now recvTimingResp. For the modules that have both master and slave ports, e.g. the bus, the behaviour was previously relying on branches based on pkt->isRequest(), and this is now replaced with a direct call to the apprioriate member function depending on the type of access. Please note that send/recvRetry is still shared by all the timing accessors and remains in the Port base class for now (to maintain the current bus functionality and avoid changing the statistics of all regressions). The packet queue is split into a MasterPort and SlavePort version to facilitate the use of the new timing accessors. All uses of the PacketQueue are updated accordingly. With this patch, the type of packet (request or response) is now well defined for each type of access, and asserts on pkt->isRequest() and pkt->isResponse() are now moved to the appropriate send member functions. It is also worth noting that sendTimingSnoopReq no longer returns a boolean, as the semantics do not alow snoop requests to be rejected or stalled. All these assumptions are now excplicitly part of the port interface itself.	2012-05-01 13:40:42 -04:00
Gabe Black	2c85cf41a2	X86: Fix the IMUL_R_P_I macroop. The disp displacement was left off the load microop so the wrong value was used.	2012-04-29 02:26:34 -07:00
Vince Weaver	03a91b0533	X86: Fix up the open system call's flags.	2012-04-29 00:31:03 -07:00
Vince Weaver	38799e2b3f	X86: Make gem5 ignore a bunch of syscalls.	2012-04-29 00:30:56 -07:00
Gabe Black	64bf90dca3	X86: Clear out duplicate TLB entries when adding a new one. It's possible for two page table walks to overlap which will go in the same place in the TLB's trie. They would land on top of each other, so this change adds some code which detects if an address already matches an entry and if so throws away the new one.	2012-04-24 00:48:41 -07:00
Gabe Black	74ca8a3cd0	ISA: Put parser generated files in a "generated" directory. This is to avoid collision with non-generated files.	2012-04-23 12:00:41 -07:00
Gabe Black	29329e61b7	X86: Report an error if there's no kernel object, don't blindly use it. This way the user gets a nice message instead of a less nice segfault.	2012-04-21 15:00:23 -07:00
Gabe Black	8fe112d61b	X86: Fix a tiny typo in the load/store microop constructor. The parameter is _machInst, which is very similar to the member machInst. If machInst is used to pass the parameter to a lower level constructor, what really happens is that machInst is set to whatever it already happened to be, effectively leaving it uninitialized.	2012-04-15 01:07:39 -07:00
Gabe Black	aacb676220	X86: Use the AddrTrie class to implement the TLB. This change also adjusts the TlbEntry class so that it stores the number of address bits wide a page is rather than its size in bytes. In other words, instead of storing 4K for a 4K page, it stores 12. 12 is easy to turn into 4K, but it's a little harder going the other way.	2012-04-14 23:24:18 -07:00
Andreas Hansson	750f33a901	MEM: Remove the Broadcast destination from the packet This patch simplifies the packet by removing the broadcast flag and instead more firmly relying on (and enforcing) the semantics of transactions in the classic memory system, i.e. request packets are routed from a master to a slave based on the address, and when they are created they have neither a valid source, nor destination. On their way to the slave, the request packet is updated with a source field for all modules that multiplex packets from multiple master (e.g. a bus). When a request packet is turned into a response packet (at the final slave), it moves the potentially populated source field to the destination field, and the response packet is routed through any multiplexing components back to the master based on the destination field. Modules that connect multiplexing components, such as caches and bridges store any existing source and destination field in the sender state as a stack (just as before). The packet constructor is simplified in that there is no longer a need to pass the Packet::Broadcast as the destination (this was always the case for the classic memory system). In the case of Ruby, rather than using the parameter to the constructor we now rely on setDest, as there is already another three-argument constructor in the packet class. In many places where the packet information was printed as part of DPRINTFs, request packets would be printed with a numeric "dest" that would always be -1 (Broadcast) and that field is now removed from the printing.	2012-04-14 05:45:55 -04:00
Andreas Hansson	dccca0d3a9	MEM: Separate snoops and normal memory requests/responses This patch introduces port access methods that separates snoop request/responses from normal memory request/responses. The differentiation is made for functional, atomic and timing accesses and builds on the introduction of master and slave ports. Before the introduction of this patch, the packets belonging to the different phases of the protocol (request -> [forwarded snoop request -> snoop response]* -> response) all use the same port access functions, even though the snoop packets flow in the opposite direction to the normal packet. That is, a coherent master sends normal request and receives responses, but receives snoop requests and sends snoop responses (vice versa for the slave). These two distinct phases now use different access functions, as described below. Starting with the functional access, a master sends a request to a slave through sendFunctional, and the request packet is turned into a response before the call returns. In a system without cache coherence, this is all that is needed from the functional interface. For the cache-coherent scenario, a slave also sends snoop requests to coherent masters through sendFunctionalSnoop, with responses returned within the same packet pointer. This is currently used by the bus and caches, and the LSQ of the O3 CPU. The send/recvFunctional and send/recvFunctionalSnoop are moved from the Port super class to the appropriate subclass. Atomic accesses follow the same flow as functional accesses, with request being sent from master to slave through sendAtomic. In the case of cache-coherent ports, a slave can send snoop requests to a master through sendAtomicSnoop. Just as for the functional access methods, the atomic send and receive member functions are moved to the appropriate subclasses. The timing access methods are different from the functional and atomic in that requests and responses are separated in time and send/recvTiming are used for both directions. Hence, a master uses sendTiming to send a request to a slave, and a slave uses sendTiming to send a response back to a master, at a later point in time. Snoop requests and responses travel in the opposite direction, similar to what happens in functional and atomic accesses. With the introduction of this patch, it is possible to determine the direction of packets in the bus, and no longer necessary to look for both a master and a slave port with the requested port id. In contrast to the normal recvFunctional, recvAtomic and recvTiming that are pure virtual functions, the recvFunctionalSnoop, recvAtomicSnoop and recvTimingSnoop have a default implementation that calls panic. This is to allow non-coherent master and slave ports to not implement these functions.	2012-04-14 05:45:07 -04:00
Andreas Hansson	b6aa6d55eb	clang/gcc: Fix compilation issues with clang 3.0 and gcc 4.6 This patch addresses a number of minor issues that cause problems when compiling with clang >= 3.0 and gcc >= 4.6. Most importantly, it avoids using the deprecated ext/hash_map and instead uses unordered_map (and similarly so for the hash_set). To make use of the new STL containers, g++ and clang has to be invoked with "-std=c++0x", and this is now added for all gcc versions >= 4.6, and for clang >= 3.0. For gcc >= 4.3 and <= 4.5 and clang <= 3.0 we use the tr1 unordered_map to avoid the deprecation warning. The addition of c++0x in turn causes a few problems, as the compiler is more stringent and adds a number of new warnings. Below, the most important issues are enumerated: 1) the use of namespaces is more strict, e.g. for isnan, and all headers opening the entire namespace std are now fixed. 2) another other issue caused by the more stringent compiler is the narrowing of the embedded python, which used to be a char array, and is now unsigned char since there were values larger than 128. 3) a particularly odd issue that arose with the new c++0x behaviour is found in range.hh, where the operator< causes gcc to complain about the template type parsing (the "<" is interpreted as the beginning of a template argument), and the problem seems to be related to the begin/end members introduced for the range-type iteration, which is a new feature in c++11. As a minor update, this patch also fixes the build flags for the clang debug target that used to be shared with gcc and incorrectly use "-ggdb".	2012-04-14 05:43:31 -04:00
Andreas Hansson	b00949d88b	MEM: Enable multiple distributed generalized memories This patch removes the assumption on having on single instance of PhysicalMemory, and enables a distributed memory where the individual memories in the system are each responsible for a single contiguous address range. All memories inherit from an AbstractMemory that encompasses the basic behaviuor of a random access memory, and provides untimed access methods. What was previously called PhysicalMemory is now SimpleMemory, and a subclass of AbstractMemory. All future types of memory controllers should inherit from AbstractMemory. To enable e.g. the atomic CPU and RubyPort to access the now distributed memory, the system has a wrapper class, called PhysicalMemory that is aware of all the memories in the system and their associated address ranges. This class thus acts as an infinitely-fast bus and performs address decoding for these "shortcut" accesses. Each memory can specify that it should not be part of the global address map (used e.g. by the functional memories by some testers). Moreover, each memory can be configured to be reported to the OS configuration table, useful for populating ATAG structures, and any potential ACPI tables. Checkpointing support currently assumes that all memories have the same size and organisation when creating and resuming from the checkpoint. A future patch will enable a more flexible re-organisation. --HG-- rename : src/mem/PhysicalMemory.py => src/mem/AbstractMemory.py rename : src/mem/PhysicalMemory.py => src/mem/SimpleMemory.py rename : src/mem/physical.cc => src/mem/abstract_mem.cc rename : src/mem/physical.hh => src/mem/abstract_mem.hh rename : src/mem/physical.cc => src/mem/simple_mem.cc rename : src/mem/physical.hh => src/mem/simple_mem.hh	2012-04-06 13:46:31 -04:00
Gabe Black	a7859f7e45	X86: Fix address size handling so real mode works properly. Virtual (pre-segmentation) addresses are truncated based on address size, and any non-64 bit linear address is truncated to 32 bits. This means that real mode addresses aren't truncated down to 16 bits after their segment bases are added in.	2012-03-31 12:27:33 -07:00

1 2 3 4 5 ...

2806 commits