sanchayanmaity/gem5 - Sanchayan Maity's repositories

Author	SHA1	Message	Date
Anthony Gutierrez	8a53da22c2	cpu: allow the fetch buffer to be smaller than a cache line the current implementation of the fetch buffer in the o3 cpu is only allowed to be the size of a cache line. some architectures, e.g., ARM, have fetch buffers smaller than a cache line, see slide 22 at: http://www.arm.com/files/pdf/at-exploring_the_design_of_the_cortex-a15.pdf this patch allows the fetch buffer to be set to values smaller than a cache line.	2013-11-15 13:21:15 -05:00
Andreas Hansson	f028da7af7	cpu: Fix Checker register index use This patch fixes an issue in the checker CPU register indexing. The code will not even compile using LTO as deep inlining causes the used index to be outside the array bounds.	2013-11-15 03:47:10 -05:00
Steve Reinhardt	a2c21d47a8	tests: suppress output on switcheroo tests The output from the switcheroo tests is voluminous and (because it includes timestamps) highly sensitive to minor changes, leading to extremely large updates to the reference outputs. This patch addresses this problem by suppressing output from the tests. An internal parameter can be set to enable the output. Wiring that up to a command-line flag (perhaps even the rudimantary -v/-q options in m5/main.py) is left for future work.	2013-11-14 15:03:42 -08:00
Anthony Gutierrez	99d6c3b7e0	sim: fix event priority name for debug-start option	2013-11-12 11:46:48 -05:00
Andreas Hansson	460cc77d6d	mem: Fixes for DRAM stats accounting This patch fixes a number of stats accounting issues in the DRAM controller. Most importantly, it separates the system interface and DRAM interface so that it is clearer what the actual DRAM bandwidth (and consequently utilisation) is.	2013-11-01 11:56:31 -04:00
Andreas Hansson	ce93982cc6	mem: Fix the LPDDR3 page size This patch corrects the LPDDR3 page size, which was set too low.	2013-11-01 11:56:30 -04:00
Neha Agarwal	5c486908d7	mem: Adding stats for DRAM power calculation This patch adds stats which are used for offline power calculation from the 'Micron Power Calculator' spreadsheet.	2013-11-01 11:56:28 -04:00
Neha Agarwal	77fce1ce0e	mem: Unify request selection for read and write queues This patch unifies the request selection across read and write queues for FR-FCFS scheduling policy. It also fixes the request selection code to prioritize the row hits present in the request queues over the selection based on earliest bank availability.	2013-11-01 11:56:27 -04:00
Andreas Hansson	bb572663cf	mem: Add a simple adaptive version of the open-page policy This patch adds a basic adaptive version of the open-page policy that guides the decision to keep open or close by looking at the contents of the controller queues. If no row hits are found, and bank conflicts are present, then the row is closed by means of an auto precharge. This is a well-known technique that should improve performance in most use-cases.	2013-11-01 11:56:26 -04:00
Neha Agarwal	da6fd72f62	mem: Just-in-time write scheduling in DRAM controller This patch removes the untimed while loop in the write scheduling mechanism and now schedule commands taking into account the minimum timing constraint. It also introduces an optimization to track write queue size and switch from writes to reads if the number of write requests fall below write low threshold.	2013-11-01 11:56:25 -04:00
Andreas Hansson	ee6b41a1e4	mem: Add tRRD as a timing parameter for the DRAM controller This patch adds the tRRD parameter to the DRAM controller. With the recent addition of the actAllowedAt member for each bank, this addition is trivial.	2013-11-01 11:56:24 -04:00
Andreas Hansson	491d3a77cf	mem: Less conservative tRAS in DRAM configurations This patch changes the default values of the tRAS timing parameter to be less conservative, and closer in line with existing parts.	2013-11-01 11:56:23 -04:00
Ani Udipi	8bc855fa15	mem: Make tXAW enforcement less conservative and per rank This patch changes the tXAW constraint so that it is enforced per rank rather than globally for all ranks in the channel. It also avoids using the bank freeAt to enforce the activation limit, as doing so also precludes performing any column or row command to the DRAM. Instead the patch introduces a new variable actAllowedAt for the banks and use this to track when a potential activation can occur.	2013-11-01 11:56:22 -04:00
Neha Agarwal	7645c8e611	mem: Fix for 100% write threshold in DRAM controller This patch fixes the controller when a write threshold of 100% is used. Earlier for 100% write threshold no data is written to memory as writes never get triggered since this corner case is not considered.	2013-11-01 11:56:21 -04:00
Andreas Hansson	10e8978ec0	mem: Pick the next DRAM request based on bank availability This patch changes the FCFS bit of FR-FCFS such that requests that target the earliest available bank are picked first (as suggested in the original work on FR-FCFS by Rixner et al). To accommodate this we add functionality to identify a bank through a one-dimensional identifier (bank id). The member names of the DRAMPacket are also update to match the style guide.	2013-11-01 11:56:20 -04:00
Ani Udipi	ea76f97576	mem: Use the same timing calculation for DRAM read and write This patch simplifies the DRAM model by re-using the function that computes the busy and access time for both reads and writes.	2013-11-01 11:56:19 -04:00
Ani Udipi	655bf86828	mem: Fix DRAM bank occupancy for streaming access This patch fixes an issue that allowed more than 100% bus utilisation in certain cases.	2013-11-01 11:56:18 -04:00
Ani Udipi	be62a142cf	mem: Schedule time for DRAM event taking tRAS into account This patch changes the time the controller is woken up to take the next scheduling decisions. tRAS is now handled in estimateLatency and doDRAMAccess and we do not need to worry about it at scheduling time. The earliest we need to wake up is to do a pre-charge, row access and column access before the bus becomes free for use.	2013-11-01 11:56:17 -04:00
Ani Udipi	d4cf009b95	mem: Add tRAS parameter to the DRAM controller model This patch adds an explicit tRAS parameter to the DRAM controller model. Previously tRAS was, rather conservatively, assumed to be tRCD + tCL + tRP. The default values for tRAS are chosen to match the previous behaviour and will be updated later.	2013-11-01 11:56:16 -04:00
Andreas Hansson	c9a8b7b147	sim: Clarify the difference between tracing and debugging This patch changes the name the command-line options related to debug output to all start with "debug" rather than being a mix of that and "trace". It also makes it clear that the breakpoint time is specified in ticks and not in cycles.	2013-11-01 11:56:13 -04:00
Chander Sudanthi	3e6da89419	ARM: add support for TEEHBR access Thumb2 ARM kernels may access the TEEHBR via thumbee_notifier in arch/arm/kernel/thumbee.c. The Linux kernel code just seems to be saving and restoring the register. This patch adds support for the TEEHBR cp14 register. Note, this may be a special case when restoring from an image that was run on a system that supports ThumbEE.	2013-10-31 13:41:13 -05:00
Matt Evans	d17529b046	dev: Add 'OSC' oscillator sys control reg support to VersatileExpress The VE motherboard provides a set of system control registers through which various motherboard and coretile registers are accessed. Voltage regulators and oscillator (DLL/PLL) config are examples. These registers must be impleted to boot Linux 3.9+ kernels.	2013-10-31 13:41:13 -05:00
Geoffrey Blake	c32fbb7c00	dev: Add support for MSI-X and Capability Lists for ARM and PCI devices This patch adds the registers and fields to the PCI device to support Capability lists and to support MSI-X in the GIC.	2013-10-31 13:41:13 -05:00
Geoffrey Blake	be4aa2b6ba	dev: Fix race conditions in IDE device on newer kernels Newer linux kernels and distros exercise more functionality in the IDE device than previously, exposing 2 races. The first race is the handling of aborted DMA commands would immediately report the device is ready back to the kernel and cause already in flight commands to assert the simulator when they returned and discovered an inconsitent device state. The second race was due to the Status register not being handled correctly, the interrupt status bit would get stuck at 1 and the driver eventually views this as a bad state and logs the condition to the terminal. This patch fixes these two conditions by making the device handle aborted commands gracefully and properly handles clearing the interrupt status bit in the Status register.	2013-10-31 13:41:13 -05:00
Geoffrey Blake	fb0496498d	base: Add support for ipv6 into inet.hh/inet.cc	2013-10-31 13:41:13 -05:00
Faissal Sleiman	397dc784fd	cpu: Construct ROB with cpu params struct instead of each variable Most other structures/stages get passed the cpu params struct.	2013-10-31 13:41:13 -05:00
Geoffrey Blake	15938e0492	config: Fix handling of parents for simobject vectors SimObjectVector objects did not provide the same interface to the _parent attribute through get_parent() like a normal SimObject. It also handled assigning a _parent incorrectly if objects in a SimObjectVector were changed post-creation, leading to errors later when the simulator tried to execute. This patch fixes these two omissions.	2013-10-31 13:41:13 -05:00
Dam Sunwoo	6b4543184e	sim: added option to serialize SimLoopExitEvent SimLoopExitEvents weren't serialized by default. Some benchmarks utilize a delayed m5 exit pseudo op call to terminate the simulation and this event was lost when resuming from a checkpoint generated after the pseudo op call. This patch adds the capability to serialize the SimLoopExitEvents and enable serialization for m5_exit and m5_fail pseudo ops by default. Does not affect other generic SimLoopExitEvents.	2013-10-31 13:41:13 -05:00
Stephan Diestelhorst	19c2a606fa	mem: Add "const" attribute to Packet getters Add a "const" keywords to the getters in the Packet class so these can be invoked on const Packet objects.	2013-10-31 13:41:13 -05:00
Prakash Ramrakhyani	885656f2ed	mem: Add privilege info to request class This patch adds a flag in the request class that indicates if the request was made in privileged mode.	2013-10-31 13:41:13 -05:00
Ali Saidi	79f81e2641	cpu: Fix O3 issuse with load+barrier instructions. Fix a problem in the O3 CPU for instructions that are both memory loads and memory barriers (e.g. load acquire) and to uncacheable memory. This combination can confuse the commit stage into commitng an instruction that hasn't executed and got it's value yet. At the same time refactor the code slightly to remove duplication between two of the cases.	2013-10-31 13:41:13 -05:00
Lluc Alvarez	2b9b245fb3	ruby: set SenderMachine in messages of MOESI_CMP_directory This patch adds missing initializations of the SenderMachine field of out_msg's when thery are created in the L2 cache controller of the MOESI_CMP_directory coherence protocol. When an out_msg is created and this field is left uninitialized, it is set to the default value MachineType_NUM. This causes a panic in the MachineType_to_string function when gem5 is executed with the Ruby debug flag on and it tries to print the message. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2013-10-30 10:35:06 -05:00
Emilio Castillo	80fa6a0edc	ruby: Fixed a deadlock when restoring a checkpoint with garnet This patch fixes a problem where in Garnet, the enqueue time in the VCallocator and the SWallocator which is of type Cycles was being stored inside a variable with int type. This lead to a known problem restoring checkpoints with garnet & the fixed pipeline enabled. That value was really big and didn't fit in the variable overflowing it, therefore some conditions on the VC allocation stage & the SW allocation stage were not met and the packets didn't advance through the network, leading to a deadlock panic right after the checkpoint was restored. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2013-10-30 10:35:05 -05:00
Stephan Diestelhorst	4e9d91016a	mem: De-virtualise interfaces in the CoherentBus The CoherentBus eventually got virtual methods for its interface. The "virtuality" of the CoherentBus, however, comes already from the virtual interface of the bus' ports. There is no need to add another layer of virtual functions, here.	2013-10-17 10:20:45 -05:00
Matt Horsnell	6decd70bfb	cpu: add consistent guarding to *_impl.hh files.	2013-10-17 10:20:45 -05:00
Sascha Bischoff	52f90890a3	mem: Add PortID to QueuedMasterPort constructor This patch adds the PortID to the QueuedMasterPort. This allows a PortID to be specified as it previously was set to the detault value of -1.	2013-10-17 10:20:45 -05:00
Matt Evans	94d17a547c	arm: Add a 'clear PPI' method to gic_pl390 The underlying assumption that all PPIs must be edge-triggered is strained when the architected timers and VGIC interfaces make level-behaviour observable. For example, a virtual timer interrupt 'goes away' when the hypervisor is entered and the vtimer is disabled; this requires a PPI to be de-activated. The new method simply clears the interrupt pending state.	2013-10-17 10:20:45 -05:00
Geoffrey Blake	2b9138135e	config: Fix ommission of number base in ethernet address param The ethernet address param tries to convert a hexadecimal string using int() in python, which defaults to base 10, need to specify base 16 in this case.	2013-10-17 10:20:45 -05:00
Geoffrey Blake	3d582c767a	config: Fix for port references generated multiple times SimObjects are expected to only generate one port reference per port belonging to them. There is a subtle bug with using "not" here as a VectorPort is seen as not having a reference if it is either None or empty as per Python docs sec 9.9 for Standard operators. Intended behavior is to only check if we have not created the reference.	2013-10-17 10:20:45 -05:00
Dam Sunwoo	ad614bf24d	dev: Add option to disable framebuffer .bmp dump in run folder There is an option to enable/disable all framebuffer dumps, but the last frame always gets dumped in the run folder with no other way to disable it. These files can add up very quickly running many experiments. This patch adds an option to disable them. The default behavior remains unchanged.	2013-10-17 10:20:45 -05:00
Faissal Sleiman	1746eb4a11	cpu: Removing an unused variable in rename	2013-10-17 10:20:45 -05:00
Faissal Sleiman	9195f1fbfd	cpu: Change IEW DPRINTF to use IEW debug flag IEW DPRINTF uses Decode debug flag, which appears to be a copying error. This patch changes this to the IEW Debug flag.	2013-10-17 10:20:45 -05:00
Faissal Sleiman	e516531bd0	cpu: Put in assertions to check for maximum supported LQ/SQ size LSQSenderState represents the LQ/SQ index using uint8_t, which supports up to 256 entries (including the sentinel entry). Sending packets to memory with a higher index than 255 truncates the index, such that the response matches the wrong entry. For instance, this can result in a deadlock if a store completion does not clear the head entry.	2013-10-17 10:20:45 -05:00
Eric Van Hensbergen	bfdd031c0d	arm: Accomodate function name changes in newer linux kernels	2013-10-17 10:20:45 -05:00
Ali Saidi	2f7b012ced	arm: Fix a GIC mask register bug This resulted in a kernel printk that said, "GIC CPU mask not found - kernel will fail to boot."	2013-10-17 10:20:45 -05:00
Ali Saidi	cf266f05a9	cpu: Fix O3 uncacheable load that is replayed but misses the TLB This change fixes an issue in the O3 CPU where an uncachable instruction is attempted to be executed before it reaches the head of the ROB. It is determined to be uncacheable, and is replayed, but a PanicFault is attached to the instruction to make sure that it is properly executed before committing. If the TLB entry it was using is replaced in the interveaning time, the TLB returns a delayed translation when the load is replayed at the head of the ROB, however the LSQ code can't differntiate between the old fault and the new one. If the translation isn't complete it can't be faulting, so clear the fault.	2013-10-17 10:20:45 -05:00
Ali Saidi	60ce2b34fe	mem: Make MemoryAccess flag more verbose This patch extends the MemoryAccess debug flag to report who sent the requests and the cacheability.	2013-10-17 10:20:45 -05:00
Andreas Hansson	8a8e5cdc7e	build: Place proto output in the same directory, also for EXTRAS This patch changes the ProtoBuf builder such that the generated source and header is placed in the build directory of the proto file. This was previously not the case for the directories included as EXTRAS. To make this work, we also ensure that the build directory for the EXTRAS are added to the include path (which does not seem to automatically be the case).	2013-10-17 10:20:45 -05:00
Ali Saidi	88b811b4ef	dev: Allow additional UART interrupts to be set This patch allows setting a few additional interrupts for status changes that should never occur.	2013-10-17 10:20:45 -05:00
Andreas Sandberg	cc42e87b85	kvm: Fix latency calculation of IPR accesses When handling IPR accesses in doMMIOAccess, the KVM CPU used clockEdge() to convert between cycles and ticks. This is incorrect since doMMIOAccess is supposed to return a latency in ticks rather than when the access is done. This changeset fixes this issue by returning clockPeriod() * ipr_delay instead.	2013-10-16 18:12:15 +02:00
Steve Reinhardt	b10ff075b1	ruby: eliminate non-determinism from ruby.stats output Get rid of non-deterministic "stats" in ruby.stats output such as time & date of run, elapsed & CPU time used, and memory usage. These values cause spurious miscomparisons when looking at output diffs (though they don't affect regressions, since the regressions pass/fail status currently ignores ruby.stats entirely). Most of this information is already captured in other places (time & date in stdout, elapsed time & mem usage in stats.txt), where the regression script is smart enough to filter it out. It seems easier to get rid of the redundant output rather than teaching the regression tester to ignore the same information in two different places.	2013-10-15 18:22:49 -04:00
Yasuko Eckert	1bb293d1e7	arch/x86: add support for explicit CC register file Convert condition code registers from being specialized ("pseudo") integer registers to using the recently added CC register class. Nilay Vaish also contributed to this patch.	2013-10-15 14:22:44 -04:00
Yasuko Eckert	2c293823aa	cpu: add a condition-code register class Add a third register class for condition codes, in parallel with the integer and FP classes. No ISAs use the CC class at this point though.	2013-10-15 14:22:44 -04:00
Steve Reinhardt	5526221847	cpu/o3: clean up rename map and free list Restructured rename map and free list to clean up some extraneous code and separate out common code that can be reused across different register classes (int and fp at this point). Both components now consist of a set of Simple* objects that are stand-alone rename map & free list for each class, plus a Unified* object that presents a unified interface across all register classes and then redirects accesses to the appropriate Simple* object as needed. Moved free list initialization to PhysRegFile to better isolate knowledge of physical register index mappings to that class (and remove the need to pass a number of parameters to the free list constructor). Causes a small change to these stats: cpu.rename.int_rename_lookups cpu.rename.fp_rename_lookups because they are now categorized on a per-operand basis rather than a per-instruction basis. That is, an instruction with mixed fp/int/misc operand types will have each operand categorized independently, where previously the lookup was categorized based on the instruction type.	2013-10-15 14:22:44 -04:00
Steve Reinhardt	219c423f1f	cpu: rename _DepTag constants to _Reg_Base Make these names more meaningful. Specifically, made these substitutions: s/FP_Base_DepTag/FP_Reg_Base/g; s/Ctrl_Base_DepTag/Misc_Reg_Base/g; s/Max_DepTag/Max_Reg_Index/g;	2013-10-15 14:22:43 -04:00
Steve Reinhardt	a830e63de7	isa: clean up register constants Clean up and add some consistency to the *_Base_DepTag constants as well as some related register constants: - Get rid of NumMiscArchRegs, TotalArchRegs, and TotalDataRegs since they're never used and not always defined - Set FP_Base_DepTag = NumIntRegs when possible (i.e., every case except x86) - Set Ctrl_Base_DepTag = FP_Base_DepTag + NumFloatRegs (this was true before, but wasn't always expressed that way) - Drastically reduce the number of arbitrary constants appearing in these calculations	2013-10-15 14:22:43 -04:00
Steve Reinhardt	9bd017b8ae	cpu/o3: clean up scoreboard object It had a bunch of fields (and associated constructor parameters) thet it didn't really use, and the array initialization was needlessly verbose. Also just hardwired the getReg() method to aleays return true for misc regs, rather than having an array of bits that we always kept marked as ready.	2013-10-15 14:22:43 -04:00
Steve Reinhardt	c009d0eb2a	cpu/o3: clean up physical register file No need for PhysRegFile to be a template class, or have a pointer back to the CPU. Also made some methods for checking the physical register type (int vs. float) based on the phys reg index, which will come in handy later.	2013-10-15 14:22:43 -04:00
Steve Reinhardt	06d246ab4a	cpu/inorder: merge register class enums The previous patch introduced a RegClass enum to clean up register classification. The inorder model already had an equivalent enum (RegType) that was used internally. This patch replaces RegType with RegClass to get rid of the now-redundant code.	2013-10-15 14:22:43 -04:00
Steve Reinhardt	7aa423acad	cpu: clean up architectural register classification Move from a poorly documented scheme where the mapping of unified architectural register indices to register classes is hardcoded all over to one where there's an enum for the register classes and a function that encapsulates the mapping.	2013-10-15 14:22:42 -04:00
Andreas Sandberg	4f5775df64	mem: Rename the ASI_BITS flag field in Request ASI_BITS in the Request object were originally used to store a memory request's ASI on SPARC. This is not the case any more since other ISAs use the ASI bits to store architecture-dependent information. This changeset renames the ASI_BITS to ARCH_BITS which better describes their use. Additionally, the getAsi() accessor is renamed to getArchFlags().	2013-10-15 13:26:34 +02:00
Andreas Sandberg	5e7738467b	mem: Use a flag instead of address bit 63 for generic IPRs Using address bit 63 to identify generic IPRs caused problems on SPARC, where IPRs are heavily used. This changeset redefines how generic IPRs are identified. Instead of using bit 63, we now use a separate flag (GENERIC_IPR) a memory request.	2013-10-15 13:24:35 +02:00
Nilay Vaish	87cc327abb	x86: enables lstat and readlink syscalls	2013-10-07 18:05:49 -05:00
Andreas Sandberg	c0f367e514	base: Fix a potential race in PollQueue::setupAsyncIO There is a potential race between enabling asynchronous IO and selecting the target for the SIGIO signal. This changeset move the F_SETOWN call to before the F_SETFL call that enables SIGIO delivery. This ensures that signals are always sent to the correct process.	2013-10-07 16:03:15 +02:00
Andreas Sandberg	0dd6f87e63	kvm: Service events in the instruction event queues This changset adds calls to the service the instruction event queues that accidentally went missing from commit [0063c7dd18ec]. The original commit only included the code needed to schedule instruction stops from KVM and missed the functionality to actually service the events.	2013-10-03 11:00:18 +02:00
Andreas Sandberg	fec2dea5c3	x86: Add support for m5ops through a memory mapped interface In order to support m5ops in virtualized environments, we need to use a memory mapped interface. This changeset adds support for that by reserving 0xFFFF0000-0xFFFFFFFF and mapping those to the generic IPR interface for m5ops. The mapping is done in the X86ISA::TLB::finalizePhysical() which means that it just works for all of the CPU models, including virtualized ones.	2013-09-30 12:20:53 +02:00
Andreas Sandberg	d9856f33a4	arch: Add support for m5ops using mmapped IPRs In order to support m5ops on virtualized CPUs, we need to either intercept hypercall instructions or provide a memory mapped m5ops interface. Since KVM does not normally pass the results of hypercalls to userspace, which makes that method unfeasible. This changeset introduces support for m5ops using memory mapped mmapped IPRs. This is implemented by adding a class of "generic" IPRs which are handled by architecture-independent code. Such IPRs always have bit 63 set and are handled by handleGenericIprRead() and handleGenericIprWrite(). Platform specific impementations of handleIprRead and handleIprWrite should use GenericISA::isGenericIprAccess to determine if an IPR address should be handled by the generic code instead of the architecture-specific code. Platforms that don't need their own IPR support can reuse GenericISA::handleIprRead() and GenericISA::handleIprWrite().	2013-09-30 12:20:43 +02:00
Andreas Sandberg	114b643dd0	x86: Add support for FXSAVE, FXSAVE64, FXRSTOR, and FXRSTOR64	2013-09-30 12:06:36 +02:00
Andreas Sandberg	47bcc5c737	x86: Add support for FLDENV & FNSTENV	2013-09-30 12:04:36 +02:00
Andreas Sandberg	654d1e675a	x86: Add support for loading 32-bit and 80-bit floats in the x87 The x87 FPU supports three floating point formats: 32-bit, 64-bit, and 80-bit floats. The current gem5 implementation supports 32-bit and 64-bit floats, but only works correctly for 64-bit floats. This changeset fixes the 32-bit float handling by correctly loading and rounding (using truncation) 32-bit floats instead of simply truncating the bit pattern. 80-bit floats are loaded by first loading the 80-bits of the float to two temporary integer registers. A micro-op (cvtint_fp80) then converts the contents of the two integer registers to the internal FP representation (double). Similarly, when storing an 80-bit float, there are two conversion routines (ctvfp80h_int and cvtfp80l_int) that convert an internal FP register to 80-bit and stores the upper 64-bits or lower 32-bits to an integer register, which is the written to memory using normal integer stores.	2013-09-30 12:00:20 +02:00
Andreas Sandberg	c299dcedc6	x86: Fix re-entrancy problems in x87 store instructions X87 store instructions typically loads and pops the top value of the stack and stores it in memory. The current implementation pops the stack at the same time as the floating point value is loaded to a temporary register. This will corrupt the state of the x87 stack if the store fails. This changeset introduces a pop87 micro-instruction that pops the stack and uses this instruction in the affected macro-instructions to pop the stack after storing the value to memory.	2013-09-30 11:51:25 +02:00
Andreas Sandberg	469f2e31cf	kvm: Add support for thread-specific instruction events Instruction events are currently ignored when executing in KVM. This changeset adds support for triggering KVM exits based on instruction counts using hardware performance counters. Depending on the underlying performance counter implementation, there might be some inaccuracies due to instructions being counted in the host kernel when entering/exiting KVM. Due to limitations/bugs in Linux's performance counter interface, we can't reliably change the period of an overflow counter. We work around this issue by detaching and reattaching the counter if we need to reconfigure it.	2013-09-30 09:53:52 +02:00
Andreas Sandberg	86bade714e	kvm: FPU synchronization support on x86 This changeset adds support for synchronizing the FPU and SIMD state of a virtual x86 CPU with gem5. It supports both the XSave API and the KVM_(GET\|SET)_FPU kernel API. The XSave interface can be disabled using the useXSave parameter (in case of kernel issues). Unfortunately, KVM_(GET\|SET)_FPU interface seems to be buggy in some kernels (specifically, the MXCSR register isn't always synchronized), which means that it might not be possible to synchronize MXCSR on old kernels without the XSave interface. This changeset depends on the __float80 type in gcc and might not build using llvm.	2013-09-30 09:43:43 +02:00
Andreas Sandberg	cccca70149	x86: Add support routines to load and store 80-bit floats The x87 FPU on x86 supports extended floating point. We currently handle all floating point on x86 as double and don't support 80-bit loads/stores. This changeset add a utility function to load and convert 80-bit floats to doubles (loadFloat80) and another function to store doubles as 80-bit floats (storeFloat80). Both functions use libfputils to do the conversion in software. The functions are currently not used, but are required to handle floating point in KVM and to properly support all x87 loads/stores.	2013-09-30 09:42:30 +02:00
Andreas Sandberg	3af2d8eab0	x86: Add limited support for extracting function call arguments Add support for extracting the first 6 64-bit integer argumements to a function call in X86ISA::getArgument().	2013-09-30 09:37:17 +02:00
Andreas Sandberg	30841926a3	kvm: x86: Fix segment registers to make them VMX compatible There are cases when the segment registers in gem5 are not compatible with VMX. This changeset works around all known such issues. Specifically: * The accessed bits in CS, SS, DD, ES, FS, GS are forced to 1. * The busy bit in TR is forced to 1. * The protection level of SS is forced to the same protection level as CS. The difference /seems/ to be caused by a bug in gem5's x86 implementation.	2013-09-30 09:36:54 +02:00
Andreas Sandberg	e5c319db43	kvm: Add x86 segment register verification to help debugging	2013-09-25 12:35:21 +02:00
Andreas Sandberg	599b59b387	kvm: Initial x86 support This changeset adds support for KVM on x86. Full support is split across a number of commits since some features are relatively complex. This changeset includes support for: * Integer state synchronization (including segment regs) * CPUID (gem5's CPUID values are inserted into KVM) * x86 legacy IO (remapped and handled by gem5's memory system) * Memory mapped IO * PCI * MSRs * State dumping Most of the functionality is fairly straight forward. There are some quirks to support PCI enumerations since this is done in the TLB(!) in the simulated CPUs. We currently replicate some of that code. Unlike the ARM implementation, the x86 implementation of the virtual CPU does not use the cycles hardware counter. KVM on x86 simulates the time stamp counter (TSC) in the kernel. If we just measure host cycles using perfevent, we might end up measuring a slightly different number of cycles. If we don't get the cycle accounting right, we might end up rewinding the TSC, with all kinds of chaos as a result. An additional feature of the KVM CPU on x86 is extended state dumping. This enables Python scripts controlling the simulator to request dumping of a subset of the processor state. The following methods are currenlty supported: * dumpFpuRegs * dumpIntRegs * dumpSpecRegs * dumpDebugRegs * dumpXCRs * dumpXSave * dumpVCpuEvents * dumpMSRs Known limitations: * M5 ops are currently not supported. * FPU synchronization is not supported (only affects CPU switching). Both of the limitations will be addressed in separate commits.	2013-09-25 12:24:26 +02:00
Andreas Sandberg	cd9cd85ce9	kvm: Correctly handle the return value from handleIpr(Read\|Write) The KVM base class incorrectly assumed that handleIprRead and handleIprWrite both return ticks. This is not the case, instead they return cycles. This changeset converts the returned cycles to ticks when handling IPR accesses.	2013-09-19 17:55:04 +02:00
Andreas Sandberg	211c10b46d	kvm: Fix a case where the run timers weren't armed properly There is a possibility that the timespec used to arm a timer becomes zero if the number of ticks used when arming a timer is close to the resolution of the timer. Due to the semantics of POSIX timers, this actually disarms the timer. This changeset fixes this issue by eliminating the rounding error (we always round away from zero now). It also reuses the minimum number of cycles, which were previously only used for cycle-based timers, to calculate a more useful resolution.	2013-09-19 17:55:03 +02:00
Andreas Sandberg	a6e723e4d6	x86: Add support routines to convert between x87 tag formats This changeset adds the convX87XTagsToTags() and convX87TagsToXTags() which convert between the tag formats in the FTW register and the format used in the xsave area. The conversion from to the x87 FTW representation is currently loses some information since it does not reconstruct the valid/zero/special flags which are not included in the xsave representation.	2013-09-19 17:30:26 +02:00
Andreas Sandberg	4dbf25adc3	sim: Fix undefined behavior in the pseudo-inst interface The order between updating and using arg_num in PseudoInst::pseudoInst() is currently undefined. This changeset explicitly updates arg_num after it has been used to extract an argument. --HG-- extra : rebase_source : 67c46dc3333d16ce56687ee8aea41ce6c6d133bb	2013-09-18 17:08:35 +02:00
Andreas Hansson	9aa939891f	mem: Fix scheduling bug in SimpleMemory This patch ensures that a dequeue event is not scheduled if the memory controller is waiting for a retry already. Without this check it is possible for the controller to attempt sending something whilst already having one packet that is in retry, thus causing the bus to have an assertion failure.	2013-09-18 08:46:33 -04:00
Andreas Hansson	fe5212f932	swig: Fix issue with circular import in 2.0.9/2.0.10 This patch fixes an issue which prevented gem5 from running when built using swig 2.0.9 and 2.0.10. The generated event.py tried to import m5.internal which in turn relied on importing event. This patch seems to fix the problem, and so far has not caused any other issues.	2013-09-18 08:46:31 -04:00
Andreas Sandberg	e93e12a62b	x86: Expose the raw hash map of MSRs This patch allows the KVM CPU module to initialize it's MSRs by enumerating the MSRs in the gem5 x86 implementation.	2013-09-18 11:28:28 +02:00
Andreas Sandberg	4b840b8322	x86: Add support for checking the raw state of an interrupt In order to support hardware virtualization, we need to be able to check if there are any interrupts pending irregardless of the rflags.intf value. This changeset adds the checkInterruptsRaw() method to the x86 interrupt control. It returns true if there are pending interrupts that can be delivered as soon as the CPU is ready for interrupt delivery.	2013-09-18 11:28:27 +02:00
Andreas Sandberg	15733e9b33	x86: Expose the interrupt vector in faults This patch allows a hardware virtualized CPU to discover which interrupt to deliver to the guest.	2013-09-18 11:28:24 +02:00
Joel Hestness	cc155ffa0d	ruby: Fix Topology throttle connections The Topology source sets up input and output buffers for each of the external nodes of a topology by indexing on Ruby's generated controller unique IDs. These unique IDs are found by adding the MachineType_base_number to the version number of each controller (see any generated *_Controller.cc - init() calls getToNetQueue and getFromNetQueue using m_version + base). However, the Topology object used the cntrl_id - which is required to be unique across all controllers - to index the controllers list as they are being connected to their input and output buffers. If the cntrl_ids did not match the Ruby unique ID, the throttles end up connected to incorrectly indexed nodes in the network, resulting in packets traversing incorrect network paths. This patch fixes the Topology indexing scheme by using the Ruby unique ID to match that of the SimpleNetwork buffer vectors.	2013-09-11 15:35:18 -05:00
Joel Hestness	a1f9081bab	cpu: Dynamically instantiate O3 CPU LSQUnits Previously, the LSQ would instantiate MaxThreads LSQUnits in the body of it's object, but it would only initialize numThreads LSQUnits as specified by the user. This had the effect of leaving some LSQUnits uninitialized when the number of threads was less than MaxThreads, and when adding statistics to the LSQUnit that must be initialized, this caused the stats initialization check to fail. By dynamically instantiating LSQUnits, they are all initialized and this avoids uninitialized LSQUnits from floating around during runtime.	2013-09-11 15:34:50 -05:00
Joel Hestness	c1cf55c738	ruby: Statically allocate stats in SimpleNetwork, Switch, Throttle The previous changeset (9863:9483739f83ee) used STL vector containers to dynamically allocate stats in the Ruby SimpleNetwork, Switch and Throttle. For gcc versions before at least 4.6.3, this causes the standard vector allocator to call Stats copy constructors (a no-no, since stats should be allocated in the body of each SimObject instance). Since the size of these stats arrays is known at compile time (NOTE: after code generation), this patch changes their allocation to be static rather than using an STL vector.	2013-09-11 15:33:27 -05:00
Nilay Vaish	e391fd151b	stats: add operator= for DataWrapVec class gcc/g++ 4.4.7 complained about the operator= being undefined. This changeset adds the operator.	2013-09-09 18:52:23 -05:00
Nilay Vaish	90bfbd9793	ruby: network: convert to gem5 style stats	2013-09-06 16:21:35 -05:00
Nilay Vaish	24dc914d87	ruby: profiler: removes function resourceUsage()	2013-09-06 16:21:32 -05:00
Nilay Vaish	79b5ea9d19	ruby: remove undefined message size type This message size type does not work well with one of the statistical variables. It also seems unnecessary.	2013-09-06 16:21:30 -05:00
Nilay Vaish	0280997fbf	ruby: network: removes reset functionality	2013-09-06 16:21:30 -05:00
Nilay Vaish	e7bd70e079	ruby: network: shorten variable names	2013-09-06 16:21:29 -05:00
Nilay Vaish	47d113696d	stats: adds a Formula operator for division	2013-09-06 16:21:29 -05:00
Nilay Vaish	c0a8ad0a35	ruby: converts sparse memory stats to gem5 style	2013-09-06 16:21:28 -05:00
Andreas Hansson	53cf77cf18	sim: Fix clang warning for unused variable This patch ensures the NULL ISA can build without causing issues with an unused variable.	2013-09-05 13:53:54 -04:00
Andreas Hansson	3b90f52b61	util: Add ini string as tooltip info in dot output This patch adds the config ini string as a tooltip that can be displayed in most browsers rendering the resulting svg. Certain characters are modified for HTML output. Tested on chrome and firefox.	2013-09-04 13:23:00 -04:00
Andreas Hansson	fad36b35c6	util: Add colours to the dot output This patch is adding a splash of colour to the dot output to make it easier to distinguish objects of different types. As a bonus, the pastel-colour palette also makes the output look like a something from the 21st century.	2013-09-04 13:22:59 -04:00
Andreas Hansson	62cf785178	util: Add class name to dot graph and output to svg This patch adds the class name to the label, creates some more space by increasing the rank separation, and additionally outputs the graph as an editable SVG in addition to the PDF.	2013-09-04 13:22:58 -04:00
Andreas Hansson	19a5b68db7	arch: Resurrect the NOISA build target and rename it NULL This patch makes it possible to once again build gem5 without any ISA. The main purpose is to enable work around the interconnect and memory system without having to build any CPU models or device models. The regress script is updated to include the NULL ISA target. Currently no regressions make use of it, but all the testers could (and perhaps should) transition to it. --HG-- rename : build_opts/NOISA => build_opts/NULL rename : src/arch/noisa/SConsopts => src/arch/null/SConsopts rename : src/arch/noisa/cpu_dummy.hh => src/arch/null/cpu_dummy.hh rename : src/cpu/intr_control.cc => src/cpu/intr_control_noisa.cc	2013-09-04 13:22:57 -04:00
Andreas Hansson	ea40297018	cpu: Move the branch predictor out of the BaseCPU The branch predictor is guarded by having either the in-order or out-of-order CPU as one of the available CPU models and therefore should not be used in the BaseCPU. This patch moves the parameter to the relevant CPU classes.	2013-09-04 13:22:56 -04:00
Andreas Hansson	bb1d2f3957	arch: Header clean up for NOISA resurrection This patch is a first step to getting NOISA working again. A number of redundant includes make life more difficult than it has to be and this patch simply removes them. There are also some redundant forward declarations removed.	2013-09-04 13:22:55 -04:00
Andreas Hansson	cead68a781	alpha: Move system virtProxy to Alpha only This patch moves the system virtual port proxy to the Alpha system only to make the resurrection of the NOISA slightly less painful. Alpha is the only ISA that is actually using it.	2013-09-04 13:22:55 -04:00
Andreas Hansson	fdf6f6c4b6	scons: Enable build on OSX This patch changes the SConscript to build gem5 with libc++ on OSX as the conventional libstdc++ does not have the C++11 constructs that the current code base makes use of (e.g. std::forward). Since this was the last use of the transitional TR1, the unordered map and set header can now be simplified as well.	2013-09-04 13:22:54 -04:00
Andreas Hansson	c6062a3981	cpu: Fix timing CPU isDrained comment formatting This patch fixes up the comment formatting for isDrained in the timing CPU.	2013-08-20 11:21:27 -04:00
Andreas Hansson	c57c452143	base: Fix VectorPrint initialisation This patch changes how the initialisation of the VectorPrint struct is done so that gcc 4.4 is happy again.	2013-08-20 11:21:26 -04:00
Andreas Hansson	b63631536d	stats: Cumulative stats update This patch updates the stats to reflect the: 1) addition of the internal queue in SimpleMemory, 2) moving of the memory class outside FSConfig, 3) fixing up of the 2D vector printing format, 4) specifying burst size and interface width for the DRAM instead of relying on cache-line size, 5) performing merging in the DRAM controller write buffer, and 6) fixing how idle cycles are counted in the atomic and timing CPU models. The main reason for bundling them up is to minimise the changeset size.	2013-08-19 03:52:36 -04:00
Lena Olson	646c4a23ca	cpu: Accurately count idle cycles for simple cpu Added a couple missing updates to the notIdleFraction stat. Without these, it sometimes gives a (not) idle fraction that is greater than 1 or less than 0.	2013-08-19 03:52:35 -04:00
Andreas Hansson	c26911013c	config: Command line support for multi-channel memory This patch adds support for specifying multi-channel memory configurations on the command line, e.g. 'se/fs.py --mem-type=ddr3_1600_x64 --mem-channels=4'. To enable this, it enhances the functionality of MemConfig and moves the existing makeMultiChannel class method from SimpleDRAM to the support scripts. The se/fs.py example scripts are updated to make use of the new feature.	2013-08-19 03:52:34 -04:00
Andreas Hansson	49d88f08b0	mem: Change AbstractMemory defaults to match the common case This patch changes the default parameter value of conf_table_reported to match the common case. It also simplifies the regression and config scripts to reflect this change.	2013-08-19 03:52:33 -04:00
Sascha Bischoff	e553844efc	cpu: Fix TrafficGen trace playback This patch addresses an issue with trace playback in the TrafficGen where the trace was reset but the header was not read from the trace when a captured trace was played back for a second time. This resulted in parsing errors as the expected message was not found in the trace file. The header check is moved to an init funtion which is called by the constructor and when the trace is reset. This ensures that the trace header is read each time when the trace is replayed. This patch also addresses a small formatting issue in a panic.	2013-08-19 03:52:32 -04:00
Andreas Hansson	6279eaf1f7	mem: Use STL deque in favour of list for DRAM queues This patch changes the data structure used for the DRAM read, write and response queues from an STL list to deque. This optimisation is based on the observation that the size is small (and fixed), and that the structures are frequently iterated over in a linear fashion.	2013-08-19 03:52:32 -04:00
Andreas Hansson	ac42db8134	mem: Perform write merging in the DRAM write queue This patch implements basic write merging in the DRAM to avoid redundant bursts. When a new access is added to the queue it is compared against the existing entries, and if it is either intersecting or immediately succeeding/preceeding an existing item it is merged. There is currently no attempt made at avoiding iterating over the existing items in determining whether merging is possible or not.	2013-08-19 03:52:31 -04:00
Amin Farmahini	243f135e5f	mem: Replacing bytesPerCacheLine with DRAM burstLength in SimpleDRAM This patch gets rid of bytesPerCacheLine parameter and makes the DRAM configuration separate from cache line size. Instead of bytesPerCacheLine, we define a parameter for the DRAM called burst_length. The burst_length parameter shows the length of a DRAM device burst in bits. Also, lines_per_rowbuffer is replaced with device_rowbuffer_size to improve code portablity. This patch adds a burst length in beats for each memory type, an interface width for each memory type, and the memory controller model is extended to reason about "system" packets vs "dram" packets and assemble the responses properly. It means that system packets larger than a full burst are split into multiple dram packets.	2013-08-19 03:52:30 -04:00
Andreas Hansson	7a61f667f0	cpu: Fix timing CPU drain check This patch modifies the SimpleTimingCPU drain check to also consider the fetch event. Previously, there was an assumption that there is never a fetch event scheduled if the CPU is not executing microcode. However, when a context is activated, a fetch even is scheduled, and microPC() is zero.	2013-08-19 03:52:30 -04:00
Andreas Hansson	f7d44590cb	alpha: Check interrupts before quiesce This patch adds a check to the quiesce operation to ensure that the CPU does not suspend itself when there are unmasked interrupts pending. Without this patch there are corner cases when the CPU gets an interrupt before the quiesce is executed and then never wakes up again.	2013-08-19 03:52:29 -04:00
Sascha Bischoff	6211c24a96	stats: Fix issue when printing 2D vectors This patch addresses an issue with the text-based stats output which resulted in Vector2D stats being printed without subnames in the event that one of the dimensions was of length 1. This patch also fixes the total printing for the 2D vector. Previously totals were printed without explicitly stating that a total was being printed. This has been rectified in this patch.	2013-08-19 03:52:29 -04:00
Akash Bagdia	e7e17f92db	power: Add voltage domains to the clock domains This patch adds the notion of voltage domains, and groups clock domains that operate under the same voltage (i.e. power supply) into domains. Each clock domain is required to be associated with a voltage domain, and the latter requires the voltage to be explicitly set. A voltage domain is an independently controllable voltage supply being provided to section of the design. Thus, if you wish to perform dynamic voltage scaling on a CPU, its clock domain should be associated with a separate voltage domain. The current implementation of the voltage domain does not take into consideration cases where there are derived voltage domains running at ratio of native voltage domains, as with the case where there can be on-chip buck/boost (charge pumps) voltage regulation logic. The regression and configuration scripts are updated with a generic voltage domain for the system, and one for the CPUs.	2013-08-19 03:52:28 -04:00
Andreas Hansson	d5593f3c75	mem: Warn instead of panic for tXAW violation Until the performance bug is fixed, avoid killing simulations.	2013-08-19 03:52:26 -04:00
Andreas Hansson	7bc3eaec7a	mem: Allow disabling of tXAW through a 0 activation limit This patch fixes an issue where an activation limit of 0 was not allowed. With this patch, setting the limit to 0 simply disables the tXAW constraint.	2013-08-19 03:52:26 -04:00
Andreas Hansson	2a675aecb9	mem: Add an internal packet queue in SimpleMemory This patch adds a packet queue in SimpleMemory to avoid using the packet queue in the port (and thus have no involvement in the flow control). The port queue was bound to 100 packets, and as the SimpleMemory is modelling both a controller and an actual RAM, it potentially has a large number of packets in flight. There is currently no limit on the number of packets in the memory controller, but this could easily be added in a follow-on patch. As a result of the added internal storage, the functional access and draining is updated. Some minor cleaning up and renaming has also been done. The memtest regression changes as a result of this patch and the stats will be updated.	2013-08-19 03:52:25 -04:00
Andreas Hansson	9b2effd9e2	cpu: Fix a bug in the O3 CPU introduced by the cache line patch This patch fixes a bug in the O3 fetch stage that was introduced when the cache line size was moved to the system. By mistake, the initialisation and resetting of the fetch stage was merged and put in the constructor. The resetting is now re-added where it should be.	2013-08-19 03:52:24 -04:00
Nilay Vaish	95381f8a99	ruby: slicc: remove double trigger, continueProcessing These constructs are not in use and are not being maintained by any one. In addition, it is not known if doubleTrigger works correctly with Ruby now.	2013-08-07 14:51:18 -05:00
Nilay Vaish	f1b17bf157	ruby: slicc: move some code to AbstractController Some of the code in StateMachine.py file is added to all the controllers and is independent of the controller definition. This code is being moved to the AbstractController class which is the parent class of all controllers.	2013-08-07 14:51:18 -05:00
Nilay Vaish	e038741598	x86: add tlb checkpointing This patch adds checkpointing support to x86 tlb. It upgrades the cpt_upgrader.py script so that previously created checkpoints can be updated. It moves the checkpoint version to 6.	2013-08-07 14:51:17 -05:00
Andreas Sandberg	b5bb2a25aa	cpu: Remove unused getBranchPred() method from BaseCPU Remove unused virtual getBranchPred() method from BaseCPU as it is not implemented by any of the CPU models. It used to always return NULL.	2013-07-19 11:52:07 +02:00
Andreas Hansson	d4273cc9a6	mem: Set the cache line size on a system level This patch removes the notion of a peer block size and instead sets the cache line size on the system level. Previously the size was set per cache, and communicated through the interconnect. There were plenty checks to ensure that everyone had the same size specified, and these checks are now removed. Another benefit that is not yet harnessed is that the cache line size is now known at construction time, rather than after the port binding. Hence, the block size can be locally stored and does not have to be queried every time it is used. A follow-on patch updates the configuration scripts accordingly.	2013-07-18 08:31:16 -04:00
Xiangyu Dong	4e8ecd7c6f	mem: Add cache class destructor to avoid memory leaks Make valgrind a little bit happier	2013-07-18 08:29:47 -04:00
Andreas Hansson	204df3b928	sim: Make MaxTick in Python match the one in C++ This patch aligns the MaxTick in Python with the one in C++. Thus, both reflect the maximum value that an unsigned 64-bit integer can have.	2013-07-18 08:29:08 -04:00
Deyuan Guo	fb29dcf378	loader: Load weak symbols for function tracing	2013-07-15 18:08:57 -04:00
Umesh Bhaskar	5ba9e7afe2	debug : Fixes the issue wherein Debug symbols were not getting dumped into trace files for SE mode	2013-07-15 11:08:34 -04:00
Steve Reinhardt	1f43e244bd	dev: make BasicPioDevice take size in constructor Instead of relying on derived classes explicitly assigning to the BasicPioDevice pioSize field, require them to pass a size value in to the constructor. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2013-07-11 21:57:04 -05:00
Steve Reinhardt	502ad1e675	dev: consistently end device classes in 'Device' PciDev and IntDev stuck out as the only device classes that ended in 'Dev' rather than 'Device'. This patch takes care of that inconsistency. Note that you may need to delete pre-existing files matching build//python/m5/internal/param_ as scons does not pick up indirect dependencies on imported python modules when generating params, and the PciDev -> PciDevice rename takes place in a file (dev/Device.py) that gets imported quite a bit. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2013-07-11 21:56:50 -05:00
Steve Reinhardt	2737650a69	dev/arm: get rid of AmbaDev namespace It was confusing having an AmbaDev namespace along with an AmbaDevice class. The namespace stuff is now moved in to a new base AmbaDevice class, which is a mixin for classes AmbaPioDevice (the former AmbaDevice) and AmbaDmaDevice to provide the readId function as an inherited member function. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2013-07-11 21:56:39 -05:00
Steve Reinhardt	b0b1c0205c	devices: make more classes derive from BasicPioDevice A couple of devices that have single fixed memory mapped regions were not derived from BasicPioDevice, when that's exactly the functionality that BasicPioDevice provides. This patch gets rid of a little bit of redundant code by making those devices actually do so. Also fixed the weird case of X86ISA::Interrupts, where the class already did derive from BasicPioDevice but didn't actually use all the features it could have. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2013-07-11 21:56:24 -05:00
Brad Beckmann	8e54c93222	ruby: removed the very old double trigger hack Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2013-07-11 13:56:05 -05:00
Nilay Vaish	1be0098c0b	ruby: append transition comment only when in opt/debug	2013-06-28 21:42:27 -05:00
Nilay Vaish	b3980cdb9a	ruby: network: remove reconfiguration code This code seems not to be of any use now. There is no path in the simulator that allows for reconfiguring the network. A better approach would be to take a checkpoint and start the simulation from the checkpoint with the new configuration.	2013-06-28 21:36:37 -05:00
Prakash Ramrakhyani	ac515d7a9b	mem: Reorganize cache tags and make them a SimObject This patch reorganizes the cache tags to allow more flexibility to implement new replacement policies. The base tags class is now a clocked object so that derived classes can use a clock if they need one. Also having deriving from SimObject allows specialized Tag classes to be swapped in/out in .py files. The cache set is now templatized to allow it to contain customized cache blocks with additional informaiton. This involved moving code to the .hh file and removing cacheset.cc. The statistics belonging to the cache tags are now including ".tags" in their name. Hence, the stats need an update to reflect the change in naming.	2013-06-27 05:49:50 -04:00
Andreas Hansson	0d68d36b9d	mem: Remove the cache builder This patch removes the redundant cache builder class.	2013-06-27 05:49:50 -04:00
Andreas Hansson	a0e551869c	config: Remove Clock parameter multiplication This patch removes the multiplication operator support for Clock parameters as this functionality is now achieved by creating derived clock domains. Nate, this one is for you.	2013-06-27 05:49:50 -04:00
Akash Bagdia	7d7ab73862	sim: Add the notion of clock domains to all ClockedObjects This patch adds the notion of source- and derived-clock domains to the ClockedObjects. As such, all clock information is moved to the clock domain, and the ClockedObjects are grouped into domains. The clock domains are either source domains, with a specific clock period, or derived domains that have a parent domain and a divider (potentially chained). For piece of logic that runs at a derived clock (a ratio of the clock its parent is running at) the necessary derived clock domain is created from its corresponding parent clock domain. For now, the derived clock domain only supports a divider, thus ensuring a lower speed compared to its parent. Multiplier functionality implies a PLL logic that has not been modelled yet (create a separate clock instead). The clock domains should be used as a mechanism to provide a controllable clock source that affects clock for every clocked object lying beneath it. The clock of the domain can (in a future patch) be controlled by a handler responsible for dynamic frequency scaling of the respective clock domains. All the config scripts have been retro-fitted with clock domains. For the System a default SrcClockDomain is created. For CPUs that run at a different speed than the system, there is a seperate clock domain created. This domain incorporates the CPU and the associated caches. As before, Ruby runs under its own clock domain. The clock period of all domains are pre-computed, such that no virtual functions or multiplications are needed when calling clockPeriod. Instead, the clock period is pre-computed when any changes occur. For this to be possible, each clock domain tracks its children.	2013-06-27 05:49:49 -04:00
Akash Bagdia	076d04a653	config: Add a system clock command-line option This patch adds a 'sys_clock' command-line option and use it to assign clocks to the system during instantiation. As part of this change, the default clock in the System class is removed and whenever a system is instantiated a system clock value must be set. A default value is provided for the command-line option. The configs and tests are updated accordingly.	2013-06-27 05:49:49 -04:00
Akash Bagdia	7eccb1b779	config: Remove redundant explicit setting of default clocks This patch removes the explicit setting of the clock period for certain instances of CoherentBus, NonCoherentBus and IOCache where the specified clock is same as the default value of the system clock. As all the values used are the defaults, there are no performance changes. There are similar cases where the toL2Bus is set to use the parent CPU clock which is already the default behaviour. The main motivation for these simplifications is to ease the introduction of clock domains.	2013-06-27 05:49:49 -04:00
Andreas Hansson	3b92748937	mem: Tidy up the bridge with const and additional checks This patch does a bit of tidying up in the bridge code, adding const where appropriate and also removing redundant checks and adding a few new ones. There are no changes to the behaviour of any regressions.	2013-06-27 05:49:49 -04:00
Andreas Hansson	f25ea3fd56	mem: Fix CommMonitor style and response check This patch fixes the CommMonitor local variable names, and also introduces a variable to capture if it expects to see a response. The latter check considers both needsResponse and memInhibitAsserted.	2013-06-27 05:49:49 -04:00
Andreas Hansson	33a8d777ad	mem: Align cache timing to clock edges This patch changes the cache timing calculations such that the results are aligned to clock edges. Plenty stats change as a results of this patch.	2013-06-27 05:49:49 -04:00
Andreas Hansson	10650fc525	cpu: Consider instructions waiting for FU completion in draining This patch changes the IEW drain check to include the FU pool as there can be instructions that are "stored" in FU completion events and thus not covered by the existing checks. With this patch, we simply include a check to see if all the FUs are considered non-busy in the next tick. Without this patch, the pc-switcheroo-full regression fails after minor changes to the cache timing (aligning to clock edge).	2013-06-27 05:49:49 -04:00
Andreas Hansson	368f50a0a1	mem: Cycles converted to Ticks in atomic cache accesses This patch fixes an outstanding issue in the cache timing calculations where an atomic access returned a time in Cycles, but the port forwarded it on as if it was in Ticks. A separate patch will update the regression stats.	2013-06-27 05:49:49 -04:00
Andreas Hansson	997a6a4add	base: Fix address range granularity calculation This patch fixes a bug in the granularity calculation. For example, if the high bit is 6 (counting from 0) and we have one interleaving bit, then the granularity is now 2 ** (6 - 1 + 1) = 64.	2013-06-27 05:49:49 -04:00
Andreas Hansson	f330b3c28d	mem: Remove a redundant heap allocation for a snoop packet This patch changes the updards snoop packet to avoid allocating and later deleting it. As the code executes in 0 time and the lifetime of the packet does not extend beyond the block there is no reason to heap allocate it.	2013-06-27 05:49:49 -04:00
Andreas Hansson	9a1169f3d7	mem: Remove CoherentBus snoop port unused private member This patch removes an unused member to avoid getting compiler warnings when using clang.	2013-06-27 05:49:49 -04:00
Sascha Bischoff	3d19bccb93	stats: Remove printing of SparseHist total This patch removes the printing of the SparseHist total in the stats.txt output file. This has been removed as a sparse histogram has no total, and therefore this was printing out the value of a non-local, unrelated variable.	2013-06-27 05:49:49 -04:00
Nilay Vaish	d8ed1d1a2c	ruby: moesi cmp directory: separate actions for external hits This patch adds separate actions for requests that missed in the local cache and messages were sent out to get the requested line. These separate actions are required for differentiating between the hit and miss latencies in the statistics collected.	2013-06-25 00:32:04 -05:00
Nilay Vaish	128ab50c47	ruby: mesi cmp directory: separate actions for external hits This patch adds separate actions for requests that missed in the local cache and messages were sent out to get the requested line. These separate actions are required for differentiating between the hit and miss latencies in the statistics collected.	2013-06-25 00:32:03 -05:00
Nilay Vaish	beb6e57c6f	ruby: profiler: lots of inter-related changes The patch started of with removing the global variables from the profiler for profiling the miss latency of requests made to the cache. The corrresponding histograms have been moved to the Sequencer. These are combined together when the histograms are printed. Separate histograms are now maintained for tracking latency of all requests together, of hits only and of misses only. A particular set of histograms used to use the type GenericMachineType defined in one of the protocol files. This patch removes this type. Now, everything that relied on this type would use MachineType instead. To do this, SLICC has been changed so that multiple machine types can be declared by a controller in its preamble.	2013-06-25 00:32:03 -05:00
Nilay Vaish	b3db882dee	ruby: remove the three files related to profiling This patch removes the following three files: RubySlicc_Profiler.sm, RubySlicc_Profiler_interface.cc and RubySlicc_Profiler_interface.hh. Only one function prototyped in the file RubySlicc_Profiler.sm. Rest of the code appearing in any of these files is not in use. Therefore, these files are being removed. That one single function, profileMsgDelay(), is being moved to the protocol files where it is in use. If we need any of these deleted functions, I think the right way to make them visible is to have the AbstractController class in a .sm and let the controller state machine inherit from this class. The AbstractController class can then have the prototypes of these profiling functions in its definition.	2013-06-24 08:59:08 -05:00
Joel Hestness ext:(%2C%20Nilay%20Vaish%20%3Cnilay%40cs.wisc.edu%3E)	71c6c43110	ruby: MessageBuffer: Remove unused m_size variable The m_size variable attempted to track m_prio_heap.size(), but it did so incorrectly due to the functions reanalyzeMessages and reanalyzeAllMessages(). Since this variable is intended to track m_prio_heap.size(), we can simply replace instances where m_size is referenced with m_prio_heap.size(), which has the added bonus of removing the need for m_size. Note: This patch also removes an extraneous DPRINTF format string designator from reanalyzeAllMessages() Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2013-06-24 06:57:06 -05:00
Lena Olson	94280c7e51	ruby: fix typo in MOESI_CMP_token protocol	2013-06-20 16:20:38 -05:00
Lena Olson	ed234ddec6	ruby: Fix prefetching for MESI_CMP_Directory Transitions from present on PF_Ifetch were missing, causing a crash when prefetching is enabled. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2013-06-18 16:59:22 -05:00
Lena Olson	eb1279ff49	ruby: fix slicc compiler to complain about duplicate symbols Previously, .sm files were allowed to use the same name for a type and a variable. This is unnecessarily confusing and has some bad side effects, like not being able to declare later variables in the same scope with the same type. This causes the compiler to complain and die on things like Address Address. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2013-06-18 16:58:52 -05:00
Lena Olson	7c39d5df7e	ruby: restrict Address to being a type and not a variable name Change all occurrances of Address as a variable name to instead use Addr. Address is an allowed name in slicc even when Address is also being used as a type, leading to declarations of "Address Address". While this works, it prevents adding another field of type Address because the compiler then thinks Address is a variable name, not type. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2013-06-18 16:58:33 -05:00
Andreas Sandberg	d06064c386	x86: Add support for maintaining the x87 tag word The current implementation of the x87 never updates the x87 tag word. This is currently not a big issue since the simulated x87 never checks for stack overflows, however this becomes an issue when switching between a virtualized CPU and a simulated CPU. This changeset adds support, which is enabled by default, for updating the tag register to every floating point microop that updates the stack top using the spm mechanism. The new tag words is generated by the helper function X86ISA::genX87Tags(). This function is currently limited to flagging a stack position as valid or invalid and does not try to distinguish between the valid, zero, and special states.	2013-06-18 16:36:08 +02:00
Andreas Sandberg	a8e8c4f433	x86: Fix loading of floating point constants This changeset actually fixes two issues: * The lfpimm instruction didn't work correctly when applied to a floating point constant (it did work for integers containing the bit string representation of a constant) since it used reinterpret_cast to convert a double to a uint64_t. This caused a compilation error, at least, in gcc 4.6.3. * The instructions loading floating point constants in the x87 processor didn't work correctly since they just stored a truncated integer instead of a double in the floating point register. This changeset fixes the old microcode by using lfpimm instruction instead of the limm instructions.	2013-06-18 16:30:06 +02:00
Andreas Sandberg	c9c02efb99	x86: Initialize the MXCSR register	2013-06-18 16:28:36 +02:00
Andreas Sandberg	688fc7f71f	x86: Make the boot state VMX compliant This patch allows the default x86 state to be used when by CPUs that use hardware virtualization.	2013-06-18 16:27:28 +02:00
Andreas Sandberg	5d584934ad	x86: Make fprem like the fprem on a real x87 The current implementation of fprem simply does an fmod and doesn't simulate any of the iterative behavior in a real fprem. This isn't normally a problem, however, it can lead to problems when switching between CPU models. If switching from a real CPU in the middle of an fprem loop to a simulated CPU, the output of the fprem loop becomes correupted. This changeset changes the fprem implementation to work like the one on real hardware.	2013-06-18 16:10:42 +02:00
Andreas Sandberg	6151c0f7f4	kvm: Use the address finalization code in the TLB Reuse the address finalization code in the TLB instead of replicating it when handling MMIO. This patch also adds support for injecting memory mapped IPR requests into the memory system.	2013-06-18 16:10:22 +02:00
Andreas Sandberg	46a8cbbb7f	x86: Add helper functions to access rflags The rflags register is spread across several different registers. Most of the flags are stored in MISCREG_RFLAGS, but some are stored in microcode registers. When accessing RFLAGS, we need to reconstruct it from these registers. This changeset adds two functions, X86ISA::getRFlags() and X86ISA::setRFlags(), that take care of this magic.	2013-06-18 16:10:22 +02:00
Andreas Sandberg	de89e133d8	x86: Fix the flag handling code in FABS and FCHS This changeset fixes two problems in the FABS and FCHS implementation. First, the ISA parser expects the assignment in flag_code to be a pure assignment and not an and-assignment, which leads to the isa_parser omitting the misc reg update. Second, the FCHS and FABS macro-ops don't set the SetStatus flag, which means that the default micro-op version, which doesn't update FSW, is executed.	2013-06-18 16:10:21 +02:00
Andreas Sandberg	64270b19c3	kvm: Add more VM stats This changeset adds the following stats to KVM: * numVMHalfEntries: Number of entries into KVM to finalize pending IO operations without executing guest instructions. These typically happen as a result of a drain where the guest must finalize some operations before the guest state is consistent. * numExitSignal: Number of VM exits that have been triggered by a signal. These usually happen as a result of the timer that limits the time spent in KVM.	2013-06-11 09:43:05 +02:00
Andreas Sandberg	c97a99110b	kvm: Separate host frequency from simulated CPU frequency We used to use the KVM CPU's clock to specify the host frequency. This was not ideal for several reasons. One of them being that the clock parameter of a CPU determines the frequency of some of the components connected to the CPU. This changeset adds a separate hostFreq parameter that should be used to specify the host frequency until we add code to autodetect it. The hostFactor should still be used to specify the conversion factor between the host performance and that of the simulated system.	2013-06-11 09:24:55 +02:00
Andreas Sandberg	4f002930bc	kvm: Don't handle IO and execute in the same tick We currently execute instructions in the guest and then handle any IO request right after we break out of the virtualized environment. This has the effect of executing IO requests in the exact same tick as the first instruction in the sequence that was just run. There seem to be cases where this simplification upsets some timing-sensitive devices. This changeset splits execute and IO (and other services) across multiple ticks. This is implemented by adding a separate RunningService state to the CPU state machine. When a VM requires service, it enters into this state and pending IO is then serviced in the future instead of immediately. The delay between getting the request and servicing it depends on the number of cycles executed in the guest, which allows other components to catch up with the CPU.	2013-06-11 09:24:51 +02:00
Andreas Sandberg	df059f45a0	kvm: Maintain a local instruction counter and update totalNumInsts Update the system's totalNumInst counter when exiting from KVM and maintain an internal absolute instruction count instead of relying on the one from perf.	2013-06-11 09:24:40 +02:00
Andreas Sandberg	0b4a8b4086	x86: Fix bug when copying TSC on CPU handover The TSC value stored in MISCREG_TSC is actually just an offset from the current CPU cycle to the actual TSC value. Writes with side-effects to the TSC subtract the current cycle count before storing the new value, while reads add the current cycle count. When switching CPUs, the current value is copied without side-effects. This works as long as the source and the destination CPUs have the same clock frequencies. The TSC will jump, sometimes backwards, if they have different clock frequencies. Most OSes assume the TSC to be monotonic and break when this happens. This changeset makes sure that the TSC is copied with side-effects to ensure that the offset is updated to match the new CPU.	2013-06-11 09:24:38 +02:00
Andreas Sandberg	2442aae54f	sim: Revert [34e3295b0e39] (sim: Fix early termination in mult...) HG changset 34e3295b0e39 introduced a check in the main simulation loop that discards exit events that happen at the same tick as another exit event. This was supposed to fix a problem where a simulation script got confused by multiple exit events. This obviously breaks the simulator since it can hide important simulation events, such as a simulation failure, that happen at the same time as a non-fatal simulation event.	2013-06-11 09:24:10 +02:00
Andreas Sandberg	0793d0727b	cpu: Add support for scheduling multiple inst/load stop events Currently, the only way to get a CPU to stop after a fixed number of instructions/loads is to set a property on the CPU that causes a SimLoopExitEvent to be scheduled when the CPU is constructed. This is clearly not ideal in cases where the simulation script wants the CPU to stop at multiple instruction counts (e.g., SimPoint generation). This changeset adds the methods scheduleInstStop() and scheduleLoadStop() to the BaseCPU. These methods are exported to Python and are designed to be used from the simulation script. By using these methods instead of the old properties, a simulation script can schedule a stop at any point during simulation or schedule multiple stops. The number of instructions specified when scheduling a stop is relative to the current point of execution.	2013-06-11 09:18:25 +02:00
Nilay Vaish	d32ee94231	ruby: remove several unused variables in Profiler This patch removes per processor cycle count, histogram for filter stats, histogram for multicasts, histogram for prefetch wait, some function prototypes that do not have definitions.	2013-06-09 07:30:00 -05:00
Nilay Vaish	27b321f2f7	ruby: remove periodic event from Profiler The Profiler class does not need an event for dumping statistics periodically. This is because there is a method for dumping statistics for all the sim objects periodically. Since Ruby is a sim object, its statistics are also included.	2013-06-09 07:29:59 -05:00
Nilay Vaish	f59a7af50a	ruby: stats: use gem5's stats for cache and memory controllers This moves event and transition count statistics for cache controllers to gem5's statistics. It does the same for the statistics associated with the memory controller in ruby. All the cache/directory/dma controllers individually collect the event and transition counts. A callback function, collateStats(), has been added that is invoked on the controller version 0 of each controller class. This function adds all the individual controller statistics to a vector variables. All the code for registering the statistical variables and collating them is generated by SLICC. The patch removes the files _Profiler.{cc,hh} and _ProfileDumper.{cc,hh} which were earlier used for collecting and dumping statistics respectively.	2013-06-09 07:29:59 -05:00
Nilay Vaish	38736ce7c3	ruby: remove undefined functions in Address class	2013-06-09 07:29:58 -05:00
Nilay Vaish	f2b5b4c8cc	stats: allow printing vectors on a single line This patch adds a new flag to specify if the data values for a given vector should be printed in one line in the stats.txt file. The default behavior will be to print the data in multiple lines. It makes changes to print functions to enforce this behavior.	2013-06-09 07:29:57 -05:00
Andreas Sandberg	a3685b0181	dev: Clarify why updates are delayed when the MC14818 is activated	2013-06-04 10:08:21 +02:00
Andreas Sandberg	7846f59d0d	arch: Create a method to finalize physical addresses in the TLB Some architectures (currently only x86) require some fixing-up of physical addresses after a normal address translation. This is usually to remap devices such as the APIC, but could be used for other memory mapped devices as well. When running the CPU in a using hardware virtualization, we still need to do these address fix-ups before inserting the request into the memory system. This patch moves this patch allows that code to be used by such CPUs without doing full address translations.	2013-06-03 13:55:41 +02:00
Andreas Sandberg	63dae28703	base: Make the Python module loader PEP302 compliant The custom Python loader didn't comply with PEP302 for two reasons: * Previously, we would overwrite old modules on name conflicts. PEP302 explicitly states that: "If there is an existing module object named 'fullname' in sys.modules, the loader must use that existing module". * The "__package__" attribute wasn't set. PEP302: "The __package__ attribute must be set." This changeset addresses both of these issues.	2013-06-03 13:51:03 +02:00
Andreas Sandberg	c2ec232920	kvm: Allow architectures to override the cycle accounting mechanism Some architectures have special registers in the guest that can be used to do cycle accounting. This is generally preferrable since the prevents the guest from seeing a non-monotonic clock. This changeset adds a virtual method, getHostCycles(), that the architecture-specific code can override to implement this functionallity. The default implementation uses the hwCycles counter.	2013-06-03 13:39:11 +02:00
Andreas Sandberg	15f81b6ed9	kvm: Add handling of EAGAIN when creating timers timer_create can apparently return -1 and set errno to EAGAIN if the kernel suffered a temporary failure when allocating a timer. This happens from time to time, so we need to handle it.	2013-06-03 13:38:59 +02:00
Andreas Sandberg	743f80712e	sim: Add debug output when executing pseudo-instructions	2013-06-03 13:21:21 +02:00
Andreas Sandberg	2b65fce5d9	kvm: Add a call to thread->startup() in startup() It is now required to initialize the thread context by calling startup() on it. Failing to do so currently causes decoder in x86-based CPUs to get very confused when restoring from checkpoints.	2013-06-03 12:36:56 +02:00
Andreas Sandberg	5e60f87aa3	dev: Add support for disabling ticking and the divider in MC146818 Some Linux versions disable updates (regB.set = 1) to prevent the chip from updating its internal state while the OS is updating it. Support for this was already there, this patch merely disables the check in writeReg that prevented it from being enabled. The patch also includes support for disabling the divider, which is used to control when clock updates should start after setting the internal RTC state. These changes are required to boot most vanilla Linux distributions that update the RTC settings at boot.	2013-06-03 12:28:52 +02:00
Andreas Sandberg	14b8a17f28	dev: Clean up MC146818 register (A & B) handling Rewrite reg A & B handling to use the bitunion stuff instead of bit masking. Add better error messages when the kernel tries to enable unsupported stuff.	2013-06-03 12:28:41 +02:00
Andreas Hansson	3bc4ecdcb4	mem: More descriptive DRAM config names This patch changes the class names of the variuos DRAM configurations to better reflect what memory they are based on. The speed and interface width is now part of the name, and also the alias that is used to select them on the command line. Some minor changes are done to the actual parameters, to better reflect the named configurations. As a result of these changes the regressions change slightly and the stats will be bumped in a separate patch.	2013-05-30 12:54:14 -04:00
Andreas Hansson	83d99aebb1	mem: Add bytes per activate DRAM controller stat This patch adds a histogram to track how many bytes are accessed in an open row before it is closed. This metric is useful in characterising a workload and the efficiency of the DRAM scheduler. For example, a DDR3-1600 device requires 44 cycles (tRC) before it can activate another row in the same bank. For a x32 interface (8 bytes per cycle) that means 8 x 44 = 352 bytes must be transferred to hide the preparation time.	2013-05-30 12:54:13 -04:00
Andreas Hansson	d82bffd297	mem: Add static latency to the DRAM controller This patch adds a frontend and backend static latency to the DRAM controller by delaying the responses. Two parameters expressing the frontend and backend contributions in absolute time are added to the controller, and the appropriate latency is added to the responses when adding them to the (infinite) queued port for sending. For writes and reads that hit in the write buffer, only the frontend latency is added. For reads that are serviced by the DRAM, the static latency is the sum of the pipeline latencies of the entire frontend, backend and PHY. The default values are chosen based on having roughly 10 pipeline stages in total at 500 MHz. In the future, it would be sensible to make the controller use its clock and convert these latencies (and a few of the DRAM timings) to cycles.	2013-05-30 12:54:12 -04:00
Andreas Hansson	7da851d1a8	mem: Spring cleaning of MSHR and MSHRQueue This patch does some minor tidying up of the MSHR and MSHRQueue. The clean up started as part of some ad-hoc tracing and debugging, but seems worthwhile enough to go in as a separate patch. The highlights of the changes are reduced scoping (private) members where possible, avoiding redundant new/delete, and constructor initialisation to please static code analyzers.	2013-05-30 12:54:11 -04:00
Andreas Hansson	42191522cc	mem: Fix MSHR print format This patch fixes an incorrect print format string by adding an additional string element.	2013-05-30 12:54:09 -04:00
Andreas Hansson	4d7d8393ed	cpu: Prune the stale TraceCPU This patch prunes the TraceCPU as the code is stale and the functionality that it provided can now be achieved with the TrafficGen using its trace playback mode. The TraceCPU was able to play back pre-recorded memory traces of a few different formats, and to achieve this level of flexibility with the TrafficGen, use the util/encode_packet_trace (with suitable modifications) to create a protobuf trace off-line.	2013-05-30 12:54:09 -04:00
Sascha Bischoff	6f4be9bd4c	cpu: Check that minimum TrafficGen period is less than max period Add a check which ensures that the minumum period for the LINEAR and RANDOM traffic generator states is less than or equal to the maximum period. If the minimum period is greater than the maximum period a fatal is triggered.	2013-05-30 12:54:08 -04:00
Sascha Bischoff	04ccc79134	cpu: Fix bug when reading in TrafficGen state transitions This patch fixes a bug with the traffic generator which occured when reading in the state transitions from the configuration file. Previously, the size of the vector which stored the transitions was used to get the size of the transitions matrix, rather than using the number of states. Therefore, if there were more transitions than states, i.e. some transitions has a probability of less than 1, then the traffic generator would fatal when trying to check the transitions. This issue has been addressed by using the number of input states, rather then the number of transitions.	2013-05-30 12:54:07 -04:00
Andreas Hansson	fc09bc8678	cpu: Add request elasticity to the traffic generator This patch adds an optional request elasticity to the traffic generator, effectievly compensating for it in the case of the linear and random generators, and adding it in the case of the trace generator. The accounting is left with the top-level traffic generator, and the individual generators do the necessary math as part of determining the next packet tick. Note that in the linear and random generators we have to compensate for the blocked time to not be elastic, i.e. without this patch the aforementioned generators will slow down in the case of back-pressure.	2013-05-30 12:54:06 -04:00
Andreas Hansson	4931414ca7	cpu: Block traffic generator when requests have to retry This patch changes the queued port for a conventional master port and stalls the traffic generator when requests are not immediately accepted. This is a first step to allowing elasticity in the injection of requests. The patch also adds stats for the sent packets and retries, and slightly changes how the nextPacketTick and getNextPacket interact. The advancing of the trace is now moved to getNextPacket and nextPacketTick is only responsible for answering the question when the next packet should be sent.	2013-05-30 12:54:05 -04:00
Andreas Hansson	c9c35da934	cpu: Move traffic generator sending out of generator states This patch moves the responsibility for sending packets out of the generator states and leaves it with the top-level traffic generator. The main aim of this patch is to enable a transition to non-queued ports, i.e. with send/retry flow control, and to do so it is much more convenient to not wrap the port interactions and instead leave it all local to the traffic generator. The generator states now only govern when they are ready to send something new, and the generation of the packets to send. They thus have no knowledge of the port that is used.	2013-05-30 12:54:04 -04:00
Andreas Hansson	ba11a02cf2	cpu: Fold together the StateGraph and the TrafficGen This patch simplifies the object hierarchy of the traffic generator by getting rid of the StateGraph class and folding this functionality into the traffic generator itself. The main goal of this patch is to facilitate upcoming changes by reducing the number of affected layers.	2013-05-30 12:54:03 -04:00
Andreas Hansson	7e13c4d046	mem: Make returning snoop responses occupy response layer This patch introduces a mirrored internal snoop port to facilitate easy addition of flow control for the snoop responses that are turned into normal responses on their return. To perform this, the slave ports of the coherent bus are wrapped in internal master ports that are passed as the source ports to the response layer in question. As a result of this patch, there is more contention for the response resources, and as such system performance will decrease slightly. A consequence of the mirrored internal port is that the port the bus tells to retry (the internal one) and the port actually retrying (the mirrored) one are not the same. Thus, the existing check in tryTiming is not longer correct. In fact, the test is redundant as the layer is only in the retry state while calling sendRetry on the waiting port, and if the latter does not immediately call the bus then the retry state is left. Consequently the check is removed.	2013-05-30 12:54:02 -04:00
Andreas Hansson	2308f812ef	mem: Make the buses multi layered This patch makes the buses multi layered, and effectively creates a crossbar structure with distributed contention ports at the destination ports. Before this patch, a bus could have a single request, response and snoop response in flight at any time, and with these changes there can be as many requests as connected slaves (bus master ports), and as many responses as connected masters (bus slave ports). Together with address interleaving, this patch enables us to create high-throughput memory interconnects, e.g. 50+ GByte/s.	2013-05-30 12:54:01 -04:00
Andreas Hansson	e82996d9da	mem: Separate the two snoop response cases in the bus This patch makes the flow control and state updates of the coherent bus more clear by separating the two cases, i.e. forward as a snoop response, or turn it into a normal response. With this change it is also more clear what resources are being occupied, and that we effectively bypass the busy check for the second case. As a result of the change in resource usage some stats change.	2013-05-30 12:54:00 -04:00
Andreas Hansson	cb62d39835	mem: Tidy up a few variables in the bus This patch does some minor housekeeping on the bus code, removing redundant code, and moving the extraction of the destination id to the top of the functions using it.	2013-05-30 12:53:59 -04:00
Uri Wiener	91f7b065a9	mem: Add basic stats to the buses This patch adds a basic set of stats which are hard to impossible to implement using only communication monitors, and are needed for insight such as bus utilization, transactions through the bus etc. Stats added include throughput and transaction distribution, and also a two-dimensional vector capturing how many packets and how much data is exchanged between the masters and slaves connected to the bus.	2013-05-30 12:53:58 -04:00
Andreas Hansson	e1e73c5f39	mem: Use unordered set in bus request tracking This patch changes the set used to track outstanding requests to an unordered set (part of C++11 STL). There is no need to maintain the order, and hopefully there might even be a small performance benefit.	2013-05-30 12:53:57 -04:00
Andreas Hansson	82397921a5	mem: Check for waiting state in bus draining This patch fixes a bug in the bus where the bus transitions from busy to idle and still has a port that is waiting for a retry from a peer.	2013-05-30 12:53:57 -04:00
Andreas Hansson	bf6291460d	mem: Add a LPDDR3-1600 configuration This patch adds a typical (leaning towards fast) LPDDR3 configuration based on publically available data. As expected, it looks very similar to the LPDDR2-S4 configuration, only with a slightly lower burst time.	2013-05-30 12:53:56 -04:00
Andreas Hansson	ce1ad84abd	mem: Adapt the LPDDR2 to match a single x32 channel This patch adapts the existing LPDDR2 configuration to make use of the multi-channel functionality. Thus, to get a x64 interface two controllers should be instantiated using the makeMultiChannel method. The page size and ranks are also adapted to better suit with a typical LPDDR2 part.	2013-05-30 12:53:55 -04:00
Andreas Hansson	88aa7755f4	mem: Avoid explicitly zeroing the memory backing store This patch removes the explicit memset as it is redundant and causes the simulator to touch the entire space, forcing the host system to allocate the pages. Anonymous pages are mapped on the first access, and the page-fault handler is responsible for zeroing them. Thus, the pages are still zeroed, but we avoid touching the entire allocated space which enables us to use much larger memory sizes as long as not all the memory is actually used.	2013-05-30 12:53:54 -04:00
Andreas Hansson	4c7a283e55	base: Avoid size limitation on protobuf coded streams This patch changes how the streams are created to avoid the size limitation on the coded streams. As we only read/write a single message at a time, there is never any message larger than a few bytes. However, the coded stream eventually complains that its internal counter reaches 64+ MByte if the total file size exceeds this value. Based on suggestions in the protobuf discussion forums, the coded stream is now created for every message that is read/written. The result is that the internal byte count never goes about tens of bytes, and we can read/write any size file that the underlying file I/O can handle.	2013-05-30 12:53:53 -04:00
Andreas Hansson	d1a43d83da	cpu: Make hash struct instead of class to please clang This patch changes the type of the hash function for BasicBlockRanges to match the original definition of the templatized type. Without this, clang raises a warning and combined with the "-Werror" flag this causes compilation to fail.	2013-05-30 12:53:52 -04:00
Malek Musleh	64af621cc6	ruby: slicc: fix error msg in TypeFieldMemberAST.py	2013-05-21 11:57:14 -05:00
Gedare Bloom	22b60c57e6	x86: Squash outstanding walks when instructions are squashed. This is the x86 version of the ARM changeset baa17ba80e06. In case an instruction has been squashed by the o3 cpu, this patch allows page table walker to avoid carrying out a pending translation that the instruction requested for.	2013-05-21 11:40:11 -05:00
Nilay Vaish	30fe807316	x86: mark instructions for being function call/return Currently call and return instructions are marked as IsCall and IsReturn. Thus, the branch predictor does not use RAS for these instructions. Similarly, the number of function calls that took place is recorded as 0. This patch marks these instructions as they should be.	2013-05-21 11:34:41 -05:00
Nilay Vaish	fba40864aa	x86: add op class for int and fp microops in isa description Currently all the integer microops are marked as IntAluOp and the floating point microops are marked as FloatAddOp. This patch adds support for marking different microops differently. Now IntMultOp, IntDivOp, FloatDivOp, FloatMultOp, FloatCvtOp, FloatSqrtOp classes will be used as well. This will help in providing different latencies for different op class.	2013-05-21 11:33:57 -05:00
Nilay Vaish	4ef466cc8a	ruby: moesi hammer: cosmetic changes Updates copyright years, removes space at the end of lines, shortens variable names.	2013-05-21 11:32:45 -05:00
Nilay Vaish	09d5bc7e6f	ruby: mesi cmp directory: cosmetic changes Updates copyright years, removes space at the end of lines, shortens variable names.	2013-05-21 11:32:38 -05:00
Nilay Vaish	bd3d1955da	ruby: moesi cmp token: cosmetic changes Updates copyright years, removes space at the end of lines, shortens variable names.	2013-05-21 11:32:24 -05:00
Nilay Vaish	e7ce518168	ruby: moesi cmp directory: cosmetic changes Updates copyright years, removes space at the end of lines, shortens variable names.	2013-05-21 11:32:15 -05:00
Nilay Vaish ext:(%2C%20Malek%20Musleh%20%3Cmalek.musleh%40gmail.com%3E)	59a7abff29	ruby: add stats to .sm files, remove cache profiler This patch changes the way cache statistics are collected in ruby. As of now, there is separate entity called CacheProfiler which holds statistical variables for caches. The CacheMemory class defines different functions for accessing the CacheProfiler. These functions are then invoked in the .sm files. I find this approach opaque and prone to error. Secondly, we probably should not be paying the cost of a function call for recording statistics. Instead, this patch allows for accessing statistical variables in the .sm files. The collection would become transparent. Secondly, it would happen in place, so no function calls. The patch also removes the CacheProfiler class. --HG-- rename : src/mem/slicc/ast/InfixOperatorExprAST.py => src/mem/slicc/ast/OperatorExprAST.py	2013-05-21 11:31:31 -05:00
Anthony Gutierrez	d3c33d91b6	cpu: remove local/globalHistoryBits params from branch pred having separate params for the local/globalHistoryBits and the local/globalPredictorSize can lead to inconsistencies if they are not carefully set. this patch dervies the number of bits necessary to index into the local/global predictors based on their size. the value of the localHistoryTableSize for the ARM O3 CPU has been increased to 1024 from 64, which is more accurate for an A15 based on some correlation against A15 hardware.	2013-05-14 18:39:47 -04:00
Andreas Sandberg	4e52789c6d	kvm: Add support for disabling coalesced MMIO Add the option useCoalescedMMIO to the BaseKvmCPU. The default behavior is to disable coalesced MMIO since this hasn't been heavily tested.	2013-05-14 16:02:45 +02:00
Andreas Sandberg	3ba93822cc	kvm: Dump state before panic in KVM exit handlers	2013-05-14 15:59:43 +02:00
Andreas Sandberg	98483ba858	kvm: Fix the memory interface used by KVM The CpuPort class was removed before the KVM patches were committed, which means that the KVM interface currently doesn't compile. This changeset adds the BaseKvmCPU::KVMCpuPort class which derives from MasterPort. This class is used on the data and instruction ports instead of the old CpuPort.	2013-05-14 15:56:04 +02:00
Andreas Sandberg	1ae30c68c1	arm: Add support for the m5fail pseudo-op	2013-05-14 15:06:50 +02:00
Andreas Sandberg	e316e4e5fe	kvm: Add a stat counting number of instructions executed This changeset adds a 'numInsts' stat to the KVM-based CPU. It also cleans up the variable names in kvmRun to make the distinction between host cycles and estimated simulated cycles clearer. As a bonus feature, it also fixes a warning (unreferenced variable) when compiling in fast mode.	2013-05-02 12:03:43 +02:00
Andreas Sandberg	fa249461ca	kvm: Add checkpoint debug print Add a debug print (when the Checkpoint debug flag is set) on serialize and unserialize. Additionally, dump the KVM state before serializing. The KVM state isn't dumped after unserializing since the state is loaded lazily on the next KVM entry.	2013-05-02 12:02:19 +02:00
Andreas Sandberg	41156c8196	kvm: Make MMIO requests uncacheable Device accesses are normally uncacheable. This change probably doesn't make any difference since we normally disable caching when KVM is active. However, there might be devices that check this, so we'd better enable this flag to be safe.	2013-05-02 12:01:50 +02:00
Andreas Sandberg	12d7498ad5	sim: Add support for m5fail in pseudoInst()	2013-05-02 11:54:08 +02:00
Michael Levenhagen	223f89a162	x86: corrects vsyscall address for gettimeofday The vsyscall address for gettimeofday is 0xffffffffff600000ul. The offset therefore should be 0x0 instead of 0x410. This can be cross checked with the file sysdeps/unix/sysv/linux/x86_64/gettimeofday.c in source of glibc. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2013-04-23 15:21:32 -05:00
Michael Levenhagen	794d00257a	x86: enable gettimeofday and getppid system calls Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2013-04-23 15:21:30 -05:00
Mitch Hayenga	b222ba2fd3	sim: Fix two bugs relating to software caching of PageTable entries. The existing implementation can read uninitialized data or stale information from the cached PageTable entries. 1) Add a valid bit for the cache entries. Simply using zero for the virtual address to signify invalid entries is not sufficient. Speculative, wrong-path accesses frequently access page zero. The current implementation would return a uninitialized TLB entry when address zero was accessed and the PageTable cache entry was invalid. 2) When unmapping/mapping/remaping a page, invalidate the corresponding PageTable cache entry if one already exists.	2013-04-23 09:47:52 -04:00
Andreas Hansson	3e35fa5dcc	cpu: Fix TraceGen flag initalisation This patch ensures the flags are always initialised.	2013-04-23 05:07:10 -04:00
Nilay Vaish	95eebf9e5e	ruby: mesi coherence protocol: remove unused state M_MB	2013-04-23 00:03:07 -05:00
Christian Menard	25a6b1866e	x86: increment the stack pointer in lret inst The 'lret' instruction reloads instruction pointer and code segment from the stack and then pops them. But the popping part is missing from the current implementation. This caused incorrect behavior in some code related to the Fiasco OS. Microops are being added to rectify the behavior of the instruction. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2013-04-23 00:03:04 -05:00
Nilay Vaish	aa86800e7a	ruby: patch checkpoint restore with garnet Due to recent changes to clocking system in Ruby and the way Ruby restores state from a checkpoint, garnet was failing to run from a checkpointed state. The problem is that Ruby resets the time to zero while warming up the caches. If any component records a local copy of the time (read calls curCycle()) before the simulation has started, then that component will not operate until that time is reached. In the context of this particular patch, the Garnet Network class calls curCycle() at multiple places. Any non-operational component can block in requests in the memory system, which the system interprets as a deadlock. This patch makes changes so that Garnet can successfully run from checkpointed state. It adds a globally visible time at which the actual execution started. This time is initialized in RubySystem::startup() function. This variable is only meant for components with in Ruby. This replaces the private variable that was maintained within Garnet since it is not possible to figure out the correct time when the value of this variable can be set. The patch also does away with all cases where curCycle() is called with in some Ruby component before the system has actually started executing. This is required due to the quirky manner in which ruby restores from a checkpoint.	2013-04-23 00:03:02 -05:00
Andreas Hansson	e23e3bea8b	mem: Address mapping with fine-grained channel interleaving This patch adds an address mapping scheme where the channel interleaving takes place on a cache line granularity. It is similar to the existing RaBaChCo that interleaves on a DRAM page, but should give higher performance when there is less locality in the address stream.	2013-04-22 13:20:34 -04:00
Andreas Hansson	e61799aa7c	mem: More descriptive enum names for address mapping This patch changes the slightly ambigious names used for the address mapping scheme to be more descriptive, and actually spell out what they do. With this patch we also open up for adding more flavours of open- and close-type mappings, i.e. interleaving across channels with the open map.	2013-04-22 13:20:33 -04:00
Andreas Hansson	99b3a12a75	cpu: Use request flags in trace playback This patch changes the TraceGen such that it uses the optional request flags from the protobuf trace if they are present.	2013-04-22 13:20:33 -04:00
Andreas Hansson	fe97f0e2b1	cpu: Make the generators usable outside the TrafficGen module This patch enables the use of the generator behaviours outside the TrafficGen module. This is useful e.g. to allow packet replay modes for other devices in the system without having to replace them with a TrafficGen in the configuration files. This change also enables more specific behaviours to be composed as specific modules, e.g. BaseBandModem can use a number of generators and have application-specific parameters based around a specific set of generators.	2013-04-22 13:20:33 -04:00
Andreas Hansson	a35d3ff167	mem: Add a WideIO DRAM configuration This patch adds a WideIO 200 MHz configuration that can be used as a baseline to compare with DDRx and LPDDRx. Note that it is a single channel and that it should be replicated 4 times. It is based on publically available information and attempts to capture an envisioned 8 Gbit single-die part (i.e. without TSVs).	2013-04-22 13:20:33 -04:00
Uri Wiener	a8fbfefb5e	mem: Adding verbose debug output in the memory system This patch provides useful printouts throughut the memory system. This includes pretty-printed cache tags and function call messages (call-stack like).	2013-04-22 13:20:33 -04:00
Andreas Hansson	9929e884b6	mem: Replace check with panic where inhibited should not happen This patch changes the SimpleTimingPort and RubyPort to panic on inhibited requests as this should never happen in either of the cases. The SimpleTimingPort is only used for the I/O devices PIO port and the DMA devices config port and should thus never see an inhibited request. Similarly, the SimpleTimingPort is also used for the MessagePort in x86, and there should also not be any cases where the port sees an inhibited request.	2013-04-22 13:20:33 -04:00
Andreas Sandberg	33ab8f735d	kvm: Add support for pseudo-ops on ARM This changeset adds support for m5 pseudo-ops when running in kvm-mode. Unfortunately, we can't trap the normal gem5 co-processor entry in KVM (it doesn't seem to be possible to trap accesses to non-existing co-processors). We therefore use BZJ instructions to cause a trap from virtualized mode into gem5. The BZJ instruction is becomes a normal branch to the gem5 fallback code when running in simulated mode, which means that this patch does not need to change the ARM ISA-specific code. Note: This requires a patched host kernel.	2013-04-22 13:20:32 -04:00
Andreas Sandberg	1c529a4196	sim: Add a helper function to execute pseudo instructions All architectures execute m5 pseudo instructions by setting up arguments according to the ABI and executing a magic instruction that contains an operation number. Handling of such instructions is currently spread across the different ISA implementations. This changeset introduces the PseudoInst::pseudoInst function which handles most of this in an architecture independent way. This is function is mainly intended to be used from KVM, but can also be used from the simulated CPUs.	2013-04-22 13:20:32 -04:00
Andreas Sandberg	32ecd72b6e	kvm: Add support for state dumping on ARM	2013-04-22 13:20:32 -04:00
Andreas Sandberg	f156020158	kvm: Add basic support for ARM Architecture specific limitations: * LPAE is currently not supported by gem5. We therefore panic if LPAE is enabled when returning to gem5. * The co-processor based interface to the architected timer is unsupported. We can't support this due to limitations in the KVM API on ARM. * M5 ops are currently not supported. This requires either a kernel hack or a memory mapped device that handles the guest<->m5 interface.	2013-04-22 13:20:32 -04:00
Andreas Sandberg	6d2941d990	arm: Add a method to query interrupt state ignoring CPSR masks Add the method checkRaw to ArmISA::Interrupts. This method can be used to query the raw state (ignoring CPSR masks) of an interrupt. It is primarily intended for hardware virtualized CPUs.	2013-04-22 13:20:32 -04:00
Andreas Sandberg	f8f66fa3df	kvm: Add experimental support for a perf-based execution timer Add support for using the CPU cycle counter instead of a normal POSIX timer to generate timed exits to gem5. This should, in theory, provide better resolution when requesting timer signals. The perf-based timer requires a fairly recent kernel since it requires a working PERF_EVENT_IOC_PERIOD ioctl. This ioctl has existed in the kernel for a long time, but it used to be completely broken due to an inverted match when the kernel copied things from user space. Additionally, the ioctl does not change the sample period correctly on all kernel versions which implement it. It is currently only known to work reliably on kernel version 3.7 and above on ARM.	2013-04-22 13:20:32 -04:00
Andreas Sandberg	2607efded8	kvm: Avoid synchronizing the TC on every KVM exit Reduce the number of KVM->TC synchronizations by overloading the getContext() method and only request an update when the TC is requested as opposed to every time KVM returns to gem5.	2013-04-22 13:20:32 -04:00
Andreas Sandberg	f485ad1908	kvm: Basic support for hardware virtualized CPUs This changeset introduces the architecture independent parts required to support KVM-accelerated CPUs. It introduces two new simulation objects: KvmVM -- The KVM VM is a component shared between all CPUs in a shared memory domain. It is typically instantiated as a child of the system object in the simulation hierarchy. It provides access to KVM VM specific interfaces. BaseKvmCPU -- Abstract base class for all KVM-based CPUs. Architecture dependent CPU implementations inherit from this class and implement the following methods: * updateKvmState() -- Update the architecture-dependent KVM state from the gem5 thread context associated with the CPU. * updateThreadContext() -- Update the thread context from the architecture-dependent KVM state. * dump() -- Dump the KVM state using (optional). In order to deliver interrupts to the guest, CPU implementations typically override the tick() method and check for, and deliver, interrupts prior to entering KVM. Hardware-virutalized CPU currently have the following limitations: * SE mode is not supported. * PC events are not supported. * Timing statistics are currently very limited. The current approach simply scales the host cycles with a user-configurable factor. * The simulated system must not contain any caches. * Since cycle counts are approximate, there is no way to request an exact number of cycles (or instructions) to be executed by the CPU. * Hardware virtualized CPUs and gem5 CPUs must not execute at the same time in the same simulator instance. * Only single-CPU systems can be simulated. * Remote GDB connections to the guest system are not supported. Additionally, m5ops requires an architecture specific interface and might not be supported.	2013-04-22 13:20:32 -04:00
Timothy M. Jones	005616518c	cpu: Let python scripts obtain the number of instructions executed	2013-04-22 13:20:31 -04:00
Andreas Sandberg	5f2361f3af	arm: Enable support for triggering a sim panic on kernel panics Add the options 'panic_on_panic' and 'panic_on_oops' to the LinuxArmSystem SimObject. When these option are enabled, the simulator panics when the guest kernel panics or oopses. Enable panic on panic and panic on oops in ARM-based test cases.	2013-04-22 13:20:31 -04:00
Dam Sunwoo	e8381142b0	sim: separate nextCycle() and clockEdge() in clockedObjects Previously, nextCycle() could return the current cycle if the current tick was already aligned with the clock edge. This behavior is not only confusing (not quite what the function name implies), but also caused problems in the drainResume() function. When exiting/re-entering the sim loop (e.g., to take checkpoints), the CPUs will drain and resume. Due to the previous behavior of nextCycle(), the CPU tick events were being rescheduled in the same ticks that were already processed before draining. This caused divergence from runs that did not exit/re-entered the sim loop. (Initially a cycle difference, but a significant impact later on.) This patch separates out the two behaviors (nextCycle() and clockEdge()), uses nextCycle() in drainResume, and uses clockEdge() everywhere else. Nothing (other than name) should change except for the drainResume timing.	2013-04-22 13:20:31 -04:00
Dam Sunwoo	2c1e344313	cpu: generate SimPoint basic block vector profiles This patch is based on http://reviews.m5sim.org/r/1474/ originally written by Mitch Hayenga. Basic block vectors are generated (simpoint.bb.gz in simout folder) based on start and end addresses of basic blocks. Some comments to the original patch are addressed and hooks are added to create and resume from checkpoints based on instruction counts dictated by external SimPoint analysis tools. SimPoint creation/resuming options will be implemented as a separate patch.	2013-04-22 13:20:31 -04:00
Chris Emmons	121b15a54d	ARM: Add support for HDLCD controller for TC2 and newer Versatile Express tiles. Newer core tiles / daughterboards for the Versatile Express platform have an HDLCD controller that supports HD-quality output. This patch adds an implementation of the controller.	2013-04-22 13:20:31 -04:00
Andreas Sandberg	aa08069b3f	sim: Add helper functions that add PCEvents with custom arguments This changeset adds support for forwarding arguments to the PC event constructors to following methods: addKernelFuncEvent addFuncEvent Additionally, this changeset adds the following helper method to the System base class: addFuncEventOrPanic - Hook a PCEvent to a symbol, panic on failure. addKernelFuncEventOrPanic - Hook a PCEvent to a kernel symbol, panic on failure. System implementations have been updated to use the new functionality where appropriate.	2013-04-22 13:20:31 -04:00
Ali Saidi	c9e4678c16	cpu: fix a switching issue with the o3 cpu. This change fixes the switcheroo test that broke earlier this month. The code that was checking for the pipeline being blocked wasn't checking for a pending translation, only for a icache access.	2013-04-22 13:20:31 -04:00
Nilay Vaish	3d858e5627	Merged c22628fa2564 and 2285b98847d7	2013-04-17 16:09:37 -05:00
Deyuan Guo ext:(%2C%20Nilay%20Vaish%20%3Cnilay%40cs.wisc.edu%3E)	b54e118628	base: load weak symbols from object file Without loading weak symbols into gem5, some function names and the given PC cannot correspond correctly, because the binding attributes of unction names in an ELF file are not only STB_GLOBAL or STB_LOCAL, but also STB_WEAK. This patch adds a function for loading weak symbols. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2013-04-17 16:07:19 -05:00
Nathanael Premillieu	3ff091bdf4	arm: set ldr_ret_uop as conditional or unconditional control This patch adds a missing flag to the ldr_ret_uop microop instruction. The flag is added when the instruction is used, not directly in the constructor of the instruction. Committed by: Nilay Vaish <nilay@cs.wisc.edu>"	2013-04-17 16:07:10 -05:00
Nilay Vaish	03c60f005e	ruby: moesi cmp directory: add copyright notice	2013-04-17 16:06:58 -05:00
Andreas Hansson	234c9a36a2	dev: Fix a bug in the use of seekp/seekg This patch fixes two instances of incorrect use of the seekp/seekg stream member functions. These two functions return a stream reference (*this), and should not be compared to an integer value.	2013-04-17 08:17:03 -04:00
Joel Hestness	1583056de8	Ruby: Fix RubyPort evict packet memory leak When using the o3 or inorder CPUs with many Ruby protocols, the caches may need to forward invalidations to the CPUs. The RubyPort was instantiating a packet to be sent to the CPUs to signal the eviction, but the packets were not being freed by the CPUs. Consistent with the classic memory model, stack allocate the packet and heap allocate the request so on ruby_eviction_callback() completion, the packet deconstructor is called, and deletes the request (*Note: stack allocating the request causes double deletion, since it will be deleted in the packet destructor). This results in the least memory allocations without memory errors.	2013-04-09 16:25:30 -05:00
Joel Hestness	46d4b71aa2	Ruby: Delete packet requests during warmup When warming up caches in Ruby, the CacheRecorder sends fetch requests into Ruby Sequencers with packet types that require responses. Since responses are never generated for these CacheRecorder requests, the requests are not deleted in the packet destructor called from the Ruby hit callback. Free the request.	2013-04-09 16:25:29 -05:00
Joel Hestness	e98c3c227d	Ruby: Add field to slicc machine for generic type This allows you to have (i.e.) an L2 cache that is not named "L2Cache" but is still a GenericMachineType_L2Cache. This is particularly helpful if the protocol has multiple L2 controllers.	2013-04-09 16:25:29 -05:00
Joel Hestness	b936619ab4	Ruby: Order profilers based on version When Ruby stats are printed for events and transitions, they include stats for all of the controllers of the same type, but they are not necessarily printed in order of the controller ID "version", because of the way the profilers were added to the profiler vector. This patch fixes the push order problem so that the stats are printed in ascending order 0->(# controllers), so statistics parsers may correctly assume the controller to which the stats belong.	2013-04-09 16:25:29 -05:00
Jason Power	88d34665d0	Ruby: More descriptive message buffer connection fatal When connecting message buffers between Ruby controllers, it is easy to mistakenly connect multiple controllers to the same message buffer. This patch prints a more descriptive fatal message than the previous assert statement in order to facilitate easier debugging.	2013-04-09 16:15:06 -05:00
Jason Power	19cc9fc6bd	Ruby: Fix typo in Slicc if-statement AST error The error in the SLICC code was hidden by the python error in SLICC parser before this patch	2013-04-09 16:12:42 -05:00
Joel Hestness	3b02210713	Ruby System, Cache Recorder: Use delete [] for trace vars The cache trace variables are array allocated uint8_t* in the RubySystem and the Ruby CacheRecorder, but the code used delete to free the memory, resulting in Valgrind memory errors. Change these deletes to delete [] to get rid of the errors.	2013-04-07 20:31:15 -05:00
Nilay Vaish	ac778b1d02	o3cpu: commit: changes interrupt handling Currently the commit stage keeps a local copy of the interrupt object. Since the interrupt is usually handled several cycles after the commit stage becomes aware of it, it is possible that the local copy of the interrupt object may not be the interrupt that is actually handled. It is possible that another interrupt occurred in the interval between interrupt detection and interrupt handling. This patch creates a copy of the interrupt just before the interrupt is handled. The local copy is ignored.	2013-03-29 14:05:26 -05:00
Nilay Vaish	d2fd3b2ec2	x86: changes to apic, keyboard It is possible that operating system wants to shutdown the lapic timer by writing timer's initial count to 0. This patch adds a check that the timer event is only scheduled if the count is 0. The patch also converts few of the panics related to the keyboard to warnings since we are any way not interested in simulating the keyboard.	2013-03-28 09:34:23 -05:00
Mitch Hayenga	4920f0d7e5	mem: Fix cache latency bug Fixes a latency calculation bug for accesses during a cache line fill. Under a cache miss, before the line is filled, accesses to the cache are associated with a MSHR and marked as targets. Once the line fill completes, MSHR target packets pay an additional latency of "responseLatency + busSerializationLatency". However, the "whenReady" field of the cache line is only set to an additional delay of "busSerializationLatency". This lacks the responseLatency component of the fill. It is possible for accesses that occur on the cycle of (or briefly after) the line fill to respond without properly paying the responseLatency. This also creates the situation where two accesses to the same address may be serviced in an order opposite of how they were received by the cache. For stores to the same address, this means that although the cache performs the stores in the order they were received, acknowledgements may be sent in a different order. Adding the responseLatency component to the whenReady field preserves the penalty that should be paid and prevents these ordering issues. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2013-03-27 18:36:09 -05:00
Steve Reinhardt	f0b745d556	scons: don't die on warnings in swig-generated code There's not much to do about it other than disable the offending warning anyway, so it's not worth terminating the build over. Also suppress uninitialized variable warnings on gcc (happens at least with gcc 4.4 and swig 1.3.40).	2013-03-27 10:03:02 -07:00
Rene de Jong	87089175cc	mem: Cancel cache retry event when blocking port This patch solves the corner case scenario where the sendRetryEvent could be scheduled twice, when an io device stresses the IOcache in the system. This should not be possible in the cache system.	2013-03-26 14:46:51 -04:00
Andreas Hansson	93a8423dea	mem: Separate waiting for the bus and waiting for a peer This patch splits the retryList into a list of ports that are waiting for the bus itself to become available, and a map that tracks the ports where forwarding failed due to a peer not accepting the packet. Thus, when a retry reaches the bus, it can be sent to the appropriate port that initiated that transaction. As a consequence of this patch, only ports that are really ready to go will get a retry, thus reducing the amount of redundant failed attempts. This patch also makes it easier to reason about the order of servicing requests as the ports waiting for the bus are now clearly FIFO and much easier to change if desired.	2013-03-26 14:46:47 -04:00
Andreas Hansson	362f6f1a16	mem: Introduce a variable for the retrying port This patch introduces a variable to keep track of the retrying port instead of relying on it being the front of the retryList. Besides the improvement in readability, this patch is a step towards separating out the two cases where a port is waiting for the bus to be free, and where the forwarding did not succeed and the bus is waiting for a retry to pass on to the original initiator of the transaction. The changes made are currently such that the regressions are not affected. This is ensured by always prioritizing the currently retrying port and putting it back at the front of the retry list.	2013-03-26 14:46:46 -04:00
Andreas Hansson	2123176684	mem: Add a generic id field to the packet trace This patch adds an optional generic 64-bit identifier field to the packet trace. This can be used to store the sequential number of the instruction that gave rise to the packet, thread id, master id, "sub"-master within a larger module etc. As the field is optional it has a marginal cost if not used.	2013-03-26 14:46:45 -04:00
Andreas Hansson	7a57b1bce0	mem: Add optional request flags to the packet trace This patch adds an optional flags field to the packet trace to encode the request flags that contain information about whether the request is (un)cacheable, instruction fetch, preftech etc.	2013-03-26 14:46:44 -04:00
Andreas Hansson	08c1835bef	cpu: Remove CpuPort and use MasterPort in the CPU classes This patch changes the port in the CPU classes to use MasterPort instead of the derived CpuPort. The functions of the CpuPort are now distributed across the relevant subclasses. The port accessor functions (getInstPort and getDataPort) now return a MasterPort instead of a CpuPort. This simplifies creating derivative CPUs that do not use the CpuPort.	2013-03-26 14:46:42 -04:00
Nilay Vaish	b2c8c50f17	ruby: slicc: set sender, receiver clock objs for optional queue	2013-03-22 17:21:23 -05:00
Nilay Vaish	e85b556d70	ruby: message buffer: correct previous errors A recent set of patches added support for multiple clock domains to ruby. I had made some errors while writing those patches. The sender was using the receiver side clock while enqueuing a message in the buffer. Those errors became visible while creating (or restoring from) checkpoints. The errors also become visible when a multi eventq scenario occurs.	2013-03-22 17:21:22 -05:00
Nilay Vaish	47c8cb72fc	ruby: message buffer: remove _ptr from some variables The names were getting too long.	2013-03-22 15:53:27 -05:00
Nilay Vaish	6465cf5824	ruby: message buffer node: used Tick in place of Cycles The message buffer node used to keep time in terms of Cycles. Since the sender and the receiver can have different clock periods, storing node time in cycles requires some conversion. Instead store the time directly in Ticks.	2013-03-22 15:53:26 -05:00
Nilay Vaish	39e9445468	ruby: consumer: avoid using receiver side clock A set of patches was recently committed to allow multiple clock domains in ruby. In those patches, I had inadvertently made an incorrect use of the clocks. Suppose object A needs to schedule an event on object B. It was possible that A accesses B's clock to schedule the event. This is not possible in actual system. Hence, changes are being to the Consumer class so as to avoid such happenings. Note that in a multi eventq simulation, this can possibly lead to an incorrect simulation. There are two functions in the Consumer class that are used for scheduling events. The first function takes in the relative delay over the current time as the argument and adds the current time to it for scheduling the event. The second function takes in the absolute time (in ticks) for scheduling the event. The first function is now being moved to protected section of the class so that only objects of the derived classes can use it. All other objects will have to specify absolute time while scheduling an event for some consumer.	2013-03-22 15:53:26 -05:00
Nilay Vaish	28005a7626	ruby: remove unsued profile functions	2013-03-22 15:53:25 -05:00
Nilay Vaish	89bb826079	ruby: keep histogram of outstanding requests in seq The histogram for tracking outstanding counts per cycle is maintained in the profiler. For a parallel implementation of the memory system, we need that this histogram is maintained locally. Hence it will now be kept in the sequencer itself. The resulting histograms will be merged when the stats are printed.	2013-03-22 15:53:25 -05:00
Nilay Vaish	870d545788	slicc: remove check if the L1Cache has a sequencer	2013-03-22 15:53:24 -05:00
Nilay Vaish	8573a69d8f	ruby: move stall and wakeup functions to AbstractController These functions are currently implemented in one of the files related to Slicc. Since these are purely C++ functions, they are better suited to be in the base class.	2013-03-22 15:53:24 -05:00
Nilay Vaish	eccc86e809	ruby: connect two controllers using only message buffers This patch modifies ruby so that two controllers can be connected to each other with only message buffers in between. Before this patch, all the controllers had to be connected to the network for them to communicate with each other. With this patch, one can have protocols where a controller is not connected to the network, but communicates with another controller through a message buffer.	2013-03-22 15:53:23 -05:00
Nilay Vaish	5aa43e130a	ruby: convert Topology to regular class The Topology class in Ruby does not need to inherit from SimObject class. This patch turns it into a regular class. The topology object is now created in the constructor of the Network class. All the parameters for the topology class have been moved to the network class.	2013-03-22 15:53:23 -05:00
Nilay Vaish	2d50127642	ruby: network: move routers from topology to network	2013-03-22 15:53:22 -05:00
Andreas Hansson	2ca42cd626	cpu: Avoid including inorder TLBUnit to avoid gcc LTO bug This patch comments out the inclusion of the inorder TLBUnit which is only used in the 9-stage pipeline. With the TLBUnit present, gcc >= 4.6 in combination with LTO ends up throwing away the definition of the TLBUnit destructor, and consequently fail to link. See http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53808 for more details about the bug, and http://gcc.gnu.org/ml/gcc/2012-06/msg00397.html for the discussion thread that also touches on similar issues seen with clang.	2013-03-20 06:41:23 -04:00
Andreas Hansson	c01c5e971b	mem: Fix missing delete of packet in DRAM access This patch fixes a memory leak caused by not deleting packets that require no response.	2013-03-18 05:22:45 -04:00
Nilay Vaish	dc37b03439	ruby: set: corrects csprintf() call introduced by 7d95b650c9b6	2013-03-15 16:28:08 -05:00
Andreas Sandberg	fc6f569d94	cpu: Fix state transition bug in the traffic generator The traffic generator used to incorrectly determine the next state in when state 0 had a non-zero probability. Due to the way the next transition was determined, state 0 could never be entered other than as an initial state. This changeset updates the transitition() method to correctly handle such cases and cases where the transition matrix is a 1x1 matrix.	2013-03-12 18:41:29 +01:00
Nilay Vaish	5c940fec0a	x86: implement some of the x87 instructions This patch implements ftan, fprem, fyl2x, fld* floating-point instructions.	2013-03-11 13:15:46 -05:00
Andreas Hansson	82f600e02d	base: Fix address range granularity calculations This patch fixes a bug in the address range granularity calculations. Previously it incorrectly used the high bit to establish the size of the regions created, when it should really be looking at the low bit.	2013-03-07 05:55:03 -05:00
Andreas Hansson	92e973b310	ruby: Fix gcc 4.8 maybe-uninitialized compilation error This patch fixes the one-and-only gcc 4.8 compilation error, being a warning about "maybe uninitialized" in Orion.	2013-03-07 05:55:02 -05:00
Andreas Hansson	c4645c0d68	x86: Make the table walker reset the packet delay This patch fixes an issue related to the table walker recycling packets that still have a bus delay that is not accounted for. For now, we simply ignore the values and reset them to zero.	2013-03-07 05:55:01 -05:00
Nilay Vaish	c061819890	ruby: remove the functional copy of memory in se mode This patch removes the functional copy of the memory that was maintained in the se mode. Now ruby itself will provide the data.	2013-03-06 21:53:57 -06:00
Nilay Vaish	e8802fa127	ruby: garnet: fixed: implement functional access	2013-03-06 21:53:16 -06:00
Ali Saidi	f205d83359	cpu: fix a switching issue with the o3 cpu. This change fixes the switcheroo test that broke earlier this month. The code that was checking for the pipeline being blocked wasn't checking for a pending translation, only for a icache access.	2013-03-04 23:33:47 -05:00
Ali Saidi	f4fd12d49e	ARM: fix some cases where instructions that write to fp reg 15 are accidently branches.	2013-03-04 23:33:47 -05:00
Blake Hechtman ext:(%2C%20Nilay%20Vaish%20%3Cnilay%40cs.wisc.edu%3E)	af8eb67fb4	ruby: fixes functional writes to RubyRequest The functional write code was assuming that all writes are block sized, which may not be true for Ruby Requests. This bug can lead to a buffer overflow. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2013-03-02 23:12:55 -06:00
Nilay Vaish	a4e8512afa	sim: remove duplicate check on stack size	2013-03-02 18:04:51 -06:00
Andreas Hansson	e5bcb30756	mem: Add check if SimpleDRAM nextReqEvent is scheduled This check covers a case where a retry is called from the SimpleDRAM causing a new request to appear before the DRAM itself schedules a nextReqEvent. By adding this check, the event is not scheduled twice.	2013-03-01 13:20:33 -05:00
Andreas Hansson	da5356ccce	mem: Add a method to build multi-channel DRAM configurations This patch adds a class method that allows easy creation of channel-interleaved multi-channel DRAM configurations. It is enabled by a class method to allow customisation of the class independent of the channel configuration. For example, the user can create a MyDDR subclass of e.g. SimpleDDR3, and then create a four-channel configuration of the subclass by calling MyDDR.makeMultiChannel(4, mem_start, mem_size).	2013-03-01 13:20:32 -05:00
Andreas Hansson	0facc8e1ac	mem: SimpleDRAM variable naming and whitespace fixes This patch fixes a number of small cosmetic issues in the SimpleDRAM module. The most important change is to move the accounting of received packets to after the check is made if the packet should be retried or not. Thus, packets are only counted if they are actually accepted.	2013-03-01 13:20:24 -05:00
Andreas Hansson	3ba131f4d5	mem: Add support for multi-channel DRAM configurations This patch adds support for multi-channel instances of the DRAM controller model by stripping away the channel bits in the address decoding. The patch relies on the availiability of address interleaving and, at this time, it is up to the user to configure the interleaving appropriately. At the moment it is assumed that the channel interleaving bits are immediately following the column bits (smallest sensible interleaving). Convenience methods for building multi-channel configurations will be added later.	2013-03-01 13:20:22 -05:00
Andreas Hansson	1a58362e25	mem: Merge interleaved ranges when creating backing store This patch adds merging of interleaved ranges before creating the backing stores. The backing stores are always a contigous chunk of the address space, and with this patch it is possible to have interleaved memories in the system.	2013-03-01 13:20:21 -05:00
Andreas Hansson	cafd38f36c	mem: Merge ranges in bus before passing them on This patch adds basic merging of address ranges to the bus, such that interleaved ranges are merged together before being passed on by the bus. As such, the bus aggregates the address ranges of the connected slave ports and then passes on the merged ranges through its master ports. The bus thus hides the complexity of the interleaved ranges and only exposes contigous ranges to the surrounding system. As part of this patch, the bus ranges are also cached for any future queries.	2013-03-01 13:20:19 -05:00
Dibakar Gope ext:(%2C%20Nilay%20Vaish%20%3Cnilay%40cs.wisc.edu%3E)	c636a09e83	ruby: mesi coherence protocol: invalidate lock The MESI CMP directory coherence protocol, while transitioning from SM to IM, did not invalidate the lock that it might have taken on a cache line. This patch adds an action for doing so. The problem was found by Dibakar, but I was not happy with his proposed solution. So I implemented a different solution. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2013-02-28 10:04:26 -06:00
Nilay Vaish	fea27bc49b	slicc: remove unused variable message_buffer_names	2013-02-19 22:58:51 -06:00
Nilay Vaish	e95e78ff2f	ruby: remove unused variable m_print_config in class Topology	2013-02-19 22:58:50 -06:00
Andreas Hansson	da950caed2	mem: Fix sender state bug and delay popping This patch fixes a newly introduced bug where the sender state was popped before checking that it should be. Amazingly all regressions pass, but Linux fails to boot on the detailed CPU with caches enabled.	2013-02-19 12:57:47 -05:00
Andreas Hansson	a62afd094b	scons: Fix warnings issued by clang 3.2svn (XCode 4.6) This patch fixes the warnings that clang3.2svn emit due to the "-Wall" flag. There is one case of an uninitialised value in the ARM neon ISA description, and then a whole range of unused private fields that are pruned.	2013-02-19 05:56:08 -05:00
Andreas Hansson	08a5fd328b	scons: Unify the flags shared by gcc and clang This patch restructures and unifies the flags used by gcc and clang as they are largely the same. The common parts are now dealt with in a shared block of code, and the few bits and pieces that are specifically affecting either gcc or clang are done separately.	2013-02-19 05:56:07 -05:00
Andreas Hansson	5eddb63877	scons: Add warning delete with non-virtual destructor This patch enables a warning for deleting derived classes that do not have a virtual destructor. The patch merely adds additional checks, and there are currently no cases that had to be fixed.	2013-02-19 05:56:07 -05:00
Andreas Hansson	319443d42d	scons: Add warning for missing declarations This patch enables warnings for missing declarations. To avoid issues with SWIG-generated code, the warning is only applied to non-SWIG code.	2013-02-19 05:56:07 -05:00
Andreas Hansson	b44e0ce52b	scons: Add warning for overloaded virtual functions Fix the ISA startup warnings	2013-02-19 05:56:07 -05:00
Andreas Hansson	0acd2a96e5	scons: Add warning for overloaded virtual functions A derived function with a different signature than a base class function will result in the base class function of the same name being hidden. The parameter list and return type for the member function in the derived class must match those of the member function in the base class, otherwise the function in the derived class will hide the function in the base class and no polymorphic behaviour will occur. This patch addresses these warnings by ensuring a unique function name to avoid (unintentionally) hiding any functions.	2013-02-19 05:56:06 -05:00
Andreas Hansson	d670fa60a1	scons: Add warning for missing field initializers This patch adds a warning for missing field initializers for both gcc and clang, and addresses the warnings that were generated.	2013-02-19 05:56:06 -05:00
Andreas Hansson	c10098f28b	scons: Fix up numerous warnings about name shadowing This patch address the most important name shadowing warnings (as produced when using gcc/clang with -Wshadow). There are many locations where constructor parameters and function parameters shadow local variables, but these are left unchanged.	2013-02-19 05:56:06 -05:00
Andreas Hansson	860155a5fc	mem: Enforce strict use of busFirst- and busLastWordTime This patch adds a check to ensure that the delay incurred by the bus is not simply disregarded, but accounted for by someone. At this point, all the modules do is to zero it out, and no additional time is spent. This highlights where the bus timing is simply dropped instead of being paid for. As a follow up, the locations identified in this patch should add this additional time to the packets in one way or another. For now it simply acts as a sanity check and highlights where the delay is simply ignored. Since no time is added, all regressions remain the same.	2013-02-19 05:56:06 -05:00
Andreas Hansson	40d0e6c899	mem: Change accessor function names to match the port interface This patch changes the names of the cache accessor functions to be in line with those used by the ports. This is done to avoid confusion and get closer to a one-to-one correspondence between the interface of the memory object (the cache in this case) and the port itself. The member function timingAccess has been split into a snoop/non-snoop part to avoid branching on the isResponse() of the packet.	2013-02-19 05:56:06 -05:00
Andreas Hansson	b3fc8839c4	mem: Make packet bus-related time accounting relative This patch changes the bus-related time accounting done in the packet to be relative. Besides making it easier to align the cache timing to cache clock cycles, it also makes it possible to create a Last-Level Cache (LLC) directly to a memory controller without a bus inbetween. The bus is unique in that it does not ever make the packets wait to reflect the time spent forwarding them. Instead, the cache is currently responsible for making the packets wait. Thus, the bus annotates the packets with the time needed for the first word to appear, and also the last word. The cache then delays the packets in its queues before passing them on. It is worth noting that every object attached to a bus (devices, memories, bridges, etc) should be doing this if we opt for keeping this way of accounting for the bus timing.	2013-02-19 05:56:06 -05:00
Andreas Hansson	362160c8ae	mem: Add deferred packet class to prefetcher This patch removes the time field from the packet as it was only used by the preftecher. Similar to the packet queue, the prefetcher now wraps the packet in a deferred packet, which also has a tick representing the absolute time when the packet should be sent.	2013-02-19 05:56:06 -05:00
Andreas Hansson	7cd49b24d2	sim: Make clock private and access using clockPeriod() This patch makes the clock member private to the ClockedObject and forces all children to access it using clockPeriod(). This makes it impossible to inadvertently change the clock, and also makes it easier to transition to a situation where the clock is derived from e.g. a clock domain, or through a multiplier.	2013-02-19 05:56:06 -05:00
Andreas Hansson	5c7ebee434	x86: Move APIC clock divider to Python This patch moves the 16x APIC clock divider to the Python code to avoid the post-instantiation modifications to the clock. The x86 APIC was the only object setting the clock after creation time and this required some custom functionality and configuration. With this patch, the clock multiplier is moved to the Python code and the objects are instantiated with the appropriate clock.	2013-02-19 05:56:06 -05:00
Sascha Bischoff	86a4d09269	mem: Fix SenderState related cache deadlock This patch fixes a potential deadlock in the caches. This deadlock could occur when more than one cache is used in a system, and pkt->senderState is modified in between the two caches. This happened as the caches relied on the senderState remaining unchanged, and used it for instantaneous upstream communication with other caches. This issue has been addressed by iterating over the linked list of senderStates until we are either able to cast to a MSHR* or senderState is NULL. If the cast is successful, we know that the packet has previously passed through another cache, and therefore update the downstreamPending flag accordingly. Otherwise, we do nothing.	2013-02-19 05:56:06 -05:00
Andreas Hansson	0622f30961	mem: Add predecessor to SenderState base class This patch adds a predecessor field to the SenderState base class to make the process of linking them up more uniform, and enable a traversal of the stack without knowing the specific type of the subclasses. There are a number of simplifications done as part of changing the SenderState, particularly in the RubyTest.	2013-02-19 05:56:05 -05:00
Andreas Hansson	f69d431ede	base: Fix a bug in the address interleaving This patch fixes a minor (but important) typo in the matching of an address to an interleaved range.	2013-02-19 05:56:05 -05:00
Andreas Hansson	9947923c60	mem: Ensure trace captures packet fields before forwarding This patch fixes a bug in the CommMonitor caused by the packet being modified before it is captured in the trace. By recording the fields before passing the packet on, and then putting these values in the trace we ensure that even if the packet is modified the trace captures what the CommMonitor saw.	2013-02-19 05:56:05 -05:00
Anthony Gutierrez	f7107fb795	loader: add a flattened device tree blob (dtb) object this adds a dtb_object so the loader can load in the dtb file for linux/android ARM kernels.	2013-02-15 18:48:59 -05:00
Mrinmoy Ghosh	8cef39fb67	arm: fix a page table walker issue where a page could be translated multiple times If multiple memory operations to the same page are miss the TLB they are all inserted into the page table queue and before this change could result in multiple uncessesary walks as well as duplicate enteries being inserted into the TLB.	2013-02-15 17:40:10 -05:00
Andreas Sandberg	3af59ab386	cpu: Document exec trace flags	2013-02-15 17:40:10 -05:00
Andreas Sandberg	08467a88a6	dev: Use the correct return type for disk offsets Replace the use of off_t in the various DiskImage related classes with std::streampos. off_t is a signed 32 bit integer on most 32-bit systems, whereas std::streampos is normally a 64 bit integer on most modern systems. Furthermore, std::streampos is the type used by tellg() and seekg() in the standard library, so it should have been used in the first place. This patch makes it possible to use disk images larger than 2 GiB on 32 bit systems with a modern C++ standard library.	2013-02-15 17:40:10 -05:00
Geoffrey Blake	ca96e7bff1	cpu: Avoid duplicate entries in tracking structures for writes to misc regs setMiscReg currently makes a new entry for each write to a misc reg without checking for duplicates, this can cause a triggering of the assert if an instruction get replayed and writes to the same misc regs multiple times. This fix prevents duplicate entries and instead updates the value.	2013-02-15 17:40:10 -05:00
Geoffrey Blake	8e79c68936	cpu: Fix rename mis-handling serializing instructions when resource constrained The rename can mis-handle serializing instructions (i.e. strex) if it gets into a resource constrained situation and the serializing instruction has to be placed on the skid buffer to handle blocking. In this situation the instruction informs the pipeline it is serializing and logs that the next instruction must be serialized, but since we are blocking the pipeline defers this action to place the serializing instruction and incoming instructions into the skid buffer. When resuming from blocking, rename will pull the serializing instruction from the skid buffer and the current logic will see this as the "next" instruction that has to be serialized and because of flags set on the serializing instruction, it passes through the pipeline stage as normal and resets rename to non-serializing. This causes instructions to follow the serializing inst incorrectly and eventually leads to an error in the pipeline. To fix this rename should check first if it has to block before checking for serializing instructions.	2013-02-15 17:40:10 -05:00
Chris Emmons	27630e9cad	ARM: Postpones creation of framebuffer output file until it is actually used. This delay prevents a potential conflict with the HDLCD if both are in the same system even if only one is enabled.	2013-02-15 17:40:10 -05:00
Andreas Hansson	f6550b3d20	mem: Tighten up cache constness and scoping This patch merely adopts a more strict use of const for the cache member functions and variables, and also moves a large portion of the member functions from public to protected.	2013-02-15 17:40:10 -05:00
Sascha Bischoff	2f3b322280	base: Add warn() and inform() to m5.utils for use from python This patch adds two fuctions to m5.util, warn and inform, which mirror those found in the C++ side of gem5. These are added in addition to the already existing m5.util.panic and m5.util.fatal which already mirror the C++ functionality. This ensures that warning and information messages generated by python are in the same format as those generated by C++. Occurrences of print "Warning: %s..." % name have been replaced with warn("%s...", name)	2013-02-15 17:40:10 -05:00
Matt Horsnell	e88e7d88b9	o3: fix tick used for renaming and issue with range selection Fixes the tick used from rename: - previously this gathered the tick on leaving rename which was always 1 less than the dispatch. This conflated the decode ticks when back pressure built in the pipeline. - now picks up tick on entry. Added --store_completions flag: - will additionally display the store completion tail in the viewer. - this highlights periods when large numbers of stores are outstanding (>16 LSQ blocking) Allows selection by tick range (previously this caused an infinite loop)	2013-02-15 17:40:09 -05:00
Andreas Sandberg	6459908069	arm: Don't export private GIC methods	2012-10-25 14:08:29 +01:00
Andreas Sandberg	81be8b9d15	arm: Create a GIC base class and make the PL390 derive from it This patch moves the GIC interface to a separate base class and makes all interrupt devices use that base class instead of a pointer to the PL390 implementation. This allows us to have multiple GIC implementations. Future implementations will allow in-kernel GIC implementations when using hardware virtualization. --HG-- rename : src/dev/arm/gic.cc => src/dev/arm/gic_pl390.cc rename : src/dev/arm/gic.hh => src/dev/arm/gic_pl390.hh	2012-10-25 14:05:24 +01:00
Andreas Sandberg	b904bd5437	sim: Add a system-global option to bypass caches Virtualized CPUs and the fastmem mode of the atomic CPU require direct access to physical memory. We currently require caches to be disabled when using them to prevent chaos. This is not ideal when switching between hardware virutalized CPUs and other CPU models as it would require a configuration change on each switch. This changeset introduces a new version of the atomic memory mode, 'atomic_noncaching', where memory accesses are inserted into the memory system as atomic accesses, but bypass caches. To make memory mode tests cleaner, the following methods are added to the System class: * isAtomicMode() -- True if the memory mode is 'atomic' or 'direct'. * isTimingMode() -- True if the memory mode is 'timing'. * bypassCaches() -- True if caches should be bypassed. The old getMemoryMode() and setMemoryMode() methods should never be used from the C++ world anymore.	2013-02-15 17:40:09 -05:00
Andreas Sandberg	1eec115c31	cpu: Refactor memory system checks CPUs need to test that the memory system is in the right mode in two places, when the CPU is initialized (unless it's switched out) and on a drainResume(). This led to some code duplication in the CPU models. This changeset introduces the verifyMemoryMode() method which is called by BaseCPU::init() if the CPU isn't switched out. The individual CPU models are responsible for calling this method when resuming from a drain as this code is CPU model specific.	2013-02-15 17:40:08 -05:00
Andreas Sandberg	e5dca84c3f	config: Move CPU handover logic to m5.switchCpus() CPU switching consists of the following steps: 1. Drain the system 2. Switch out old CPUs (cpu.switchOut()) 3. Change the system timing mode to the mode the new CPUs require 4. Flush caches if switching to hardware virtualization 5. Inform new CPUs of the handover (cpu.takeOverFrom()) 6. Resume the system m5.switchCpus() previously only did step 2 & 5. Since information about the new processors' memory system requirements is now exposed, do all of the steps above. This patch adds automatic memory system switching and flush (if needed) to switchCpus(). Additionally, it adds optional draining to switchCpus(). This has the following implications: * changeToTiming and changeToAtomic are no longer needed, so they have been removed. * changeMemoryMode is only used internally, so it is has been renamed to be private. * switchCpus requires a reference to the system containing the CPUs as its first parameter. WARNING: This changeset breaks compatibility with existing configuration scripts since it changes the signature of m5.switchCpus().	2013-02-15 17:40:08 -05:00
Andreas Sandberg	7f1263f144	cpu: Make checker CPUs inherit from CheckerCPU in the Python hierarchy Checker CPUs currently don't inherit from the CheckerCPU in the Python object hierarchy. This has two consequences: * It makes CPU model discovery from the Python world somewhat complicated as there is no way of testing if a CPU is a checker. * Parameters are duplicated in the checker configuration specification. This changeset makes all checker CPUs inherit from the base checker CPU class.	2013-02-15 17:40:08 -05:00
Andreas Sandberg	7cd1fd4324	cpu: Add CPU metadata om the Python classes The configuration scripts currently hard-code the requirements of each CPU. This is clearly not optimal as it makes writing new configuration scripts painful and adding new CPU models requires existing scripts to be updated. This patch adds the following class methods to the base CPU and all relevant CPUs: * memory_mode -- Return a string describing the current memory mode (invalid/atomic/timing). * require_caches -- Does the CPU model require caches? * support_take_over -- Does the CPU support CPU handover?	2013-02-15 17:40:08 -05:00
Ali Saidi	db5c478e70	arm: fix some fp comparisons that worked by accident. The explict tests in the follwing fp comparison operations were incorrect as they checked for only signaling NaNs and not quite-NaNs as well. When compiled with gcc, the comparison generates a fp exception that causes the FE_INVALID flag to be set and we check for it, so even though the check was incorrect, the correct exception was set. With clang this behavior seems to not occur. The checks are updated to test for nans and the behavior is now correct with both clang and gcc.	2013-02-15 17:40:08 -05:00
Ali Saidi	4412046041	cpu: include set in o3/commit_impl. While the majority of compilers seemed to pickup set from else where, one version of gcc 4.7 complains, so explictly add it.	2013-02-15 17:40:08 -05:00
Ali Saidi	68495a0748	ARM: Fix an issue with clang generating wrong code. Clang generated executables would enter the if condition when it wasn't supposted to, resulting in the wrong simulated behavior. Implementing the operation this way is a bit faster anyway.	2013-02-15 17:40:08 -05:00
Ali Saidi	7ae06a3b3b	cpu: fix case with o3 cpu blocking and unblocking decode in cycle Fix a case in the O3 CPU where the decode stage blocks and unblocks in a single cycle sending both signals to fetch which causes an assert or worse. The previous check could never work before since the status was set to Blocked before a test for the status being Unblocking was executed.	2013-02-15 17:40:08 -05:00
Ali Saidi	b84bd3028c	cpu: Fix a livelock in the o3 cpu. Check if an instruction just enabled interrupts and we've previously had an interrupt pending that was not handled because interrupts were subsequently disabled before the pipeline reached a place to handle the interrupt. In that case squash now to make sure the interrupt is handled.	2013-02-15 17:40:07 -05:00
Andreas Sandberg	d4eca0591d	base: Add support for newer versions of IPython IPython is used for the interactive gem5 shell if it exists. IPython made API changes in version 0.11. This patch adds support for IPython version 0.11 and above. --HG-- extra : rebase_source : 5388d0919adb58d97f49a1a637db48cba61283a3	2013-02-10 13:23:58 +01:00
Andreas Hansson	7c6bc52bf5	Ruby: Fix compilation errors on gcc 4.7 and clang 3.2 This patch fixes a few (recently added) errors that prevented gem5 from compiling on more recent versions of gcc and clang.	2013-02-14 12:24:51 -05:00
Nilay Vaish	71c27e6370	ruby: MI protocol: add a missing transition The transition for state MII and event Store was found missing during testing. The transition is being added. The controller will not stall the Store request in state MII	2013-02-10 21:43:18 -06:00
Nilay Vaish	cb7782f78d	ruby: enable multiple clock domains This patch allows ruby to have multiple clock domains. As I understand with this patch, controllers can have different frequencies. The entire network needs to run at a single frequency. The idea is that with in an object, time is treated in terms of cycles. But the messages that are passed from one entity to another should contain the time in Ticks. As of now, this is only true for the message buffers, but not for the links in the network. As I understand the code, all the entities in different networks (simple, garnet-fixed, garnet-flexible) should be clocked at the same frequency. Another problem is that the directory controller has to operate at the same frequency as the ruby system. This is because the memory controller does not make use of the Message Buffer, and instead implements a buffer of its own. So, it has no idea of the frequency at which the directory controller is operating and uses ruby system's frequency for scheduling events.	2013-02-10 21:43:17 -06:00
Nilay Vaish	253e8edf13	ruby: replace Time with Cycles (final patch in the series) This patch is as of now the final patch in the series of patches that replace Time with Cycles.This patch further replaces Time with Cycles in Sequencer, Profiler, different protocols and related entities. Though Time has not been completely removed, the places where it is in use seem benign as of now.	2013-02-10 21:43:10 -06:00
Nilay Vaish	f6e3ab7bd4	ruby: replace Time with Cycles in garnet fixed and flexible	2013-02-10 21:43:09 -06:00
Nilay Vaish	9d6d6c6718	ruby: replace Time with Tick in replacement policy classes	2013-02-10 21:43:08 -06:00
Nilay Vaish	221d39284e	ruby: convert block size, memory size to unsigned	2013-02-10 21:43:07 -06:00
Nilay Vaish	5e33045a2a	ruby: replace Time with Cycles in MessageBuffer	2013-02-10 21:26:26 -06:00
Nilay Vaish	b742081cc1	ruby: replace Time with Cycles in Memory Controller	2013-02-10 21:26:25 -06:00
Nilay Vaish	89f86dbd28	ruby: Replace Time with Cycles in SequencerMessage	2013-02-10 21:26:25 -06:00
Nilay Vaish	7862478eef	ruby: replace Time with Cycles in Message class Concomitant changes are being committed as well, including the io operator<< for the Cycles class.	2013-02-10 21:26:24 -06:00
Nilay Vaish	d3aebe1f91	ruby: replaces Time with Cycles in many places The patch started of with replacing Time with Cycles in the Consumer class. But to get ruby to compile, the rest of the changes had to be carried out. Subsequent patches will further this process, till we completely replace Time with Cycles.	2013-02-10 21:26:24 -06:00
Nilay Vaish	affd77ea77	base: add some mathematical operators to Cycles class	2013-02-10 21:26:23 -06:00
Nilay Vaish	bc1daae7fd	ruby: modifies histogram add() function This patch modifies the Histogram class' add() function so that it can add linear histograms as well. The function assumes that the left end point of the ranges of the two histograms are the same. It also assumes that when the ranges of the two histogram are changed to accomodate an element not in the range, the factor used in changing the range is same for both the histograms. This function is then used in removing one of the calls to the global profiler*. The histograms for recording the delays incurred in processing different requests are now maintained by the controllers. The profiler adds these histograms when it needs to print the stats.	2013-02-10 21:26:22 -06:00
Nilay Vaish	a49b1df3f0	ruby: record fully busy cycle with in the controller This patch does several things. First, the counter for fully busy cycles for a controller is now kept with in the controller, instead of being part of the profiler. Second, the topology class no longer keeps an array of controllers which was only used for printing stats. Instead, ruby system will now ask each controller to print the stats. Thirdly, the statistical variable for recording how many different types were created is being moved in to the controller from the profiler. Note that for printing, the profiler will collate results from different controllers.	2013-02-10 21:26:22 -06:00
Andreas Sandberg	10f1f8c6a4	base: Fix broken IPython argument handling Prior to this changeset, we used to clear sys.argv before entering the IPython shell. This caused some versions of IPython to crash because they assume argv[0] to exist. The correct way of overriding the arguments passed to IPython is to set the argv keyword argument when initializing the shell.	2013-02-10 13:23:56 +01:00
Nilay Vaish	87ea04ab2f	sim: remove unused struct priority_compare	2013-01-31 21:26:29 -06:00
Nilay Vaish	6aed4d4f93	ruby: correct computation of number of bits required for address The number of bits required for an address was set to floorLog2(memory size). This is correct under the assumption that the memory size is a power of 2, which is not always true. Hence, floorLog2 is being replaced with ceilLog2.	2013-01-31 09:44:20 -06:00
Andreas Hansson	a4288dabf9	mem: Add comments for the DRAM address decoding This patch adds more verbose comments to explain the two different address mapping schemes of the DRAM controller.	2013-01-31 07:49:18 -05:00
Andreas Hansson	c4898b15bc	mem: Add DDR3 and LPDDR2 DRAM controller configurations This patch moves the default DRAM parameters from the SimpleDRAM class to two different subclasses, one for DDR3 and one for LPDDR2. More can be added as we go forward. The regressions that previously used the SimpleDRAM are now using SimpleDDR3 as this is the most similar configuration.	2013-01-31 07:49:14 -05:00
Ani Udipi	eaa37e611f	mem: Add tTAW and tFAW to the SimpleDRAM model This patch adds two additional scheduling constraints to the DRAM controller model, to constrain the activation rate. The two metrics are determine the size of the activation window in terms of the number of activates and the minimum time required for that number of activates. This maps to current DDRx, LPDDRx and WIOx standards that have either tFAW (4 activate window) or tTAW (2 activate window) scheduling constraints.	2013-01-31 07:49:14 -05:00
Andreas Hansson	b7153e2a64	mem: Separate out the different cases for DRAM bus busy time This patch changes how the data bus busy time is calculated such that it is delayed to the actual scheduling time of the request as opposed to being done as soon as possible. This patch changes a bunch of statistics, and the stats update is bundled together with the introruction of tFAW/tTAW and the named DRAM configurations like DDR3 and LPDDR2.	2013-01-31 07:49:13 -05:00
Anthony Gutierrez	af0f8b31db	cache: remove drainManager because it's not used the cache drainManager is set but never cleared, this is because the cache itself does not need to be drained and thus never triggers a signalDrainDone(). because the drainManager variable is not used properly and does not appear to be necessary it has been removed with this patch.	2013-01-28 20:19:42 -05:00
Nilay Vaish	a8eb5b18e0	ruby: remove get_time() This patch replaces get_time() in *.sm files with curCycle() which is now possible since controllers are clocked objects.	2013-01-28 06:14:18 -06:00
Nilay Vaish	31659e83fb	ruby: remove call to curCycle in panic() The panic() function already prints the current tick value. This call to curCycle() is as such redundant. Since we are trying to move towards multiple clock domains, this call will print misleading time.	2013-01-28 06:11:42 -06:00
Nilay Vaish ext:(%2C%20Timothy%20Jones%20%3Ctimothy.jones%40cl.cam.ac.uk%3E)	dbeabedaf0	branch predictor: move out of o3 and inorder cpus This patch moves the branch predictor files in the o3 and inorder directories to src/cpu/pred. This allows sharing the branch predictor across different cpu models. This patch was originally posted by Timothy Jones in July 2010 but never made it to the repository. --HG-- rename : src/cpu/o3/bpred_unit.cc => src/cpu/pred/bpred_unit.cc rename : src/cpu/o3/bpred_unit.hh => src/cpu/pred/bpred_unit.hh rename : src/cpu/o3/bpred_unit_impl.hh => src/cpu/pred/bpred_unit_impl.hh rename : src/cpu/o3/sat_counter.hh => src/cpu/pred/sat_counter.hh	2013-01-24 12:28:51 -06:00
Andrea Pellegrini	11d5ffa108	o3 cpu: fix zero reg problem There was an issue w/ the rename logic, which would assign a previous physical register to the ZeroReg architectural register in x86. This issue was giving problems for instructions squashed in threads w/ ID different from 0, sometimes allowing non-mispredicted instructions to obtain a value different from zero when reading the zeroReg.	2013-01-22 00:13:28 -06:00
Nilay Vaish	fc57ae6401	x86, cpu: corrects 270c9a75e91f, take over decoder on cpu switch The changes made by the changeset 270c9a75e91f do not work well with switching of cpus. The problem is that decoder for the old thread context holds state that is not taken over by the new decoder. This patch adds a takeOverFrom() function to Decoder class in each ISA. Except for x86, functions in other ISAs are blank. For x86, the function copies state from the old decoder to the new decoder.	2013-01-22 00:10:10 -06:00
Joel Hestness	1429d21244	O3 IEW: Make incrWb and decrWb clearer Move the increment/decrement of wbOutstanding outside of the comparison in incrWb and decrWb in the IEW. This also fixes a compiler bug with gcc 4.4.7, which incorrectly optimizes "-- ==" as "-=".	2013-01-19 15:14:54 -06:00
Nilay Vaish	5b6f972750	ruby: remove calls to g_system_ptr->getTime() This patch further removes calls to g_system_ptr->getTime() where ever other clocked objects are available for providing current time.	2013-01-17 13:10:12 -06:00
Nilay Vaish	f2bcf4f01c	x86 cpuid: enable clflush Note that clflush is only being enabled. It is not implemented in actual. A warning is printed if the cpu encounters a clflush instruction. We need to enable this instruction in cpuid since JRE 1.7 tests for it.	2013-01-15 07:43:21 -06:00
Nilay Vaish	ac9bb51405	x86: implements fsin, fcos instructions	2013-01-15 07:43:21 -06:00
Nilay Vaish	7f5463539b	x86: implements emms instruction	2013-01-15 07:43:20 -06:00
Nilay Vaish	91b00d98a5	x86: implement fabs, fchs instructions	2013-01-15 07:43:19 -06:00
Malek Musleh	1abf950f3c	ruby sequencer: converts cycles to ticks in deadlock panic() This patch converts the panic() print outs in the Sequencer::wakeup() call from ruby cycles to Ticks(). This makes it easier to debug deadlocks with the ProtocolTrace flag so the issue time indicated in the panic message can be quickly searched for. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2013-01-14 10:05:12 -06:00
Nilay Vaish	2012983718	Ruby: remove reference to g_system_ptr from class Message This patch was initiated so as to remove reference to g_system_ptr, the pointer to Ruby System that is used for getting the current time. That simple change actual requires changing a lot many things in slicc and garnet. All these changes are related to how time is handled. In most of the places, g_system_ptr has been replaced by another clock object. The changes have been done under the assumption that all the components in the memory system are on the same clock frequency, but the actual clocks might be distributed.	2013-01-14 10:05:10 -06:00
Nilay Vaish	cf232de461	Ruby: use ClockedObject in Consumer class Many Ruby structures inherit from the Consumer, which is used for scheduling events. The Consumer used to relay on an Event Manager for scheduling events and on g_system_ptr for time. With this patch, the Consumer will now use a ClockedObject to schedule events and to query for current time. This resulted in several structures being converted from SimObjects to ClockedObjects. Also, the MessageBuffer class now requires a pointer to a ClockedObject so as to query for time.	2013-01-14 10:04:21 -06:00
Andreas Hansson	cbbc4c7f6b	scons: Address clang 3.2 compilation error This patch fixes a compilation error encountered using clang 3.2 on OSX.	2013-01-14 10:23:56 -05:00
Nilay Vaish	f7c0ba406e	base simple cpu: removes commented out code about cache ops	2013-01-12 22:11:16 -06:00
Nilay Vaish	25ec278a0b	x86: Changes to decoder, corrects 9376 The changes made by the changeset 9376 were not quite correct. The patch made changes to the code which resulted in decoder not getting initialized correctly when the state was restored from a checkpoint. This patch adds a startup function to each ISA object. For x86, this function sets the required state in the decoder. For other ISAs, the function is empty right now.	2013-01-12 22:09:48 -06:00
Lluís Vilanova	807168a1de	util: add m5_fail op. Used as a command in full-system scripts helps the user ensure the benchmarks have finished successfully. For example, one can use: /path/to/benchmark args \|\| /sbin/m5 fail 1 and thus ensure gem5 will exit with an error if the benchmark fails.	2013-01-08 08:54:12 -05:00
Tao Zhang	858d99b7cc	sim: Fix early termination in multi-core simulation under SE mode. When "-I" (maximum instruction number) and "-F" (fastforward instruction number) are applied together, gem5 immediately exits after the cpu switching. The reason is that multiple exit events may be generated in the same cycle by Atomic CPU and inserted to mainEventQueue. However, mainEventQueue can only serve one exit event in one cycle. Therefore, the rest exit events are left in mainEventQueue without being descheduled or deleted, which causes gem5 exits immediately after the system resumes by cpu switching.	2013-01-08 08:54:11 -05:00
Mitch Hayenga	4a752b1655	arm: add access syscall for ARM SE mode This patch adds the "access" syscall for ARM SE as required by some spec2006 benchmarks.	2013-01-08 08:54:07 -05:00
Mitch Hayenga	c7dbd5e768	mem: Make LL/SC locks fine grained The current implementation in gem5 just keeps a list of locks per cacheline. Due to this, a store to a non-overlapping portion of the cacheline can cause an LL/SC pair to fail. This patch simply adds an address range to the lock structure, so that the lock is only invalidated if the store overlaps the lock range.	2013-01-08 08:54:07 -05:00
Mitch Hayenga	dc4a0aa2fa	mem: Fix use-after-free bug Running with valgrind I noticed a use after free originating from simple_mem.cc. It looks like this is a known issue and this additional call site was missed in an earlier patch.	2013-01-08 08:54:06 -05:00
Andreas Sandberg	8480615d8d	dev: Fix infinite recursion in DMA devices The DMA device sometimes calls the process() method on a completion event directly instead of scheduling it on the current tick. This breaks some devices that assume that the completion handler won't be called until the current event handler has returned. Specifically, it causes infinite recursion in the IdeDisk component because it does not advance its chunk generator until after a dmaRead()/dmaWrite() has returned. This changeset removes this mico-optimization and schedules the event in the current tick instead. This way the semantics event handling stay the same even when the delay is 0.	2013-01-07 16:56:39 -05:00
Sascha Bischoff	8a767885d6	stats: Fix swig wrapping for Tick in stats Tick was not correctly wrapped for the stats system, and therefore it was not possible to configure the stats dumping from the python scripts without defining Ticks as long long. This patch fixes the wrapping of Tick by copying the typemap of uint64_t to Tick.	2013-01-07 16:56:36 -05:00
Andreas Sandberg	009970f59b	cpu: Unify the serialization code for all of the CPU models Cleanup the serialization code for the simple CPUs and the O3 CPU. The CPU-specific code has been replaced with a (un)serializeThread that serializes the thread state / context of a specific thread. Assuming that the thread state class uses the CPU-specific thread state uses the base thread state serialization code, this allows us to restore a checkpoint with any of the CPU models.	2013-01-07 13:05:52 -05:00
Andreas Sandberg	e09e9fa279	cpu: Flush TLBs on switchOut() This changeset inserts a TLB flush in BaseCPU::switchOut to prevent stale translations when doing repeated switching. Additionally, the TLB flushing functionality is exported to the Python to make debugging of switching/checkpointing easier. A simulation script will typically use the TLB flushing functionality to generate a reference trace. The following sequence can be used to simulate a handover (this depends on how drain is implemented, but is generally the case) between identically configured CPU models: m5.drain(test_sys) [ cpu.flushTLBs() for cpu in test_sys.cpu ] m5.resume(test_sys) The generated trace should normally be identical to a trace generated when switching between identically configured CPU models or checkpointing and resuming.	2013-01-07 13:05:48 -05:00
Andreas Sandberg	964aa49d15	mem: Fix guest corruption when caches handle uncacheable accesses When the classic gem5 cache sees an uncacheable memory access, it used to ignore it or silently drop the cache line in case of a write. Normally, there shouldn't be any data in the cache belonging to an uncacheable address range. However, since some architecture models don't implement cache maintenance instructions, there might be some dirty data in the cache that is discarded when this happens. The reason it has mostly worked before is because such cache lines were most likely evicted by normal memory activity before a TLB flush was requested by the OS. Previously, the cache model would invalidate cache lines when they were accessed by an uncacheable write. This changeset alters this behavior so all uncacheable memory accesses cause a cache flush with an associated writeback if necessary. This is implemented by reusing the cache flushing machinery used when draining the cache, which implies that writebacks are performed using functional accesses.	2013-01-07 13:05:47 -05:00
Andreas Sandberg	1814a85a05	cpu: Rewrite O3 draining to avoid stopping in microcode Previously, the O3 CPU could stop in the middle of a microcode sequence. This patch makes sure that the pipeline stops when it has committed a normal instruction or exited from a microcode sequence. Additionally, it makes sure that the pipeline has no instructions in flight when it is drained, which should make draining more robust. Draining is controlled in the commit stage, which checks if the next PC after a committed instruction is in microcode. If this isn't the case, it requests a squash of all instructions after that the instruction that just committed and immediately signals a drain stall to the fetch stage. The CPU then continues to execute until the pipeline and all associated buffers are empty.	2013-01-07 13:05:46 -05:00
Andreas Sandberg	9e8003148f	cpu: Make sure that a drained atomic CPU isn't executing ucode Currently, the atomic CPU can be in the middle of a microcode sequence when it is drained. This leads to two problems: * When switching to a hardware virtualized CPU, we obviously can't execute gem5 microcode. * Since curMacroStaticInst is populated when executing microcode, repeated switching between CPUs executing microcode leads to incorrect execution. After applying this patch, the CPU will be on a proper instruction boundary, which means that it is safe to switch to any CPU model (including hardware virtualized ones). This changeset fixes a bug where the multiple switches to the same atomic CPU sometimes corrupts the target state because of dangling pointers to the currently executing microinstruction. Note: This changeset moves tick event descheduling from switchOut() to drain(), which makes timing consistent between just draining a system and draining /and/ switching between two atomic CPUs. This makes debugging quite a lot easier (execution traces get the same timing), but the latency of the last instruction before a drain will not be accounted for correctly (it will always be 1 cycle). Note 2: This changeset removes so_state variable, the locked variable, and the tickEvent from checkpoints since none of them contain state that needs to be preserved across checkpoints. The so_state is made redundant because we don't use the drain state variable anymore, the lock variable should never be set when the system is drained, and the tick event isn't scheduled.	2013-01-07 13:05:46 -05:00
Andreas Sandberg	f9bcf46371	cpu: Make sure that a drained timing CPU isn't executing ucode Currently, the timing CPU can be in the middle of a microcode sequence or multicycle (stayAtPC is true) instruction when it is drained. This leads to two problems: * When switching to a hardware virtualized CPU, we obviously can't execute gem5 microcode. * If stayAtPC is true we might execute half of an instruction twice when restoring a checkpoint or switching CPUs, which leads to an incorrect execution. After applying this patch, the CPU will be on a proper instruction boundary, which means that it is safe to switch to any CPU model (including hardware virtualized ones). This changeset also fixes a bug where the timing CPU sometimes switches out with while stayAtPC is true, which corrupts the target state after a CPU switch or checkpoint. Note: This changeset removes the so_state variable from checkpoints since the drain state isn't used anymore.	2013-01-07 13:05:46 -05:00
Andreas Sandberg	52ff37caa3	cpu: Fix broken thread context handover The thread context handover code used to break when multiple handovers were performed during the same quiesce period. Previously, the thread contexts would assign the TC pointer in the old quiesce event to the new TC. This obviously broke in cases where multiple switches were performed within the same quiesce period, in which case the TC pointer in the quiesce event would point to an old CPU. The new implementation deschedules pending quiesce events in the old TC and schedules a new quiesce event in the new TC. The code has been refactored to remove most of the code duplication.	2013-01-07 13:05:46 -05:00
Andreas Sandberg	fca4fea769	cpu: Fix O3 LSQ debug dumping constness and formatting	2013-01-07 13:05:46 -05:00
Andreas Sandberg	fb52ea9220	arm: Invalidate cached TLB configuration in drainResume Currently, we invalidate the cached miscregs in TLB::unserialize(). The intended use of the drainResume() method is to invalidate cached state and prepare the system to resume after a CPU handover or (un)serialization. This patch moves the TLB miscregs invalidation code to the drainResume() method to avoid surprising behavior.	2013-01-07 13:05:45 -05:00
Andreas Sandberg	0d59549cd9	arm: Fix draining of the pagetable walker when squashing Since the page table walker only checks if a drain has completed in doL1DescriptorWrapper() and doL2DescriptorWrapper(), it sometimes looses track of a drain request if there is a squash. This changeset adds a completeDrain() call after squashing requests in the pending queue, which fixes this issue.	2013-01-07 13:05:45 -05:00
Andreas Sandberg	8db27aa230	cpu: Fix broken squashAfter implementation in O3 CPU Commit can currently both commit and squash in the same cycle. This confuses other stages since the signals coming from the commit stage can only signal either a squash or a commit in a cycle. This changeset changes the behavior of squashAfter so that it commits all instructions, including the instruction that requested the squash, in the first cycle and then starts to squash in the next cycle.	2013-01-07 13:05:45 -05:00
Andreas Sandberg	a2077ccf02	o3 cpu: Remove unused variables	2013-01-07 13:05:45 -05:00
Andreas Sandberg	1c3a1888d8	sim: Remove unused variables	2013-01-07 13:05:45 -05:00
Andreas Sandberg	2cfe62adc4	cpu: Rename defer_registration->switched_out The defer_registration parameter is used to prevent a CPU from initializing at startup, leaving it in the "switched out" mode. The name of this parameter (and the help string) is confusing. This patch renames it to switched_out, which should be more descriptive.	2013-01-07 13:05:45 -05:00
Andreas Sandberg	f7da0fddd1	cpu: Remove unused params.hh header file in inorder CPU	2013-01-07 13:05:45 -05:00
Andreas Sandberg	38925ff621	arm: Remove the register mapping hack used when copying TCs In order to see all registers independent of the current CPU mode, the ARM architecture model uses the magic MISCREG_CPSR_MODE register to change the register mappings without actually updating the CPU mode. This hack is no longer needed since the thread context now provides a flat interface to the register file. This patch replaces the CPSR_MODE hack with the flat register interface.	2013-01-07 13:05:44 -05:00
Andreas Sandberg	a7e0cbeb36	cpu: Introduce sanity checks when switching between CPUs This patch introduces the following sanity checks when switching between CPUs: * Check that the set of new and old CPUs do not overlap. Having an overlap between the set of new CPUs and the set of old CPUs is currently not supported. Doing such a switch used to result in the following assertion error: BaseCPU::takeOverFrom(BaseCPU): \ Assertion `!new_itb_port->isConnected()' failed. Check that all new CPUs are in the switched out state. * Check that all old CPUs are in the switched in state.	2013-01-07 13:05:44 -05:00
Andreas Sandberg	901258c22b	cpu: Correctly call parent on switchOut() and takeOverFrom() This patch cleans up the CPU switching functionality by making sure that CPU models consistently call the parent on switchOut() and takeOverFrom(). This has the following implications that might alter current functionality: * The call to BaseCPU::switchout() in the O3 CPU is moved from signalDrained() (!) to switchOut(). * A call to BaseSimpleCPU::switchOut() is introduced in the simple CPUs.	2013-01-07 13:05:44 -05:00
Andreas Sandberg	4ae02295d5	cpu: Unify SimpleCPU and O3 CPU serialization code The O3 CPU used to copy its thread context to a SimpleThread in order to do serialization. This was a bit of a hack involving two static SimpleThread instances and a magic constructor that was only used by the O3 CPU. This patch moves the ThreadContext serialization code into two global procedures that, in addition to the normal serialization parameters, take a ThreadContext reference as a parameter. This allows us to reuse the serialization code in all ThreadContext implementations.	2013-01-07 13:05:44 -05:00
Andreas Sandberg	6daada2701	cpu: Initialize the O3 pipeline from startup() The entire O3 pipeline used to be initialized from init(), which is called before initState() or unserialize(). This causes the pipeline to be initialized from an incorrect thread context. This doesn't currently lead to correctness problems as instructions fetched from the incorrect start PC will be squashed a few cycles after initialization. This patch will affect the regressions since the O3 CPU now issues its first instruction fetch to the correct PC instead of 0x0.	2013-01-07 13:05:44 -05:00
Andreas Sandberg	e2dad8236a	cpu: Implement a flat register interface in thread contexts Some architectures map registers differently depending on their mode of operations. There is currently no architecture independent way of accessing all registers. This patch introduces a flat register interface to the ThreadContext class. This interface is useful, for example, when serializing or copying thread contexts.	2013-01-07 13:05:44 -05:00
Andreas Sandberg	17b47d35e1	arch: Move the ISA object to a separate section After making the ISA an independent SimObject, it is serialized automatically by the Python world. Previously, this just resulted in an empty ISA section. This patch moves the contents of the ISA to that section and removes the explicit ISA serialization from the thread contexts, which makes it behave like a normal SimObject during serialization. Note: This patch breaks checkpoint backwards compatibility! Use the cpt_upgrader.py utility to upgrade old checkpoints to the new format.	2013-01-07 13:05:42 -05:00
Andreas Sandberg	7eb0fb8b6e	cpu: Check that the memory system is in the correct mode This patch adds checks to all CPU models to make sure that the memory system is in the correct mode at startup and when resuming after a drain. Previously, we only checked that the memory system was in the right mode when resuming. This is inadequate since this is a configuration error that should be detected at startup as well as when resuming. Additionally, since the check was done using an assert, it wasn't performed when NDEBUG was set (e.g., the fast target).	2013-01-07 13:05:41 -05:00
Andreas Sandberg	94561dd526	arch: Add support for invalidating TLBs when draining This patch adds support for the memInvalidate() drain method. TLB flushing is requested by calling the virtual flushAll() method on the TLB. Note: This patch renames invalidateAll() to flushAll() on x86 and SPARC to make the interface consistent across all supported architectures.	2013-01-07 13:05:40 -05:00
Andreas Sandberg	d44f2f611f	mem: Remove the IIC replacement policy The IIC replacement policy seems to be unused and has probably gathered too much bit rot to be useful. This patch removes the IIC and its associated cache parameters.	2013-01-07 13:05:39 -05:00
Andreas Hansson	9364d35b8b	dev: Do not serialize timer parameters This patch removes the intNum and clock from the serialized scalars as these are set by the Python parameters and should not be part of the checkpoint.	2013-01-07 13:05:39 -05:00
Andreas Hansson	406891c62a	scons: Enforce gcc >= 4.4 or clang >= 2.9 and c++0x support This patch checks that the compiler in use is either gcc >= 4.4 or clang >= 2.9. and enables building with --std=c++0x in all cases. As a consequence, we can tidy up the hashmap and always have static_assert available. If anyone wants to use alternative compilers, icc for example supports c++0x to a similar level and could be added if needed. This patch opens up for a more elaborate use of c++0x features that are present in gcc 4.4 and clang 2.9, e.g. auto typed variables, variadic templates, rvalues and move semantics, and strongly typed enums. There will be no going back on this one...	2013-01-07 13:05:39 -05:00
Andreas Hansson	221302335b	scons: Remove stale compiler options This patch simply prunes the SUNCC and ICC compiler options as they are both sufficiently stale that they would have to be re-written from scratch anyhow. The patch serves to clean things up before shifting to a build environment that enforces basic c++11 compliance as done in the following patch.	2013-01-07 13:05:39 -05:00
Andreas Hansson	921490a060	sim: Fatal if a clocked object is set to have a clock of 0 This patch adds a check to the clocked object constructor to ensure it is not configured to have a clock period of 0.	2013-01-07 13:05:39 -05:00
Andreas Hansson	490dc30d96	dev: Make the ethernet devices use a non-zero clock This patch changes the NS gige controller to have a non-clock, and sets the default to 500 MHz. The blocks that could prevoiusly be by-passed with a zero clock are now always present, and the user is left with the option of setting a very high clock frequency to achieve a similar performance.	2013-01-07 13:05:39 -05:00
Chander Sudanthi	694a81e994	ARM: pl111/LCD framebuffer checkpointing fix Fixed check pointing of the framebuffer. Previously, the pixel size was not considered in determining the size of the buffer to checkpoint. This patch checkpoints the entire framebuffer instead of the first quarter.	2013-01-07 13:05:39 -05:00
Andreas Sandberg	c3551e82f7	arch: Fix broken M5VarArgsFault initialization At least gcc 4.4.3 seems to get confused by the use of func both as a template parameter and a member variable in the M5VarArgsFault class. This causes the value of the member variable func to be unpredictable in M5VarArgsFault objects. This changeset renames the template parameter to remove this ambiguity.	2013-01-07 13:05:38 -05:00
Andreas Hansson	18b147acef	mem: Merge ranges that are part of the conf table This patch adds basic merging of address ranges when determining which address ranges should be reported in the configuration table. By performing this merging it is possible to distribute an address range across many memory channels (controllers). This is essential to enable address interleaving.	2013-01-07 13:05:38 -05:00
Andreas Hansson	b8c2fa6ba9	base: Add support for merging of interleaved address ranges This patch adds support for merging a vector of interleaved address ranges into a contigous range. The functionality will be used in the interconnect and the PhysicalMemory to transform interleaved memory ranges to contigous ranges before passing them on. The actual use of the merging is appearing in future patches.	2013-01-07 13:05:38 -05:00
Andreas Hansson	01c5598373	mem: Add interleaving bits to the address ranges This patch adds support for interleaving bits for the address ranges. What was previously just a start and end address, now has an additional three fields, for the high bit, and number of bits to use for interleaving, and a match value to compare against. If the number of interleaving bits is set to zero it is effectively disabled. A number of convenience functions are added to the range to enquire about the interleaving, its granularity and the number of stripes it is part of.	2013-01-07 13:05:38 -05:00
Andreas Hansson	e6c57786a4	config: Traverse lists when visiting children in all proxy This patch makes the all proxy traverse any potential list that is encountered in the object hierarchy instead of only looking at children that are SimObjects. An example of where this is useful is when creating a multi-channel memory system as a list of controllers, whilst ensuring that the memories are still visible in the system.	2013-01-07 13:05:38 -05:00
Andreas Hansson	e0d93fde99	base: Simplify the AddrRangeMap by removing unused code This patch cleans up the AddrRangeMap in preparation for the addition of interleaving by removing unused code. The non-const editions of find are never used, and hence the duplication is not needed.	2013-01-07 13:05:38 -05:00
Andreas Hansson	e65de3f5ca	config: Do not use hardcoded physmem in fs script This patch generalises the address range resolution for the I/O cache and I/O bridge such that they do not assume a single memory. The patch involves adding a parameter to the system which is then defined based on the memories that are to be visible from the I/O subsystem, whether behind a cache or a bridge. The change is needed to allow interleaved memory controllers in the system.	2013-01-07 13:05:38 -05:00
Andreas Hansson	15a979c6be	mem: Tidy up bus addr range debug messages This patch tidies up a number of the bus DPRINTFs related to range manipulation. In particular, it shifts the message about range changes to the start of the member function, and also adds information about when all ranges are received.	2013-01-07 13:05:38 -05:00
Andreas Hansson	caf6786ad5	mem: Skip address mapper range checks to allow more flexibility This patch makes the address mapper less stringent about checking the before and after ranges, i.e. the original and remapped ranges. The checks were not really necessary, and there are situations when the previous checks were too strict.	2013-01-07 13:05:38 -05:00
Andreas Hansson	71da1d2157	base: Encapsulate the underlying fields in AddrRange This patch makes the start and end address private in a move to prevent direct manipulation and matching of ranges based on these fields. This is done so that a transition to ranges with interleaving support is possible. As a result of hiding the start and end, a number of member functions are needed to perform the comparisons and manipulations that previously took place directly on the members. An accessor function is provided for the start address, and a function is added to test if an address is within a range. As a result of the latter the != and == operator is also removed in favour of the member function. A member function that returns a string representation is also created to allow debug printing. In general, this patch does not add any functionality, but it does take us closer to a situation where interleaving (and more cleverness) can be added under the bonnet without exposing it to the user. More on that in a later patch.	2013-01-07 13:05:38 -05:00
Andreas Hansson	cfdaf53104	mem: Remove the joining of neighbouring ranges This patch temporarily removes the joining of ranges when creating the backing store, to reserve this functionality for the interleaved ranges that are about to be introduced. When creating the mmaps for the backing store, there is no point in creating larger contigous chunks that what is necessary. The larger chunks will only make life more difficult for the host. Merging will be re-added later, but then only for interleaved ranges.	2013-01-07 13:05:38 -05:00
Andreas Hansson	ccb6c64047	cpu: Share the send functionality between traffic generators This patch moves the packet creating and sending to a member function in the shared base class to avoid code duplication.	2013-01-07 13:05:37 -05:00
Andreas Hansson	1da209140c	cpu: Add support for protobuf input for the trace generator This patch adds support for reading input traces encoded using protobuf according to what is done in the CommMonitor. A follow-up patch adds a Python script that can be used to convert the previously used ASCII traces to protobuf equivalents. The appropriate regression input is updated as part of this patch.	2013-01-07 13:05:37 -05:00
Andreas Hansson	35bdee72cb	cpu: Encapsulate traffic generator input in a stream This patch encapsulates the traffic generator input in a stream class such that the parsing is not visible to the trace generator. The change takes us one step closer to using protobuf-based input traces for the trace replay. The functionality of the current input stream is identical to what it was, and the ASCII format remains the same for now.	2013-01-07 13:05:37 -05:00
Andreas Hansson	4afa6c4c3e	base: Add wrapped protobuf input stream This patch adds support for inputting protobuf messages through a ProtoInputStream which hides the internal streams used by the library. The stream is created based on the name of an input file and optionally includes decompression using gzip. The input stream will start by getting a magic number from the file, and also verify that it matches with the expected value. Once opened, messages can be read incrementally from the stream, returning true/false until an error occurs or the end of the file is reached.	2013-01-07 13:05:37 -05:00
Andreas Hansson	f456c7983d	mem: Add tracing support in the communication monitor This patch adds packet tracing to the communication monitor using a protobuf as the mechanism for creating the trace. If no file is specified, then the tracing is disabled. If a file is specified, then for every packet that is successfully sent, a protobuf message is serialized to the file.	2013-01-07 13:05:37 -05:00
Andreas Hansson	11ab30fa5a	base: Add wrapped protobuf output streams This patch adds support for outputting protobuf messages through a ProtoOutputStream which hides the internal streams used by the library. The stream is created based on the name of an output file and optionally includes compression using gzip. The output stream will start by putting a magic number in the file, and then for every message that is serialized prepend the size such that the stream can be written and read incrementally. At this point this merely serves as a proof of concept.	2013-01-07 13:05:37 -05:00
Andreas Hansson	41f228c2ea	scons: Add support for google protobuf building This patch enables the use of protobuf input files in the build process, thus allowing .proto files to be added to input. Each .proto file is compiled using the protoc tool and the newly created C++ source is added to the list of sources. The first location where the protobufs will be used is in the capturing and replay of memory traces, involving the communication monitor and the trace-generator state of the traffic generator. This will follow in the next patch. This patch does add a dependency on the availability of the BSD licensed protobuf library (and headers), and the protobuf compiler, protoc. These dependencies are checked in the SConstruct, similar to e.g. swig. The user can override the use of protoc from the PATH by specifying the PROTOC environment variable. Although the dependency on libprotobuf and protoc might seem like a big step, they add significant value to the project going forward. Execution traces and other types of traces could easily be added and parsers for C++ and Python are automatically generated. We could also envision using protobufs for the checkpoints, description of the traffic-generator behaviour etc. The sky is the limit. We could also use the GzipOutputStream from the protobuf library instead of the current GPL gzstream. Currently, only the C++ source and header is generated. Going forward we might want to add the Python output to support simple command-line tools for displaying and editing the traces.	2013-01-07 13:05:37 -05:00
Andreas Sandberg	63f1d0516d	arm: Fix DMA event handling bug in the PL111 model The PL111 model currently maintains a list of pre-allocated DmaDoneEvents to prevent unnecessary heap allocations. This list effectively works like a stack where the top element is the latest scheduled event. When an event triggers, the top pointer is moved down the stack. This obviously breaks since events usually retire from the bottom (events don't necessarily have to retire in order), which triggers the following assertion: gem5.debug: build/ARM/dev/arm/pl111.cc:460: void Pl111::fillFifo(): \ Assertion `!dmaDoneEvent[dmaPendingNum-1].scheduled()' failed. This changeset adds a vector listing the currently unused events. This vector acts like a stack where the an element is popped off the stack when a new event is needed an pushed on the stack when they trigger.	2013-01-07 13:05:37 -05:00
Andreas Hansson	fffdc6a450	dev: Fix the Pl111 timings by separating pixel and DMA clock This patch fixes the Pl111 timings by creating a separate clock for the pixel timings. The device clock is used for all interactions with the memory system, just like the AHB clock on the actual module. The result without this patch is that the module only is allowed to send one request every tick of the 24MHz clock which causes a huge backlog.	2013-01-07 13:05:36 -05:00
Andreas Hansson	f22d3bb9c3	cpu: Fix the traffic gen read percentage This patch fixes the computation that determines whether to perform a read or a write such that the two corner cases (0 and 100) are both more efficient and handled correctly.	2013-01-07 13:05:35 -05:00
Andreas Hansson	852a7bcf92	mem: Add sanity check to packet queue size This patch adds a basic check to ensure that the packet queue does not grow absurdly large. The queue should only be used to store packets that were delayed due to blocking from the neighbouring port, and not for actual storage. Thus, a limit of 100 has been chosen for now (which is already quite substantial).	2013-01-07 13:05:35 -05:00
Andreas Hansson	ce5fc494e3	ruby: Fix missing cxx_header in Switch This patch addresses a warning related to the swig interface generation for the Switch class. The cxx_header is now specified correctly, and the header in question has got a few includes added to make it all compile.	2013-01-07 13:05:35 -05:00
Chris Emmons	b7827a5aaa	config: Replace second keyboard with a mouse. The platform has two KMI devices that are both setup to be keyboards. This patch changes the second keyboard to a mouse. This patch will allow keyboard input as usual and additionally provide mouse support.	2013-01-07 13:05:35 -05:00
Andreas Hansson	174269978a	mem: Fix a bug in the memory serialization file naming This patch fixes a bug that caused multiple systems to overwrite each other physical memory. The system name is now included in the filename such that this is avoided.	2013-01-07 13:05:35 -05:00
Andreas Sandberg	0d1ad50326	arm: Make ID registers ISA parameters This patch makes the values of ID_ISARx, MIDR, and FPSID configurable as ISA parameter values. Additionally, setMiscReg now ignores writes to all of the ID registers. Note: This moves the MIDR parameter from ArmSystem to ArmISA for consistency.	2013-01-07 13:05:35 -05:00
Andreas Sandberg	3db3f83a5e	arch: Make the ISA class inherit from SimObject The ISA class on stores the contents of ID registers on many architectures. In order to make reset values of such registers configurable, we make the class inherit from SimObject, which allows us to use the normal generated parameter headers. This patch introduces a Python helper method, BaseCPU.createThreads(), which creates a set of ISAs for each of the threads in an SMT system. Although it is currently only needed when creating multi-threaded CPUs, it should always be called before instantiating the system as this is an obvious place to configure ID registers identifying a thread/CPU.	2013-01-07 13:05:35 -05:00
Ali Saidi	69d419f313	o3: Fix issue with LLSC ordering and speculation This patch unlocks the cpu-local monitor when the CPU sees a snoop to a locked address. Previously we relied on the cache to handle the locking for us, however some users on the gem5 mailing list reported a case where the cpu speculatively executes a ll operation after a pending sc operation in the pipeline and that makes the cache monitor valid. This should handle that case by invaliding the local monitor.	2013-01-07 13:05:33 -05:00
Ali Saidi	5146a69835	cpu: rename the misleading inSyscall to noSquashFromTC isSyscall was originally created because during handling of a syscall in SE mode the threadcontext had to be updated. However, in many places this is used in FS mode (e.g. fault handlers) and the name doesn't make much sense. The boolean actually stops gem5 from squashing speculative and non-committed state when a write to a threadcontext happens, so re-name the variable to something more appropriate	2013-01-07 13:05:33 -05:00
Ali Saidi	9a645d6e9b	cache: add note about where conflicts are handled	2013-01-07 13:05:32 -05:00
Gabe Black	e17c375ddd	Decoder: Remove the thread context get/set from the decoder. This interface is no longer used, and getting rid of it simplifies the decoders and code that sets up the decoders. The thread context had been used to read architectural state which was used to contextualize the instruction memory as it came in. That was changed so that the state is now sent to the decoders to keep locally if/when it changes. That's significantly more efficient. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2013-01-04 19:00:45 -06:00
Gabe Black	d1965af220	X86: Move address based decode caching in front of the predecoder. The predecoder in x86 does a lot of work, most of which can be skipped if the decoder cache is put in front of it. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2013-01-04 19:00:44 -06:00
Gabe Black	63b10907ef	SPARC: Keep a copy of the current ASI in the decoder. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2013-01-04 18:09:45 -06:00
Gabe Black	a83e74b37a	ARM: Keep a copy of the fpscr len and stride fields in the decoder. Avoid reading them every instruction, and also eliminate the last use of the thread context in the decoders. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2013-01-04 18:09:35 -06:00
Nilay Vaish	e9fa54de58	x86: implement x87 fp instruction fnstsw This patch implements the fnstsw instruction. The code was originally written by Vince Weaver. Gabe had made some comments about the code, but those were never addressed. This patch addresses those comments.	2012-12-30 12:45:50 -06:00
Nilay Vaish	23ba6fc5fb	x86: implement x87 fp instruction fsincos This patch implements the fsincos instruction. The code was originally written by Vince Weaver. Gabe had made some comments about the code, but those were never addressed. This patch addresses those comments.	2012-12-30 12:45:45 -06:00
Nathanael Premillieu	3026a116ba	arm: set uopSet_uop as conditional or unconditional control uopSet_uop is microop instruction that has the IsControl flags set, but the IsCondControl or IsUncondControl flags seems not to be set, neither in the construction nor where the microop is used. This patch adds the the flags in the constructor of the instruction (MicroUopSetPCCPSR). Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2012-12-12 09:50:33 -06:00
Nathanael Premillieu	84fc57bfe6	arm: set movret_uop as conditional or unconditional control A flag was missing for the movret_uop microop instruction. This patch adds that flag when the instruction is used, not directly in the constructor of the instruction. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2012-12-12 09:50:16 -06:00
Nilay Vaish	f3d0be210f	ruby: add support for prefetching to MESI protocol	2012-12-11 10:05:56 -06:00
Nilay Vaish	c120273708	ruby: modify the directed tester to read/write streams The directed tester supports only generating only read or only write accesses. The patch modifies the tester to support streams that have both read and write accesses.	2012-12-11 10:05:55 -06:00
Nilay Vaish	9b72a0f627	ruby: change slicc to allow for constructor args The patch adds support to slicc for recognizing arguments that should be passed to the constructor of a class. I did not like the fact that an explicit check was being carried on the type 'TBETable' to figure out the arguments to be passed to the constructor. The patch also moves some of the member variables that are declared for all the controllers to the base class AbstractController.	2012-12-11 10:05:55 -06:00
Nilay Vaish	93e283abb3	ruby: add a prefetcher This patch adds a prefetcher for the ruby memory system. The prefetcher is based on a prefetcher implemented by others (well, I don't know who wrote the original). The prefetcher does stride-based prefetching, both unit and non-unit. It obseves the misses in the cache and trains on these. After the training period is over, the prefetcher starts issuing prefetch requests to the controller.	2012-12-11 10:05:54 -06:00
Nilay Vaish	d502384795	ruby: add functions for computing next stride/page address	2012-12-11 10:05:53 -06:00
Erik Tomusk	3dc7e4f496	TournamentBP: Fix some bugs with table sizes and counters globalHistoryBits, globalPredictorSize, and choicePredictorSize are decoupled. globalHistoryBits controls how much history is kept, global and choice predictor sizes control how much of that history is used when accessing predictor tables. This way, global and choice predictors can actually be different sizes, and it is no longer possible to walk off the predictor arrays and cause a seg fault. There are now individual thresholds for choice, global, and local saturating counters, so that taken/not taken decisions are correct even when the predictors' counters' sizes are different. The interface for localPredictorSize has been removed from TournamentBP because the value can be calculated from localHistoryBits. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2012-12-06 09:31:06 -06:00
Malek Musleh	150e9b8c68	inorder cpu: add missing DPRINTF argument Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2012-12-06 05:25:40 -06:00
Nathanael Premillieu	eb899407c5	o3 cpu: remove some unused buggy functions in the lsq Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2012-12-06 04:36:51 -06:00
Nilay Vaish	2d6470936c	sim: have a curTick per eventq This patch adds a _curTick variable to an eventq. This variable is updated whenever an event is serviced in function serviceOne(), or all events upto a particular time are processed in function serviceEvents(). This change helps when there are eventqs that do not make use of curTick for scheduling events.	2012-11-16 10:27:47 -06:00
Nilay Vaish	90c45c29fe	ruby: support functional accesses in garnet flexible network	2012-11-10 17:18:01 -06:00
Nilay Vaish	1492ab066d	ruby: bug in functionalRead, revert recent changes Recent changes to functionalRead() in the memory system was not correct. The change allowed for returning data from the first message found in the buffers of the memory system. This is not correct since it is possible that a timing message has data from an older state of the block. The changes are being reverted.	2012-11-10 17:18:00 -06:00
Andreas Hansson	c4b36901d0	mem: Fix DRAM draining to ensure write queue is empty This patch fixes the draining of the SimpleDRAM controller model. The controller performs buffering of writes and normally there is no need to ever empty the write buffer (if you have a fast on-chip memory, then use it). The patch adds checks to ensure the write buffer is drained when the controller is asked to do so.	2012-11-08 04:25:06 -05:00
Hamid Reza Khaleghzadeh ext:(%2C%20Lluc%20Alvarez%20%3Clluc.alvarez%40bsc.es%3E%2C%20Nilay%20Vaish%20%3Cnilay%40cs.wisc.edu%3E)	8cd475d58e	ruby: reset and dump stats along with reset of the system This patch adds support to ruby so that the statistics maintained by ruby are reset/dumped when the statistics for the rest of the system are reset/dumped. For resetting the statistics, ruby now provides the resetStats() function that a sim object can provide. As a consequence, the clearStats() function has been removed from RubySystem. For dumping stats, Ruby now adds a callback event to the dumpStatsQueue. The exit callback that ruby used to add earlier is being removed. Created by: Hamid Reza Khaleghzadeh. Improved by: Lluc Alvarez, Nilay Vaish Committed by: Nilay Vaish	2012-11-02 12:18:25 -05:00
Ali Saidi	ce5766c409	mem: fix use after free issue in memories until 4-phase work complete.	2012-11-02 11:50:16 -05:00
Andreas Sandberg	ddd6af414c	mem: Add support for writing back and flushing caches This patch adds support for the following optional drain methods in the classical memory system's cache model: memWriteback() - Write back all dirty cache lines to memory using functional accesses. memInvalidate() - Invalidate all cache lines. Dirty cache lines are lost unless a writeback is requested. Since memWriteback() is called when checkpointing systems, this patch adds support for checkpointing systems with caches. The serialization code now checks whether there are any dirty lines in the cache. If there are dirty lines in the cache, the checkpoint is flagged as bad and a warning is printed.	2012-11-02 11:32:02 -05:00
Andreas Sandberg	050f24c796	sim: Add drain methods to request additional cleanup operations This patch adds the following two methods to the Drainable base class: memWriteback() - Write back all dirty cache lines to memory using functional accesses. memInvalidate() - Invalidate memory system buffers. Dirty data won't be written back. Specifying calling memWriteback() after draining will allow us to checkpoint systems with caches. memInvalidate() can be used to drop memory system buffers in preparation for switching to an accelerated CPU model that bypasses the gem5 memory system (e.g., hardware virtualized CPUs). Note: This patch only adds the methods to Drainable, the code for flushing the TLB and the cache is committed separately.	2012-11-02 11:32:02 -05:00
Andreas Sandberg	aae6134b54	sim: Add SWIG interface for Serializable This changeset adds a SWIG interface for the Serializable class, which fixes a warning when compiling the SWIG interface for the event queue. Currently, the only method exported is the name() method.	2012-11-02 11:32:02 -05:00
Andreas Sandberg	dc01535c7e	python: Rename doDrain()->drain() and make it do the right thing There is no point in exporting the old drain() method in Simulate.py. It should only be used internally by doDrain(). This patch moves the old drain() method into doDrain() and renames doDrain() to drain().	2012-11-02 11:32:02 -05:00
Andreas Sandberg	196397fea4	sim: Reuse the code to change memory mode. changeToAtomic and changeToTiming both do essentially the same thing, they check the type of their input argument, drain the system, and switch to the desired memory mode. This patch moves all of that code to a separate method (changeMemoryMode) and calls that from both changeToAtomic and changeToTiming.	2012-11-02 11:32:02 -05:00
Andreas Sandberg	b81a977e6a	sim: Move the draining interface into a separate base class This patch moves the draining interface from SimObject to a separate class that can be used by any object needing draining. However, objects not visible to the Python code (i.e., objects not deriving from SimObject) still depend on their parents informing them when to drain. This patch also gets rid of the CountedDrainEvent (which isn't really an event) and replaces it with a DrainManager.	2012-11-02 11:32:01 -05:00
Andreas Sandberg	eb703a4b4e	cpu: O3 add a header declaring the DerivO3CPU SWIG needs a complete declaration of all wrapped objects. This patch adds a header file with the DerivO3CPU class and includes it in the SWIG interface. --HG-- rename : src/cpu/o3/cpu_builder.cc => src/cpu/o3/deriv.cc	2012-11-02 11:32:01 -05:00
Andreas Sandberg	ebe65a394b	cpu: Add header files for checker CPUs In order to create reliable SWIG wrappers, we need to include the declaration of the wrapped class in the SWIG file. Previously, we didn't expose the declaration of checker CPUs. This patch adds header files for such CPUs and include them in the SWIG wrapper. --HG-- rename : src/cpu/dummy_checker_builder.cc => src/cpu/dummy_checker.cc rename : src/cpu/o3/checker_builder.cc => src/cpu/o3/checker.cc	2012-11-02 11:32:01 -05:00
Andreas Sandberg	df02047d5a	dev: Fix ethernet device inheritance structure The Python wrappers and the C++ should have the same object structure. If this is not the case, bad things will happen when the SWIG wrappers cast between an object and any of its base classes. This was not the case for NSGigE and Sinic devices. This patch makes NSGigE and Sinic inherit from the new EtherDevBase class, which in turn inherits from EtherDevice. As a bonus, this removes some duplicated statistics from the Sinic device.	2012-11-02 11:32:01 -05:00
Andreas Sandberg	c0ab52799c	sim: Include object header files in SWIG interfaces When casting objects in the generated SWIG interfaces, SWIG uses classical C-style casts ( (Foo *)bar; ). In some cases, this can degenerate into the equivalent of a reinterpret_cast (mainly if only a forward declaration of the type is available). This usually works for most compilers, but it is known to break if multiple inheritance is used anywhere in the object hierarchy. This patch introduces the cxx_header attribute to Python SimObject definitions, which should be used to specify a header to include in the SWIG interface. The header should include the declaration of the wrapped object. We currently don't enforce header the use of the header attribute, but a warning will be generated for objects that do not use it.	2012-11-02 11:32:01 -05:00
Andreas Sandberg	044a652587	pci: Make Python wrapper cast to the right type The PCI base class is PciDev and not PciDevice, which is used by the Python world. Make sure this is reflected in the wrapper code.	2012-11-02 11:32:01 -05:00
Andreas Sandberg	249e318212	mips: Remove unused Python file Remove BISystem.py, BareIronMipsSystem is already implemented in MipsSystem.py.	2012-11-02 11:32:01 -05:00
Andreas Sandberg	49a799ce77	dev: Add missing inline declarations	2012-11-02 11:32:01 -05:00
Andreas Sandberg	00b3a57d88	base: Add missing header file to addr_range.hh.	2012-11-02 11:32:01 -05:00
Dam Sunwoo	81406018b0	ARM: dump stats and process info on context switches This patch enables dumping statistics and Linux process information on context switch boundaries (__switch_to() calls) that are used for Streamline integration (a graphical statistics viewer from ARM).	2012-11-02 11:32:01 -05:00
Chander Sudanthi	322daba74c	base: Fix a few incorrectly handled print format cases This patch ensures cases like %0.6u, %06f, and %.6u are processed correctly. The case like %06f is ambiguous and was made to match printf. Also, this patch removes the goto statement in cprintf.cc in favor of a function call.	2012-11-02 11:32:00 -05:00
Chander Sudanthi	55787cc0d0	base: split out the VncServer into a VncInput and Server classes This patch adds a VncInput base class which VncServer inherits from. Another class can implement the same interface and be used instead of the VncServer, for example a class that replays Vnc traffic. --HG-- rename : src/base/vnc/VncServer.py => src/base/vnc/Vnc.py rename : src/base/vnc/vncserver.cc => src/base/vnc/vncinput.cc rename : src/base/vnc/vncserver.hh => src/base/vnc/vncinput.hh	2012-11-02 11:32:00 -05:00
Dam Sunwoo	ac161c1d72	ISA: generic Linux thread info support This patch takes the Linux thread info support scattered across different ISA implementations (currently in ARM, ALPHA, and MIPS), and unifies them into a single file. Adds a few more helper functions to read out TGID, mm, etc. ISA-specific information (e.g., ALPHA PCBB register) is now moved to the corresponding isa_traits.hh files.	2012-11-02 11:32:00 -05:00
Ali Saidi	d0678d1c31	sim: Fix as issue where exit events on instr queues are used after freed.	2012-11-02 11:32:00 -05:00
Mrinmoy Ghosh	4440332bdd	o3: Fix a couple of issues with the local predictor. Fix some issues with the local predictor and the way it's indexed.	2012-11-02 11:32:00 -05:00
Andreas Sandberg	7e25052fee	Partly revert [4f54b0f229b5] and move draining to m5.changeToTiming Changeset 4f54b0f229b5 removed the call to doDrain in changeToTiming based on the assumption that the system does not need draining when running in atomic mode. This is a false assumption since at least the System class requires the system to be drained before it allows switching of memory modes. This patch reverts that part of the changeset.	2012-11-02 11:32:00 -05:00
Andreas Hansson	3d98119717	mem: Fix typo in port comments This patch merely fixes a few typos in the port comments.	2012-10-31 09:28:23 -04:00
Andreas Hansson	6f6adbf0f6	dev: Make default clock more reasonable for system and devices This patch changes the default system clock from 1THz to 1GHz. This clock is used by all modules that do not override the default (parent clock), and primarily affects the IO subsystem. Every DMA device uses its clock to schedule the next transfer, and the change will thus cause this inter-transfer delay to be longer. The default clock of the bus is removed, as the clock inherited from the system provides exactly the same value. A follow-on patch will bump the stats.	2012-10-25 13:14:44 -04:00
Andreas Hansson	1fdc4e850e	arm: Use table walker clock that is inherited from CPU This patch simplifies the scheduling of the next walk for the ARM table walker. Previously it used the CPU clock, but as the table walker inherits the clock from the CPU, it is cleaner to simply use its own clock (which is the same).	2012-10-25 04:32:42 -04:00
Andreas Hansson	69e82539fd	dev: Remove zero-time loop in DMA timing send This patch removes the zero-time loop used to send items from the DMA port transmit list. Instead of having a loop, the DMA port now uses an event to schedule sending of a single packet. Ultimately this patch serves to ease the transition to a blocking 4-phase handshake. A follow-on patch will update the regression statistics.	2012-10-23 04:49:33 -04:00
Nilay Vaish	52d8693677	ruby: functional access updates to network test protocol I had forgotten to change the network test protocol while making changes to ruby for supporting functional accesses. This patch updates the protocol so that it can compile correctly.	2012-10-18 18:35:42 -05:00
Nilay Vaish	5ffc165939	ruby: improved support for functional accesses This patch adds support to different entities in the ruby memory system for more reliable functional read/write accesses. Only the simple network has been augmented as of now. Later on Garnet will also support functional accesses. The patch adds functional access code to all the different types of messages that protocols can send around. These messages are functionally accessed by going through the buffers maintained by the network entities. The patch also rectifies some of the bugs found in coherence protocols while testing the patch. With this patch applied, functional writes always succeed. But functional reads can still fail.	2012-10-15 17:51:57 -05:00
Nilay Vaish	07ce90f7aa	memtest: move check on outstanding requests The Memtest tester allows for only one request to be outstanding for a particular physical address. The check has been written separately for reads and writes. This patch moves the check earlier than its current position so that it need not be written separately for reads and writes.	2012-10-15 17:27:17 -05:00
Nilay Vaish	61434a9943	ruby: register multiple memory controllers Currently the Ruby System maintains pointer to only one of the memory controllers. But there can be multiple controllers in the system. This patch adds a vector of memory controllers.	2012-10-15 17:27:17 -05:00
Nilay Vaish	c14e6cfc4e	ruby: remove AbstractMemOrCache The only place where this abstract class is in use is the memory controller, which it self is an abstract class. Does not seem useful at all.	2012-10-15 17:27:16 -05:00
Nilay Vaish	3e607f146f	ruby: allow function definition in slicc structs This patch adds support for function definitions to appear in slicc structs. This is required for supporting functional accesses for different types of messages. Subsequent patches will use this to development.	2012-10-15 17:27:16 -05:00
Nilay Vaish	c7b0901b97	ruby banked array: do away with event scheduling It seems unecessary that the BankedArray class needs to schedule an event to figure out when the access ends. Instead only the time for the end of access needs to be tracked.	2012-10-15 17:27:15 -05:00
Nilay Vaish	6a65fafa52	ruby: reset timing after cache warm up Ruby system was recently converted to a clocked object. Such objects maintain state related to the time that has passed so far. During the cache warmup, Ruby system changes its own time and the global time. Later on, the global time is restored. So Ruby system also needs to reset its own time.	2012-10-15 17:27:15 -05:00
Andreas Hansson	b6bd4f34b4	Mem: Fix incorrect logic in bus blocksize check This patch fixes the logic in the blocksize check such that the warning is printed if the size is not 16, 32, 64 or 128.	2012-10-15 12:51:21 -04:00
Andreas Hansson	2a740aa096	Port: Add protocol-agnostic ports in the port hierarchy This patch adds an additional level of ports in the inheritance hierarchy, separating out the protocol-specific and protocl-agnostic parts. All the functionality related to the binding of ports is now confined to use BaseMaster/BaseSlavePorts, and all the protocol-specific parts stay in the Master/SlavePort. In the future it will be possible to add other protocol-specific implementations. The functions used in the binding of ports, i.e. getMaster/SlavePort now use the base classes, and the index parameter is updated to use the PortID typedef with the symbolic InvalidPortID as the default.	2012-10-15 08:12:35 -04:00
Andreas Hansson	9baa35ba80	Mem: Separate the host and guest views of memory backing store This patch moves all the memory backing store operations from the independent memory controllers to the global physical memory. The main reason for this patch is to allow address striping in a future set of patches, but at this point it already provides some useful functionality in that it is now possible to change the number of memory controllers and their address mapping in combination with checkpointing. Thus, the host and guest view of the memory backing store are now completely separate. With this patch, the individual memory controllers are far simpler as all responsibility for serializing/unserializing is moved to the physical memory. Currently, the functionality is more or less moved from AbstractMemory to PhysicalMemory without any major changes. However, in a future patch the physical memory will also resolve any ranges that are interleaved and properly assign the backing store to the memory controllers, and keep the host memory as a single contigous chunk per address range. Functionality for future extensions which involve CPU virtualization also enable the host to get pointers to the backing store.	2012-10-15 08:12:32 -04:00
Andreas Hansson	d7ad8dc608	Checkpoint: Make system serialize call children This patch changes how the serialization of the system works. The base class had a non-virtual serialize and unserialize, that was hidden by a function with the same name for a number of subclasses (most likely not intentional as the base class should have been virtual). A few of the derived systems had no specialization at all (e.g. Power and x86 that simply called the System::serialize), but MIPS and Alpha adds additional symbol table entries to the checkpoint. Instead of overriding the virtual function, the additional entries are now printed through a virtual function (un)serializeSymtab. The reason for not calling System::serialize from the two related systems is that a follow up patch will require the system to also serialize the PhysicalMemory, and if this is done in the base class if ends up being between the general parts and the specialized symbol table. With this patch, the checkpoint is not modified, as the order of the segments is unchanged.	2012-10-15 08:12:29 -04:00
Andreas Hansson	0c58106b6e	Mem: Use deque instead of list for bus retries This patch changes the data structure used to keep track of ports that should be told to retry. As the bus is doing this in an FCFS way, there is no point having a list. A deque is a better match (and is at least in theory a better choice from a performance point of view).	2012-10-15 08:12:25 -04:00
Andreas Hansson	93a159875a	Fix: Address a few minor issues identified by cppcheck This patch addresses a number of smaller issues identified by the code inspection utility cppcheck. There are a number of identified leaks in the arm/linux/system.cc (although the function only get's called once so it is not a major problem), a few deletes in dev/x86/i8042.cc that were not array deletes, and sprintfs where the character array had one element less than needed. In the IIC tags there was a function allocating an array of longs which is in fact never used.	2012-10-15 08:12:23 -04:00
Andreas Hansson	88554790c3	Mem: Use cycles to express cache-related latencies This patch changes the cache-related latencies from an absolute time expressed in Ticks, to a number of cycles that can be scaled with the clock period of the caches. Ultimately this patch serves to enable future work that involves dynamic frequency scaling. As an immediate benefit it also makes it more convenient to specify cache performance without implicitly assuming a specific CPU core operating frequency. The stat blocked_cycles that actually counter in ticks is now updated to count in cycles. As the timing is now rounded to the clock edges of the cache, there are some regressions that change. Plenty of them have very minor changes, whereas some regressions with a short run-time are perturbed quite significantly. A follow-on patch updates all the statistics for the regressions.	2012-10-15 08:10:54 -04:00
Andreas Hansson	1c321b8847	Regression: Use CPU clock and 32-byte width for L1-L2 bus This patch changes the CoherentBus between the L1s and L2 to use the CPU clock and also four times the width compared to the default bus. The parameters are not intending to fit every single scenario, but rather serve as a better startingpoint than what we previously had. Note that the scripts that do not use the addTwoLevelCacheHiearchy are not affected by this change. A separate patch will update the stats.	2012-10-15 08:08:08 -04:00
Andreas Hansson	930db9257d	Clock: Inherit the clock from parent by default This patch changes the default 1 Tick clock period to a proxy that resolves the parents clock. As a result of this, the caches and L1-to-L2 bus, for example, will automatically use the clock period of the CPU unless explicitly overridden. To ensure backwards compatibility, the System class overrides the proxy and specifies a 1 Tick clock. We could change this to something more reasonable in a follow-on patch, perhaps 1 GHz or something similar. With this patch applied, all clocked objects should have a reasonable clock period set, and could start specifying delays in Cycles instead of absolute time.	2012-10-15 08:07:07 -04:00
Andreas Hansson	8cc503f1dd	Param: Fix proxy traversal to support chained proxies This patch modifies how proxies are traversed and unproxied to allow chained proxies. The issue that is solved manifested itself when a proxy during its evaluation ended up being hitting another proxy, and the second one got evaluated using the object that was originally used for the first proxy. For a more tangible example, see the following patch on making the default clock being inherited from the parent. In this patch, the CPU clock is a proxy Parent.clock, which is overridden in the system to be an actual value. This all works fine, but the AlphaLinuxSystem has a boot_cpu_frequency parameter that is Self.cpu[0].clock.frequency. When the latter is evaluated, it all happens relative to the current object of the proxy, i.e. the system. Thus the cpu.clock is evaluated as Parent.clock, but using the system rather than the cpu as the object to enquire.	2012-10-15 08:07:06 -04:00
Andreas Hansson	36d199b9a9	Mem: Use range operations in bus in preparation for striping This patch transitions the bus to use the AddrRange operations instead of directly accessing the start and end. The change facilitates the move to a more elaborate AddrRange class that also supports address striping in the bus by specifying interleaving bits in the ranges. Two new functions are added to the AddrRange to determine if two ranges intersect, and if one is a subset of another. The bus propagation of address ranges is also tweaked such that an update is only propagated if the bus received information from all the downstream slave modules. This avoids the iteration and need for the cycle-breaking scheme that was previously used.	2012-10-15 08:07:04 -04:00
Andreas Hansson	43ca8415e8	Mem: Determine bus block size during initialisation This patch moves the block size computation from findBlockSize to initialisation time, once all the neighbouring ports are connected. There is no need to dynamically update the block size, and the caching of the value effectively avoided that anyhow. This is very similar to what was already in place, just with a slightly leaner implementation.	2012-10-11 06:38:43 -04:00
Andreas Hansson	5dba9225f7	Doxygen: Update the version of the Doxyfile This patch bumps the Doxyfile to match more recent versions of Doxygen. The sections that are deprecated have been removed, and the new ones added. The project name has also been updated.	2012-10-11 06:38:42 -04:00
Nilay Vaish	88ba1c452b	ruby: makes some members non-static This patch makes some of the members (profiler, network, memory vector) of ruby system non-static.	2012-10-02 14:35:45 -05:00
Nilay Vaish	4488379244	ruby: changes to simple network This patch makes the Switch structure inherit from BasicRouter, as is done in two other networks.	2012-10-02 14:35:45 -05:00
Nilay Vaish	b370f6a7b2	ruby: rename template_hack to template I don't like using the word hack. Hence, the patch.	2012-10-02 14:35:44 -05:00
Nilay Vaish	d58f84c481	ruby: remove unused code in protocols	2012-10-02 14:35:44 -05:00
Nilay Vaish	73eafe4849	ruby: remove some unused things in slicc This patch removes the parts of slicc that were required for multi-chip protocols. Going ahead, it seems multi-chip protocols would be implemented by playing with the network itself.	2012-10-02 14:35:43 -05:00
Nilay Vaish	3c9d3b16d8	ruby: move functional access to ruby system This patch moves the code for functional accesses to ruby system. This is because the subsequent patches add support for making functional accesses to the messages in the interconnect. Making those accesses from the ruby port would be cumbersome.	2012-10-02 14:35:42 -05:00
Nilay Vaish	95664da097	MI coherence protocol: add copyright notice	2012-09-30 13:20:53 -05:00
Djordje Kovacevic	80a26a3e39	MEM: Put memory system document into doxygen	2012-09-25 11:49:41 -05:00
Mrinmoy Ghosh	6fc0094337	Cache: add a response latency to the caches In the current caches the hit latency is paid twice on a miss. This patch lets a configurable response latency be set of the cache for the backward path.	2012-09-25 11:49:41 -05:00
Sascha Bischoff	74ab69c7ea	Statistics: Add a function to configure periodic stats dumping This patch adds a function, periodicStatDump(long long period), which will dump and reset the statistics every period. This function is designed to be called from the python configuration scripts. This allows the periodic stats dumping to be configured more easilly at run time. The period is currently specified as a long long as there are issues passing Tick into the C++ from the python as they have conflicting definitions. If the period is less than curTick, the first occurance occurs at curTick. If the period is set to 0, then the event is descheduled and the stats are not periodically dumped. Due to issues when resumung from a checkpoint, the StatDump event must be moved forward such that it occues AFTER the current tick. As the function is called from the python, the event is scheduled before the system resumes from the checkpoint. Therefore, the event is moved using the updateEvents() function. This is called from simulate.py once the system has resumed from the checkpoint. NOTE: It should be noted that this is a fairly temporary patch which re-adds the capability to extract temporal information from the communication monitors. It should not be used at the same time as anything that relies on dumping the statistics based on in simulation events i.e. a context switch.	2012-09-25 11:49:41 -05:00
Dam Sunwoo	acbb7a2eed	ARM: added support for flattened device tree blobs Newer Linux kernels require DTB (device tree blobs) to specify platform configurations. The input DTB filename can be specified through gem5 parameters in LinuxArmSystem.	2012-09-25 11:49:41 -05:00
Ali Saidi	5adb4ddc12	O3: Pack the comm structures a bit better to reduce their size.	2012-09-25 11:49:40 -05:00
Ali Saidi	396600de10	mem: Add a gasket that allows memory ranges to be re-mapped. For example if DRAM is at two locations and mirrored this patch allows the mirroring to occur.	2012-09-25 11:49:40 -05:00
Ali Saidi	0c99d21ad7	ARM: Squash outstanding walks when instructions are squashed.	2012-09-25 11:49:40 -05:00
Andreas Sandberg	6f603e0807	arm: Use a static_assert to test that miscRegName[] is complete Instead of statically defining miscRegName to contain NUM_MISCREGS elements, let the compiler determine the length of the array. This allows us to use a static_assert to test that all registers are listed in the name vector.	2012-09-25 11:49:40 -05:00
Andreas Sandberg	4544f3def4	base: Check for static_assert support and provide fallback C++11 has support for static_asserts to provide compile-time assertion checking. This is very useful when testing, for example, structure sizes to make sure that the compiler got the right alignment or vector sizes.	2012-09-25 11:49:40 -05:00
Andreas Sandberg	6598241f2c	sim: Move CPU-specific methods from SimObject to the BaseCPU class	2012-09-25 11:49:40 -05:00
Andreas Sandberg	5f32eceeda	sim: Remove SimObject::setMemoryMode Remove SimObject::setMemoryMode from the main SimObject class since it is only valid for the System class. In addition to removing the method from the C++ sources, this patch also removes getMemoryMode and changeTiming from SimObject.py and updates the simulation code to call the (get\|set)MemoryMode method on the System object instead.	2012-09-25 11:49:40 -05:00
Djordje Kovacevic	d060a28a29	CPU: Add abandoned instructions to O3 Pipe Viewer	2012-09-25 11:49:40 -05:00
Nathanael Premillieu	bfffbb6797	ARM: Inst writing to cntrlReg registers not set as control inst Deletion of the fact that instructions that writes to registers of type "cntrlReg" are not set as control instruction (flag IsControl not set).	2012-09-25 11:49:40 -05:00
Ali Saidi	04ca96427c	ARM: Predict target of more instructions that modify PC.	2012-09-25 11:49:40 -05:00
Andreas Sandberg	1b29352dd5	build: Add missing dependencies when building param SWIG interfaces This patch adds an explicit dependency between param_%s.i and the Python source file defining the object. Previously, the build system didn't rebuild SWIG interfaces correctly when an object's Python sources were updated.	2012-09-25 11:49:40 -05:00
Joel Hestness	4095af5fd6	RubyPort and Sequencer: Fix draining Fix the drain functionality of the RubyPort to only call drain on child ports during a system-wide drain process, instead of calling each time that a ruby_hit_callback is executed. This fixes the issue of the RubyPort ports being reawakened during the drain simulation, possibly with work they didn't previously have to complete. If they have new work, they may call process on the drain event that they had not registered work for, causing an assertion failure when completing the drain event. Also, in RubyPort, set the drainEvent to NULL when there are no events to be drained. If not set to NULL, the drain loop can result in stale drainEvents used.	2012-09-23 13:57:08 -05:00
Andreas Hansson	3b6a143ec5	DRAM: Introduce SimpleDRAM to capture a high-level controller This patch introduces a high-level model of a DRAM controller, with a basic read/write buffer structure, a selectable and customisable arbiter, a few address mapping options, and the basic DRAM timing constraints. The parameters make it possible to turn this model into any desired DDRx/LPDDRx/WideIOx memory controller. The intention is not to be cycle accurate or capture every aspect of a DDR DRAM interface, but rather to enable exploring of the high-level knobs with a good simulation speed. Thus, contrary to e.g. DRAMSim this module emphasizes simulation speed with a good-enough accuracy. This module is merely a starting point, and there are plenty additions and improvements to come. A notable addition is the support for address-striping in the bus to enable a multi-channel DRAM controller. Also note that there are still a few "todo's" in the code base that will be addressed as we go along. A follow-up patch will add basic performance regressions that use the traffic generator to exercise a few well-defined corner cases.	2012-09-21 11:48:13 -04:00
Andreas Hansson	d75b1b5a73	TrafficGen: Add a basic traffic generator This patch adds a traffic generator to the code base. The generator is aimed to be used as a black box model to create appropriate use-cases and benchmarks for the memory system, and in particular the interconnect and the memory controller. The traffic generator is a master module, where the actual behaviour is captured in a state-transition graph where each state generates some sort of traffic. By constructing a graph it is possible to create very elaborate scenarios from basic generators. Currencly the set of generators include idling, linear address sweeps, random address sequences and playback of traces (recording will be done by the Communication Monitor in a follow-up patch). At the moment the graph and the states are described in an ad-hoc line-based format, and in the future this should be aligned with our used of e.g. the Google protobufs. Similarly for the traces, the format is currently a simplistic ad-hoc line-based format that merely serves as a starting point. In addition to being used as a black-box model for system components, the traffic generator is also useful for creating test cases and regressions for the interconnect and memory system. In future patches we will use the traffic generator to create DRAM test cases for the controller model. The patch following this one adds a basic regressions which also contains an example configuration script and trace file for playback.	2012-09-21 11:48:08 -04:00
Andreas Hansson	4aee3aa073	Mem: Tidy up bus member variables types This patch merely tidies up the types used for the bus member variables. It also makes the constant ones const.	2012-09-21 10:11:24 -04:00
Lluc Alvarez	c8de765468	SE: Ignore FUTEX_PRIVATE_FLAG of sys_futex This patch ignores the FUTEX_PRIVATE_FLAG of the sys_futex system call in SE mode. With this patch, when sys_futex with the options FUTEX_WAIT_PRIVATE or FUTEX_WAKE_PRIVATE is emulated, the FUTEX_PRIVATE_FLAG is ignored and so their behaviours are the regular FUTEX_WAIT and FUTEX_WAKE. Emulating FUTEX_WAIT_PRIVATE and FUTEX_WAKE_PRIVATE as if they were non-private is safe from a functional point of view. The FUTEX_PRIVATE_FLAG does not change the semantics of the futex, it's just a mechanism to improve performance under certain circunstances that can be ignored in SE mode.	2012-09-21 04:51:18 -04:00
Anthony Gutierrez	9cd0c5ecc8	bus: removed outdated warn regarding 64 B block sizes this warn is outdated as 64 B blocks are very common, and even the default size for some CPU types. E.g., arm_detailed.	2012-09-20 17:25:52 -04:00
Andreas Hansson	a731f8f9dd	Mem: Remove the file parameter from AbstractMemory This patch removes the unused file parameter from the AbstractMemory. The patch serves to make it easier to transition to a separation of the actual contigious host memory backing store, and the gem5 memory controllers. Without the file parameter it becomes easier to hide the creation of the mmap in the PhysicalMemory, as there are no longer any reasons to expose the actual contigious ranges to the user. To the best of my knowledge there is no use of the parameter, so the change should not affect anyone.	2012-09-19 06:15:46 -04:00
Andreas Hansson	ffb6aec603	AddrRange: Transition from Range<T> to AddrRange This patch takes the final plunge and transitions from the templated Range class to the more specific AddrRange. In doing so it changes the obvious Range<Addr> to AddrRange, and also bumps the range_map to be AddrRangeMap. In addition to the obvious changes, including the removal of redundant includes, this patch also does some house keeping in preparing for the introduction of address interleaving support in the ranges. The Range class is also stripped of all the functionality that is never used. --HG-- rename : src/base/range.hh => src/base/addr_range.hh rename : src/base/range_map.hh => src/base/addr_range_map.hh	2012-09-19 06:15:44 -04:00
Andreas Hansson	c34df76272	AddrRange: Simplify Range by removing stream input/output This patch simplifies the Range class in preparation for the introduction of a more specific AddrRange class that allows interleaving/striping. The only place where the parsing was used was in the unit test.	2012-09-19 06:15:43 -04:00
Andreas Hansson	12c291f9d7	AddrRange: Remove unused range_multimap This patch simply removes the unused range_multimap in preparation for a more specific AddrRangeMap that also allows interleaving in addition to pure ranges.	2012-09-19 06:15:42 -04:00
Andreas Hansson	fccbf8bb45	AddrRange: Simplify AddrRange params Python hierarchy This patch simplifies the Range object hierarchy in preparation for an address range class that also allows striping (e.g. selecting a few bits as matching in addition to the range). To extend the AddrRange class to an AddrRegion, the first step is to simplify the hierarchy such that we can make it as lean as possible before adding the new functionality. The only class using Range and MetaRange is AddrRange, and the three classes are now collapsed into one.	2012-09-19 06:15:41 -04:00
Nilay Vaish	33c904e0a5	ruby: eliminate typedef integer_t	2012-09-18 22:49:12 -05:00
Nilay Vaish	86b1c0fd54	ruby: avoid using g_system_ptr for event scheduling This patch removes the use of g_system_ptr for event scheduling. Each consumer object now needs to specify upfront an EventManager object it would use for scheduling events. This makes the ruby memory system more amenable for a multi-threaded simulation.	2012-09-18 22:46:34 -05:00
Andreas Hansson	7c55464aac	Mem: Add a maximum bandwidth to SimpleMemory This patch makes a minor addition to the SimpleMemory by enforcing a maximum data rate. The bandwidth is configurable, and a reasonable value (12.8GB/s) has been choosen as the default. The changes do add some complexity to the SimpleMemory, but they should definitely be justifiable as this enables a far more realistic setup using even this simple memory controller. The rate regulation is done for reads and writes combined to reflect the bidirectional data busses used by most (if not all) relevant memories. Moreover, the regulation is done per packet as opposed to long term, as it is the short term data rate (data bus width times frequency) that is the limiting factor. A follow-up patch bumps the stats for the regressions.	2012-09-18 10:30:02 -04:00
Andreas Hansson	d1f3a3b91a	gcc: Enable Link-Time Optimization for gcc >= 4.6 This patch adds Link-Time Optimization when building the fast target using gcc >= 4.6, and adds a scons flag to disable it (-no-lto). No check is performed to guarantee that the linker supports LTO and use of the linker plugin, so the user has to ensure that binutils GNU ld >= 2.21 or the gold linker is available. Typically, if gcc >= 4.6 is available, the latter should not be a problem. Currently the LTO option is only useful for gcc >= 4.6, due to the limited support on clang and earlier versions of gcc. The intention is to also add support for clang once the LTO integration matures. The same number of jobs is used for the parallel phase of LTO as the jobs specified on the scons command line, using the -flto=n flag that was introduced with gcc 4.6. The gold linker also supports concurrent and incremental linking, but this is not used at this point. The compilation and linking time is increased by almost 50% on average, although ARM seems to be particularly demanding with an increase of almost 100%. Also beware when using this as gcc uses a tremendous amount of memory and temp space in the process. You have been warned. After some careful consideration, and plenty discussions, the flag is only added to the fast target, and the warning that was issued in an earlier version of this patch is now removed. Similarly, the flag used to enable LTO, now the default is to use it, and the flag has been modified to disable LTO. The rationale behind this decision is that opt is used for development, whereas fast is only used for long runs, e.g. regressions or more elaborate experiments where the additional compile and link time is amortized by a much larger run time. When it comes to the return on investment, the regression seems to be roughly 15% faster with LTO. For a bit more detail, I ran twolf on ARM.fast, with three repeated runs, and they all finish within 42 minutes (+- 25 seconds) without LTO and 31 minutes (+- 25 seconds) with LTO, i.e. LTO gives an impressive >25% speed-up for this case. Without LTO (ARM.fast twolf) real 42m37.632s user 42m34.448s sys 0m0.390s real 41m51.793s user 41m50.384s sys 0m0.131s real 41m45.491s user 41m39.791s sys 0m0.139s With LTO (ARM.fast twolf) real 30m33.588s user 30m5.701s sys 0m0.141s real 31m27.791s user 31m24.674s sys 0m0.111s real 31m25.500s user 31m16.731s sys 0m0.106s	2012-09-14 12:13:22 -04:00
Andreas Hansson	a57eda0843	scons: Add a target for google-perftools profiling This patch adds a new target called 'perf' that facilitates profiling using google perftools rather than gprof. The perftools CPU profiler offers plenty useful information in addition to gprof, and the latter is kept mostly to offer profiling also on non-Linux hosts.	2012-09-14 12:13:21 -04:00
Andreas Hansson	224ea5fba6	scons: Restructure ccflags and ldflags This patch restructures the ccflags such that the common parts are defined in a single location, also capturing all the target types in a single place. The patch also adds a corresponding ldflags in preparation for google-perf profiling support and the addition of Link-Time Optimization.	2012-09-14 12:13:20 -04:00
Andreas Hansson	806a1144ce	scons: Use c++0x with gcc >= 4.4 instead of 4.6 This patch shifts the version of gcc for which we enable c++0x from 4.6 to 4.4 The more long term plan is to see what the c++0x features can bring and what level of support would be enabled simply by bumping the required version of gcc from 4.3 to 4.4. A few minor things had to be fixed in the code base, most notably the choice of a hashmap implementation. In the Ruby Sequencer there were also a few minor issues that gcc 4.4 was not too happy about.	2012-09-14 12:13:18 -04:00
Joel Hestness	234fa4cf7e	Standard Switch: Drain the system before switching CPUs When switching from an atomic CPU to any of the timing CPUs, a drain is unnecessary since no events are scheduled in atomic mode. However, when trying to switch CPUs starting with a timing CPU, there may be events scheduled. This change ensures that all events are drained from the system by calling m5.drain before switching CPUs.	2012-09-12 21:41:37 -05:00
Joel Hestness	16dcb723c1	Base CPU: Initialize profileEvent to NULL The profileEvent pointer is tested against NULL in various places, but it is not initialized unless running in full-system mode. In SE mode, this can result in segmentation faults when profileEvent default intializes to something other than NULL.	2012-09-12 21:40:28 -05:00
Jason Power	aa8bcd15ec	Ruby: Modify Scons so that we can put .sm files in extras Also allows for header files which are required in slicc generated code to be in a directory other than src/mem/ruby/slicc_interface.	2012-09-12 14:52:04 -05:00
Anthony Gutierrez	c6927ed138	stats: remove duplicate instruction stats from the commit stage these stats are duplicates of insts/opsCommitted, cause confusion, and are poorly named.	2012-09-12 11:35:52 -04:00
Andreas Hansson	292d8252a4	clang: Fix issues identified by the clang static analyzer This patch addresses a few minor issues reported by the clang static analyzer. The analysis was run with: scan-build -disable-checker deadcode \ -enable-checker experimental.core \ -disable-checker experimental.core.CastToStruct \ -enable-checker experimental.cpluscplus	2012-09-11 14:15:47 -04:00
Lena Olson	584eba3ab6	Cache: Split invalidateBlk up to seperate block vs. tags This seperates the functionality to clear the state in a block into blk.hh and the functionality to udpate the tag information into the tags. This gets rid of the case where calling invalidateBlk on an already-invalid block does something different than calling it on a valid block, which was confusing.	2012-09-11 14:14:49 -04:00
Nilay Vaish	f47c2f6415	X86: make use of register predication The patch introduces two predicates for condition code registers -- one tests if a register needs to be read, the other tests whether a register needs to be written to. These predicates are evaluated twice -- during construction of the microop and during its execution. Register reads and writes are elided depending on how the predicates evaluate.	2012-09-11 09:33:42 -05:00
Nilay Vaish	6369df59c8	x86: Add a separate register for D flag bit The D flag bit is part of the cc flag bit register currently. But since it is not being used any where in the implementation, it creates an unnecessary dependency. Hence, it is being moved to a separate register.	2012-09-11 09:25:43 -05:00
Nilay Vaish	3700e5448a	ISA Parser: Allow predication of source and destination registers This patch is meant for allowing predicated reads and writes. Note that this predication is different from the ISA provided predication. They way we currently provide the ISA description for X86, we read/write registers that do not need to be actually read/written. This is likely to be true for other ISAs as well. This patch allows for read and write predicates to be associated with operands. It allows for the register indices for source and destination registers to be decided at the time when the microop is constructed. The run time indicies come in to play only when the at least one of the predicates has been provided. This patch will not affect any of the ISAs that do not provide these predicates. Also the patch assumes that the order in which operands appear in any function of the microop is same across all the functions of the microops. A subsequent patch will enable predication for the x86 ISA.	2012-06-03 10:59:04 -05:00
Nilay Vaish	637c6c7e32	Ruby: Use uint32_t instead of uint32 everywhere	2012-09-11 09:24:45 -05:00
Nilay Vaish	f00347a20f	Ruby: Use uint8_t instead of uint8 everywhere	2012-09-11 09:23:56 -05:00
Nilay Vaish	c5bf1390aa	Ruby System: Convert to Clocked Object This patch moves Ruby System from being a SimObject to recently introduced ClockedObject.	2012-09-10 12:21:01 -05:00
Nilay Vaish	4e6f048ef0	Ruby Slicc: remove the call to cin.get() function If I understand correctly, this was put in place so that a debugger can be attached when the protocol aborts. While this sounds useful, it is a problem when the simulation is not being actively monitored. I think it is better to remove this.	2012-09-10 12:20:34 -05:00
Marco Elver	9e0edbcea8	Mem: Allow serializing of more than INT_MAX bytes Despite gzwrite taking an unsigned for length, it returns an int for bytes written; gzwrite fails if (int)len < 0. Because of this, call gzwrite with len no larger than INT_MAX: write in blocks of INT_MAX if data to be written is larger than INT_MAX.	2012-09-10 11:57:43 -04:00
Palle Lyckegaard	21d4d50ba1	NetBSD: Build on NetBSD Minor patch against so building on NetBSD is possible.	2012-09-10 11:57:42 -04:00
Andreas Hansson	3215ed9754	AddrRange: Remove the unused range_ops header This patch prunes the range_ops header that is no longer used. The bridge used it to do filtering of address ranges, but this is changed since quite some time. Ultimately this patch aims to simplify the handling of ranges before specialising the AddrRange to an AddrRegion that also allows striping bits to be selected.	2012-09-10 11:57:40 -04:00
Andreas Hansson	1f9c3bcb46	Inet: Remove the SackRange and its use This patch aims to simplify the use of the Range class before introducing a more elaborate AddrRegion to replace the AddrRange. The SackRange is the only use of the range class besides address ranges, and the removal of this use makes for an easier modification of the range class. The functionlity that is removed with this patch is not used anywhere throughout the code base.	2012-09-10 11:57:39 -04:00
Andreas Hansson	cf5935445f	Device: Bump PIO and PCI latencies to more reasonable values This patch addresses a previously highlighted issue with the default latencies used for PIO and PCI devices. The values are merely educated guesses and might not represent the particular system you want to model. However, the values in this patch are definitely far more realistic than the previous ones. In i8254xGBe, the writeConfig method is updated to use configDelay instead of pioDelay. A follow-up patch will update the regression stats.	2012-09-10 11:57:36 -04:00
Andreas Sandberg	d4a6d9846a	sim: Update the SimObject documentation Includes a small change in sim_object.cc that adds the name space to the output stream parameter in serializeAll. Leaving out the name space unfortunately confuses Doxygen.	2012-09-07 14:20:53 -05:00
Andreas Sandberg	2f397f314b	sim: Remove the unused SimObject::regFormulas method Simulation objects normally register derived statistics, presumably what regFormulas originally was meant for, in regStats(). This patch removes regRegformulas since there is no need to have a separate method call to register formulas.	2012-09-07 14:20:53 -05:00
Ali Saidi	03ff612054	O3: Get rid of incorrect assert in RAS.	2012-09-07 14:20:53 -05:00
Ali Saidi	2059c01673	dev: Fix bifield definition in timer_cpulocal.hh Bitfield definition in the local timer model for ARM had the bitfield range numbers reversed which could lead to buggy behavior.	2012-09-07 14:20:53 -05:00
Ali Saidi	5217d5a451	Igbe: Newer kernels seem to allow TSO headers and packet data to be in one desc Implement some code we used to panic on as it actually does happen with the e1000 driver in Linux 3.3+. We used to assume that a TSO header would never be part of a larger payload, however it appears as though it now can be.	2012-09-07 14:20:53 -05:00
Krishnendra Nathella	3f5ee1cf8c	sim: add validation to make sure there is memory where we're loading the kernel	2012-09-07 14:20:53 -05:00
Ali Saidi	3742b19b36	loader: initialize all memory in the ObjectFile objects. Some bare metal build flows seem to build binaries that we aren't necessarily expecting. Initialize everything to 0, so we don't make any assumptions about what is or isn't in the binary.	2012-09-07 14:20:52 -05:00
Ali Saidi	8fc0cef611	ARM: Fix one of the timers used in the VExpress EMM platform.	2012-09-07 14:20:52 -05:00
Andreas Hansson	287ea1a081	Param: Transition to Cycles for relevant parameters This patch is a first step to using Cycles as a parameter type. The main affected modules are the CPUs and the Ruby caches. There are definitely plenty more places that are affected, but this patch serves as a starting point to making the transition. An important part of this patch is to actually enable parameters to be specified as Param.Cycles which involves some changes to params.py.	2012-09-07 12:34:38 -04:00
Joel Hestness	6924e10978	Ruby Memory Controller: Fix clocking	2012-09-05 20:51:41 -05:00
Jason Power	494f6a858e	Ruby: Correct DataBlock =operator The =operator for the DataBlock class was incorrectly interpreting the class member m_alloc. This variable stands for whether the assigned memory for the data block needs to be freed or not by the class itself. It seems that the =operator interpreted the variable as whether the memory is assigned to the data block. This wrong interpretation was causing values not to propagate to RubySystem::m_mem_vec_ptr. This caused major issues with restoring from checkpoints when using a protocol which verified that the cache data was consistent with the backing store (i.e. MOESI-hammer).	2012-08-28 17:57:51 -05:00
Andreas Hansson	0cacf7e817	Clock: Add a Cycles wrapper class and use where applicable This patch addresses the comments and feedback on the preceding patch that reworks the clocks and now more clearly shows where cycles (relative cycle counts) are used to express time. Instead of bumping the existing patch I chose to make this a separate patch, merely to try and focus the discussion around a smaller set of changes. The two patches will be pushed together though. This changes done as part of this patch are mostly following directly from the introduction of the wrapper class, and change enough code to make things compile and run again. There are definitely more places where int/uint/Tick is still used to represent cycles, and it will take some time to chase them all down. Similarly, a lot of parameters should be changed from Param.Tick and Param.Unsigned to Param.Cycles. In addition, the use of curTick is questionable as there should not be an absolute cycle. Potential solutions can be built on top of this patch. There is a similar situation in the o3 CPU where lastRunningCycle is currently counting in Cycles, and is still an absolute time. More discussion to be had in other words. An additional change that would be appropriate in the future is to perform a similar wrapping of Tick and probably also introduce a Ticks class along with suitable operators for all these classes.	2012-08-28 14:30:33 -04:00
Andreas Hansson	d53d04473e	Clock: Rework clocks to avoid tick-to-cycle transformations This patch introduces the notion of a clock update function that aims to avoid costly divisions when turning the current tick into a cycle. Each clocked object advances a private (hidden) cycle member and a tick member and uses these to implement functions for getting the tick of the next cycle, or the tick of a cycle some time in the future. In the different modules using the clocks, changes are made to avoid counting in ticks only to later translate to cycles. There are a few oddities in how the O3 and inorder CPU count idle cycles, as seen by a few locations where a cycle is subtracted in the calculation. This is done such that the regression does not change any stats, but should be revisited in a future patch. Another, much needed, change that is not done as part of this patch is to introduce a new typedef uint64_t Cycle to be able to at least hint at the unit of the variables counting Ticks vs Cycles. This will be done as a follow-up patch. As an additional follow up, the thread context still uses ticks for the book keeping of last activate and last suspend and this should probably also be changed into cycles as well.	2012-08-28 14:30:31 -04:00
Andreas Hansson	d14e5857c7	Port: Stricter port bind/unbind semantics This patch tightens up the semantics around port binding and checks that the ports that are being bound are currently not connected, and similarly connected before unbind is called. The patch consequently also changes the order of the unbind and bind for the switching of CPUs to ensure that the rules are adhered to. Previously the ports would be "over-written" without any check. There are no changes in behaviour due to this patch, and the only place where the unbind functionality is used is in the CPU.	2012-08-28 14:30:27 -04:00
Andreas Hansson	105ad88d35	Checker: Fix checker CPU ports This patch updates how the checker CPU handles the ports such that the regressions will once again run without causing a panic. A minor amount of tidying up was also done as part of this patch.	2012-08-28 14:30:24 -04:00
Andreas Hansson	d090f4d930	swig: Disable unused value warning with llvm 3.1 compilers This patch disables a warning for unused values which causes problems when compiling the swig-generated sources using recent llvm-based compilers like llvm-gcc and clang.	2012-08-28 14:30:22 -04:00
Anthony Gutierrez	5b1614de02	sim: fix overflow check in simulate because Tick is now unsigned	2012-08-27 20:53:20 -04:00
Nilay Vaish	85c7352462	Ruby: remove README.debugging and Decommissioning_note These files were relevant when Ruby was part of GEMS. They are not required any longer.	2012-08-27 14:57:46 -05:00
Nilay Vaish	0737837109	System: Remove redundant call to startupCPU	2012-08-27 01:14:46 -05:00
Nilay Vaish	9190940511	Ruby: Remove RubyEventQueue This patch removes RubyEventQueue. Consumer objects now rely on RubySystem or themselves for scheduling events.	2012-08-27 01:00:55 -05:00
Nilay Vaish	7122b83d8f	Ruby Memory Vector: Allow more than 4GB of memory The memory size variable was a 32-bit int. This meant that the size of the memory was limited to 4GB. This patch changes the type of the variable to 64-bit to support larger memory sizes. Thanks to Raghuraman Balasubramanian for bringing this to notice.	2012-08-27 01:00:54 -05:00
Nilay Vaish	b422994fea	MESI Protocol: Correct the virtual network in profile functions The virtual network in a couple of places was incorrectly mentioned as 3 in place of 1. This is being corrected.	2012-08-25 15:49:06 -05:00
Nilay Vaish	01f1430833	MESI Coherence Protocol: Add copyright notice	2012-08-25 13:16:45 -05:00
Andreas Hansson	2c1052cd4d	DMA: Refactor the DMA device and align timing and atomic This patch does a bunch of house-keeping updates on the DMA, including indentation, and formatting, but most importantly breaks out the response handling such that it can be shared between the atomic and timing modes. It also removes a potential bug caused by the atomic handling of responses only deleting the allocated request (pkt->req) once the DMA action completes instead of doing so for every packet. Before this patch, the handling of responses was near identical for atomic and timing, but the code was simply duplicated. With this patch, the handleResp method deals with the responses in both cases. There are further updates to make after removing the NACKs, but that will be part of a separate follow-up patch. This patch does not change the behaviour of any regression.	2012-08-22 11:40:01 -04:00
Andreas Hansson	c60db56741	Packet: Remove NACKs from packet and its use in endpoints This patch removes the NACK frrom the packet as there is no longer any module in the system that issues them (the bridge was the only one and the previous patch removes that). The handling of NACKs was mostly avoided throughout the code base, by using e.g. panic or assert false, but in a few locations the NACKs were actually dealt with (although NACKs never occured in any of the regressions). Most notably, the DMA port will now never receive a NACK and the backoff time is thus never changed. As a consequence, the entire backoff mechanism (similar to a PCI bus) is now removed and the DMA port entirely relies on the bus performing the arbitration and issuing a retry when appropriate. This is more in line with e.g. PCIe. Surprisingly, this patch has no impact on any of the regressions. As mentioned in the patch that removes the NACK from the bridge, a follow-up patch should change the request and response buffer size for at least one regression to also verify that the system behaves as expected when the bridge fills up.	2012-08-22 11:39:59 -04:00
Andreas Hansson	a6074016e2	Bridge: Remove NACKs in the bridge and unify with packet queue This patch removes the NACKing in the bridge, as the split request/response busses now ensure that protocol deadlocks do not occur, i.e. the message-dependency chain is broken by always allowing responses to make progress without being stalled by requests. The NACKs had limited support in the system with most components ignoring their use (with a suitable call to panic), and as the NACKs are no longer needed to avoid protocol deadlocks, the cleanest way is to simply remove them. The bridge is the starting point as this is the only place where the NACKs are created. A follow-up patch will remove the code that deals with NACKs in the endpoints, e.g. the X86 table walker and DMA port. Ultimately the type of packet can be complete removed (until someone sees a need for modelling more complex protocols, which can now be done in parts of the system since the port and interface is split). As a consequence of the NACK removal, the bridge now has to send a retry to a master if the request or response queue was full on the first attempt. This change also makes the bridge ports very similar to QueuedPorts, and a later patch will change the bridge to use these. A first step in this direction is taken by aligning the name of the member functions, as done by this patch. A bit of tidying up has also been done as part of the simplifications. Surprisingly, this patch has no impact on any of the regressions. Hence, there was never any NACKs issued. In a follow-up patch I would suggest changing the size of the bridge buffers set in FSConfig.py to also test the situation where the bridge fills up.	2012-08-22 11:39:58 -04:00
Andreas Hansson	e317d8b9ff	Port: Extend the QueuedPort interface and use where appropriate This patch extends the queued port interfaces with methods for scheduling the transmission of a timing request/response. The methods are named similar to the corresponding sendTiming(Snoop)Req/Resp, replacing the "send" with "sched". As the queues are currently unbounded, the methods always succeed and hence do not return a value. This functionality was previously provided in the subclasses by calling PacketQueue::schedSendTiming with the appropriate parameters. With this change, there is no need to introduce these extra methods in the subclasses, and the use of the queued interface is more uniform and explicit.	2012-08-22 11:39:56 -04:00
Andreas Hansson	70e99e0b91	Device: Remove overloaded pio_latency parameter This patch removes the overloading of the parameter, which seems both redundant, and possibly incorrect. The PciConfigAll now also uses a Param.Latency rather than a Param.Tick. For backwards compatibility it still sets the pio_latency to 1 tick. All the comments have also been updated to not state that it is in simticks when it is not necessarily the case.	2012-08-21 05:50:03 -04:00
Andreas Hansson	a81c969529	CPU: Remove overloaded function_trace_start parameter This patch removes the overloading of the parameter, which seems both redundant, and possibly incorrect. The inorder CPU is particularly interesting as it uses a different name for the parameter, and never make any use of it internally.	2012-08-21 05:49:43 -04:00
Andreas Hansson	5803309574	PacketQueue: Allow queuing in the same tick as desired send tick This patch allows packets to be enqueued in the same tick as they are intended to be sent. This does not imply they actually are sent that tick, although that is possible. This change is useful for module that use the queued ports primarly to avoid handling the flow control involved in sending and retrying packets.	2012-08-21 05:49:24 -04:00
Andreas Hansson	4be1ae3cf8	EventManager: Remove test for NULL pointer in constructor This patch tidies up the EventManager constructor and prunes a corner case where the EventManager would initialise its eventq pointer to NULL. This would cause segmentation faults on actual use and should never happen.	2012-08-21 05:49:18 -04:00
Andreas Hansson	016593f2e9	Clock: Make Tick unsigned and remove UTick This patch makes the Tick unsigned and removes the UTick typedef. The ticks should never be negative, and there was only one major issue with removing it, caused by the o3 CPU using a -1 as an initial value. The patch has no impact on any regressions.	2012-08-21 05:49:09 -04:00
Andreas Hansson	452217817f	Clock: Move the clock and related functions to ClockedObject This patch moves the clock of the CPU, bus, and numerous devices to the new class ClockedObject, that sits in between the SimObject and MemObject in the class hierarchy. Although there are currently a fair amount of MemObjects that do not make use of the clock, they potentially should do so, e.g. the caches should at some point have the same clock as the CPU, potentially with a 1:n ratio. This patch does not introduce any new clock objects or object hierarchies (clusters, clock domains etc), but is still a step in the direction of having a more structured approach clock domains. The most contentious part of this patch is the serialisation of clocks that some of the modules (but not all) did previously. This serialisation should not be needed as the clock is set through the parameters even when restoring from the checkpoint. In other words, the state is "stored" in the Python code that creates the modules. The nextCycle methods are also simplified and the clock phase parameter of the CPU is removed (this could be part of a clock object once they are introduced).	2012-08-21 05:49:01 -04:00
Nilay Vaish	0160d51483	Ruby Banked Array: add copyrights	2012-08-19 13:05:53 -05:00
Jason Power	44b4c96253	Ruby: Add RubySystem parameter to MemoryControl This guarantees that RubySystem object is created before the MemoryController object is created.	2012-08-16 23:39:36 -05:00
Nilay Vaish	649e377937	Alpha System: override startup(), instead of loadState() Alpha System was overriding loadState() function to setup some functional event. The system tried to read/write to memory before the Ruby memory had unserialized the state. With this patch, Alpha System overrides the startup() function, and sets up functional events in this function. This works because startup() is called after Ruby memory system has unserialized the memory state.	2012-08-16 23:45:21 -05:00
Anthony Gutierrez	0b3897fc90	O3,ARM: fix some problems with drain/switchout functionality and add Drain DPRINTFs This patch fixes some problems with the drain/switchout functionality for the O3 cpu and for the ARM ISA and adds some useful debug print statements. This is an incremental fix as there are still a few bugs/mem leaks with the switchout code. Particularly when switching from an O3CPU to a TimingSimpleCPU. However, when switching from O3 to O3 cores with the ARM ISA I haven't encountered any more assertion failures; now the kernel will typically panic inside of simulation.	2012-08-15 10:38:08 -04:00
Ali Saidi	dd1b346584	sysemul: bump all linux versions of for syscal emulation to 3.0. New tool chains seem to be looking for kernel versions newer than what this this was previously set to. Also take this opportunity to change the hostname we report in uname to sim.gem5.org.	2012-08-15 10:38:04 -04:00
Jason Power	11411cc9c7	Ruby: Clean up topology changes This patch moves instantiateTopology into Ruby.py and removes the mem/ruby/network/topologies directory. It also adds some extra inheritance to the topologies to clean up some issues in the existing topologies.	2012-08-10 13:50:42 -05:00
Nilay Vaish	706e84f2b8	System: set kernel to null, if unspecified.	2012-08-08 13:40:32 -05:00
Marc Orr	7cef6b9bef	syscall emulation: Enabled getrlimit and getrusage for x86. Added/moved rlimit constants to base linux header file. This patch is a revised version of Vince Weaver's earlier patch.	2012-08-06 19:52:56 -07:00
Steve Reinhardt	f4b424cd53	SETranslatingPortProxy: fix bug in tryReadString() Off-by-one loop termination meant that we were stuffing the terminating '\0' into the std::string value, which makes for difficult-to-debug string comparison failures.	2012-08-06 16:57:11 -07:00
Steve Reinhardt	73ef8bd168	process: add progName() virtual function This replaces a (potentially uninitialized) string field with a virtual function so that we can have a safe interface without requiring changes to the eio code.	2012-08-06 16:55:34 -07:00
Steve Reinhardt	e232152db6	syscall_emul: clean up open() code a bit.	2012-08-06 16:55:28 -07:00
Steve Reinhardt	b647b48bf4	str: add an overloaded startswith() utility method for various string types and use it in a few places.	2012-08-06 16:52:49 -07:00
Marc Orr	d55115936e	syscall emulation: Clean up ioctl handling, and implement for x86. Enable different whitelists for different OS/arch combinations, since some use the generic Linux definitions only, and others use definitions inherited from earlier Unix flavors on those architectures. Also update x86 function pointers so ioctl is no longer unimplemented on that platform. This patch is a revised version of Vince Weaver's earlier patch.	2012-08-06 16:52:40 -07:00
Jason Power	6721b3e325	Ruby NetDest: add assert for bad element in netdest	2012-08-01 17:07:34 -05:00
Anthony Gutierrez	630068be6f	dma: remove unused variable this patch removes the actionInProgress field from the DmaPort class. this variable is only defined and initiated in the ctor. it is never used.	2012-07-27 16:08:05 -04:00
Anthony Gutierrez	8133f2460f	checker: make checker cpu id match its host's cpu id when using the checker i ran into problems where an instruction reading the cpu id register failed because the ids did not match, and hence, the result of the instruction did not match. this patch ensures that the ids match so this instruction does not fail. this problem only seemed to manifest itself when multiple cores were in the system, either multi-core, or extra switched- out cores present in the system.	2012-07-27 16:08:04 -04:00
Anthony Gutierrez	7bf14aedbf	cache: don't allow dirty data in the i-cache removes the optimization that forwards an exclusive copy to a requester on a read, only for the i-cache. this optimization isn't necessary because we typically won't be writing to the i-cache.	2012-07-27 16:08:04 -04:00
Anthony Gutierrez	2eb6b403c9	ARM: fix value of MISCREG_CTR returned by readMiscReg() According to the A15 TRM the value of this register is as follows (assuming 16 word = 64 byte lines) [31:29] Format - b100 specifies v7 [28] RAZ - b0 [27:24] CWG log2(max writeback size #words) - 0x4 16 words [23:20] ERG log2(max reservation size #words) - 0x4 16 words [19:16] DminLine log2(smallest dcache line #words) - 0x4 16 words [15:14] L1Ip L1 index/tagging policy - b11 specifies PIPT [13:4] RAZ - b0000000000 [3:0] IminLine log2(smallest icache line #words) - 0x4 16 words	2012-07-27 16:08:04 -04:00
Andreas Hansson	66f5124e2b	Bridge: Use EventWrapper instead of Event subclass for sendEvent This class simply cleans up the code by making use of the EventWrapper convenience class to schedule the sendEvent in the bridge ports.	2012-07-23 09:32:19 -04:00
Nilay Vaish	11a551ae3a	X86 CPUID: Return false if unknown processor family	2012-07-22 20:31:23 -05:00
Andreas Hansson	f00cba34eb	Mem: Make SimpleMemory single ported This patch changes the simple memory to have a single slave port rather than a vector port. The simple memory makes no attempts at modelling the contention between multiple ports, and any such multiplexing and demultiplexing could be done in a bus (or crossbar) outside the memory controller. This scenario also matches with the ongoing work on a SimpleDRAM model, which will be a single-ported single-channel controller that can be used in conjunction with a bus (or crossbar) to create a multi-port multi-channel controller. There are only very few regressions that make use of the vector port, and these are all for functional accesses only. To facilitate these cases, memtest and memtest-ruby have been updated to also have a "functional" bus to perform the (de)multiplexing of the functional memory accesses.	2012-07-12 12:56:13 -04:00
Nilay Vaish	b913af440b	Ruby: remove config information from ruby.stats This patch removes printConfig() functions from all structures in Ruby. Most of the information is already part of config.ini, and where ever it is not, it would become in due course.	2012-07-12 08:39:19 -05:00
Nilay Vaish	ce4e9a9a50	Ruby: remove some unused stuff from SLICC files	2012-07-12 08:39:18 -05:00
Brad Beckmann	8c18f6da9e	x86: added page size in bytes tlb entry function	2012-07-11 12:21:04 -07:00
Brad Beckmann	5931087dcd	ruby: improved DRAM reset comment	2012-07-11 09:44:34 -07:00
Marc Orr	387f843d51	syscall emulation: Add the futex system call.	2012-07-10 22:51:54 -07:00
Brad Beckmann	52540b1b78	x86: logSize and lruSeq are now optional ckpt params	2012-07-10 22:51:54 -07:00
Steve Reinhardt	2e47aaabc0	Add hook to call map() on Process from python. This enables configuration scripts to set up mappings from process virtual addresses to specific physical addresses in SE mode. This feature is needed to support modeling of user-accessible memories or devices in SE mode, avoiding the complexities of FS mode and the need to write a device driver.	2012-07-10 22:51:54 -07:00
Brad Beckmann	645fa9c262	# User Brad Beckmann <Brad.Beckmann@amd.com> ruby: fixed fatal print statement	2012-07-10 22:51:54 -07:00
Brad Beckmann	6f9bd33b73	ruby: remove the cpu assumptions for the random tester	2012-07-10 22:51:54 -07:00
Brad Beckmann	a22918dd41	# User Brad Beckmann <Brad.Beckmann@amd.com> ruby: fixed msgptr print call	2012-07-10 22:51:54 -07:00
Brad Beckmann	884cd6f752	imported patch jason/slicc-external-structure-fix	2012-07-10 22:51:54 -07:00
Brad Beckmann	86d6b788f6	ruby: banked cache array resource model This patch models a cache as separate tag and data arrays. The patch exposes the banked array as another resource that is checked by SLICC before a transition is allowed to execute. This is similar to how TBE entries and slots in output ports are modeled.	2012-07-10 22:51:54 -07:00
Joel Hestness	467093ebf2	ruby: tag and data cache access support Updates to Ruby to support statistics counting of cache accesses. This feature serves multiple purposes beyond simple stats collection. It provides the foundation for ruby to model the cache tag and data arrays as physical resources, as well as provide the necessary input data for McPAT power modeling.	2012-07-10 22:51:54 -07:00
Nuwan Jayasena	c10f348120	ruby: adds reset function to Ruby memory controllers	2012-07-10 22:51:54 -07:00
Nuwan Jayasena	1740c4c448	ruby: memory controllers now inherit from an abstract "MemoryControl" class	2012-07-10 22:51:53 -07:00
Brad Beckmann	4a52a6ea2d	cpu: added assertions to ensure the correct proxies are used	2012-07-10 22:51:53 -07:00
Brad Beckmann	11b725c19d	ruby: changes how Topologies are created Instead of just passing a list of controllers to the makeTopology function in src/mem/ruby/network/topologies/<Topo>.py we pass in a function pointer which knows how to make the topology, possibly with some extra state set in the configs/ruby/<protocol>.py file. Thus, we can move all of the files from network/topologies to configs/topologies. A new class BaseTopology is added which all topologies in configs/topologies must inheirit from and follow its API. --HG-- rename : src/mem/ruby/network/topologies/Crossbar.py => configs/topologies/Crossbar.py rename : src/mem/ruby/network/topologies/Mesh.py => configs/topologies/Mesh.py rename : src/mem/ruby/network/topologies/MeshDirCorners.py => configs/topologies/MeshDirCorners.py rename : src/mem/ruby/network/topologies/Pt2Pt.py => configs/topologies/Pt2Pt.py rename : src/mem/ruby/network/topologies/Torus.py => configs/topologies/Torus.py	2012-07-10 22:51:53 -07:00
Andreas Hansson	745274cbd4	EventManager: Rename queue accessor and remove cast operator This patch renames the queue() accessor to the less ambigious eventQueue, and also removes the cast operator. The queue() member function cause problems in derived classes that declare members with the same name, e.g. a MemObject subclass that has a packet queue on its own. The operator is not causing any harm at this point, but as it is not used there is little point in keeping it.	2012-07-09 12:35:46 -04:00
Andreas Hansson	d2f458e7b5	Mem: Make members relating to range and size constant This patch makes the address-range related members const. The change is trivial and merely ensures that they can be called on a const memory.	2012-07-09 12:35:44 -04:00
Andreas Hansson	67e257f442	Port: Hide the queue implementation in SimpleTimingPort This patch makes the queue implementation in the SimpleTimingPort private to avoid confusion with the protected member queue in the QueuedSlavePort. The SimpleTimingPort provides the queue_impl to the QueuedSlavePort and it can be accessed via the reference in the base class. The use of the member name queue is thus no longer overloaded.	2012-07-09 12:35:42 -04:00
Andreas Hansson	b265d9925c	Port: Align port names in C++ and Python This patch is a first step to align the port names used in the Python world and the C++ world. Ultimately it serves to make the use of config.json together with output from the simulation easier, including post-processing of statistics. Most notably, the CPU, cache, and bus is addressed in this patch, and there might be other ports that should be updated accordingly. The dash name separator has also been replaced with a "." which is what is used to concatenate the names in python, and a separation is made between the master and slave port in the bus.	2012-07-09 12:35:39 -04:00
Andreas Hansson	1c2ee987f3	Bus: Make the default bus width 8 bytes instead of 64 This patch changes the default bus width to a more sensible 8 bytes (64 bits), which is in line with most on-chip buses. Although there are cases where a wider or narrower bus is useful, the 8 bytes is a good compromise to serve as the default. This patch changes essentially all statistics, and will be bundled with the outstanding changes to the bus.	2012-07-09 12:35:38 -04:00
Andreas Hansson	8caaac048a	Bus: Split the bus into separate request/response layers This patch splits the existing buses into multiple layers. The non-coherent bus is split into a request and a response layer, and the coherent bus adds an additional layer for the snoop responses. The layer is modified to be templatised on the port type, such that the different layers can have retryLists with either master or slave ports. This patch also removes the dynamic cast from the retry, as previously promised when moving the recvRetry from the port base class to the master/slave port respectively. Overall, the split bus more closely reflects any modern on-chip bus and should be at step in the right direction. From this point, it would be reasonable straight forward to add separate layers (and thus contention points and arbitration) for each port and thus create a true crossbar. The regressions all produce the correct output, but have varying degrees of changes to their statistics. A separate patch will be pushed with the updates to the reference statistics.	2012-07-09 12:35:37 -04:00
Andreas Hansson	995e6e4670	Bus: Add a notion of layers to the buses This patch moves all flow control, arbitration and state information into a bus layer. The layer is thus responsible for all the state transitions, and for keeping hold of the retry list. Consequently the layer is also responsible for the draining. With this change, the non-coherent and coherent bus are given a single layer to avoid changing any temporal behaviour, but the patch opens up for adding more layers.	2012-07-09 12:35:36 -04:00
Andreas Hansson	14f9c77dd3	Bus: Replace tickNextIdle and inRetry with a state variable This patch adds a state enum and member variable in the bus, tracking the bus state, thus eliminating the need for tickNextIdle and inRetry, and fixing an issue that allowed the bus to be occupied by multiple packets at once (hopefully it also makes it easier to understand the code). The bus, in its current form, uses tickNextIdle and inRetry to keep track of the state of the bus. However, it only updates tickNextIdle _after_ forwarding a packet using sendTiming, and the result is that the bus is still seen as idle, and a module that receives the packet and starts transmitting new packets in zero time will still see the bus as idle (and this is done by a number of DMA devices). The issue can also be seen in isOccupied where the bus calls reschedule on an event instead of schedule. This patch addresses the problem by marking the bus as _not_ idle already by the time we conclude that the bus is not occupied and we will deal with the packet. As a result of not allowing multiple packets to occupy the bus, some regressions have slight changes in their statistics. A separate patch updates these accordingly. Further ahead, a follow-on patch will introduce a separate state variable for request/responses/snoop responses, and thus implement a split request/response bus with separate flow control for the different message types (even further ahead it will introduce a multi-layer bus).	2012-07-09 12:35:35 -04:00
Andreas Hansson	46d9adb68c	Port: Make getAddrRanges const This patch makes getAddrRanges const throughout the code base. There is no reason why it should not be, and making it const prevents adding any unintentional side-effects.	2012-07-09 12:35:34 -04:00
Andreas Hansson	830391cad9	Port: Add getAddrRanges to master port (asking slave port) This patch adds getAddrRanges to the master port, and thus avoids going through getSlavePort to be able to ask the slave. Similar to the previous patch that added isSnooping to the SlavePort, this patch aims to introduce an additional level of hierarchy in the ports (base port being protocol-agnostic) and getSlave/MasterPort will return port pointers to these base classes. The function is named getAddrRanges also on the master port, but does nothing besides asking the connected slave port. The slave port, as before, has to provide an implementation and actually produce a list of address ranges. The initial design used the name getSlaveAddrRanges for the new function, but the more verbose name was later changed.	2012-07-09 12:35:33 -04:00
Andreas Hansson	49407d76aa	Port: Add isSnooping to slave port (asking master port) This patch adds isSnooping to the slave port, and thus avoids going through getMasterPort to be able to ask the master. Over the course of the next few patches, all getMasterPort/getSlavePort in Port and MemObject are to be protocol agnostic, and the snooping is part of the protocol layer. The function is already present on the master port, where it is implemented by the module itself, e.g. a cache. On the slave side, it is merely asking the connected master port. The same name is used by both functions despite their difference in behaviour. The initial design used isMasterSnooping on the slave port side, but the more verbose function name was later changed.	2012-07-09 12:35:32 -04:00
Andreas Hansson	17f9270dad	Port: Move retry from port base class to Master/SlavePort This patch is the last part of moving all protocol-related functionality out of the Port base class. All the send/recv functions are already moved, and the retry (which still governs all the timing transport functions) is the only part that remained in the base class. The only point where this currently causes a bit of inconvenience is in the bus where the retry list is global and holds Port pointers (not Master/SlavePort). This is about to change with the split into a request/response bus and will soon be removed anyway. The patch has no impact on any regressions.	2012-07-09 12:35:31 -04:00
Andreas Hansson	ff5718f042	Fix: Address a few benign memory leaks This patch is the result of static analysis identifying a number of memory leaks. The leaks are all benign as they are a result of not deallocating memory in the desctructor. The fix still has value as it removes false positives in the static analysis.	2012-07-09 12:35:30 -04:00
Andreas Hansson	92eaac0711	gcc: Fix warnings for gcc 4.7 and clang 3.1 This patch fixes two warnings, one related to a narrowing conversion (int to MachInst), and one due to the cast operator for arguments and a mismatch in const-ness (const void* and void*).	2012-07-02 08:21:53 -04:00
Lena Olson	d2ebade5a5	Cache: Fix the LRU policy for classic memory hierarchy The LRU policy always evicted the least recently touched way, even if it contained valid data and another way was invalid, as can happen if a block has been invalidated by coherance. This can result in caches never warming up even though they are replacing blocks. This modifies the LRU policy to move blocks to LRU position on invalidation.	2012-06-29 11:21:58 -04:00
Uri Wiener	fcccab0dcd	Bus: enable non/coherent buses sub-classes This patch merely changes several methods to be virtual in order to enable non/coherent buses sub-classes.	2012-06-29 11:19:08 -04:00
Dam Sunwoo	7cbe0cf564	Mem: fix master id assertion in cache_impl.hh The assertion was applied to the wrong packet. This patch fixes the issue rerported by Xiang Jiang on the gem5-dev mailing list.	2012-06-29 11:19:07 -04:00
Matt Evans	579047c76d	Mem: Fix a livelock resulting in LLSC/locked memory access implementation. Currently when multiple CPUs perform a load-linked/store-conditional sequence, the loads all create a list of reservations which is then scanned when the stores occur. A reservation matching the context and address of the store is sought, BUT all reservations matching the address are also erased at this point. The upshot is that a store-conditional will remove all reservations even if the store itself does not succeed. A livelock was observed using 7-8 CPUs where a thread would erase the reservations of other threads, not succeed, loop and put its own reservation in again only to have it blown by another thread that unsuccessfully now tries to store-conditional -- no forward progress was made, hanging the system. The correct way to do this is to only blow a reservation when a store (conditional or not) actually /occurs/ to its address. One thread always wins (the one that does the store-conditional first).	2012-06-29 11:19:05 -04:00
Nathanael Premillieu	af2b14a362	O3: Track if the RAS has been pushed or not to pop the RAS if neccessary. Add new flag (named pushedRAS) in the PredictorHistory structure. This flag tracks whether the RAS has been pushed or not during a prediction. Then, in the squash function it is used to pop the RAS if necessary.	2012-06-29 11:18:29 -04:00
Ali Saidi	71daeb0b2b	ARM: Fix identification of one RAS pop instruction. The check should be with the op2 field, not with the op1 field.	2012-06-29 11:18:29 -04:00
Ali Saidi	8d1e56bdcd	Cache: Only invalidate a line in the cache when an uncacheable write is seen.	2012-06-29 11:18:29 -04:00
Ali Saidi	7e3496c78c	ARM: Update version of linux we claim to be to 3.0.0. Static binaries generated with new versions of libc complain that the kernel is too old otherwise.	2012-06-29 11:18:29 -04:00
Ali Saidi	aed8050824	ARM: Fix issue with predicted next pc being wrong because of advance() ordering. npc in PCState for ARM was being calculated before the current flags were updated with the next flags. This causes an issue as the npc is incremented by two or four depending on the current flags (thumb or not) and was leading to branches that were predicted correctly being identified as mispredicted.	2012-06-29 11:18:28 -04:00
Ali Saidi	c51fc5ceff	ARM: Fix address range issue with VExpress EMM	2012-06-27 19:23:02 -04:00
Anthony Gutierrez	9764cde7f2	ARM: implement the ProcessInfo methods	2012-06-11 11:07:41 -04:00
Andreas Hansson	754a9570f2	Timing CPU: Remove a redundant port pointer This patch is trivial and merely prunes a pointer that was never set or used.	2012-06-08 12:45:24 -04:00
Andreas Hansson	a118c01716	Power: Fix MaxMiscDestRegs which was set to zero This patch fixes a failing compilation caused by MaxMiscDestRegs being zero. According to gcc 4.6, the result is a comparison that is always false due to limited range of data type.	2012-06-08 12:44:17 -04:00
Nilay Vaish	d6609793d4	X86 TLB: Add a missing = sign	2012-06-07 17:03:45 -05:00
Ali Saidi	c80cd4136e	mem: Delay deleting of incoming packets by one call. This patch is a temporary fix until Andreas' four-phase patches get reviewed and committed. Removing FastAlloc seems to have exposed an issue which previously was reasonable rare in which packets are freed before the sending cache is done with them. This change puts incoming packets no a pendingDelete queue which are deleted at the start of the next call and thus breaks the dependency between when the caller returns true and when the packet is actually used by the sending cache. Running valgrind on a multi-core linux boot and the memtester results in no valgrind warnings.	2012-06-07 10:59:03 -04:00
Jayneel Gandhi	7183c3fd56	X86 TLB: Fix for gcc 4.4.3 Due to recent changes to X86 TLB, gem5 stopped compiling on gcc version 4.4.3. This patch provides the fix for that problem. The patch is tested on gcc 4.4.3. The change is not required for more recent versions of gcc (like on 4.6.3).	2012-06-07 08:11:00 -05:00
Anthony Gutierrez	d6da3ff317	cpu: Don't init simple and inorder CPUs if they are defered. initCPU() will be called to initialize switched out CPUs for the simple and inorder CPU models. this patch prevents those CPUs from being initialized because they should get their state from the active CPU when it is switched out.	2012-06-05 14:20:13 -04:00
Ali Saidi	20d25b9da7	ISA: Back-out NoopMachInst as a StaticInstPtr change.	2012-06-05 13:52:30 -04:00
Ali Saidi	c06970b673	cpt: update some comments in the checkpoint migration script	2012-06-05 10:36:59 -04:00
William Wang	e5f0d6016b	stats: when applying an operation to two vectors sum the components first. Previously writing X/Y in a formula would result in: x[0]/y[0] + x[1]/y[1] In reality you want: (x[0] +x[1])/(y[0] + y[1])	2012-06-05 01:23:11 -04:00
Dam Sunwoo	14539ccae1	Mem: add per-master stats to physmem Added per-master stats (similar to cache stats) to physmem.	2012-06-05 01:23:11 -04:00
Geoffrey Blake	eced845a5e	ARM: Add PCIe support to VExpress_EMM model and remove deprecated ELT	2012-06-05 01:23:11 -04:00
Chander Sudanthi	15228694d0	ARM: removed extra white space Extra white space fixes in miscregs.hh	2012-06-05 01:23:10 -04:00
Chander Sudanthi	8a2ca2fd24	ARM: Fix MPIDR and MIDR register implementation. This change allows designating a system as MP capable or not as some bootloaders/kernels care that it's set right. You can have a single processor MP capable system, but you can't have a multi-processor UP only system. This change also fixes the initialization of the MIDR register.	2012-06-05 01:23:10 -04:00
Chander Sudanthi	e60b2ac706	ARM: PS2 encoding fix Fixed Disable encoding and added SetDefaults. See http://wiki.osdev.org/Mouse_Input for encodings.	2012-06-05 01:23:10 -04:00
Ali Saidi	70d7d6cc7f	sim: Provide a framework for detecting out of data checkpoints and migrating them.	2012-06-05 01:23:10 -04:00
Ali Saidi	2e988bbab0	stats: Add stats unittest for total calculations.	2012-06-05 01:23:10 -04:00
Ali Saidi	6df196b71e	O3: Clean up the O3 structures and try to pack them a bit better. DynInst is extremely large the hope is that this re-organization will put the most used members close to each other.	2012-06-05 01:23:09 -04:00
Ali Saidi	1b370431d0	sim: Remove FastAlloc While FastAlloc provides a small performance increase (~1.5%) over regular malloc it isn't thread safe. After removing FastAlloc and using tcmalloc I've seen a performance increase of 12% over libc malloc when running twolf for ARM.	2012-06-05 01:23:08 -04:00
Ali Saidi	d6997777be	ARM: Fix over-eager assert in gic.	2012-06-05 01:23:08 -04:00
Mitchell Hayenga	8294d49bb6	stats: Provide a mechanism to get a callback when stats are dumped. This mechanism is useful for dumping output that is correlated with stats dumping, but isn't tracked by the gem5 statistics.	2012-06-05 01:23:08 -04:00
Ali Saidi	0b0c5621ee	ARM: Fix compilation on ARM after Gabe's change.	2012-06-05 01:23:08 -04:00
Gabe Black	008b17d816	ISA: Turn the ExtMachInst NoopMachinst into the StaticInstPtr NoopStaticInst. This eliminates a use of the ExtMachInst type outside of the ISAs.	2012-06-04 10:57:23 -07:00
Gabe Black	35fa5074aa	X86: Ensure that the CPUID instruction always writes its outputs. The CPUID instruction was implemented so that it would only write its results if the instruction was successful. This works fine on the simple CPU where unwritten registers retain their old values, but on a CPU like O3 with renaming this is broken. The instruction needs to write the old values back into the registers explicitly if they aren't being changed.	2012-06-04 10:43:09 -07:00
Gabe Black	7b73c36f5d	X86: Ensure that the decoder's internal ExtMachInst is completely initialized. There are some bits of some fields of the ExtMachInst which are not actually used for anything but are included in the hash of an ExtMachInst for simplicity and efficiency. This change makes sure the decoder's internal working ExtMachInst is completely initialized, even these unused bits, so that there isn't any nondeterministic behavior, no valgrind messages about uninitialized variables, and no potential false misses/redundant entries in the decode cache.	2012-06-04 10:43:08 -07:00
Andreas Hansson	0d32940711	Bus: Split the bus into a non-coherent and coherent bus This patch introduces a class hierarchy of buses, a non-coherent one, and a coherent one, splitting the existing bus functionality. By doing so it also enables further specialisation of the two types of buses. A non-coherent bus connects a number of non-snooping masters and slaves, and routes the request and response packets based on the address. The request packets issued by the master connected to a non-coherent bus could still snoop in caches attached to a coherent bus, as is the case with the I/O bus and memory bus in most system configurations. No snoops will, however, reach any master on the non-coherent bus itself. The non-coherent bus can be used as a template for modelling PCI, PCIe, and non-coherent AMBA and OCP buses, and is typically used for the I/O buses. A coherent bus connects a number of (potentially) snooping masters and slaves, and routes the request and response packets based on the address, and also forwards all requests to the snoopers and deals with the snoop responses. The coherent bus can be used as a template for modelling QPI, HyperTransport, ACE and coherent OCP buses, and is typically used for the L1-to-L2 buses and as the main system interconnect. The configuration scripts are updated to use a NoncoherentBus for all peripheral and I/O buses. A bit of minor tidying up has also been done. --HG-- rename : src/mem/bus.cc => src/mem/coherent_bus.cc rename : src/mem/bus.hh => src/mem/coherent_bus.hh rename : src/mem/bus.cc => src/mem/noncoherent_bus.cc rename : src/mem/bus.hh => src/mem/noncoherent_bus.hh	2012-05-31 13:30:04 -04:00
Andreas Hansson	1d520cda80	gcc: Small fixes to compile with gcc 4.7 This patch makes two very minor changes to please gcc 4.7. The CopyData function no longer exists and this has been replaced. For some reason previous versions of gcc did not complain on the const char casting not having an implementation, but this is now addressed.	2012-05-30 05:31:48 -04:00
Andreas Hansson	b8cf48accc	Bus: Remove redundant packet parameter from isOccupied This patch merely remove the Packet* from the isOccupied member function. Historically this was used to check if the packet was an express snoop, but this is now done outside this function (where relevant).	2012-05-30 05:31:11 -04:00
Andreas Hansson	5880fbe96d	Bus: Turn the PortId into a transport function parameter The main aim of this patch is to arrive at a suitable port interface for vector ports, including both the packet and the port id. This patch changes the bus transport functions (recvFunctional/Atomic/Timing) to require a PortId parameter indicating the source port. Previously this information was passed by setting the source field of the packet, and this is only required in the case of a timing request. With this patch, the use of the source and destination field is also more restrictive, as they are only needed for timing accesses. The modifications to these fields for atomic snoops is now removed entirely, also making minor modifications to the cache.	2012-05-30 05:30:24 -04:00
Andreas Hansson	cad802761a	Packet: Unify the use of PortID in packet and port This patch removes the Packet::NodeID typedef and unifies it with the Port::PortId. The src and dest fields in the packet are used to hold a port id (e.g. in the bus), and thus the two should actually be the same. The typedef PortID is now global (in base/types.hh) and aligned with the ThreadID in terms of capitalisation and naming of the InvalidPortID constant. Before this patch, two flags were used for valid destination and source, rather than relying on a named value (InvalidPortID), and this is now redundant, as the src and dest field themselves are sufficient to tell whether the current value is a valid port identifier or not. Consequently, the VALID_SRC and VALID_DST are removed. As part of the cleaning up, a number of int parameters and local variables are updated to use PortID. Note that Ruby still has its own NodeID typedef. Furthermore, the MemObject getMaster/SlavePort still has an int idx parameter with a default value of -1 which should eventually change to PortID idx = InvalidPortID.	2012-05-30 05:29:42 -04:00
Andreas Hansson	6a54f7fc5f	Packet: Updated comments for src and dest fields This patch updates the comments for the src and dest fields to reflect their actual use. Due to a number of patches (e.g. removing the Broadcast flag), the old comments are no longer indicative of the current usage.	2012-05-30 05:29:07 -04:00
Andreas Hansson	3b367db42c	Bridge: Split deferred request, response and sender state This patch splits the PacketBuffer class into a RequestState and a DeferredRequest and DeferredResponse. Only the requests need a SenderState, and the deferred requests and responses only need an associated point in time for the request and the response queue. Besides the cleaning up, the goal is to simplify the transition to a new port handshake, and with these changes, the two packet queues are starting to look very similar to the generic packet queue, but currently they do a few unique things relating to the NACK and counting of requests/responses that the packet queue cannot be conveniently used. This will be addressed in a later patch.	2012-05-30 05:28:06 -04:00
Gabe Black	d9988ded3c	X86: Use the HandyM5Reg to avoid a register read and some logic in the TLB.	2012-05-28 21:56:23 -07:00
Gabe Black	40084e0c3e	X86: Move the GDT down to where it can be accessed in 32 bit mode. The GDT can be accessed by user level software running in compatibility mode by moving segment selectors into segment registers. The GDT needs to be set up at an address accessible in this mode.	2012-05-27 19:01:08 -07:00
Gabe Black	1d96135087	X86: Truncate addresses to 32 bits except in 64 bit mode, not long mode. A small change was added a while ago to keep addresses from overflowing 32 bits when larger addresses shouldn't be accessible to software. That change truncated when not in long mode, but really it should have truncated when not in 64 bit mode. The difference is whether compatibility mode is included, a mode that's supposed to act like a legacy 32 bit mode.	2012-05-27 19:01:04 -07:00
Gabe Black	19df4e94ee	ISA,CPU: Generalize and split out the components of the decode cache. This will allow it to be specialized by the ISAs. The existing caching scheme is provided by the BasicDecodeCache in the GenericISA namespace and is built from the generalized components. --HG-- rename : src/cpu/decode_cache.cc => src/arch/generic/decode_cache.cc	2012-05-26 13:45:12 -07:00
Gabe Black	0cba96ba6a	CPU: Merge the predecoder and decoder. These classes are always used together, and merging them will give the ISAs more flexibility in how they cache things and manage the process. --HG-- rename : src/arch/x86/predecoder_tables.cc => src/arch/x86/decoder_tables.cc	2012-05-26 13:44:46 -07:00
Gabe Black	eae1e97fb0	ISA: Make the decode function part of the ISA's decoder.	2012-05-25 00:55:24 -07:00
Gabe Black	276f3e9535	CPU: Simplify the implementation of the decode cache. Also reorganize it to make it more amenable to being rearranged later.	2012-05-25 00:54:39 -07:00
Gabe Black	82a228bd43	Decode: Make the Decoder class defined per ISA. --HG-- rename : src/cpu/decode.cc => src/arch/generic/decoder.cc rename : src/cpu/decode.hh => src/arch/generic/decoder.hh	2012-05-25 00:53:37 -07:00
Andreas Hansson	49da0497d3	Cache: Remove dangling doWriteback declaration This patch removes the declaration of doWriteback as there is no implementation for this member function.	2012-05-24 04:09:19 -04:00
Andreas Hansson	3e0ed08706	Packet: Cleaning up packet command and attribute This patch removes unused commands and attributes from the packet to avoid any confusion. It is part of an effort to clear up how and where different commands and attributes are used.	2012-05-23 09:18:04 -04:00
Andreas Hansson	01906f957a	Config: Use the attribute naming and include ports in JSON This patch changes the organisation of the JSON output slightly to make it easier to traverse and use the files. Most importantly, the hierarchical dictionaries now use keys that correspond to the attribute names also in the case of VectorParams (used to be e.f. "cpu0 cpu1"). It also adds the name and the path to each SimObject directory entry. Before this patch, to get cpu0, you would have to query dict['system']['cpu0 cpu1'][0] and this could be a dict with 'cpu0' : { cpu parameters }. Now you use dict['system']['cpu'][0] and get { cpu parameters } (where one is "name" : "cpu0"). Additionally this patch includes more verbose information about the ports, specifying their role, and using a JSON array rather than a concatenated string for the peer.	2012-05-23 09:16:39 -04:00
Andreas Hansson	d4847fe6ea	DMA: Split the DMA device and IO device into seperate files This patch moves the DMA device to its own set of files, splitting it from the IO device. There are no behavioural changes associated with this patch. The patch also grabs the opportunity to do some very minor tidying up, including some white space removal and pruning some redundant parameters. Besides the immediate benefits of the separation-of-concerns, this patch also makes upcoming changes more streamlined as it split the devices that are only slaves and the DMA device that also acts as a master. --HG-- rename : src/dev/io_device.cc => src/dev/dma_device.cc rename : src/dev/io_device.hh => src/dev/dma_device.hh	2012-05-23 09:15:45 -04:00
Andreas Hansson	5b36cf623c	MEM: Add a snooping DMA port subclass for table walker This patch makes the (device) DmaPort non-snooping and removes the recvSnoop constructor parameter and instead introduces a SnoopingDmaPort subclass for the ARM table walker. Functionality is unchanged, as are the stats, and the patch merely clarifies that the normal DMA ports are not snooping (although they may issue requests that are snooped by others, as done with PCI, PCIe, AMBA4 ACE etc). Currently this port is declared in the ARM table walker as it is not used anywhere else. If other ports were to have similar behaviour it could be moved in a future patch.	2012-05-23 09:14:12 -04:00
Andreas Hansson	31b4ac5cec	Config: Exit with fatal if a port is already connected This patch turns the existing warning into a fatal, as there should never be any cases where a (non-vector) port is assigned to and then later connected to something else. If this behaviour is allowed, as it used to be, there are cases where the wrong number of C++ ports are created when instantiating objects with VectorPorts (obviously that could be fixed, but the better approach is to simply not allow it).	2012-05-23 09:01:56 -04:00
Nilay Vaish	1031fe7b6f	Ruby: Remove the unused src/mem/ruby/common/Driver.* files.	2012-05-22 11:35:58 -05:00
Nilay Vaish	6a966d5eeb	Ruby Sequencer: Schedule deadlock check event at correct time The scheduling of the deadlock check event was being done incorrectly as the clock was not being multiplied, so as to convert the time into ticks. This patch removes that bug.	2012-05-22 11:32:57 -05:00
Nilay Vaish	4d4d212ae9	X86: Split Condition Code register This patch moves the ECF and EZF bits to individual registers (ecfBit and ezfBit) and the CF and OF bits to cfofFlag registers. This is being done so as to lower the read after write dependencies on the the condition code register. Ultimately we will have the following registers [ZAPS], [OF], [CF], [ECF], [EZF] and [DF]. Note that this is only one part of the solution for lowering the dependencies. The other part will check whether or not the condition code register needs to be actually read. This would be done through a separate patch.	2012-05-22 11:29:53 -05:00
Marc Orr	16a559c9c6	x86 ISA: Implement the sse3 haddps instruction. Shuffle the 32 bit values into position, and then add in parallel.	2012-05-19 04:32:25 -07:00
Gabe Black	250c40799d	Syscalls: warn when the length argument to mmap is excessive. If the length argument to mmap is larger than the arbitrary but reasonable limit of 4GB, there's a good chance that the value is nonsense and not intentional. Rather than attempting to satisfy the mmap anyway, this change makes gem5 warn to make it more apparent what's going wrong.	2012-05-19 04:13:47 -07:00
Lena Olson	8fe8efeb34	Mem: Fix size check when allocating physical memory	2012-05-14 20:31:33 -05:00
Koan-Sin Tan	0b2d5e20d1	ARM: fix the calculation of the values in the RV clocks This clock is used by the linux scheduler.	2012-05-10 18:04:28 -05:00
Ali Saidi	331696582f	stats: fix compilation of unit test.	2012-05-10 18:04:28 -05:00
Ali Saidi	ec50c78f83	stats: fix bug in assert for 2d vector	2012-05-10 18:04:28 -05:00
Chander Sudanthi	1965a89873	ARM: pl011 raw interrupt fix Raw interrupt was not being set when interrupt was disabled. This patch sets the raw interrupt regardless of the mask.	2012-05-10 18:04:28 -05:00
Chander Sudanthi	200689c53f	ARM: EMM board address range fix 0x40000000 is reservered for external AXI addresses. This address range is not used currently. Removed the range from the bridge.	2012-05-10 18:04:28 -05:00
Uri Wiener	29a5e6ff35	DOT: improved dot-based system visualization Revised system visualization to reflect structure and memory hierarchy. Improved visualization: less congested and cluttered; more colorful. Nodes reflect components; directed edges reflect dirctional relation, from a master port to a slave port. Requires pydot.	2012-05-10 18:04:27 -05:00
Uri Wiener	cb1b63ea61	DOT: fixed broken code for visualizing configuration using dot Fixed broken code which visualizes the system configuration by generating a tree from each component's children, starting from root. Requires DOT (hence pydot).	2012-05-10 18:04:27 -05:00
Dam Sunwoo	f2f7fa1a1c	ARM: guard masked symbol tables by default Symbol tables masked with the loadAddrMask create redundant entries that could conflict with kernel function events that rely on the original addresses. This patch guards the creation of those masked symbol tables by default, with an option to enable them when needed (for early-stage kernel debugging, etc.)	2012-05-10 18:04:27 -05:00
Ali Saidi	041b932428	mem: fix bug with CopyStringOut and null string termination.	2012-05-10 18:04:27 -05:00
Ali Saidi	c02dc07424	Cache: restructure code that actually isn't a loop	2012-05-10 18:04:27 -05:00
Ali Saidi	e029941bda	dev: use correct delete operation in SimpleDisk	2012-05-10 18:04:27 -05:00
Ali Saidi	d9b484b41a	ARM: Fix incorrect use of not operators in arm devices	2012-05-10 18:04:27 -05:00
Ali Saidi	5745665509	gem5: assert before indexing intro arrays to verify bounds	2012-05-10 18:04:27 -05:00
Ali Saidi	4f66bcdd2e	gem5: fix some iterator use and erase bugs	2012-05-10 18:04:27 -05:00
Ali Saidi	5ecaf30219	gem5: fix a number of use after free issues	2012-05-10 18:04:27 -05:00
Ali Saidi	da10fbf5ca	base: fix a invalid ?: operator	2012-05-10 18:04:27 -05:00
Ali Saidi	8cee4dacc8	gem5: Fix a number of incorrect case statements	2012-05-10 18:04:26 -05:00
Ali Saidi	413ba1fdaf	stats: track if the stats have been enabled and prevent requesting master id Track the point in the initialization where statistics have been registered. After this point registering new masterIds can no longer work as some SimObjects may have sized stats vectors based on the previous value. If someone tries to register a masterId after this point the simulator executes fatal().	2012-05-10 18:04:26 -05:00
Ali Saidi	f6895e8bd4	Cache: Panic if you attempt to create a checkpoint with a cache in the system	2012-05-10 18:04:26 -05:00
Pritha Ghoshal	dc456d8166	IGbE: Fix writeback conditions for i8254x GbE in updated data sheet. An older revision of the data sheet specified that txdctl.gran was 1 the granularity was based on cache block and gran being 0 is based on descriptor count. The newer version of the data sheet reverses this errata	2012-05-10 18:04:26 -05:00
Nathan Binkert	55411f7f71	stats: use nan instead of no_value	2012-05-09 11:51:42 -07:00
Andreas Hansson	ab23e29487	MEM: Add the communication monitor This patch adds a communication monitor MemObject that can be inserted between a master and slave port to provide a range of statistics about the communication passing through it. The communication monitor is non-invasive and does not change any properties or timing of the packets, with the exception of adding a sender state to be able to track latency. The statistics are only collected in timing mode (not atomic) to avoid slowing down any fast forwarding. An example of the statistics captured by the monitor are: read/write burst lengths, bandwidth, request-response latency, outstanding transactions, inter transaction time, transaction count, and address distribution. The monitor can be used in combination with periodic resetting and dumping of stats (through schedStatEvent) to study the behaviour over time. In future patches, a selection of convenience scripts will be added to aid in visualising the statistics collected by the monitor.	2012-05-09 04:37:45 -04:00
Andreas Hansson	692351ea34	MEM: Do not forward uncacheable to bus snoopers This patch adds a guarding if-statement to avoid forwarding uncacheable requests (or rather their corresponding request packets) to bus snoopers. These packets should never have any effect on the caches, and thus there is no need to forward them to the snoopers.	2012-05-08 05:15:52 -04:00
Andreas Hansson	15e28c5ba6	Ruby: Ensure snoop requests are sent using sendTimingSnoopReq This patch fixes a bug that caused snoop requests to be placed in a packet queue. Instead, the packet is now sent immediately using sendTimingSnoopReq, thus bypassing the packet queue and any normal responses waiting to be sent.	2012-05-04 03:30:02 -04:00
Andreas Hansson	3fea59e162	MEM: Separate requests and responses for timing accesses This patch moves send/recvTiming and send/recvTimingSnoop from the Port base class to the MasterPort and SlavePort, and also splits them into separate member functions for requests and responses: send/recvTimingReq, send/recvTimingResp, and send/recvTimingSnoopReq, send/recvTimingSnoopResp. A master port sends requests and receives responses, and also receives snoop requests and sends snoop responses. A slave port has the reciprocal behaviour as it receives requests and sends responses, and sends snoop requests and receives snoop responses. For all MemObjects that have only master ports or slave ports (but not both), e.g. a CPU, or a PIO device, this patch merely adds more clarity to what kind of access is taking place. For example, a CPU port used to call sendTiming, and will now call sendTimingReq. Similarly, a response previously came back through recvTiming, which is now recvTimingResp. For the modules that have both master and slave ports, e.g. the bus, the behaviour was previously relying on branches based on pkt->isRequest(), and this is now replaced with a direct call to the apprioriate member function depending on the type of access. Please note that send/recvRetry is still shared by all the timing accessors and remains in the Port base class for now (to maintain the current bus functionality and avoid changing the statistics of all regressions). The packet queue is split into a MasterPort and SlavePort version to facilitate the use of the new timing accessors. All uses of the PacketQueue are updated accordingly. With this patch, the type of packet (request or response) is now well defined for each type of access, and asserts on pkt->isRequest() and pkt->isResponse() are now moved to the appropriate send member functions. It is also worth noting that sendTimingSnoopReq no longer returns a boolean, as the semantics do not alow snoop requests to be rejected or stalled. All these assumptions are now excplicitly part of the port interface itself.	2012-05-01 13:40:42 -04:00
Gabe Black	2c85cf41a2	X86: Fix the IMUL_R_P_I macroop. The disp displacement was left off the load microop so the wrong value was used.	2012-04-29 02:26:34 -07:00
Vince Weaver	03a91b0533	X86: Fix up the open system call's flags.	2012-04-29 00:31:03 -07:00
Vince Weaver	38799e2b3f	X86: Make gem5 ignore a bunch of syscalls.	2012-04-29 00:30:56 -07:00
Nilay Vaish	04a558bb41	Garnet: Correct computation of link utilization The computation for link utilization was incorrect for the flexible network. The utilization was being divided twice by the total time.	2012-04-28 16:57:31 -05:00
Nilay Vaish	c3dad222e3	Ruby: Remove extra statements from Sequencer	2012-04-25 17:52:03 -05:00
Andreas Hansson	beed20d7bc	MEM: Use base class Master/SlavePort pointers in the bus This patch makes some rather trivial simplifications to the bus in that it changes the use of BusMasterPort and BusSlavePort pointers to simply use MasterPort and SlavePort (iterators are also updated accordingly). This change is a step towards a future patch that introduces a separation of the interface and the structural port itself.	2012-04-25 10:45:23 -04:00
Andreas Hansson	4c92708b48	MEM: Add the PortId type and a corresponding id field to Port This patch introduces the PortId type, moves the definition of INVALID_PORT_ID to the Port class, and also gives every port an id to reflect the fact that each element in a vector port has an identifier/index. Previously the bus and Ruby testers (and potentially other users of the vector ports) added the id field in their port subclasses, and now this functionality is always present as it is moved to the base class.	2012-04-25 10:41:23 -04:00
Andreas Hansson	79750fc575	clang/gcc: Use STL hash function for int64_t and uint64_t This patch changes the guards for the definition of hash functions to also exclude the int64_t and uint64_t hash functions in the case we are using the c++0x STL <unordered_map> (and <hash>) or the TR1 version of the same header. Previously the guard only covered the hash function for strings, but it seems there is also no need to define a hash for the 64-bit integer types, and this has caused problems with builds on 32-bit Ubuntu.	2012-04-25 08:57:18 -04:00
Gabe Black	64bf90dca3	X86: Clear out duplicate TLB entries when adding a new one. It's possible for two page table walks to overlap which will go in the same place in the TLB's trie. They would land on top of each other, so this change adds some code which detects if an address already matches an entry and if so throws away the new one.	2012-04-24 00:48:41 -07:00
Gabe Black	74ca8a3cd0	ISA: Put parser generated files in a "generated" directory. This is to avoid collision with non-generated files.	2012-04-23 12:00:41 -07:00
Gabe Black	80c6cdae18	base: Include cassert in trie.hh. trie.hh uses assert, but it wasn't explicitly including cassert.	2012-04-22 05:20:44 -07:00
Gabe Black	29329e61b7	X86: Report an error if there's no kernel object, don't blindly use it. This way the user gets a nice message instead of a less nice segfault.	2012-04-21 15:00:23 -07:00
Gabe Black	a5187f9d96	CPU: Tidy up some formatting and a DPRINTF in the simple CPU base class. Put the { on the same line as the if and put a space between the if and the open paren. Also, use the # format modifier which puts a 0x in front of hex values automatically. If the ExtMachInst type isn't integral and actually prints something more complicated, the # falls away harmlessly and we aren't left with a phantom 0x followed by a bunch of unrelated text.	2012-04-15 12:35:49 -07:00
Gabe Black	8fe112d61b	X86: Fix a tiny typo in the load/store microop constructor. The parameter is _machInst, which is very similar to the member machInst. If machInst is used to pass the parameter to a lower level constructor, what really happens is that machInst is set to whatever it already happened to be, effectively leaving it uninitialized.	2012-04-15 01:07:39 -07:00
Gabe Black	aacb676220	X86: Use the AddrTrie class to implement the TLB. This change also adjusts the TlbEntry class so that it stores the number of address bits wide a page is rather than its size in bytes. In other words, instead of storing 4K for a 4K page, it stores 12. 12 is easy to turn into 4K, but it's a little harder going the other way.	2012-04-14 23:24:18 -07:00
Gabe Black	d6031d72df	sim: Update some comments in trie.hh that were meant to go in the last change.	2012-04-14 23:22:57 -07:00
Gabe Black	c4c27ded42	sim: A trie data structure specifically to speed up paging lookups. This change adds a trie data structure which stores an arbitrary pointer type based on an address and a number of relevant bits. Then lookups can be done against the trie where the tree is traversed and the first legitimate match found is returned.	2012-04-14 23:19:34 -07:00
Andreas Hansson	14edc6013d	Ruby: Use MasterPort base-class pointers where possible This patch simplifies future patches by changing the pointer type used in a number of the Ruby testers to use MasterPort instead of using a derived CpuPort class. There is no reason for using the more specialised pointers, and there is no longer a need to do any casting. With the latest changes to the tester, organising ports as readers and writes, things got a bit more complicated, and the "type" now had to be removed to be able to fall back to using MasterPort rather than CpuPort.	2012-04-14 05:46:59 -04:00
Andreas Hansson	750f33a901	MEM: Remove the Broadcast destination from the packet This patch simplifies the packet by removing the broadcast flag and instead more firmly relying on (and enforcing) the semantics of transactions in the classic memory system, i.e. request packets are routed from a master to a slave based on the address, and when they are created they have neither a valid source, nor destination. On their way to the slave, the request packet is updated with a source field for all modules that multiplex packets from multiple master (e.g. a bus). When a request packet is turned into a response packet (at the final slave), it moves the potentially populated source field to the destination field, and the response packet is routed through any multiplexing components back to the master based on the destination field. Modules that connect multiplexing components, such as caches and bridges store any existing source and destination field in the sender state as a stack (just as before). The packet constructor is simplified in that there is no longer a need to pass the Packet::Broadcast as the destination (this was always the case for the classic memory system). In the case of Ruby, rather than using the parameter to the constructor we now rely on setDest, as there is already another three-argument constructor in the packet class. In many places where the packet information was printed as part of DPRINTFs, request packets would be printed with a numeric "dest" that would always be -1 (Broadcast) and that field is now removed from the printing.	2012-04-14 05:45:55 -04:00
Andreas Hansson	dccca0d3a9	MEM: Separate snoops and normal memory requests/responses This patch introduces port access methods that separates snoop request/responses from normal memory request/responses. The differentiation is made for functional, atomic and timing accesses and builds on the introduction of master and slave ports. Before the introduction of this patch, the packets belonging to the different phases of the protocol (request -> [forwarded snoop request -> snoop response]* -> response) all use the same port access functions, even though the snoop packets flow in the opposite direction to the normal packet. That is, a coherent master sends normal request and receives responses, but receives snoop requests and sends snoop responses (vice versa for the slave). These two distinct phases now use different access functions, as described below. Starting with the functional access, a master sends a request to a slave through sendFunctional, and the request packet is turned into a response before the call returns. In a system without cache coherence, this is all that is needed from the functional interface. For the cache-coherent scenario, a slave also sends snoop requests to coherent masters through sendFunctionalSnoop, with responses returned within the same packet pointer. This is currently used by the bus and caches, and the LSQ of the O3 CPU. The send/recvFunctional and send/recvFunctionalSnoop are moved from the Port super class to the appropriate subclass. Atomic accesses follow the same flow as functional accesses, with request being sent from master to slave through sendAtomic. In the case of cache-coherent ports, a slave can send snoop requests to a master through sendAtomicSnoop. Just as for the functional access methods, the atomic send and receive member functions are moved to the appropriate subclasses. The timing access methods are different from the functional and atomic in that requests and responses are separated in time and send/recvTiming are used for both directions. Hence, a master uses sendTiming to send a request to a slave, and a slave uses sendTiming to send a response back to a master, at a later point in time. Snoop requests and responses travel in the opposite direction, similar to what happens in functional and atomic accesses. With the introduction of this patch, it is possible to determine the direction of packets in the bus, and no longer necessary to look for both a master and a slave port with the requested port id. In contrast to the normal recvFunctional, recvAtomic and recvTiming that are pure virtual functions, the recvFunctionalSnoop, recvAtomicSnoop and recvTimingSnoop have a default implementation that calls panic. This is to allow non-coherent master and slave ports to not implement these functions.	2012-04-14 05:45:07 -04:00
Andreas Hansson	b9bc530ad2	Regression: Add ANSI colours to highlight test status This patch adds a very basic pretty-printing of the test status (passed or failed) to highlight failing tests even more: green for passed, and red for failed. The printing only uses ANSI it the target output is a tty and supports ANSI colours. Hence, any regression scripts that are outputting to files or sending e-mails etc should still be fine.	2012-04-14 05:44:27 -04:00
Andreas Hansson	b6aa6d55eb	clang/gcc: Fix compilation issues with clang 3.0 and gcc 4.6 This patch addresses a number of minor issues that cause problems when compiling with clang >= 3.0 and gcc >= 4.6. Most importantly, it avoids using the deprecated ext/hash_map and instead uses unordered_map (and similarly so for the hash_set). To make use of the new STL containers, g++ and clang has to be invoked with "-std=c++0x", and this is now added for all gcc versions >= 4.6, and for clang >= 3.0. For gcc >= 4.3 and <= 4.5 and clang <= 3.0 we use the tr1 unordered_map to avoid the deprecation warning. The addition of c++0x in turn causes a few problems, as the compiler is more stringent and adds a number of new warnings. Below, the most important issues are enumerated: 1) the use of namespaces is more strict, e.g. for isnan, and all headers opening the entire namespace std are now fixed. 2) another other issue caused by the more stringent compiler is the narrowing of the embedded python, which used to be a char array, and is now unsigned char since there were values larger than 128. 3) a particularly odd issue that arose with the new c++0x behaviour is found in range.hh, where the operator< causes gcc to complain about the template type parsing (the "<" is interpreted as the beginning of a template argument), and the problem seems to be related to the begin/end members introduced for the range-type iteration, which is a new feature in c++11. As a minor update, this patch also fixes the build flags for the clang debug target that used to be shared with gcc and incorrectly use "-ggdb".	2012-04-14 05:43:31 -04:00
Steve Reinhardt	29482e90ba	SCons: restore Werror option in src/SConscript Partial backout of cset 8b223e308b08. Although it's great that there's currently no need for Werror=false in the current tree, some of us have uncommitted code that still needs this option.	2012-04-13 08:13:04 -07:00
Andreas Hansson	c9634d9b38	Ruby: Ensure order-dependent iteration uses an ordered map This patch fixes a bug in Ruby that caused non-deterministic simulation when changing the underlying hash map implementation. The reason is order-dependent behaviour in combination with iteration over the hash map contents. The two locations where a sorted container is assumed are now changed to make use of a std::map instead of the unordered hash map. With this change, the stats changes slightly and the follow-on changeset will update the relevant statistics.	2012-04-12 08:35:49 -04:00
Gabe Black	15ca4f2fc7	tests: Fix building unit tests. Unit tests shouldn't build in gem5's main function because they have thier own.	2012-04-09 23:20:30 -07:00
Brad Beckmann	3fd425124c	rubytest: remove spurious printf	2012-04-06 17:51:47 -07:00
Lisa Hsu	a5287efc58	slicc: Controllers attached to Sequencers no longer have to be named L1Cache.	2012-04-06 13:47:08 -07:00
Brad Beckmann	5dfa4cd3f5	sim-ruby: checkpointing fixes and dependent eventq improvements Fixes checkpointing with respect to lost events after swapping event queues. Also adds DPRINTFs to better understand what's going on when Ruby serializes and unserializes.	2012-04-06 13:47:07 -07:00
Brad Beckmann	70682e36dd	slicc: fixed error message when the type has no inheritance	2012-04-06 13:47:07 -07:00
Brad Beckmann	5838ed7290	MOESI_hammer: tbe allocation and dependent wakeup fixes	2012-04-06 13:47:07 -07:00
Brad Beckmann	f12961bf25	python: added __nonzero__ function to SimObject Bool params	2012-04-06 13:47:07 -07:00
Brad Beckmann	f050ebe3a8	MOESI_hammer: fixed bug with single cpu + flushes, then modified the regression tester to check this functionality	2012-04-06 13:47:06 -07:00
Brad Beckmann	0a9f4b950f	rubytest: seperated read and write ports. This patch allows the ruby tester to support protocols where the i-cache and d-cache are managed by seperate controllers.	2012-04-06 13:47:06 -07:00
Andreas Hansson	b00949d88b	MEM: Enable multiple distributed generalized memories This patch removes the assumption on having on single instance of PhysicalMemory, and enables a distributed memory where the individual memories in the system are each responsible for a single contiguous address range. All memories inherit from an AbstractMemory that encompasses the basic behaviuor of a random access memory, and provides untimed access methods. What was previously called PhysicalMemory is now SimpleMemory, and a subclass of AbstractMemory. All future types of memory controllers should inherit from AbstractMemory. To enable e.g. the atomic CPU and RubyPort to access the now distributed memory, the system has a wrapper class, called PhysicalMemory that is aware of all the memories in the system and their associated address ranges. This class thus acts as an infinitely-fast bus and performs address decoding for these "shortcut" accesses. Each memory can specify that it should not be part of the global address map (used e.g. by the functional memories by some testers). Moreover, each memory can be configured to be reported to the OS configuration table, useful for populating ATAG structures, and any potential ACPI tables. Checkpointing support currently assumes that all memories have the same size and organisation when creating and resuming from the checkpoint. A future patch will enable a more flexible re-organisation. --HG-- rename : src/mem/PhysicalMemory.py => src/mem/AbstractMemory.py rename : src/mem/PhysicalMemory.py => src/mem/SimpleMemory.py rename : src/mem/physical.cc => src/mem/abstract_mem.cc rename : src/mem/physical.hh => src/mem/abstract_mem.hh rename : src/mem/physical.cc => src/mem/simple_mem.cc rename : src/mem/physical.hh => src/mem/simple_mem.hh	2012-04-06 13:46:31 -04:00
Tushar Krishna	dbe1608fd5	NetworkTest: remove unnecessary memory allocation	2012-04-05 17:51:26 -04:00
Nilay Vaish	4f4a710457	Config: corrects the way Ruby attaches to the DMA ports With recent changes to the memory system, a port cannot be assigned a peer port twice. While making use of the Ruby memory system in FS mode, DMA ports were assigned peer twice, once for the classic memory system and once for the Ruby memory system. This patch removes this double assignment of peer ports.	2012-04-05 11:09:19 -05:00
Andreas Hansson	aab2001ab7	Python: Make the All proxy traverse SimObject children as well This patch changes the behaviour of the All proxy parameter to not only consider the direct children, but also do a pre-order depth-first traversal of the object tree and append all results from the children. This is used in a later patch to find all the memories in the system, independent of where they are located in the hierarchy.	2012-04-05 10:44:35 -04:00
Andreas Hansson	a8e6adb0b1	Atomic: Remove the physmem_port and access memory directly This patch removes the physmem_port from the Atomic CPU and instead uses the system pointer to access the physmem when using the fastmem option. The system already keeps track of the physmem and the valid memory address ranges, and with this patch we merely make use of that existing functionality. As a result of this change, the overloaded getMasterPort in the Atomic CPU can be removed, thus unifying the CPUs.	2012-04-03 03:50:14 -04:00
Gabe Black	a7859f7e45	X86: Fix address size handling so real mode works properly. Virtual (pre-segmentation) addresses are truncated based on address size, and any non-64 bit linear address is truncated to 32 bits. This means that real mode addresses aren't truncated down to 16 bits after their segment bases are added in.	2012-03-31 12:27:33 -07:00
Andreas Hansson	74043c4f5c	MEM: Remove legacy DRAM in preparation for memory updates This patch removes the DRAM memory class in preparation for updates to the memory system, with the first one introducing an abstract memory class, and removing the assumption of a single physical memory.	2012-03-30 12:57:48 -04:00
Andreas Hansson	a128ba7cd1	Ruby: Remove the physMemPort and instead access memory directly This patch removes the physMemPort from the RubySequencer and instead uses the system pointer to access the physmem. The system already keeps track of the physmem and the valid memory address ranges, and with this patch we merely make use of that existing functionality. The memory is modified so that it is possible to call the access functions (atomic and functional) without going through the port, and the memory is allowed to be unconnected, i.e. have no ports (since Ruby does not attach it like the conventional memory system).	2012-03-30 09:42:36 -04:00
William Wang	f9d403a7b9	MEM: Introduce the master/slave port sub-classes in C++ This patch introduces the notion of a master and slave port in the C++ code, thus bringing the previous classification from the Python classes into the corresponding simulation objects and memory objects. The patch enables us to classify behaviours into the two bins and add assumptions and enfore compliance, also simplifying the two interfaces. As a starting point, isSnooping is confined to a master port, and getAddrRanges to slave ports. More of these specilisations are to come in later patches. The getPort function is not getMasterPort and getSlavePort, and returns a port reference rather than a pointer as NULL would never be a valid return value. The default implementation of these two functions is placed in MemObject, and calls fatal. The one drawback with this specific patch is that it requires some code duplication, e.g. QueuedPort becomes QueuedMasterPort and QueuedSlavePort, and BusPort becomes BusMasterPort and BusSlavePort (avoiding multiple inheritance). With the later introduction of the port interfaces, moving the functionality outside the port itself, a lot of the duplicated code will disappear again.	2012-03-30 09:40:11 -04:00
Andreas Hansson	a14013af3a	CPU: Unify initMemProxies across CPUs and simulation modes This patch unifies where initMemProxies is called, in the init() method of each BaseCPU subclass, before TheISA::initCPU is called. Moreover, it also ensures that initMemProxies is called in both full-system and syscall-emulation mode, thus unifying also across the modes. An additional check is added in the ThreadState to ensure that initMemProxies is only called once.	2012-03-30 09:38:35 -04:00
Andreas Hansson	9d7c715c46	range_map: Enable const find and iteration This patch adds const access functions to the range_map to enable its use in a const context, similar to the STL container classes.	2012-03-26 05:37:00 -04:00
Andreas Hansson	312efd742e	Power: Change bitfield name to avoid conflicts with range_map This patch changes the name of a bitfield from W to W_FIELD to avoid clashes with W being used as a class (typename) in the templatized range_map. It also changes L to L_FIELD to avoid future problems. The problem manifestes itself when the CPU includes a header that in turn includes range_map.hh. The relevant parts of the decoder are updated.	2012-03-26 05:35:24 -04:00
Andreas Hansson	ca9790a2db	Ruby: Fix Set::print for 32-bit hosts This patch fixes a compilation error caused by a length mismatch on 32-bit hosts. The ifdef and sprintf is replaced by a csprintf.	2012-03-23 06:54:25 -04:00
Andreas Hansson	9727b1be18	MEM: Unify bus access methods and prepare for master/slave split This patch unifies the recvFunctional, recvAtomic and recvTiming to all be based on a similar structure: 1) extract information about the incoming packet, 2) send it out to the appropriate snoopers, 3) determine where it is going, and 4) forward it to the right destination. The naming of variables across the different access functions is now consistent as well. Additionally, the patch introduces the member functions releaseBus and retryWaiting to better distinguish between the two cases when we should tell a sender to retry. The first case is when the bus goes from busy to idle, and the second case is when it receives a retry from a destination that did not immediatelly accept a packet. As a very minor change, the MMU debug flag is no longer used in the bus.	2012-03-22 06:37:21 -04:00
Andreas Hansson	c2d2ea99e3	MEM: Split SimpleTimingPort into PacketQueue and ports This patch decouples the queueing and the port interactions to simplify the introduction of the master and slave ports. By separating the queueing functionality from the port itself, it becomes much easier to distinguish between master and slave ports, and still retain the queueing ability for both (without code duplication). As part of the split into a PacketQueue and a port, there is now also a hierarchy of two port classes, QueuedPort and SimpleTimingPort. The QueuedPort is useful for ports that want to leave the packet transmission of outgoing packets to the queue and is used by both master and slave ports. The SimpleTimingPort inherits from the QueuedPort and adds the implemention of recvTiming and recvFunctional through recvAtomic. The PioPort and MessagePort are cleaned up as part of the changes. --HG-- rename : src/mem/tport.cc => src/mem/packet_queue.cc rename : src/mem/tport.hh => src/mem/packet_queue.hh	2012-03-22 06:36:27 -04:00
Andreas Hansson	fb395b56dd	Scons: Remove Werror=False in SConscript files This patch removes the overriding of "-Werror" in a handful of cases. The code compiles with gcc 4.6.3 and clang 3.0 without any warnings, and thus without any errors. There are no functional changes introduced by this patch. In the future, rather than ypassing "-Werror", address the warnings.	2012-03-22 06:34:50 -04:00
Andreas Hansson	12742835bc	Python: Fix a conditional expression that requires Python 2.5 This patch changes a conditional expression to a conventional if/else block, which does not require Python >= 2.5.	2012-03-21 19:02:03 -04:00
Nathanael Premillieu	8e2a8fbb7e	ARM: Fix case where cond/uncond control is mis-specified	2012-03-21 10:34:06 -05:00
Ali Saidi	ed8ed6e761	ARM: Clean up condCodes in IT blocks.	2012-03-21 10:34:06 -05:00
Geoffrey Blake	a64319f764	ARM: IT doesn't need to be serializing.	2012-03-21 10:34:06 -05:00
Andrew Lukefahr	b4e5be717d	O3: Fix sizing of decode to rename skid buffer.	2012-03-21 10:34:06 -05:00
Koan-Sin Tan	0376422c0b	ARM: Add RTC to PBX System	2012-03-21 10:34:05 -05:00
Brian Grayson	565c1de4a8	O3: Fix size of skid buffer between fetch and decode when widths are different	2012-03-21 10:34:05 -05:00
Ali Saidi	1981ba21ca	ARM: Fix uninitialized value in ARM RTC model.	2012-03-21 10:34:05 -05:00
Tushar Krishna	c9e4bca8d8	Garnet: Stats at vnet granularity + code cleanup This patch (1) Moves redundant code from fixed and flexible networks to BaseGarnetNetwork. (2) Prints network stats at vnet granularity.	2012-03-19 17:34:17 -04:00
Andreas Hansson	72538294fb	gcc: Clean-up of non-C++0x compliant code, first steps This patch cleans up a number of minor issues aiming to get closer to compliance with the C++0x standard as interpreted by gcc and clang (compile with std=c++0x and -pedantic-errors). In particular, the patch cleans up enums where the last item was succeded by a comma, namespaces closed by a curcly brace followed by a semi-colon, and the use of the GNU-extension typeof (replaced by templated functions). It does not address variable-length arrays, zero-size arrays, anonymous structs, range expressions in switch statements, and the use of long long. The generated CPU code also has a large number of issues that remain to be fixed, mainly related to overflows in implicit constant conversion (due to shifts).	2012-03-19 06:36:09 -04:00
Andreas Hansson	adb8621031	clang: Fix recently introduced clang compilation errors This patch makes the code compile with clang 2.9 and 3.0 again by making two very minor changes. Firt, it maintains a strict typing in the forward declaration of the BaseCPUParams. Second, it adds a FullSystemInt flag of the type unsigned int next to the boolean FullSystem flag. The FullSystemInt variable can be used in decode-statements (expands to switch statements) in the instruction decoder.	2012-03-19 06:35:04 -04:00
Andreas Hansson	a444a6f8d6	scripts: Fix to ensure that port connection count is always set This patch ensures that the port connection count is set to zero in those cases when the port is not connected.	2012-03-19 06:34:02 -04:00
Brian Grayson	98185658c5	O3: Add fatal when fetchWidth > Impl::MaxWidth.	2012-03-11 10:20:54 -04:00
Brian Grayson	9a9a4a0780	ARM: Fix branch prediction issue with CB(N)Z instruction	2012-03-09 15:32:41 -05:00
Geoffrey Blake	69d229ce28	O3/Ozone: Eliminate dead code counting software prefetch insts Eliminates dead code in the O3 and Ozone CPU models that counted software prefetch instructions separately for the ALPHA ISA only.	2012-03-09 09:59:28 -05:00
Geoffrey Blake	98cf57fb89	CheckerCPU: Add function stubs to non-ARM ISA source to compile with CheckerCPU Making the CheckerCPU a runtime time option requires the code to be compatible with ISAs other than ARM. This patch adds the appropriate function stubs to allow compilation.	2012-03-09 09:59:28 -05:00
Geoffrey Blake	043709fdfa	CheckerCPU: Make CheckerCPU runtime selectable instead of compile selectable Enables the CheckerCPU to be selected at runtime with the --checker option from the configs/example/fs.py and configs/example/se.py configuration files. Also merges with the SE/FS changes.	2012-03-09 09:59:27 -05:00
Ali Saidi	df05ffab12	ARM: Don't reset CPUs that are going to be switched in.	2012-03-09 09:59:26 -05:00
Ali Saidi	3ce2d0fad0	System: Move code in initState() back into constructor whenever possible. The change to port proxies recently moved code out of the constructor into initState(). This is needed for code that loads data into memory, however for code that setups symbol tables, kernel based events, etc this is the wrong thing to do as that code is only called when a checkpoint isn't being restored from.	2012-03-09 09:59:26 -05:00
Ali Saidi	ec1ef24895	ARM: Fix valgrind reported error on O3 that was causing minor stats changes.	2012-03-09 09:59:26 -05:00
Ali Saidi	eaa994e7f6	cache: Allow main memory to be at disjoint address ranges.	2012-03-09 09:59:25 -05:00
Marc Orr	eb43883bef	build scripts: Made minor modifications to reduce build overhead time. 1. --implicit-cache behavior is default. 2. makeEnv in src/SConscript is conditionally called. 3. decider set to MD5-timestamp 4. NO_HTML build option changed to SLICC_HTML (defaults to False)	2012-03-06 19:07:41 -08:00
Steve Reinhardt	fd2d5ae2af	DynInst: get rid of dead MyHash code. Not sure what this was ever used for, but it doesn't seem used anymore.	2012-03-02 09:17:42 -08:00
Andreas Hansson	32eae8094d	CPU: Check that the interrupt controller is created when needed This patch adds a creation-time check to the CPU to ensure that the interrupt controller is created for the cases where it is needed, i.e. if the CPU is not being switched in later and not a checker CPU. The patch also adds the "createInterruptController" call to a number of the regression scripts.	2012-03-02 09:21:48 -05:00
Andreas Hansson	adc419a13a	Ruby: Rename RubyPort::sendTiming to avoid overriding base class This patch renames the sendTiming member function in the RubyPort to avoid inadvertently hiding Port::sendTiming (discovered through some rather painful debugging). The RubyPort does, in fact, rely on the functionality of the queued port and the implementation merely schedules a send the next cycle. The new name for the member function is sendNextCycle to better reflect this behaviour. In the unlikely event that we ever shift to using C++11 the member functions in Port should have a "final" identifier to prevent any overriding in derived classes.	2012-03-02 09:16:50 -05:00
Ali Saidi	b129d7ce00	ARM: FIx a bug preventing multiple cores booting a VExpress_EMM machine. New kernel code verifies that multi-processor extensions are available before booting secondary CPUs.	2012-03-02 08:18:19 -06:00
Ali Saidi	96e37eb17c	ARM: FIx missing cf controller connection.	2012-03-01 22:43:23 -06:00
Chander Sudanthi	357fb0a185	VNC: spacing Fixed some spacing in a switch statement	2012-03-01 17:26:36 -06:00
Ali Saidi	91b737ed48	ARM: Add support for Versatile Express extended memory map Also clean up how we create boot loader memory a bit.	2012-03-01 17:26:31 -06:00
Ali Saidi	3876105bdb	ARM: Add RTC device for ARM platforms. This change implements a PL031 real time clock. --HG-- rename : src/dev/arm/timer_sp804.cc => src/dev/arm/rtc_pl031.cc rename : src/dev/arm/timer_sp804.hh => src/dev/arm/rtc_pl031.hh	2012-03-01 17:26:31 -06:00
Matt Horsnell	08187e3916	ARM: Add limited CP14 support. New kernels attempt to read CP14 what debug architecture is available. These changes add the debug registers and return that none is currently available.	2012-03-01 17:26:31 -06:00
Ali Saidi	d907d0ec72	Cache: Fix an issue with LRU when bonus block is used to complete transaction. The block is never inserted because it's the one extra block in the cache, but it can be invalidated twice in a row. In that case the block doesn't have a new master id (beacuse it was never inserted), however it is valid and the accounting goes wrong at that point.	2012-03-01 17:26:31 -06:00
Dam Sunwoo	86d1042d9f	ARM: move kernel func event to correct location. With the recent series of patches, the symbol table loading moved from "construct" time to "init" time, but the kernel function event callback registration was left behind. This patch moves it to the proper location.	2012-03-01 17:26:31 -06:00
Giacomo Gabrielli	d51478db4e	ARM: fix bits-to-fp conversion function declarations. Add extra declarations to allow the compiler to pick up the right function. Please note that these declarations have been added as part of the clang-related changes.	2012-03-01 17:26:30 -06:00
Nilay Vaish	4b32c9fb4d	x86: Fix x86 TLB and Walker This patch adds a function to X86 tlb that returns the walker port. This port is required for correctly connecting the walker ports for the cpu just switched in	2012-03-01 11:37:03 -06:00
Nilay Vaish	c80af04d7d	x86: Fix switching of CPUs This patch prevents creation of interrupt controller for cpus that will be switched in later	2012-03-01 11:37:02 -06:00
Andreas Hansson	e5ac647fc9	MEM: Make all the port proxy members const This is a trivial patch that merely makes all the member functions of the port proxies const. There is no good reason why they should not be, and this change only serves to make it explicit that they are not modified through their use.	2012-02-29 04:47:51 -05:00
Andreas Hansson	88abdc0fad	SWIG: Ensure ptrdiff_t is a known type in gcc >= 4.6.1 This patch fixes a compilation error that occurs with gcc >= 4.6.1, caused by swig not including cstddef and not using the std:: namespace prefix for ptrdiff_t. There is an old patch, http://reviews.m5sim.org/r/913/ that no longer applies cleanly and this might be re-iterating the same issue. We work around the problem by always enforcing the inclusion of cstddef in all swig interface declarations, and also by explicitly using std::ptrdiff_t.	2012-02-29 04:26:58 -05:00
Gabe Black	559b43a372	X86: Use the M5PanicFault fault in execute methods instead of calling panic. If an instruction is executed speculatively and hits a situation where it wants to panic, it should return a fault instead. If the instruction was misspeculated, the fault can be thrown away. If the instruction wasn't misspeculated, the fault will be invoked and the panic will still happen.	2012-02-26 15:32:53 -08:00
Andreas Hansson	0cd0a8fdd3	MEM: Simplify cache ports preparing for master/slave split This patch splits the two cache ports into a master (memory-side) and slave (cpu-side) subclass of port with slightly different functionality. For example, it is only the CPU-side port that blocks incoming requests, and only the memory-side port that schedules send events outside of what the transmit list dictates. This patch simplifies the two classes by relying further on SimpleTimingPort and also generalises the latter to better accommodate the changes (introducing trySendTiming and scheduleSend). The memory-side cache port overrides sendDeferredPacket to be able to not only send responses from the transmit list, but also send requests based on the MSHRs. A follow on patch further simplifies the SimpleTimingPort and the cache ports.	2012-02-24 11:52:49 -05:00
Andreas Hansson	77878d0a87	MEM: Prepare mport for master/slave split This patch simplifies the mport in preparation for a split into a master and slave role for the message ports. In particular, sendMessageAtomic was only used in a single location and similarly so sendMessageTiming. The affected interrupt device is updated accordingly.	2012-02-24 11:50:15 -05:00
Andreas Hansson	86c2aad482	Ruby: Simplify tester ports by not using SimpleTimingPort This patch simplfies the master ports used by RubyDirectedTester and RubyTester by avoiding the use of SimpleTimingPort. Neither tester made any use of the functionality offered by SimpleTimingPort besides a trivial implementation of recvFunctional (only snoops) and recvRangeChange (not relevant since there is only one master). The patch does not change or add any functionality, it merely makes the introduction of a master/slave port easier (in a future patch).	2012-02-24 11:48:48 -05:00
Andreas Hansson	485d103255	MEM: Move all read/write blob functions from Port to PortProxy This patch moves the readBlob/writeBlob/memsetBlob from the Port class to the PortProxy class, thus making a clear separation of the basic port functionality (recv/send functional/atomic/timing), and the higher-level functional accessors available on the port proxies. There are only a few places in the code base where the blob functions were used on ports, and they are all for peeking into the memory system without making a normal memory access (in the memtest, and the malta and tsunami pchip). The memtest also exemplifies how easy it is to create a non-translating proxy if desired. The malta and tsunami pchip used a slave port to perform a functional read, and this is now changed to rely on the physProxy of the system (to which they already have a pointer).	2012-02-24 11:46:39 -05:00
Andreas Hansson	9e3c8de30b	MEM: Make port proxies use references rather than pointers This patch is adding a clearer design intent to all objects that would not be complete without a port proxy by making the proxies members rathen than dynamically allocated. In essence, if NULL would not be a valid value for the proxy, then we avoid using a pointer to make this clear. The same approach is used for the methods using these proxies, such as loadSections, that now use references rather than pointers to better reflect the fact that NULL would not be an acceptable value (in fact the code would break and that is how this patch started out). Overall the concept of "using a reference to express unconditional composition where a NULL pointer is never valid" could be done on a much broader scale throughout the code base, but for now it is only done in the locations affected by the proxies.	2012-02-24 11:45:30 -05:00
Andreas Hansson	1031b824b9	MEM: Move port creation to the memory object(s) construction This patch moves all port creation from the getPort method to be consistently done in the MemObject's constructor. This is possible thanks to the Swig interface passing the length of the vector ports. Previously there was a mix of: 1) creating the ports as members (at object construction time) and using getPort for the name resolution, or 2) dynamically creating the ports in the getPort call. This is now uniform. Furthermore, objects that would not be complete without a port have these ports as members rather than having pointers to dynamically allocated ports. This patch also enables an elaboration-time enumeration of all the ports in the system which can be used to determine the masterId.	2012-02-24 11:43:53 -05:00
Andreas Hansson	9f07d2ce7e	CPU: Round-two unifying instr/data CPU ports across models This patch continues the unification of how the different CPU models create and share their instruction and data ports. Most importantly, it forces every CPU to have an instruction and a data port, and gives these ports explicit getters in the BaseCPU (getDataPort and getInstPort). The patch helps in simplifying the code, make assumptions more explicit, andfurther ease future patches related to the CPU ports. The biggest changes are in the in-order model (that was not modified in the previous unification patch), which now moves the ports from the CacheUnit to the CPU. It also distinguishes the instruction fetch and load-store unit from the rest of the resources, and avoids the use of indices and casting in favour of keeping track of these two units explicitly (since they are always there anyways). The atomic, timing and O3 model simply return references to their already existing ports.	2012-02-24 11:42:00 -05:00
Andreas Hansson	ef4af8cec8	MEM: Fatal when no port can be found for an address This patch adds a check in the findPort method to ensure that an invalid port id is never returned. Previously this could happen if no default port was set, and no address matched the request, in which case -1 was returned causing a SEGFAULT when using the id to index in the port array. To clean things up further a symbolic name is added for the invalid port id.	2012-02-24 11:40:29 -05:00
Steve Reinhardt	e121708e08	SimObject: make get_config_as_dict() tolerate undefined params Without this patch, undefined params cause a cryptic KeyError in multidict inside get_config_as_dict(). This patch lets undefined params through get_config_as_dict() so they can once again generate meaningful error messages later on in the configuration process.	2012-02-20 08:11:14 -08:00
Andreas Hansson	6cf9f182f6	MEM: Fix residual bus ports and make them master/slave This patch cleans up a number of remaining uses of bus.port which is now split into bus.master and bus.slave. The only non-trivial change is the memtest where the level building now has to be aware of the role of the ports used in the previous level.	2012-02-14 14:15:30 -05:00
Mrinmoy Ghosh	9b05e96b9e	BPred: Fix RAS to handle predicated call/return instructions. Change RAS to fix issues with predicated call/return instructions. Handled all cases in the life of a predicated call and return instruction.	2012-02-13 12:26:25 -06:00
Mrinmoy Ghosh	fd90c3676d	BP: Fix several Branch Predictor issues. 1. Updates the Branch Predictor correctly to the state just after a mispredicted branch, if a squash occurs. 2. If a BTB does not find an entry, the branch is predicted not taken. The global history is modified to correctly reflect this prediction. 3. Local history is now updated at the fetch stage instead of execute stage. 4. In the Update stage of the branch predictor the local predictors are now correctly updated according to the state of local history during fetch stage. This patch also improves performance by as much as 17% on some benchmarks	2012-02-13 12:26:24 -06:00
Andreas Hansson	abc212461b	MEM: Explicit ports and Python binding on CopyEngine The copy-engine ports were previously created implicitly and bound based on the dma port peer rather than relying on the normal Python binding (connectPorts) being called explicitly. This patch makes the copy engine port similar to all other ports in that they are visibly in the Python class and bound using the normal explicit calls through Python.	2012-02-13 06:46:43 -05:00
Andreas Hansson	63777fb23f	MEM: Pass the ports from Python to C++ using the Swig params This patch adds basic information about the ports in the parameter classes to be passed from the Python world to the corresponding C++ object. Currently, the only information passed is the number of connected peers, which for a Port is either 0 or 1, and for a VectorPort reflects the size of the VectorPort. The default port of the bus had to be renamed to avoid using the name "default" as a field in the parameter class. It is possible to extend the Swig'ed information further and add e.g. a pair with a description and size.	2012-02-13 06:45:11 -05:00
Andreas Hansson	5a9a743cfc	MEM: Introduce the master/slave port roles in the Python classes This patch classifies all ports in Python as either Master or Slave and enforces a binding of master to slave. Conceptually, a master (such as a CPU or DMA port) issues requests, and receives responses, and conversely, a slave (such as a memory or a PIO device) receives requests and sends back responses. Currently there is no differentiation between coherent and non-coherent masters and slaves. The classification as master/slave also involves splitting the dual role port of the bus into a master and slave port and updating all the system assembly scripts to use the appropriate port. Similarly, the interrupt devices have to have their int_port split into a master and slave port. The intdev and its children have minimal changes to facilitate the extra port. Note that this patch does not enforce any port typing in the C++ world, it merely ensures that the Python objects have a notion of the port roles and are connected in an appropriate manner. This check is carried when two ports are connected, e.g. bus.master = memory.port. The following patches will make use of the classifications and specialise the C++ ports into masters and slaves.	2012-02-13 06:43:09 -05:00
Gabe Black	eada4268ef	X86: open flags: Another patch from Vince Weaver	2012-02-12 16:41:29 -08:00
Anthony Gutierrez	542d0ceebc	cpu: add separate stats for insts/ops both globally and per cpu model	2012-02-12 16:07:39 -06:00
Dam Sunwoo	230540e655	mem: fix cache stats to use request ids correctly This patch fixes the cache stats to use the new request ids. Cache stats also display the requestor names in the vector subnames. Most cache stats now include "nozero" and "nonan" flags to reduce the amount of excessive cache stat dump. Also, simplified incMissCount()/incHitCount() functions.	2012-02-12 16:07:39 -06:00
Ali Saidi	8aaa39e93d	mem: Add a master ID to each request object. This change adds a master id to each request object which can be used identify every device in the system that is capable of issuing a request. This is part of the way to removing the numCpus+1 stats in the cache and replacing them with the master ids. This is one of a series of changes that make way for the stats output to be changed to python.	2012-02-12 16:07:38 -06:00
Mrinmoy Ghosh	7e104a1af2	prefetcher: Make prefetcher a sim object instead of it being a parameter on cache	2012-02-12 16:07:38 -06:00
Gabe Black	5b557a314f	SPARC: Make PSTATE and HPSTATE a BitUnion. This gets rid of cryptic bits of code with lots of bit manipulation, and makes some comments redundant.	2012-02-11 14:16:38 -08:00
Nilay Vaish	aa513a4a99	Ruby: Remove isTagPresent() calls from Sequencer.cc This patch removes the calls to isTagPresent() from Sequencer.cc. These calls are made just for setting the cache block to have been most recently used. The calls have been folded in to the function setMRU().	2012-02-10 11:29:02 -06:00
Nilay Vaish	69d8600bf8	MESI: Add queues for stalled requests This patch adds support for stalling the requests queued up at different controllers for the MESI CMP directory protocol. Earlier the controllers would recycle the requests using some fixed latency. This results in younger requests getting serviced first at times, and can result in starvation. Instead all the requests that need a particular block to be in a stable state are moved to a separate queue, where they wait till that block returns to a stable state and then they are processed.	2012-02-10 11:05:24 -06:00
Nilay Vaish	72f3f526fc	sim/system: initialize the pagePtr variable	2012-02-10 09:52:32 -06:00
Nilay Vaish	6a7a6263e1	O3 CPU: Improve handling of delayed commit flag The delayed commit flag is used in conjunction with interrupt pending flag to figure out whether or not fetch stage should get more instructions. This patch clears this flag when instructions are squashed. Also, in case an interrupt is pending, currently it is not possible to access the instruction cache. This patch allows accessing the cache in case this flag is set.	2012-02-10 08:37:31 -06:00
Nilay Vaish	cd765c23a2	O3 CPU: Strengthen condition for handling interrupts The condition for handling interrupts is to check whether or not the cpu's instruction list is empty. As observed, this can lead to cases in which even though the instruction list is empty, interrupts are handled when they should not be. The condition is being strengthened so that interrupts get handled only when the last committed microop did not had IsDelayedCommit set.	2012-02-10 08:37:30 -06:00
Nilay Vaish	8f7e03d4cf	O3 CPU: Provide the squashing instruction This patch adds a function to the ROB that will get the squashing instruction from the ROB's list of instructions. This squashing instruction is used for figuring out the macroop from which the fetch stage should fetch the microops. Further, a check has been added that if the instructions are to be fetched from the cache maintained by the fetch stage, then the data in the cache should be valid and the PC of the thread being fetched from is same as the address of the cache block.	2012-02-10 08:37:28 -06:00
Nilay Vaish	0e597e944a	O3 Fetch: Check if PC is pointing to Microcode ROM	2012-02-10 08:37:26 -06:00
Gabe Black	e80ebc308f	SE/FS: Record the system pointer all the time for the simple CPU. This pointer was only being stored in code that came from SE mode. The system pointer is always meaningful and available, so it should always be stored.	2012-02-10 02:05:31 -08:00
Andreas Hansson	cdb32860b4	MEM: Remove onRetryList from BusPort and rely on retryList This patch removes the onRetryList field from the BusPort class and entirely relies on the retryList which holds all ports that are waiting to retry. The onRetryList field and the retryList were previously used with overloaded functionalities and only one is really needed (there were also checks to assert they held the same information). After this patch the bus ports will be split into master and slave ports and this simplifies that transition.	2012-02-09 13:06:27 -05:00
Gabe Black	a6246bb047	Checker: Access workload element 0 only if there is an element 0.	2012-02-07 04:44:01 -08:00
Gabe Black	f2b46fdb85	Faults: Turn off arch/faults.hh Because there are no longer architecture independent but specialized functions in arch/XXX/faults.hh, code that isn't using the faults from a particular ISA no longer needs to be able to include them through the switching header file arch/faults.hh. By removing that header file (arch/faults.hh), the potential interface between ISA code and non ISA code is narrowed.	2012-02-07 04:43:21 -08:00
Gabe Black	cbcdcd53a7	System: Forgot to qrefresh with my last change.	2012-02-03 09:48:10 -08:00
Gabe Black	acebd9bf91	System: Fix the check which detects running out of physical memory. The code that checks whether pages allocated by allocPhysPages only checks that the first page fits into physical memory, not that all of them do. This change makes the code check the last page which should work properly. This function used to only allocate one page at a time, so the first page and last page used to be the same thing.	2012-02-02 23:54:25 -08:00
Ali Saidi	0a26883296	configs: More fixes for the memory system updates	2012-02-01 09:48:28 -08:00
Gabe Black	ea8b347dc5	Merge with head, hopefully the last time for this batch.	2012-01-31 22:40:08 -08:00
Koan-Sin Tan	7d4f187700	clang: Enable compiling gem5 using clang 2.9 and 3.0 This patch adds the necessary flags to the SConstruct and SConscript files for compiling using clang 2.9 and later (on Ubuntu et al and OSX XCode 4.2), and also cleans up a bunch of compiler warnings found by clang. Most of the warnings are related to hidden virtual functions, comparisons with unsigneds >= 0, and if-statements with empty bodies. A number of mismatches between struct and class are also fixed. clang 2.8 is not working as it has problems with class names that occur in multiple namespaces (e.g. Statistics in kernel_stats.hh). clang has a bug (http://llvm.org/bugs/show_bug.cgi?id=7247) which causes confusion between the container std::set and the function Packet::set, and this is currently addressed by not including the entire namespace std, but rather selecting e.g. "using std::vector" in the appropriate places.	2012-01-31 12:05:52 -05:00
Andreas Hansson	4590b91fb8	MEM: Remove the otherPort from the cache ports This patch is a very straight-forward simplification, removing the unecessary otherPort pointer from the cache port. The pointer was only used to forward range changes, and the address range is fixed for the cache. Removing the pointer simplifies the transition to master/slave ports.	2012-01-31 11:51:19 -05:00
Andreas Hansson	4fdecae443	Thread: Use inherited baseCpu rather than cpu in SimpleThread This patch is a trivial simplification, removing the cpu pointer from SimpleThread and relying on the baseCpu pointer in ThreadState. The patch does not add or change any functionality, it merely cleans up the code.	2012-01-31 11:50:07 -05:00
Dam Sunwoo	0ed3c84c7b	util: implements "writefile" gem5 op to export file from guest to host filesystem Usage: m5 writefile <filename> File will be created in the gem5 output folder with the identical filename. Implementation is largely based on the existing "readfile" functionality. Currently does not support exporting of folders.	2012-01-31 07:46:04 -08:00
Geoffrey Blake	af6aaf2581	CheckerCPU: Re-factor CheckerCPU to be compatible with current gem5 Brings the CheckerCPU back to life to allow FS and SE checking of the O3CPU. These changes have only been tested with the ARM ISA. Other ISAs potentially require modification.	2012-01-31 07:46:03 -08:00
Gabe Black	e88165a431	Merge with main repository.	2012-01-30 21:07:57 -08:00
Andreas Hansson	cfc268ad9e	MEM: Make the RubyPort physMemPort a PioPort instead of M5Port This patch makes the physMemPort of the RubyPort a PioPort rather than an M5Port. This reflects the fact that the M5Port and PioPort have different roles. The M5Port is really a coherent slave that is connected to the CPUs and other coherent masters of the system, e.g. DMA ports. The PioPort, on the other hand, is a master port that is connected to the memory and other slaves, for example the pio devices. This simplifies future changes into master/slave ports and is consistent with the port roles throughout the system.	2012-01-30 05:38:24 -05:00
Andreas Hansson	ef9fc01073	MEM: Clean-up of Functional/Virtual/TranslatingPort remnants This patch cleans up forward declarations and a member-function prototype that still referred to the old FunctionalPort, VirtualPort and TranslatingPort. There is no change in functionality.	2012-01-30 03:44:25 -05:00
Gabe Black	39f314cc15	Yet another merge with the main repository. --HG-- rename : tests/long/10.linux-boot/ref/x86/linux/pc-o3-timing/config.ini => tests/long/fs/10.linux-boot/ref/x86/linux/pc-o3-timing/config.ini rename : tests/long/10.linux-boot/ref/x86/linux/pc-o3-timing/simout => tests/long/fs/10.linux-boot/ref/x86/linux/pc-o3-timing/simout rename : tests/long/10.linux-boot/ref/x86/linux/pc-o3-timing/stats.txt => tests/long/fs/10.linux-boot/ref/x86/linux/pc-o3-timing/stats.txt rename : tests/long/10.linux-boot/ref/x86/linux/pc-o3-timing/system.pc.com_1.terminal => tests/long/fs/10.linux-boot/ref/x86/linux/pc-o3-timing/system.pc.com_1.terminal rename : tests/long/00.gzip/ref/x86/linux/o3-timing/config.ini => tests/long/se/00.gzip/ref/x86/linux/o3-timing/config.ini rename : tests/long/00.gzip/ref/x86/linux/o3-timing/simout => tests/long/se/00.gzip/ref/x86/linux/o3-timing/simout rename : tests/long/00.gzip/ref/x86/linux/o3-timing/stats.txt => tests/long/se/00.gzip/ref/x86/linux/o3-timing/stats.txt rename : tests/long/10.mcf/ref/x86/linux/o3-timing/config.ini => tests/long/se/10.mcf/ref/x86/linux/o3-timing/config.ini rename : tests/long/10.mcf/ref/x86/linux/o3-timing/simout => tests/long/se/10.mcf/ref/x86/linux/o3-timing/simout rename : tests/long/10.mcf/ref/x86/linux/o3-timing/stats.txt => tests/long/se/10.mcf/ref/x86/linux/o3-timing/stats.txt rename : tests/long/20.parser/ref/x86/linux/o3-timing/config.ini => tests/long/se/20.parser/ref/x86/linux/o3-timing/config.ini rename : tests/long/20.parser/ref/x86/linux/o3-timing/simout => tests/long/se/20.parser/ref/x86/linux/o3-timing/simout rename : tests/long/20.parser/ref/x86/linux/o3-timing/stats.txt => tests/long/se/20.parser/ref/x86/linux/o3-timing/stats.txt rename : tests/long/70.twolf/ref/x86/linux/o3-timing/config.ini => tests/long/se/70.twolf/ref/x86/linux/o3-timing/config.ini rename : tests/long/70.twolf/ref/x86/linux/o3-timing/simout => tests/long/se/70.twolf/ref/x86/linux/o3-timing/simout rename : tests/long/70.twolf/ref/x86/linux/o3-timing/stats.txt => tests/long/se/70.twolf/ref/x86/linux/o3-timing/stats.txt rename : tests/quick/00.hello/ref/x86/linux/o3-timing/config.ini => tests/quick/se/00.hello/ref/x86/linux/o3-timing/config.ini rename : tests/quick/00.hello/ref/x86/linux/o3-timing/simout => tests/quick/se/00.hello/ref/x86/linux/o3-timing/simout rename : tests/quick/00.hello/ref/x86/linux/o3-timing/stats.txt => tests/quick/se/00.hello/ref/x86/linux/o3-timing/stats.txt	2012-01-29 03:27:15 -08:00
Gabe Black	dc0e629ea1	Implement Ali's review feedback. Try to decrease indentation, and remove some redundant FullSystem checks.	2012-01-29 02:04:34 -08:00
Nilay Vaish	5c2fc35e02	O3 CPU LSQ: Implement TSO This patch makes O3's LSQ maintain total order between stores. Essentially only the store at the head of the store buffer is allowed to be in flight. Only after that store completes, the next store is issued to the memory system. By default, the x86 architecture will have TSO.	2012-01-28 19:09:04 -06:00
Gabe Black	ec20ee2f7c	SE/FS: Make SE vs. FS mode a runtime parameter.	2012-01-28 07:24:34 -08:00
Gabe Black	eab5c60286	MIPS: Fix a compiler warning from the eret instruction.	2012-01-28 07:24:23 -08:00
Gabe Black	c3d41a2def	Merge with the main repo. --HG-- rename : src/mem/vport.hh => src/mem/fs_translating_port_proxy.hh rename : src/mem/translating_port.cc => src/mem/se_translating_port_proxy.cc rename : src/mem/translating_port.hh => src/mem/se_translating_port_proxy.hh	2012-01-28 07:24:01 -08:00
Andreas Hansson	4acca8a053	ns_gige: Fix a missing curly brace in if-statement This patch adds a missing curly brace when clearing and setting the appropriate bits in the ns_gige.cc code. This commit is not based on any runtime bug experienced, but rather inspection of the code.	2012-01-27 12:54:11 -05:00
Gabe Black	da2a4acc26	Merge yet again with the main repository.	2012-01-16 04:27:10 -08:00
Mitchell Hayenga	698408bce2	Fix memory corruption issue with CopyStringOut() CopyStringOut() improperly indexed setting the null character, would result in zeroing a random byte of memory after(out of bounds) the character array.	2012-01-12 15:27:20 -06:00
Ali Saidi	bd55c9e2af	sim: display final value of curTick in stats Different from sim_ticks in that this value is restored from checkpoints and is never reset. Useful for aligning with framebuffer output ticks	2012-01-25 17:18:25 +00:00
Ali Saidi	e1c48dfce5	Mem: Add simple bandwidth stats to PhysicalMemory	2012-01-25 17:18:25 +00:00
Nilay Vaish	63563c9df2	O3, Ruby: Forward invalidations from Ruby to O3 CPU This patch implements the functionality for forwarding invalidations and replacements from the L1 cache of the Ruby memory system to the O3 CPU. The implementation adds a list of ports to RubyPort. Whenever a replacement or an invalidation is performed, the L1 cache forwards this to all the ports, which is the LSQ in case of the O3 CPU.	2012-01-23 11:07:14 -06:00
Nilay Vaish	9481d05b8a	MemCmd: Add a command for invalidation requests to LSQ This command will be sent from the memory system (Ruby) to the LSQ of an O3 CPU so that the LSQ, if it needs to, invalidates the address in the request packet.	2012-01-23 11:07:11 -06:00
Andreas Hansson	acd289b7ef	MEM: Make the bus default port yet another port This patch removes the idiosyncratic nature of the default bus port and makes it yet another port in the list of interfaces. Rather than having a specific pointer to the default port we merely track the identifier of this port. This change makes future port diversification easier and overall cleans up the bus code.	2012-01-17 12:55:09 -06:00
Andreas Hansson	55cf3f4ac1	MEM: Removing the default port peer from Python ports In preparation for the introduction of Master and Slave ports, this patch removes the default port parameter in the Python port and thus forces the argument list of the Port to contain only the description. The drawback at this point is that the config port and dma port of PCI and DMA devices have to be connected explicitly. This is key for future diversification as the pio and config port are slaves, but the dma port is a master.	2012-01-17 12:55:09 -06:00
Andreas Hansson	2208ea049f	MEM: Make the bus bridge unidirectional and fixed address range This patch makes the bus bridge uni-directional and specialises the bus ports to be a master port and a slave port. This greatly simplifies the assumptions on both sides as either port only has to deal with requests or responses. The following patches introduce the notion of master and slave ports, and would not be possible without this split of responsibilities. In making the bridge unidirectional, the address range mechanism of the bridge is also changed. For the cases where communication is taking place both ways, an additional bridge is needed. This causes issues with the existing mechanism, as the busses cannot determine when to stop iterating the address updates from the two bridges. To avoid this issue, and also greatly simplify the specification, the bridge now has a fixed set of address ranges, specified at creation time.	2012-01-17 12:55:09 -06:00
William Wang	e731cf4c1d	MEM: Remove the functional ports from the memory system The functional ports are no longer used and this patch cleans up the legacy that is still present in buses, memories, CPUs etc. Note that this does not refer to the class FunctionalPort (already removed), but rather ports with the name (and use) functional.	2012-01-17 12:55:09 -06:00
Andreas Hansson	07cf9d914b	MEM: Separate queries for snooping and address ranges This patch simplifies the address-range determination mechanism and also unifies the naming across ports and devices. It further splits the queries for determining if a port is snooping and what address ranges it responds to (aiming towards a separation of cache-maintenance ports and pure memory-mapped ports). Default behaviours are such that most ports do not have to define isSnooping, and master ports need not implement getAddrRanges.	2012-01-17 12:55:09 -06:00
Andreas Hansson	142380a373	MEM: Remove Port removeConn and MemObject deletePortRefs Cleaning up and simplifying the ports and going towards a more strict elaboration-time creation and binding of the ports.	2012-01-17 12:55:09 -06:00
Andreas Hansson	6315e5bbb5	MEM: Remove the notion of the default port This patch removes the default port and instead relies on the peer being set to NULL initially. The binding check (i.e. is a port connected or not) will eventually be moved to the init function of the modules.	2012-01-17 12:55:09 -06:00
Andreas Hansson	de34e49d15	MEM: Simplify ports by removing EventManager This patch removes the inheritance of EventManager from the ports and moves all responsibility for event queues to the owner. Eventually the event manager should be the interface block, which could either be the structural owner or a subblock like a LSQ in the O3 CPU for example.	2012-01-17 12:55:09 -06:00
Andreas Hansson	b3f930c884	CPU: Moving towards a more general port across CPU models This patch performs minimal changes to move the instruction and data ports from specialised subclasses to the base CPU (to the largest degree possible). Ultimately it servers to make the CPU(s) have a well-defined interface to the memory sub-system.	2012-01-17 12:55:08 -06:00
Andreas Hansson	f85286b3de	MEM: Add port proxies instead of non-structural ports Port proxies are used to replace non-structural ports, and thus enable all ports in the system to correspond to a structural entity. This has the advantage of accessing memory through the normal memory subsystem and thus allowing any constellation of distributed memories, address maps, etc. Most accesses are done through the "system port" that is used for loading binaries, debugging etc. For the entities that belong to the CPU, e.g. threads and thread contexts, they wrap the CPU data port in a port proxy. The following replacements are made: FunctionalPort > PortProxy TranslatingPort > SETranslatingPortProxy VirtualPort > FSTranslatingPortProxy --HG-- rename : src/mem/vport.cc => src/mem/fs_translating_port_proxy.cc rename : src/mem/vport.hh => src/mem/fs_translating_port_proxy.hh rename : src/mem/translating_port.cc => src/mem/se_translating_port_proxy.cc rename : src/mem/translating_port.hh => src/mem/se_translating_port_proxy.hh	2012-01-17 12:55:08 -06:00
Andreas Hansson	43a45edcf0	Ruby: Change the access permissions for MOESI hammer This patch changes the access permission for the WB_E_W state from Busy to Read_Write to avoid having issues in follow-on patches with functional accesses going through Ruby. This change was made after consultation with all involved parties and is more of a work-around than a fix.	2012-01-17 12:55:07 -06:00
Andreas Hansson	41af57f9fb	MEM: Add the system port as a central access point The system port is used as a globally reachable access point to the memory subsystem. The benefit of using an actual port is that the usual infrastructure is used to resolve any access and thus makes the overall system able to handle distributed memories in any configuration, and also makes the accesses agnostic to the address map. This patch only introduces the port and does not actually use it for anything.	2012-01-17 12:55:07 -06:00
Andreas Hansson	13ef7a5647	MEM: Differentiate functional cache accesses from CPU and memory This patch changes the functionalAccess member function in the cache model such that it is aware of what port the access came from, i.e. if it came from the CPU side or from the memory side. By adding this information, it is possible to respect the 'forwardSnoops' flag for snooping requests coming from the memory side and not forward them. This fixes an outstanding issue with the IO bus getting accesses that have no valid destination port and also cleans up future changes to the bus model.	2012-01-17 12:55:07 -06:00
Steve Reinhardt	7a3a37307a	Alpha: warn_once about broken PAL breakpoints. A recent changeset (aae12ce9f34c) removed support for PAL-mode breakpoints in Alpha, since it was awkward and likely unused. This patch lets a user know if they potentially run into this limitation.	2012-01-16 19:01:27 -08:00
Steve Reinhardt	1585cfb5b5	debug: fix AllFlags::disable() Looks like copy-and-paste bug, apparently I'm the first person to ever use this since it's plainly broken.	2012-01-16 19:00:59 -08:00
Maximilien Breughe	a7394ad680	inorder: MDU deadlock fix	2012-01-12 10:15:00 -05:00
Deyuan Guo	4a59cf00b4	mips: compatibility between MIPS_SE and cross compiler from CodeSorcery	2012-01-12 09:59:01 -05:00
Deyuan Guo	31b6941a52	mips: Fix bugs in faults.cc/hh and tlb.cc for MIPS_FS	2012-01-12 09:59:00 -05:00
Deyuan Guo	a40ec5671f	mips: Fix decoder of two float-convert instructions	2012-01-12 09:58:59 -05:00
Deyuan Guo	7f782a6c79	mips: definition of MIPS64_QNAN in registers.hh	2012-01-12 09:58:58 -05:00
Nilay Vaish	0e6d6a5e25	PerfectCacheMemory: Remove references to CacheMsg The definition for the class CacheMsg was removed long back. Some declaration had still survived, which was recently removed. Since the PerfectCacheMemory class relied on this particular declaration, its absence let to compilation breaking down. Hence this patch.	2012-01-12 00:35:57 -06:00
Ali Saidi	c40ae2c3fb	Packet: Put back part of the assert	2012-01-11 19:27:11 -05:00
Ali Saidi	bc1c21274e	Packet: Remove meaningless assert statement	2012-01-11 19:24:13 -05:00
Nilay Vaish	bf59a9298f	Ruby: Resurrect Cache Warmup Capability This patch resurrects ruby's cache warmup capability. It essentially makes use of all the infrastructure that was added to the controllers, memories and the cache recorder.	2012-01-11 13:48:48 -06:00
Nilay Vaish	3f8065290a	Ruby Debug Flags: Remove one, add another The flag RubyStoreBuffer is being removed, instead RubySystem is being added	2012-01-11 13:42:00 -06:00
Nilay Vaish	2d3cae02f5	Ruby Port: Add a list of cpu ports attached to this port	2012-01-11 13:39:58 -06:00
Nilay Vaish	17fc60ee88	Ruby EventQueue: Remove unused functions	2012-01-11 13:31:04 -06:00
Nilay Vaish	8b3ad17cc3	Ruby Sparse Memory: Add function for collating blocks This patch adds function to the Sparse Memory so that the blocks can be recorded in a cache trace. The blocks are added to the cache recorder which can later write them into a file.	2012-01-11 13:29:54 -06:00
Nilay Vaish	c3109f7775	Ruby: Add infrastructure for recording cache contents This patch changes CacheRecorder, CacheMemory, CacheControllers so that the contents of a cache can be recorded for checkpointing purposes.	2012-01-11 13:29:15 -06:00
Nilay Vaish	ab0347a1c6	Ruby Memory Vector: Functions for collating and populating pages This patch adds functions to the memory vector class that can be used for collating memory pages to raw trace and for populating pages from a raw trace.	2012-01-11 11:46:23 -06:00
Nilay Vaish	bd739a75b9	Ruby: remove the files related to the tracer The Ruby Tracer is out of date with the changes that are being carried out to support checkpointing. Hence, it needs to be removed.	2012-01-10 18:35:45 -06:00
Nilay Vaish	70cb16ba14	MOESI Hammer: Remove a couple of bugs A couple of bugs were observed while building checkpointing support in Ruby. This patch changes transitions to remove those errors.	2012-01-10 17:28:44 -06:00
Nilay Vaish	adff204c97	Sparse Memory: Simplify the structure for an entry The SparseMemEntry structure includes just one void* pointer. It seems unnecessary that we have a structure for this. The patch removes the structure and makes use of a typedef on void* instead.	2012-01-10 10:20:32 -06:00
Ali Saidi	cfa1d26b43	Automated merge with ssh://repo.gem5.org/gem5	2012-01-10 10:18:08 -06:00
Ali Saidi	8f18898e85	config: Fix json output for Python lt 2.6.	2012-01-10 10:17:33 -06:00
Nilay Vaish	9957035a42	DPRINTF: Improve some dprintf messages.	2012-01-10 10:15:02 -06:00
Nilay Vaish	acbc03ae46	X86: Add memory fence to I/O instructions	2012-01-09 20:13:31 -06:00
Anders Handler	b587d511c3	CPU: Remove Alpha-specific PC alignment check.	2012-01-09 20:05:07 -05:00
Ali Saidi	e308208f30	Config: Fix issue with JSON output	2012-01-09 20:04:28 -05:00
Geoffrey Blake	e826d23a2e	Packet: Add derived class FunctionalPacket to enable partial functional reads This adds the derived class FunctionalPacket to fix a long standing deficiency in the Packet class where it was unable to handle finding data to partially satisfy a functional access. Made this a derived class as functional accesses are used only in certain contexts and to not add any additional overhead to the existing Packet class.	2012-01-09 18:10:05 -06:00
Dam Sunwoo	bda1125e88	stats: fix Vector2d to display stats correctly when y_subname is not specified. Vector2d stats with no y_subname were not displayed as the VectorPrint subname was not initialized correctly to reflect the empty field.	2012-01-09 18:08:20 -06:00
Prakash Ramrakhyani	51aa7e4a03	sim: Enable sampling of run-time for code-sections marked using pseudo insts. This patch adds a mechanism to collect run time samples for specific portions of a benchmark, using work_begin and work_end pseudo instructions.It also enhances the histogram stat to report geometric mean.	2012-01-09 18:08:20 -06:00
Ali Saidi	525d1e46dc	O3: Remove some asserts that no longer seem to be valid.	2012-01-09 18:08:20 -06:00
Ali Saidi	68d387ec80	config: support outputing a pickle of the configuration tree	2012-01-09 18:08:20 -06:00
Min Kyu Jeong	c94e5256d9	mem: Change DPRINTF prints more useful destination port number. Old code prints 0 for destination since pkt->getDest() returns 0 for pkt->getDest() == Packet::Broadcast, which is always true.	2012-01-09 18:08:20 -06:00
Ali Saidi	d2c26f402c	O3: Add support of function tracing with O3 CPU.	2012-01-09 18:08:20 -06:00
Ali Saidi	bcb71963eb	ARM: Add support for running multiple systems	2012-01-09 18:08:20 -06:00
Ali Saidi	80a6907927	ARM: Add support for initparam m5 op	2012-01-09 18:08:20 -06:00
Dam Sunwoo	3f9e352de4	Base: Fixed shift amount in genrand() to work with large numbers The previous version didn't work correctly with max integer values (2^31-1 for 32-bit, 2^63-1 for 64bit version), causing "shift" to become -1. For smaller numbers, it wouldn't have caused functional errors, but would have resulted in more than necessary loops in the while loop. Special-cased cases when (max + 1 == 0) to prevent the ceilLog2 functions from failing.	2012-01-09 18:08:20 -06:00
Andreas Hansson	59b7cad3ec	SWIG: Make gem5 compile and link with swig 2.0.4 To make gem5 compile and run with swig 2.0.4 a few minor fixes are necessary, the fail label issues by swig must not be treated as an error by gcc (tested with gcc 4.2.1), and the vector wrappers must have SWIGPY_SLICE_ARG defined which happens in pycontainer.swg, included through std_container.i. By adding the aforementioned include to the vector wrappers everything seems to work.	2012-01-09 18:08:20 -06:00
Andreas Hansson	c2dbfc1d6c	MAC: Make gem5 compile and run on MacOSX 10.7.2 Adaptations to make gem5 compile and run on OSX 10.7.2, with a stock gcc 4.2.1 and the remaining dependencies from macports, i.e. python 2.7,.2 swig 2.0.4, mercurial 2.0. The changes include an adaptation of the SConstruct to handle non-library linker flags, and Darwin-specific code to find the memory usage of gem5. A number of Ruby files relied on ambigious uint (without the 32 suffix) which caused compilation errors.	2012-01-09 18:08:20 -06:00
Nilay Vaish	10c2e8ae9a	Ruby Cache: Add param for marking caches as instruction only	2012-01-07 07:38:53 -06:00
Gabe Black	241cc0c840	Another merge with the main repository.	2012-01-07 02:16:37 -08:00
Gabe Black	ec936364b7	Merge with the main repository again.	2012-01-07 02:15:35 -08:00
Gabe Black	36a822f08e	Merge with main repository.	2012-01-07 02:10:34 -08:00
Nilay Vaish	ce941fd2ae	AbstractController: Remove some of the unused functions --HG-- extra : rebase_source : 78df7398a609f1db8a2592cd2d1bdc9156d1b8c3	2012-01-06 05:11:07 -06:00
Nilay Vaish	6da125cc3c	Ruby Set: Move NUMBER_WORDS_PER_SET to Set.hh This constant is currently in System.hh, but is only used in Set.hh. It is being moved to Set.hh to remove this artificial dependence of Set.hh on System.hh. --HG-- extra : rebase_source : 683c43a5eeaec4f5f523b3ea32953a07f65cfee7	2012-01-06 05:11:07 -06:00
Nilay Vaish	daa4c7526a	eventq: add a function for replacing head of the queue This patch adds a function for replacing the event at the head of the queue with another event. This helps in running a different set of events. Events already scheduled can processed by replacing the original head event back. This function has been specifically added to support cache warmup and cooldown required for creating and restoring checkpoints. --HG-- extra : rebase_source : ed6e2905720b6bfdefd020fab76235ccf33d28d1	2012-01-05 11:02:56 -06:00
Nilay Vaish	d3aa01eed9	MESI Coherence Protocol: Fix L2 miss statistics This patch removes calls to uu_ProfileMiss from transitions where the request is satisfied by the L2 cache controller. --HG-- extra : rebase_source : e59fe7c6cd5795c0019cf178dd3b062d73cc2ff5	2012-01-05 11:00:45 -06:00
Nilay Vaish	bd23a37198	X86 TLB: Move a DPRINTF to its correct place The DPRINTF for doing protection checks appears after the checks have been carried out. It is possible that the function returns while the checks are being carried, in which case the printf is missed out. This patch moves the DPRINTF before the checks. --HG-- extra : rebase_source : 172896057e593022444d882ea93323a5d9f77a89	2012-01-05 11:00:32 -06:00
Nilay Vaish	ea94029ea5	Ruby: Shuffle some of the included files This patch adds and removes included files from some of the files so as to organize remove some false dependencies and include some files directly instead of transitively. --HG-- extra : rebase_source : 09b482ee9ae00b3a204ace0c63550bc3ca220134	2011-12-31 18:44:51 -06:00
Nilay Vaish	734ef9a209	SLICC: Use pointers for directory entries SLICC uses pointers for cache and TBE entries but not for directory entries. This patch changes the protocols, SLICC and Ruby memory system so that even directory entries are referenced using pointers. --HG-- extra : rebase_source : abeb4ac78033d003153751f216fd1948251fcfad	2011-12-31 16:38:30 -06:00
Ali Saidi	94ce971278	IO: Fix bug in DMA Device where receiving a snoop on DMA port would cause a panic. --HG-- extra : rebase_source : 8152d4fa7d7354c9f150a450ae0710e95141ba4b	2011-12-15 00:09:46 -05:00
Nathan Binkert	6ef9691035	gcc: fix unused variable warnings from GCC 4.6.1 --HG-- extra : rebase_source : f9e22de341493a25ac6106c16ac35c61c128a080	2011-12-13 11:49:27 -08:00
Ali Saidi	9b52717a92	Trace: FIx issue with creation of trace file with output dir overhaul. --HG-- extra : rebase_source : c1ab57ea8805703d97cdee4f32410821a2d2a9db	2011-12-01 17:36:22 -08:00
Brad Beckmann	8daad28a90	MOESI_hammer: fixed L2 to L1 infinite stalls and deadlock --HG-- extra : rebase_source : 90f217f28e195a8cee5d64b25c913b452d818676	2011-12-01 10:08:52 -08:00
Brad Beckmann	cecbdb6d79	physmem: Improved fatal message for size mismatch --HG-- extra : rebase_source : 16da1c63263f8fd6fef9a842c577343cd6246a35	2011-12-01 10:08:52 -08:00
Chris Emmons	9aea847f58	VNC: Add support for capturing frame buffer to file each time it is changed. When a change in the frame buffer from the VNC server is detected, the new frame is stored out to the m5out/frames_*/ directory. Specifiy the flag "--frame-capture" when running configs/example/fs.py to enable this behavior. --HG-- extra : rebase_source : d4e08e83f4fa6ff79f3dc9c433fc1f0487e057fc	2011-12-01 00:15:26 -08:00
Chris Emmons	5bde1d359f	Output: Add hierarchical output support and cleanup existing codebase. --HG-- extra : rebase_source : 3301137733cdf5fdb471d56ef7990e7a3a865442	2011-12-01 00:15:25 -08:00
Ali Saidi	5d50ee420d	SE: Don't warn when not extending stack as it's too noisy with O3. --HG-- extra : rebase_source : e56d1551d42d46b5f357cd63f9891715b664f6fc	2011-12-01 00:15:25 -08:00
Chander Sudanthi	61c14da751	O3: Remove hardcoded tgts_per_mshr in O3CPU.py. There are two lines in O3CPU.py that set the dcache and icache tgts_per_mshr to 20, ignoring any pre-configured value of tgts_per_mshr. This patch removes these hardcoded lines from O3CPU.py and sets the default L1 cache mshr targets to 20. --HG-- extra : rebase_source : 6f92d950e90496a3102967442814e97dc84db08b	2011-12-01 00:15:22 -08:00
Mitchell Hayenga	fa753c1454	Device: Make changes necessary to support a coherent page walker cache. Adds the flag 'recvSnoops' which enables pagewalkers using DmaPorts, to properly configure snoops. --HG-- extra : rebase_source : 64207bef62c3268ddff2236ee4adae873812325f	2011-12-01 00:15:22 -08:00
Ali Saidi	946f7f0f55	ARM: Add support for having a TLB cache. --HG-- extra : rebase_source : 7a5780ab74d7c294682738c7ccb3ce8d56c6fd63	2011-12-01 00:15:22 -08:00
Ali Saidi	5901c5223f	ARM: Add IsSerializeAfter and IsNonSpeculative flag to the syscall instruction . Squashes the subsequent instructions in O3 pipe after the service call, so that they see the effect of the system call when re-executed. This isn't really an issue with FS mode, but can show up in SE mode. --HG-- extra : rebase_source : 613a69fe1d9834261e25a8cd340aa6b47578e1fe	2011-12-01 00:15:22 -08:00
Ali Saidi	1444103998	O3: Add stat that counts how many cycles the O3 cpu was quiesced. --HG-- extra : rebase_source : 043b9307eef3c5b87f8e6370765641e016ed1fa7	2011-12-01 00:15:22 -08:00
Gabe Black	93fb460fad	X86: Fix a bad segmentation check for the stack segment. --HG-- extra : rebase_source : 755f4f6eae52f88ed516a1f1ac9e2565725d89c1	2011-12-01 00:17:14 -05:00
Gabe Black	87b66c9ae3	SPARC: Minor style fix. I forgot to fix this as well per Ali's feedback. --HG-- extra : rebase_source : e70d031cb5f91e2212a1a73ea1769bf0549b826c	2011-11-28 04:35:55 -05:00
Andreas Hansson	64ccfecf95	SPARC: Fixing a minor copy-paste bug using the wrong variable There was a bug in the mm_disk implementation where a copy paste error resulted in the d32 variable not being initialised (as it incorrectly was used instead of d16), and gcc 4.5 complaining. --HG-- extra : rebase_source : 9515e87b188b9eac189da8034cb13c3bf7d9e20b	2011-11-28 04:34:18 -05:00
Gabe Black	e7d0c999a1	SPARC: Isolate FP operations enough to prevent code/rounding mode reordering. --HG-- extra : rebase_source : ee79ab89c5a707c1294f38abb84c60f8ef64196c	2011-11-27 22:00:58 -05:00
Gabe Black	13552dc304	Compiler: Add an M5_NO_INLINE define. --HG-- extra : rebase_source : 1f5e8b7bb6b0a8bb4f951b6d7189964d96ed5df1	2011-11-27 22:00:57 -05:00
Tushar Krishna	88e91cafc6	Topology: bug fix in external link initialization --HG-- extra : rebase_source : c226cd1e5e5ed4d4c64fa9427de4905bd8335e34	2011-11-23 16:34:13 -05:00
Tushar Krishna	eff430a972	Remove standard_1level_CMP-protocol.sm include statement from Network --HG-- extra : rebase_source : 51a2dd4bb643e3dc5b0218a6190cf5c1989f9691	2011-11-22 20:11:18 -05:00
Gabe Black	49a2d54e1a	X86: Fix the constant detecting three byte opcodes in the predecoder. --HG-- extra : rebase_source : b64c3d2348cb73177024695fb6e205d51bf1cda9	2011-11-20 05:10:05 -08:00
Gabe Black	85424bef19	SE/FS: Get rid of includes of config/full_system.hh.	2011-11-18 02:20:22 -08:00
Gabe Black	de21bb93ea	SE/FS: Get rid of FULL_SYSTEM in the CPU directory.	2011-11-18 01:33:28 -08:00

... 18 19 20 21 22 ...

6977 commits