sanchayanmaity/gem5 - Sanchayan Maity's repositories

Author	SHA1	Message	Date
Nilay Vaish	183100b8cb	ruby: slicc: slight change to rule for transitions It had an unnecessary pairs token which is being removed.	2014-04-19 09:00:31 -05:00
Faissal Sleiman	a1570f544f	o3: Fix occupancy checks for SMT A number of calls to isEmpty() and numFreeEntries() should be thread-specific. In cpu.cc, the fact that tid is /commented/ out is a bug. Say the rob has instructions from thread 0 (isEmpty() returns false), and none from thread 1. If we are trying to squash all of thread 1, then readTailInst(thread 1) will be called because rob->isEmpty() returns false. The result is end_it is not in the list and the while statement loops indefinitely back over the cpu's instList. In iew_impl.hh, all threads are told they have the entire remaining IQ, when each thread actually has a certain allocation. The result is extra stalls at the iew dispatch stage which the rename stage usually takes care of. In commit_impl.hh, rob->readHeadInst(thread 1) can be called if the rob only contains instructions from thread 0. This returns a dummyInst (which may work since we are trying to squash all instructions, but hardly seems like the right way to do it). In rob_impl.hh this fix skips the rest of the function more frequently and is more efficient. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2014-04-19 09:00:30 -05:00
Marco Elver	d9fa950396	ruby: recorder: Fix (de-)serializing with different cache block-sizes Upon aggregating records, serialize system's cache-block size, as the cache-block size can be different when restoring from a checkpoint. This way, we can correctly read all records when restoring from a checkpoints, even if the cache-block size is different. Note, that it is only possible to restore from a checkpoint if the desired cache-block size is smaller or equal to the cache-block size when the checkpoint was taken; we can split one larger request into multiple small ones, but it is not reliable to do the opposite. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2014-04-19 09:00:30 -05:00
Andreas Sandberg	02b51afb7e	kvm, x86: Add initial support for multicore simulation Simulating a SMP or multicore requires devices to be shared between multiple KVM vCPUs. This means that locking is required when accessing devices. This changeset adds the necessary locking to allow devices to execute correctly. It is implemented by temporarily migrating the KVM CPU to the VM's (and devices) event queue when handling MMIO. Similarly, the VM migrates to the interrupt controller's event queue when delivering an interrupt. The support for fast-forwarding of multicore simulations added by this changeset assumes that all devices in a system are simulated in the same thread and each vCPU has its own thread. Special care must be taken to ensure that devices living under the CPU in the object hierarchy (e.g., the interrupt controller) do not inherit the parent CPUs thread and are assigned to device thread. The KvmVM object is assumed to live in the same thread as the other devices in the system.	2014-04-09 16:01:58 +02:00
Andreas Sandberg	221f4f232a	dev: Protect PollEvent processing when running in parallel mode The calling thread is undefined when the PollQueue services events. This implies that PollEvents need to handle the case where they are processed from a different thread than the thread that created the event. This changeset adds temporary event queue migrations to the VNC server, the ethernet tap device, and the terminal to protect them from inter-thread calls.	2014-04-09 16:01:43 +02:00
Nilay Vaish	d805e42b81	ruby: slicc: change enqueue statement As of now, the enqueue statement can take in any number of 'pairs' as argument. But we only use the pair in which latency is the key. This latency is allowed to be either a fixed integer or a member variable of controller in which the expression appears. This patch drops the use of pairs in an enqueue statement. Instead, an expression is allowed which will be interpreted to be the latency of the enqueue. This expression can anything allowed by slicc including a constant integer or a member variable.	2014-04-08 13:26:30 -05:00
Nilay Vaish	e689c00b16	ruby: coherence protocols: drop the phrase IntraChip The phrase is no longer valid since we do not distinguish between inter and intra chip communication.	2014-04-08 13:26:29 -05:00
Andreas Sandberg	838bcd3b19	sim: Add the ability to lock and migrate between event queues We need the ability to lock event queues to enable device accesses across threads. The serviceOne() method now takes a service lock prior to handling a new event. By locking an event queue, a different thread/eq can effectively execute in the context of the locked event queue. To simplify temporary event queue migrations, this changeset introduces the EventQueue::ScopedMigration class that unlocks the current event queue, locks a new event queue, and updates the current event queue variable. In order to prevent deadlocks, event queues need to be released when waiting on barriers. This is implemented using the EventQueue::ScopedRelease class. An instance of this class is, for example, used in the BaseGlobalEvent class to release the event queue when waiting on the synchronization barrier. The intended use for this functionality is when devices need to be accessed across thread boundaries. For example, when fast-forwarding, it might be useful to run devices and CPUs in separate threads. In such a case, the CPU locks the device queue whenever it needs to perform IO. This functionality is primarily intended for KVM. Note: Migrating between event queues can lead to non-deterministic timing. Use with extreme care! --HG-- extra : rebase_source : 23e3a741a1fd73861d1339782dbbe1bc76285315	2014-04-03 11:22:49 +02:00
Marco Elver	b884fcf412	cpu: o3: lsq: Fix TSO implementation This patch fixes violation of TSO in the O3CPU, as all loads must be ordered with all other loads. In the LQ, if a snoop is observed, all subsequent loads need to be squashed if the system is TSO. Prior to this patch, the following case could be violated: P0 \| P1 ; MOV [x],mail=/usr/spool/mail/nilay \| MOV EAX,[y] ; MOV [y],mail=/usr/spool/mail/nilay \| MOV EBX,[x] ; exists (1:EAX=1 /\ 1:EBX=0) [is a violation] The problem was found using litmus [http://diy.inria.fr]. Committed by: Nilay Vaish <nilay@cs.wisc.edu	2014-03-25 13:15:04 -05:00
Andreas Hansson	a00383a40a	mem: Track DRAM read/write switching and add hysteresis This patch adds stats for tracking the number of reads/writes per bus turn around, and also adds hysteresis to the write-to-read switching to ensure that the queue does not oscilate around the low threshold.	2014-03-23 11:12:14 -04:00
Andreas Hansson	7c18691db1	mem: Rename SimpleDRAM to a more suitable DRAMCtrl This patch renames the not-so-simple SimpleDRAM to a more suitable DRAMCtrl. The name change is intended to ensure that we do not send the wrong message (although the "simple" in SimpleDRAM was originally intended as in cleverly simple, or elegant). As the DRAM controller modelling work is being presented at ISPASS'14 our hope is that a broader audience will use the model in the future. --HG-- rename : src/mem/SimpleDRAM.py => src/mem/DRAMCtrl.py rename : src/mem/simple_dram.cc => src/mem/dram_ctrl.cc rename : src/mem/simple_dram.hh => src/mem/dram_ctrl.hh	2014-03-23 11:12:12 -04:00
Andreas Hansson	3dd1587afc	mem: Change memory defaults to be more representative Make the default memory type DDR3-1600 x64, and use the open-adaptive page policy. This change is aiming to ensure that users by default are using a realistic memory system.	2014-03-23 11:12:10 -04:00
Wendy Elsasser	bbbae677ed	mem: Add close adaptive paging policy to DRAM controller model This patch adds a second adaptive page policy to the DRAM controller, closing the page unless there are already queued accesses to the open page.	2014-03-23 11:12:08 -04:00
Andreas Hansson	03a1aed803	mem: DRAM controller tidying up Minor tidying up and removing of redundant code, including the printing of queue state every million accesses.	2014-03-23 11:12:06 -04:00
Andreas Hansson	bc83eb2197	mem: Fix bug in DRAM bytes per activate This patch ensures that we do not sample the bytes per activate when the row has already been closed.	2014-03-23 11:12:05 -04:00
Andreas Hansson	116985d661	mem: Limit the accesses to a page before forcing a precharge This patch adds a basic starvation-prevention mechanism where a DRAM page is forced to close after a certain number of accesses. The limit is combined with the open and open-adaptive page policy and if reached causes an auto-precharge.	2014-03-23 11:12:03 -04:00
Andreas Hansson	6557741311	mem: Make DRAM write queue draining more aggressive This patch changes the triggering condition for the write draining such that we grab the opportunity to issue writes if there are no reads waiting (as opposed to waiting for the writes to reach the high threshold). As a result, we potentially drain some of the writes in read idle periods (if any). A low threshold is added to be able to control how many write bursts are kept in the memory controller queue (acting as on-chip storage). The high and low thresholds are updated to sensible values for a 32/64 size write buffer. Note that the thresholds should be adjusted along with the queue sizes. This patch also adds some basic initialisation sanity checks and moves part of the initialisation to the constructor.	2014-03-23 11:12:01 -04:00
Neha Agarwal	364a51181e	cpu: DRAM Traffic Generator This patch enables a new 'DRAM' mode to the existing traffic generator, catered to generate specific requests to DRAM based on required hit length (stride size) and bank utilization. It is an add on to the Random mode. The basic idea is to control how many successive packets target the same page, and how many banks are being used in parallel. This gives a two-dimensional space that stresses different aspects of the DRAM timing. The configuration file needed to use this patch has to be changed as follow: (reference to Random Mode, LPDDR3 memory type) 'STATE 0 10000000000 RANDOM 50 0 134217728 64 3004 5002 0' -> 'STATE 0 10000000000 DRAM 50 0 134217728 32 3004 5002 0 96 1024 8 6 1' The last 4 parameters to be added are: <stride size (bytes), page size(bytes), number of banks available in DRAM, number of banks to be utilized, address mapping scheme> The address mapping information is used to get the stride address stream of the specified size and to know where to find the bank bits. The configuration file has a parameter where '0'-> RoCoRaBaCh, '1'-> RoRaBaCoCh/RoRaBaChCo address-mapping schemes. Note that the generator currently assumes a single channel and a single rank. This is to avoid overwhelming the traffic generator with information about the memory organisation.	2014-03-23 11:11:58 -04:00
Neha Agarwal	43abaf518f	mem: DDR3 config for comparing with DRAMSim2 This patch adds a new DDR3 configuration to match with the parameters that are specified in one of the DDR3 configs used in DRAMSim2.	2014-03-23 11:11:56 -04:00
Andreas Hansson	7e7b67472a	mem: More descriptive address-mapping scheme names This patch adds the row bits to the name of the address mapping schemes to make it more clear that all the current schemes places the row bits as the most significant bits.	2014-03-23 11:11:53 -04:00
Stan Czerniawski	4f77bc230a	misc: Fix -q (quiet) flag Check the right flag.	2014-03-23 11:11:49 -04:00
Andreas Hansson	9ac4f781ec	ruby: Move Ruby debug flags to ruby dir and remove stale options This patch moves the Ruby-related debug flags to the ruby sub-directory, and also removes the state SConsopts that add the no-longer-used NO_VECTOR_BOUNDS_CHECK.	2014-03-23 11:11:48 -04:00
Andreas Hansson	9f018d2f5a	mem: Include the DRAMSim2 wrapper in NULL build This patch makes sure DRAMSim2 is included in a build of the NULL ISA.	2014-03-23 11:11:44 -04:00
Sascha Bischoff	548d47ea2c	mem: CommMonitor trace warn on non-timing mode Add a warning to the CommMonitor which will alert the user if they try and record a trace when the system is not in timing mode.	2014-03-23 11:11:40 -04:00
Stan Czerniawski	e18d0e04a2	cpu: Add basic check to TrafficGen initial state Prevent incomplete configuration of TrafficGen class from causing segmentation faults. If an 'INIT' line is not present in the configuration file then the currState variable will remain uninitialized which may result in a crash.	2014-03-23 11:11:39 -04:00
Andrew Bardsley	0c001e729a	dev: Fix IsaFake's cxx_header setting cxx_header was set incorrectly on IsaFake	2014-03-23 11:11:37 -04:00
Eric Van Hensbergen	7630168a75	arm: m5ops readfile64 args broken, offset coming through garbage There were several sections of the m5ops code which were essentially copy/pasted versions of the 32-bit code. The problem is that some of these didn't account fo4 64-bit registers leading to arguments being in the wrong registers. This patch addresses the args for readfile64, writefile64, and addsymbol64 -- all of which seemed to suffer from a similar set of problems when moving to 64-bit.	2014-03-23 11:11:34 -04:00
Andreas Hansson	5093e58dc2	base: Fix error message time unit (cycle -> tick) This patch fixes the unit used in all error messages.	2014-03-23 11:11:32 -04:00
Nilay Vaish	52a83c1d0e	ruby: consumer: avoid accessing wakeup times when waking up Each consumer object maintains a set of tick values when the object is supposed to wakeup and do some processing. As of now, the object accesses this set both when scheduling a wakeup event and when the object actually wakes up. The set is accessed during wakeup to remove the current tick value from the set. This functionality is now being moved to the scheduling function where ticks are removed at a later time.	2014-03-20 09:14:14 -05:00
Nilay Vaish	4b67ada89e	ruby: garnet: convert network interfaces into clocked objects This helps in configuring the network interfaces from the python script and these objects no longer rely on the network object for the timing information.	2014-03-20 09:14:14 -05:00
Nilay Vaish	4f7ef51efb	ruby: slicc: code refactor	2014-03-20 09:14:14 -05:00
Nilay Vaish	9b3418d163	ruby: no piobus in se mode Piobus was recently added to se scripts for ruby so that the interrupt controller can be connected to something (required since the interrupt controller sends address range messages). This patch removes the piobus and instead, the pio port of ruby port will now ignore the range change messages in se mode.	2014-03-20 08:03:09 -05:00
Nilay Vaish	f7e7fa6d90	ruby: remove some of the unnecessary code	2014-03-17 17:40:14 -05:00
Andreas Sandberg	11ffa379ab	kvm: Clean up signal handling KVM used to use two signals, one for instruction count exits and one for timer exits. There is really no need to distinguish between the two since they only trigger exits from KVM. This changeset unifies and renames the signals and adds a method, kick(), that can be used to raise the control signal in the vCPU thread. It also removes the early timer warning since we do not normally see if the signal was delivered. --HG-- extra : rebase_source : cd0e45ca90894c3d6f6aa115b9b06a1d8f0fda4d	2014-03-16 17:40:58 +01:00
Andreas Sandberg	5db547bca4	kvm: x86: Adjust PC to remove the CS segment base address gem5 seems to store the PC as RIP+CS_BASE. This is not what KVM expects, so we need to subtract CS_BASE prior to transferring the PC into KVM. This changeset adds the necessary PC manipulation and refactors thread context updates slightly to avoid reading registers multiple times from KVM. --HG-- extra : rebase_source : 3f0569dca06a1fcd8694925f75c8918d954ada44	2014-03-16 17:30:24 +01:00
Andreas Sandberg	f791e7b313	kvm: x86: Add support for x86 INIT and STARTUP handling This changeset adds support for INIT and STARTUP IPI handling. We currently handle both of these interrupts in gem5 and transfer the state to KVM. Since we do not have a BIOS loaded, we pretend that the INIT interrupt suspends the CPU after reset. --HG-- extra : rebase_source : 7f3b25f3801d68f668b6cd91eaf50d6f48ee2a6a	2014-03-16 17:28:23 +01:00
Paul Rosenfeld	32bf74cb8e	alpha: Small removal of dead comments/code from alpha ISA Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2014-03-12 07:03:22 -05:00
Andreas Hansson	62fe81e9c1	cpu: Make CPU and ThreadContext getters const This patch merely tidies up the CPU and ThreadContext getters by making them const where appropriate.	2014-03-07 15:56:23 -05:00
Geoffrey Blake	c4a8e5c36c	arm: Handle functional TLB walks properly The table walker code currently accounts for two types of walks, Atomic and Timing, and treats them differently. Atomic walks keep a single instance of WalkerState around for all walks to use in currState. Timing mode keeps a queue of in-flight WalkerStates and maintains currState as NULL between walks. If a functional walk is done during Timing mode, it is treated as an atomic walk and either creates a persistent WalkerState if in between Timing walks, or stomps an existing currState for an in-progress Timing walk. This patch distinguishes functional walks as being able to exist at any time and sets up a temporary WalkerState for its exclusive use and then cleans up when finished, leaving any in progress Atomic or Timing walks undisturbed.	2014-03-07 15:56:23 -05:00
Prakash Ramrakhyani	e88cffb30a	mem: Fix incorrect assert failure in the Cache This patch fixes an assert condition that is not true at all times. There are valid situations that arise in dual-core dual-workload runs where the assert condition is false. The function call following the assert however needs to be called only when the condition is true (a block cannot be invalidated in the tags structure if has not been allocated in the structure, and the tempBlock is never allocated). Hence the 'assert' has been replaced with an 'if'.	2014-03-07 15:56:23 -05:00
Radhika Jagtap	c446dc40bd	mem: Edit proto Packet and enhance the python script This patch changes the decode script to output the optional fields of the proto message Packet, namely id and flags. The flags field is set by the communication monitor. The id field is useful for CPU trace experiments, e.g. linking the fetch side to decode side. It had to be renamed because it clashes with a built in python function id() for getting the "identity" of an object. This patch also takes a few common function definitions out from the multiple scripts and adds them to a protolib python module.	2014-03-07 15:56:23 -05:00
Stephan Diestelhorst	45677ffa97	misc: Add panic_if / fatal_if / chatty_assert This snippet can be used to replace if + {panics, fatals, asserts} constructs. The idea is to have both the condition checking and a verbose printout in a single statement. The interface is as follows: panic_if(foo != bar, "These should be equal: foo %i bar %i", foo, bar); fatal_if(foo != bar, "These should be equal: foo %i bar %i", foo, bar); chatty_assert(foo == bar, "These should be equal: foo %i bar %i", foo, bar);	2014-03-07 15:56:23 -05:00
Mitch Hayenga	b9a9d99b22	scons: Fixes uninitialized warnings issued by clang Small fixes to appease recent clang versions.	2014-03-07 15:56:23 -05:00
Stephan Diestelhorst	bef2086f5b	arm: Fix uninitialised warning with gcc 4.8 Small fix for a warning that prevents compilation with gcc 4.8.1 due to detecting that a variable might be uninitialised. The fix is to assign a safe default.	2014-03-07 15:56:23 -05:00
Ali Saidi	bf39a475fe	mem: Wakeup sleeping CPUs without caches on LLSC For systems without caches, the LLSC code does not get snoops for wake-ups. We add the LLSC code in the abstract memory to do the job for us.	2014-03-07 15:56:23 -05:00
Andreas Sandberg	f4a897d8e3	sim: Schedule the global sync event at curTick() + simQuantum The global synchronization event used to be scheduled at simQuantum. This prevented repeated entries into gem5 from Python as it can be scheduled in the past. This changeset ensures that the first global synchronization happens at curTick() + simQuantum instead.	2014-03-06 15:59:53 +01:00
Andreas Sandberg	be246cef62	x86: Setup correct TSL/TR segment attributes on INIT The TSL/LDT & TR/TSS segments didn't contain valid attributes. This caused problems when transfering the state into KVM where invalid state is a no-go. Fixup the attributes with values from AMD's architecture programmer's manual.	2014-03-03 14:44:57 +01:00
Andreas Sandberg	e7d230ede0	kvm: x86: Always assume segments to be usable When transferring segment registers into kvm, we need to find the value of the unusable bit. We used to assume that this could be inferred from the selector since segments are generally unusable if their selector is 0. This assumption breaks in some weird corner cases. Instead, we just assume that segments are always usable. This is what qemu does so it should work.	2014-03-03 14:34:33 +01:00
Andreas Sandberg	739cc0128b	kvm: Initialize signal handlers from startupThread() Signal handlers in KVM are controlled per thread and should be initialized from the thread that is going to execute the CPU. This changeset moves the initialization call from startup() to startupThread().	2014-03-03 14:31:39 +01:00
Nilay Vaish	5cd9dd29bd	ruby: message buffer: changes related to tracking push/pop times The last pop operation is now tracked as a Tick instead of in Cycles. This helps in avoiding use of the receiver's clock during the enqueue operation.	2014-03-01 23:59:58 -06:00
Nilay Vaish	67cd04b6fe	ruby: make the max_size variable of the MessageBuffer unsigned	2014-03-01 23:59:57 -06:00
Christopher Torng	919baa603d	cpu: Enable fast-forwarding for MIPS InOrderCPU and O3CPU A copyRegs() function is added to MIPS utilities to copy architectural state from the old CPU to the new CPU during fast-forwarding. This addition alone enables fast-forwarding for the o3 cpu model running MIPS. The patch also adds takeOverFrom() and drainResume() functions to the InOrderCPU to enable it to take over from another CPU. This change enables fast-forwarding for the inorder cpu model running MIPS, but not for Alpha. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2014-03-01 23:35:23 -06:00
Nilay Vaish	a533f3f983	ruby: profiler: statically allocate stats variable Couple of users observed segmentation fault when the simulator tries to register the statistical variable m_IncompleteTimes. It seems that there is some problem with the initialization of these variables when allocated in the constructor.	2014-03-01 23:35:21 -06:00
Nilay Vaish	7e27860ef4	ruby: route all packets through ruby port Currently, the interrupt controller in x86 is connected to the io bus directly. Therefore the packets between the io devices and the interrupt controller do not go through ruby. This patch changes ruby port so that these packets arrive at the ruby port first, which then routes them to their destination. Note that the patch does not make these packets go through the ruby network. That would happen in a subsequent patch.	2014-02-23 19:16:16 -06:00
Andreas Hansson	5755fff998	ruby: Simplify RubyPort flow control and routing This patch simplfies the retry logic in the RubyPort, avoiding redundant attributes, and enforcing more stringent checks on the interactions with the normal ports. The patch also simplifies the routing done by the RubyPort, using the port identifiers instead of a heavy-weight sender state. The patch also fixes a bug in the sending of responses from PIO ports. Previously these responses bypassed the queue in the queued port, and ignored the return value, potentially leading to response packets being lost. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2014-02-23 19:16:16 -06:00
Nilay Vaish	7572ab71b5	ruby: message buffer: refactor code Code in two of the functions was exactly the same. This patch moves this code to a new function which is called from the two functions mentioned initially.	2014-02-23 19:16:15 -06:00
Nilay Vaish	cde20fd476	ruby: remove few not required #includes	2014-02-23 19:16:15 -06:00
Nilay Vaish	82378f7301	ruby: slicc: remove unused COPY_HEAD functionality	2014-02-23 19:16:15 -06:00
Nilay Vaish	13ad07601b	ruby: protocols: remove unused action z_stall	2014-02-23 19:16:15 -06:00
Nilay Vaish	cd33f9bc42	ruby: network: move message buffers to base network class.	2014-02-21 08:02:05 -06:00
Nilay Vaish	bd8f954526	ruby: network: garnet: fixed: removes net_ptr from links	2014-02-21 08:02:04 -06:00
Nilay Vaish	307f53e164	ruby: cache: remove not required variable m_cache_name	2014-02-21 08:02:02 -06:00
Nilay Vaish	f8f8b7e5c2	ruby: network: garnet: fixed: removes next cycle functions At several places, there are functions that take a cycle value as input and performs some computation. Along with each such function, another function was being defined that simply added one more cycle to input and computed the same function. This patch removes this second copy of the function. Places where these functions were being called have been updated to use the original function with argument being current cycle + 1.	2014-02-20 17:28:01 -06:00
Nilay Vaish	896654746a	ruby: controller: slight code refactoring	2014-02-20 17:27:45 -06:00
Nilay Vaish	0ce8c25919	ruby: mesi three level: rename incorrectly named files Two files had been incorrectly named with a .cache suffix. --HG-- rename : src/mem/protocol/MESI_Three_Level-L0.cache => src/mem/protocol/MESI_Three_Level-L0cache.sm rename : src/mem/protocol/MESI_Three_Level-L1.cache => src/mem/protocol/MESI_Three_Level-L1cache.sm	2014-02-20 17:27:17 -06:00
Nilay Vaish	db5b3d37fe	ruby: network: removes unused code.	2014-02-20 17:27:07 -06:00
Nilay Vaish	dd5c72e5a7	ruby: slicc: slight code refactoring	2014-02-20 17:26:49 -06:00
Nilay Vaish	b312a41f21	ruby: message buffer: removes some unecessary functions.	2014-02-20 17:26:41 -06:00
Andreas Sandberg	0d6009e8dc	kvm: Add support for multi-system simulation The introduction of parallel event queues added most of the support needed to run multiple VMs (systems) within the same gem5 instance. This changeset fixes up signal delivery so that KVM's control signals are delivered to the thread that executes the CPU's event queue. Specifically: * Timers and counters are now initialized from a separate method (startupThread) that is scheduled as the first event in the thread-specific event queue. This ensures that they are initialized from the thread that is going to execute the CPUs event queue and enables signal delivery to the right thread when exiting from KVM. * The POSIX-timer-based KVM timer (used to force exits from KVM) has been updated to deliver signals to the thread that's executing KVM instead of the process (thread is undefined in that case). This assumes that the timer is instantiated from the thread that is going to execute the KVM vCPU. * Signal masking is now done using pthread_sigmask instead of sigprocmask. The behavior of the latter is undefined in threaded applications. * Since signal masks can be inherited, make sure to actively unmask the control signals when setting up the KVM signal mask. There are currently no facilities to multiplex between multiple KVM CPUs in the same event queue, we are therefore limited to configurations where there is only one KVM CPU per event queue. In practice, this means that multi-system configurations can be simulated, but not multiple CPUs in a shared-memory configuration.	2014-02-20 15:43:53 +01:00
Andreas Hansson	4b81585c49	mem: Fix bug in PhysicalMemory use of mmap and munmap This patch fixes a bug in how physical memory used to be mapped and unmapped. Previously we unmapped and re-mapped if restoring from a checkpoint. However, we never checked that the new mapping was actually the same, it was just magically working as the OS seems to fairly reliably give us the same chunk back. This patch fixes this issue by relying entirely on the mmap call in the constructor.	2014-02-18 05:51:01 -05:00
Andreas Hansson	f0ea79c41f	dev: Include basic devices in NULL ISA build This patch enbles use of the basic PIO devices as part of the NULL build. Although it might seem counter intuitive to have a PIO device without being able to execute a driver, this change enables us to break a device class hierarchy into an ISA-agnostic part, and an ISA-specific part, without requiring multiple-inheritance. The ISA-agnostic base class is a PIO device, but does not make use of the port.	2014-02-18 05:50:59 -05:00
Andreas Hansson	969b436243	mem: Filter cache snoops based on address ranges This patch adds a filter to the cache to drop snoop requests that are not for a range covered by the cache. This fixes an issue observed when multiple caches are placed in parallel, covering different address ranges. Without this patch, all the caches will forward the snoop upwards, when only one should do so.	2014-02-18 05:50:58 -05:00
Andreas Hansson	bf2f178f85	mem: Add a wrapped DRAMSim2 memory controller This patch adds DRAMSim2 as a memory controller by wrapping the external library and creating a sublass of AbstractMemory that bridges between the semantics of gem5 and the DRAMSim2 interface. The DRAMSim2 wrapper extracts the clock period from the config file. There is no way of extracting this information from DRAMSim2 itself, so we simply read the same config file and get it from there. To properly model the response queue, the wrapper keeps track of how many transactions are in the actual controller, and how many are stacking up waiting to be sent back as responses (in the wrapper). The latter requires us to move away from the queued port and manage the packets ourselves. This is due to DRAMSim2 not having any flow control on the response path. DRAMSim2 assumes that the transactions it is given are matching the burst size of the choosen memory. The wrapper checks to ensure the cache line size of the system matches the burst size of DRAMSim2 as there are currently no provisions to split the system requests. In theory we could allow a cache line size smaller than the burst size, but that would lead to inefficient use of the DRAM, so for not we fatal also in this case.	2014-02-18 05:50:53 -05:00
Andreas Hansson	c9cb492e1c	mem: Fix input to DPRINTF in CommMonitor Minor fix of the debug message parameters.	2014-02-18 05:50:51 -05:00
Andreas Sandberg	c52190a695	cpu: simple: Add support for using branch predictors This changesets adds branch predictor support to the BaseSimpleCPU. The simple CPUs normally don't need a branch predictor, however, there are at least two cases where it can be desirable: 1) A simple CPU can be used to warm the branch predictor of an O3 CPU before switching to the slower O3 model. 2) The simple CPU can be used as a quick way of evaluating/debugging new branch predictors since it exposes branch predictor statistics. Limitations: * Since the simple CPU doesn't speculate, only one instruction will be active in the branch predictor at a time (i.e., the branch predictor will never see speculative branches). * The outcome of a branch prediction does not affect the performance of the simple CPU.	2014-02-09 20:49:28 +01:00
Nilay Vaish	eb73a14fe2	base: calls abort() from fatal Currently fatal() ends the simulation in a normal fashion. This results in the call stack getting lost when using a debugger and it is not always possible to debug the simulation just from the information provided by the printed error message. Even though the error is likely due to a user's fault, the information available should not be thrown away. Hence, this patch to call abort() from fatal().	2014-02-06 16:30:13 -06:00
Nilay Vaish	bb0e9119e7	ruby: memory controller: use MemoryNode *	2014-02-06 16:30:12 -06:00
Andreas Sandberg	e76a37985f	x86: Fix x87 state transfer bug Changeset 7274310be1bb (isa: clean up register constants) increased the value of NumFloatRegs, which triggered a bug in X86ISA::copyRegs(). This bug is caused by the x87 stack being copied twice since register indexes past NUM_FLOATREGS are mapped into the x87 stack relative to the top of the stack, which is undefined when the copy takes place. This changeset updates the copyRegs() function to use access registers using the non-flattening interface, which guarantees that undesirable register folding does not happen.	2014-02-05 14:08:13 +01:00
Nikos Nikoleris	c6279f2d19	x86, kvm: Fix bug in the RFlags get and set functions The getRFlags and setRFlags utility functions were not updated correctly when condition registers were separated into their own register class. This lead to incorrect state transfer in calls from kvm into the simulator (e.g., m5 readfile ended up in an infinite loop) and when switching CPUs. This patch makes these utility functions use getCCReg and setCCReg instead of getIntReg and setIntReg which read and write the integer registers. Reviewed-by: Andreas Sandberg <andreas@sandberg.pp.se>	2014-02-02 16:37:35 +01:00
Ola Jeppsson	7f16951451	unittest: Fix build errors Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2014-01-30 12:21:58 -06:00
Mitch Hayenga	96317d466e	mem: Add additional tolerance to stride prefetcher Forces the prefetcher to mispredict twice in a row before resetting the confidence of prefetching. This helps cases where a load PC strides by a constant factor, however it may operate on different arrays at times. Avoids the cost of retraining. Primarily helps with small iteration loops. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2014-01-29 23:21:26 -06:00
Mitch Hayenga	771c864bf4	mem: Allowed tagged instruction prefetching in stride prefetcher For systems with a tightly coupled L2, a stride-based prefetcher may observe access requests from both instruction and data L1 caches. However, the PC address of an instruction miss gives no relevant training information to the stride based prefetcher(there is no stride to train). In theses cases, its better if the L2 stride prefetcher simply reverted back to a simple N-block ahead prefetcher. This patch enables this option. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2014-01-29 23:21:26 -06:00
Mitch Hayenga ext:(%2C%20Amin%20Farmahini%20%3Caminfar%40gmail.com%3E)	95735e10e7	mem: prefetcher: add options, support for unaligned addresses This patch extends the classic prefetcher to work on non-block aligned addresses. Because the existing prefetchers in gem5 mask off the lower address bits of cache accesses, many predictable strides fail to be detected. For example, if a load were to stride by 48 bytes, with 64 byte cachelines, the current stride based prefetcher would see an access pattern of 0, 64, 64, 128, 192.... Thus not detecting a constant stride pattern. This patch fixes this, by training the prefetcher on access and not masking off the lower address bits. It also adds the following configuration options: 1) Training/prefetching only on cache misses, 2) Training/prefetching only on data acceses, 3) Optionally tagging prefetches with a PC address. #3 allows prefetchers to train off of prefetch requests in systems with multiple cache levels and PC-based prefetchers present at multiple levels. It also effectively allows a pipelining of prefetch requests (like in POWER4) across multiple levels of cache hierarchy. Improves performance on my gem5 configuration by 4.3% for SPECINT and 4.7% for SPECFP (geomean).	2014-01-29 23:21:25 -06:00
Xiangyu Dong	32cc2ea8b9	cpu: fix bug when TrafficGen deschedules event Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2014-01-29 22:35:04 -06:00
Mitch Hayenga	b77ca57f8c	arm: Enable umask syscall in SE mode Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2014-01-28 18:00:51 -06:00
Mitch Hayenga	55a4ff5f04	base: Fix race condition in the socket listen function gem5 makes the incorrect assumption that by binding a socket, it effectively has allocated a port. Linux only allocates ports once you call listen on the given socket, not when you call bind. So even if the port was free when bind was called, another process (gem5 instance) could race in between the bind & listen calls and steal the port. In the current code, if the call to bind fails due to the port being in use (EADDRINUSE), gem5 retries for a different port. However if listen fails, gem5 just panics. The fix is testing the return value of listen and re-trying if it was due to EADDRINUSE. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2014-01-28 18:00:51 -06:00
Amin Farmahini	ffbdaa7cce	mem: Remove redundant findVictim() input argument The patch (1) removes the redundant writeback argument from findVictim() (2) fixes the description of access() function Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2014-01-28 18:00:50 -06:00
Amin Farmahini	575a73f4a1	mem: Fixes a bug in simple_dram write merging Fixes updating the value of size in the write merge function. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2014-01-28 18:00:49 -06:00
Nilay Vaish	bdee69d0b1	x86: use lfpimm instead of limm for fptan	2014-01-27 18:50:54 -06:00
Nilay Vaish	6a543b5134	x86: implements x87 add/sub instructions	2014-01-27 18:50:53 -06:00
Nilay Vaish	5be0b846b1	x86: implements fxch instruction.	2014-01-27 18:50:52 -06:00
Nilay Vaish	4eb3b1ed0b	x86: correct error in emms instruction.	2014-01-27 18:50:51 -06:00
ARM gem5 Developers	612f8f074f	arm: Add support for ARMv8 (AArch64 & AArch32) Note: AArch64 and AArch32 interworking is not supported. If you use an AArch64 kernel you are restricted to AArch64 user-mode binaries. This will be addressed in a later patch. Note: Virtualization is only supported in AArch32 mode. This will also be fixed in a later patch. Contributors: Giacomo Gabrielli (TrustZone, LPAE, system-level AArch64, AArch64 NEON, validation) Thomas Grocutt (AArch32 Virtualization, AArch64 FP, validation) Mbou Eyole (AArch64 NEON, validation) Ali Saidi (AArch64 Linux support, code integration, validation) Edmund Grimley-Evans (AArch64 FP) William Wang (AArch64 Linux support) Rene De Jong (AArch64 Linux support, performance opt.) Matt Horsnell (AArch64 MP, validation) Matt Evans (device models, code integration, validation) Chris Adeniyi-Jones (AArch64 syscall-emulation) Prakash Ramrakhyani (validation) Dam Sunwoo (validation) Chander Sudanthi (validation) Stephan Diestelhorst (validation) Andreas Hansson (code integration, performance opt.) Eric Van Hensbergen (performance opt.) Gabe Black	2014-01-24 15:29:34 -06:00
Andreas Hansson	cfc4a99982	arch: Make all register index flattening const This patch makes all the register index flattening methods const for all the ISAs. As part of this, readMiscRegNoEffect for ARM is also made const.	2014-01-24 15:29:30 -06:00
Geoffrey Blake	9633282fc8	checker: CheckerCPU handling of MiscRegs was incorrect The CheckerCPU model in pre-v8 code was not checking the updates to miscellaneous registers due to some methods for setting misc regs were not instrumented. The v8 patches exposed this by calling the instrumented misc reg update methods and then invoking the checker before the main CPU had updated its misc regs, leading to false positives about register mismatches. This patch fixes the non-instrumented misc reg update methods and places calls to the checker in the proper places in the O3 model.	2014-01-24 15:29:30 -06:00
Ali Saidi	7d0344704a	arch, cpu: Add support for flattening misc register indexes. With ARMv8 support the same misc register id results in accessing different registers depending on the current mode of the processor. This patch adds the same orthogonality to the misc register file as the others (int, float, cc). For all the othre ISAs this is currently a null-implementation. Additionally, a system variable is added to all the ISA objects.	2014-01-24 15:29:30 -06:00
Giacomo Gabrielli	3436de0c2a	cpu: Add support for Memory+Barrier instruction types in O3 cpu.	2014-01-24 15:29:30 -06:00
Ali Saidi	90b1775a8f	cpu: Add support for instructions that zero cache lines.	2014-01-24 15:29:30 -06:00
Ali Saidi	6bed6e0352	cpu: Add CPU support for generatig wake up events when LLSC adresses are snooped. This patch add support for generating wake-up events in the CPU when an address that is currently in the exclusive state is hit by a snoop. This mechanism is required for ARMv8 multi-processor support.	2014-01-24 15:29:30 -06:00
Giacomo Gabrielli	d3444c6603	mem: Add flag to request if it was generated by a page table walk	2014-01-24 15:29:30 -06:00
Giacomo Gabrielli	aefe9cc624	mem: Add support for a security bit in the memory system This patch adds the basic building blocks required to support e.g. ARM TrustZone by discerning secure and non-secure memory accesses.	2014-01-24 15:29:30 -06:00
Chris Adeniyi-Jones	7f835a59f1	sim: Add openat/fstatat syscalls and fix mremap This patch adds support for the openat and fstatat syscalls and broadens the support for mremap to make it work on OS X.	2014-01-24 15:29:30 -06:00
Ali Saidi	904872a01a	mem: Remove explict cast from memhelper. Previously we were casting the result type to the the memory type which is incorrect for things like dual-memory operations which still return a single result.	2014-01-24 15:29:30 -06:00
Timothy M. Jones	427ceb57a9	Cache: Collect very basic stats on tag and data accesses Adds very basic statistics on the number of tag and data accesses within the cache, which is important for power modelling. For the tags, simply count the associativity of the cache each time. For the data, this depends on whether tags and data are accessed sequentially, which is given by a new parameter. In the parallel case, all data blocks are accessed each time, but with sequential accesses, a single data block is accessed only on a hit.	2014-01-24 15:29:30 -06:00
Dam Sunwoo	85e8779de7	mem: per-thread cache occupancy and per-block ages This patch enables tracking of cache occupancy per thread along with ages (in buckets) per cache blocks. Cache occupancy stats are recalculated on each stat dump.	2014-01-24 15:29:30 -06:00
Matt Horsnell	739c6df94e	base: add support for probe points and common probes The probe patch is motivated by the desire to move analytical and trace code away from functional code. This is achieved by the probe interface which is essentially a glorified observer model. What this means to users: * add a probe point and a "notify" call at the source of an "event" * add an isolated module, that is being used to carry out your analysis (e.g. generate a trace) * register that module as a probe listener Note: an example is given for reference in src/cpu/o3/simple_trace.[hh\|cc] and src/cpu/SimpleTrace.py What is happening under the hood: * every SimObject maintains has a ProbeManager. * during initialization (src/python/m5/simulate.py) first regProbePoints and the regProbeListeners is called on each SimObject. this hooks up the probe point notify calls with the listeners. FAQs: Why did you develop probe points: * to remove trace, stats gathering, analytical code out of the functional code. * the belief that probes could be generically useful. What is a probe point: * a probe point is used to notify upon a given event (e.g. cpu commits an instruction) What is a probe listener: * a class that handles whatever the user wishes to do when they are notified about an event. What can be passed on notify: * probe points are templates, and so the user can generate probes that pass any type of argument (by const reference) to a listener. What relationships can be generated (1:1, 1:N, N:M etc): * there isn't a restriction. You can hook probe points and listeners up in a 1:1, 1:N, N:M relationship. They become useful when a number of modules listen to the same probe points. The idea being that you can add a small number of probes into the source code and develop a larger number of useful analysis modules that use information passed by the probes. Can you give examples: * adding a probe point to the cpu's commit method allows you to build a trace module (outputting assembler), you could re-use this to gather instruction distribution (arithmetic, load/store, conditional, control flow) stats. Why is the probe interface currently restricted to passing a const reference: * the desire, initially at least, is to allow an interface to observe functionality, but not to change functionality. * of course this can be subverted by const-casting. What is the performance impact of adding probes: * when nothing is actively listening to the probes they should have a relatively minor impact. Profiling has suggested even with a large number of probes (60) the impact of them (when not active) is very minimal (<1%).	2014-01-24 15:29:30 -06:00
Andreas Hansson	4de69821e6	sim: Expose the current voltage for each object as a stat	2014-01-24 15:29:30 -06:00
Andreas Hansson	1d85e914a6	sim: Expose the current clock period as a stat This patch adds observability to the clock period of the clock domains by including it as a stat. As a result of adding this, the regressions will be updated in a separate patch.	2014-01-24 15:29:30 -06:00
Matt Horsnell	ca89eba79e	mem: track per-request latencies and access depths in the cache hierarchy Add some values and methods to the request object to track the translation and access latency for a request and which level of the cache hierarchy responded to the request.	2014-01-24 15:29:30 -06:00
Andreas Hansson	daa781d2db	config: Make the Clock a Tick parameter like Latency/Frequency This patch makes the Clock a TickParamValue just like Latency/Frequency. There is no longer any need to distinguish it (originally needed to support multiplication).	2014-01-24 15:29:29 -06:00
Andreas Hansson	f2b0b551cc	x86: Fix memory leak in table walker This patch fixes a memory leak in the table walker, by ensuring that the sender state is deleted again if the request packet cannot be successfully sent.	2014-01-24 15:29:29 -06:00
Andreas Hansson	7db542c0dd	cpu: Relax check on squashed non-speculative instructions This patch relaxes the check performed when squashing non-speculative instructions, as it caused problems with loads that were marked ready, and then stalled on a blocked cache. The assertion is now allowing memory references to be non-faulting.	2014-01-24 15:29:29 -06:00
Dam Sunwoo	f1cd6b1ba8	cpu: remove faulty simpoint basic block inst count assertion This patch removes an assertion in the simpoint profiling code that asserts that a previously-seen basic block has the exact same number of instructions executed as before. This can be false if the basic block generates aborts or takes interrupts at different locations within the basic block. The basic block profiling are not affected significantly as these events are rare in general.	2014-01-24 15:29:29 -06:00
Nilay Vaish	37433d91a3	ruby: remove unused label no_vector	2014-01-17 11:02:15 -06:00
Nilay Vaish	407f37e15f	ruby: move all statistics to stats.txt, eliminate ruby.stats	2014-01-10 16:19:47 -06:00
Nilay Vaish	cfe912a512	stats: add function for adding two histograms This patch adds a function to the HistStor class for adding two histograms. This functionality is required for Ruby. It also adds support for printing histograms in a single line.	2014-01-10 16:19:40 -06:00
Nilay Vaish	0387281e2a	ruby: fix bug introduced to revision 8523754f8885	2014-01-09 10:45:50 -06:00
Nilay Vaish	8559081648	ruby: slicc: remove variable 'addr' used in calls to doTransition This variable causes trouble if a variable of same name is declared in a protocol file. Hence it is being eliminated.	2014-01-08 04:26:25 -06:00
Nilay Vaish	4070b00875	ruby: add a three level MESI protocol. The first two levels (L0, L1) are private to the core, the third level (L2)is possibly shared. The protocol supports clustered designs. For example, one can have two sets of two cores. Each core has an L0 and L1 cache. There are two L2 controllers where each set accesses only one of the L2 controllers.	2014-01-04 00:03:34 -06:00
Nilay Vaish	bb6d7d402b	ruby: rename MESI_CMP_directory to MESI_Two_Level This is because the next patch introduces a three level hierarchy. --HG-- rename : build_opts/ALPHA_MESI_CMP_directory => build_opts/ALPHA_MESI_Two_Level rename : build_opts/X86_MESI_CMP_directory => build_opts/X86_MESI_Two_Level rename : configs/ruby/MESI_CMP_directory.py => configs/ruby/MESI_Two_Level.py rename : src/mem/protocol/MESI_CMP_directory-L1cache.sm => src/mem/protocol/MESI_Two_Level-L1cache.sm rename : src/mem/protocol/MESI_CMP_directory-L2cache.sm => src/mem/protocol/MESI_Two_Level-L2cache.sm rename : src/mem/protocol/MESI_CMP_directory-dir.sm => src/mem/protocol/MESI_Two_Level-dir.sm rename : src/mem/protocol/MESI_CMP_directory-dma.sm => src/mem/protocol/MESI_Two_Level-dma.sm rename : src/mem/protocol/MESI_CMP_directory-msg.sm => src/mem/protocol/MESI_Two_Level-msg.sm rename : src/mem/protocol/MESI_CMP_directory.slicc => src/mem/protocol/MESI_Two_Level.slicc rename : tests/long/fs/10.linux-boot/ref/x86/linux/pc-simple-timing-ruby-MESI_CMP_directory/config.ini => tests/long/fs/10.linux-boot/ref/x86/linux/pc-simple-timing-ruby-MESI_Two_Level/config.ini rename : tests/long/fs/10.linux-boot/ref/x86/linux/pc-simple-timing-ruby-MESI_CMP_directory/ruby.stats => tests/long/fs/10.linux-boot/ref/x86/linux/pc-simple-timing-ruby-MESI_Two_Level/ruby.stats rename : tests/long/fs/10.linux-boot/ref/x86/linux/pc-simple-timing-ruby-MESI_CMP_directory/simerr => tests/long/fs/10.linux-boot/ref/x86/linux/pc-simple-timing-ruby-MESI_Two_Level/simerr rename : tests/long/fs/10.linux-boot/ref/x86/linux/pc-simple-timing-ruby-MESI_CMP_directory/simout => tests/long/fs/10.linux-boot/ref/x86/linux/pc-simple-timing-ruby-MESI_Two_Level/simout rename : tests/long/fs/10.linux-boot/ref/x86/linux/pc-simple-timing-ruby-MESI_CMP_directory/stats.txt => tests/long/fs/10.linux-boot/ref/x86/linux/pc-simple-timing-ruby-MESI_Two_Level/stats.txt rename : tests/long/fs/10.linux-boot/ref/x86/linux/pc-simple-timing-ruby-MESI_CMP_directory/system.pc.com_1.terminal => tests/long/fs/10.linux-boot/ref/x86/linux/pc-simple-timing-ruby-MESI_Two_Level/system.pc.com_1.terminal rename : tests/quick/se/00.hello/ref/alpha/linux/simple-timing-ruby-MESI_CMP_directory/config.ini => tests/quick/se/00.hello/ref/alpha/linux/simple-timing-ruby-MESI_Two_Level/config.ini rename : tests/quick/se/00.hello/ref/alpha/linux/simple-timing-ruby-MESI_CMP_directory/ruby.stats => tests/quick/se/00.hello/ref/alpha/linux/simple-timing-ruby-MESI_Two_Level/ruby.stats rename : tests/quick/se/00.hello/ref/alpha/linux/simple-timing-ruby-MESI_CMP_directory/simerr => tests/quick/se/00.hello/ref/alpha/linux/simple-timing-ruby-MESI_Two_Level/simerr rename : tests/quick/se/00.hello/ref/alpha/linux/simple-timing-ruby-MESI_CMP_directory/simout => tests/quick/se/00.hello/ref/alpha/linux/simple-timing-ruby-MESI_Two_Level/simout rename : tests/quick/se/00.hello/ref/alpha/linux/simple-timing-ruby-MESI_CMP_directory/stats.txt => tests/quick/se/00.hello/ref/alpha/linux/simple-timing-ruby-MESI_Two_Level/stats.txt rename : tests/quick/se/00.hello/ref/alpha/tru64/simple-timing-ruby-MESI_CMP_directory/config.ini => tests/quick/se/00.hello/ref/alpha/tru64/simple-timing-ruby-MESI_Two_Level/config.ini rename : tests/quick/se/00.hello/ref/alpha/tru64/simple-timing-ruby-MESI_CMP_directory/ruby.stats => tests/quick/se/00.hello/ref/alpha/tru64/simple-timing-ruby-MESI_Two_Level/ruby.stats rename : tests/quick/se/00.hello/ref/alpha/tru64/simple-timing-ruby-MESI_CMP_directory/simerr => tests/quick/se/00.hello/ref/alpha/tru64/simple-timing-ruby-MESI_Two_Level/simerr rename : tests/quick/se/00.hello/ref/alpha/tru64/simple-timing-ruby-MESI_CMP_directory/simout => tests/quick/se/00.hello/ref/alpha/tru64/simple-timing-ruby-MESI_Two_Level/simout rename : tests/quick/se/00.hello/ref/alpha/tru64/simple-timing-ruby-MESI_CMP_directory/stats.txt => tests/quick/se/00.hello/ref/alpha/tru64/simple-timing-ruby-MESI_Two_Level/stats.txt rename : tests/quick/se/50.memtest/ref/alpha/linux/memtest-ruby-MESI_CMP_directory/config.ini => tests/quick/se/50.memtest/ref/alpha/linux/memtest-ruby-MESI_Two_Level/config.ini rename : tests/quick/se/50.memtest/ref/alpha/linux/memtest-ruby-MESI_CMP_directory/ruby.stats => tests/quick/se/50.memtest/ref/alpha/linux/memtest-ruby-MESI_Two_Level/ruby.stats rename : tests/quick/se/50.memtest/ref/alpha/linux/memtest-ruby-MESI_CMP_directory/simerr => tests/quick/se/50.memtest/ref/alpha/linux/memtest-ruby-MESI_Two_Level/simerr rename : tests/quick/se/50.memtest/ref/alpha/linux/memtest-ruby-MESI_CMP_directory/simout => tests/quick/se/50.memtest/ref/alpha/linux/memtest-ruby-MESI_Two_Level/simout rename : tests/quick/se/50.memtest/ref/alpha/linux/memtest-ruby-MESI_CMP_directory/stats.txt => tests/quick/se/50.memtest/ref/alpha/linux/memtest-ruby-MESI_Two_Level/stats.txt rename : tests/quick/se/60.rubytest/ref/alpha/linux/rubytest-ruby-MESI_CMP_directory/config.ini => tests/quick/se/60.rubytest/ref/alpha/linux/rubytest-ruby-MESI_Two_Level/config.ini rename : tests/quick/se/60.rubytest/ref/alpha/linux/rubytest-ruby-MESI_CMP_directory/ruby.stats => tests/quick/se/60.rubytest/ref/alpha/linux/rubytest-ruby-MESI_Two_Level/ruby.stats rename : tests/quick/se/60.rubytest/ref/alpha/linux/rubytest-ruby-MESI_CMP_directory/simerr => tests/quick/se/60.rubytest/ref/alpha/linux/rubytest-ruby-MESI_Two_Level/simerr rename : tests/quick/se/60.rubytest/ref/alpha/linux/rubytest-ruby-MESI_CMP_directory/simout => tests/quick/se/60.rubytest/ref/alpha/linux/rubytest-ruby-MESI_Two_Level/simout rename : tests/quick/se/60.rubytest/ref/alpha/linux/rubytest-ruby-MESI_CMP_directory/stats.txt => tests/quick/se/60.rubytest/ref/alpha/linux/rubytest-ruby-MESI_Two_Level/stats.txt	2014-01-04 00:03:33 -06:00
Nilay Vaish	5b1804e3bd	ruby: add support for clusters A cluster over here means a set of controllers that can be accessed only by a certain set of cores. For example, consider a two level hierarchy. Assume there are 4 L1 controllers (private) and 2 L2 controllers. We can have two different hierarchies here: a. the address space is partitioned between the two L2 controllers. Each L1 controller accesses both the L2 controllers. In this case, each L1 controller is a cluster initself. b. both the L2 controllers can cache any address. An L1 controller has access to only one of the L2 controllers. In this case, each L2 controller along with the L1 controllers that access it, form a cluster. This patch allows for each controller to have a cluster ID, which is 0 by default. By setting the cluster ID properly, one can instantiate hierarchies with clusters. Note that the coherence protocol might have to be changed as well.	2014-01-04 00:03:31 -06:00
Nilay Vaish	9853ef6651	ruby: some small changes	2014-01-04 00:03:30 -06:00
Steve Reinhardt	d8c9b5431b	python: provide better error message for wrapped C++ methods If you successfully export a C++ SimObject method, but try to invoke it from Python before the C++ object is created, you get a confusing error that says the attribute does not exist, making you question whether you successfully exported the method at all. In reality, your only problem is that you're calling the method too soon. This patch enhances the error message to give you a better clue.	2014-01-03 17:08:43 -08:00
Steve Reinhardt	ba9ec669bc	python: don't die on assignment to cloned object Updating the SimObject topology of a cloned hierarchy is a little dangerous, in that cloning is a "deep copy" and the clone does not inherit SimObject updates the same way it would inherit scalar variable assignments. However, because of various SimObject-valued proxy parameters, like 'memories', 'clk_domain', and 'system', it turns out that there are a number of implicit topology changes that happen at instantiation, which means that these changes are impossible to avoid. So in order to make cloning systems useful, this error has to go. Changing it to a warning produces a lot of noise, so it seems best just to delete it.	2014-01-03 17:08:42 -08:00
Christopher Torng	b4b03a60b1	sim: Add support for dynamic frequency scaling This patch provides support for DFS by having ClockedObjects register themselves with their clock domain at construction time in a member list. Using this list, a clock domain can update each member's tick to the curTick() before modifying the clock period. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2013-12-29 19:29:45 -06:00
Christopher Torng	903b442228	mips: Floating point convert bug fix In mips architecture, floating point convert instructions use the FloatConvertOp format defined in src/arch/mips/isa/formats/fp.isa. The type of the operands in the ISA description file (_sw for signed word, or _sf for signed float, etc.) is used to create a type for the operand in C++. Then the operand is converted using the fpConvert() function in src/arch/mips/utility.cc. If we are converting from a word to a float, and we want to convert 0xffffffff, we expect -1 to be passed into fpConvert(). Instead, we see MAX_INT passed in. Then fpConvert() converts _val_ to MAX_INT in single-precision floating point, and we get the wrong value. To fix it, the signs of the convert operands are being changed from unsigned to signed in the MIPS ISA description. Then, the FloatConvertOp format is being changed to insert a int32_t into the C++ code instead of a uint32_t. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2013-12-29 19:29:45 -06:00
Nilay Vaish	d71311b1cf	ruby: fix bugs in mesi cmp directory protocol This patch fixes couple of bugs in the L2 controller of the mesi cmp directory protocol. 1. The state MT_I was transitioning to NP on receiving a clean writeback from the L1 controller. This patch makes it inform the directory controller about the writeback. 2. The L2 controller was sending the dirty bit to the L1 controller and the L2 controller used writeback from the L1 controller to update the dirty bit unconditionally. Now, the L1 controller always assumes that the incoming data is clean. The L2 controller updates the dirty bit only when the L1 controller writes to the block. 3. Certain unused functions and events are being removed.	2013-12-26 15:18:55 -06:00
Nilay Vaish	fc53f9ffcc	ruby: slicc: replace max_in_port_rank with number of inports This patch replaces max_in_port_rank with the number of inports. The use of max_in_port_rank was causing spurious re-builds and incorrect initialization of variables in ruby related regression tests. This was due to the variable value being used across threads while compiling when it was not meant to be. Since the number of inports is state machine specific value, this problem should get solved.	2013-12-20 20:34:04 -06:00
Nilay Vaish	30b259a31e	ruby: declare variables to be unsigned in Address.hh	2013-12-20 20:34:03 -06:00
Nilay Vaish	f5b52a265a	ruby: mesi: remove owner and sharer fields from directory tags The directory controller should not have the sharer field since there is only one level 2 cache. Anyway the field was not in use. The owner field was being used to track the l2 cache version (in case of distributed l2) that has the cache block under consideration. The information is not required since the version of the level 2 cache can be obtained from a subset of the address bits.	2013-12-20 20:34:03 -06:00
Nilay Vaish	50d250f514	sim: reset stats after startup Currently statistics are reset after the initial / checkpoint state has been loaded. But ruby does some checkpoint processing in its startup() function. So the stats need to be reset after the startup() function has been called. This patch moves the class to stats.reset() to achieve this change in functionality.	2013-12-03 10:51:40 -06:00
Nilay Vaish	5800e83223	cpu: call BaseCPU startup() function in o3 cpu	2013-12-03 10:36:04 -06:00
Andreas Sandberg	c033ead992	base: Fix race in PollQueue and remove SIGALRM workaround There is a race between enabling asynchronous IO for a file descriptor and IO events happening on that descriptor. A SIGIO won't normally be delivered if an event is pending when asynchronous IO is enabled. Instead, the signal will be raised the next time there is an event on the FD. This changeset simulates a SIGIO by setting the async_io flag when setting up asynchronous IO for an FD. This causes the main event loop to poll all file descriptors to check for pending IO. As a consequence of this, the old SIGALRM hack should no longer be needed and is therefore removed.	2013-11-29 14:36:10 +01:00
Andreas Sandberg	9c57d5b5a6	base: Clean up signal handling The PollEvent class dynamically installs a SIGIO and SIGALRM handler when a file handler is registered. Most signal handlers currently get registered in the initSignals() function. This changeset moves the SIGIO/SIGALRM handlers to initSignals() to live with the other signal handlers. The original code installs SIGIO and SIGALRM with the SA_RESTART option to prevent syscalls from returning EINTR. This changeset consistently uses this flag for all signal handlers to ensure that other signals that trigger asynchronous behavior (e.g., statistics dumping) do not cause undesirable EINTR returns.	2013-11-29 14:35:36 +01:00
Nilay Vaish	9fb93e5cd2	sim: correct ticksToCycles() function.	2013-11-26 17:05:22 -06:00
Andreas Sandberg	4b8be6a90b	kvm: Set the perf exclude_host attribute if available The performance counting framework in Linux 3.2 and onwards supports an attribute to exclude events generated by the host when running KVM. Setting this attribute allows us to get more reliable measurements of the guest machine. For example, on a highly loaded system, the instruction counts from the guest can be severely distorted by the host kernel (e.g., by page fault handlers). This changeset introduces a check for the attribute and enables it in the KVM CPU if present.	2013-10-15 10:09:23 +02:00
Christian Menard	d4f205ea2f	x86: Implementation of Int3 and Int_Ib in long mode This is an implementation of the x86 int3 and int immediate instructions for long mode according to 'AMD64 Programmers Manual Volume 3'.	2013-11-26 17:51:07 +01:00
Andreas Sandberg	e5d63d0535	kvm: Remove the unused hostFreq member from BaseKvmCPU	2013-11-26 17:40:58 +01:00
Steve Reinhardt ext:(%2C%20Nilay%20Vaish%20%3Cnilay%40cs.wisc.edu%3E%2C%20Ali%20Saidi%20%3CAli.Saidi%40ARM.com%3E)	de366a16f1	sim: simulate with multiple threads and event queues This patch adds support for simulating with multiple threads, each of which operates on an event queue. Each sim object specifies which eventq is would like to be on. A custom barrier implementation is being added using which eventqs synchronize. The patch was tested in two different configurations: 1. ruby_network_test.py: in this simulation L1 cache controllers receive requests from the cpu. The requests are replied to immediately without any communication taking place with any other level. 2. twosys-tsunami-simple-atomic: this configuration simulates a client-server system which are connected by an ethernet link. We still lack the ability to communicate using message buffers or ports. But other things like simulation start and end, synchronizing after every quantum are working. Committed by: Nilay Vaish	2013-11-25 11:21:00 -06:00
Anthony Gutierrez	8a53da22c2	cpu: allow the fetch buffer to be smaller than a cache line the current implementation of the fetch buffer in the o3 cpu is only allowed to be the size of a cache line. some architectures, e.g., ARM, have fetch buffers smaller than a cache line, see slide 22 at: http://www.arm.com/files/pdf/at-exploring_the_design_of_the_cortex-a15.pdf this patch allows the fetch buffer to be set to values smaller than a cache line.	2013-11-15 13:21:15 -05:00
Andreas Hansson	f028da7af7	cpu: Fix Checker register index use This patch fixes an issue in the checker CPU register indexing. The code will not even compile using LTO as deep inlining causes the used index to be outside the array bounds.	2013-11-15 03:47:10 -05:00
Steve Reinhardt	a2c21d47a8	tests: suppress output on switcheroo tests The output from the switcheroo tests is voluminous and (because it includes timestamps) highly sensitive to minor changes, leading to extremely large updates to the reference outputs. This patch addresses this problem by suppressing output from the tests. An internal parameter can be set to enable the output. Wiring that up to a command-line flag (perhaps even the rudimantary -v/-q options in m5/main.py) is left for future work.	2013-11-14 15:03:42 -08:00
Anthony Gutierrez	99d6c3b7e0	sim: fix event priority name for debug-start option	2013-11-12 11:46:48 -05:00
Andreas Hansson	460cc77d6d	mem: Fixes for DRAM stats accounting This patch fixes a number of stats accounting issues in the DRAM controller. Most importantly, it separates the system interface and DRAM interface so that it is clearer what the actual DRAM bandwidth (and consequently utilisation) is.	2013-11-01 11:56:31 -04:00
Andreas Hansson	ce93982cc6	mem: Fix the LPDDR3 page size This patch corrects the LPDDR3 page size, which was set too low.	2013-11-01 11:56:30 -04:00
Neha Agarwal	5c486908d7	mem: Adding stats for DRAM power calculation This patch adds stats which are used for offline power calculation from the 'Micron Power Calculator' spreadsheet.	2013-11-01 11:56:28 -04:00
Neha Agarwal	77fce1ce0e	mem: Unify request selection for read and write queues This patch unifies the request selection across read and write queues for FR-FCFS scheduling policy. It also fixes the request selection code to prioritize the row hits present in the request queues over the selection based on earliest bank availability.	2013-11-01 11:56:27 -04:00
Andreas Hansson	bb572663cf	mem: Add a simple adaptive version of the open-page policy This patch adds a basic adaptive version of the open-page policy that guides the decision to keep open or close by looking at the contents of the controller queues. If no row hits are found, and bank conflicts are present, then the row is closed by means of an auto precharge. This is a well-known technique that should improve performance in most use-cases.	2013-11-01 11:56:26 -04:00
Neha Agarwal	da6fd72f62	mem: Just-in-time write scheduling in DRAM controller This patch removes the untimed while loop in the write scheduling mechanism and now schedule commands taking into account the minimum timing constraint. It also introduces an optimization to track write queue size and switch from writes to reads if the number of write requests fall below write low threshold.	2013-11-01 11:56:25 -04:00
Andreas Hansson	ee6b41a1e4	mem: Add tRRD as a timing parameter for the DRAM controller This patch adds the tRRD parameter to the DRAM controller. With the recent addition of the actAllowedAt member for each bank, this addition is trivial.	2013-11-01 11:56:24 -04:00

1 2 3 4 5 ...

6266 commits