sanchayanmaity/gem5 - Sanchayan Maity's repositories

Author	SHA1	Message	Date
Matt Horsnell	e88e7d88b9	o3: fix tick used for renaming and issue with range selection Fixes the tick used from rename: - previously this gathered the tick on leaving rename which was always 1 less than the dispatch. This conflated the decode ticks when back pressure built in the pipeline. - now picks up tick on entry. Added --store_completions flag: - will additionally display the store completion tail in the viewer. - this highlights periods when large numbers of stores are outstanding (>16 LSQ blocking) Allows selection by tick range (previously this caused an infinite loop)	2013-02-15 17:40:09 -05:00
Andreas Sandberg	6459908069	arm: Don't export private GIC methods	2012-10-25 14:08:29 +01:00
Andreas Sandberg	81be8b9d15	arm: Create a GIC base class and make the PL390 derive from it This patch moves the GIC interface to a separate base class and makes all interrupt devices use that base class instead of a pointer to the PL390 implementation. This allows us to have multiple GIC implementations. Future implementations will allow in-kernel GIC implementations when using hardware virtualization. --HG-- rename : src/dev/arm/gic.cc => src/dev/arm/gic_pl390.cc rename : src/dev/arm/gic.hh => src/dev/arm/gic_pl390.hh	2012-10-25 14:05:24 +01:00
Andreas Sandberg	b904bd5437	sim: Add a system-global option to bypass caches Virtualized CPUs and the fastmem mode of the atomic CPU require direct access to physical memory. We currently require caches to be disabled when using them to prevent chaos. This is not ideal when switching between hardware virutalized CPUs and other CPU models as it would require a configuration change on each switch. This changeset introduces a new version of the atomic memory mode, 'atomic_noncaching', where memory accesses are inserted into the memory system as atomic accesses, but bypass caches. To make memory mode tests cleaner, the following methods are added to the System class: * isAtomicMode() -- True if the memory mode is 'atomic' or 'direct'. * isTimingMode() -- True if the memory mode is 'timing'. * bypassCaches() -- True if caches should be bypassed. The old getMemoryMode() and setMemoryMode() methods should never be used from the C++ world anymore.	2013-02-15 17:40:09 -05:00
Andreas Sandberg	1eec115c31	cpu: Refactor memory system checks CPUs need to test that the memory system is in the right mode in two places, when the CPU is initialized (unless it's switched out) and on a drainResume(). This led to some code duplication in the CPU models. This changeset introduces the verifyMemoryMode() method which is called by BaseCPU::init() if the CPU isn't switched out. The individual CPU models are responsible for calling this method when resuming from a drain as this code is CPU model specific.	2013-02-15 17:40:08 -05:00
Andreas Sandberg	e5dca84c3f	config: Move CPU handover logic to m5.switchCpus() CPU switching consists of the following steps: 1. Drain the system 2. Switch out old CPUs (cpu.switchOut()) 3. Change the system timing mode to the mode the new CPUs require 4. Flush caches if switching to hardware virtualization 5. Inform new CPUs of the handover (cpu.takeOverFrom()) 6. Resume the system m5.switchCpus() previously only did step 2 & 5. Since information about the new processors' memory system requirements is now exposed, do all of the steps above. This patch adds automatic memory system switching and flush (if needed) to switchCpus(). Additionally, it adds optional draining to switchCpus(). This has the following implications: * changeToTiming and changeToAtomic are no longer needed, so they have been removed. * changeMemoryMode is only used internally, so it is has been renamed to be private. * switchCpus requires a reference to the system containing the CPUs as its first parameter. WARNING: This changeset breaks compatibility with existing configuration scripts since it changes the signature of m5.switchCpus().	2013-02-15 17:40:08 -05:00
Andreas Sandberg	7f1263f144	cpu: Make checker CPUs inherit from CheckerCPU in the Python hierarchy Checker CPUs currently don't inherit from the CheckerCPU in the Python object hierarchy. This has two consequences: * It makes CPU model discovery from the Python world somewhat complicated as there is no way of testing if a CPU is a checker. * Parameters are duplicated in the checker configuration specification. This changeset makes all checker CPUs inherit from the base checker CPU class.	2013-02-15 17:40:08 -05:00
Andreas Sandberg	7cd1fd4324	cpu: Add CPU metadata om the Python classes The configuration scripts currently hard-code the requirements of each CPU. This is clearly not optimal as it makes writing new configuration scripts painful and adding new CPU models requires existing scripts to be updated. This patch adds the following class methods to the base CPU and all relevant CPUs: * memory_mode -- Return a string describing the current memory mode (invalid/atomic/timing). * require_caches -- Does the CPU model require caches? * support_take_over -- Does the CPU support CPU handover?	2013-02-15 17:40:08 -05:00
Ali Saidi	db5c478e70	arm: fix some fp comparisons that worked by accident. The explict tests in the follwing fp comparison operations were incorrect as they checked for only signaling NaNs and not quite-NaNs as well. When compiled with gcc, the comparison generates a fp exception that causes the FE_INVALID flag to be set and we check for it, so even though the check was incorrect, the correct exception was set. With clang this behavior seems to not occur. The checks are updated to test for nans and the behavior is now correct with both clang and gcc.	2013-02-15 17:40:08 -05:00
Ali Saidi	4412046041	cpu: include set in o3/commit_impl. While the majority of compilers seemed to pickup set from else where, one version of gcc 4.7 complains, so explictly add it.	2013-02-15 17:40:08 -05:00
Ali Saidi	68495a0748	ARM: Fix an issue with clang generating wrong code. Clang generated executables would enter the if condition when it wasn't supposted to, resulting in the wrong simulated behavior. Implementing the operation this way is a bit faster anyway.	2013-02-15 17:40:08 -05:00
Ali Saidi	7ae06a3b3b	cpu: fix case with o3 cpu blocking and unblocking decode in cycle Fix a case in the O3 CPU where the decode stage blocks and unblocks in a single cycle sending both signals to fetch which causes an assert or worse. The previous check could never work before since the status was set to Blocked before a test for the status being Unblocking was executed.	2013-02-15 17:40:08 -05:00
Ali Saidi	b84bd3028c	cpu: Fix a livelock in the o3 cpu. Check if an instruction just enabled interrupts and we've previously had an interrupt pending that was not handled because interrupts were subsequently disabled before the pipeline reached a place to handle the interrupt. In that case squash now to make sure the interrupt is handled.	2013-02-15 17:40:07 -05:00
Andreas Sandberg	d4eca0591d	base: Add support for newer versions of IPython IPython is used for the interactive gem5 shell if it exists. IPython made API changes in version 0.11. This patch adds support for IPython version 0.11 and above. --HG-- extra : rebase_source : 5388d0919adb58d97f49a1a637db48cba61283a3	2013-02-10 13:23:58 +01:00
Andreas Hansson	7c6bc52bf5	Ruby: Fix compilation errors on gcc 4.7 and clang 3.2 This patch fixes a few (recently added) errors that prevented gem5 from compiling on more recent versions of gcc and clang.	2013-02-14 12:24:51 -05:00
Nilay Vaish	71c27e6370	ruby: MI protocol: add a missing transition The transition for state MII and event Store was found missing during testing. The transition is being added. The controller will not stall the Store request in state MII	2013-02-10 21:43:18 -06:00
Nilay Vaish	cb7782f78d	ruby: enable multiple clock domains This patch allows ruby to have multiple clock domains. As I understand with this patch, controllers can have different frequencies. The entire network needs to run at a single frequency. The idea is that with in an object, time is treated in terms of cycles. But the messages that are passed from one entity to another should contain the time in Ticks. As of now, this is only true for the message buffers, but not for the links in the network. As I understand the code, all the entities in different networks (simple, garnet-fixed, garnet-flexible) should be clocked at the same frequency. Another problem is that the directory controller has to operate at the same frequency as the ruby system. This is because the memory controller does not make use of the Message Buffer, and instead implements a buffer of its own. So, it has no idea of the frequency at which the directory controller is operating and uses ruby system's frequency for scheduling events.	2013-02-10 21:43:17 -06:00
Nilay Vaish	253e8edf13	ruby: replace Time with Cycles (final patch in the series) This patch is as of now the final patch in the series of patches that replace Time with Cycles.This patch further replaces Time with Cycles in Sequencer, Profiler, different protocols and related entities. Though Time has not been completely removed, the places where it is in use seem benign as of now.	2013-02-10 21:43:10 -06:00
Nilay Vaish	f6e3ab7bd4	ruby: replace Time with Cycles in garnet fixed and flexible	2013-02-10 21:43:09 -06:00
Nilay Vaish	9d6d6c6718	ruby: replace Time with Tick in replacement policy classes	2013-02-10 21:43:08 -06:00
Nilay Vaish	221d39284e	ruby: convert block size, memory size to unsigned	2013-02-10 21:43:07 -06:00
Nilay Vaish	5e33045a2a	ruby: replace Time with Cycles in MessageBuffer	2013-02-10 21:26:26 -06:00
Nilay Vaish	b742081cc1	ruby: replace Time with Cycles in Memory Controller	2013-02-10 21:26:25 -06:00
Nilay Vaish	89f86dbd28	ruby: Replace Time with Cycles in SequencerMessage	2013-02-10 21:26:25 -06:00
Nilay Vaish	7862478eef	ruby: replace Time with Cycles in Message class Concomitant changes are being committed as well, including the io operator<< for the Cycles class.	2013-02-10 21:26:24 -06:00
Nilay Vaish	d3aebe1f91	ruby: replaces Time with Cycles in many places The patch started of with replacing Time with Cycles in the Consumer class. But to get ruby to compile, the rest of the changes had to be carried out. Subsequent patches will further this process, till we completely replace Time with Cycles.	2013-02-10 21:26:24 -06:00
Nilay Vaish	affd77ea77	base: add some mathematical operators to Cycles class	2013-02-10 21:26:23 -06:00
Nilay Vaish	bc1daae7fd	ruby: modifies histogram add() function This patch modifies the Histogram class' add() function so that it can add linear histograms as well. The function assumes that the left end point of the ranges of the two histograms are the same. It also assumes that when the ranges of the two histogram are changed to accomodate an element not in the range, the factor used in changing the range is same for both the histograms. This function is then used in removing one of the calls to the global profiler*. The histograms for recording the delays incurred in processing different requests are now maintained by the controllers. The profiler adds these histograms when it needs to print the stats.	2013-02-10 21:26:22 -06:00
Nilay Vaish	a49b1df3f0	ruby: record fully busy cycle with in the controller This patch does several things. First, the counter for fully busy cycles for a controller is now kept with in the controller, instead of being part of the profiler. Second, the topology class no longer keeps an array of controllers which was only used for printing stats. Instead, ruby system will now ask each controller to print the stats. Thirdly, the statistical variable for recording how many different types were created is being moved in to the controller from the profiler. Note that for printing, the profiler will collate results from different controllers.	2013-02-10 21:26:22 -06:00
Andreas Sandberg	10f1f8c6a4	base: Fix broken IPython argument handling Prior to this changeset, we used to clear sys.argv before entering the IPython shell. This caused some versions of IPython to crash because they assume argv[0] to exist. The correct way of overriding the arguments passed to IPython is to set the argv keyword argument when initializing the shell.	2013-02-10 13:23:56 +01:00
Nilay Vaish	87ea04ab2f	sim: remove unused struct priority_compare	2013-01-31 21:26:29 -06:00
Nilay Vaish	6aed4d4f93	ruby: correct computation of number of bits required for address The number of bits required for an address was set to floorLog2(memory size). This is correct under the assumption that the memory size is a power of 2, which is not always true. Hence, floorLog2 is being replaced with ceilLog2.	2013-01-31 09:44:20 -06:00
Andreas Hansson	a4288dabf9	mem: Add comments for the DRAM address decoding This patch adds more verbose comments to explain the two different address mapping schemes of the DRAM controller.	2013-01-31 07:49:18 -05:00
Andreas Hansson	c4898b15bc	mem: Add DDR3 and LPDDR2 DRAM controller configurations This patch moves the default DRAM parameters from the SimpleDRAM class to two different subclasses, one for DDR3 and one for LPDDR2. More can be added as we go forward. The regressions that previously used the SimpleDRAM are now using SimpleDDR3 as this is the most similar configuration.	2013-01-31 07:49:14 -05:00
Ani Udipi	eaa37e611f	mem: Add tTAW and tFAW to the SimpleDRAM model This patch adds two additional scheduling constraints to the DRAM controller model, to constrain the activation rate. The two metrics are determine the size of the activation window in terms of the number of activates and the minimum time required for that number of activates. This maps to current DDRx, LPDDRx and WIOx standards that have either tFAW (4 activate window) or tTAW (2 activate window) scheduling constraints.	2013-01-31 07:49:14 -05:00
Andreas Hansson	b7153e2a64	mem: Separate out the different cases for DRAM bus busy time This patch changes how the data bus busy time is calculated such that it is delayed to the actual scheduling time of the request as opposed to being done as soon as possible. This patch changes a bunch of statistics, and the stats update is bundled together with the introruction of tFAW/tTAW and the named DRAM configurations like DDR3 and LPDDR2.	2013-01-31 07:49:13 -05:00
Anthony Gutierrez	af0f8b31db	cache: remove drainManager because it's not used the cache drainManager is set but never cleared, this is because the cache itself does not need to be drained and thus never triggers a signalDrainDone(). because the drainManager variable is not used properly and does not appear to be necessary it has been removed with this patch.	2013-01-28 20:19:42 -05:00
Nilay Vaish	a8eb5b18e0	ruby: remove get_time() This patch replaces get_time() in *.sm files with curCycle() which is now possible since controllers are clocked objects.	2013-01-28 06:14:18 -06:00
Nilay Vaish	31659e83fb	ruby: remove call to curCycle in panic() The panic() function already prints the current tick value. This call to curCycle() is as such redundant. Since we are trying to move towards multiple clock domains, this call will print misleading time.	2013-01-28 06:11:42 -06:00
Nilay Vaish ext:(%2C%20Timothy%20Jones%20%3Ctimothy.jones%40cl.cam.ac.uk%3E)	dbeabedaf0	branch predictor: move out of o3 and inorder cpus This patch moves the branch predictor files in the o3 and inorder directories to src/cpu/pred. This allows sharing the branch predictor across different cpu models. This patch was originally posted by Timothy Jones in July 2010 but never made it to the repository. --HG-- rename : src/cpu/o3/bpred_unit.cc => src/cpu/pred/bpred_unit.cc rename : src/cpu/o3/bpred_unit.hh => src/cpu/pred/bpred_unit.hh rename : src/cpu/o3/bpred_unit_impl.hh => src/cpu/pred/bpred_unit_impl.hh rename : src/cpu/o3/sat_counter.hh => src/cpu/pred/sat_counter.hh	2013-01-24 12:28:51 -06:00
Andrea Pellegrini	11d5ffa108	o3 cpu: fix zero reg problem There was an issue w/ the rename logic, which would assign a previous physical register to the ZeroReg architectural register in x86. This issue was giving problems for instructions squashed in threads w/ ID different from 0, sometimes allowing non-mispredicted instructions to obtain a value different from zero when reading the zeroReg.	2013-01-22 00:13:28 -06:00
Nilay Vaish	fc57ae6401	x86, cpu: corrects 270c9a75e91f, take over decoder on cpu switch The changes made by the changeset 270c9a75e91f do not work well with switching of cpus. The problem is that decoder for the old thread context holds state that is not taken over by the new decoder. This patch adds a takeOverFrom() function to Decoder class in each ISA. Except for x86, functions in other ISAs are blank. For x86, the function copies state from the old decoder to the new decoder.	2013-01-22 00:10:10 -06:00
Joel Hestness	1429d21244	O3 IEW: Make incrWb and decrWb clearer Move the increment/decrement of wbOutstanding outside of the comparison in incrWb and decrWb in the IEW. This also fixes a compiler bug with gcc 4.4.7, which incorrectly optimizes "-- ==" as "-=".	2013-01-19 15:14:54 -06:00
Nilay Vaish	5b6f972750	ruby: remove calls to g_system_ptr->getTime() This patch further removes calls to g_system_ptr->getTime() where ever other clocked objects are available for providing current time.	2013-01-17 13:10:12 -06:00
Nilay Vaish	f2bcf4f01c	x86 cpuid: enable clflush Note that clflush is only being enabled. It is not implemented in actual. A warning is printed if the cpu encounters a clflush instruction. We need to enable this instruction in cpuid since JRE 1.7 tests for it.	2013-01-15 07:43:21 -06:00
Nilay Vaish	ac9bb51405	x86: implements fsin, fcos instructions	2013-01-15 07:43:21 -06:00
Nilay Vaish	7f5463539b	x86: implements emms instruction	2013-01-15 07:43:20 -06:00
Nilay Vaish	91b00d98a5	x86: implement fabs, fchs instructions	2013-01-15 07:43:19 -06:00
Malek Musleh	1abf950f3c	ruby sequencer: converts cycles to ticks in deadlock panic() This patch converts the panic() print outs in the Sequencer::wakeup() call from ruby cycles to Ticks(). This makes it easier to debug deadlocks with the ProtocolTrace flag so the issue time indicated in the panic message can be quickly searched for. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2013-01-14 10:05:12 -06:00
Nilay Vaish	2012983718	Ruby: remove reference to g_system_ptr from class Message This patch was initiated so as to remove reference to g_system_ptr, the pointer to Ruby System that is used for getting the current time. That simple change actual requires changing a lot many things in slicc and garnet. All these changes are related to how time is handled. In most of the places, g_system_ptr has been replaced by another clock object. The changes have been done under the assumption that all the components in the memory system are on the same clock frequency, but the actual clocks might be distributed.	2013-01-14 10:05:10 -06:00
Nilay Vaish	cf232de461	Ruby: use ClockedObject in Consumer class Many Ruby structures inherit from the Consumer, which is used for scheduling events. The Consumer used to relay on an Event Manager for scheduling events and on g_system_ptr for time. With this patch, the Consumer will now use a ClockedObject to schedule events and to query for current time. This resulted in several structures being converted from SimObjects to ClockedObjects. Also, the MessageBuffer class now requires a pointer to a ClockedObject so as to query for time.	2013-01-14 10:04:21 -06:00
Andreas Hansson	cbbc4c7f6b	scons: Address clang 3.2 compilation error This patch fixes a compilation error encountered using clang 3.2 on OSX.	2013-01-14 10:23:56 -05:00
Nilay Vaish	f7c0ba406e	base simple cpu: removes commented out code about cache ops	2013-01-12 22:11:16 -06:00
Nilay Vaish	25ec278a0b	x86: Changes to decoder, corrects 9376 The changes made by the changeset 9376 were not quite correct. The patch made changes to the code which resulted in decoder not getting initialized correctly when the state was restored from a checkpoint. This patch adds a startup function to each ISA object. For x86, this function sets the required state in the decoder. For other ISAs, the function is empty right now.	2013-01-12 22:09:48 -06:00
Lluís Vilanova	807168a1de	util: add m5_fail op. Used as a command in full-system scripts helps the user ensure the benchmarks have finished successfully. For example, one can use: /path/to/benchmark args \|\| /sbin/m5 fail 1 and thus ensure gem5 will exit with an error if the benchmark fails.	2013-01-08 08:54:12 -05:00
Tao Zhang	858d99b7cc	sim: Fix early termination in multi-core simulation under SE mode. When "-I" (maximum instruction number) and "-F" (fastforward instruction number) are applied together, gem5 immediately exits after the cpu switching. The reason is that multiple exit events may be generated in the same cycle by Atomic CPU and inserted to mainEventQueue. However, mainEventQueue can only serve one exit event in one cycle. Therefore, the rest exit events are left in mainEventQueue without being descheduled or deleted, which causes gem5 exits immediately after the system resumes by cpu switching.	2013-01-08 08:54:11 -05:00
Mitch Hayenga	4a752b1655	arm: add access syscall for ARM SE mode This patch adds the "access" syscall for ARM SE as required by some spec2006 benchmarks.	2013-01-08 08:54:07 -05:00
Mitch Hayenga	c7dbd5e768	mem: Make LL/SC locks fine grained The current implementation in gem5 just keeps a list of locks per cacheline. Due to this, a store to a non-overlapping portion of the cacheline can cause an LL/SC pair to fail. This patch simply adds an address range to the lock structure, so that the lock is only invalidated if the store overlaps the lock range.	2013-01-08 08:54:07 -05:00
Mitch Hayenga	dc4a0aa2fa	mem: Fix use-after-free bug Running with valgrind I noticed a use after free originating from simple_mem.cc. It looks like this is a known issue and this additional call site was missed in an earlier patch.	2013-01-08 08:54:06 -05:00
Andreas Sandberg	8480615d8d	dev: Fix infinite recursion in DMA devices The DMA device sometimes calls the process() method on a completion event directly instead of scheduling it on the current tick. This breaks some devices that assume that the completion handler won't be called until the current event handler has returned. Specifically, it causes infinite recursion in the IdeDisk component because it does not advance its chunk generator until after a dmaRead()/dmaWrite() has returned. This changeset removes this mico-optimization and schedules the event in the current tick instead. This way the semantics event handling stay the same even when the delay is 0.	2013-01-07 16:56:39 -05:00
Sascha Bischoff	8a767885d6	stats: Fix swig wrapping for Tick in stats Tick was not correctly wrapped for the stats system, and therefore it was not possible to configure the stats dumping from the python scripts without defining Ticks as long long. This patch fixes the wrapping of Tick by copying the typemap of uint64_t to Tick.	2013-01-07 16:56:36 -05:00
Andreas Sandberg	009970f59b	cpu: Unify the serialization code for all of the CPU models Cleanup the serialization code for the simple CPUs and the O3 CPU. The CPU-specific code has been replaced with a (un)serializeThread that serializes the thread state / context of a specific thread. Assuming that the thread state class uses the CPU-specific thread state uses the base thread state serialization code, this allows us to restore a checkpoint with any of the CPU models.	2013-01-07 13:05:52 -05:00
Andreas Sandberg	e09e9fa279	cpu: Flush TLBs on switchOut() This changeset inserts a TLB flush in BaseCPU::switchOut to prevent stale translations when doing repeated switching. Additionally, the TLB flushing functionality is exported to the Python to make debugging of switching/checkpointing easier. A simulation script will typically use the TLB flushing functionality to generate a reference trace. The following sequence can be used to simulate a handover (this depends on how drain is implemented, but is generally the case) between identically configured CPU models: m5.drain(test_sys) [ cpu.flushTLBs() for cpu in test_sys.cpu ] m5.resume(test_sys) The generated trace should normally be identical to a trace generated when switching between identically configured CPU models or checkpointing and resuming.	2013-01-07 13:05:48 -05:00
Andreas Sandberg	964aa49d15	mem: Fix guest corruption when caches handle uncacheable accesses When the classic gem5 cache sees an uncacheable memory access, it used to ignore it or silently drop the cache line in case of a write. Normally, there shouldn't be any data in the cache belonging to an uncacheable address range. However, since some architecture models don't implement cache maintenance instructions, there might be some dirty data in the cache that is discarded when this happens. The reason it has mostly worked before is because such cache lines were most likely evicted by normal memory activity before a TLB flush was requested by the OS. Previously, the cache model would invalidate cache lines when they were accessed by an uncacheable write. This changeset alters this behavior so all uncacheable memory accesses cause a cache flush with an associated writeback if necessary. This is implemented by reusing the cache flushing machinery used when draining the cache, which implies that writebacks are performed using functional accesses.	2013-01-07 13:05:47 -05:00
Andreas Sandberg	1814a85a05	cpu: Rewrite O3 draining to avoid stopping in microcode Previously, the O3 CPU could stop in the middle of a microcode sequence. This patch makes sure that the pipeline stops when it has committed a normal instruction or exited from a microcode sequence. Additionally, it makes sure that the pipeline has no instructions in flight when it is drained, which should make draining more robust. Draining is controlled in the commit stage, which checks if the next PC after a committed instruction is in microcode. If this isn't the case, it requests a squash of all instructions after that the instruction that just committed and immediately signals a drain stall to the fetch stage. The CPU then continues to execute until the pipeline and all associated buffers are empty.	2013-01-07 13:05:46 -05:00
Andreas Sandberg	9e8003148f	cpu: Make sure that a drained atomic CPU isn't executing ucode Currently, the atomic CPU can be in the middle of a microcode sequence when it is drained. This leads to two problems: * When switching to a hardware virtualized CPU, we obviously can't execute gem5 microcode. * Since curMacroStaticInst is populated when executing microcode, repeated switching between CPUs executing microcode leads to incorrect execution. After applying this patch, the CPU will be on a proper instruction boundary, which means that it is safe to switch to any CPU model (including hardware virtualized ones). This changeset fixes a bug where the multiple switches to the same atomic CPU sometimes corrupts the target state because of dangling pointers to the currently executing microinstruction. Note: This changeset moves tick event descheduling from switchOut() to drain(), which makes timing consistent between just draining a system and draining /and/ switching between two atomic CPUs. This makes debugging quite a lot easier (execution traces get the same timing), but the latency of the last instruction before a drain will not be accounted for correctly (it will always be 1 cycle). Note 2: This changeset removes so_state variable, the locked variable, and the tickEvent from checkpoints since none of them contain state that needs to be preserved across checkpoints. The so_state is made redundant because we don't use the drain state variable anymore, the lock variable should never be set when the system is drained, and the tick event isn't scheduled.	2013-01-07 13:05:46 -05:00
Andreas Sandberg	f9bcf46371	cpu: Make sure that a drained timing CPU isn't executing ucode Currently, the timing CPU can be in the middle of a microcode sequence or multicycle (stayAtPC is true) instruction when it is drained. This leads to two problems: * When switching to a hardware virtualized CPU, we obviously can't execute gem5 microcode. * If stayAtPC is true we might execute half of an instruction twice when restoring a checkpoint or switching CPUs, which leads to an incorrect execution. After applying this patch, the CPU will be on a proper instruction boundary, which means that it is safe to switch to any CPU model (including hardware virtualized ones). This changeset also fixes a bug where the timing CPU sometimes switches out with while stayAtPC is true, which corrupts the target state after a CPU switch or checkpoint. Note: This changeset removes the so_state variable from checkpoints since the drain state isn't used anymore.	2013-01-07 13:05:46 -05:00
Andreas Sandberg	52ff37caa3	cpu: Fix broken thread context handover The thread context handover code used to break when multiple handovers were performed during the same quiesce period. Previously, the thread contexts would assign the TC pointer in the old quiesce event to the new TC. This obviously broke in cases where multiple switches were performed within the same quiesce period, in which case the TC pointer in the quiesce event would point to an old CPU. The new implementation deschedules pending quiesce events in the old TC and schedules a new quiesce event in the new TC. The code has been refactored to remove most of the code duplication.	2013-01-07 13:05:46 -05:00
Andreas Sandberg	fca4fea769	cpu: Fix O3 LSQ debug dumping constness and formatting	2013-01-07 13:05:46 -05:00
Andreas Sandberg	fb52ea9220	arm: Invalidate cached TLB configuration in drainResume Currently, we invalidate the cached miscregs in TLB::unserialize(). The intended use of the drainResume() method is to invalidate cached state and prepare the system to resume after a CPU handover or (un)serialization. This patch moves the TLB miscregs invalidation code to the drainResume() method to avoid surprising behavior.	2013-01-07 13:05:45 -05:00
Andreas Sandberg	0d59549cd9	arm: Fix draining of the pagetable walker when squashing Since the page table walker only checks if a drain has completed in doL1DescriptorWrapper() and doL2DescriptorWrapper(), it sometimes looses track of a drain request if there is a squash. This changeset adds a completeDrain() call after squashing requests in the pending queue, which fixes this issue.	2013-01-07 13:05:45 -05:00
Andreas Sandberg	8db27aa230	cpu: Fix broken squashAfter implementation in O3 CPU Commit can currently both commit and squash in the same cycle. This confuses other stages since the signals coming from the commit stage can only signal either a squash or a commit in a cycle. This changeset changes the behavior of squashAfter so that it commits all instructions, including the instruction that requested the squash, in the first cycle and then starts to squash in the next cycle.	2013-01-07 13:05:45 -05:00
Andreas Sandberg	a2077ccf02	o3 cpu: Remove unused variables	2013-01-07 13:05:45 -05:00
Andreas Sandberg	1c3a1888d8	sim: Remove unused variables	2013-01-07 13:05:45 -05:00
Andreas Sandberg	2cfe62adc4	cpu: Rename defer_registration->switched_out The defer_registration parameter is used to prevent a CPU from initializing at startup, leaving it in the "switched out" mode. The name of this parameter (and the help string) is confusing. This patch renames it to switched_out, which should be more descriptive.	2013-01-07 13:05:45 -05:00
Andreas Sandberg	f7da0fddd1	cpu: Remove unused params.hh header file in inorder CPU	2013-01-07 13:05:45 -05:00
Andreas Sandberg	38925ff621	arm: Remove the register mapping hack used when copying TCs In order to see all registers independent of the current CPU mode, the ARM architecture model uses the magic MISCREG_CPSR_MODE register to change the register mappings without actually updating the CPU mode. This hack is no longer needed since the thread context now provides a flat interface to the register file. This patch replaces the CPSR_MODE hack with the flat register interface.	2013-01-07 13:05:44 -05:00
Andreas Sandberg	a7e0cbeb36	cpu: Introduce sanity checks when switching between CPUs This patch introduces the following sanity checks when switching between CPUs: * Check that the set of new and old CPUs do not overlap. Having an overlap between the set of new CPUs and the set of old CPUs is currently not supported. Doing such a switch used to result in the following assertion error: BaseCPU::takeOverFrom(BaseCPU): \ Assertion `!new_itb_port->isConnected()' failed. Check that all new CPUs are in the switched out state. * Check that all old CPUs are in the switched in state.	2013-01-07 13:05:44 -05:00
Andreas Sandberg	901258c22b	cpu: Correctly call parent on switchOut() and takeOverFrom() This patch cleans up the CPU switching functionality by making sure that CPU models consistently call the parent on switchOut() and takeOverFrom(). This has the following implications that might alter current functionality: * The call to BaseCPU::switchout() in the O3 CPU is moved from signalDrained() (!) to switchOut(). * A call to BaseSimpleCPU::switchOut() is introduced in the simple CPUs.	2013-01-07 13:05:44 -05:00
Andreas Sandberg	4ae02295d5	cpu: Unify SimpleCPU and O3 CPU serialization code The O3 CPU used to copy its thread context to a SimpleThread in order to do serialization. This was a bit of a hack involving two static SimpleThread instances and a magic constructor that was only used by the O3 CPU. This patch moves the ThreadContext serialization code into two global procedures that, in addition to the normal serialization parameters, take a ThreadContext reference as a parameter. This allows us to reuse the serialization code in all ThreadContext implementations.	2013-01-07 13:05:44 -05:00
Andreas Sandberg	6daada2701	cpu: Initialize the O3 pipeline from startup() The entire O3 pipeline used to be initialized from init(), which is called before initState() or unserialize(). This causes the pipeline to be initialized from an incorrect thread context. This doesn't currently lead to correctness problems as instructions fetched from the incorrect start PC will be squashed a few cycles after initialization. This patch will affect the regressions since the O3 CPU now issues its first instruction fetch to the correct PC instead of 0x0.	2013-01-07 13:05:44 -05:00
Andreas Sandberg	e2dad8236a	cpu: Implement a flat register interface in thread contexts Some architectures map registers differently depending on their mode of operations. There is currently no architecture independent way of accessing all registers. This patch introduces a flat register interface to the ThreadContext class. This interface is useful, for example, when serializing or copying thread contexts.	2013-01-07 13:05:44 -05:00
Andreas Sandberg	17b47d35e1	arch: Move the ISA object to a separate section After making the ISA an independent SimObject, it is serialized automatically by the Python world. Previously, this just resulted in an empty ISA section. This patch moves the contents of the ISA to that section and removes the explicit ISA serialization from the thread contexts, which makes it behave like a normal SimObject during serialization. Note: This patch breaks checkpoint backwards compatibility! Use the cpt_upgrader.py utility to upgrade old checkpoints to the new format.	2013-01-07 13:05:42 -05:00
Andreas Sandberg	7eb0fb8b6e	cpu: Check that the memory system is in the correct mode This patch adds checks to all CPU models to make sure that the memory system is in the correct mode at startup and when resuming after a drain. Previously, we only checked that the memory system was in the right mode when resuming. This is inadequate since this is a configuration error that should be detected at startup as well as when resuming. Additionally, since the check was done using an assert, it wasn't performed when NDEBUG was set (e.g., the fast target).	2013-01-07 13:05:41 -05:00
Andreas Sandberg	94561dd526	arch: Add support for invalidating TLBs when draining This patch adds support for the memInvalidate() drain method. TLB flushing is requested by calling the virtual flushAll() method on the TLB. Note: This patch renames invalidateAll() to flushAll() on x86 and SPARC to make the interface consistent across all supported architectures.	2013-01-07 13:05:40 -05:00
Andreas Sandberg	d44f2f611f	mem: Remove the IIC replacement policy The IIC replacement policy seems to be unused and has probably gathered too much bit rot to be useful. This patch removes the IIC and its associated cache parameters.	2013-01-07 13:05:39 -05:00
Andreas Hansson	9364d35b8b	dev: Do not serialize timer parameters This patch removes the intNum and clock from the serialized scalars as these are set by the Python parameters and should not be part of the checkpoint.	2013-01-07 13:05:39 -05:00
Andreas Hansson	406891c62a	scons: Enforce gcc >= 4.4 or clang >= 2.9 and c++0x support This patch checks that the compiler in use is either gcc >= 4.4 or clang >= 2.9. and enables building with --std=c++0x in all cases. As a consequence, we can tidy up the hashmap and always have static_assert available. If anyone wants to use alternative compilers, icc for example supports c++0x to a similar level and could be added if needed. This patch opens up for a more elaborate use of c++0x features that are present in gcc 4.4 and clang 2.9, e.g. auto typed variables, variadic templates, rvalues and move semantics, and strongly typed enums. There will be no going back on this one...	2013-01-07 13:05:39 -05:00
Andreas Hansson	221302335b	scons: Remove stale compiler options This patch simply prunes the SUNCC and ICC compiler options as they are both sufficiently stale that they would have to be re-written from scratch anyhow. The patch serves to clean things up before shifting to a build environment that enforces basic c++11 compliance as done in the following patch.	2013-01-07 13:05:39 -05:00
Andreas Hansson	921490a060	sim: Fatal if a clocked object is set to have a clock of 0 This patch adds a check to the clocked object constructor to ensure it is not configured to have a clock period of 0.	2013-01-07 13:05:39 -05:00
Andreas Hansson	490dc30d96	dev: Make the ethernet devices use a non-zero clock This patch changes the NS gige controller to have a non-clock, and sets the default to 500 MHz. The blocks that could prevoiusly be by-passed with a zero clock are now always present, and the user is left with the option of setting a very high clock frequency to achieve a similar performance.	2013-01-07 13:05:39 -05:00
Chander Sudanthi	694a81e994	ARM: pl111/LCD framebuffer checkpointing fix Fixed check pointing of the framebuffer. Previously, the pixel size was not considered in determining the size of the buffer to checkpoint. This patch checkpoints the entire framebuffer instead of the first quarter.	2013-01-07 13:05:39 -05:00
Andreas Sandberg	c3551e82f7	arch: Fix broken M5VarArgsFault initialization At least gcc 4.4.3 seems to get confused by the use of func both as a template parameter and a member variable in the M5VarArgsFault class. This causes the value of the member variable func to be unpredictable in M5VarArgsFault objects. This changeset renames the template parameter to remove this ambiguity.	2013-01-07 13:05:38 -05:00
Andreas Hansson	18b147acef	mem: Merge ranges that are part of the conf table This patch adds basic merging of address ranges when determining which address ranges should be reported in the configuration table. By performing this merging it is possible to distribute an address range across many memory channels (controllers). This is essential to enable address interleaving.	2013-01-07 13:05:38 -05:00
Andreas Hansson	b8c2fa6ba9	base: Add support for merging of interleaved address ranges This patch adds support for merging a vector of interleaved address ranges into a contigous range. The functionality will be used in the interconnect and the PhysicalMemory to transform interleaved memory ranges to contigous ranges before passing them on. The actual use of the merging is appearing in future patches.	2013-01-07 13:05:38 -05:00
Andreas Hansson	01c5598373	mem: Add interleaving bits to the address ranges This patch adds support for interleaving bits for the address ranges. What was previously just a start and end address, now has an additional three fields, for the high bit, and number of bits to use for interleaving, and a match value to compare against. If the number of interleaving bits is set to zero it is effectively disabled. A number of convenience functions are added to the range to enquire about the interleaving, its granularity and the number of stripes it is part of.	2013-01-07 13:05:38 -05:00
Andreas Hansson	e6c57786a4	config: Traverse lists when visiting children in all proxy This patch makes the all proxy traverse any potential list that is encountered in the object hierarchy instead of only looking at children that are SimObjects. An example of where this is useful is when creating a multi-channel memory system as a list of controllers, whilst ensuring that the memories are still visible in the system.	2013-01-07 13:05:38 -05:00
Andreas Hansson	e0d93fde99	base: Simplify the AddrRangeMap by removing unused code This patch cleans up the AddrRangeMap in preparation for the addition of interleaving by removing unused code. The non-const editions of find are never used, and hence the duplication is not needed.	2013-01-07 13:05:38 -05:00
Andreas Hansson	e65de3f5ca	config: Do not use hardcoded physmem in fs script This patch generalises the address range resolution for the I/O cache and I/O bridge such that they do not assume a single memory. The patch involves adding a parameter to the system which is then defined based on the memories that are to be visible from the I/O subsystem, whether behind a cache or a bridge. The change is needed to allow interleaved memory controllers in the system.	2013-01-07 13:05:38 -05:00
Andreas Hansson	15a979c6be	mem: Tidy up bus addr range debug messages This patch tidies up a number of the bus DPRINTFs related to range manipulation. In particular, it shifts the message about range changes to the start of the member function, and also adds information about when all ranges are received.	2013-01-07 13:05:38 -05:00
Andreas Hansson	caf6786ad5	mem: Skip address mapper range checks to allow more flexibility This patch makes the address mapper less stringent about checking the before and after ranges, i.e. the original and remapped ranges. The checks were not really necessary, and there are situations when the previous checks were too strict.	2013-01-07 13:05:38 -05:00
Andreas Hansson	71da1d2157	base: Encapsulate the underlying fields in AddrRange This patch makes the start and end address private in a move to prevent direct manipulation and matching of ranges based on these fields. This is done so that a transition to ranges with interleaving support is possible. As a result of hiding the start and end, a number of member functions are needed to perform the comparisons and manipulations that previously took place directly on the members. An accessor function is provided for the start address, and a function is added to test if an address is within a range. As a result of the latter the != and == operator is also removed in favour of the member function. A member function that returns a string representation is also created to allow debug printing. In general, this patch does not add any functionality, but it does take us closer to a situation where interleaving (and more cleverness) can be added under the bonnet without exposing it to the user. More on that in a later patch.	2013-01-07 13:05:38 -05:00
Andreas Hansson	cfdaf53104	mem: Remove the joining of neighbouring ranges This patch temporarily removes the joining of ranges when creating the backing store, to reserve this functionality for the interleaved ranges that are about to be introduced. When creating the mmaps for the backing store, there is no point in creating larger contigous chunks that what is necessary. The larger chunks will only make life more difficult for the host. Merging will be re-added later, but then only for interleaved ranges.	2013-01-07 13:05:38 -05:00
Andreas Hansson	ccb6c64047	cpu: Share the send functionality between traffic generators This patch moves the packet creating and sending to a member function in the shared base class to avoid code duplication.	2013-01-07 13:05:37 -05:00
Andreas Hansson	1da209140c	cpu: Add support for protobuf input for the trace generator This patch adds support for reading input traces encoded using protobuf according to what is done in the CommMonitor. A follow-up patch adds a Python script that can be used to convert the previously used ASCII traces to protobuf equivalents. The appropriate regression input is updated as part of this patch.	2013-01-07 13:05:37 -05:00
Andreas Hansson	35bdee72cb	cpu: Encapsulate traffic generator input in a stream This patch encapsulates the traffic generator input in a stream class such that the parsing is not visible to the trace generator. The change takes us one step closer to using protobuf-based input traces for the trace replay. The functionality of the current input stream is identical to what it was, and the ASCII format remains the same for now.	2013-01-07 13:05:37 -05:00
Andreas Hansson	4afa6c4c3e	base: Add wrapped protobuf input stream This patch adds support for inputting protobuf messages through a ProtoInputStream which hides the internal streams used by the library. The stream is created based on the name of an input file and optionally includes decompression using gzip. The input stream will start by getting a magic number from the file, and also verify that it matches with the expected value. Once opened, messages can be read incrementally from the stream, returning true/false until an error occurs or the end of the file is reached.	2013-01-07 13:05:37 -05:00
Andreas Hansson	f456c7983d	mem: Add tracing support in the communication monitor This patch adds packet tracing to the communication monitor using a protobuf as the mechanism for creating the trace. If no file is specified, then the tracing is disabled. If a file is specified, then for every packet that is successfully sent, a protobuf message is serialized to the file.	2013-01-07 13:05:37 -05:00
Andreas Hansson	11ab30fa5a	base: Add wrapped protobuf output streams This patch adds support for outputting protobuf messages through a ProtoOutputStream which hides the internal streams used by the library. The stream is created based on the name of an output file and optionally includes compression using gzip. The output stream will start by putting a magic number in the file, and then for every message that is serialized prepend the size such that the stream can be written and read incrementally. At this point this merely serves as a proof of concept.	2013-01-07 13:05:37 -05:00
Andreas Hansson	41f228c2ea	scons: Add support for google protobuf building This patch enables the use of protobuf input files in the build process, thus allowing .proto files to be added to input. Each .proto file is compiled using the protoc tool and the newly created C++ source is added to the list of sources. The first location where the protobufs will be used is in the capturing and replay of memory traces, involving the communication monitor and the trace-generator state of the traffic generator. This will follow in the next patch. This patch does add a dependency on the availability of the BSD licensed protobuf library (and headers), and the protobuf compiler, protoc. These dependencies are checked in the SConstruct, similar to e.g. swig. The user can override the use of protoc from the PATH by specifying the PROTOC environment variable. Although the dependency on libprotobuf and protoc might seem like a big step, they add significant value to the project going forward. Execution traces and other types of traces could easily be added and parsers for C++ and Python are automatically generated. We could also envision using protobufs for the checkpoints, description of the traffic-generator behaviour etc. The sky is the limit. We could also use the GzipOutputStream from the protobuf library instead of the current GPL gzstream. Currently, only the C++ source and header is generated. Going forward we might want to add the Python output to support simple command-line tools for displaying and editing the traces.	2013-01-07 13:05:37 -05:00
Andreas Sandberg	63f1d0516d	arm: Fix DMA event handling bug in the PL111 model The PL111 model currently maintains a list of pre-allocated DmaDoneEvents to prevent unnecessary heap allocations. This list effectively works like a stack where the top element is the latest scheduled event. When an event triggers, the top pointer is moved down the stack. This obviously breaks since events usually retire from the bottom (events don't necessarily have to retire in order), which triggers the following assertion: gem5.debug: build/ARM/dev/arm/pl111.cc:460: void Pl111::fillFifo(): \ Assertion `!dmaDoneEvent[dmaPendingNum-1].scheduled()' failed. This changeset adds a vector listing the currently unused events. This vector acts like a stack where the an element is popped off the stack when a new event is needed an pushed on the stack when they trigger.	2013-01-07 13:05:37 -05:00
Andreas Hansson	fffdc6a450	dev: Fix the Pl111 timings by separating pixel and DMA clock This patch fixes the Pl111 timings by creating a separate clock for the pixel timings. The device clock is used for all interactions with the memory system, just like the AHB clock on the actual module. The result without this patch is that the module only is allowed to send one request every tick of the 24MHz clock which causes a huge backlog.	2013-01-07 13:05:36 -05:00
Andreas Hansson	f22d3bb9c3	cpu: Fix the traffic gen read percentage This patch fixes the computation that determines whether to perform a read or a write such that the two corner cases (0 and 100) are both more efficient and handled correctly.	2013-01-07 13:05:35 -05:00
Andreas Hansson	852a7bcf92	mem: Add sanity check to packet queue size This patch adds a basic check to ensure that the packet queue does not grow absurdly large. The queue should only be used to store packets that were delayed due to blocking from the neighbouring port, and not for actual storage. Thus, a limit of 100 has been chosen for now (which is already quite substantial).	2013-01-07 13:05:35 -05:00
Andreas Hansson	ce5fc494e3	ruby: Fix missing cxx_header in Switch This patch addresses a warning related to the swig interface generation for the Switch class. The cxx_header is now specified correctly, and the header in question has got a few includes added to make it all compile.	2013-01-07 13:05:35 -05:00
Chris Emmons	b7827a5aaa	config: Replace second keyboard with a mouse. The platform has two KMI devices that are both setup to be keyboards. This patch changes the second keyboard to a mouse. This patch will allow keyboard input as usual and additionally provide mouse support.	2013-01-07 13:05:35 -05:00
Andreas Hansson	174269978a	mem: Fix a bug in the memory serialization file naming This patch fixes a bug that caused multiple systems to overwrite each other physical memory. The system name is now included in the filename such that this is avoided.	2013-01-07 13:05:35 -05:00
Andreas Sandberg	0d1ad50326	arm: Make ID registers ISA parameters This patch makes the values of ID_ISARx, MIDR, and FPSID configurable as ISA parameter values. Additionally, setMiscReg now ignores writes to all of the ID registers. Note: This moves the MIDR parameter from ArmSystem to ArmISA for consistency.	2013-01-07 13:05:35 -05:00
Andreas Sandberg	3db3f83a5e	arch: Make the ISA class inherit from SimObject The ISA class on stores the contents of ID registers on many architectures. In order to make reset values of such registers configurable, we make the class inherit from SimObject, which allows us to use the normal generated parameter headers. This patch introduces a Python helper method, BaseCPU.createThreads(), which creates a set of ISAs for each of the threads in an SMT system. Although it is currently only needed when creating multi-threaded CPUs, it should always be called before instantiating the system as this is an obvious place to configure ID registers identifying a thread/CPU.	2013-01-07 13:05:35 -05:00
Ali Saidi	69d419f313	o3: Fix issue with LLSC ordering and speculation This patch unlocks the cpu-local monitor when the CPU sees a snoop to a locked address. Previously we relied on the cache to handle the locking for us, however some users on the gem5 mailing list reported a case where the cpu speculatively executes a ll operation after a pending sc operation in the pipeline and that makes the cache monitor valid. This should handle that case by invaliding the local monitor.	2013-01-07 13:05:33 -05:00
Ali Saidi	5146a69835	cpu: rename the misleading inSyscall to noSquashFromTC isSyscall was originally created because during handling of a syscall in SE mode the threadcontext had to be updated. However, in many places this is used in FS mode (e.g. fault handlers) and the name doesn't make much sense. The boolean actually stops gem5 from squashing speculative and non-committed state when a write to a threadcontext happens, so re-name the variable to something more appropriate	2013-01-07 13:05:33 -05:00
Ali Saidi	9a645d6e9b	cache: add note about where conflicts are handled	2013-01-07 13:05:32 -05:00
Gabe Black	e17c375ddd	Decoder: Remove the thread context get/set from the decoder. This interface is no longer used, and getting rid of it simplifies the decoders and code that sets up the decoders. The thread context had been used to read architectural state which was used to contextualize the instruction memory as it came in. That was changed so that the state is now sent to the decoders to keep locally if/when it changes. That's significantly more efficient. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2013-01-04 19:00:45 -06:00
Gabe Black	d1965af220	X86: Move address based decode caching in front of the predecoder. The predecoder in x86 does a lot of work, most of which can be skipped if the decoder cache is put in front of it. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2013-01-04 19:00:44 -06:00
Gabe Black	63b10907ef	SPARC: Keep a copy of the current ASI in the decoder. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2013-01-04 18:09:45 -06:00
Gabe Black	a83e74b37a	ARM: Keep a copy of the fpscr len and stride fields in the decoder. Avoid reading them every instruction, and also eliminate the last use of the thread context in the decoders. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2013-01-04 18:09:35 -06:00
Nilay Vaish	e9fa54de58	x86: implement x87 fp instruction fnstsw This patch implements the fnstsw instruction. The code was originally written by Vince Weaver. Gabe had made some comments about the code, but those were never addressed. This patch addresses those comments.	2012-12-30 12:45:50 -06:00
Nilay Vaish	23ba6fc5fb	x86: implement x87 fp instruction fsincos This patch implements the fsincos instruction. The code was originally written by Vince Weaver. Gabe had made some comments about the code, but those were never addressed. This patch addresses those comments.	2012-12-30 12:45:45 -06:00
Nathanael Premillieu	3026a116ba	arm: set uopSet_uop as conditional or unconditional control uopSet_uop is microop instruction that has the IsControl flags set, but the IsCondControl or IsUncondControl flags seems not to be set, neither in the construction nor where the microop is used. This patch adds the the flags in the constructor of the instruction (MicroUopSetPCCPSR). Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2012-12-12 09:50:33 -06:00
Nathanael Premillieu	84fc57bfe6	arm: set movret_uop as conditional or unconditional control A flag was missing for the movret_uop microop instruction. This patch adds that flag when the instruction is used, not directly in the constructor of the instruction. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2012-12-12 09:50:16 -06:00
Nilay Vaish	f3d0be210f	ruby: add support for prefetching to MESI protocol	2012-12-11 10:05:56 -06:00
Nilay Vaish	c120273708	ruby: modify the directed tester to read/write streams The directed tester supports only generating only read or only write accesses. The patch modifies the tester to support streams that have both read and write accesses.	2012-12-11 10:05:55 -06:00
Nilay Vaish	9b72a0f627	ruby: change slicc to allow for constructor args The patch adds support to slicc for recognizing arguments that should be passed to the constructor of a class. I did not like the fact that an explicit check was being carried on the type 'TBETable' to figure out the arguments to be passed to the constructor. The patch also moves some of the member variables that are declared for all the controllers to the base class AbstractController.	2012-12-11 10:05:55 -06:00
Nilay Vaish	93e283abb3	ruby: add a prefetcher This patch adds a prefetcher for the ruby memory system. The prefetcher is based on a prefetcher implemented by others (well, I don't know who wrote the original). The prefetcher does stride-based prefetching, both unit and non-unit. It obseves the misses in the cache and trains on these. After the training period is over, the prefetcher starts issuing prefetch requests to the controller.	2012-12-11 10:05:54 -06:00
Nilay Vaish	d502384795	ruby: add functions for computing next stride/page address	2012-12-11 10:05:53 -06:00
Erik Tomusk	3dc7e4f496	TournamentBP: Fix some bugs with table sizes and counters globalHistoryBits, globalPredictorSize, and choicePredictorSize are decoupled. globalHistoryBits controls how much history is kept, global and choice predictor sizes control how much of that history is used when accessing predictor tables. This way, global and choice predictors can actually be different sizes, and it is no longer possible to walk off the predictor arrays and cause a seg fault. There are now individual thresholds for choice, global, and local saturating counters, so that taken/not taken decisions are correct even when the predictors' counters' sizes are different. The interface for localPredictorSize has been removed from TournamentBP because the value can be calculated from localHistoryBits. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2012-12-06 09:31:06 -06:00
Malek Musleh	150e9b8c68	inorder cpu: add missing DPRINTF argument Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2012-12-06 05:25:40 -06:00
Nathanael Premillieu	eb899407c5	o3 cpu: remove some unused buggy functions in the lsq Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2012-12-06 04:36:51 -06:00
Nilay Vaish	2d6470936c	sim: have a curTick per eventq This patch adds a _curTick variable to an eventq. This variable is updated whenever an event is serviced in function serviceOne(), or all events upto a particular time are processed in function serviceEvents(). This change helps when there are eventqs that do not make use of curTick for scheduling events.	2012-11-16 10:27:47 -06:00
Nilay Vaish	90c45c29fe	ruby: support functional accesses in garnet flexible network	2012-11-10 17:18:01 -06:00
Nilay Vaish	1492ab066d	ruby: bug in functionalRead, revert recent changes Recent changes to functionalRead() in the memory system was not correct. The change allowed for returning data from the first message found in the buffers of the memory system. This is not correct since it is possible that a timing message has data from an older state of the block. The changes are being reverted.	2012-11-10 17:18:00 -06:00
Andreas Hansson	c4b36901d0	mem: Fix DRAM draining to ensure write queue is empty This patch fixes the draining of the SimpleDRAM controller model. The controller performs buffering of writes and normally there is no need to ever empty the write buffer (if you have a fast on-chip memory, then use it). The patch adds checks to ensure the write buffer is drained when the controller is asked to do so.	2012-11-08 04:25:06 -05:00
Hamid Reza Khaleghzadeh ext:(%2C%20Lluc%20Alvarez%20%3Clluc.alvarez%40bsc.es%3E%2C%20Nilay%20Vaish%20%3Cnilay%40cs.wisc.edu%3E)	8cd475d58e	ruby: reset and dump stats along with reset of the system This patch adds support to ruby so that the statistics maintained by ruby are reset/dumped when the statistics for the rest of the system are reset/dumped. For resetting the statistics, ruby now provides the resetStats() function that a sim object can provide. As a consequence, the clearStats() function has been removed from RubySystem. For dumping stats, Ruby now adds a callback event to the dumpStatsQueue. The exit callback that ruby used to add earlier is being removed. Created by: Hamid Reza Khaleghzadeh. Improved by: Lluc Alvarez, Nilay Vaish Committed by: Nilay Vaish	2012-11-02 12:18:25 -05:00
Ali Saidi	ce5766c409	mem: fix use after free issue in memories until 4-phase work complete.	2012-11-02 11:50:16 -05:00
Andreas Sandberg	ddd6af414c	mem: Add support for writing back and flushing caches This patch adds support for the following optional drain methods in the classical memory system's cache model: memWriteback() - Write back all dirty cache lines to memory using functional accesses. memInvalidate() - Invalidate all cache lines. Dirty cache lines are lost unless a writeback is requested. Since memWriteback() is called when checkpointing systems, this patch adds support for checkpointing systems with caches. The serialization code now checks whether there are any dirty lines in the cache. If there are dirty lines in the cache, the checkpoint is flagged as bad and a warning is printed.	2012-11-02 11:32:02 -05:00
Andreas Sandberg	050f24c796	sim: Add drain methods to request additional cleanup operations This patch adds the following two methods to the Drainable base class: memWriteback() - Write back all dirty cache lines to memory using functional accesses. memInvalidate() - Invalidate memory system buffers. Dirty data won't be written back. Specifying calling memWriteback() after draining will allow us to checkpoint systems with caches. memInvalidate() can be used to drop memory system buffers in preparation for switching to an accelerated CPU model that bypasses the gem5 memory system (e.g., hardware virtualized CPUs). Note: This patch only adds the methods to Drainable, the code for flushing the TLB and the cache is committed separately.	2012-11-02 11:32:02 -05:00
Andreas Sandberg	aae6134b54	sim: Add SWIG interface for Serializable This changeset adds a SWIG interface for the Serializable class, which fixes a warning when compiling the SWIG interface for the event queue. Currently, the only method exported is the name() method.	2012-11-02 11:32:02 -05:00
Andreas Sandberg	dc01535c7e	python: Rename doDrain()->drain() and make it do the right thing There is no point in exporting the old drain() method in Simulate.py. It should only be used internally by doDrain(). This patch moves the old drain() method into doDrain() and renames doDrain() to drain().	2012-11-02 11:32:02 -05:00
Andreas Sandberg	196397fea4	sim: Reuse the code to change memory mode. changeToAtomic and changeToTiming both do essentially the same thing, they check the type of their input argument, drain the system, and switch to the desired memory mode. This patch moves all of that code to a separate method (changeMemoryMode) and calls that from both changeToAtomic and changeToTiming.	2012-11-02 11:32:02 -05:00
Andreas Sandberg	b81a977e6a	sim: Move the draining interface into a separate base class This patch moves the draining interface from SimObject to a separate class that can be used by any object needing draining. However, objects not visible to the Python code (i.e., objects not deriving from SimObject) still depend on their parents informing them when to drain. This patch also gets rid of the CountedDrainEvent (which isn't really an event) and replaces it with a DrainManager.	2012-11-02 11:32:01 -05:00
Andreas Sandberg	eb703a4b4e	cpu: O3 add a header declaring the DerivO3CPU SWIG needs a complete declaration of all wrapped objects. This patch adds a header file with the DerivO3CPU class and includes it in the SWIG interface. --HG-- rename : src/cpu/o3/cpu_builder.cc => src/cpu/o3/deriv.cc	2012-11-02 11:32:01 -05:00
Andreas Sandberg	ebe65a394b	cpu: Add header files for checker CPUs In order to create reliable SWIG wrappers, we need to include the declaration of the wrapped class in the SWIG file. Previously, we didn't expose the declaration of checker CPUs. This patch adds header files for such CPUs and include them in the SWIG wrapper. --HG-- rename : src/cpu/dummy_checker_builder.cc => src/cpu/dummy_checker.cc rename : src/cpu/o3/checker_builder.cc => src/cpu/o3/checker.cc	2012-11-02 11:32:01 -05:00
Andreas Sandberg	df02047d5a	dev: Fix ethernet device inheritance structure The Python wrappers and the C++ should have the same object structure. If this is not the case, bad things will happen when the SWIG wrappers cast between an object and any of its base classes. This was not the case for NSGigE and Sinic devices. This patch makes NSGigE and Sinic inherit from the new EtherDevBase class, which in turn inherits from EtherDevice. As a bonus, this removes some duplicated statistics from the Sinic device.	2012-11-02 11:32:01 -05:00
Andreas Sandberg	c0ab52799c	sim: Include object header files in SWIG interfaces When casting objects in the generated SWIG interfaces, SWIG uses classical C-style casts ( (Foo *)bar; ). In some cases, this can degenerate into the equivalent of a reinterpret_cast (mainly if only a forward declaration of the type is available). This usually works for most compilers, but it is known to break if multiple inheritance is used anywhere in the object hierarchy. This patch introduces the cxx_header attribute to Python SimObject definitions, which should be used to specify a header to include in the SWIG interface. The header should include the declaration of the wrapped object. We currently don't enforce header the use of the header attribute, but a warning will be generated for objects that do not use it.	2012-11-02 11:32:01 -05:00
Andreas Sandberg	044a652587	pci: Make Python wrapper cast to the right type The PCI base class is PciDev and not PciDevice, which is used by the Python world. Make sure this is reflected in the wrapper code.	2012-11-02 11:32:01 -05:00
Andreas Sandberg	249e318212	mips: Remove unused Python file Remove BISystem.py, BareIronMipsSystem is already implemented in MipsSystem.py.	2012-11-02 11:32:01 -05:00
Andreas Sandberg	49a799ce77	dev: Add missing inline declarations	2012-11-02 11:32:01 -05:00
Andreas Sandberg	00b3a57d88	base: Add missing header file to addr_range.hh.	2012-11-02 11:32:01 -05:00
Dam Sunwoo	81406018b0	ARM: dump stats and process info on context switches This patch enables dumping statistics and Linux process information on context switch boundaries (__switch_to() calls) that are used for Streamline integration (a graphical statistics viewer from ARM).	2012-11-02 11:32:01 -05:00
Chander Sudanthi	322daba74c	base: Fix a few incorrectly handled print format cases This patch ensures cases like %0.6u, %06f, and %.6u are processed correctly. The case like %06f is ambiguous and was made to match printf. Also, this patch removes the goto statement in cprintf.cc in favor of a function call.	2012-11-02 11:32:00 -05:00
Chander Sudanthi	55787cc0d0	base: split out the VncServer into a VncInput and Server classes This patch adds a VncInput base class which VncServer inherits from. Another class can implement the same interface and be used instead of the VncServer, for example a class that replays Vnc traffic. --HG-- rename : src/base/vnc/VncServer.py => src/base/vnc/Vnc.py rename : src/base/vnc/vncserver.cc => src/base/vnc/vncinput.cc rename : src/base/vnc/vncserver.hh => src/base/vnc/vncinput.hh	2012-11-02 11:32:00 -05:00
Dam Sunwoo	ac161c1d72	ISA: generic Linux thread info support This patch takes the Linux thread info support scattered across different ISA implementations (currently in ARM, ALPHA, and MIPS), and unifies them into a single file. Adds a few more helper functions to read out TGID, mm, etc. ISA-specific information (e.g., ALPHA PCBB register) is now moved to the corresponding isa_traits.hh files.	2012-11-02 11:32:00 -05:00
Ali Saidi	d0678d1c31	sim: Fix as issue where exit events on instr queues are used after freed.	2012-11-02 11:32:00 -05:00
Mrinmoy Ghosh	4440332bdd	o3: Fix a couple of issues with the local predictor. Fix some issues with the local predictor and the way it's indexed.	2012-11-02 11:32:00 -05:00
Andreas Sandberg	7e25052fee	Partly revert [4f54b0f229b5] and move draining to m5.changeToTiming Changeset 4f54b0f229b5 removed the call to doDrain in changeToTiming based on the assumption that the system does not need draining when running in atomic mode. This is a false assumption since at least the System class requires the system to be drained before it allows switching of memory modes. This patch reverts that part of the changeset.	2012-11-02 11:32:00 -05:00
Andreas Hansson	3d98119717	mem: Fix typo in port comments This patch merely fixes a few typos in the port comments.	2012-10-31 09:28:23 -04:00
Andreas Hansson	6f6adbf0f6	dev: Make default clock more reasonable for system and devices This patch changes the default system clock from 1THz to 1GHz. This clock is used by all modules that do not override the default (parent clock), and primarily affects the IO subsystem. Every DMA device uses its clock to schedule the next transfer, and the change will thus cause this inter-transfer delay to be longer. The default clock of the bus is removed, as the clock inherited from the system provides exactly the same value. A follow-on patch will bump the stats.	2012-10-25 13:14:44 -04:00
Andreas Hansson	1fdc4e850e	arm: Use table walker clock that is inherited from CPU This patch simplifies the scheduling of the next walk for the ARM table walker. Previously it used the CPU clock, but as the table walker inherits the clock from the CPU, it is cleaner to simply use its own clock (which is the same).	2012-10-25 04:32:42 -04:00
Andreas Hansson	69e82539fd	dev: Remove zero-time loop in DMA timing send This patch removes the zero-time loop used to send items from the DMA port transmit list. Instead of having a loop, the DMA port now uses an event to schedule sending of a single packet. Ultimately this patch serves to ease the transition to a blocking 4-phase handshake. A follow-on patch will update the regression statistics.	2012-10-23 04:49:33 -04:00
Nilay Vaish	52d8693677	ruby: functional access updates to network test protocol I had forgotten to change the network test protocol while making changes to ruby for supporting functional accesses. This patch updates the protocol so that it can compile correctly.	2012-10-18 18:35:42 -05:00
Nilay Vaish	5ffc165939	ruby: improved support for functional accesses This patch adds support to different entities in the ruby memory system for more reliable functional read/write accesses. Only the simple network has been augmented as of now. Later on Garnet will also support functional accesses. The patch adds functional access code to all the different types of messages that protocols can send around. These messages are functionally accessed by going through the buffers maintained by the network entities. The patch also rectifies some of the bugs found in coherence protocols while testing the patch. With this patch applied, functional writes always succeed. But functional reads can still fail.	2012-10-15 17:51:57 -05:00
Nilay Vaish	07ce90f7aa	memtest: move check on outstanding requests The Memtest tester allows for only one request to be outstanding for a particular physical address. The check has been written separately for reads and writes. This patch moves the check earlier than its current position so that it need not be written separately for reads and writes.	2012-10-15 17:27:17 -05:00
Nilay Vaish	61434a9943	ruby: register multiple memory controllers Currently the Ruby System maintains pointer to only one of the memory controllers. But there can be multiple controllers in the system. This patch adds a vector of memory controllers.	2012-10-15 17:27:17 -05:00
Nilay Vaish	c14e6cfc4e	ruby: remove AbstractMemOrCache The only place where this abstract class is in use is the memory controller, which it self is an abstract class. Does not seem useful at all.	2012-10-15 17:27:16 -05:00
Nilay Vaish	3e607f146f	ruby: allow function definition in slicc structs This patch adds support for function definitions to appear in slicc structs. This is required for supporting functional accesses for different types of messages. Subsequent patches will use this to development.	2012-10-15 17:27:16 -05:00
Nilay Vaish	c7b0901b97	ruby banked array: do away with event scheduling It seems unecessary that the BankedArray class needs to schedule an event to figure out when the access ends. Instead only the time for the end of access needs to be tracked.	2012-10-15 17:27:15 -05:00
Nilay Vaish	6a65fafa52	ruby: reset timing after cache warm up Ruby system was recently converted to a clocked object. Such objects maintain state related to the time that has passed so far. During the cache warmup, Ruby system changes its own time and the global time. Later on, the global time is restored. So Ruby system also needs to reset its own time.	2012-10-15 17:27:15 -05:00
Andreas Hansson	b6bd4f34b4	Mem: Fix incorrect logic in bus blocksize check This patch fixes the logic in the blocksize check such that the warning is printed if the size is not 16, 32, 64 or 128.	2012-10-15 12:51:21 -04:00
Andreas Hansson	2a740aa096	Port: Add protocol-agnostic ports in the port hierarchy This patch adds an additional level of ports in the inheritance hierarchy, separating out the protocol-specific and protocl-agnostic parts. All the functionality related to the binding of ports is now confined to use BaseMaster/BaseSlavePorts, and all the protocol-specific parts stay in the Master/SlavePort. In the future it will be possible to add other protocol-specific implementations. The functions used in the binding of ports, i.e. getMaster/SlavePort now use the base classes, and the index parameter is updated to use the PortID typedef with the symbolic InvalidPortID as the default.	2012-10-15 08:12:35 -04:00
Andreas Hansson	9baa35ba80	Mem: Separate the host and guest views of memory backing store This patch moves all the memory backing store operations from the independent memory controllers to the global physical memory. The main reason for this patch is to allow address striping in a future set of patches, but at this point it already provides some useful functionality in that it is now possible to change the number of memory controllers and their address mapping in combination with checkpointing. Thus, the host and guest view of the memory backing store are now completely separate. With this patch, the individual memory controllers are far simpler as all responsibility for serializing/unserializing is moved to the physical memory. Currently, the functionality is more or less moved from AbstractMemory to PhysicalMemory without any major changes. However, in a future patch the physical memory will also resolve any ranges that are interleaved and properly assign the backing store to the memory controllers, and keep the host memory as a single contigous chunk per address range. Functionality for future extensions which involve CPU virtualization also enable the host to get pointers to the backing store.	2012-10-15 08:12:32 -04:00
Andreas Hansson	d7ad8dc608	Checkpoint: Make system serialize call children This patch changes how the serialization of the system works. The base class had a non-virtual serialize and unserialize, that was hidden by a function with the same name for a number of subclasses (most likely not intentional as the base class should have been virtual). A few of the derived systems had no specialization at all (e.g. Power and x86 that simply called the System::serialize), but MIPS and Alpha adds additional symbol table entries to the checkpoint. Instead of overriding the virtual function, the additional entries are now printed through a virtual function (un)serializeSymtab. The reason for not calling System::serialize from the two related systems is that a follow up patch will require the system to also serialize the PhysicalMemory, and if this is done in the base class if ends up being between the general parts and the specialized symbol table. With this patch, the checkpoint is not modified, as the order of the segments is unchanged.	2012-10-15 08:12:29 -04:00
Andreas Hansson	0c58106b6e	Mem: Use deque instead of list for bus retries This patch changes the data structure used to keep track of ports that should be told to retry. As the bus is doing this in an FCFS way, there is no point having a list. A deque is a better match (and is at least in theory a better choice from a performance point of view).	2012-10-15 08:12:25 -04:00
Andreas Hansson	93a159875a	Fix: Address a few minor issues identified by cppcheck This patch addresses a number of smaller issues identified by the code inspection utility cppcheck. There are a number of identified leaks in the arm/linux/system.cc (although the function only get's called once so it is not a major problem), a few deletes in dev/x86/i8042.cc that were not array deletes, and sprintfs where the character array had one element less than needed. In the IIC tags there was a function allocating an array of longs which is in fact never used.	2012-10-15 08:12:23 -04:00
Andreas Hansson	88554790c3	Mem: Use cycles to express cache-related latencies This patch changes the cache-related latencies from an absolute time expressed in Ticks, to a number of cycles that can be scaled with the clock period of the caches. Ultimately this patch serves to enable future work that involves dynamic frequency scaling. As an immediate benefit it also makes it more convenient to specify cache performance without implicitly assuming a specific CPU core operating frequency. The stat blocked_cycles that actually counter in ticks is now updated to count in cycles. As the timing is now rounded to the clock edges of the cache, there are some regressions that change. Plenty of them have very minor changes, whereas some regressions with a short run-time are perturbed quite significantly. A follow-on patch updates all the statistics for the regressions.	2012-10-15 08:10:54 -04:00
Andreas Hansson	1c321b8847	Regression: Use CPU clock and 32-byte width for L1-L2 bus This patch changes the CoherentBus between the L1s and L2 to use the CPU clock and also four times the width compared to the default bus. The parameters are not intending to fit every single scenario, but rather serve as a better startingpoint than what we previously had. Note that the scripts that do not use the addTwoLevelCacheHiearchy are not affected by this change. A separate patch will update the stats.	2012-10-15 08:08:08 -04:00
Andreas Hansson	930db9257d	Clock: Inherit the clock from parent by default This patch changes the default 1 Tick clock period to a proxy that resolves the parents clock. As a result of this, the caches and L1-to-L2 bus, for example, will automatically use the clock period of the CPU unless explicitly overridden. To ensure backwards compatibility, the System class overrides the proxy and specifies a 1 Tick clock. We could change this to something more reasonable in a follow-on patch, perhaps 1 GHz or something similar. With this patch applied, all clocked objects should have a reasonable clock period set, and could start specifying delays in Cycles instead of absolute time.	2012-10-15 08:07:07 -04:00
Andreas Hansson	8cc503f1dd	Param: Fix proxy traversal to support chained proxies This patch modifies how proxies are traversed and unproxied to allow chained proxies. The issue that is solved manifested itself when a proxy during its evaluation ended up being hitting another proxy, and the second one got evaluated using the object that was originally used for the first proxy. For a more tangible example, see the following patch on making the default clock being inherited from the parent. In this patch, the CPU clock is a proxy Parent.clock, which is overridden in the system to be an actual value. This all works fine, but the AlphaLinuxSystem has a boot_cpu_frequency parameter that is Self.cpu[0].clock.frequency. When the latter is evaluated, it all happens relative to the current object of the proxy, i.e. the system. Thus the cpu.clock is evaluated as Parent.clock, but using the system rather than the cpu as the object to enquire.	2012-10-15 08:07:06 -04:00
Andreas Hansson	36d199b9a9	Mem: Use range operations in bus in preparation for striping This patch transitions the bus to use the AddrRange operations instead of directly accessing the start and end. The change facilitates the move to a more elaborate AddrRange class that also supports address striping in the bus by specifying interleaving bits in the ranges. Two new functions are added to the AddrRange to determine if two ranges intersect, and if one is a subset of another. The bus propagation of address ranges is also tweaked such that an update is only propagated if the bus received information from all the downstream slave modules. This avoids the iteration and need for the cycle-breaking scheme that was previously used.	2012-10-15 08:07:04 -04:00
Andreas Hansson	43ca8415e8	Mem: Determine bus block size during initialisation This patch moves the block size computation from findBlockSize to initialisation time, once all the neighbouring ports are connected. There is no need to dynamically update the block size, and the caching of the value effectively avoided that anyhow. This is very similar to what was already in place, just with a slightly leaner implementation.	2012-10-11 06:38:43 -04:00
Andreas Hansson	5dba9225f7	Doxygen: Update the version of the Doxyfile This patch bumps the Doxyfile to match more recent versions of Doxygen. The sections that are deprecated have been removed, and the new ones added. The project name has also been updated.	2012-10-11 06:38:42 -04:00
Nilay Vaish	88ba1c452b	ruby: makes some members non-static This patch makes some of the members (profiler, network, memory vector) of ruby system non-static.	2012-10-02 14:35:45 -05:00
Nilay Vaish	4488379244	ruby: changes to simple network This patch makes the Switch structure inherit from BasicRouter, as is done in two other networks.	2012-10-02 14:35:45 -05:00
Nilay Vaish	b370f6a7b2	ruby: rename template_hack to template I don't like using the word hack. Hence, the patch.	2012-10-02 14:35:44 -05:00
Nilay Vaish	d58f84c481	ruby: remove unused code in protocols	2012-10-02 14:35:44 -05:00
Nilay Vaish	73eafe4849	ruby: remove some unused things in slicc This patch removes the parts of slicc that were required for multi-chip protocols. Going ahead, it seems multi-chip protocols would be implemented by playing with the network itself.	2012-10-02 14:35:43 -05:00
Nilay Vaish	3c9d3b16d8	ruby: move functional access to ruby system This patch moves the code for functional accesses to ruby system. This is because the subsequent patches add support for making functional accesses to the messages in the interconnect. Making those accesses from the ruby port would be cumbersome.	2012-10-02 14:35:42 -05:00
Nilay Vaish	95664da097	MI coherence protocol: add copyright notice	2012-09-30 13:20:53 -05:00
Djordje Kovacevic	80a26a3e39	MEM: Put memory system document into doxygen	2012-09-25 11:49:41 -05:00
Mrinmoy Ghosh	6fc0094337	Cache: add a response latency to the caches In the current caches the hit latency is paid twice on a miss. This patch lets a configurable response latency be set of the cache for the backward path.	2012-09-25 11:49:41 -05:00
Sascha Bischoff	74ab69c7ea	Statistics: Add a function to configure periodic stats dumping This patch adds a function, periodicStatDump(long long period), which will dump and reset the statistics every period. This function is designed to be called from the python configuration scripts. This allows the periodic stats dumping to be configured more easilly at run time. The period is currently specified as a long long as there are issues passing Tick into the C++ from the python as they have conflicting definitions. If the period is less than curTick, the first occurance occurs at curTick. If the period is set to 0, then the event is descheduled and the stats are not periodically dumped. Due to issues when resumung from a checkpoint, the StatDump event must be moved forward such that it occues AFTER the current tick. As the function is called from the python, the event is scheduled before the system resumes from the checkpoint. Therefore, the event is moved using the updateEvents() function. This is called from simulate.py once the system has resumed from the checkpoint. NOTE: It should be noted that this is a fairly temporary patch which re-adds the capability to extract temporal information from the communication monitors. It should not be used at the same time as anything that relies on dumping the statistics based on in simulation events i.e. a context switch.	2012-09-25 11:49:41 -05:00
Dam Sunwoo	acbb7a2eed	ARM: added support for flattened device tree blobs Newer Linux kernels require DTB (device tree blobs) to specify platform configurations. The input DTB filename can be specified through gem5 parameters in LinuxArmSystem.	2012-09-25 11:49:41 -05:00
Ali Saidi	5adb4ddc12	O3: Pack the comm structures a bit better to reduce their size.	2012-09-25 11:49:40 -05:00
Ali Saidi	396600de10	mem: Add a gasket that allows memory ranges to be re-mapped. For example if DRAM is at two locations and mirrored this patch allows the mirroring to occur.	2012-09-25 11:49:40 -05:00
Ali Saidi	0c99d21ad7	ARM: Squash outstanding walks when instructions are squashed.	2012-09-25 11:49:40 -05:00
Andreas Sandberg	6f603e0807	arm: Use a static_assert to test that miscRegName[] is complete Instead of statically defining miscRegName to contain NUM_MISCREGS elements, let the compiler determine the length of the array. This allows us to use a static_assert to test that all registers are listed in the name vector.	2012-09-25 11:49:40 -05:00
Andreas Sandberg	4544f3def4	base: Check for static_assert support and provide fallback C++11 has support for static_asserts to provide compile-time assertion checking. This is very useful when testing, for example, structure sizes to make sure that the compiler got the right alignment or vector sizes.	2012-09-25 11:49:40 -05:00
Andreas Sandberg	6598241f2c	sim: Move CPU-specific methods from SimObject to the BaseCPU class	2012-09-25 11:49:40 -05:00
Andreas Sandberg	5f32eceeda	sim: Remove SimObject::setMemoryMode Remove SimObject::setMemoryMode from the main SimObject class since it is only valid for the System class. In addition to removing the method from the C++ sources, this patch also removes getMemoryMode and changeTiming from SimObject.py and updates the simulation code to call the (get\|set)MemoryMode method on the System object instead.	2012-09-25 11:49:40 -05:00
Djordje Kovacevic	d060a28a29	CPU: Add abandoned instructions to O3 Pipe Viewer	2012-09-25 11:49:40 -05:00
Nathanael Premillieu	bfffbb6797	ARM: Inst writing to cntrlReg registers not set as control inst Deletion of the fact that instructions that writes to registers of type "cntrlReg" are not set as control instruction (flag IsControl not set).	2012-09-25 11:49:40 -05:00
Ali Saidi	04ca96427c	ARM: Predict target of more instructions that modify PC.	2012-09-25 11:49:40 -05:00
Andreas Sandberg	1b29352dd5	build: Add missing dependencies when building param SWIG interfaces This patch adds an explicit dependency between param_%s.i and the Python source file defining the object. Previously, the build system didn't rebuild SWIG interfaces correctly when an object's Python sources were updated.	2012-09-25 11:49:40 -05:00
Joel Hestness	4095af5fd6	RubyPort and Sequencer: Fix draining Fix the drain functionality of the RubyPort to only call drain on child ports during a system-wide drain process, instead of calling each time that a ruby_hit_callback is executed. This fixes the issue of the RubyPort ports being reawakened during the drain simulation, possibly with work they didn't previously have to complete. If they have new work, they may call process on the drain event that they had not registered work for, causing an assertion failure when completing the drain event. Also, in RubyPort, set the drainEvent to NULL when there are no events to be drained. If not set to NULL, the drain loop can result in stale drainEvents used.	2012-09-23 13:57:08 -05:00
Andreas Hansson	3b6a143ec5	DRAM: Introduce SimpleDRAM to capture a high-level controller This patch introduces a high-level model of a DRAM controller, with a basic read/write buffer structure, a selectable and customisable arbiter, a few address mapping options, and the basic DRAM timing constraints. The parameters make it possible to turn this model into any desired DDRx/LPDDRx/WideIOx memory controller. The intention is not to be cycle accurate or capture every aspect of a DDR DRAM interface, but rather to enable exploring of the high-level knobs with a good simulation speed. Thus, contrary to e.g. DRAMSim this module emphasizes simulation speed with a good-enough accuracy. This module is merely a starting point, and there are plenty additions and improvements to come. A notable addition is the support for address-striping in the bus to enable a multi-channel DRAM controller. Also note that there are still a few "todo's" in the code base that will be addressed as we go along. A follow-up patch will add basic performance regressions that use the traffic generator to exercise a few well-defined corner cases.	2012-09-21 11:48:13 -04:00
Andreas Hansson	d75b1b5a73	TrafficGen: Add a basic traffic generator This patch adds a traffic generator to the code base. The generator is aimed to be used as a black box model to create appropriate use-cases and benchmarks for the memory system, and in particular the interconnect and the memory controller. The traffic generator is a master module, where the actual behaviour is captured in a state-transition graph where each state generates some sort of traffic. By constructing a graph it is possible to create very elaborate scenarios from basic generators. Currencly the set of generators include idling, linear address sweeps, random address sequences and playback of traces (recording will be done by the Communication Monitor in a follow-up patch). At the moment the graph and the states are described in an ad-hoc line-based format, and in the future this should be aligned with our used of e.g. the Google protobufs. Similarly for the traces, the format is currently a simplistic ad-hoc line-based format that merely serves as a starting point. In addition to being used as a black-box model for system components, the traffic generator is also useful for creating test cases and regressions for the interconnect and memory system. In future patches we will use the traffic generator to create DRAM test cases for the controller model. The patch following this one adds a basic regressions which also contains an example configuration script and trace file for playback.	2012-09-21 11:48:08 -04:00
Andreas Hansson	4aee3aa073	Mem: Tidy up bus member variables types This patch merely tidies up the types used for the bus member variables. It also makes the constant ones const.	2012-09-21 10:11:24 -04:00
Lluc Alvarez	c8de765468	SE: Ignore FUTEX_PRIVATE_FLAG of sys_futex This patch ignores the FUTEX_PRIVATE_FLAG of the sys_futex system call in SE mode. With this patch, when sys_futex with the options FUTEX_WAIT_PRIVATE or FUTEX_WAKE_PRIVATE is emulated, the FUTEX_PRIVATE_FLAG is ignored and so their behaviours are the regular FUTEX_WAIT and FUTEX_WAKE. Emulating FUTEX_WAIT_PRIVATE and FUTEX_WAKE_PRIVATE as if they were non-private is safe from a functional point of view. The FUTEX_PRIVATE_FLAG does not change the semantics of the futex, it's just a mechanism to improve performance under certain circunstances that can be ignored in SE mode.	2012-09-21 04:51:18 -04:00
Anthony Gutierrez	9cd0c5ecc8	bus: removed outdated warn regarding 64 B block sizes this warn is outdated as 64 B blocks are very common, and even the default size for some CPU types. E.g., arm_detailed.	2012-09-20 17:25:52 -04:00
Andreas Hansson	a731f8f9dd	Mem: Remove the file parameter from AbstractMemory This patch removes the unused file parameter from the AbstractMemory. The patch serves to make it easier to transition to a separation of the actual contigious host memory backing store, and the gem5 memory controllers. Without the file parameter it becomes easier to hide the creation of the mmap in the PhysicalMemory, as there are no longer any reasons to expose the actual contigious ranges to the user. To the best of my knowledge there is no use of the parameter, so the change should not affect anyone.	2012-09-19 06:15:46 -04:00
Andreas Hansson	ffb6aec603	AddrRange: Transition from Range<T> to AddrRange This patch takes the final plunge and transitions from the templated Range class to the more specific AddrRange. In doing so it changes the obvious Range<Addr> to AddrRange, and also bumps the range_map to be AddrRangeMap. In addition to the obvious changes, including the removal of redundant includes, this patch also does some house keeping in preparing for the introduction of address interleaving support in the ranges. The Range class is also stripped of all the functionality that is never used. --HG-- rename : src/base/range.hh => src/base/addr_range.hh rename : src/base/range_map.hh => src/base/addr_range_map.hh	2012-09-19 06:15:44 -04:00
Andreas Hansson	c34df76272	AddrRange: Simplify Range by removing stream input/output This patch simplifies the Range class in preparation for the introduction of a more specific AddrRange class that allows interleaving/striping. The only place where the parsing was used was in the unit test.	2012-09-19 06:15:43 -04:00
Andreas Hansson	12c291f9d7	AddrRange: Remove unused range_multimap This patch simply removes the unused range_multimap in preparation for a more specific AddrRangeMap that also allows interleaving in addition to pure ranges.	2012-09-19 06:15:42 -04:00
Andreas Hansson	fccbf8bb45	AddrRange: Simplify AddrRange params Python hierarchy This patch simplifies the Range object hierarchy in preparation for an address range class that also allows striping (e.g. selecting a few bits as matching in addition to the range). To extend the AddrRange class to an AddrRegion, the first step is to simplify the hierarchy such that we can make it as lean as possible before adding the new functionality. The only class using Range and MetaRange is AddrRange, and the three classes are now collapsed into one.	2012-09-19 06:15:41 -04:00
Nilay Vaish	33c904e0a5	ruby: eliminate typedef integer_t	2012-09-18 22:49:12 -05:00
Nilay Vaish	86b1c0fd54	ruby: avoid using g_system_ptr for event scheduling This patch removes the use of g_system_ptr for event scheduling. Each consumer object now needs to specify upfront an EventManager object it would use for scheduling events. This makes the ruby memory system more amenable for a multi-threaded simulation.	2012-09-18 22:46:34 -05:00
Andreas Hansson	7c55464aac	Mem: Add a maximum bandwidth to SimpleMemory This patch makes a minor addition to the SimpleMemory by enforcing a maximum data rate. The bandwidth is configurable, and a reasonable value (12.8GB/s) has been choosen as the default. The changes do add some complexity to the SimpleMemory, but they should definitely be justifiable as this enables a far more realistic setup using even this simple memory controller. The rate regulation is done for reads and writes combined to reflect the bidirectional data busses used by most (if not all) relevant memories. Moreover, the regulation is done per packet as opposed to long term, as it is the short term data rate (data bus width times frequency) that is the limiting factor. A follow-up patch bumps the stats for the regressions.	2012-09-18 10:30:02 -04:00
Andreas Hansson	d1f3a3b91a	gcc: Enable Link-Time Optimization for gcc >= 4.6 This patch adds Link-Time Optimization when building the fast target using gcc >= 4.6, and adds a scons flag to disable it (-no-lto). No check is performed to guarantee that the linker supports LTO and use of the linker plugin, so the user has to ensure that binutils GNU ld >= 2.21 or the gold linker is available. Typically, if gcc >= 4.6 is available, the latter should not be a problem. Currently the LTO option is only useful for gcc >= 4.6, due to the limited support on clang and earlier versions of gcc. The intention is to also add support for clang once the LTO integration matures. The same number of jobs is used for the parallel phase of LTO as the jobs specified on the scons command line, using the -flto=n flag that was introduced with gcc 4.6. The gold linker also supports concurrent and incremental linking, but this is not used at this point. The compilation and linking time is increased by almost 50% on average, although ARM seems to be particularly demanding with an increase of almost 100%. Also beware when using this as gcc uses a tremendous amount of memory and temp space in the process. You have been warned. After some careful consideration, and plenty discussions, the flag is only added to the fast target, and the warning that was issued in an earlier version of this patch is now removed. Similarly, the flag used to enable LTO, now the default is to use it, and the flag has been modified to disable LTO. The rationale behind this decision is that opt is used for development, whereas fast is only used for long runs, e.g. regressions or more elaborate experiments where the additional compile and link time is amortized by a much larger run time. When it comes to the return on investment, the regression seems to be roughly 15% faster with LTO. For a bit more detail, I ran twolf on ARM.fast, with three repeated runs, and they all finish within 42 minutes (+- 25 seconds) without LTO and 31 minutes (+- 25 seconds) with LTO, i.e. LTO gives an impressive >25% speed-up for this case. Without LTO (ARM.fast twolf) real 42m37.632s user 42m34.448s sys 0m0.390s real 41m51.793s user 41m50.384s sys 0m0.131s real 41m45.491s user 41m39.791s sys 0m0.139s With LTO (ARM.fast twolf) real 30m33.588s user 30m5.701s sys 0m0.141s real 31m27.791s user 31m24.674s sys 0m0.111s real 31m25.500s user 31m16.731s sys 0m0.106s	2012-09-14 12:13:22 -04:00
Andreas Hansson	a57eda0843	scons: Add a target for google-perftools profiling This patch adds a new target called 'perf' that facilitates profiling using google perftools rather than gprof. The perftools CPU profiler offers plenty useful information in addition to gprof, and the latter is kept mostly to offer profiling also on non-Linux hosts.	2012-09-14 12:13:21 -04:00
Andreas Hansson	224ea5fba6	scons: Restructure ccflags and ldflags This patch restructures the ccflags such that the common parts are defined in a single location, also capturing all the target types in a single place. The patch also adds a corresponding ldflags in preparation for google-perf profiling support and the addition of Link-Time Optimization.	2012-09-14 12:13:20 -04:00
Andreas Hansson	806a1144ce	scons: Use c++0x with gcc >= 4.4 instead of 4.6 This patch shifts the version of gcc for which we enable c++0x from 4.6 to 4.4 The more long term plan is to see what the c++0x features can bring and what level of support would be enabled simply by bumping the required version of gcc from 4.3 to 4.4. A few minor things had to be fixed in the code base, most notably the choice of a hashmap implementation. In the Ruby Sequencer there were also a few minor issues that gcc 4.4 was not too happy about.	2012-09-14 12:13:18 -04:00
Joel Hestness	234fa4cf7e	Standard Switch: Drain the system before switching CPUs When switching from an atomic CPU to any of the timing CPUs, a drain is unnecessary since no events are scheduled in atomic mode. However, when trying to switch CPUs starting with a timing CPU, there may be events scheduled. This change ensures that all events are drained from the system by calling m5.drain before switching CPUs.	2012-09-12 21:41:37 -05:00
Joel Hestness	16dcb723c1	Base CPU: Initialize profileEvent to NULL The profileEvent pointer is tested against NULL in various places, but it is not initialized unless running in full-system mode. In SE mode, this can result in segmentation faults when profileEvent default intializes to something other than NULL.	2012-09-12 21:40:28 -05:00
Jason Power	aa8bcd15ec	Ruby: Modify Scons so that we can put .sm files in extras Also allows for header files which are required in slicc generated code to be in a directory other than src/mem/ruby/slicc_interface.	2012-09-12 14:52:04 -05:00
Anthony Gutierrez	c6927ed138	stats: remove duplicate instruction stats from the commit stage these stats are duplicates of insts/opsCommitted, cause confusion, and are poorly named.	2012-09-12 11:35:52 -04:00
Andreas Hansson	292d8252a4	clang: Fix issues identified by the clang static analyzer This patch addresses a few minor issues reported by the clang static analyzer. The analysis was run with: scan-build -disable-checker deadcode \ -enable-checker experimental.core \ -disable-checker experimental.core.CastToStruct \ -enable-checker experimental.cpluscplus	2012-09-11 14:15:47 -04:00
Lena Olson	584eba3ab6	Cache: Split invalidateBlk up to seperate block vs. tags This seperates the functionality to clear the state in a block into blk.hh and the functionality to udpate the tag information into the tags. This gets rid of the case where calling invalidateBlk on an already-invalid block does something different than calling it on a valid block, which was confusing.	2012-09-11 14:14:49 -04:00
Nilay Vaish	f47c2f6415	X86: make use of register predication The patch introduces two predicates for condition code registers -- one tests if a register needs to be read, the other tests whether a register needs to be written to. These predicates are evaluated twice -- during construction of the microop and during its execution. Register reads and writes are elided depending on how the predicates evaluate.	2012-09-11 09:33:42 -05:00
Nilay Vaish	6369df59c8	x86: Add a separate register for D flag bit The D flag bit is part of the cc flag bit register currently. But since it is not being used any where in the implementation, it creates an unnecessary dependency. Hence, it is being moved to a separate register.	2012-09-11 09:25:43 -05:00
Nilay Vaish	3700e5448a	ISA Parser: Allow predication of source and destination registers This patch is meant for allowing predicated reads and writes. Note that this predication is different from the ISA provided predication. They way we currently provide the ISA description for X86, we read/write registers that do not need to be actually read/written. This is likely to be true for other ISAs as well. This patch allows for read and write predicates to be associated with operands. It allows for the register indices for source and destination registers to be decided at the time when the microop is constructed. The run time indicies come in to play only when the at least one of the predicates has been provided. This patch will not affect any of the ISAs that do not provide these predicates. Also the patch assumes that the order in which operands appear in any function of the microop is same across all the functions of the microops. A subsequent patch will enable predication for the x86 ISA.	2012-06-03 10:59:04 -05:00
Nilay Vaish	637c6c7e32	Ruby: Use uint32_t instead of uint32 everywhere	2012-09-11 09:24:45 -05:00
Nilay Vaish	f00347a20f	Ruby: Use uint8_t instead of uint8 everywhere	2012-09-11 09:23:56 -05:00
Nilay Vaish	c5bf1390aa	Ruby System: Convert to Clocked Object This patch moves Ruby System from being a SimObject to recently introduced ClockedObject.	2012-09-10 12:21:01 -05:00
Nilay Vaish	4e6f048ef0	Ruby Slicc: remove the call to cin.get() function If I understand correctly, this was put in place so that a debugger can be attached when the protocol aborts. While this sounds useful, it is a problem when the simulation is not being actively monitored. I think it is better to remove this.	2012-09-10 12:20:34 -05:00
Marco Elver	9e0edbcea8	Mem: Allow serializing of more than INT_MAX bytes Despite gzwrite taking an unsigned for length, it returns an int for bytes written; gzwrite fails if (int)len < 0. Because of this, call gzwrite with len no larger than INT_MAX: write in blocks of INT_MAX if data to be written is larger than INT_MAX.	2012-09-10 11:57:43 -04:00
Palle Lyckegaard	21d4d50ba1	NetBSD: Build on NetBSD Minor patch against so building on NetBSD is possible.	2012-09-10 11:57:42 -04:00
Andreas Hansson	3215ed9754	AddrRange: Remove the unused range_ops header This patch prunes the range_ops header that is no longer used. The bridge used it to do filtering of address ranges, but this is changed since quite some time. Ultimately this patch aims to simplify the handling of ranges before specialising the AddrRange to an AddrRegion that also allows striping bits to be selected.	2012-09-10 11:57:40 -04:00
Andreas Hansson	1f9c3bcb46	Inet: Remove the SackRange and its use This patch aims to simplify the use of the Range class before introducing a more elaborate AddrRegion to replace the AddrRange. The SackRange is the only use of the range class besides address ranges, and the removal of this use makes for an easier modification of the range class. The functionlity that is removed with this patch is not used anywhere throughout the code base.	2012-09-10 11:57:39 -04:00
Andreas Hansson	cf5935445f	Device: Bump PIO and PCI latencies to more reasonable values This patch addresses a previously highlighted issue with the default latencies used for PIO and PCI devices. The values are merely educated guesses and might not represent the particular system you want to model. However, the values in this patch are definitely far more realistic than the previous ones. In i8254xGBe, the writeConfig method is updated to use configDelay instead of pioDelay. A follow-up patch will update the regression stats.	2012-09-10 11:57:36 -04:00
Andreas Sandberg	d4a6d9846a	sim: Update the SimObject documentation Includes a small change in sim_object.cc that adds the name space to the output stream parameter in serializeAll. Leaving out the name space unfortunately confuses Doxygen.	2012-09-07 14:20:53 -05:00
Andreas Sandberg	2f397f314b	sim: Remove the unused SimObject::regFormulas method Simulation objects normally register derived statistics, presumably what regFormulas originally was meant for, in regStats(). This patch removes regRegformulas since there is no need to have a separate method call to register formulas.	2012-09-07 14:20:53 -05:00
Ali Saidi	03ff612054	O3: Get rid of incorrect assert in RAS.	2012-09-07 14:20:53 -05:00
Ali Saidi	2059c01673	dev: Fix bifield definition in timer_cpulocal.hh Bitfield definition in the local timer model for ARM had the bitfield range numbers reversed which could lead to buggy behavior.	2012-09-07 14:20:53 -05:00
Ali Saidi	5217d5a451	Igbe: Newer kernels seem to allow TSO headers and packet data to be in one desc Implement some code we used to panic on as it actually does happen with the e1000 driver in Linux 3.3+. We used to assume that a TSO header would never be part of a larger payload, however it appears as though it now can be.	2012-09-07 14:20:53 -05:00
Krishnendra Nathella	3f5ee1cf8c	sim: add validation to make sure there is memory where we're loading the kernel	2012-09-07 14:20:53 -05:00
Ali Saidi	3742b19b36	loader: initialize all memory in the ObjectFile objects. Some bare metal build flows seem to build binaries that we aren't necessarily expecting. Initialize everything to 0, so we don't make any assumptions about what is or isn't in the binary.	2012-09-07 14:20:52 -05:00
Ali Saidi	8fc0cef611	ARM: Fix one of the timers used in the VExpress EMM platform.	2012-09-07 14:20:52 -05:00
Andreas Hansson	287ea1a081	Param: Transition to Cycles for relevant parameters This patch is a first step to using Cycles as a parameter type. The main affected modules are the CPUs and the Ruby caches. There are definitely plenty more places that are affected, but this patch serves as a starting point to making the transition. An important part of this patch is to actually enable parameters to be specified as Param.Cycles which involves some changes to params.py.	2012-09-07 12:34:38 -04:00
Joel Hestness	6924e10978	Ruby Memory Controller: Fix clocking	2012-09-05 20:51:41 -05:00
Jason Power	494f6a858e	Ruby: Correct DataBlock =operator The =operator for the DataBlock class was incorrectly interpreting the class member m_alloc. This variable stands for whether the assigned memory for the data block needs to be freed or not by the class itself. It seems that the =operator interpreted the variable as whether the memory is assigned to the data block. This wrong interpretation was causing values not to propagate to RubySystem::m_mem_vec_ptr. This caused major issues with restoring from checkpoints when using a protocol which verified that the cache data was consistent with the backing store (i.e. MOESI-hammer).	2012-08-28 17:57:51 -05:00
Andreas Hansson	0cacf7e817	Clock: Add a Cycles wrapper class and use where applicable This patch addresses the comments and feedback on the preceding patch that reworks the clocks and now more clearly shows where cycles (relative cycle counts) are used to express time. Instead of bumping the existing patch I chose to make this a separate patch, merely to try and focus the discussion around a smaller set of changes. The two patches will be pushed together though. This changes done as part of this patch are mostly following directly from the introduction of the wrapper class, and change enough code to make things compile and run again. There are definitely more places where int/uint/Tick is still used to represent cycles, and it will take some time to chase them all down. Similarly, a lot of parameters should be changed from Param.Tick and Param.Unsigned to Param.Cycles. In addition, the use of curTick is questionable as there should not be an absolute cycle. Potential solutions can be built on top of this patch. There is a similar situation in the o3 CPU where lastRunningCycle is currently counting in Cycles, and is still an absolute time. More discussion to be had in other words. An additional change that would be appropriate in the future is to perform a similar wrapping of Tick and probably also introduce a Ticks class along with suitable operators for all these classes.	2012-08-28 14:30:33 -04:00
Andreas Hansson	d53d04473e	Clock: Rework clocks to avoid tick-to-cycle transformations This patch introduces the notion of a clock update function that aims to avoid costly divisions when turning the current tick into a cycle. Each clocked object advances a private (hidden) cycle member and a tick member and uses these to implement functions for getting the tick of the next cycle, or the tick of a cycle some time in the future. In the different modules using the clocks, changes are made to avoid counting in ticks only to later translate to cycles. There are a few oddities in how the O3 and inorder CPU count idle cycles, as seen by a few locations where a cycle is subtracted in the calculation. This is done such that the regression does not change any stats, but should be revisited in a future patch. Another, much needed, change that is not done as part of this patch is to introduce a new typedef uint64_t Cycle to be able to at least hint at the unit of the variables counting Ticks vs Cycles. This will be done as a follow-up patch. As an additional follow up, the thread context still uses ticks for the book keeping of last activate and last suspend and this should probably also be changed into cycles as well.	2012-08-28 14:30:31 -04:00
Andreas Hansson	d14e5857c7	Port: Stricter port bind/unbind semantics This patch tightens up the semantics around port binding and checks that the ports that are being bound are currently not connected, and similarly connected before unbind is called. The patch consequently also changes the order of the unbind and bind for the switching of CPUs to ensure that the rules are adhered to. Previously the ports would be "over-written" without any check. There are no changes in behaviour due to this patch, and the only place where the unbind functionality is used is in the CPU.	2012-08-28 14:30:27 -04:00
Andreas Hansson	105ad88d35	Checker: Fix checker CPU ports This patch updates how the checker CPU handles the ports such that the regressions will once again run without causing a panic. A minor amount of tidying up was also done as part of this patch.	2012-08-28 14:30:24 -04:00
Andreas Hansson	d090f4d930	swig: Disable unused value warning with llvm 3.1 compilers This patch disables a warning for unused values which causes problems when compiling the swig-generated sources using recent llvm-based compilers like llvm-gcc and clang.	2012-08-28 14:30:22 -04:00
Anthony Gutierrez	5b1614de02	sim: fix overflow check in simulate because Tick is now unsigned	2012-08-27 20:53:20 -04:00
Nilay Vaish	85c7352462	Ruby: remove README.debugging and Decommissioning_note These files were relevant when Ruby was part of GEMS. They are not required any longer.	2012-08-27 14:57:46 -05:00
Nilay Vaish	0737837109	System: Remove redundant call to startupCPU	2012-08-27 01:14:46 -05:00
Nilay Vaish	9190940511	Ruby: Remove RubyEventQueue This patch removes RubyEventQueue. Consumer objects now rely on RubySystem or themselves for scheduling events.	2012-08-27 01:00:55 -05:00
Nilay Vaish	7122b83d8f	Ruby Memory Vector: Allow more than 4GB of memory The memory size variable was a 32-bit int. This meant that the size of the memory was limited to 4GB. This patch changes the type of the variable to 64-bit to support larger memory sizes. Thanks to Raghuraman Balasubramanian for bringing this to notice.	2012-08-27 01:00:54 -05:00
Nilay Vaish	b422994fea	MESI Protocol: Correct the virtual network in profile functions The virtual network in a couple of places was incorrectly mentioned as 3 in place of 1. This is being corrected.	2012-08-25 15:49:06 -05:00
Nilay Vaish	01f1430833	MESI Coherence Protocol: Add copyright notice	2012-08-25 13:16:45 -05:00
Andreas Hansson	2c1052cd4d	DMA: Refactor the DMA device and align timing and atomic This patch does a bunch of house-keeping updates on the DMA, including indentation, and formatting, but most importantly breaks out the response handling such that it can be shared between the atomic and timing modes. It also removes a potential bug caused by the atomic handling of responses only deleting the allocated request (pkt->req) once the DMA action completes instead of doing so for every packet. Before this patch, the handling of responses was near identical for atomic and timing, but the code was simply duplicated. With this patch, the handleResp method deals with the responses in both cases. There are further updates to make after removing the NACKs, but that will be part of a separate follow-up patch. This patch does not change the behaviour of any regression.	2012-08-22 11:40:01 -04:00
Andreas Hansson	c60db56741	Packet: Remove NACKs from packet and its use in endpoints This patch removes the NACK frrom the packet as there is no longer any module in the system that issues them (the bridge was the only one and the previous patch removes that). The handling of NACKs was mostly avoided throughout the code base, by using e.g. panic or assert false, but in a few locations the NACKs were actually dealt with (although NACKs never occured in any of the regressions). Most notably, the DMA port will now never receive a NACK and the backoff time is thus never changed. As a consequence, the entire backoff mechanism (similar to a PCI bus) is now removed and the DMA port entirely relies on the bus performing the arbitration and issuing a retry when appropriate. This is more in line with e.g. PCIe. Surprisingly, this patch has no impact on any of the regressions. As mentioned in the patch that removes the NACK from the bridge, a follow-up patch should change the request and response buffer size for at least one regression to also verify that the system behaves as expected when the bridge fills up.	2012-08-22 11:39:59 -04:00
Andreas Hansson	a6074016e2	Bridge: Remove NACKs in the bridge and unify with packet queue This patch removes the NACKing in the bridge, as the split request/response busses now ensure that protocol deadlocks do not occur, i.e. the message-dependency chain is broken by always allowing responses to make progress without being stalled by requests. The NACKs had limited support in the system with most components ignoring their use (with a suitable call to panic), and as the NACKs are no longer needed to avoid protocol deadlocks, the cleanest way is to simply remove them. The bridge is the starting point as this is the only place where the NACKs are created. A follow-up patch will remove the code that deals with NACKs in the endpoints, e.g. the X86 table walker and DMA port. Ultimately the type of packet can be complete removed (until someone sees a need for modelling more complex protocols, which can now be done in parts of the system since the port and interface is split). As a consequence of the NACK removal, the bridge now has to send a retry to a master if the request or response queue was full on the first attempt. This change also makes the bridge ports very similar to QueuedPorts, and a later patch will change the bridge to use these. A first step in this direction is taken by aligning the name of the member functions, as done by this patch. A bit of tidying up has also been done as part of the simplifications. Surprisingly, this patch has no impact on any of the regressions. Hence, there was never any NACKs issued. In a follow-up patch I would suggest changing the size of the bridge buffers set in FSConfig.py to also test the situation where the bridge fills up.	2012-08-22 11:39:58 -04:00
Andreas Hansson	e317d8b9ff	Port: Extend the QueuedPort interface and use where appropriate This patch extends the queued port interfaces with methods for scheduling the transmission of a timing request/response. The methods are named similar to the corresponding sendTiming(Snoop)Req/Resp, replacing the "send" with "sched". As the queues are currently unbounded, the methods always succeed and hence do not return a value. This functionality was previously provided in the subclasses by calling PacketQueue::schedSendTiming with the appropriate parameters. With this change, there is no need to introduce these extra methods in the subclasses, and the use of the queued interface is more uniform and explicit.	2012-08-22 11:39:56 -04:00
Andreas Hansson	70e99e0b91	Device: Remove overloaded pio_latency parameter This patch removes the overloading of the parameter, which seems both redundant, and possibly incorrect. The PciConfigAll now also uses a Param.Latency rather than a Param.Tick. For backwards compatibility it still sets the pio_latency to 1 tick. All the comments have also been updated to not state that it is in simticks when it is not necessarily the case.	2012-08-21 05:50:03 -04:00
Andreas Hansson	a81c969529	CPU: Remove overloaded function_trace_start parameter This patch removes the overloading of the parameter, which seems both redundant, and possibly incorrect. The inorder CPU is particularly interesting as it uses a different name for the parameter, and never make any use of it internally.	2012-08-21 05:49:43 -04:00
Andreas Hansson	5803309574	PacketQueue: Allow queuing in the same tick as desired send tick This patch allows packets to be enqueued in the same tick as they are intended to be sent. This does not imply they actually are sent that tick, although that is possible. This change is useful for module that use the queued ports primarly to avoid handling the flow control involved in sending and retrying packets.	2012-08-21 05:49:24 -04:00
Andreas Hansson	4be1ae3cf8	EventManager: Remove test for NULL pointer in constructor This patch tidies up the EventManager constructor and prunes a corner case where the EventManager would initialise its eventq pointer to NULL. This would cause segmentation faults on actual use and should never happen.	2012-08-21 05:49:18 -04:00
Andreas Hansson	016593f2e9	Clock: Make Tick unsigned and remove UTick This patch makes the Tick unsigned and removes the UTick typedef. The ticks should never be negative, and there was only one major issue with removing it, caused by the o3 CPU using a -1 as an initial value. The patch has no impact on any regressions.	2012-08-21 05:49:09 -04:00
Andreas Hansson	452217817f	Clock: Move the clock and related functions to ClockedObject This patch moves the clock of the CPU, bus, and numerous devices to the new class ClockedObject, that sits in between the SimObject and MemObject in the class hierarchy. Although there are currently a fair amount of MemObjects that do not make use of the clock, they potentially should do so, e.g. the caches should at some point have the same clock as the CPU, potentially with a 1:n ratio. This patch does not introduce any new clock objects or object hierarchies (clusters, clock domains etc), but is still a step in the direction of having a more structured approach clock domains. The most contentious part of this patch is the serialisation of clocks that some of the modules (but not all) did previously. This serialisation should not be needed as the clock is set through the parameters even when restoring from the checkpoint. In other words, the state is "stored" in the Python code that creates the modules. The nextCycle methods are also simplified and the clock phase parameter of the CPU is removed (this could be part of a clock object once they are introduced).	2012-08-21 05:49:01 -04:00
Nilay Vaish	0160d51483	Ruby Banked Array: add copyrights	2012-08-19 13:05:53 -05:00
Jason Power	44b4c96253	Ruby: Add RubySystem parameter to MemoryControl This guarantees that RubySystem object is created before the MemoryController object is created.	2012-08-16 23:39:36 -05:00
Nilay Vaish	649e377937	Alpha System: override startup(), instead of loadState() Alpha System was overriding loadState() function to setup some functional event. The system tried to read/write to memory before the Ruby memory had unserialized the state. With this patch, Alpha System overrides the startup() function, and sets up functional events in this function. This works because startup() is called after Ruby memory system has unserialized the memory state.	2012-08-16 23:45:21 -05:00
Anthony Gutierrez	0b3897fc90	O3,ARM: fix some problems with drain/switchout functionality and add Drain DPRINTFs This patch fixes some problems with the drain/switchout functionality for the O3 cpu and for the ARM ISA and adds some useful debug print statements. This is an incremental fix as there are still a few bugs/mem leaks with the switchout code. Particularly when switching from an O3CPU to a TimingSimpleCPU. However, when switching from O3 to O3 cores with the ARM ISA I haven't encountered any more assertion failures; now the kernel will typically panic inside of simulation.	2012-08-15 10:38:08 -04:00
Ali Saidi	dd1b346584	sysemul: bump all linux versions of for syscal emulation to 3.0. New tool chains seem to be looking for kernel versions newer than what this this was previously set to. Also take this opportunity to change the hostname we report in uname to sim.gem5.org.	2012-08-15 10:38:04 -04:00
Jason Power	11411cc9c7	Ruby: Clean up topology changes This patch moves instantiateTopology into Ruby.py and removes the mem/ruby/network/topologies directory. It also adds some extra inheritance to the topologies to clean up some issues in the existing topologies.	2012-08-10 13:50:42 -05:00
Nilay Vaish	706e84f2b8	System: set kernel to null, if unspecified.	2012-08-08 13:40:32 -05:00
Marc Orr	7cef6b9bef	syscall emulation: Enabled getrlimit and getrusage for x86. Added/moved rlimit constants to base linux header file. This patch is a revised version of Vince Weaver's earlier patch.	2012-08-06 19:52:56 -07:00
Steve Reinhardt	f4b424cd53	SETranslatingPortProxy: fix bug in tryReadString() Off-by-one loop termination meant that we were stuffing the terminating '\0' into the std::string value, which makes for difficult-to-debug string comparison failures.	2012-08-06 16:57:11 -07:00
Steve Reinhardt	73ef8bd168	process: add progName() virtual function This replaces a (potentially uninitialized) string field with a virtual function so that we can have a safe interface without requiring changes to the eio code.	2012-08-06 16:55:34 -07:00
Steve Reinhardt	e232152db6	syscall_emul: clean up open() code a bit.	2012-08-06 16:55:28 -07:00
Steve Reinhardt	b647b48bf4	str: add an overloaded startswith() utility method for various string types and use it in a few places.	2012-08-06 16:52:49 -07:00
Marc Orr	d55115936e	syscall emulation: Clean up ioctl handling, and implement for x86. Enable different whitelists for different OS/arch combinations, since some use the generic Linux definitions only, and others use definitions inherited from earlier Unix flavors on those architectures. Also update x86 function pointers so ioctl is no longer unimplemented on that platform. This patch is a revised version of Vince Weaver's earlier patch.	2012-08-06 16:52:40 -07:00
Jason Power	6721b3e325	Ruby NetDest: add assert for bad element in netdest	2012-08-01 17:07:34 -05:00
Anthony Gutierrez	630068be6f	dma: remove unused variable this patch removes the actionInProgress field from the DmaPort class. this variable is only defined and initiated in the ctor. it is never used.	2012-07-27 16:08:05 -04:00
Anthony Gutierrez	8133f2460f	checker: make checker cpu id match its host's cpu id when using the checker i ran into problems where an instruction reading the cpu id register failed because the ids did not match, and hence, the result of the instruction did not match. this patch ensures that the ids match so this instruction does not fail. this problem only seemed to manifest itself when multiple cores were in the system, either multi-core, or extra switched- out cores present in the system.	2012-07-27 16:08:04 -04:00
Anthony Gutierrez	7bf14aedbf	cache: don't allow dirty data in the i-cache removes the optimization that forwards an exclusive copy to a requester on a read, only for the i-cache. this optimization isn't necessary because we typically won't be writing to the i-cache.	2012-07-27 16:08:04 -04:00
Anthony Gutierrez	2eb6b403c9	ARM: fix value of MISCREG_CTR returned by readMiscReg() According to the A15 TRM the value of this register is as follows (assuming 16 word = 64 byte lines) [31:29] Format - b100 specifies v7 [28] RAZ - b0 [27:24] CWG log2(max writeback size #words) - 0x4 16 words [23:20] ERG log2(max reservation size #words) - 0x4 16 words [19:16] DminLine log2(smallest dcache line #words) - 0x4 16 words [15:14] L1Ip L1 index/tagging policy - b11 specifies PIPT [13:4] RAZ - b0000000000 [3:0] IminLine log2(smallest icache line #words) - 0x4 16 words	2012-07-27 16:08:04 -04:00
Andreas Hansson	66f5124e2b	Bridge: Use EventWrapper instead of Event subclass for sendEvent This class simply cleans up the code by making use of the EventWrapper convenience class to schedule the sendEvent in the bridge ports.	2012-07-23 09:32:19 -04:00

... 4 5 6 7 8 ...

5927 commits