gem5/src/arch/mips/isa/formats/fp.isa
Curtis Dunham fe27f937aa arch: teach ISA parser how to split code across files
This patch encompasses several interrelated and interdependent changes
to the ISA generation step.  The end goal is to reduce the size of the
generated compilation units for instruction execution and decoding so
that batch compilation can proceed with all CPUs active without
exhausting physical memory.

The ISA parser (src/arch/isa_parser.py) has been improved so that it can
accept 'split [output_type];' directives at the top level of the grammar
and 'split(output_type)' python calls within 'exec {{ ... }}' blocks.
This has the effect of "splitting" the files into smaller compilation
units.  I use air-quotes around "splitting" because the files themselves
are not split, but preprocessing directives are inserted to have the same
effect.

Architecturally, the ISA parser has had some changes in how it works.
In general, it emits code sooner.  It doesn't generate per-CPU files,
and instead defers to the C preprocessor to create the duplicate copies
for each CPU type.  Likewise there are more files emitted and the C
preprocessor does more substitution that used to be done by the ISA parser.

Finally, the build system (SCons) needs to be able to cope with a
dynamic list of source files coming out of the ISA parser. The changes
to the SCons{cript,truct} files support this. In broad strokes, the
targets requested on the command line are hidden from SCons until all
the build dependencies are determined, otherwise it would try, realize
it can't reach the goal, and terminate in failure. Since build steps
(i.e. running the ISA parser) must be taken to determine the file list,
several new build stages have been inserted at the very start of the
build. First, the build dependencies from the ISA parser will be emitted
to arch/$ISA/generated/inc.d, which is then read by a new SCons builder
to finalize the dependencies. (Once inc.d exists, the ISA parser will not
need to be run to complete this step.) Once the dependencies are known,
the 'Environments' are made by the makeEnv() function. This function used
to be called before the build began but now happens during the build.
It is easy to see that this step is quite slow; this is a known issue
and it's important to realize that it was already slow, but there was
no obvious cause to attribute it to since nothing was displayed to the
terminal. Since new steps that used to be performed serially are now in a
potentially-parallel build phase, the pathname handling in the SCons scripts
has been tightened up to deal with chdir() race conditions. In general,
pathnames are computed earlier and more likely to be stored, passed around,
and processed as absolute paths rather than relative paths.  In the end,
some of these issues had to be fixed by inserting serializing dependencies
in the build.

Minor note:
For the null ISA, we just provide a dummy inc.d so SCons is never
compelled to try to generate it. While it seems slightly wrong to have
anything in src/arch/*/generated (i.e. a non-generated 'generated' file),
it's by far the simplest solution.
2014-05-09 18:58:47 -04:00

376 lines
12 KiB
C++

// -*- mode:c++ -*-
// Copyright (c) 2007 MIPS Technologies, Inc.
// All rights reserved.
//
// Redistribution and use in source and binary forms, with or without
// modification, are permitted provided that the following conditions are
// met: redistributions of source code must retain the above copyright
// notice, this list of conditions and the following disclaimer;
// redistributions in binary form must reproduce the above copyright
// notice, this list of conditions and the following disclaimer in the
// documentation and/or other materials provided with the distribution;
// neither the name of the copyright holders nor the names of its
// contributors may be used to endorse or promote products derived from
// this software without specific prior written permission.
//
// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
// "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
// LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
// A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
// OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
// SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
// LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
// DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
// THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
// (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
//
// Authors: Korey Sewell
////////////////////////////////////////////////////////////////////
//
// Floating Point operate instructions
//
output header {{
/**
* Base class for FP operations.
*/
class FPOp : public MipsStaticInst
{
protected:
/// Constructor
FPOp(const char *mnem, MachInst _machInst, OpClass __opClass) : MipsStaticInst(mnem, _machInst, __opClass)
{
}
//std::string generateDisassembly(Addr pc, const SymbolTable *symtab) const;
//needs function to check for fpEnable or not
};
class FPCompareOp : public FPOp
{
protected:
FPCompareOp(const char *mnem, MachInst _machInst, OpClass __opClass) : FPOp(mnem, _machInst, __opClass)
{
}
std::string generateDisassembly(Addr pc, const SymbolTable *symtab) const;
};
}};
output decoder {{
std::string FPCompareOp::generateDisassembly(Addr pc, const SymbolTable *symtab) const
{
std::stringstream ss;
ccprintf(ss, "%-10s ", mnemonic);
ccprintf(ss,"%d",CC);
if(_numSrcRegs > 0) {
ss << ", ";
printReg(ss, _srcRegIdx[0]);
}
if(_numSrcRegs > 1) {
ss << ", ";
printReg(ss, _srcRegIdx[1]);
}
return ss.str();
}
}};
output header {{
void fpResetCauseBits(%(CPU_exec_context)s *cpu);
}};
output exec {{
inline Fault checkFpEnableFault(CPU_EXEC_CONTEXT *xc)
{
//@TODO: Implement correct CP0 checks to see if the CP1
// unit is enable or not
if (!isCoprocessorEnabled(xc, 1))
return new CoprocessorUnusableFault(1);
return NoFault;
}
//If any operand is Nan return the appropriate QNaN
template <class T>
bool
fpNanOperands(FPOp *inst, CPU_EXEC_CONTEXT *xc, const T &src_type,
Trace::InstRecord *traceData)
{
uint64_t mips_nan = 0;
assert(sizeof(T) == 4);
for (int i = 0; i < inst->numSrcRegs(); i++) {
uint64_t src_bits = xc->readFloatRegOperandBits(inst, 0);
if (isNan(&src_bits, 32) ) {
mips_nan = MIPS32_QNAN;
xc->setFloatRegOperandBits(inst, 0, mips_nan);
if (traceData) { traceData->setData(mips_nan); }
return true;
}
}
return false;
}
template <class T>
bool
fpInvalidOp(FPOp *inst, CPU_EXEC_CONTEXT *cpu, const T dest_val,
Trace::InstRecord *traceData)
{
uint64_t mips_nan = 0;
T src_op = dest_val;
assert(sizeof(T) == 4);
if (isNan(&src_op, 32)) {
mips_nan = MIPS32_QNAN;
//Set value to QNAN
cpu->setFloatRegOperandBits(inst, 0, mips_nan);
//Read FCSR from FloatRegFile
uint32_t fcsr_bits =
cpu->tcBase()->readFloatRegBits(FLOATREG_FCSR);
uint32_t new_fcsr = genInvalidVector(fcsr_bits);
//Write FCSR from FloatRegFile
cpu->tcBase()->setFloatRegBits(FLOATREG_FCSR, new_fcsr);
if (traceData) { traceData->setData(mips_nan); }
return true;
}
return false;
}
void
fpResetCauseBits(CPU_EXEC_CONTEXT *cpu)
{
//Read FCSR from FloatRegFile
uint32_t fcsr = cpu->tcBase()->readFloatRegBits(FLOATREG_FCSR);
// TODO: Use utility function here
fcsr = bits(fcsr, 31, 18) << 18 | bits(fcsr, 11, 0);
//Write FCSR from FloatRegFile
cpu->tcBase()->setFloatRegBits(FLOATREG_FCSR, fcsr);
}
}};
def template FloatingPointExecute {{
Fault %(class_name)s::execute(CPU_EXEC_CONTEXT *xc, Trace::InstRecord *traceData) const
{
Fault fault = NoFault;
%(fp_enable_check)s;
//When is the right time to reset cause bits?
//start of every instruction or every cycle?
if (FullSystem)
fpResetCauseBits(xc);
%(op_decl)s;
%(op_rd)s;
//Check if any FP operand is a NaN value
if (!fpNanOperands((FPOp*)this, xc, Fd, traceData)) {
%(code)s;
//Change this code for Full-System/Sycall Emulation
//separation
//----
//Should Full System-Mode throw a fault here?
//----
//Check for IEEE 754 FP Exceptions
//fault = fpNanOperands((FPOp*)this, xc, Fd, traceData);
bool invalid_op = false;
if (FullSystem) {
invalid_op =
fpInvalidOp((FPOp*)this, xc, Fd, traceData);
}
if (!invalid_op && fault == NoFault) {
%(op_wb)s;
}
}
return fault;
}
}};
// Primary format for float point operate instructions:
def format FloatOp(code, *flags) {{
iop = InstObjParams(name, Name, 'FPOp', code, flags)
header_output = BasicDeclare.subst(iop)
decoder_output = BasicConstructor.subst(iop)
decode_block = BasicDecode.subst(iop)
exec_output = FloatingPointExecute.subst(iop)
}};
def format FloatCompareOp(cond_code, *flags) {{
import sys
code = 'bool cond;\n'
if '_sf' in cond_code or 'SinglePrecision' in flags:
if 'QnanException' in flags:
code += 'if (isQnan(&Fs_sf, 32) || isQnan(&Ft_sf, 32)) {\n'
code += '\tFCSR = genInvalidVector(FCSR);\n'
code += '\treturn NoFault;'
code += '}\n else '
code += 'if (isNan(&Fs_sf, 32) || isNan(&Ft_sf, 32)) {\n'
elif '_df' in cond_code or 'DoublePrecision' in flags:
if 'QnanException' in flags:
code += 'if (isQnan(&Fs_df, 64) || isQnan(&Ft_df, 64)) {\n'
code += '\tFCSR = genInvalidVector(FCSR);\n'
code += '\treturn NoFault;'
code += '}\n else '
code += 'if (isNan(&Fs_df, 64) || isNan(&Ft_df, 64)) {\n'
else:
sys.exit('Decoder Failed: Can\'t Determine Operand Type\n')
if 'UnorderedTrue' in flags:
code += 'cond = 1;\n'
elif 'UnorderedFalse' in flags:
code += 'cond = 0;\n'
else:
sys.exit('Decoder Failed: Float Compare Instruction Needs A Unordered Flag\n')
code += '} else {\n'
code += cond_code + '}'
code += 'FCSR = genCCVector(FCSR, CC, cond);\n'
iop = InstObjParams(name, Name, 'FPCompareOp', code)
header_output = BasicDeclare.subst(iop)
decoder_output = BasicConstructor.subst(iop)
decode_block = BasicDecode.subst(iop)
exec_output = BasicExecute.subst(iop)
}};
def format FloatConvertOp(code, *flags) {{
import sys
#Determine Source Type
convert = 'fpConvert('
if '_sf' in code:
code = 'float ' + code + '\n'
convert += 'SINGLE_TO_'
elif '_df' in code:
code = 'double ' + code + '\n'
convert += 'DOUBLE_TO_'
elif '_sw' in code:
code = 'int32_t ' + code + '\n'
convert += 'WORD_TO_'
elif '_sd' in code:
code = 'int64_t ' + code + '\n'
convert += 'LONG_TO_'
else:
sys.exit("Error Determining Source Type for Conversion")
#Determine Destination Type
if 'ToSingle' in flags:
code += 'Fd_uw = ' + convert + 'SINGLE, '
elif 'ToDouble' in flags:
code += 'Fd_ud = ' + convert + 'DOUBLE, '
elif 'ToWord' in flags:
code += 'Fd_uw = ' + convert + 'WORD, '
elif 'ToLong' in flags:
code += 'Fd_ud = ' + convert + 'LONG, '
else:
sys.exit("Error Determining Destination Type for Conversion")
#Figure out how to round value
if 'Ceil' in flags:
code += 'ceil(val)); '
elif 'Floor' in flags:
code += 'floor(val)); '
elif 'Round' in flags:
code += 'roundFP(val, 0)); '
elif 'Trunc' in flags:
code += 'truncFP(val));'
else:
code += 'val); '
iop = InstObjParams(name, Name, 'FPOp', code)
header_output = BasicDeclare.subst(iop)
decoder_output = BasicConstructor.subst(iop)
decode_block = BasicDecode.subst(iop)
exec_output = BasicExecute.subst(iop)
}};
def format FloatAccOp(code, *flags) {{
iop = InstObjParams(name, Name, 'FPOp', code, flags)
header_output = BasicDeclare.subst(iop)
decoder_output = BasicConstructor.subst(iop)
decode_block = BasicDecode.subst(iop)
exec_output = BasicExecute.subst(iop)
}};
// Primary format for float64 operate instructions:
def format Float64Op(code, *flags) {{
iop = InstObjParams(name, Name, 'MipsStaticInst', code, flags)
header_output = BasicDeclare.subst(iop)
decoder_output = BasicConstructor.subst(iop)
decode_block = BasicDecode.subst(iop)
exec_output = BasicExecute.subst(iop)
}};
def format FloatPSCompareOp(cond_code1, cond_code2, *flags) {{
import sys
code = 'bool cond1, cond2;\n'
code += 'bool code_block1, code_block2;\n'
code += 'code_block1 = code_block2 = true;\n'
if 'QnanException' in flags:
code += 'if (isQnan(&Fs1_sf, 32) || isQnan(&Ft1_sf, 32)) {\n'
code += '\tFCSR = genInvalidVector(FCSR);\n'
code += 'code_block1 = false;'
code += '}\n'
code += 'if (isQnan(&Fs2_sf, 32) || isQnan(&Ft2_sf, 32)) {\n'
code += '\tFCSR = genInvalidVector(FCSR);\n'
code += 'code_block2 = false;'
code += '}\n'
code += 'if (code_block1) {'
code += '\tif (isNan(&Fs1_sf, 32) || isNan(&Ft1_sf, 32)) {\n'
if 'UnorderedTrue' in flags:
code += 'cond1 = 1;\n'
elif 'UnorderedFalse' in flags:
code += 'cond1 = 0;\n'
else:
sys.exit('Decoder Failed: Float Compare Instruction Needs A Unordered Flag\n')
code += '} else {\n'
code += cond_code1
code += 'FCSR = genCCVector(FCSR, CC, cond1);}\n}\n'
code += 'if (code_block2) {'
code += '\tif (isNan(&Fs2_sf, 32) || isNan(&Ft2_sf, 32)) {\n'
if 'UnorderedTrue' in flags:
code += 'cond2 = 1;\n'
elif 'UnorderedFalse' in flags:
code += 'cond2 = 0;\n'
else:
sys.exit('Decoder Failed: Float Compare Instruction Needs A Unordered Flag\n')
code += '} else {\n'
code += cond_code2
code += 'FCSR = genCCVector(FCSR, CC, cond2);}\n}'
iop = InstObjParams(name, Name, 'FPCompareOp', code)
header_output = BasicDeclare.subst(iop)
decoder_output = BasicConstructor.subst(iop)
decode_block = BasicDecode.subst(iop)
exec_output = BasicExecute.subst(iop)
}};