The Objective-C runtime ISA pointer is defined as such:
union isa_t {
isa_t() { }
isa_t(uintptr_t value) : bits(value) { }
uintptr_t bits;
private:
// Accessing the class requires custom ptrauth operations, so
// force clients to go through setClass/getClass by making this
// private.
Class cls;
public:
#if defined(ISA_BITFIELD)
struct {
ISA_BITFIELD; // defined in isa.h
};
bool isDeallocating() {
return extra_rc == 0 && has_sidetable_rc == 0;
}
void setDeallocating() {
extra_rc = 0;
has_sidetable_rc = 0;
}
#endif
void setClass(Class cls, objc_object *obj);
Class getClass(bool authenticated);
Class getDecodedClass(bool authenticated);
};
The bits fields can be read by the definitions here.
When I read a macho from disk and go to the _objc_classlist section and follow a objc_class which is defined as such:
struct objc_class : objc_object {
objc_class(const objc_class&) = delete;
objc_class(objc_class&&) = delete;
void operator=(const objc_class&) = delete;
void operator=(objc_class&&) = delete;
// Class ISA;
Class superclass;
cache_t cache; // formerly cache pointer and vtable
class_data_bits_t bits; // class_rw_t * plus custom rr/alloc flags
...
and objc_object is defined as such:
struct objc_object {
private:
isa_t isa;
public:
...
meaning that I should be able to interpret the first 8 bytes of objc_class as the bits field of an isa, but when I do this and try to interpret the bits I get random and false information,
on the other hand if I interpret the first 8 bytes as a pointer, it leads me to another objc_class instance on disk, which is usually the metaclass of the class. I wonder then why is the definition of the isa union from the Objective-C runtime and its bits field. Is this only right to interpret this as isa union with bits when we instantiate an object of a some kind and when reading from disk it's just a pointer to a meta class definition?
EDIT:
The way I read the objc_class struct from file is with python:
ISA_MASK = 0x0000000ffffffff8
@dataclass
class Isa():
bits: ctypes.c_size_t
_cls: ctypes.c_size_t
def __init__(self, fp, addr):
fp.seek(addr)
self.bits = struct.unpack("<Q", fp.read(8))[0]
self._cls = self.bits
def nonpointer(self):
return self.bits & 1
def has_assoc(self):
return (self.bits >> 1) & 1
def has_cxx_dtor(self):
return (self.bits >> 2) & 1
def shiftcls(self):
return (self.bits >> 3) & 0x7ffffffff
def magic(self):
return (self.bits >> 36) & 0x3f
def weakly_referenced(self):
return (self.bits >> 42) & 1
def unused(self):
return (self.bits >> 43) & 1
def has_sidetable_rc(self):
return (self.bits >> 44) & 1
def extra_rc(self):
return (self.bits >> 45) & 0x7ffff
def get_class(self):
clsbits = self.bits
clsbits &= ISA_MASK
return clsbits
@dataclass
class ObjcObject:
isa: Isa
_addr: ctypes.c_size_t
def __init__(self, fp, addr, isa_class, external_block_addr):
self.isa = None
self._addr = addr
fp.seek(addr)
isa_addr = struct.unpack("<Q", fp.read(8))[0]
if isa_addr != 0 and isa_addr < external_block_addr:
self.isa = Isa(fp, isa_addr, external_block_addr)
@dataclass
class ObjcClass(ObjcObject):
super_class: ObjcClass
cache: Cache
class_ro: ClassRo
def __init__(self, fp, addr, external_block_addr):
super().__init__(fp, addr, ObjcClass, external_block_addr)
...
...
I have for example a class lets call it A and after processing the chained fixups on address 0x0025eed0 I have it it's symbol _OBJC_CLASS_$_A and the objc_class defined in that addres.
The first 8 bytes of the structure is the ISA as we've established by looking at the sources of the runtime. Following it as a pointer and not treating it as the isa_t union I get to another objc_class struct for the symbol _OBJC_METACLASS_$_A which is the metaclass of this class.
Now if instead of treating the first 8 bytes of the objc_class struct as a pointer to the metaclass, I try to interpret them as the bits of the isa_t union like I have in the code I provided, and for example using the has_cxx_dtor method I get False which is incorrect because I can clearly find this method on the method_list_t structure of the class_ro so it doesn't match up with what I parse and hence the isa_t union seem unrelated to the actual data of the class on disk.
Note that the method for extracting the data from the bits of isa_t is by looking at the source of isa.h and assuming I read an ARM64 macho without ptr auth and not from simulator.
After digging a bit through the runtime, it appears that non-pointer isas are a runtime-only concept, and that all on-disk isas will always be regular pointers.
The loading process of Obj-C classes in an object file:
dyldcalls_objc_map_images(objc-internal.h/objc-runtime-new.mm), passing in the object headers to read and load classes from_objc_map_imagesdoes a bit of setup before callingmap_images(objc-private.h/objc-runtime-new.mm)map_imagestakes the runtime lock, then callsmap_images_nolock(objc-private.h/objc-os.mm)map_images_nolockiterates over the mach headers, searching for Obj-C info and performing some validation. It passes all of the headers which contain Obj-C classes to_read_images(objc-private.h/objc-runtime-new.mm)_read_imagesis where we actually get to the interesting parts. It first sets up support for non-pointer isas as relevant for the runtime target, and sets up some tables for storing class information. After reading and fixing up selectors, it starts reading class info (OBJC_RUNTIME_DISCOVER_CLASSES_START())classliststored in the header, receiving direct pointers to each of the classes in the imagereadClass(objc-runtime-new.mm), which resolves mangled class names, Swift classes, and more — but at the end of the day, the readclassref_t(raw pointer to dyld class) is either cast toClass(the class object), or replaced by an allocatedClassinstanceSo, where do non-pointer isas come into play? Only when setting objects' class at runtime:
objc_constructInstanceorclass_createInstance(runtime.h), or set an object's class viaobject_setClass, the object has eitherobjc_object::initInstanceIsaorobjc_object::initIsa(objc-object.h) called on it (andinitInstanceIsajust calls through toinitIsaanyway)objc_object::initIsahas two implementations (one forSUPPORT_NONPOINTER_ISAand the other for non-supported), but both call down toisa_t::setClass(objc-private.h/objc-object.h)isa_t::setClassalso has two implementations — whenSUPPORT_NONPOINTER_ISAis true, the implementation sets the appropriate bits in the isa value itself, settingshiftclsas necessary; whenSUPPORT_NONPOINTER_ISAis false, it just sets the class directly(Or in reverse, if you prefer:
isa_t::setClassis only called fromobjc_object::initIsa/objc_object::changeIsa, which themselves are only called fromobjc_constructInstance/class_createInstance/object_setClass.)So, when you read these object files on disk, you will only ever encounter pointer isas for objects and classes; the bits that are actually set inside of isas is done at runtime exclusively. If there are details you're hoping to read from those bits, you'll need to construct that info yourself from the surrounding mach-o data.