17.3 C
New York
Saturday, September 21, 2024

Reversing iOS System Libraries Utilizing Radare2: A Deep Dive into Dyld Cache (Half 2)


That is the second (and the shortest) installment of my weblog sequence about reverse engineering iOS system libraries with radare2. We’ll introduce the idea of cross-references, why they’re necessary and the gist of them in r2.

This weblog put up  explores methods to discover cross-references from inside a single library within the dyld shared library cache (DSC to mates), so should you missed the first put up you possibly can look it up now to know methods to deal with DSC as a single massive executable with radare2.


The entire above boils all the way down to discovering cross-references — that’s, discovering from which deal with one other deal with is referenced.

Discovering Cross-References

On the subject of reverse engineering compiled code, a standard activity is determining how issues are related to be able to reply questions equivalent to:

  • What code is utilizing this textual content string / world variable?
  • What code is looking this perform / technique?
  • Is that this pointer saved in any information part?

The entire above boils all the way down to discovering cross-references — that’s discovering from which deal with one other deal with is referenced.

In radare2, cross-references are computed on account of an evaluation command.

Particularly, the naked minimal evaluation which could be carried out to compute references from executable code is to linearly emulate it utilizing ESIL. ESIL is the underlying intermediate language utilized by radare2 to signify and nearly execute code in a method that’s impartial from the structure.

The command to carry out that’s aaex. It takes an elective measurement in bytes and emulates that quantity of bytes of executable code doing the minimal quantity of labor wanted to compute references ranging from the present deal with (which will also be specified as non permanent search utilizing the @ deal with suffix). If no argument is offered, it’ll emulate all executable sections of the binary.

Within the case of DSC, nonetheless, we don’t often need to emulate all of the code that’s there, as a result of it’s large and might take actually hours. As an alternative, now we have to limit the emulation to the portion of code we’re considering — for instance the __text part of a single library.

Cross-References Inside a Library

The best and most frequent use case is to search out references to code or information inside a library which is a part of the DSC.

Listed below are examples to try this protecting some common eventualities.

The widespread premise, although, is that now we have to know forward of time which library we’re considering.

Instance: Discover Code Referencing a Particular String

Whether or not you need to slim down the explanation for an error, the code producing an obscure log message or just to match supply code and compiled code, finding references to strings is at all times a helpful functionality.

One activity I’ve to periodically endure is to investigate the code for Apple dyld (out there as open supply) to assist new variations of the “slide data” constructions used for rebasing and signing pointers as a part of radare2 DSC plugin upkeep.

A straightforward method to find the attention-grabbing code for this objective is to search out references to the error string “invalid slide data in cache file”. It seems on the finish of a giant if / else chain coping with all of the attainable variations of it.

Right here’s a GitHub excerpt of the supply file dyld/SharedCacheRuntime.cpp which exhibits the utilization of that string in its context:

Reversing iOS System Libraries Utilizing Radare2: A Deep Dive into Dyld Cache (Half 2)

As of fairly just lately, dyld’s executable code itself can also be current within the DSC, as a part of the lib/dyld picture.

As a way to discover the reference to the string within the executable from inside the DSC, now we have to ensure we opened the DSC whereas having dyld as a part of the R_DYLDCACHE_FILTER atmosphere variable:

Let’s find the string first to find out its deal with utilizing the izq command:

This tells us the string is at 0x1ab7a9bdb. Pay attention to this as a result of it’s the deal with that any code utilizing this string shall be referencing.

Then we are able to find the __text part for lib/dyld, which defines the reminiscence deal with vary during which we’re anticipated to discover a reference to it, as a result of that comprises the whole compiled code for the dyld executable. We are able to do that by filtering the output the iSq command (which lists sections like iS however the additional “q” suppresses the default ASCII-art columns for efficiency and brevity):

Now we are able to proceed to emulate the entire executable part to be able to compute cross-references. We try this by searching for to the part first, then operating the aaex command:

We move $SS as the scale argument, which implies “present part’s measurement”, setting the non permanent search to $S which implies “present part’s begin deal with”. That’s a fairly generic command we are able to reuse to emulate the entire present part.

After this command finishes (and it’s often fairly quick once we restrict its scope like this), we are able to then proceed querying the computed cross-references for the one we’re considering.

For this we are able to use the axt command (analyze xrefs to) passing it the deal with of the string:

This instantly returns the deal with of the instruction discovered computing the deal with of the string (0x1ab7782ec), which we are able to then visualize in its context utilizing pd. That is exactly the perform we’re considering:

This system is sort of generic and can work even in absence of debug symbols for matching supply code to the corresponding compiled code, so long as string constants can be found.

As a last word, do not forget that all cross-references, after being computed, are saved in reminiscence, so there’s no have to run the aaex command once more for locating completely different references from the areas of code which have already been emulated.

Instance: Discovering A number of References

You should use the identical methodology as within the earlier instance to search out references to any deal with, together with world variables, features, Goal-C ivars and even Goal-C selectors.

For instance, let’s discover the references to the tubeManager ivar of __NSURLSessionLocal which we already encountered within the first episode.

To try this, let’s open the DSC with a filter together with the Basis framework first:

Then get the deal with of the tubeManager ivar by grepping flags:

Let’s discover out the place the ivar itself lives, utilizing the iSq. command which reveals the part containing the deal with the place the command is run:

This output tells us that the ivar’s deal with belongs to the __objc_ivar part of the __DATA section of the CFNetwork framework. So let’s discover references from the CFNetwork framework’s executable code (which is saved within the __text part of the __TEXT section of the identical framework). As earlier than, we are able to discover the beginning deal with of that part, search there, emulate it to compute the references and discover the references to the ivar deal with:

There are fairly a number of hits this time. When this occurs, relying on the precise activity at hand, we may have some extra info to be able to prioritize which of them we must be first.

A great way to take action in r2 is to feed each deal with of the ensuing xrefs into the fd command which interprets any deal with into flag identify + offset illustration.

We are able to do that utilizing a one-liner which leverages command composition, the place the filtered output of 1 command can be utilized as enter for an additional one by the use of the backticks () operator.

On this case, we’re going to make use of the second column (~[1]) of the axt output above (the deal with) to create a listing of addresses, then use the search iteration operator @@= to make use of every ingredient of the record as non permanent deal with for the fd command, which is then referred to as as soon as for every deal with of the discovered references:

This illuminates the context during which the references are discovered earlier than digging deeper into every of them, all with out having to attend for r2 to investigate all attainable features forward of time.

For instance within the output above, if the duty is to determine what a “tubeManager” is or does, I’d skip over every thing which appears to be a part of a deallocation routine or a duplicate constructor. These features often reference all ivars and it’s unlikely there may be any logic particular to this one specifically.

Beware, nonetheless, that there’s no assure in regards to the offset from a flag falling contained in the corresponding image, as a result of there could possibly be unnamed features in between. Subsequently, it’s at all times a good suggestion to double test the outcomes of fd for that earlier than deciding whether or not to skip over or dig deeper into one, by being additional suspicious of massive offsets — and brief features.

Conclusion

Nicely, this glided by rapidly! Hopefully you discovered the weblog helpful. There’s nonetheless rather a lot to say about cross references, so within the subsequent put up we’ll dig extra and have a look at some methods to search out references throughout libraries inside the DSC.

As at all times, questions are welcome, as are points and pull requests on radare2’s GitHub.



Related Articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Latest Articles