22.7 C
New York
Thursday, September 19, 2024

Reversing iOS System Libraries Utilizing Radare2: A Deep Dive into Dyld Cache (Half 3)


Welcome to the ultimate weblog submit in our sequence about reverse engineering iOS system libraries with radare2. This time we’ll concentrate on discovering cross-references throughout totally different libraries current within the dyld shared library cache (or DSC for those who’re into acronyms).

We’ll talk about numerous strategies to realize this, every with its trade-offs between efficiency and quantity of prior data required.

When you missed the earlier two weblog installments, now it’s a great time to catch up as a result of they undergo the ideas wanted to know immediately’s subject.


By leveraging the capabilities of multi-file caches, all import stubs are grouped into ‘stub islands,’ cleverly mapped to maximise the variety of executables that may reuse the identical import stub.

Cross-References Throughout Libraries

Features which might be exported symbols may be known as by different libraries throughout the DSC. This often occurs by an “import stub,” a small perform that masses the tackle of the ultimate perform and bounce to it.

Usually, every executable that calls features exported by different libraries embeds its personal import stubs in a particular part of its personal Mach-O file. Whereas this stays true for executables embedded into the macOS and simulators’ DSCs, it differs on iOS as a result of Apple applied a number of optimizations over time to scale back the dimensions of all the DSC.

On iOS 15.x, libraries “close to sufficient” within the DSC reminiscence format (however not essentially associated to one another) had been in a position to reuse one another’s import stubs in the event that they occurred to depend upon the identical imports.

Extra not too long ago, by leveraging the capabilities of multi-file caches, all import stubs are as a substitute grouped collectively into “stub islands”, which in flip are cleverly mapped to maximise the quantity of executables which might reuse the identical import stub.

To implement these optimizations, the unique libraries are remodeled whereas being embedded to the DSC. That’s the reason extracting separated libraries from the DSC may very well be cumbersome and result in incomplete info — it’s less complicated to work with all the DSC.

Challenges with Efficiency

The principle drawback when discovering cross-references throughout libraries within the DSC is efficiency. The problem is to scale back the dimensions of the issue as a lot as potential whereas nonetheless undertaking the duty.

Since for any exported image there may be many libraries referencing it, we need to load solely the “fascinating libraries” when opening the DSC with r2. We have to know each the library that holds the perform we need to discover references to, and ideally, the caller libraries the place we need to discover the caller code.

This implies discovering all references to a given exported image is a tedious job that requires:

  • Discovering all import stubs that reference the goal image.
  • Discovering references to the stubs.

Even when “stub islands” can be found and straightforward to emulate in an inexpensive time, discovering all of the references to a subset of them should require emulating many of the executable code within the DSC. This isn’t preferrred and consumes important time and reminiscence.

As a substitute of blindly trying to find references, I sometimes go for a extra interactive discovery of calls to imported features, offered I do know the caller code I’m serious about. This strategy requires some assist from r2pipe which is the best solution to automate duties in radare2.Along with the $what script we already encountered within the first episode, right here’s one other r2pipe python script that I incessantly use for this process.

The “namestubs” Script

This script makes use of r2’s emulation powers to call an import stub after the perform it wraps. It additionally helps the comparatively latest Goal-C methodology stubs, which encapsulate calls to objc_msgSend utilizing a particular selector.

You will discover the script on GitHub: Gist hyperlink.The alias I take advantage of for this script known as $namestubs:

Reversing iOS System Libraries Utilizing Radare2: A Deep Dive into Dyld Cache (Half 3)

The ensuing $namestubs command have to be supplied with an inventory of addresses. If all the pieces works appropriately, it can produce no output and can create features with the proper names.The produced names have the stub. prefix, and if a number of stubs have the identical identify, they’ll be postfixed with an incremental quantity.

The stub perform creation is idempotent, within the sense that if $namestubs known as a number of occasions on the identical tackle, or if the identical tackle seems a number of occasions within the argument listing, the script will detect {that a} stub identify is already outlined and transfer on.

Instance: Exploring Calls to Exports Interactively

Let’s put all of it along with an instance the place we analyze the code of a quite simple perform that calls exports from different libraries.

The perform in query is +[_NSPredicateUtilities _predicateSecurityAction], an Goal-C class methodology that has been stripped out of the Goal-C runtime by way of compiler optimizations (as identified within the first submit).

That perform lives within the Basis framework, so we are able to open the DSC with a filter like

If we disassemble that perform as it’s now, right here’s what we see:

This output isn’t very informative because it solely reveals uncooked addresses with out image names. Nonetheless, if we study the primary name goal, the 0x18bf71170 tackle:

We are able to see that it’s computing an tackle in x16 then leaping to it.

If we allow r2’s emulation and disassemble that once more, we are able to see that the computed tackle refers to a logo with a reputation:

Which means it’s a stub for objc_opt_self().

That’s precisely what the $namestubs script does, so let’s feed it with all of the perform addresses known as by the +[_NSPredicateUtilities _predicateSecurityAction] perform:

Now the stubs features are created mechanically with the proper names, so if we disassemble the perform once more, we get:

We are able to now use the $what script to verify which library these stubs belong to, however that can fail to resolve the picture. Through the use of the iSq command to disclose the corresponding part, although, we are able to  see all these stubs belong to the identical stubs island:

Stub islands don’t belong to any explicit picture as a result of they’re shared between neighboring libraries.

Instance: Interactive Exploration of Older DSC

For completeness, let’s study the identical perform on iOS 15.7, the place stub islands weren’t a factor but:

The very first thing that stands out is that dyld_program_sdk_at_least will not be a stub. We are able to verify this by checking the part it belongs to:

This reveals that this perform calls a logo from libdyld immediately with none stub.

As for the opposite two, if we run $what and iSq. on them:

We are able to see that they’re certainly stubs, reused from CoreFoundation and libicucore, respectively. Word that iSq. would haven’t returned any ends in case the stubs had been reused from libraries not current within the load filter, whereas $what as a substitute works regardless.

Let’s identify them utilizing $namestubs and print the perform’s disassembly once more:

Right here, we are able to see they had been certainly objc_opt_self and abort, confirming that the stubs had been reused as a result of they had been close to in reminiscence, not as a result of CoreFoundation or libicucore are liable for these symbols.

Instance: Discover All References Leveraging Stub Islands

If interactive exploration isn’t an possibility and you continue to have to enumerate all of the calls to a particular exported image, you may nonetheless do it by leveraging stub islands —  simply be ready for disappointing efficiency in comparison with the interactive approach.

Issues it is best to know upfront:

  • The perform you need to discover calls to
  • Ideally, a superset of the caller libraries to set the filter correctly at load time.

The steps are:

  • Emulate all stubs islands first
  • Discover references to the goal perform inside stub islands to find the corresponding stubs
  • Discover references to the stubs you’re serious about, limiting the search to code close to sufficient to name the goal stubs immediately (which is the purpose of getting stubs).

First, we have to find all stub islands. That is straightforward as a result of r2 creates a piece for every of them, so that they’re obtained by filtering the sections listing by identify, much like what we do for libraries:

The place the primary column is the part’s begin tackle, the second column is the tip tackle.

To emulate all stubs islands with a single one-liner, we are able to name:

The command runs aaex as soon as originally of every stub island part, emulating $SS (part dimension) bytes of every.

The method takes some time (not too lengthy — about 1.5 minutes on my MacBook Professional M1), however after it finishes, we are able to question references to any image and we’ll instantly discover out the place the stubs for it are.

For instance:

All these references level throughout the stubs features, particularly to the second instruction of every stub.

Let’s study the primary one (all of them have the identical construction). By disassembling three directions ranging from 4 bytes above the discovered reference, we are able to see all the stub code:

The following step is to gather the ranges of addresses we have to emulate to seek out direct name (or department) references to every of such stubs.

That is the place issues get problematic performance-wise, as a result of the code we have to emulate can very nicely be all of the executable code within the DSC.

Assuming an arm64 DSC, it’s potential to limit the emulation area if the goal image isn’t extensively used and seems solely in a subset of the islands. In that case, since on arm64 the b / bl instruction can bounce to code situated inside +/- 128MB from the instruction itself, we are able to reap the benefits of this truth to focus solely on the components of all of the __text sections which might name our stub features immediately. This may hopefully cut back the work wanted to emulate all of them and discover the references to the stubs, and may be simply automated utilizing r2pipe.

Nonetheless, if the stubs for the goal image are scattered throughout the DSC, the brute-force strategy is perhaps our solely possibility. This may contain emulating all of the __text sections within the DSC with the present filter, which usually will not be advisable until the filter already narrows down the loaded libraries to just some tens.

Right here’s a proof-of-concept (PoC) r2pipe script that automates the above: Gist hyperlink, which 

may be invoked immediately or by an alias like this:

The script requires an tackle or flag identify of any exported image and can output all of the references to the corresponding stubs discovered within the stub islands. Just a few warnings to notice:

  • This works solely on caches with stubs islands (iOS 16+)
  • It may possibly take a number of hours to finish!
  • It’s only a PoC and ought to be tailor-made/optimized in your personal process

Right here’s an instance of utilizing this script the place we open an unfiltered cache (worst-case state of affairs)  and search for all references to the mach_continuous_approximate_time perform. To hurry up the loading of the unfiltered caches, it’s potential to disable the parsing of strings and lessons, and the demangling of symbols through the use of the -e command line change a number of occasions to set these configuration variables:

Alternatively, you might use Blacktop’s ipsw instrument, as described right here. Be ready for some ready, as analyzing a number of gigabytes of code with about 9 million symbols takes time. As soon as the main focus is extra slim and prepared for interactive exploration at a extra fine-grained scope, you may return to r2, the place it actually shines.

Conclusion

That’s it, for now. Thanks for taking the time to digest all this. Hopefully, you should use this info for revenue, or to have enjoyable opening points and pull requests on radare2’s GitHub.



Related Articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Latest Articles