DEV Community

Cover image for Disassembling apps to 'Ok Google' my garage
Sam Thorogood
Sam Thorogood

Posted on

Disassembling apps to 'Ok Google' my garage

In early 2018, I disassembled the Android app for my garage door so that I could tinker with its innards and hook it up the Google Assistant 🗣️.

This was fun, and involved snooping on HTTPS and grokking Android bytecode. If you'd like to learn a bit of reverse engineering, read on!

Super low quality GIF of opening my garage

⚠️ This post is about the process of reverse engineering and it is for educational purposes only. I'm also not identifying any vendors in this post or making source code available. I don't condone or endorse this behavior. 😅

The Backstory

I have a garage door with the usual interfaces: two keyring dongles, a button on the wall, and the company actually provides iOS & Android apps. These apps are pretty good, although pretty complex for what is really a simple action. 🙅

But let's face it: having to press buttons is a hassle. There's the pointing, finding the button, pressing it—👉🔘—but since we live in the future 📡🔬, I should be able to talk to my garage via Google, through my Google Home or my phone.


The End

Here are the steps taken when I speak 0️⃣ Ok Google, open the garage door:

1️⃣ Google recognizes my voice using its service
2️⃣ The IFTTT service's Applet matches the string "open the garage door"
3️⃣ IFTTT makes a simple HTTP call to my App Engine backend
4️⃣ My backend makes a sequence of login/request calls to the vendor's API
5️⃣ This connects to the physical internet-connected controller (from the vendor) in my house
6️⃣ The controller broadcasts on a proprietary frequency for the 'dumb' motor to perform an action

The seven steps needed to open a garage door

This is a lot of steps to automate a simple task. But, from my speech to door opening or closing, is no more than a couple of seconds. I also don't mind the online parts—I'm not interested in offline as Google's recognizer isn't designed to work offline anyway.


So, how did I make this happen?

The bulk of my work was in 🛠️ reverse engineering the garage door company's Android application and how it talked to the physical controller—which as I mentioned above, works fine but is just too complex for my liking.

Garage door and controller

The controller device from the vendor, which bridges the vendor's API and the dumb motor next to it

Online vs Offline

One of the first steps I did was try a few different network environments to work out how everything talks to each other.

  1. I took my internet down, while leaving my phone and the controller on the same WiFi connection—the vendor's Android application still worked, so they have a LAN fallback

  2. I took my phone outside and off WiFi—they could still talk, so there's probably a server involved somewhere

Snooping on HTTPS

For context: it's fair to say that most APIs are HTTP-based—there's a huge win to be able to pierce things like transparent proxies or to use the same API on a mobile application as well as a web admin page.

And in 2018, it's hard to imagine any protocol going over pure HTTP—I have to start with HTTPS.

So, naïvely, I started off trying to examine the HTTPS traffic being generated by the vendor's Android application—what is it doing? I created a local WiFi network from my computer, and then used mitmproxy to intercept HTTPS connections (with some setups steps that vary based on your OS).

There's two possible ways mitmproxy, and other similar tools, work:

  1. You have access to the vendor's private SSL certificate and can act as them (only really useful if you are the vendor, for debugging)
  2. The tool generates new fake certificates, and you are able to install a new root CA on the device you're listening to (so that it trusts the tool)

Since I'm not the garage door company, I needed to go with the second option. mitmproxy documents how to add a new root CA to your device of choice—actually, I have a spare iPhone, and I found adding the certificate there much easier.

SSL Certificate Pinning

But for this post, there was a catch. The vendor, in their wisdom, implemented what's known as SSL Certificate Pinning—meaning mitmproxy was useless.

In this case, the vendor's application expects a very specific certificate (based on its public key), ignoring the device's own notion of trust. Not all applications will be built this way—mitmproxy actually could correctly read a lot of traffic from other parts of my iPhone, including communication with iCloud, which is interesting if you're into that.

The details of how pinning works are not worth discussing, but this was a dead end, and good on the company in question—this is a really good idea. The only interesting thing that I could discover (actually via a tool called Wireshark) is the DNS lookups the application was making—the hostnames of the APIs it was calling.


Because I wasn't able to intercept the HTTPS connections being made by the vendor's application, I decided the best way would be to snoop around in the vendor's source code.

Disassembly

On Android, this is actually pretty trivial, but the resulting source code is tricky—not impossible—to read. On iOS, it's a lot harder—most iOS applications are actual native code (unlike Android, where they're bytecode), so unless you want to read assembler all day (I don't), you're out of luck.

Examining an Android

The first step to analyzing the application is to actually get the application. Android applications are shipped as APKs (which are really just ZIPs).

It's actually tricky to get these off your own device. However, if you Google for "apk download name.of.the.package", you'll find a number of APK download sites.

Aside: I won't comment on the legality of this, but it's certainly easy. There's also no guarantee that these sites aren't changing the application before you download it (think of the usual attack vectors—mining cryptocurrency, stealing passwords).

Once you have an APK, you'll need Apktool. On macOS/Homebrew, you can install it with brew install apktool.

Run the tool and check out some of the Smali files—the readable versions of Android bytecode—with these commands:

apktool d name-of-apk.apk
ls -R smali/
Enter fullscreen mode Exit fullscreen mode

Apktool may also extract .so files, which are actual binaries for different platforms (you'll typically see them in x86, armeabi paths). You can't really do anything with these (they're assembler)—in my case, I found a few support libraries but no application or protocol logic.

Analyzing Code

Now that you have the source code to your application, open up any file and take a look. Smali is hard to read, and can use obtuse variable or class names like a, b, c etc, due to obfuscation. Take a look:

.method public constructor <init>(Lcom/example/ViewProxy;)V
    .locals 2
    .param p1, "proxy"    # Lcom/example/ViewProxy;

    .prologue
    .line 39
    invoke-direct {p0, p1}, LLcom/example/UIView;-><init>(Lcom/example/ViewProxy;)V

    .line 41
    new-instance v0, Linfo/whistlr/SquareNavBar;

    invoke-virtual {p1}, Lcom/example/ViewProxy;->getActivity()Landroid/app/Activity;

    move-result-object v1

    invoke-direct {v0, v1}, Linfo/whistlr/SquareNavBar;-><init>(Landroid/content/Context;)V

    .line 43
    .local v0, "hcsb":Linfo/whistlr/SquareNavBar;
    new-instance v1, Linfo/whistlr/View$1;

    # ...
Enter fullscreen mode Exit fullscreen mode

At this point, you have two options.

1️⃣ Traverse the entire codebase and trace code flows
2️⃣ Add a huge amount of logging calls and recompile the app.

Of course, you want to do the second option.

The best part of working with disassembled Android code is that calls to built-in methods and classes always have their "real" names. What does this mean? It means that calls to things like HttpsURLConnection, the standard HTTPS client on Android for doing network requests, happen with that name—so you can just search for them and log around them.

Finding a Vector

So I need to look for references to HttpsURLConnection—I could literally use grep for this:

grep -r HttpsURLConnection smali/
Enter fullscreen mode Exit fullscreen mode

In my vendor application's case, all the references were localized to a couple of files. The vendor obviously has streamlined their API so it's only called in a couple of places, which makes it really easy for us to log.

I opened up that file (in my case, it was smali/com/vendorname/client/comms/c/b.smali) and poke around. This code, for example, sets the User-Agent to the result of invoking a static method:

const-string v1, "User-Agent"
invoke-static {}, Lcom/vendorname/client/comms/c/a;->a()Ljava/lang/String;
move-result-object v2
invoke-virtual {v0, v1, v2}, Ljavax/net/ssl/HttpsURLConnection;->setRequestProperty(Ljava/lang/String;Ljava/lang/String;)V
Enter fullscreen mode Exit fullscreen mode

And there's a huge amount of similar code here, including setting the request URL and payload—this is perfect! 🎉👨‍💻


Aside: Bytecode 101

The Smali code, which is just readable Android bytecode, has classes, static methods, and instance methods. But there's no variables or declarations-instead, each method has a number of registers.

We see the number of registers at the top of any method:

.method public writeToParcel(Landroid/os/Parcel;I)V
    .locals 2  # <-- this has two local registers
Enter fullscreen mode Exit fullscreen mode

In total, this method will actually have three registers—two local (referenced as v0, v1) and one parameter (the Parcel argument, referenced as p0). There's two things to remember:

  1. On instance methods, v0 is always the this object.
  2. The parameters are also referenced as the argument no plus the number of locals. So in our example, p0 is equivalent to v2 (the next unused local).

The reason Android turns a regular method into this register approach is that it's trying to use as few variables as possible. If you have a method that uses tens of variables but only one at a time, the Android compiler can conceivably pack those uses all into one register. You'll see this throughout the Smali:

const-string v1, "POST"  # store "POST" in v1
invoke-virtual {v2}, Lcom/app/client/c;->d()Ljavax/net/ssl/SSLSocketFactory;
move-result-object v1  # store the SSLSocketFactory in v1
Enter fullscreen mode Exit fullscreen mode

(Importantly, the type of a register can change, but this is checked at compile time.)


Back to the file I opened before. 🗃️🔙

Logging for Fun and Profit

We basically want to find any strings—HTTP URLs, parameter names—and then log them to Android's console.

We can call the Android log methods to make this happen. It takes two arguments—a "tag", and a string. The tag is useful—I can put a unique string there so we can find the output quickly in Android logs.

To declare a new tag, the easiest thing to do is bump the number of .locals so we can just declare a new string in the high local (since there's no limit, the bytecode is just trying to be efficient). If your method has e.g., six, then just add another one:

.method public run()V
    .locals 7  # was 6, increase this number as much as you need
Enter fullscreen mode Exit fullscreen mode

Now, to log a register—like our User-Agent from before—all you need to do is:

const-string v1, "User-Agent"
invoke-static {}, Lcom/vendorname/client/comms/c/a;->a()Ljava/lang/String;
move-result-object v2
invoke-virtual {v0, v1, v2}, Ljavax/net/ssl/HttpsURLConnection;->setRequestProperty(Ljava/lang/String;Ljava/lang/String;)V

# now log "User-Agent" and its value
const-string v6, "__XXX__garagedoor__XXX__"
invoke-static {v6, v1}, Landroid/util/Log;->v(Ljava/lang/String;Ljava/lang/String;)I
invoke-static {v6, v2}, Landroid/util/Log;->v(Ljava/lang/String;Ljava/lang/String;)I
Enter fullscreen mode Exit fullscreen mode

I found the URL that the service was calling, because the application was constructing a java/net/URL and I logged the v4 string passed to it.

I also found the JSON payload of the POST request—a goldmine!—by finding calls to getOutputStream on HttpsURLConnection, which the code then used to write a String to. That string was the payload of the request:

# this code gets OutputStream (v3) and creates a DataOutputStream (v2) around it
invoke-virtual {v0}, Ljavax/net/ssl/HttpsURLConnection;->getOutputStream()Ljava/io/OutputStream;
move-result-object v3
invoke-direct {v2, v3}, Ljava/io/DataOutputStream;-><init>(LLjava/io/OutputStream;)V

# **add this** code to log v1...
const-string v6, "__XXX__garagedoor_payload__XXX__"
invoke-static {v6, v1}, Landroid/util/Log;->v(Ljava/lang/String;Ljava/lang/String;)I

# ... because the app writes the string in v1 to our DataOutputStream (v2)
invoke-virtual {v2, v1}, Ljava/io/DataOutputStream;->writeBytes(Ljava/lang/String;)V
Enter fullscreen mode Exit fullscreen mode

Ok, so I added some log statements. What now? 🤔

Back To the Device

I can now use apktool and a few other tools to reassemble my "own" version of the APK. If you've made any mistakes in your Smali, apktool will complain here—it's basically recompiling the code. This is my flow:

# apktool to reassemble APK
apktool b -f smali/ -o replacement.unaligned.apk

# you'll first need to generate a 'keystore' using Android tools
jarsigner -verbose -sigalg MD5withRSA -digestalg SHA1 \
  -keystore my-key.keystore -storepass YOUR_KEYSTORE_PASSWORD \
  replacement.unaligned.apk YOUR_KEYSTORE_NAME

# required, tool from Android SDK
zipalign -v 4 replacement.unaligned.apk replacement.apk

# finally- install onto your device!
adb install -r replacement.apk
Enter fullscreen mode Exit fullscreen mode

Before I installed the application, it's worth noting that I have to use my own keystore, so I can't overwrite the app I've installed via the Play Store. I'll uninstall that version first.

Note that the tools above (jarsigner, zipalign and adb) are all part of the Android command-line tools.

Reading logs

Now, I just open up the application and sign in like I normally would (the application has an odd account system, but it's not really relevant for now).

I can just use adb logcat, grepping for the tag before—and I'll see a bunch of interesting log statements:

$ adb logcat | grep __XXX_garagedoor
V/__XXX__garagedoor__XXX__(  871): User-Agent
V/__XXX__garagedoor__XXX__(  871): name-of-user-agent; v1.0
Enter fullscreen mode Exit fullscreen mode

Great! Phew! Wow, that wasn't short. I'm not going to cover more on the Android side—it's interesting but now you have all the tools you need to go and find out what the application is doing.

Payload Perculiarities

One thing that I'll note that the payloads sent over HTTPS by my vendor's application were actually encrypted in a fairly odd fashion—despite the fact that HTTPS already encrypts the payload. So at face value, they were quite hard to parse: but remember, that encrypted payload was just generated somewhere else in the app. 🤷

So when I found the origin and the code that did the encryption, it was trivial(-ish) to reproduce the same steps myself.


Once I'd worked out the protocol the application used, the rest was actually fairly straight-forward. There's a few parts.

My App Engine backend

App Engine is perfect for small HTTP servers—it's basically free for small projects, and supports Python, Node.JS, Go and other languages. I wrote a small Go server which implemented the vendor's protocol.

When you hit it with a HTTP POST, it does a login/request dance to the vendor's API. It looks a bit like:

Diagram of HTTP requests

My App Engine backend has knowledge of the account needed to hit the vendor's API. But I require HTTP callers to know the 'secret'—so anyone who discovers the API can't just open my garage door randomly (imagine a @OpenMyGarageDoor Twitter bot).

IFTTT integration

IFTTT allows you to set up custom "Applets" that run on certain Google Assistant triggers. This is how I actually use the door.

IFTTT applet

This applet is pretty simple—I set up a number of voice commands, like "open the garage", which call a HTTP endpoint. One challenge with this IFTTT integration though, is that IFTTT will always say the precanned response—e.g. "Yes, I'm opening the door!"—even if the HTTP request fails. There seems to be no way to really propagate errors through IFTTT. 🚨👎

Google Voice accounts

IFTTT works specifically with your account. In my household, where I'm not the only one with a smart device, that's not ideal—it only opens from my phone.

However, if your Google Home device has multi-user support, the extra voice action—"open the garage door"—seems to be shared with all local users. So if my partner or visitors say "open the garage" inside the house, to a Home, it works fine. Currently though, they can't trigger the same thing from their phones, unless they also set up the IFTTT applet.


Finished

Thanks for reading! I hope you've learned a bit about how easy it is to get started doing a bit of software reverse engineering, and maybe inspired you to go make your house work for you. 🏡👩‍🏭

If I spent more time on this, I'd probably work to remove the IFTT part—not returning a failure code if the request fails is quite frustrating, and it means occasionally nothing will happen and I can't find out why. Google has other options such as Dialogflow and Actions on Google which I've not really looked at yet, but IFTTT has been great for prototyping.

And if the company I bought my product from had Assistant support, I never would have had a go at this! So thanks for giving me a fun project. 😂

Thanks

Oldest comments (3)

Collapse
 
maxart2501 profile image
Massimo Artizzu

Ahahah that's awesome, Sam!
I'll finish to read this later and I'll forward it to my colleagues, but all I can say now is that it's pretty inspirational. Pushes to reveal the hacker in us! 🛠️

Collapse
 
chingiz profile image
Chingiz Huseynzade

Omg, it was super duper article that I read today. So much inspiration and motivation to look forward what can developer achieve if we need to do something. Thanks for the great article. <3

Collapse
 
tabunity profile image
tab unity

We use the best parts for our work because we want your garage door to last. From springs to cables, we make Garage Door Replacement sure everything is top-notch. Our work is all about quality and making sure your door works great for a long time.