Real-Time Record and Replay on
Android for Malware Analysis
Zong Shen Shen
Chia Wei Hsu
Lin Chun Huang
Shan Shin Li
Shiuh Pyng Shieh
CISC 2013
Outline
•Introduction
• Challenge & contribution
•Assessment factors for R&R
• Two possible architectures
• Three key factors
• Prototype implementation
•Discussion
• Experiment
• Limitation
• To apply well-developed tools for precise analysis
• To migrate the heavy-weight procedures from end
device to host machine
• Mobile environment replication on dedicated servers
• Execution trace recording on the mobile
• Trace replaying for mobile image on the server
• Versatile engine deployment on the server to trace
behavior revealed by the image
Why Record & Replay?
Challenge
• Since conventional R&R schemes emphasize on
fine-grained state consistency, the high resource
consumption is still impractical for real device
• The trade-off between accuracy and computation
overhead
Observation & Objective
• Prolific human-machine interaction on Android
• Most of the program entry points are components
listening for user commands which can determine the
execution flow
• Controlling the UI events can control the program
behavior in most cases
• Application-level record and replay
• UI events as execution trace
• Concrete program behavior as consistency metric
Contribution
• This research proposes a new architecture for
end-point protection and guides the proper
approaches for real-world deployment
• It applies the advantages of well-developed analysis
engines without incurring significant overhead for end
device
Assessment Factors for R&R
• Two possible architectures
• Mobile to server side emulator
• Mobile to server side mobile
• Three key factors
• The scalability to serve multiple users
• The configuration effort to build system
• The preciseness of replayed behavior
• High scalability
• Multiple images to multiple users mapping
• Low configuration effort
• The convenience of off-the-shelf SDK tools
• Low replay preciseness
• Unfaithful emulated environment
Mobile to Server Side Emulator
Mobile to Server Side Mobile
• Low scalability
• One device to one user mapping
• High configuration effort
• Requiring customization of replay device
• High replay preciseness
• Full support of telecom services and hardware gadgets
Chosen Architecture
• Mobile-to-emulator
• With Android SDK, analyst can configure experiment
platform more efficiently
• Improvement of emulated environment can bridge
the gap between emulator and real device
• R&R agent survey
• Prototype implementation
How to build R&R Tools?
• Application layer
• Injecting analytics modules into the interested UI
event receivers of analyzed subject
• Aspect-oriented programming
• System layer
• Monitoring the system-wide events fired from
hardware gadgets
• Linux Getevent & Sendevent service
Aspect-Oriented Programming
• Aspect
• Relevant information recording for UI events
• Joint point
• The entry point of each event receiver
• Incompatible code problem
• The instantiation of aspect is JVM format but the joint
point of target app is DVM format
• Code transformation is possible solution but have the
risk of semantics loss
Linux Getevent & Sendevent
• Getevent
• Providing a live dump of hardware events
• Sendevent
• Injecting events into target hardware gadget to trigger
its action
• Built-in tool of Android emulator and real device
Recorder Replayer
Event Dumper
Event Filter App Monitor
Packet Encapsulator
Event Sender
App Launcher Event Converter
Packet Decapsulator
Prototype Implementation
• https://www.youtube.com/watch?v=YfgrfNddp9g
Prototype Demo
1.After install
2.After reboot
Experiment (1/2)
2. Installing new package
3. Fake Google search
1.After reboot
Experiment (2/2)
Limitation
• Complicated event driven model of Android apps
• This research focuses on UI events but do not
address the issue of background broadcast events
• Unfaithful emulated environment
• The subject on the server may crash if it invokes
unsupported services and gadgets
Conclusion
• Two R&R architectures are proposed
• Mobile-to-emulator is the better choice due to high
scalability and low configuration effort
• Two R&R agents are discussed
• AOP has great potential, but Linux Get&Sendevent
can ease the effort for test bench construction
• Accuracy issue
• Background event should be covered to improve the
replay consistency

(CISC 2013) Real-Time Record and Replay on Android for Malware Analysis

  • 1.
    Real-Time Record andReplay on Android for Malware Analysis Zong Shen Shen Chia Wei Hsu Lin Chun Huang Shan Shin Li Shiuh Pyng Shieh CISC 2013
  • 2.
    Outline •Introduction • Challenge &contribution •Assessment factors for R&R • Two possible architectures • Three key factors • Prototype implementation •Discussion • Experiment • Limitation
  • 3.
    • To applywell-developed tools for precise analysis • To migrate the heavy-weight procedures from end device to host machine • Mobile environment replication on dedicated servers • Execution trace recording on the mobile • Trace replaying for mobile image on the server • Versatile engine deployment on the server to trace behavior revealed by the image Why Record & Replay?
  • 4.
    Challenge • Since conventionalR&R schemes emphasize on fine-grained state consistency, the high resource consumption is still impractical for real device • The trade-off between accuracy and computation overhead
  • 5.
    Observation & Objective •Prolific human-machine interaction on Android • Most of the program entry points are components listening for user commands which can determine the execution flow • Controlling the UI events can control the program behavior in most cases • Application-level record and replay • UI events as execution trace • Concrete program behavior as consistency metric
  • 6.
    Contribution • This researchproposes a new architecture for end-point protection and guides the proper approaches for real-world deployment • It applies the advantages of well-developed analysis engines without incurring significant overhead for end device
  • 7.
    Assessment Factors forR&R • Two possible architectures • Mobile to server side emulator • Mobile to server side mobile • Three key factors • The scalability to serve multiple users • The configuration effort to build system • The preciseness of replayed behavior
  • 8.
    • High scalability •Multiple images to multiple users mapping • Low configuration effort • The convenience of off-the-shelf SDK tools • Low replay preciseness • Unfaithful emulated environment Mobile to Server Side Emulator
  • 9.
    Mobile to ServerSide Mobile • Low scalability • One device to one user mapping • High configuration effort • Requiring customization of replay device • High replay preciseness • Full support of telecom services and hardware gadgets
  • 10.
    Chosen Architecture • Mobile-to-emulator •With Android SDK, analyst can configure experiment platform more efficiently • Improvement of emulated environment can bridge the gap between emulator and real device • R&R agent survey • Prototype implementation
  • 11.
    How to buildR&R Tools? • Application layer • Injecting analytics modules into the interested UI event receivers of analyzed subject • Aspect-oriented programming • System layer • Monitoring the system-wide events fired from hardware gadgets • Linux Getevent & Sendevent service
  • 12.
    Aspect-Oriented Programming • Aspect •Relevant information recording for UI events • Joint point • The entry point of each event receiver • Incompatible code problem • The instantiation of aspect is JVM format but the joint point of target app is DVM format • Code transformation is possible solution but have the risk of semantics loss
  • 13.
    Linux Getevent &Sendevent • Getevent • Providing a live dump of hardware events • Sendevent • Injecting events into target hardware gadget to trigger its action • Built-in tool of Android emulator and real device
  • 14.
    Recorder Replayer Event Dumper EventFilter App Monitor Packet Encapsulator Event Sender App Launcher Event Converter Packet Decapsulator Prototype Implementation
  • 15.
  • 16.
  • 17.
    2. Installing newpackage 3. Fake Google search 1.After reboot Experiment (2/2)
  • 18.
    Limitation • Complicated eventdriven model of Android apps • This research focuses on UI events but do not address the issue of background broadcast events • Unfaithful emulated environment • The subject on the server may crash if it invokes unsupported services and gadgets
  • 19.
    Conclusion • Two R&Rarchitectures are proposed • Mobile-to-emulator is the better choice due to high scalability and low configuration effort • Two R&R agents are discussed • AOP has great potential, but Linux Get&Sendevent can ease the effort for test bench construction • Accuracy issue • Background event should be covered to improve the replay consistency

Editor's Notes

  • #5 然而,於Desktop System所發展的R&R技術主要運用在找出細微的軟體漏洞,所以強調Record與Replay兩端應儘量維持高精確的一致性,以便能在分析時重現重要的Bug。 這類型方法大多是記錄System Call的參數使用、系統發出的Synchronous/Asynchronous訊號,將會產生相當龐大的Trace量,耗費的運算資源亦相當高,仍無法直接部署在 真實手機上使用。 因此,這邊將產生一個有趣的挑戰,我們必須嘗試在真實手機環境上,找到準確性與資源消耗的平衡點,雖然不能達到非常精確的R&R,但至少能在server端重現出重要的行為, 並且可避免手機端耗費大量運算。
  • #6 我們觀察到Android一個非常重要的特性,那就是應用程式具備非常豐富的人機互動。 在本質上,Android應用程式採用Event-Driven Programming Paradigm,也就是程式本身由多個Entry Point組成,而這些Entry Point為聽候使用者下達命令的Components, 當這些Component接收到使用者下達不同的命令時,將使得後續的Handler Routine依照命令切換至不同的Primary Task,使得應用程式的執行流程產生變化。 在大部分的情況下,若我們能掌握UI Events,則我們可以控制應用程式的控制流程,並進一步能控制應用程式的外顯行為。 基於此觀察,我們研究的目標為設計一套Application Level的R&R系統,以UI Event做為Execution Trace,達成輕量化手機端運算的目標,並且將狀態一致性的目標定義為 能成功於Replay端重現出具體外顯行為。這些外顯行為舉例來說,可以是應用程式接收使用者輸入的資料並透過網路向外傳送,或是應用程式接收使用者的命令進行錄音並寫入 檔案中。
  • #7 這篇研究最重要的貢獻就是提出一套新型的惡意行為分析架構以達成Real Time End Point Protection的目標,亦即,我們在使用者環境部署一款輕量化的Execution Trace Recorder記錄使用者環境發生的事件, 並將Trace Log定期傳送至Server端進行Replay,同時採用多樣性的惡意行為分析與偵測工具來捕捉可疑行為。除此之外,我們還評估並指導在真實運用時,比較適合的幾種實作與部署模式,可 供將來相關研究領域做參考。
  • #10 接下來,我們介紹Mobile to Server Side Mobile這個系統架構。 此架構與第一種主要的不同點在於Replay端真正進行Trace Replay的為一部真實手機。 在分析工具的部分,進行Replay的手機可以直接移植分析工具進行分析。此外,也可以客製化修改作業系統,在重要的System Call以及Object Access的部分 插入Log工具,讓Replay Image執行時可以產生Behavior Pattern,並由外接機器進行行為分析。 這套架構看似較前者複雜,然而,我們設計這套架構的主要原因是希望能提高Replay環境的真實性,並盡可能重現出手機環境獨特的行為。 為了提高系統整體的實用性,我們嘗試讓一部Replay手機能服務多位使用者。
  • #11 以上述三個項目進行評估後,我們認為Mobile-to-Emulator是現階段較容易部署的架構: 首先, Android SDK提供我們相當方便的工具建置實驗平台,以及將來擴充系統規模。 其次,雖然現階段Emulator的擬真能力仍然不足,但是透過客製化Emulator的功能,以及仰賴Android Virtualization技術的進步,能夠逐步縮減Emulator與真實手機的差距。
  • #13 接下來這個Section,我們進行R&R工具的介紹與優劣評估。 首先是Aspect-Oriented Programming,簡稱AOP。AOP是一套提高程式模組化設計的Programming Model。 主要概念是,將程式主要執行流程外的一些零碎片段抽離出來,形成可重複使用的Module,以其用語來說,稱為Aspect。 此外,亦可提供Programmer自行設計Aspect,透過AOP Engine的幫助,將Aspect具現出程式碼,並縫合到目標程式的指定位置中,這個指定的安插位置,稱為Joint Point。 以本計畫的要求來說,我們可設計如下: 定義Aspect為錄製UI Events所需要的運算。 定義Joint Point為應用程式中所有Event Receiver。
  • #14 其次是G&S。G&S是一套在真實手機或是Emulator中預設都有的服務。 GetEvent提供Event Live Dump的服務,可以列出環境中所有硬體裝置當下產生的事件。 SendEvent則提供對目標硬體裝置輸入一個特定事件,使其產生對應行為的功能。 雖然這套工具還需經過包裝以過濾出來我們感興趣的事件,但是使用這套工具可以避免許多移植工具與函式庫的困擾,是現階段較適合採納的工具。
  • #15 在Prototype的實作,我們採用Mobile to Server Side Emulator作為實驗架構以及Linux G&S做為R&R工具。
  • #19 目前我們設計R&R的方式遭遇兩個限制: 首先是Android App Event-Driven Model的部分,目前我們只著重錄製UI Event,期望控制這些Event就足夠控制App的Work Flow。但是,Android App還有一個特別的Component稱為Broadcast Receiver,它可收聽系統廣播事件,而Broadcast Receiver接收系統事件後,亦可能存在執行路徑改變應用程式主要執行流程。若應用程式大量註冊此類Broadcast Receiver,則我們較難確保程式執行流程的一致性。 其次是Android Malware大量Repackage 的現象。Infected Carrier與Malicious Payload間不一定存在相依關係,而目前我們記錄到的事件群有許多都是影響Infected Carrier的事件,最大的缺點是,我們尚未評估Infected Carrier對Replay精確性帶來的影響。