Abschlussarbeiten

Offene Themen für Abschlussarbeiten

12 Einträge gefunden


The majority of static analysis tools focus on generating the call graph of the whole program (i.e., both the application and the libraries that the application depends on). A popular compromise to the excessive cost of building a call graph for the whole program is to build an application-only call graph. To achieve this, all the effects of the library code are usually ignored. This results in potential unsoundness in the generated call graph and therefore in analyses that use it.

Ali and Lhoták present and evaluate Averroes, a tool that generates a placeholder library that over-approximates the possible behaviour of an original library. The placeholder library can be constructed quickly without analyzing the whole program. Any existing whole-program call graph construction framework can use the placeholder library as a replacement for the actual libraries to efficiently construct a sound and precise application call graph. A natural extension to Averroes would be applying it to large Java frameworks (e.g., Android, J2EE, Eclipse Plug-ins). In particular, applying Averroes to the Android framework will lead to an easier means of analyzing client apps without the need to analyze the Android SDK. Like a library, the Android SDK satisfies the separate compilation assumption because it is developed without knowledge of the client apps that will be developed for it.

In this thesis, you will extend Averroes to support generating analyzable placeholder libraries for the Android SDK. One major challenge is that in Android, the main entry point to the program resides in the framework rather than in the client app. Additionally, there are lots of callbacks that the Android SDK makes into the client app. Therefore, improving the precision of Averroes for handling library callbacks would be necessary to achieve better results. Finally, Averroes has to somehow reason about the lifecycle of an Android app, similar to what FlowDroid does. Otherwise, unrealizable paths would be present in the analyses used by Averroes, which will render the genereated the placeholder library highly imprecise.

Ideal candidates have experience with static analysis, in particular for Java. Prior knowledge of developing static analyses in Soot and Android app development is helpful but not necessary.

 

Thesis opening as PDF

 

Interested? Please contact Karim Ali at karim.ali@remove-this.cased.de

Many modern Android applications make heavy use of native code written in C or C++ to speed up computation-intensive operations such as scene rendering for games or photo/video processing. While such unmanaged code is helpful or even required for application development, it however also poses new security challenges. State-of-the-art static data flow trackers for Android such as FlowDroid do not support analyzing native code and instead apply heuristics on the effect of calls into such code. Pure native code analysis tools on the other hand usually have no notion of the interaction of the code with an Android app and its environment. This gap in analysis techniques allows malware to hide behavior from automated vetting processes by mixing Android and native code in the same application. Manually linking the data flows from both worlds is a cumbersome undertaking quickly becomes infeasible for larger applications.
In this thesis, you will evaluate existing tools for native code analysis and how they can be integrated into the FlowDroid data flow tracker for Android apps. You will implement a hybrid data flow analysis which can track flows between Android code and native libraries and evaluate it on real-world benign and malware apps.


Requirements: Ideal candidates have a profound understanding of the Java and C/C++ programming languages and experience with good software design and efficient programming. Prior knowledge of static analysis is helpful, but not absolutely necessary. 

 

Ausschreibung als PDF

 

Interested? Please contact Steven Arzt at Steven.Arzt@remove-this.ec-spride.de

16.09.2014

Semantic Data Flow Aggregation for Security

Bachelorarbeit

Frei


Scanning large Android apps or Java programs for data leaks or other security weaknesses usually results in hundreds, if not thousands, of findings. Existing tools display these findings in isolation even though many of them have a common cause such as a missing validation or a common vulnerable component. Much time can be saved if these findings could be aggregated, pointing the human analyst directly to the common parts of similar findings and proposing possible places for fixes or further inspection.

The problem is aggravated by the presence of false positives. One false positive in heavily re-used code can lead to hundreds of false data flows being reported. In existing tools, all these findings must be checked and marked as false positives in isolation. An ideal tool would however allow the analyst to mark the common mistake as a false positive and then automatically apply this knowledge to filter all consequences of this mistake.

In this thesis, you will explore possibilities to aggregate data flows using exact (common subgraphs) and inexact (machine learning) techniques and raise the level of abstraction in the interaction with static analysis tools. You will apply your techniques to the FlowDroid open-source taint tracking tool and its existing Eclipse plugin for visualization.

Requirements: Ideal candidates have a profound understanding of the Java language and experience with good software design and efficient programming. Prior knowledge of static analysis is helpful, but not absolutely necessary.

Thesis opening as PDF

Interested? Please contact Steven Arzt at Steven.Arzt@remove-this.ec-spride.de

31.03.2014

Generating Android Malware App with Zero Permissions

Bachelorarbeit, Masterarbeit

Frei


Different studies and also the news have shown that there exists a lot of malicious Apps in the wild. Even Google‘s Play Store cannot effectively prevent malware entering the store. There are various kinds of malicious apps, but the most spread one leaking personal information such as user contacts, the installed applications or the current location. Fortunately, during installation of such apps, the user sees the permissions an app requires (e.g., for accessing contacts or accessing location information) and can decide whether he/she wants to install the app or not. We have recently started to explore attack vectors providing a malicious application that can exercise privileged operations with zero permissions (or at most 1). Such an app would be able to leak all the user’s sensitive information without the user noticing. The goal of this thesis will be to evaluate this attack vector and to construct a tool that generates such malicious applications from some specific specification. More concrete details will be given in a personal meeting with the supervisor.

 

Requirements: Knowledge about Android is required (implementation of own Android apps would be beneficial), as is interest in Android security and software engineering. Background knowledge in program analysis is beneficial.

 

Are you interested? Please contact Siegfried Rasthofer at siegfried.rasthofer@ec-spride.de / +49 61 51 16-75425


Ausschreibung als PDF

Like in any other software project developers of static analyses go through many test-and-fix cycles while developing. Like in other projects, these cycles involve debugging. But, static analyses tend to be hard to debug, because they are typically performed on code bases with huge call graphs containing thousands of nodes. When it comes to scalability issues or flows propagating through large portions of such call graphs, traditional debug tools provide only a useless focussed view on small fractions. To see the bigger picture, one typically introduces additional logging, either text based or even better as graph based representations.

 

While working on a static analysis in our group, we relied on the built-in textual logging of the analysis framework used in the first place. Very soon, the output became too complex to be comprehensible. Mainly focusing on the static analysis itself, we started creating simple static graphs using Graphviz/Dot like the one shown above. These static graphs are better than the textual logging, but still have their limitations. Mainly, such an approach does not scale to large scenarios, because of the lack of filtering and aggregating information on demand.

 

In this topic the existing static graph generation should be replaced by an implementation, which allows interactions like filtering, aggregating, increasing shown information, and highlighting single paths to track flows.

Like in any other software project developers of static analyses go through many test-and-fix cycles while developing. Like in other projects, these cycles involve debugging. But, static analyses tend to be hard to debug, because they are typically performed on code bases with huge call graphs containing thousands of nodes. When it comes to scalability issues or flows propagating through large portions of such call graphs, traditional debug tools provide only a useless focussed view on small fractions. To see the bigger picture, one typically introduces additional logging, either text based or even better as graph based representations. But, these techniques introduce additional computational effort and memory consumption. Therefore, in some cases it is not possible to log in perfect detail and some aggregations have to be introduced. Ad-hoc, it is often unclear which information can be aggregated, filtered, or what information has to be highlighted. This means, the developer of the analysis may has to change these often to try different aggregations, etc. Considering that some static analysis have long initialization phases, which are typically not affected by these changes, the test-and-fix cycles will nevertheless be long, which actively hinders the developer to perform the changes and instantly observe results.

 

In this topic an existing static analysis based on Soot has to be split into two parts. The first part should contain the stable initialization part, which loads the byte code of the Java Class Library, transforms it to an intermediate representation, generates a call graph and keeps these artifacts in memory. The second part should contain the analysis code itself. It should be possible to dynamically reload the second part, without loosing the progress of the first part. One possible solution might be to use the OSGi framework and place each part of the analysis in a separate OSGi bundle. OSGi will then allow to reload the analysis bundle without the need of restarting the whole program. Therefore, the main challenge will be to find a good split, eliminate side effects, clear memory allocations of replaced code, stop execution of code scheduled to be replaced, and to provide build infrastructure to easily integrate within the typical Eclipse development workflow.

Security assurance enables developing coherent objective argumentation that supports claiming that a software product mitigates its security risks. A security assurance case, a semi-formal approach for security assurance, is a collection of security-related claims, arguments, and evidences. Security assurance cases are currently developed separated from the software.

The goal of the proposed thesis is to investigate associating security assurance cases to software code. Questions to investigate may include: How to model evidence collection activities? And what are the impacts of code changes on the security assurance cases of software? The work includes the development of an Eclipse plugin to model security assurance cases and to associate the artifacts of the assurance cases with software code.


Candidates should have good experience with Java and Eclipse and be interested in engineering secure software.


Are you interested? Please contact Lotfi ben Othmane at lotfiben.othmane[at]cased.de


Ausschreibung als PDF

The Java Class Library (JCL) – with Java being one of the majorly adopted programming languages – is heavily used and an implicitly trusted library on which many mission critical applications are based. In order to prevent abuse, Java has a sophisticated security model to ensure the isolation of protected areas inside a program. However, attackers have found and continue to find several ways to disable the security model thus rendering it useless.

One way of effectively evading the Java security model is to perform operations in native code. Since attackers cannot easily introduce new native libraries during an attack, they are keen to abuse an exploitable part of the native code already used in the JCL itself. As this is not a small part (roughly 800k LOC in Java 1.7) of the JCL, a manual code review looking for security vulnerabilities is hardly an option. Automated methods have to be developed to mitigate the possible threat the native part of the JCL poses.

Clearly, not every of the roughly 1,800 native methods in the JCL constitutes a serious risk, some of them might even be completely benign. For instance, a call to java.io.FileOutputStream.write might be harmless in contrast to sun.miscUnsafe.copyMemory. So the potential threat of a native method is depending on their treatment of the input data and the resulting expected (or unexpected) side effects they produce (e.g. memory alterations, buffer overflows, …).

In this thesis an automated code analysis has to be developed that operates on the native part of the JCL (e.g. with LLVM) and classifies the methods visible to the Java part of the JCL according to their potential threat. Different input data for this classification can be utilized here. Interesting signals might be (and are not limited to) functional purity, direct memory manipulations, pointer arithmetics or type misuses. Basically anything from current exploit literature can be applied here to achieve more precise and meaningful results.

Publications

  • Drake, Joshua J.: Exploiting Memory Corruption Vulnerabilities in the Java Runtime. 2011
  • Tan, Gang, and Jason Croft: An Empirical Security Study of the Native Code in the JDK. Usenix Security Symposium. 2008.
  • Bratus, Sergey, et al.: Exploit Programming: From Buffer Overflows to "Weird Machines" and Theory of Computation. Usenix .login December 2011

10.02.2014

Modelling the use of native methods in the Java Class Library

Bachelorarbeit, Masterarbeit

Frei


The Java Class Library (JCL) – with Java being one of the majorly adopted programming languages – is heavily used and an implicitly trusted library on which many mission critical applications are based. In order to prevent abuse, Java has a sophisticated security model to ensure the isolation of protected areas inside a program. However, attackers have found and continue to find several ways to disable the security model thus rendering it useless.

One way of effectively evading the Java security model is to perform operations in native code. Since attackers cannot easily introduce new native libraries during an attack, they are keen to abuse an exploitable part of the native code already used in the JCL itself. As this is not a small part (roughly 800k LOC in Java 1.7) of the JCL, a manual code review looking for security vulnerabilities is hardly an option. Automated methods have to be developed to mitigate the possible threat the native part of the JCL poses.

Currently, users of the public API of the JCL are completely oblivious of the fact that most of their method calls will sooner or later result in a native call. Thus, they rely on the JCL to perform any checks or sanitization necessary. The non-native part of the JCL therefore endows the trust of application developers using it.

As the JCL and its native part have grown over the years of its existence, security reviews became increasingly complicated to perform purely by hand. Oracle, as one of the larger contributors to Java, runs code analysis tools to aid these reviews. Due to the complex nature of the Java security model and the architecture of the JCL finding vulnerabilities is a rather tough problem.

In this thesis an implementation of an automated static code analysis has to be created that is able to evaluate the propagation of the possible threat native calls pose to the public API of the JCL. Assuming that every native method poses the same amount of threat, the propagation of this threat is solely depending on the data provided to these method. Therefore, the analysis will largely benefit on an elaborate rating of the data, its type, safe guards on that data and possible treatments applied to it. In order to combine this information, techniques from data mining, machine learning, graph or network theory can be applied to the transitive hull of the reverse call graph (around 95.000 methods) of native methods (around 1800 methods) in the JCL. After a successful analysis run it should be possible to determine the risk of calling methods of the JCL. Developers can then take effective countermeasures while processing user input and make educated choices on the classes and methods they are using.

Publications

  • Tan, Gang, and Jason Croft: An Empirical Security Study of the Native Code in the JDK. Usenix Security Symposium. 2008.
  • Feng, Henry Hanping, et al: Anomaly detection using call stack information. Security and Privacy, 2003. Proceedings. 2003 Symposium on. IEEE, 2003

The Java Class Library (JCL) – with Java being one of the majorly adopted programming languages – is heavily used and an implicitly trusted library on which many mission critical applications are based. In order to prevent abuse, Java has a sophisticated security model to ensure the isolation of protected areas inside a program. However, attackers have found and continue to find several ways to disable the security model thus rendering it useless.

One way of effectively evading the Java Security Model is to perform operations in native code. Since attackers cannot easily introduce new native libraries during an attack, they are keen to abuse an exploitable part of the native code already provided by the JCL itself. As this is not a small part (roughly 800k LOC in Java 1.7) of the JCL, a manual code review looking for security vulnerabilities is hardly an option. Automated methods have to be developed to mitigate the possible threat the native part of the JCL poses.

When constructing an attack against the Java Security Model using the native part of the JCL most attacks use specially crafted input sent through Java methods to the native part. This crafted input might break the native part and thus enable the Java part of the exploit to deactivate the Java Security Model (e.g. CVE-2013-2465) and continue in full privileged mode. Choosing an Applet as the delivery method for the exploit the number of possible targets easily becomes interesting for an attacker.

In this thesis an automated analysis of the data flows between the VM-controlled and the native part of the JCL has to be created. As it will be hard to cross the language boundary between the VM-controlled and the native part with an analysis, the analysis may run in two steps. One step analyzing the Java part of the JCL (e.g. with Soot) and another step analyzing the native part of the JCL (e.g. with LLVM). The results of both analysis steps then have to be combined to produce an overall result. A classification schema has to be developed to characterize data flows depending on their possible exploitability. For instance, some safe guards and input sanitizers might mitigate threats well, while others might not. Additionally, certain data types could be more prone to exploitation than other data types. However, some parameters of the Java Native API might not even be accessible for an attacker at all.

Publications

  • Tan, Gang, and Jason Croft: An Empirical Security Study of the Native Code in the JDK. Usenix Security Symposium. 2008.
  • Bratus, Sergey, et al.: Exploit Programming: From Buffer Overflows to" Weird Machines" and Theory of Computation. Usenix .login December 2011
  • Drake, Joshua J.: Exploiting Memory Corruption Vulnerabilities in the Java Runtime.. 2011