Are our clone detectors good enough? An empirical study of code effects by obfuscation
نویسندگان
چکیده
Abstract Clone detection has received much attention in many fields such as malicious code detection, vulnerability hunting, and copyright infringement detection. However, cyber criminals may obfuscate to impede violation To date, few studies have investigated the robustness of clone detectors, especially in-fashion deep learning-based ones, against obfuscation. Meanwhile, most these only measure difference between one snippet its obfuscation version. reality, attackers modify original before obfuscating it. Then what we should evaluate is obfuscated from cloned code, not code. For this, conduct a comprehensive study evaluating 3 popular deep-learning based detectors 6 commonly used traditional ones. Regarding data, collect 6512 pairs five types dataset BigCloneBench program each pair via 64 strategies state-of-art commercial obfuscators. We also 1424 non-clone false positives. In sum, benchmark 524,148 (either or not) are generated, which passed for evaluation. automate evaluation, develop uniform evaluation framework, integrating The results bring us interesting findings on how affects performance detectors. addition, manual reviews uncover root cause phenomenon give suggestions users different perspectives.
منابع مشابه
Analyzing the Robustness of Clone Detection Tools Regarding Code Obfuscation
Research has shown that 7% to 23% of a typical source code system consists of cloned code. Some clones are introduced intentionally, but a majority is unintenionally created. To find these clones, several code clone detection tools have been developed. They are used in several fields such as detection of software plagiarism, malware detection or code quality enhancing. However, this process is ...
متن کاملEmpirical Evaluation of Similar Defect Detection by Code Clone Search
あらまし 不具合修正時の修正前ソースコード片を検索キーとしたコードクローン検索による類似不具合の検 出を実証的に評価した.これまでにオープンソースソフトウェアを対象とした研究により,コードクローン検索 による類似不具合発見の有用性が確認されている.そこで本論文では,商用開発のソフトウェアを対象としコー ドクローン検索による類似不具合発見を商用開発の現場への適用する際の指針となることを目指す.対象はパナ ソニック MSE株式会社において三つの異なるプロジェクトで開発された 3 件のソースコードであり,試験工程 での不具合修正に伴う修正履歴が記録されたリリース済のものである.修正履歴に記録された不具合修正前の ソースコード片を検索キーとしコードクローン検索を実施し類似不具合を検出した.その結果,対象とした商用 開発においてもその有効性を確認できた. キーワード コードクローン分析,類似不...
متن کاملEmpirical Studies of Code Clone Genealogies
Two identical or similar code fragments form a clone pair. Previous studies have identified cloning as a risky practice. Therefore, a developer needs to be aware of any clone pairs so as to properly propagate any changes between clones. A clone pair experiences many changes during the creation and maintenance of software systems. A change can either maintain or remove the similarity between clo...
متن کاملAre Standard Solutions Good Enough?
Remote collaboration on physical objects is a topic of recurring interest within the CSCW community. Up until now, research has primarily focused on stationary settings with specially designed technical support to address problems of reference due to non-mutual access to the object. In this workshop paper I present remote truck service as an example of work practices that require mobile remote ...
متن کاملAre We Doing Enough for Our Field ?
Correspondents Argentina: O. Civitaresse, La Plata; Australia: A. W. Thomas, Adelaide; Austria: H. Oberhummer, Vienna; Belgium: M. Huyse, Leuven; Brasil: M. Hussein, Sao Paulo; Bulgaria: D. Balabanski, Sofia; Canada: J.-M. Poutissou, TRIUMF; K. Sharma, Manitoba; J. Simpson, Guelph; China: W. Zhan, Lanzhou; Croatia: R. Caplar, Zagreb; Czech Republic: J. Kvasil, Prague; Slovak Republic: P. Povine...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Cybersecurity
سال: 2023
ISSN: ['2523-3246']
DOI: https://doi.org/10.1186/s42400-023-00148-x