DeepDual-SD: Deep Dual Attribute-Aware Embedding for Binary Code Similarity Detection

نویسندگان

چکیده

Abstract Binary code similarity detection (BCSD) is a task of detecting binary functions which are not available to the corresponding source code. It has been widely utilized facilitate various kinds crucial security analysis in software engineering. Because complexity program compilation process, identifying presents tough challenges. The most sensible detector relies on robust vector representation However, few BCSD approaches suitable form representations for analyzing similarities between binaries, may only diverge semantics but also structures. And existing solutions depend hands-on feature engineering vectors, fail take into consideration relationships instructions. To resolve these problems, we propose novel and unified approach called DeepDual-SD that aims combine dual attributes (semantic structural attribute). More specifically, consists two branches, one text-based driven by semantic attribute learning exploit instruction semantics, another graph-based investigate differences. Meanwhile deep embedding (DE) technology map this information low-dimensional representation. In addition, get together attributes, fusion mechanism based gate architecture designed pay proper attention attribute-aware embeddings. Experimental verifications conducted Openssl Debian datasets several tasks, including cross-compiler, cross-architecture cross-version scenarios. results demonstrate our method outperforms state-of-the-art methods different scenarios terms accuracy.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Local Similarity-Aware Deep Feature Embedding

Existing deep embedding methods in vision tasks are capable of learning a compact Euclidean space from images, where Euclidean distances correspond to a similarity metric. To make learning more effective and efficient, hard sample mining is usually employed, with samples identified through computing the Euclidean feature distance. However, the global Euclidean distance cannot faithfully charact...

متن کامل

Quadra-Embedding: Binary Code Embedding with Low Quantization Error

Thanks to compact data representations and fast similarity computation, many binary code embedding techniques have been proposed for large-scale similarity search used in many computer vision applications including image retrieval. Most prior techniques have centered around optimizing a set of projections for accurate embedding. In spite of active research efforts, existing solutions suffer fro...

متن کامل

Collaborative Deep Embedding via Dual Networks

Despite the long history of research on recommender systems, current approaches still face a number of challenges in practice, e.g. the difficulties in handling new items, the high diversity of user interests, and the noisiness and sparsity of observations. Many of such difficulties stem from the lack of expressive power to capture the complex relations between items and users. This paper prese...

متن کامل

Binary code-based Human Detection

HOG features are effective for object detection, but their focus on local regions makes them highdimensional features. To reduce the memory required for the HOG features, this paper proposes a new feature, R-HOG, which creates binary codes from the HOG features extracted from two local regions. This approach enables the created binary codes to reflect the relationships between local regions. Co...

متن کامل

Towards Optimal Binary Code Learning via Ordinal Embedding

Binary code learning, a.k.a., hashing, has been recently popular due to its high efficiency in large-scale similarity search and recognition. It typically maps high-dimensional data points to binary codes, where data similarity can be efficiently computed via rapid Hamming distance. Most existing unsupervised hashing schemes pursue binary codes by reducing the quantization error from an origina...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: International Journal of Computational Intelligence Systems

سال: 2023

ISSN: ['1875-6883', '1875-6891']

DOI: https://doi.org/10.1007/s44196-023-00206-9