DeepDual-SD: Deep Dual Attribute-Aware Embedding for Binary Code Similarity Detection
نویسندگان
چکیده
Abstract Binary code similarity detection (BCSD) is a task of detecting binary functions which are not available to the corresponding source code. It has been widely utilized facilitate various kinds crucial security analysis in software engineering. Because complexity program compilation process, identifying presents tough challenges. The most sensible detector relies on robust vector representation However, few BCSD approaches suitable form representations for analyzing similarities between binaries, may only diverge semantics but also structures. And existing solutions depend hands-on feature engineering vectors, fail take into consideration relationships instructions. To resolve these problems, we propose novel and unified approach called DeepDual-SD that aims combine dual attributes (semantic structural attribute). More specifically, consists two branches, one text-based driven by semantic attribute learning exploit instruction semantics, another graph-based investigate differences. Meanwhile deep embedding (DE) technology map this information low-dimensional representation. In addition, get together attributes, fusion mechanism based gate architecture designed pay proper attention attribute-aware embeddings. Experimental verifications conducted Openssl Debian datasets several tasks, including cross-compiler, cross-architecture cross-version scenarios. results demonstrate our method outperforms state-of-the-art methods different scenarios terms accuracy.
منابع مشابه
Local Similarity-Aware Deep Feature Embedding
Existing deep embedding methods in vision tasks are capable of learning a compact Euclidean space from images, where Euclidean distances correspond to a similarity metric. To make learning more effective and efficient, hard sample mining is usually employed, with samples identified through computing the Euclidean feature distance. However, the global Euclidean distance cannot faithfully charact...
متن کاملQuadra-Embedding: Binary Code Embedding with Low Quantization Error
Thanks to compact data representations and fast similarity computation, many binary code embedding techniques have been proposed for large-scale similarity search used in many computer vision applications including image retrieval. Most prior techniques have centered around optimizing a set of projections for accurate embedding. In spite of active research efforts, existing solutions suffer fro...
متن کاملCollaborative Deep Embedding via Dual Networks
Despite the long history of research on recommender systems, current approaches still face a number of challenges in practice, e.g. the difficulties in handling new items, the high diversity of user interests, and the noisiness and sparsity of observations. Many of such difficulties stem from the lack of expressive power to capture the complex relations between items and users. This paper prese...
متن کاملBinary code-based Human Detection
HOG features are effective for object detection, but their focus on local regions makes them highdimensional features. To reduce the memory required for the HOG features, this paper proposes a new feature, R-HOG, which creates binary codes from the HOG features extracted from two local regions. This approach enables the created binary codes to reflect the relationships between local regions. Co...
متن کاملTowards Optimal Binary Code Learning via Ordinal Embedding
Binary code learning, a.k.a., hashing, has been recently popular due to its high efficiency in large-scale similarity search and recognition. It typically maps high-dimensional data points to binary codes, where data similarity can be efficiently computed via rapid Hamming distance. Most existing unsupervised hashing schemes pursue binary codes by reducing the quantization error from an origina...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: International Journal of Computational Intelligence Systems
سال: 2023
ISSN: ['1875-6883', '1875-6891']
DOI: https://doi.org/10.1007/s44196-023-00206-9