首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Investigating seized devices within digital forensics gets more and more difficult due to the increasing amount of data. Hence, a common procedure uses automated file identification which reduces the amount of data an investigator has to look at by hand. Besides identifying exact duplicates, which is mostly solved using cryptographic hash functions, it is also helpful to detect similar data by applying approximate matching.Let x denote the number of digests in a database, then the lookup for a single similarity digest has the complexity of O(x). In other words, the digest has to be compared against all digests in the database. In contrast, cryptographic hash values are stored within binary trees or hash tables and hence the lookup complexity of a single digest is O(log2(x)) or O(1), respectively.In this paper we present and evaluate a concept to extend existing approximate matching algorithms, which reduces the lookup complexity from O(x) to O(1). Therefore, instead of using multiple small Bloom filters (which is the common procedure), we demonstrate that a single, huge Bloom filter has a far better performance. Our evaluation demonstrates that current approximate matching algorithms are too slow (e.g., over 21 min to compare 4457 digests of a common file corpus against each other) while the improved version solves this challenge within seconds. Studying the precision and recall rates shows that our approach works as reliably as the original implementations. We obtain this benefit by accuracy–the comparison is now a file-against-set comparison and thus it is not possible to see which file in the database is matched.  相似文献   

2.
Automated input identification is a very challenging, but also important task. Within computer forensics this reduces the amount of data an investigator has to look at by hand. Besides identifying exact duplicates, which is mostly solved using cryptographic hash functions, it is necessary to cope with similar inputs (e.g., different versions of a file), embedded objects (e.g., a JPG within a Word document), and fragments (e.g., network packets), too. Over the recent years a couple of different similarity hashing algorithms were published. However, due to the absence of a definition and a test framework, it is hardly possible to evaluate and compare these approaches to establish them in the community.The paper at hand aims at providing an assessment methodology and a sample implementation called FRASH: a framework to test algorithms of similarity hashing. First, we describe common use cases of a similarity hashing algorithm to motivate our two test classes efficiency and sensitivity & robustness. Next, our open and freely available framework is briefly described. Finally, we apply FRASH to the well-known similarity hashing approaches ssdeep and sdhash to show their strengths and weaknesses.  相似文献   

3.
Communication apps can be an important source of evidence in a forensic investigation (e.g., in the investigation of a drug trafficking or terrorism case where the communications apps were used by the accused persons during the transactions or planning activities). This study presents the first evidence‐based forensic taxonomy of Windows Phone communication apps, using an existing two‐dimensional Android forensic taxonomy as a baseline. Specifically, 30 Windows Phone communication apps, including Instant Messaging (IM) and Voice over IP (VoIP) apps, are examined. Artifacts extracted using physical acquisition are analyzed, and seven digital evidence objects of forensic interest are identified, namely: Call Log, Chats, Contacts, Locations, Installed Applications, SMSs and User Accounts. Findings from this study would help to facilitate timely and effective forensic investigations involving Windows Phone communication apps.  相似文献   

4.
《Digital Investigation》2014,11(4):314-322
This research comparatively evaluates four competing clustering algorithms for thematically clustering digital forensic text string search output. It does so in a more realistic context, respecting data size and heterogeneity, than has been researched in the past. In this study, we used physical-level text string search output, consisting of over two million search hits found in nearly 50,000 allocated files and unallocated blocks. Holding the data set constant, we comparatively evaluated k-Means, Kohonen SOM, Latent Dirichlet Allocation (LDA) followed by k-Means, and LDA followed by SOM. This enables true cross-algorithm evaluation, whereas past studies evaluated singular algorithms using unique, non-reproducible datasets. Our research shows an LDA + k-Means using a linear, centroid-based user navigation procedure produces optimal results. The winning approach increased information retrieval effectiveness, from the baseline random walk absolute precision rate of 0.04, to an average precision rate of 0.67. We also explored a variety of algorithms for user navigation of search hit results, finding that the performance of k-means clustering can be greatly improved with a non-linear, non-centroid-based cluster and document navigation procedure, which has potential implications for digital forensic tools and use thereof, particularly given the popularity and speed of k-means clustering.  相似文献   

5.
Bytewise approximate matching is a relatively new area within digital forensics, but its importance is growing quickly as practitioners are looking for fast methods to screen and analyze the increasing amounts of data in forensic investigations. The essential idea is to complement the use of cryptographic hash functions to detect data objects with bytewise identical representation with the capability to find objects with bytewise similar representations.Unlike cryptographic hash functions, which have been studied and tested for a long time, approximate matching ones are still in their early development stages and evaluation methodology is still evolving. Broadly, prior approaches have used either a human in the loop to manually evaluate the goodness of similarity matches on real world data, or controlled (pseudo-random) data to perform automated evaluation.This work's contribution is to introduce automated approximate matching evaluation on real data by relating approximate matching results to the longest common substring (LCS). Specifically, we introduce a computationally efficient LCS approximation and use it to obtain ground truth on the t5 set. Using the results, we evaluate three existing approximate matching schemes relative to LCS and analyze their performance.  相似文献   

6.
The fast growth of the average size of digital forensic targets demands new automated means to quickly, accurately and reliably correlate digital artifacts. Such tools need to offer more flexibility than the routine known-file filtering based on crypto hashes. Currently, there are two tools for which NIST has produced reference hash sets–ssdeep and sdhash. The former provides a fixed-sized fuzzy hash based on random polynomials, whereas the latter produces a variable-length similarity digest based on statistically-identified features packed into Bloom filters.This study provides a baseline evaluation of the capabilities of these tools both in a controlled environment and on real-world data. The results show that the similarity digest approach significantly outperforms in terms of recall and precision in all tested scenarios and demonstrates robust and scalable behavior.  相似文献   

7.
《Digital Investigation》2014,11(4):349-362
This paper presents a unified social graph based text mining framework to identify digital evidences from chat logs data. It considers both users' conversation and interaction data in group-chats to discover overlapping users' interests and their social ties. The proposed framework applies n-gram technique in association with a self-customized hyperlink-induced topic search (HITS) algorithm to identify key-terms representing users' interests, key-users, and key-sessions. We propose a social graph generation technique to model users' interactions, where ties (edges) between a pair of users (nodes) are established only if they participate in at least one common group-chat session, and weights are assigned to the ties based on the degree of overlap in users' interests and interactions. Finally, we present three possible cyber-crime investigation scenarios and a user-group identification method for each of them. We present our experimental results on a data set comprising 1100 chat logs of 11,143 chat sessions continued over a period of 29 months from January 2010 to May 2012. Experimental results suggest that the proposed framework is able to identify key-terms, key-users, key-sessions, and user-groups from chat logs data, all of which are crucial for cyber-crime investigation. Though the chat logs are recovered from a single computer, it is very likely that the logs are collected from multiple computers in real scenario. In this case, logs collected from multiple computers can be combined together to generate more enriched social graph. However, our experiments show that the objectives can be achieved even with logs recovered from a single computer by using group-chats data to draw relationships between every pair of users.  相似文献   

8.
There are an abundance of measures available to the standard digital device users which provide the opportunity to act in an anti-forensic manner and conceal any potential digital evidence denoting a criminal act. Whilst there is a lack of empirical evidence which evaluates the scale of this threat to digital forensic investigations leaving the true extent of engagement with such tools unknown, arguably the field should take proactive steps to examine and record the capabilities of these measures. Whilst forensic science has long accepted the concept of toolmark analysis as part of criminal investigations, ‘digital tool marks’ (DTMs) are a notion rarely acknowledged and considered in digital investigations. DTMs are the traces left behind by a tool or process on a suspect system which can help to determine what malicious behaviour has occurred on a device. This article discusses and champions the need for DTM research in digital forensics highlighting the benefits of doing so.  相似文献   

9.
Memory forensics has gradually moved into the focus of researchers and practitioners alike in recent years. With an increasing effort to extract valuable information from a snapshot of a computer's RAM, the necessity to properly assess the respective solutions rises as well. In this paper, we present an evaluation platform for forensic memory acquisition software. The platform is capable of measuring distinct factors that determine the quality of a generated memory image, specifically its correctness, atomicity, and integrity. Tests are performed for three popular open source applications, win32dd, WinPMEM, and mdd, as well as for different memory sizes.  相似文献   

10.
11.
《Science & justice》2021,61(5):627-634
The importance of ensuring the results of any digital forensic (DF) examination are effectively communicated cannot be understated. In most cases, this communication will be done via written report, yet despite this there is arguably limited best practice guidance available which is specific for this field in regards to report construction. Poor reporting practices in DF are likely to undermine the reliability of evidence provided across this field, where there is a need for formalised guidance regarding the requirements for effective DF report construction; this should not be a task left solely to each individual practitioner to determine without instruction. For this, the field of DF should look to the wider forensic community and the existing work in this area for support. In line with many other ‘traditional’ forensic science types, a DF practitioner can be commissioned to report in one of three ways - ‘technical’, ‘investigative’ or ‘evaluative’, where each reporting type maintains a specific purpose and interpretative-context, determined by the examination workflow undertaken by a practitioner following client instruction. This work draws upon guidance set out in fundamental forensic science reporting literature in order to describe each reporting type in turn, outlining their scope, content and construction requirements in an attempt to provide support for the DF field.  相似文献   

12.
Abstract

Various terms have been used to describe the intersection between computing technology and violations of the law-including computer crime, electronic crime, and cybercrime. While there remains little agreement on terminology, most experts agree that the use of electronic devices to commit crime has increased dramatically and is now commonplace. It is the role of the digital investigator to bring cybercriminals to justice. Cybercrime however differs from traditional crime and presents a variety of unique challenges including the variety of electronic devices available, amount of data produced by these devices, the absence of standard practices and guidelines for analyzing that data, the lack qualified personnel to perform investigations and the lack of resources to provide on-going training. This paper examines these challenges  相似文献   

13.
Performing a digital forensic investigation (DFI) requires a standardized and formalized process. There is currently neither an international standard nor does a global, harmonized DFI process (DFIP) exist. The authors studied existing state-of-the-art DFIP models and concluded that there are significant disparities pertaining to the number of processes, the scope, the hierarchical levels, and concepts applied. This paper proposes a comprehensive model that harmonizes existing models. An effort was made to incorporate all types of processes proposed by the existing models, including those aimed at achieving digital forensic readiness. The authors introduce a novel class of processes called concurrent processes. This is a novel contribution that should, together with the rest of the model, enable more efficient and effective DFI, while ensuring admissibility of digital evidence. Ultimately, the proposed model is intended to be used for different types of DFI and should lead to standardization.  相似文献   

14.
Large-scale digital forensic investigations present at least two fundamental challenges. The first one is accommodating the computational needs of a large amount of data to be processed. The second one is extracting useful information from the raw data in an automated fashion. Both of these problems could result in long processing times that can seriously hamper an investigation.In this paper, we discuss a new approach to one of the basic operations that is invariably applied to raw data – hashing. The essential idea is to produce an efficient and scalable hashing scheme that can be used to supplement the traditional cryptographic hashing during the initial pass over the raw evidence. The goal is to retain enough information to allow binary data to be queried for similarity at various levels of granularity without any further pre-processing/indexing.The specific solution we propose, called a multi-resolution similarity hash (or MRS hash), is a generalization of recent work in the area. Its main advantages are robust performance – raw speed comparable to a high-grade block-level crypto hash, scalability – ability to compare targets that vary in size by orders of magnitude, and space efficiency – typically below 0.5% of the size of the target.  相似文献   

15.
Due to the democratisation of new technologies, computer forensics investigators have to deal with volumes of data which are becoming increasingly large and heterogeneous. Indeed, in a single machine, hundred of events occur per minute, produced and logged by the operating system and various software. Therefore, the identification of evidence, and more generally, the reconstruction of past events is a tedious and time-consuming task for the investigators. Our work aims at reconstructing and analysing automatically the events related to a digital incident, while respecting legal requirements. To tackle those three main problems (volume, heterogeneity and legal requirements), we identify seven necessary criteria that an efficient reconstruction tool must meet to address these challenges. This paper introduces an approach based on a three-layered ontology, called ORD2I, to represent any digital events. ORD2I is associated with a set of operators to analyse the resulting timeline and to ensure the reproducibility of the investigation.  相似文献   

16.
《Digital Investigation》2014,11(3):187-200
A recent increase in the prevalence of embedded systems has led them to become a primary target of digital forensic investigations. Embedded systems with DVR (Digital Video Recorder) capabilities are able to generate multimedia (video/audio) data, and can act as vital pieces of evidence in the field of digital forensics.To counter anti-forensics, it is necessary to derive systematic forensic techniques that can be used on data fragments in unused (unallocated) areas of files or images. Specifically, the techniques should extract meaningful information from various types of data fragments, such as non-sequential fragmentation and missing fragments overwritten by other data.This paper proposes a new digital forensic system for use on video data fragments related to DVRs. We demonstrate in detail special techniques for the classification, reassembly, and extraction of video data fragments, and introduce an integrated framework for data fragment forensics based on techniques described in this paper.  相似文献   

17.
18.
Contemporary criminal investigation assisted by computing technology imposes challenges to the right to a fair trial and the scientific validity of digital evidence. This paper identifies three categories of unaddressed threats to fairness and the presumption of innocence during investigations – (i) the inappropriate and inconsistent use of technology; (ii) old procedural guarantees, which are not adapted to contemporary digital evidence processes and services; (iii) and the lack of reliability testing in digital forensics practice. Further, the solutions that have been suggested to overcome these issues are critically reviewed to identify their shortcomings. Ultimately, the paper argues for the need of legislative intervention and enforcement of standards and validation procedures for digital evidence in order to protect innocent suspects and all parties in the criminal proceedings from the negative consequences of technology-assisted investigations.  相似文献   

19.
Since its inception, over a decade ago, the field of digital forensics has faced numerous challenges. Despite different researchers and digital forensic practitioners having studied and analysed various known digital forensic challenges, as of 2013, there still exists a need for a formal classification of these challenges. This article therefore reviews existing research literature and highlights the various challenges that digital forensics has faced for the last 10 years. In conducting this research study, however, it was difficult for the authors to review all the existing research literature in the digital forensic domain; hence, sampling and randomization techniques were employed to facilitate the review of the gathered literature. Taxonomy of the various challenges is subsequently proposed in this paper based on our review of the literature. The taxonomy classifies the large number of digital forensic challenges into four well‐defined and easily understood categories. The proposed taxonomy can be useful, for example, in future developments of automated digital forensic tools by explicitly describing processes and procedures that focus on addressing specific challenges identified in this paper. However, it should also be noted that the purpose of this paper was not to propose any solutions to the individual challenges that digital forensics face, but to serve as a survey of the state of the art of the research area.  相似文献   

20.
《Science & justice》2022,62(1):86-93
The prominence of technology usage in society has inevitably led to increasing numbers of digital devices being seized, where digital evidence often features in criminal investigations. Such demand has led to well documented backlogs placing pressure on digital forensic labs, where in an effort to combat this issue, the ‘at-scene triage’ of devices has been touted as a solution. Yet such triage approaches are not straightforward to implement with multiple technical and procedural issues existing, including determining when it is actually appropriate to triage the contents of a device at-scene. This work remains focused on this point due to the complexities associated with it, and to support first responders a nine-stage triage decision model is offered which is designed to promote consistent and transparent practice when determining if a device should be triaged.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号