首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
This study has been performed in order to define standards usable to determine the sex of prehispanic individuals from the Canary Islands from their skeletal remains. Osteometric information at the right tibia was obtained from 59 complete skeletons from Gran Canaria, housed in the Museo Canario (Las Palmas), 45 males and 14 females (this constitutes the totality of complete prehispanic skeletons known from Gran Canaria). The parameters measured were: tibial length, proximal and distal epiphyseal breadth, transverse and anteroposterior diameter, perimeter at the nutrition foramen levels and minimum shaft perimeter. These parameters were subjected to different SPSS discriminant function analysis, combining all of them, or only the proximal or distal ones, without tibial length, etc., in order to obtain functions usable even if only bone fragments are available. Transverse diameter, proximal epiphyseal breadth and minimum shaft perimeter showed the highest discriminant power. The functions obtained showed high average accuracies, ranging from 94.9 to 98.3%, with female accuracies of 100%. The functions obtained were further applied to a test prehispanic population (ten males and ten females) from El Hierro. Overall accuracies of the functions when applied to this population ranged from 65 to 94.7%, with female accuracies ranging 80% to 100%.  相似文献   

2.
《Digital Investigation》2014,11(3):224-233
The allocation algorithm of the Linux FAT32 file system driver positions files on disk in such a way that their relative positions reveal information on the order in which these files have been created. This provides an opportunity to enrich information from (carved) file fragments with time information, even when such file fragments lack the file system metadata in which time-related information is usually to be found.Through source code analysis and experiments the behaviour of the Linux FAT allocator is examined. How an understanding of this allocator can be applied in practice is demonstrated with a case study involving a TomTom GPS car navigation device. In this case, time information played a crucial role. Large amounts of location records could be carved from this device's flash storage, yielding insight into the locations the device has visited—yet the carved records themselves offered no information on when the device had been at the locations. Still, bounds on the records' time of creation could be inferred when making use of filesystem timestamps related to neighbouring on-disk positions.Finally, we perform experiments which contrast the Linux behaviour with that of Windows 7. We show that the latter differs subtly, breaking the strong relation between creation order and position.  相似文献   

3.
This study tests whether postcranial sex estimation methods generated from Hispanic, and mainly Mexican samples, can be successfully applied to other increasingly common migrant populations from Central America. We use a sample of postcranial data from a modern (1980s) Guatemalan Maya sample (n = 219). Results indicate a decrease in classification accuracies for previously established univariate methods when applied to the Guatemalan study sample, specifically for males whose accuracies ranged from 30 to 84%. This bias toward inaccuracies for Guatemalan males is associated with the smaller skeletal sizes for the Guatemalan sample as compared to the samples used in the tested sex estimation methods. In contrast, the tested multivariate discriminant function classification yielded less sex bias and improved classification accuracies ranging from 82 to 89%. Our results highlight which of the tested univariate and multivariate methods reach acceptable levels for accuracy for sex estimation of cases where the region of origin may include Guatemala.  相似文献   

4.
The classification of file fragments is an important problem in digital forensics. The literature does not include comprehensive work on applying machine learning techniques to this problem. In this work, we explore the use of techniques from natural language processing to classify file fragments. We take a supervised learning approach, based on the use of support vector machines combined with the bag-of-words model, where text documents are represented as unordered bags of words. This technique has been repeatedly shown to be effective and robust in classifying text documents (e.g., in distinguishing positive movie reviews from negative ones).In our approach, we represent file fragments as “bags of bytes” with feature vectors consisting of unigram and bigram counts, as well as other statistical measurements (including entropy and others). We made use of the publicly available Garfinkel data corpus to generate file fragments for training and testing. We ran a series of experiments, and found that this approach is effective in this domain as well.  相似文献   

5.
Over the past decade, a substantial effort has been put into developing methods to classify file fragments. Throughout, it has been an article of faith that data fragments, such as disk blocks, can be attributed to different file types. This work is an attempt to critically examine the underlying assumptions and compare them to empirically collected data. Specifically, we focus most of our effort on surveying several common compressed data formats, and show that the simplistic conceptual framework of prior work is at odds with the realities of actual data. We introduce a new tool, zsniff, which allows us to analyze deflate-encoded data, and we use it to perform an empirical survey of deflate-coded text, images, and executables. The results offer a conceptually new type of classification capabilities that cannot be achieved by other means.  相似文献   

6.
File carving is the process of reassembling files from disk fragments based on the file content in the absence of file system metadata. By leveraging both file header and footer pairs, traditional file carving mainly focuses on document and image files such as PDF and JPEG. With the vast amount of malware code appearing in the wild daily, recovery of binary executable files becomes an important problem, especially for the case in which malware deletes itself after compromising a computer. However, unlike image files that usually have both a header and footer pair, executable files only have header information, which makes the carving much harder. In this paper, we present Bin-Carver, a first-of-its-kind system to automatically recover executable files with deleted or corrupted metadata. The key idea is to explore the road map information defined in executable file headers and the explicit control flow paths present in the binary code. Our experiment with thousands of binary code files has shown our Bin-Carver to be incredibly accurate, with an identification rate of 96.3% and recovery rate of 93.1% on average when handling file systems ranging from pristine to chaotic and highly fragmented.  相似文献   

7.
This paper presents a method for recovering data from files compressed with the DEFLATE algorithm where short segments in the middle of the file have been corrupted, yielding a mix of literal bytes, bytes aligned with literals across the corrupted segment, and co-indexed unknown bytes. An improved reconstruction algorithm based on long byte n-grams increases the proportion of reconstructed bytes by an average of 8.9% absolute across the 21 languages of the Europarl corpus compared to previously-published work, and the proportion of unknown bytes correctly reconstructed by an average of 20.9% absolute, while running in one-twelfth the time on average. Combined with the new recovery method, corrupted segments of 128–4096 bytes in the compressed bit-stream result in reconstructed output which differs from the original file by an average of less than twice the number of bytes represented by the corrupted segment. Both new algorithms are implemented in the trainable open-source ZipRec 1.0 utility program.  相似文献   

8.
Forensic examiners are frequently confronted with content in languages that they do not understand, and they could benefit from machine translation into their native language. But automated translation of file paths is a difficult problem because of the minimal context for translation and the frequent mixing of multiple languages within a path. This work developed a prototype implementation of a file-path translator that first identifies the language for each directory segment of a path, and then translates to English those that are not already English nor artificial words. Brown's LA-Strings utility for language identification was tried, but its performance was found inadequate on short strings and it was supplemented with clues from dictionary lookup, Unicode character distributions for languages, country of origin, and language-related keywords. To provide better data for language inference, words used in each directory over a large corpus were aggregated for analysis. The resulting directory-language probabilities were combined with those for each path segment from dictionary lookup and character-type distributions to infer the segment's most likely language. Tests were done on a corpus of 50.1 million file paths looking for 35 different languages. Tests showed 90.4% accuracy on identifying languages of directories and 93.7% accuracy on identifying languages of directory/file segments of file paths, even after excluding 44.4% of the paths as obviously English or untranslatable. Two of seven proposed language clues were shown to impair directory-language identification. Experiments also compared three translation methods: the Systran translation tool, Google Translate, and word-for-word substitution using dictionaries. Google Translate usually performed the best, but all still made errors with European languages and a significant number of errors with Arabic and Chinese.  相似文献   

9.
The National Software Reference Library (NSRL) is an essential data source for forensic investigators, providing in its Reference Data Set (RDS) a set of hash values of known software. However, the NSRL RDS has not previously been tested against a broad spectrum of real-world data. The current work did this using a corpus of 36 million files on 2337 drives from 21 countries. These experiments answered a number of important questions about the NSRL RDS, including what fraction of files it recognizes of different types. NSRL coverage by vendor/product was also tested, finding 51% of the vendor/product names in our corpus had no hash values at all in NSRL. It is shown that coverage or “recall” of the NSRL can be improved with additions from our corpus such as frequently-occurring files and files whose paths were found previously in NSRL with a different hash value. This provided 937,570 new hash values which should be uncontroversial additions to NSRL. Several additional tests investigated the accuracy of the NSRL data. Experiments testing the hash values saw no evidence of errors. Tests of file sizes showed them to be consistent except for a few cases. On the other hand, the product types assigned by NSRL can be disputed, and it failed to recognize any of a sample of virus-infected files. The file names provided by NSRL had numerous discrepancies with the file names found in the corpus, so the discrepancies were categorized; among other things, there were apparent spelling and punctuation errors. Some file names suggest that NSRL hash values were computed on deleted files, not a safe practice. The tests had the secondary benefit of helping identify occasional errors in the metadata obtained from drive imaging on deleted files in our corpus. This research has provided much data useful in improving NSRL and the forensic tools that depend upon it. It also provides a general methodology and software for testing hash sets against corpora.  相似文献   

10.
Sex determination is critical for developing the biological profile of unidentified skeletal remains. When more commonly used elements (os coxa, cranium) for sexing are not available, methods utilizing other skeletal elements are needed. This study aims to assess the degree of sexual dimorphism of the lumbar vertebrae and develop discriminant functions for sex determination from them, using a sample of South African blacks from the Raymond A. Dart Collection (47 males, 51 females). Eleven variables at each lumbar level were subjected to univariate and multivariate discriminant function analyses. Univariate equations produced classification rates ranging from 57.7% to 83.5%, with the highest accuracies associated with dimensions of the vertebral body. Multivariate stepwise analysis generated classification rates ranging from 75.9% to 88.7%. These results are comparable to other methods for sexing the skeleton and indicate that measures of the lumbar vertebrae can be used as an effective tool for sex determination.  相似文献   

11.
Several studies have shown that osteometric differences exist between different population groups. Thus, discriminant function equations derived for the determination of sex from skeletal elements are population specific. In a previous study, the authors derived such equations from nine measurements of the talus of South African whites with high levels of average accuracies. The validity of some of the equations was tested on data collected from a South African black sample that consisted of 120 tali, equally distributed by sex, derived from the Raymond A. Dart Collection of Human Skeletons. The average accuracies dropped significantly. This necessitated the derivation of new equations for the South African black population and the average accuracies obtained ranged between 80% and 89%. The validity of the equations derived from the present study was tested using the leave-one-out classification and two independent samples (1 and 2). The applicability of the equations with very high classification rate from the present study was tested on Independent sample 1 of 10 white tali with poor results. The result of the validity of these equations on an Independent sample 2 of 10 black tali revealed acceptably high average accuracies in correct classification thereby supporting earlier observations on population specificity of discriminant function equations.  相似文献   

12.
This paper explores the use of purpose-built functions and cryptographic hashes of small data blocks for identifying data in sectors, file fragments, and entire files. It introduces and defines the concept of a “distinct” disk sector—a sector that is unlikely to exist elsewhere except as a copy of the original. Techniques are presented for improved detection of JPEG, MPEG and compressed data; for rapidly classifying the forensic contents of a drive using random sampling; and for carving data based on sector hashes.  相似文献   

13.
A review of results obtained from standard guilty and innocent treatment conditions in 14 mock crime studies of the control question polygraph technique revealed accuracies ranging from chance to 100% correct. The present study examined several factors that may have contributed to the observed variability in detection rates across studies. Those included sampling error, differences in the populations from which subjects were drawn (Subjects), differences in the nature of incentives provided to subjects for passing the polygraph test (Incentives), and differences in the methods for diagnosing truth or deception (Decision Policy). A meta-analysis revealed that approximately 24% of the variance in detection rates could be attributed to sampling error, and detection rates were correlated with types of Subjects (r=.61). Incentives (r=.73), and Decision Policies (r=.67). The highest diagnostic accuracies were obtained from nonstudent subject samples, when both guilty and innocent subjects were offered monetary incentives to convince the examiner of their innocence, and when conventional field methods were used for interpreting the physiological recordings and diagnosing truth and deception. Together, differences in Subjects, Incentives, and Decision Policies may account for as much as 65% of the observed variance in detection rates. The present findings highlight the importance of conducting mock crime experiments that closely approximate field conditions.  相似文献   

14.
This study presents a method for identifying small subsets of morphological attributes of the skeletal pelvis that have consistently high reliability in assigning the sex of unknown individuals. An inductive computer algorithm (ID3) was applied to a bootstrapped training set/test set design in which the model was developed from 70% of the sample and tested on the remaining 30%. Relative accuracy of sex classification was evaluated for seven subsets of 31 morphological features of the adult os coxae. Using 115 ossa coxarum selected from the Terry Collection, a selected suite of the three most consistently diagnostic attributes averaged 93.1% correct classification of individuals by sex over ten trials. Attribute suites developed collaboratively with three well known skeletal experts averaged 87.8, 91.3, and 89.6% correct. The full set of 31 attributes averaged 90.0% accuracy. We demonstrate a small set of three criteria, selected and ordered by ID3, that is more accurate than other combinations, and suggest that ID3 is a useful approach for developing identification systems.  相似文献   

15.
Investigating seized devices within digital forensics gets more and more difficult due to the increasing amount of data. Hence, a common procedure uses automated file identification which reduces the amount of data an investigator has to look at by hand. Besides identifying exact duplicates, which is mostly solved using cryptographic hash functions, it is also helpful to detect similar data by applying approximate matching.Let x denote the number of digests in a database, then the lookup for a single similarity digest has the complexity of O(x). In other words, the digest has to be compared against all digests in the database. In contrast, cryptographic hash values are stored within binary trees or hash tables and hence the lookup complexity of a single digest is O(log2(x)) or O(1), respectively.In this paper we present and evaluate a concept to extend existing approximate matching algorithms, which reduces the lookup complexity from O(x) to O(1). Therefore, instead of using multiple small Bloom filters (which is the common procedure), we demonstrate that a single, huge Bloom filter has a far better performance. Our evaluation demonstrates that current approximate matching algorithms are too slow (e.g., over 21 min to compare 4457 digests of a common file corpus against each other) while the improved version solves this challenge within seconds. Studying the precision and recall rates shows that our approach works as reliably as the original implementations. We obtain this benefit by accuracy–the comparison is now a file-against-set comparison and thus it is not possible to see which file in the database is matched.  相似文献   

16.
A problem that arises in computer forensics is to determine the type of a file fragment. An extension to the file name indicating the type is stored in the disk directory, but when a file is deleted, the entry for the file in the directory may be overwritten. This problem is easily solved when the fragment includes the initial header, which contains explicit type-identifying information, but it is more difficult to determine the type of a fragment from the middle of a file.We investigate two algorithms for predicting the type of a fragment: one based on Fisher's linear discriminant and the other based on longest common subsequences of the fragment with various sets of test files. We test the ability of the algorithms to predict a variety of common file types. Algorithms of this kind may be useful in designing the next generation of file-carvers – programs that reconstruct files when directory information is lost or deleted. These methods may also be useful in designing virus scanners, firewalls and search engines to find files that are similar to a given file.  相似文献   

17.
In situations where badly burnt fragments of bone are found, identification of their human or non-human origin may be impossible by gross morphology alone and other techniques have to be employed. In order to determine whether histological methods were redundant and should be superseded by biomolecular analyses, small fragments of artificially burnt bone (human and non-human) were examined by quantitative and standard light microscopy, and the findings compared with newer biomolecular analyses based on identifying specific human albumin by ELISA and amplifying human mitochondrial DNA by PCR. For quantitative microscopy, reference data were first created using burnt bones from 15 human and 20 common domestic and farm animals. Measured osteon and Haversian canal parameters were analysed using multivariate statistical methods. Highly significant differences were found between values for human and non-human bone, and a canonical discriminant function equation was derived, giving a predicted correct classification of 79%. For the main study, samples of cortical bone were taken from three fresh cadavers, six human skeletons and ten freshly slaughtered animals and burnt by exposure to temperatures ranging from 800 to 1200 degrees C; charred fragments of human cortical bone from two forensic cases were also tested. Quantitative microscopy and canonical discriminant function gave the correct origin of every sample. Standard microscopy falsely assigned burnt bone from one human skeleton and one forensic case to a non-human source, but otherwise gave correct results. Human albumin was identified in five individuals, including one of the forensic cases, but mitochondrial DNA could not be amplified from any of the human bone. No false positive test results were seen with either biomolecular method; and human albumin and mitochondrial DNA were correctly identified in all unburnt control specimens. It was concluded that histological methods were not redundant and that quantitative microscopy provided an accurate and consistent means of determining the human or non-human origin of burnt bone and was more reliable than standard microscopy or the newer immunological and DNA techniques tested here.  相似文献   

18.
Craniometric studies of South Africans yield high accuracies of sex and ancestry classification, but most assess only inter‐group variation of Black and White individuals, excluding the highly heterogeneous Colored group, which constitute a significant proportion of the population. This study applied a geometric morphometric approach to the neurocrania of 774 Black, Colored, and White individuals to assess sex and ancestry estimation accuracy based on the detected morphological variation. Accuracies of 70% and 83% were achieved for sex and ancestry, respectively, with ancestry‐related variation contributing the largest proportion of overall observed variation. Even when comparing the closely related Black and Colored groups, relatively high accuracies were obtained. It is thus recommended that a similar approach be used to develop a contemporary three‐dimensional database, which can be used to objectively, reliably, and accurately classify unknown remains in the South African forensic context.  相似文献   

19.
Automated input identification is a very challenging, but also important task. Within computer forensics this reduces the amount of data an investigator has to look at by hand. Besides identifying exact duplicates, which is mostly solved using cryptographic hash functions, it is necessary to cope with similar inputs (e.g., different versions of a file), embedded objects (e.g., a JPG within a Word document), and fragments (e.g., network packets), too. Over the recent years a couple of different similarity hashing algorithms were published. However, due to the absence of a definition and a test framework, it is hardly possible to evaluate and compare these approaches to establish them in the community.The paper at hand aims at providing an assessment methodology and a sample implementation called FRASH: a framework to test algorithms of similarity hashing. First, we describe common use cases of a similarity hashing algorithm to motivate our two test classes efficiency and sensitivity & robustness. Next, our open and freely available framework is briefly described. Finally, we apply FRASH to the well-known similarity hashing approaches ssdeep and sdhash to show their strengths and weaknesses.  相似文献   

20.
This paper critically reviews and integrates multidisciplinary literature informing conceptualization and determination of disability and return to work from a psychological injury perspective. It focuses on dissecting and disentangling the ambiguities and complexities of theories and definitions of disability, impairment, and return to work, highlighting the conceptual quagmires that affect both research and clinical methodology in the field. The paper discusses the strengths and limitations of the main theoretical perspectives on disability and return to work—social, medical, and biopsychosocial—and the associated applied perspectives, including the legal/administrative, clinical, and research oriented. It provides a special focus on the Americans with Disabilities Act and the American Medical Association’s Guides for the Evaluation of Permanent Impairment as leading applied perspectives. The paper also highlights the features and methodological implications of the integrative framework of the International Classification of Functioning, Disability and Health (World Health Organization, International classification of functioning, disabilities and health, WHO, Geneva, Switzerland, 2001). Finally, it draws implications for the field of psychological injury in a legal context.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号