首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
英汉语因各自的社会历史、文化背景、心理特征、思维模式以及文字的独特性而呈现出巨大的差异。本文通过对这两种语言表达差异的分析,论述了英汉语的文化和写作中的词法、句法、语篇以及修辞手段的显性与隐性状态。  相似文献   

2.
Alternative intellectual property laws relating to the use of languages in IP organisations can result in an asymmetric distribution of costs of patenting between innovators. We present a framework for the characterisation and evaluation of the fairness of the language regime of the European Patent Office (EPO), which is currently based on three official languages: English, French and German. We estimate that the costs of access to patenting procedures borne by English-, French- or German-speaking applicants are at least 30 % lower than those borne by European applicants whose first language is not one of the current official languages of the EPO. In order to correct language-related cost asymmetries, we explore two possible alternative language regimes. In both cases, we introduce a centralised system of financial compensation that covers translation costs borne by European applicants whose first language is not one of the official languages of the EPO. In the first alternative, financial compensation is covered by savings on current translation costs at the granting stage. In the second alternative, the number of official languages is increased to five and financial compensation is funded by an increase in filing fee and by partial savings on translation costs at the granting stage. We show that both alternatives could substantially reduce language-related costs asymmetries among European applicants. In addition, we show that the five-language alternative would reduce the global costs of access to patenting procedures borne by all European applicants. Thus, more multilingualism can be cheaper than less multilingualism, provided that “implicit costs” are taken into account. We conclude discussing under which conditions the alternative language regimes proposed could have a positive impact on the effectiveness and the cost-effectiveness of the EPO language regime.  相似文献   

3.
This paper presents a method for recovering data from files compressed with the DEFLATE algorithm where short segments in the middle of the file have been corrupted, yielding a mix of literal bytes, bytes aligned with literals across the corrupted segment, and co-indexed unknown bytes. An improved reconstruction algorithm based on long byte n-grams increases the proportion of reconstructed bytes by an average of 8.9% absolute across the 21 languages of the Europarl corpus compared to previously-published work, and the proportion of unknown bytes correctly reconstructed by an average of 20.9% absolute, while running in one-twelfth the time on average. Combined with the new recovery method, corrupted segments of 128–4096 bytes in the compressed bit-stream result in reconstructed output which differs from the original file by an average of less than twice the number of bytes represented by the corrupted segment. Both new algorithms are implemented in the trainable open-source ZipRec 1.0 utility program.  相似文献   

4.
The National Software Reference Library (NSRL) is an essential data source for forensic investigators, providing in its Reference Data Set (RDS) a set of hash values of known software. However, the NSRL RDS has not previously been tested against a broad spectrum of real-world data. The current work did this using a corpus of 36 million files on 2337 drives from 21 countries. These experiments answered a number of important questions about the NSRL RDS, including what fraction of files it recognizes of different types. NSRL coverage by vendor/product was also tested, finding 51% of the vendor/product names in our corpus had no hash values at all in NSRL. It is shown that coverage or “recall” of the NSRL can be improved with additions from our corpus such as frequently-occurring files and files whose paths were found previously in NSRL with a different hash value. This provided 937,570 new hash values which should be uncontroversial additions to NSRL. Several additional tests investigated the accuracy of the NSRL data. Experiments testing the hash values saw no evidence of errors. Tests of file sizes showed them to be consistent except for a few cases. On the other hand, the product types assigned by NSRL can be disputed, and it failed to recognize any of a sample of virus-infected files. The file names provided by NSRL had numerous discrepancies with the file names found in the corpus, so the discrepancies were categorized; among other things, there were apparent spelling and punctuation errors. Some file names suggest that NSRL hash values were computed on deleted files, not a safe practice. The tests had the secondary benefit of helping identify occasional errors in the metadata obtained from drive imaging on deleted files in our corpus. This research has provided much data useful in improving NSRL and the forensic tools that depend upon it. It also provides a general methodology and software for testing hash sets against corpora.  相似文献   

5.
The classification of file fragments is an important problem in digital forensics. The literature does not include comprehensive work on applying machine learning techniques to this problem. In this work, we explore the use of techniques from natural language processing to classify file fragments. We take a supervised learning approach, based on the use of support vector machines combined with the bag-of-words model, where text documents are represented as unordered bags of words. This technique has been repeatedly shown to be effective and robust in classifying text documents (e.g., in distinguishing positive movie reviews from negative ones).In our approach, we represent file fragments as “bags of bytes” with feature vectors consisting of unigram and bigram counts, as well as other statistical measurements (including entropy and others). We made use of the publicly available Garfinkel data corpus to generate file fragments for training and testing. We ran a series of experiments, and found that this approach is effective in this domain as well.  相似文献   

6.
In forensic voice comparison, deep learning has become widely popular recently. It is mainly used to learn speaker representations, called embeddings or embedding vectors. Speaker embeddings are often trained using corpora mostly containing widely spoken languages. Thus, language dependency is an important factor in automatic forensic voice comparison, especially when the target language is linguistically very different from that the model is trained on. In the case of a low-resource language, developing a corpus for forensic purposes containing enough speakers to train deep learning models is costly. This study aims to investigate whether a model pre-trained on multilingual (mostly English) corpus can be used on a target low-resource language (here, Hungarian), not represented by the model. Often multiple samples are not available from the offender (unknown speaker). Samples are therefore compared pairwise with and without speaker enrollment for suspect (known) speakers. Two corpora are used that were developed especially for forensic purposes and a third that is meant for traditional speaker verification. Speaker embedding vectors are extracted by the x-vector and ECAPA-TDNN techniques. Speaker verification was evaluated in the likelihood-ratio framework. A comparison is made between the language combinations (modeling, LR calibration, and evaluation). The results were evaluated by Cllrmin and EER metrics. It was found that the model pre-trained on a different language but on a corpus with a significant number of speakers can be used on samples with language mismatch. Sample duration and speaking style also seem to affect the performance.  相似文献   

7.
中国的翻译事业从夏朝到现代,经历了四千多年的历史;中国对外翻译始于公元505年;明清时代进入翻译繁荣时期;当今世界的中国翻译事业空前繁荣,为推进世界文明的发展做出了不可磨灭的贡献。中国古典名著《论语》乃古代思想文化的瑰宝,《论语》的对外翻译为丰富全人类思想文化发挥了巨大的作用。不同翻译人员语言文化素质的差异必然导致对同一翻译作品理解上的差异,《论语》的翻译尤其如此。本文分析研究了《论语》的五个英文译本,旨在提高中国文化对外翻译质量。  相似文献   

8.
张传玺 《北方法学》2012,6(4):146-154
作为英法史上最重要的诉讼形式之一种,trespass on the case有不同中译法,各有依据。对现有译法和史料稍作考察可见,曾为主流的"类案侵害之诉"和"间接侵害之诉"的译法并不合适,其他现有中译法亦有缺陷。"个案侵害之诉"的译法似更符合史实,并可依英国法的开放性线索对其稍作阐释。通过对各译法背后的不同学说加以探究,不难梳理英法史研究的新旧交替过程。  相似文献   

9.
学者云,“翻译之事,定名最难”;而古人也有教诲说,“公欲善其事,必先利其器”,故工具书中术语的准确定位事关翻译意旨的顺利实现;故针对法律英语词典的误译、涵盖面不广、未对近义词进行辨析以及大量非词汇化词条等问题,词典编辑者应尽其所能对工具书予以完善,以排除法律英语翻译时因工具书而造成的障碍。  相似文献   

10.
A problem that arises in computer forensics is to determine the type of a file fragment. An extension to the file name indicating the type is stored in the disk directory, but when a file is deleted, the entry for the file in the directory may be overwritten. This problem is easily solved when the fragment includes the initial header, which contains explicit type-identifying information, but it is more difficult to determine the type of a fragment from the middle of a file.We investigate two algorithms for predicting the type of a fragment: one based on Fisher's linear discriminant and the other based on longest common subsequences of the fragment with various sets of test files. We test the ability of the algorithms to predict a variety of common file types. Algorithms of this kind may be useful in designing the next generation of file-carvers – programs that reconstruct files when directory information is lost or deleted. These methods may also be useful in designing virus scanners, firewalls and search engines to find files that are similar to a given file.  相似文献   

11.
英汉颜色词的国俗语义对比与文化教学   总被引:1,自引:0,他引:1  
徐霞  周影韶 《政法学刊》2005,22(3):119-121
词语所承载的意义不仅包含它的概念意义,还包含它的社会文化色彩和情感内涵。国俗语义是语义民族性的一种表现。英汉两种语言中颜色词语的国俗语义相同性很小。英汉颜色词语在社会属性、情感和心理方面存在差异,英汉颜色词语国俗语义缺项。  相似文献   

12.
Comparative scholars of legislative politics continue to face the challenge of measuring a key theoretical concept: conflict at the level of legislative bills. We address this challenge with a multilingual sentiment‐based approach and show that such a measure can effectively capture different types of parliamentary conflict. We also demonstrate that an automated translation of the dictionary yields valid results and therefore greatly facilitates comparative work on legislatures. Our applications show that a sentiment approach recovers government‐opposition dynamics in various settings. The use of a simple, translatable sentiment dictionary opens up the possibility of studying legislative conflict in bill debates across languages and countries.  相似文献   

13.
Conceptual confusions permeate all forms of intellectual pursuit. Many have contended that multilingual legislation, i.e., one law enacted in different languages, is unviable when carried out by means of translation. But not many have realized that the same would also be true of drafting if their contention could be justified. My involvement in the translation of Hong Kong laws into Chinese in the run-up to 1997 exposed me to a whole world of myths and misconceptions about legal translation arising from our failure to command a clear view of the workings of language. Over the years I have endeavoured to come to grips with the problems inherent in legal translation, showing that the arguments against the possibility of exact translation, against the possibility of achieving equivalence between different language texts of the law, and against the possibility of bridging the conceptual gap between legal terminologies in different languages, are all ill-grounded and misguided. There are indeed enormous difficulties in drafting and translating multilingual law, but they are essentially of a technical nature, by no means theoretically irresolvable. The viability of multilingual legislation is simply grounded in our innate communicative intention to use signs and symbols to convey meaning. As language users, we are capable of making language work for us for any particular purpose. Just as we can translate the rules of chess from one language to another whereby players speaking different languages can play the same game called “chess”, we don’t see why we can’t do the same with multilingual legislation. The door has always been open!  相似文献   

14.
法律英语汉译时应兼顾其语言转换和文化转换,但在具体转换时应采用不同的方法:从译文的可读性要求出发,应发挥译者的创造性和能动性,法律英语汉译中应在语言转换中使用归化方法;从法律英语汉译的目的、功能等角度出发,应在文化转换中使用异化方法,而异化的最终目的是为了实现真正的归化,从而对接受外国法律文化和丰富本土法律文化均不无裨益。  相似文献   

15.
本研究使用跨语言语义分类任务探查了不熟练汉英双语儿童概念表征的特点,被试为从小学一年级开始学习英语的小学三年级、五年级和初中一年级学生.结果发现,三年级被试的语义分类反应只有在目标词为第一语言时,受到范畴语言的影响.五年级被试的语义分类反应只有当目标词为第二语言时受到范畴语言的影响,初中一年级被试的反应不受范畴语言影响.这些结果表明,不熟练汉英双语儿童的第二语言词能够直接通达概念表征,随着年级的升高,第二语言词和概念间的联系增强,这种通达能力也逐渐由弱变强.  相似文献   

16.
This article tries to bring to light the mistaken idea that the words the law borrows from plain language, without explicit definition, should keep their original meaning; Although legal language and plain language are obviously close “friends”, they seem to be also “false friends”, because these words belonging to two different languages have, beyond their formal similarities, partially different meanings. For this purpose, this article provides a critical analysis of the reference of the belgian case law to the ordinary meaning of words. This reference is analysed in relation to three different matters: interpretation, legality, and autonomy of criminal law.  相似文献   

17.
中国是诗词的国度,诗、词、歌、赋代代相传,诗人辈出,世代相袭。中国历代政治家几乎都能诗会文。当代伟人毛泽东的诗词风格古雅,题材广泛,内容丰富隽永,常为国人及国际友人传诵引用。毛泽东诗词早在20世纪60年代初就被译成多种外文,流传到世界各地。其英文译本几近二十余种,各家译本竞显风流。不同译本的对比研究有利于进一步提高译文质量,更加有效促进地我国文学翻译事业的健康发展。  相似文献   

18.
条约用语的意义模糊并非一个独立问题.通过对WTO案例的实证研究可知,“shall”与“should”在权威的公共英语词典和法律英语词典中都有多义性特征,由此导致成员方和争端解决机构对WTO诸协定中“shall”与“should”具有强制性抑或劝告性意义产生分歧,影响了WTO相关规则的法律确定性和可预见性.产生这种分歧的原因较为复杂,主要包括国际法本身的局限性、条约用语的含义可能会随上下文而变化、成员依据其利益对国际法规则作不断变化和不一致的解读、语言模糊性等.采用规范分析方法、语义分析方法、实证研究、系统分析等方法,可以根据《维也纳条约法公约》第31、32条规定的习惯法解释规则合理、准确地选择“shall”与“should”的恰当含义.二者既具有强制性含义又具有劝告性含义,如何在具体语境下确定其具体含义,尚未被不充分的条约实践和条约法规则所证实.  相似文献   

19.
Investigating seized devices within digital forensics gets more and more difficult due to the increasing amount of data. Hence, a common procedure uses automated file identification which reduces the amount of data an investigator has to look at by hand. Besides identifying exact duplicates, which is mostly solved using cryptographic hash functions, it is also helpful to detect similar data by applying approximate matching.Let x denote the number of digests in a database, then the lookup for a single similarity digest has the complexity of O(x). In other words, the digest has to be compared against all digests in the database. In contrast, cryptographic hash values are stored within binary trees or hash tables and hence the lookup complexity of a single digest is O(log2(x)) or O(1), respectively.In this paper we present and evaluate a concept to extend existing approximate matching algorithms, which reduces the lookup complexity from O(x) to O(1). Therefore, instead of using multiple small Bloom filters (which is the common procedure), we demonstrate that a single, huge Bloom filter has a far better performance. Our evaluation demonstrates that current approximate matching algorithms are too slow (e.g., over 21 min to compare 4457 digests of a common file corpus against each other) while the improved version solves this challenge within seconds. Studying the precision and recall rates shows that our approach works as reliably as the original implementations. We obtain this benefit by accuracy–the comparison is now a file-against-set comparison and thus it is not possible to see which file in the database is matched.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号