- Postcard claiming to be written by 'Jack the Ripper' sells at auction
- Recommended Games
- Jack The Ripper : Letters From Hell - Extended Edition
The most common theory about the authorship of these texts is that journalists fabricated them to increase newspaper sales. At the time, the Central News Agency had been in a fierce competition with other news agencies and had a reputation of fabricating or embellishing news Evans and Skinner, ; Begg, As a first step to shed light on the authorship question of the Jack the Ripper letters, the present article reports on an authorship analysis of the texts received during and after the Whitechapel murders case that are connected to Jack the Ripper.
The available data set lends itself to several authorship questions, such as the profiling of the anonymous author s , or to the comparison between some key letters and Bulling's and Best's writings. Establishing whether some of the Jack the Ripper texts could be written by the same person is an important preliminary step as any future study, either involving profiling or comparison, would benefit from knowing if a number of questioned texts can be clustered together. In this sense, the authorship question tackled in the present study constitutes a useful starting point for any future authorship study on the Jack the Ripper letters.
The data set used in the present study is a corpus that includes the texts connected to the Whitechapel murders: the Jack the Ripper Corpus JRC see Supplementary Material.
This corpus consists of the letters or postcards found and transcribed in the Appendix of Evans and Skinner , who claim to have collected all of the texts involved in the Whitechapel murders related to Jack the Ripper from the Metropolitan Police files. These letters were OCR-scanned from the book and the scans were manually checked for scanning errors. The corpus consists of texts and 17, word tokens.
The peculiarity of the JRC is that almost all of the texts in the corpus are comparable in terms of their broad situational parameters Biber, , as they are almost all written letters or postcards with similar linguistic purposes. The vast majority of the letters was postmarked or found in London, although other letters were postmarked or found in places all over the UK, such as Birmingham, Bradford, Dublin, Edinburgh, Liverpool, Manchester, or Plymouth. The corpus ranges from 24 September to 14 October , thus spanning more than 10 years after the murders. Before this date, according to Evans and Skinner's collection, four texts were received: Text 1 24 September, word tokens : In this text the author admits to the killing of Chapman and presents the intention to stop killing.
The letter is unsigned;. The authorship question considered for this study concerns finding out which texts in a corpus are likely to be written by the same author. The best solutions proposed to solve this type of problem involve the addition of distractor texts belonging to similar registers and the use of similarity metrics applied to feature sets consisting of frequencies of linguistic features.
Postcard claiming to be written by 'Jack the Ripper' sells at auction
The problem in applying any of these techniques to the JRC corpus is that the JRC texts are too short to produce reliable frequencies, as the average text length for the corpus is only eighty-three word tokens. For this reason, in this case it is necessary to adopt a method that does not involve the computation of frequencies.
After being successfully applied to text messages case, methods using the Jaccard coefficient have been applied with good results to other registers, including newspaper articles Juola, , short emails Johnson and Wright, ; Wright, , and elicited personal narratives Larner, Within plagiarism detection research, word n -gram techniques based on similar mathematical principles are very common Oakes, , p. Word n -grams have been extensively adopted as linguistic features in traditional frequency-based stylometric methods for authorship attribution, although they are not deemed the best stylometric features, as they are often surpassed in efficacy by function words, simple word frequency, and, above all, character n -grams Grieve, ; Stamatatos, Character n -grams could also be good features but they are less amenable to interpretation, which can be a drawback depending on the ultimate goal of the research.
In addition to these methodological advantages, the use of word n -grams as features has theoretical support. Wright, reveals the idiolectal nature of certain word n -grams by taking one specific speech act as constant and then analysing how different authors realize this act, uncovering that each author recurs to their own idiosyncratic set of lexical choices to perform the same act. However, evidence of common authorship of two sets of documents can come not only from finding similarity but also from establishing that this similarity is distinctive Grant, , Although it is difficult to establish a universal threshold for distinctiveness, it is safe to assume that if a particular n -gram or lexicogrammatical structure does not occur at all or occurs extremely infrequently in a comparable reference corpus then this n -gram or structure is distinctive.
If a smaller sub-sample of its texts is considered, the remaining of the JRC itself is indeed a corpus with relevant population data. However, because of its relatively small size, more data from 19th century English is necessary to find evidence of distinctiveness. Ideally, because of the pervasiveness of register variation, the perfect comparison corpus would be one including a large number of 19th century English letters of comparable communicative situation Biber, However, in the absence of an extensive resource of this kind, the most comprehensive largest available set of general reference corpora was used instead, consisting of the largest available corpora of 19th century English: The million word 19th century section of the Corpus of Historical American English COHA ;.
However, provided that the shared n -grams found are also highly distinctive the evidence of common authorship is nonetheless valid despite differences in text lengths. Relationship between rank and percentage of occurrence for each word 2-gram in the JRC occurring in at least two texts. Because of their frequent occurrence and thus reduced discriminatory power, these top eight 2-grams were excluded from further analysis.
The distance between each pair of texts was quantified using the Jaccard distance based on the presence or absence of the remaining word 2-grams and a distance matrix was therefore generated. Histogram and boxplot showing the distribution of Jaccard distance values for all possible pairs of texts in the JRC. As the histogram of Fig.
- Dear Boss letter.
- Jack the Ripper: Letters from Hell by Stewart P. Evans.
- Smiles & Moods.
- Books similar to Jack the Ripper: Letters from Hell.
- Books similar to Jack the Ripper: Letters from Hell.
The distance matrix was then used for a hierarchical cluster analysis that can be visualized through the radial dendrogram in Fig. Radial dendrogram displaying the results of a hierarchical cluster analysis of the JRC corpus using the Ward method based on Jaccard distances. The name of the texts is a code starting with two letters from the signature and followed by the date in which it was received. The texts mentioned in the introduction, including the pre-publication texts, contain their name in addition to the code. Three main branches stem from the centre of the graph in Fig.
On the right, there are two main clusters, one of which includes only two texts. The remaining texts are all classified into another cluster whose branch points to the left and that further splits into two other clusters that roughly correspond to the two hemispheres of the graph. The most historically interesting texts, including the pre-publication texts, are all grouped in the cluster spanning over the top hemisphere of the graph and therefore the rest of the article will focus on this cluster.
Although it would be interesting to explore the other clusters, this is beyond the scope and space of this study. The left branch then splits into two more clusters, with the rightmost one splitting again into two large clusters. Let us therefore examine the pre-publication texts using a network graph as in Fig. The graph also reports the Jaccard distance for each pair of texts. Additionally, these two texts have a Jaccard distance of 0. Table 1 Syntactic analysis of the concordances for the 2-grams in common between Dear Boss and Saucy Jacky.
Syntactic analysis of the concordances for the 2-grams in common between Dear Boss and Saucy Jacky.
A closer examination reveals that the two texts share 2-grams of varying distinctiveness. The structure is quite rare even at a more general level, as it is found about ten to eighteen times per million words across the reference corpora.
It is very difficult to estimate distinctiveness for 7 using larger reference corpora, however, as it would involve the manual analysis of thousands of instances. As Fig. This is not reported in Fig. Network graph visualizing the relationships between the pre-publication texts. The size of each node is proportional to each text's length. Each edge represents a shared word 2-gram. For each pair of texts the Jaccard distance is also reported. Distances are rounded up.
My knife's so nice and sharp I want to get to work right away if I get a chance. Dear Boss.
Only twelve JRC texts have a Jaccard distance lower than 0. Table 2 Syntactic analysis of the concordances for the n-grams in common between Dear Boss and Saucy Jacky and the Moab and Midian letters. Syntactic analysis of the concordances for the n-grams in common between Dear Boss and Saucy Jacky and the Moab and Midian letters. The syntactic structure underlying this 4-gram is a verb phrase headed by a phrasal verb that, used within that particular structure underlying that particular unit of meaning, is also rare and distinctive overall.
The presence of this 4-gram and of this structure thus supports the hypothesis that the two texts were written by the same person. This and many other letters in the JRC texts can be analysed in more detail in the future. Historically speaking, the comparison presented between the earliest letters ever received in the Whitechapel murders case provides linguistic evidence supporting the hypothesis that the two most iconic texts sent during the case were written by the same person.
Jack The Ripper : Letters From Hell - Extended Edition
The present analysis, however, found linguistic evidence that supports the common authorship of these two texts. Future analyses focused on their profiling or on the comparison with known writings of suspect authors can thus take as point of departure a link between these two texts. The linguistic link found between these three texts is therefore far from coincidental in the light of the other non-linguistic evidence and significantly contributes to the debate on the origin of the letter.
The present analysis is also successful in presenting serious implications for modern research in forensic linguistics and authorship analysis. Quantitatively speaking, despite the presence of these letters in full in the public domain, only a very limited percentage of them presents substantial linguistic similarities, implying that techniques such as the analysis of short texts using similarity measures such as the Jaccard coefficient are quite effective in filtering this type of noise. Theoretically, the results presented in this article also contribute to the understanding of idiolect.
A superficial reading of most of the JRC letters would only reveal their similarities in terms of meanings, themes, purposes, and some phraseology. However, this analysis has revealed that by investigating the way these meanings, themes, and purposes are encoded linguistically uniqueness emerges, as demonstrated by the relatively low average Jaccard distances between the letters. As shown by Wright for short emails, although meanings and speech acts can be shared, it is the way they are encoded in words and syntactic structures that tends to be idiosyncratic or unique.
In this article, an analysis of the texts sent during the Whitechapel murders case was presented.
These results constitute new forensic evidence in the Jack the Ripper case after more than years, even though they do not reveal information about the identity of the killer s. Besides the historical and forensic implications, the results presented in this article also have interesting consequences for modern research in authorship analysis, forensic linguistics, and research on idiolect.
Luckily I used to, which was the only way I knew how to distinguish "English stamps" from the rest of the huge pile.
The worst was a so-called memory game where you have to put graffiti back onto buildings after having taken it off a few puzzles before. Here is where I could make some off-color pun, like suggesting that this title be called Game from Hell instead of Letters from Hell. But hopefully you get the idea. Subscribe to GameZebo. Home Reviews Adventure. Jack the Ripper: Letters from Hell Review.