::install_github("STAT325-S24/HistoryAmherstCollege") devtools
Example analyses
The following is a glimpse of the package’s dataframes along with a sample analysis using Named Entity Recognition (thanks to the cleanNLP
package).
packageVersion("HistoryAmherstCollege")
[1] '1.0'
glimpse(history_text)
Rows: 25,537
Columns: 4
$ text <chr> "Amherst College DURING ITS FIRST HALF CENTURY. 1821-1871."…
$ chapter <chr> "00", "00", "00", "00", "00", "00", "00", "00", "00", "00", …
$ paragraph <dbl> 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 4, 4, 4, 5, 5, 5, 5, 5, 6, 6, …
$ page_num <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, …
::tally(~ chapter, data = history_text) mosaic
chapter
00 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15
157 422 394 207 398 402 402 693 536 900 1336 1284 958 1109 1192 288
16 17 18 19 20 21 22 23 24 25 26 27 28 29
1313 1231 439 1298 2081 1238 1077 674 907 1486 666 308 1638 503
glimpse(history_anno)
List of 3
$ token : tibble [281,272 × 10] (S3: tbl_df/tbl/data.frame)
..$ doc_id : int [1:281272] 1 1 1 1 1 1 1 1 1 1 ...
..$ sid : int [1:281272] 1 1 1 1 1 1 1 1 1 2 ...
..$ tid : int [1:281272] 1 2 3 4 5 6 7 8 9 1 ...
..$ token : chr [1:281272] "Amherst" "College" "DURING" "ITS" ...
..$ token_with_ws: chr [1:281272] "Amherst " "College " "DURING " "ITS " ...
..$ lemma : chr [1:281272] "Amherst" "College" "during" "its" ...
..$ upos : chr [1:281272] "PROPN" "PROPN" "ADP" "PRON" ...
..$ xpos : chr [1:281272] "NNP" "NNP" "IN" "PRP$" ...
..$ tid_source : int [1:281272] 2 0 2 7 7 7 3 2 8 3 ...
..$ relation : chr [1:281272] "compound" "root" "prep" "poss" ...
$ entity : tibble [19,065 × 6] (S3: tbl_df/tbl/data.frame)
..$ doc_id : int [1:19065] 1 1 1 2 3 4 6 6 7 8 ...
..$ sid : int [1:19065] 1 1 2 1 1 1 1 1 1 1 ...
..$ tid : int [1:19065] 1 4 1 2 5 4 1 3 1 1 ...
..$ tid_end : int [1:19065] 2 7 3 4 5 8 1 3 3 1 ...
..$ entity_type: chr [1:19065] "ORG" "DATE" "DATE" "PERSON" ...
..$ entity : chr [1:19065] "Amherst College" "ITS FIRST HALF CENTURY" "1821-1871" "W. S. TYLER" ...
$ document: tibble [25,537 × 4] (S3: tbl_df/tbl/data.frame)
..$ chapter : chr [1:25537] "00" "00" "00" "00" ...
..$ paragraph: num [1:25537] 1 1 1 1 2 2 2 2 3 3 ...
..$ page_num : num [1:25537] 1 1 1 1 1 1 1 1 1 1 ...
..$ doc_id : int [1:25537] 1 2 3 4 5 6 7 8 9 10 ...
- attr(*, "class")= chr [1:2] "cnlp_annotation" "list"
::tally(~ entity$entity_type, data = history_anno) mosaic
entity$entity_type
CARDINAL DATE EVENT FAC GPE LANGUAGE
1921 4021 24 169 2164 56
LAW LOC MONEY NORP ORDINAL ORG
38 343 343 845 627 3826
PERCENT PERSON PRODUCT QUANTITY TIME WORK_OF_ART
5 4127 89 44 269 154
glimpse(history_subtitles)
Rows: 656
Columns: 4
$ page_number <int> 5, 6, 7, 8, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24…
$ page_header <chr> "PREFACE. 5 (v)", "6 (vi) PREFACE. ", "PREFACE. 7 (vii)",…
$ first_line <chr> "THIS History was a part of the plan for the Semi-Centenni…
$ chapter <chr> "00", "00", "00", "00", "01", "01", "01", "01", "01", "01"…
glimpse(history_tables)
Rows: 15
Columns: 3
$ pdf_pg <dbl> 15, 335, 412, 493, 506, 649, 655, 657, 668, 676, 680, 688, 693…
$ book_pg <dbl> 9, 321, 392, 465, 478, 607, 613, 615, 626, 632, 636, 644, 649,…
$ title <chr> "Contents", "Misc_Donations", "Principal_Donations", "Religiou…
glimpse(history_figures)
Rows: 20
Columns: 2
$ pdf_pg <dbl> 6, 40, 71, 84, 293, 341, 372, 409, 421, 425, 429, 453, 590, 59…
$ caption <chr> "view_from_ne", "view_from_mt_pleasant", "ac_in_1821", "zeph_s…
$entity |>
history_annofilter(entity_type == "PERSON") |>
group_by(entity) |>
count() |>
arrange(desc(n)) |>
head(6)
# A tibble: 6 × 2
# Groups: entity [6]
entity n
<chr> <int>
1 Humphrey 165
2 Hitchcock 160
3 Moore 67
4 Fiske 60
5 Stearns 58
6 Sabbath 49
The above table shows us the six most common names in the book. Some familiar ones immediately pop out, such as President Hitchcock, the eponym of the “Hitchcock Residence Hall” and one of the founding members of the American Statistical Association.
Note that sabbath
is incorrectly listed as a person:
|>
history_text filter(str_detect(text, "Sabbath")) |>
select(-paragraph, -page_num) |>
kable()
text | chapter |
---|---|
the Faculty and students worshipped on the Sabbath with | 07 |
the first Sabbath-school in Amherst. And it may not be amiss to | 07 |
were most frequently superintendents of the village Sabbath- | 07 |
Sabbath-school in Amherst, and some of the good people of the | 08 |
one or more of the classes. He preached on the Sabbath, occasionally, | 10 |
Sabbath to the congregation. He also sustained (from the first, | 10 |
at the Sabbath morning prayer-meetings of the students, | 11 |
the Sabbath in the old parish meeting-house on the hill. I soon | 11 |
by the students, and there were increasing symptoms from Sabbath | 11 |
to Sabbath qf collision and disturbance. I accordingly told | 11 |
Sabbath worship. The subject of a new chapel came before the | 11 |
Sabbath, the chapel building contained originally four recitation | 11 |
College, and father cares so much as that for it. The next Sabbath | 11 |
every other Sabbath, and by the other clerical members of the | 12 |
Faculty in rotation on each alternate Sabbath ; and at their first | 12 |
two hundred dollars, that is, five dollars a Sabbath, as the | 12 |
was at length doubled, and since that time ten dollars a Sabbath | 12 |
the public services of the Sabbath, were the religious lecture on | 12 |
hour immediately preceding public worship Sabbath morning. | 12 |
day seem like the Sabbath in its most strict observance. The | 12 |
particularly those which were held Sabbath mornings at half | 12 |
like a meeting at one of their houses Sabbath afternoon. The | 12 |
time a meeting, with a Sabbath-school, was sustained during my | 12 |
The Holy Spirit was evidently present. Sabbath day | 12 |
approaching Sabbath, July 6,” ” the pastor stated to the church | 12 |
village Sabbath-school, were greatly useful in promoting it, if | 12 |
at the Sabbath-school concert, and how the whole crowded assembly | 12 |
every Sabbath, at Granby.” | 13 |
on the Sabbath ever since. The sacrament of the Lord’s supper | 15 |
Sabbaths should be taken by the Professors, all of whom | 15 |
sermons on the Sabbath were all that the Trustees required ; | 15 |
Thursday evening as the public exercises on the Sabbath. For | 15 |
Besides the regular ministrations of the Sabbath, we have had | 15 |
Saturday night, sometimes Sabbath morning immediately after | 15 |
prayers, and sometimes Sabbath evening one hour before prayers. | 15 |
between the hours of nine and ten o’clock Sabbath evening. | 15 |
attend the Sabbath services, by visiting him in his harvest-field, | 16 |
whole community. ” On the first Sabbath of November the | 16 |
of wit. I remember well how his reproving eye one Sabbath | 16 |
either a teacher or superintendent of the Sabbath School. | 16 |
it the subject of private prayer on the Sabbath. Monday came, | 17 |
In addition to the faithful preaching on the Sabbath, the | 18 |
in the Faculty, preached in rotation on the Sabbath, at the | 18 |
others that attend public worship on the Sabbath at the College | 18 |
An extra Sabbath evening prayer-meeting of all the classes was | 18 |
preached the next Sabbath, March 10, the feeling increasing | 18 |
dollars. He preached twenty-one Sabbaths, before receiving | 19 |
of their Sabbath School. | 19 |
of a pleasant Sabbath morning, as our young men and families | 20 |
of the Sabbath, as other churches do, in a retired, consecrated | 20 |
Sabbath home, from which all the studies and distractions of | 20 |
Sabbath worship and service, but also of a professorship whose | 20 |
His duties shall be to preach on the Sabbath such portion of the | 20 |
the Sabbath, such assistance may be expected from other Professors | 20 |
half of the time on the Sabbath, and to assist as heretofore in | 20 |
as on the Sabbath to two or three hundred young men, and | 20 |
The President preached in the College Chapel every other Sabbath, | 21 |
and on the alternate Sabbaths the clerical Professors | 21 |
Sabbath. But this was generally felt to be more suitable to | 21 |
Saturday evening a part of the Sabbath, according to Jewish | 21 |
custom, and on Sabbath evening preparing the lesson for | 21 |
and her daughters thus employed Sabbath evening. Under | 21 |
under the fourth and left Sabbath evening free from secular | 21 |
on the Sabbath, but more on Sunday, Tuesday and Thursday | 21 |
As the College Chapel was undergoing repairs, the Sabbath | 21 |
avoid too many required religious exercises on the Sabbath, and | 21 |
” June 15, 1856, Sabbath evening, the pastor called a meeting | 21 |
teacher in the Sabbath school, both in Lowell and in Boston, | 22 |
up for temperance and the observance of the Sabbath, attended | 22 |
and Sabbath schools, and the use of other suitable means, were | 22 |
which they assembled for worship on the Sabbath without any | 22 |
After describing the sermon and the scene on that Sabbath | 22 |
Congregational Church, his superintendence of a large Sabbath | 23 |
Conferences and Sabbath School Conventions, he is equally | 23 |
Sabbaths as the thirty-seventh candidate, he was ordained and | 24 |
a superintendent or teacher in the Sabbath School, it is the testimony | 24 |
that did business on the Sabbath, and he relinquished the tempting | 25 |
evening prayers and in the Sabbath services of the chapel. | 26 |
fitly introduce the services of the Sabbath and accompany the | 26 |
deacon of the Church, superintendent of the Sabbath | 28 |
a command which we deem sacred, ” Remember the Sabbath day to ketp it holy.” We | 29 |
and have seen and heard Mr. Otis, on the Sabbath as well as on other days. | 29 |
$token |>
history_annofilter(upos == "NOUN") |>
group_by(token) |>
count() |>
arrange(desc(n)) |>
head(6)
# A tibble: 6 × 2
# Groups: token [6]
token n
<chr> <int>
1 years 703
2 time 614
3 students 517
4 year 390
5 dollars 379
6 class 329
The above table displays the six most common nouns in the text.