Written discourse
As of March 2009, written discourse in EANC includes over 106 million tokens. There are 510 authors in the EANC database not counting the press subcorpus.
Written Discourse |
Tokens |
% EANC |
Press |
47 264 735 |
42,9% |
Fiction |
37 279 344 |
33,8% |
Science |
13 875 930 |
12,6% |
Other Non-Fiction |
4 735 997 |
4,3% |
Poetry |
3 648 160 |
3,3% |
Total Written Discourse |
106 804 166 |
96,8% |
Various genres of EANC texts are distributed unevenly over time. The 19th and 20th centuries are mostly represented by literary texts, prose and poetry. Some older press has been added to the corpus in a joined project by EANC and the Armenian National Library (see Press Archive). The main bulk of the press subcorpus, however, was acquired by downloading texts from open newspaper archives and thus represents the modern (from 2000 on) language of internet news resources of the Republic of Armenia (see also Armenian texts online). This makes the ratio between press and fiction texts for the last decade very different from the same ratio for the rest of the corpus.
Nonfiction texts are also represented unevenly over time. Most scientific texts come from the Soviet period (primarily, the 1960s and 70s). Most legal texts, however, have been obtained from open internet sources and come from the last decade.
Period |
Prose |
Poetry |
Non-fiction |
Press |
Total by period |
|
tokens |
% period |
tokens |
% period |
tokens |
% period |
tokens |
% period |
|
|
|
|
|
|
|
|
|
|
|
before 1870 |
291 930 |
64% |
3 630 |
1% |
n/a |
0% |
160 704 |
35% |
456 264 |
1870 - 1879 |
514 702 |
53% |
48 811 |
5% |
249 572 |
26% |
149 631 |
16% |
962 716 |
1880 - 1889 |
1 431 103 |
74% |
4 020 |
0% |
48 411 |
3% |
446 963 |
23% |
1 930 497 |
1890 - 1899 |
801 630 |
100% |
n/a |
0% |
n/a |
0% |
n/a |
0% |
801 630 |
1900 - 1909 |
735 988 |
36% |
84 430 |
4% |
253 204 |
12% |
954 997 |
47% |
2 028 619 |
1910 - 1919 |
451 942 |
60% |
61 526 |
8% |
n/a |
0% |
245 806 |
32% |
759 274 |
1920 - 1929 |
739 636 |
44% |
296 573 |
18% |
44 170 |
3% |
599 488 |
36% |
1 679 867 |
1930 - 1939 |
2 211 314 |
57% |
27 747 |
1% |
242 714 |
6% |
1 410 425 |
36% |
3 892 200 |
1940 - 1949 |
922 848 |
46% |
138 791 |
7% |
198 717 |
10% |
732 734 |
37% |
1 993 090 |
1950 - 1959 |
2 408 255 |
47% |
784 771 |
15% |
462 914 |
9% |
1 421 629 |
28% |
5 077 569 |
1960 - 1969 |
4 013 652 |
57% |
479 107 |
7% |
425 842 |
6% |
2 176 226 |
31% |
7 094 827 |
1970 - 1979 |
5 885 441 |
48% |
121 854 |
1% |
4 354 936 |
36% |
1 899 469 |
15% |
12 261 700 |
1980 - 1989 |
3 983 807 |
34% |
69 216 |
1% |
5 935 592 |
50% |
1 861 032 |
16% |
11 849 647 |
1990 - 1999 |
1 227 048 |
37% |
78 553 |
2% |
1 324 881 |
40% |
650 432 |
20% |
3 280 914 |
2000 - 2008 |
1 129 320 |
2% |
57 638 |
0% |
4 174 458 |
10% |
34 552 624 |
88% |
39 914 040 |
|
|
|
|
|
|
|
|
|
|
undated |
10 530 728 |
82% |
1 391 493 |
11% |
896 516 |
7% |
2 575 |
0% |
12 821 312 |
|
|
|
|
|
|
|
|
|
|
Total |
37 279 344 |
35% |
3 648 160 |
3% |
18 611 927 |
17% |
47 264 735 |
44% |
106 804 166 |
One of the important objectives of EANC is to collect as many Standard Eastern Armenian fiction texts as practicable. EANC includes all school reading texts in today’s Armenian secondary school program, as well as the vast majority of SEA classical literature starting from Khachatur Abovian (mid-19th century). Many classical writings from before 1938 are now accessible for full view in EANC Electronic Library.
|