Technical Report Series on Corpus Building

Storlek: px
Starta visningen från sidan:

Download "Technical Report Series on Corpus Building"

Transkript

1 Technical Report Series on Corpus Building Vol. 9 (June 2013) Swedish Corpora Uwe Quasthoff Dirk Goldhahn Abteilung Automatische Sprachverarbeitung, Institut für Informatik, Universität Leipzig

2 Affiliation of the authors: Uwe Quasthoff, Dirk Goldhahn: Institut für Informatik,Universität Leipzig {quasthoff, Copyright: Abteilung Automatische Sprachverarbeitung, Institut für Informatik, Universität Leipzig, Technical Report Series on Corpus Building Vol. 1: Deutscher Wortschatz 2013 Vol. 2: Danish Corpora Vol. 3: Dutch Corpora Vol. 4: Icelandic Corpora Vol. 5: Hungarian Corpora Vol. 6: Ukrainian Corpora Vol. 7: Indonesian Corpora Vol. 8: Czech Corpora Vol. 9: Swedish Corpora This PDF document was created using the open source tool mwlib. For more infotmation, see PDF generated at: 26. June 2013

3 Swedish corpora 1 Introduction to corpus creation 1 SWE - a processing related language description 2 SWE corpora 3 SWE corpus comparison 8 Processing details 10 Appendix to swe news 2007: Database summary 10 Appendix to swe news 2008: Database summary 10 Appendix to swe news 2009: Database summary 11 Appendix to swe news 2010: Database summary 11 Appendix to swe news 2011: Database summary 12 Appendix to swe news 2012: Database summary 12 Appendix to swe newscrawl 2011: Database summary 13 Appendix to swe newscrawl 2012: Database summary 13 Appendix to swe web 2002: Database summary 14 Appendix to swe web 2011: Database summary 14 Appendix to swe web 2012: Database summary 15 Appendix to swe wikipedia 2007: Database summary 15 Appendix to swe wikipedia 2012: Database summary 16 Appendix to swe mixed 2012: Database summary 16 Content details 17 Appendix to swe news 2007: Size of different TLDs 17 Appendix to swe news 2008: Size of different TLDs 17 Appendix to swe news 2009: Size of different TLDs 18 Appendix to swe news 2010: Size of different TLDs 18 Appendix to swe news 2011: Size of different TLDs 18 Appendix to swe news 2012: Size of different TLDs 19 Appendix to swe newscrawl 2011: Size of different TLDs 19 Appendix to swe newscrawl 2012: Size of different TLDs 20 Appendix to swe web 2002: Size of different TLDs 20 Appendix to swe web 2011: Size of different TLDs 20

4 Appendix to swe web 2012: Size of different TLDs 21 Appendix to swe mixed 2012: Size of different TLDs 21 Appendix to swe news 2007: Size of largest domains 22 Appendix to swe news 2008: Size of largest domains 22 Appendix to swe news 2009: Size of largest domains 23 Appendix to swe news 2010: Size of largest domains 24 Appendix to swe news 2011: Size of largest domains 24 Appendix to swe news 2012: Size of largest domains 25 Appendix to swe newscrawl 2011: Size of largest domains 26 Appendix to swe newscrawl 2012: Size of largest domains 26 Appendix to swe web 2002: Size of largest domains 27 Appendix to swe web 2011: Size of largest domains 28 Appendix to swe web 2012: Size of largest domains 28 Appendix to swe mixed 2012: Size of largest domains 29 Appendix to swe news 2007: Number of sources by time period 30 Appendix to swe news 2008: Number of sources by time period 31 Appendix to swe news 2009: Number of sources by time period 33 Appendix to swe news 2010: Number of sources by time period 34 Appendix to swe news 2011: Number of sources by time period 35 Appendix to swe news 2012: Number of sources by time period 37 Word details 39 Appendix to swe news 2007: Words by length without multiplicity 39 Appendix to swe news 2008: Words by length without multiplicity 41 Appendix to swe news 2009: Words by length without multiplicity 43 Appendix to swe news 2010: Words by length without multiplicity 45 Appendix to swe news 2011: Words by length without multiplicity 47 Appendix to swe news 2012: Words by length without multiplicity 49 Appendix to swe newscrawl 2011: Words by length without multiplicity 51 Appendix to swe newscrawl 2012: Words by length without multiplicity 53 Appendix to swe web 2002: Words by length without multiplicity 55 Appendix to swe web 2011: Words by length without multiplicity 57 Appendix to swe web 2012: Words by length without multiplicity 59 Appendix to swe wikipedia 2012: Words by length without multiplicity 61 Appendix to swe mixed 2012: Words by length without multiplicity 63 Appendix to swe news 2007: Words by length with multiplicity 65 Appendix to swe news 2008: Words by length with multiplicity 67 Appendix to swe news 2009: Words by length with multiplicity 69

5 Appendix to swe news 2010: Words by length with multiplicity 71 Appendix to swe news 2011: Words by length with multiplicity 73 Appendix to swe news 2012: Words by length with multiplicity 75 Appendix to swe newscrawl 2011: Words by length with multiplicity 77 Appendix to swe newscrawl 2012: Words by length with multiplicity 79 Appendix to swe web 2002: Words by length with multiplicity 81 Appendix to swe web 2011: Words by length with multiplicity 83 Appendix to swe web 2012: Words by length with multiplicity 85 Appendix to swe wikipedia 2007: Words by length with multiplicity 87 Appendix to swe wikipedia 2012: Words by length with multiplicity 89 Appendix to swe mixed 2012: Words by length with multiplicity 91 Appendix to swe news 2007: The most frequent 50 words 92 Appendix to swe news 2008: The most frequent 50 words 93 Appendix to swe news 2009: The most frequent 50 words 94 Appendix to swe news 2010: The most frequent 50 words 95 Appendix to swe news 2011: The most frequent 50 words 96 Appendix to swe news 2012: The most frequent 50 words 97 Appendix to swe newscrawl 2011: The most frequent 50 words 98 Appendix to swe newscrawl 2012: The most frequent 50 words 99 Appendix to swe web 2002: The most frequent 50 words 100 Appendix to swe web 2011: The most frequent 50 words 101 Appendix to swe web 2012: The most frequent 50 words 102 Appendix to swe wikipedia 2007: The most frequent 50 words 103 Appendix to swe wikipedia 2012: The most frequent 50 words 104 Appendix to swe mixed 2012: The most frequent 50 words 105 Appendix to swe news 2007: Longest words in top by rank 106 Appendix to swe news 2008: Longest words in top by rank 107 Appendix to swe news 2009: Longest words in top by rank 108 Appendix to swe news 2010: Longest words in top by rank 109 Appendix to swe news 2011: Longest words in top by rank 110 Appendix to swe news 2012: Longest words in top by rank 111 Appendix to swe newscrawl 2011: Longest words in top by rank 112 Appendix to swe newscrawl 2012: Longest words in top by rank 113 Appendix to swe web 2002: Longest words in top by rank 114 Appendix to swe web 2011: Longest words in top by rank 115 Appendix to swe web 2012: Longest words in top by rank 116 Appendix to swe wikipedia 2007: Longest words in top by rank 117 Appendix to swe wikipedia 2012: Longest words in top by rank 118

6 Appendix to swe mixed 2012: Longest words in top by rank 119 Character N-gram details 120 Appendix to swe news 2007: Alphabet as used in the top words 120 Appendix to swe news 2008: Alphabet as used in the top words 121 Appendix to swe news 2009: Alphabet as used in the top words 122 Appendix to swe news 2010: Alphabet as used in the top words 123 Appendix to swe news 2011: Alphabet as used in the top words 125 Appendix to swe news 2012: Alphabet as used in the top words 126 Appendix to swe newscrawl 2011: Alphabet as used in the top words 127 Appendix to swe newscrawl 2012: Alphabet as used in the top words 128 Appendix to swe web 2002: Alphabet as used in the top words 129 Appendix to swe web 2011: Alphabet as used in the top words 131 Appendix to swe web 2012: Alphabet as used in the top words 132 Appendix to swe wikipedia 2007: Alphabet as used in the top words 133 Appendix to swe wikipedia 2012: Alphabet as used in the top words 134 Appendix to swe mixed 2012: Alphabet as used in the top words 136 Abbreviation details 138 Appendix to swe news 2007: Most frequent abbreviations 138 Appendix to swe news 2008: Most frequent abbreviations 139 Appendix to swe news 2009: Most frequent abbreviations 140 Appendix to swe news 2010: Most frequent abbreviations 141 Appendix to swe news 2011: Most frequent abbreviations 142 Appendix to swe news 2012: Most frequent abbreviations 143 Appendix to swe newscrawl 2011: Most frequent abbreviations 143 Appendix to swe newscrawl 2012: Most frequent abbreviations 144 Appendix to swe web 2002: Most frequent abbreviations 144 Appendix to swe web 2011: Most frequent abbreviations 145 Appendix to swe web 2012: Most frequent abbreviations 145 Appendix to swe wikipedia 2007: Most frequent abbreviations 146 Appendix to swe wikipedia 2012: Most frequent abbreviations 147 Appendix to swe mixed 2012: Most frequent abbreviations 148 Appendix to swe news 2007: Left neighbors of the full stop 148 Appendix to swe news 2008: Left neighbors of the full stop 149 Appendix to swe news 2009: Left neighbors of the full stop 150 Appendix to swe news 2010: Left neighbors of the full stop 151 Appendix to swe news 2011: Left neighbors of the full stop 152

7 Appendix to swe news 2012: Left neighbors of the full stop 153 Appendix to swe newscrawl 2011: Left neighbors of the full stop 154 Appendix to swe newscrawl 2012: Left neighbors of the full stop 155 Appendix to swe web 2002: Left neighbors of the full stop 156 Appendix to swe web 2011: Left neighbors of the full stop 157 Appendix to swe web 2012: Left neighbors of the full stop 158 Appendix to swe wikipedia 2007: Left neighbors of the full stop 159 Appendix to swe wikipedia 2012: Left neighbors of the full stop 160 Appendix to swe mixed 2012: Left neighbors of the full stop 161 Appendix to swe news 2007: Left neighbors of the full stop with additional internal full stops 162 Appendix to swe news 2008: Left neighbors of the full stop with additional internal full stops 163 Appendix to swe news 2009: Left neighbors of the full stop with additional internal full stops 164 Appendix to swe news 2010: Left neighbors of the full stop with additional internal full stops 165 Appendix to swe news 2011: Left neighbors of the full stop with additional internal full stops 166 Appendix to swe news 2012: Left neighbors of the full stop with additional internal full stops 167 Appendix to swe newscrawl 2011: Left neighbors of the full stop with additional internal full stops 168 Appendix to swe newscrawl 2012: Left neighbors of the full stop with additional internal full stops 169 Appendix to swe web 2002: Left neighbors of the full stop with additional internal full stops 170 Appendix to swe web 2011: Left neighbors of the full stop with additional internal full stops 171 Appendix to swe web 2012: Left neighbors of the full stop with additional internal full stops 172 Appendix to swe wikipedia 2007: Left neighbors of the full stop with additional internal full stops 173 Appendix to swe wikipedia 2012: Left neighbors of the full stop with additional internal full stops 174 Appendix to swe mixed 2012: Left neighbors of the full stop with additional internal full stops 175 Sentences details 176 Appendix to swe news 2007: Shortest sentences 176 Appendix to swe news 2008: Shortest sentences 177 Appendix to swe news 2009: Shortest sentences 179 Appendix to swe news 2010: Shortest sentences 180 Appendix to swe news 2011: Shortest sentences 182 Appendix to swe news 2012: Shortest sentences 183 Appendix to swe newscrawl 2011: Shortest sentences 185 Appendix to swe newscrawl 2012: Shortest sentences 186 Appendix to swe web 2002: Shortest sentences 188 Appendix to swe web 2011: Shortest sentences 189 Appendix to swe web 2012: Shortest sentences 191 Appendix to swe wikipedia 2007: Shortest sentences 192

8 Appendix to swe wikipedia 2012: Shortest sentences 194 Appendix to swe mixed 2012: Shortest sentences 195 Appendix to swe news 2007: Longest sentences 197 Appendix to swe news 2008: Longest sentences 199 Appendix to swe news 2009: Longest sentences 201 Appendix to swe news 2010: Longest sentences 203 Appendix to swe news 2011: Longest sentences 205 Appendix to swe news 2012: Longest sentences 207 Appendix to swe newscrawl 2011: Longest sentences 209 Appendix to swe newscrawl 2012: Longest sentences 211 Appendix to swe web 2002: Longest sentences 213 Appendix to swe web 2011: Longest sentences 215 Appendix to swe web 2012: Longest sentences 217 Appendix to swe wikipedia 2007: Longest sentences 219 Appendix to swe wikipedia 2012: Longest sentences 221 Appendix to swe mixed 2012: Longest sentences 223 Appendix to swe news 2007: Length of sentences in characters 225 Appendix to swe news 2008: Length of sentences in characters 226 Appendix to swe news 2009: Length of sentences in characters 227 Appendix to swe news 2010: Length of sentences in characters 228 Appendix to swe news 2011: Length of sentences in characters 229 Appendix to swe news 2012: Length of sentences in characters 230 Appendix to swe newscrawl 2011: Length of sentences in characters 231 Appendix to swe newscrawl 2012: Length of sentences in characters 232 Appendix to swe web 2002: Length of sentences in characters 233 Appendix to swe web 2011: Length of sentences in characters 234 Appendix to swe web 2012: Length of sentences in characters 235 Appendix to swe wikipedia 2007: Length of sentences in characters 236 Appendix to swe wikipedia 2012: Length of sentences in characters 237 Appendix to swe mixed 2012: Length of sentences in characters 238 Appendix to swe news 2007: Length of sentences in words 239 Appendix to swe news 2008: Length of sentences in words 240 Appendix to swe news 2009: Length of sentences in words 241 Appendix to swe news 2010: Length of sentences in words 242 Appendix to swe news 2011: Length of sentences in words 243 Appendix to swe news 2012: Length of sentences in words 244 Appendix to swe newscrawl 2011: Length of sentences in words 245 Appendix to swe newscrawl 2012: Length of sentences in words 246

9 Appendix to swe web 2002: Length of sentences in words 247 Appendix to swe web 2011: Length of sentences in words 248 Appendix to swe web 2012: Length of sentences in words 249 Appendix to swe wikipedia 2007: Length of sentences in words 250 Appendix to swe wikipedia 2012: Length of sentences in words 251 Appendix to swe mixed 2012: Length of sentences in words 252 Oddities details 253 Appendix to swe news 2007: Longest words 253 Appendix to swe news 2008: Longest words 253 Appendix to swe news 2009: Longest words 254 Appendix to swe news 2010: Longest words 254 Appendix to swe news 2011: Longest words 255 Appendix to swe news 2012: Longest words 255 Appendix to swe newscrawl 2011: Longest words 256 Appendix to swe newscrawl 2012: Longest words 256 Appendix to swe web 2002: Longest words 257 Appendix to swe web 2011: Longest words 257 Appendix to swe web 2012: Longest words 258 Appendix to swe wikipedia 2007: Longest words 258 Appendix to swe wikipedia 2012: Longest words 259 Appendix to swe mixed 2012: Longest words 259 Appendix to swe news 2007: Sentences with high average word length 260 Appendix to swe news 2008: Sentences with high average word length 261 Appendix to swe news 2009: Sentences with high average word length 262 Appendix to swe news 2010: Sentences with high average word length 263 Appendix to swe news 2011: Sentences with high average word length 264 Appendix to swe news 2012: Sentences with high average word length 265 Appendix to swe newscrawl 2011: Sentences with high average word length 266 Appendix to swe newscrawl 2012: Sentences with high average word length 267 Appendix to swe web 2002: Sentences with high average word length 268 Appendix to swe web 2011: Sentences with high average word length 269 Appendix to swe web 2012: Sentences with high average word length 270 Appendix to swe wikipedia 2007: Sentences with high average word length 271 Appendix to swe wikipedia 2012: Sentences with high average word length 272 Appendix to swe mixed 2012: Sentences with high average word length 273 Appendix to swe news 2007: Problems with sentence segmentation - words ending in a stopword 274 Appendix to swe news 2008: Problems with sentence segmentation - words ending in a stopword 275

10 Appendix to swe news 2009: Problems with sentence segmentation - words ending in a stopword 275 Appendix to swe news 2010: Problems with sentence segmentation - words ending in a stopword 276 Appendix to swe news 2011: Problems with sentence segmentation - words ending in a stopword 277 Appendix to swe news 2012: Problems with sentence segmentation - words ending in a stopword 278 Appendix to swe newscrawl 2011: Problems with sentence segmentation - words ending in a stopword 278 Appendix to swe newscrawl 2012: Problems with sentence segmentation - words ending in a stopword 279 Appendix to swe web 2002: Problems with sentence segmentation - words ending in a stopword 280 Appendix to swe web 2011: Problems with sentence segmentation - words ending in a stopword 281 Appendix to swe web 2012: Problems with sentence segmentation - words ending in a stopword 282 Appendix to swe wikipedia 2007: Problems with sentence segmentation - words ending in a stopword 283 Appendix to swe wikipedia 2012: Problems with sentence segmentation - words ending in a stopword 283 Appendix to swe mixed 2012: Problems with sentence segmentation - words ending in a stopword 284

11 1 Swedish corpora Introduction to corpus creation The Leipzig Corpora Collection (LCC) collects Web based corpora for many different languages. The main text genres are newspaper texts, Wikipedias and randomly collected web pages. All corpora are processed in the same way: Crawling Web pages HTML stripping Language identifikation Sentence segmentation Cleaning: Removal of ill-formed sentences Duplicate removal Calculation of word frequences and word co-occurrences As result we have a corpus containing only well-formed sentences in the language under consideration. The sentences are in random order; hence, sharing the corpus does not violate copyright law because it is impossible to reconstruct the original texts. The pre-processing steps contain both language independent steps (like HTML stripping and duplicate removal) and language dependent steps (like language identification and sentence segmentation). Especially the language specific parts are vulnerable to specific processing problems. The aim of the paper is to identify possible problems and evaluate the results. The following problems are adressed: A processing-focused language description Language size: How much text is available for this language? What are the biggest sources? Corpus description: Genre, size, crawling and processing date. Possible problems in language identification: Which languages are similar? Character set and alphabet Inspecting the word list: Most frequent words, longer high frequent words and longest words at all. Word length distribution. Can abbreviations confuse sentence segmentation? Information about the abbreviation list. Inspecting sentences: Inspect shortest and longest sentences to identify possible segmentation problems. Sentence length distribution. The paper describes the result of these inspections; the appendices show the exact results for the different corpora. This helps to compare the corpora with respect to quality. In the section quality overview, an overall quality description for each corpus is given. All corpora contain only minor problems which are irrelevant for most applications. Otherwise the corpus creation has been iterated.

12 SWE - a processing related language description 2 SWE - a processing related language description General properties of the Swedish language Native Name: Svenska Classifiation: Indo-European, Germanic, North, East Scandinavian, Danish-Swedish, Swedish Total Number of Speakers: 8.4M Largest countries with number of speakers: Sweden(8.0M). Also spoken in parts of Finland, where it has equal legal standing with Finnish. Largely mutually intelligible with Norwegian and Danish. Source: / www. ethnologue. com/ language/ swe Processing summary Latin alphabet with some additional characters full stop is used as sentence boundary and for abbreviations apostrostophes used rarely Properties important for processing Alphabet and punctuation The alphabet is Latin based, with the following specialities (source: / en. wikipedia. org/ wiki/ Swedish_alphabet): Swedish includes all 26 base letters and Å, Ä, Ö. In the alphabetic ordering, the letters Å, Ä, Ö follow Z at the end of the alphabet. Usual Latin punctuation Usage of uppercase letters: At sentence beginnings and for proper names (of persons, organisations, countries etc.). Sentence segmentation and word tokenization Sentence beginnings Sentences begin with a capitalized first word. Abbreviations Abbreviations confusing with sentence boundaries: Special abbreviation list has to be inspected. Sources for abbreviations:??? Abbreviations with full stop may appear in the word list without full stop. Apostrophes: The use of apostrophes is infrequent.

13 SWE - a processing related language description 3 Sources and ranking (2012) Estimated number of webpages containing text Google.com top-5 words: 337,000,000 results for "och" "i" "att" "som" "på" Google.com top-10 words: 232,000,000 results for "och" "i" "att" "som" "på" "är" "en" "av" "för" "med" Rank according to number of speakers (Ethnologue): 86 Rank according to Wikipedia size (see / de. wikipedia. org/ wiki/ Wikipedia:Sprachen): Rank 5 with articles ( ). Rank according to number of newspapers as found by AbyZ (5/2012): 256 newspapers, rank 10. Rank according to number of newspapers with RSS feeds (5/2012): 122 newspapers, rank 13. Rank according to our corpus size (9/2012): 13 SWE corpora Quality Overview Quality Ratings A: Very good quality. Ready to use (or already used) for frequency dictionary. Size as large as possible Only minimal errors Multiple genres (if possible) A-: Small problems identified. They should not affect usage. B: Native speaker quality. Information about abbreviations and sentence boundaries by native speaker Resulting statistics checked by native speaker, possible errors corrected C: Non-native speaker quality Obvious problems shown in corpus statistics are corrected D: First version Pre-processing with default abbreviation list and default sentence boundaries E: Poor Quality: Old, outdated or faulty. Corpus Quality The quality of the corpora differes slightly because the corpus processing toolchain changed slightly during several years. Moreover, original data are often no more available. Hence, improvement of quality often means removing incomplete or doubtful sentences. Forthcoming editions of all corpora thus might have a slightly smaller number of sentences. This especially applies to near duplicate sentences which are removed only sparingly. The following table shows the quality of the corpora. Minimal errors are still possible and described in the sections below. All possible major improvements are mentioned here.

14 SWE corpora 4 Corpus Quality rating Known problems to-dos swe_news_2007 A - - swe_news_2008 A - - swe_news_2009 A- Some uplicate sentences - swe_news_2010 A - - swe_news_2011 A - - swe_news_2012 A - - swe_newscrawl_2011 A- several near duplicate peaks - swe_newscrawl_2012 A - - swe_web_2002 A- max. 255 bytes instead characters - swe_web_2011 A - - swe_web_2012 A - - swe_wikipedia_2007 A- max. 255 bytes instead characters - swe_wikipedia_2012 A - - swe_mixed_2012 A - - Processing Overview For more details, see Appendix: Database Summary and Appendix: Number of sources by time period. Corpus Size (M sentences) Size (M running words) Multiwords Crawling date Production date swe_news_ mainly 2005 and swe_news_ daily 2008, 17% without date 2011 swe_news_ daily swe_news_ daily swe_news_ daily swe_news_ daily swe_newscrawl_ / swe_newscrawl_ / swe_web_ batch crawl swe_web_ / / swe_web_ / / swe_wikipedia_ / swe_wikipedia_ / swe_mixed_ see above 2013

15 SWE corpora 5 Content Overview For more details, see Appendix: Size of different TLDs and Appendix: Size of different domains. Corpus Type of sources Countries Number of sources Publishing date Biggest source swe_news_2007 News.se (93%),.fi(3%),.com(2%) 113 mainly 5/ /2007 swe_news_2008 News.se swe_news_2009 News.se swe_news_2010 News.se swe_news_2011 News.se swe_news_2012 News.se(95%),.ax(5%) swe_newscrawl_2011 News.se(80%),.com(18%) and before swe_newscrawl_2012 News.se(82%),.fi(8%),.com(7%),.nu(2%) and before swe_web_2002 Web.se and before swe_web_2011 Web.se(88%),.com(5%),.fi(4%) and before swe_web_2012 Web.se(86%),.com(7%),.fi(3%) and before swe_wikipedia_2007 Wikipedia and before wikipedia.org swe_wikipedia_2012 Wikipedia and before wikipedia.org swe_mixed_2012 Mixed Sources.se(80%),.com(6%),.fi(3%) and before Words Appendix: Words by Length without multiplicity shows a plot of the corresponding length distribution. A smooth asymetric bell-shaped curve is expected. Appendix: Words by Length with multiplicity shows a plot of the corresponding length distribution. A smooth asymetric bell-shaped curve is expected. Appendix: The Most Frequent 50 Words shows the most frequent stopwords as well as one or more words related to the region. Appendix: Longest Words in Top-1000 by rank shows the 25 longest words within the top The usually give an impression of the main topics treated in the corpus. Appendix: Longest Words with minimum frequency 2 should give an idea of very long words. In the case of processing problems, different types of non-words may appear. This might help to improve the word definition.

16 SWE corpora 6 Corpus Word length graph without multiplicity Word length graph with multiplicity Most Frequent 50 Words Longest Words in Top-1000 Longest Words with minimum frequency 2 swe_news_2007 okay okay okay okay URLs, missing blanks swe_news_2008 okay okay okay okay missing blanks swe_news_2009 okay okay okay okay missing blanks, routes swe_news_2010 okay okay okay okay missing blanks, junk swe_news_2011 okay okay okay Rank 636: URLs, missing blanks swe_news_2012 okay okay okay okay okay swe_newscrawl_2011 okay okay okay okay Missing blanks, routes, junk, URLs swe_newscrawl_2012 okay okay okay okay URLs, missing blanks, junk, etc. swe_web_2002 okay okay okay okay URLs, missing blanks, chemicals swe_web_2011 okay okay okay okay Routes, URLs, missing blanks, junk swe_web_2012 okay okay okay okay Routes, missing blanks, URLs, junk swe_wikipedia_2007 okay okay okay Rank 971: RobotQuistnix Routes, URLs swe_wikipedia_2012 okay okay okay okay URLs swe_mixed_2012 okay okay okay okay all of the above Abbreviations Abbreviations are usually not used as sentence boundaries. Conversely, missing abbreviations can overgenerate sentence boundaries. Due to limitations in the processing chain, the list of abbreviations used for sentence boundary detection can differ from the abbreviations in the word list. Appendix: Most Frequent Abbreviations shows possible under-generation of sentence boundaries by wrong abbreviations (i.e. words ending in a full stop) in the word list. Sentences Appendix: Shortest sentences shows the shortest declarative, exclamatory and interrogative sentences. In preprocessing, a minimal length for sentences might be specified. And missing abbreviations are often visible as faulty sentence engings. Appendix: Longest sentences shows the longest declarative, exclamatory and interrogative sentences. Usually, the maximun sentence length is defined as 256 characters (not 256 bytes). Very long exclamatory or interrogative sentences often contain an overseen sentence boundary. Appendix: Length of sentences in characters shows the distribution of the sentence length. A large and balanced corpus will result in a smooth and bell-shaped curve. Isolated local maxima usually result from large sets of near duplicate sentences.

17 SWE corpora 7 Corpus Shortest sentences Longest sentences Length distribution (in characters) Length distribution (in words) swe_news_2007 okay max. 255 bytes instead characters okay okay swe_news_2008 okay okay okay okay swe_news_2009 Some uplicate sentences okay okay okay swe_news_2010 okay okay okay okay swe_news_2011 okay okay okay okay swe_news_2012 okay okay okay okay swe_newscrawl_2011 okay okay several near duplicate peaks okay swe_newscrawl_2012 okay okay okay okay swe_web_2002 okay okay max. 255 bytes instead characters okay swe_web_2011 okay okay okay okay swe_web_2012 okay okay okay okay swe_wikipedia_2007 okay okay max. 255 bytes instead characters okay swe_wikipedia_2012 okay okay okay okay swe_mixed_2012 okay okay okay okay Oddities Appendix: Sentences with high average word length: Average sentences contain many stopwords, and these stopwords are usually short. Hence, they restrict the average word length in a sentence. Conversely, sentences with high average word length are often ill formed. They may be used to improve pre-processing. Appendix: Problems with sentence segmentation - Words ending in a stopword: If there are many ill-formed word or sentence boundaries witout a blank between two words, they will generate new ill-formed words. The appendix shows the most frequent words ending in an uppercase stopword. If they are infrequent then the date were of high quality. Corpus Sentences with high average word length Words ending in a stopword swe_news_2007 missing blanks maxfreq=48 swe_news_2008 routes, proper names okay, maxfreq=8 swe_news_2009 okay okay, maxfreq=11 swe_news_2010 URLs, missing blanks, routes maxfreq=19 swe_news_2011 okay maxfreq=17 swe_news_2012 okay okay, maxfreq=4 swe_newscrawl_2011 URLs, missing blanks, junk maxfreq=203 swe_newscrawl_2012 missing blanks, junk maxfreq=94 swe_web_2002 URLs, junk, special characters maxfreq=33 swe_web_2011 URLs, missing blanks, routes, junk maxfreq=32 swe_web_2012 URLs, missing blanks, routes, junk okay, maxfreq=12 swe_wikipedia_2007 URLs, chemicals, routes okay

18 SWE corpora 8 swe_wikipedia_2012 URLs, Japanese, routes okay swe_mixed_2012 as above maxfreq=203 SWE corpus comparison Automated Corpus comparison For the following comparisons, the following tests on the top-1000 words are performed: Vectors based on the frequencies of the top-1000 words are created for the analysed languages. The cosine of the angle between these vectors is computed. Identical languages receive a value of 0, distinct languages get a value of 1. The same analysis is conducted using the frequencies of the top-1000 typical letter trigrams of the languages. Monolingual word list comparison (top-1000 words) As one can expect the comparisons show: The different news corpora have different word lists with maximum distance 0.23 (swe_newscrawl_2011 and swe_news_2011) The wikipedia corpora are similar with maximum distance 0.09 The web corpora have maximum distance 0.18 (swe_web_2002 and swe_web_2012) The mixed corpus hun_mixed_2012 holds a central position with maximum distances of 0.32 to the other corpora. Multilingual word list comparison (top-1000 words) Both the comparison of the top-1000 words and the comparison of the letter trigrams used in these words show that there are similar languages in our data, mainly members of the north germanic family. The distance of the mixed corpus to the next language, Slovak, is 0.47 for the words and 0.54 for the letter trigrams. Both distances are below average. The average value for the most similar language is 0.58 for trigrams. The most similar languages based on words: Danish, Norwegian (Bokmål), Norwegian (Nynorsk) source language_short_name language_name cos_logfreq swe dan Danish swe nob Norwegian, Bokmål swe nno Norwegian, Nynorsk swe loy Loke swe cat Catalan-Valencian-Balear The most similar languages based on letter trigrams: Danish, Norwegian (Bokmål), Norwegian (Nynorsk) source language_short_name language_name cos_logfreq swe dan Danish swe nob Norwegian, Bokmål swe nno Norwegian, Nynorsk

19 SWE corpus comparison 9 swe eng English swe nld Dutch

20 10 Processing details Appendix to swe news 2007: Database summary Values for some general parameters Parameter Value Number of sentences Number of running word forms Number of distinct word forms Number of multiwords 0 Percentage of words with frequency= Number of sentence based co-occurrences Number of neighbour co-occurrences Appendix to swe news 2008: Database summary Values for some general parameters Parameter Value Number of sentences Number of running word forms Number of distinct word forms Number of multiwords Percentage of words with frequency= Number of sentence based co-occurrences Number of neighbour co-occurrences

Isolda Purchase - EDI

Isolda Purchase - EDI Isolda Purchase - EDI Document v 1.0 1 Table of Contents Table of Contents... 2 1 Introduction... 3 1.1 What is EDI?... 4 1.2 Sending and receiving documents... 4 1.3 File format... 4 1.3.1 XML (language

Läs mer

Grafisk teknik IMCDP IMCDP IMCDP. IMCDP(filter) Sasan Gooran (HT 2006) Assumptions:

Grafisk teknik IMCDP IMCDP IMCDP. IMCDP(filter) Sasan Gooran (HT 2006) Assumptions: IMCDP Grafisk teknik The impact of the placed dot is fed back to the original image by a filter Original Image Binary Image Sasan Gooran (HT 2006) The next dot is placed where the modified image has its

Läs mer

Schenker Privpak AB Telefon VAT Nr. SE Schenker ABs ansvarsbestämmelser, identiska med Box 905 Faxnr Säte: Borås

Schenker Privpak AB Telefon VAT Nr. SE Schenker ABs ansvarsbestämmelser, identiska med Box 905 Faxnr Säte: Borås Schenker Privpak AB Interface documentation for web service packageservices.asmx 2012-09-01 Version: 1.0.0 Doc. no.: I04304b Sida 2 av 7 Revision history Datum Version Sign. Kommentar 2012-09-01 1.0.0

Läs mer

Schenker Privpak AB Telefon 033-178300 VAT Nr. SE556124398001 Schenker ABs ansvarsbestämmelser, identiska med Box 905 Faxnr 033-257475 Säte: Borås

Schenker Privpak AB Telefon 033-178300 VAT Nr. SE556124398001 Schenker ABs ansvarsbestämmelser, identiska med Box 905 Faxnr 033-257475 Säte: Borås Schenker Privpak AB Interface documentation for web service packageservices.asmx 2010-10-21 Version: 1.2.2 Doc. no.: I04304 Sida 2 av 14 Revision history Datum Version Sign. Kommentar 2010-02-18 1.0.0

Läs mer

Schenker Privpak AB Telefon 033-178300 VAT Nr. SE556124398001 Schenker ABs ansvarsbestämmelser, identiska med Box 905 Faxnr 033-257475 Säte: Borås

Schenker Privpak AB Telefon 033-178300 VAT Nr. SE556124398001 Schenker ABs ansvarsbestämmelser, identiska med Box 905 Faxnr 033-257475 Säte: Borås Schenker Privpak AB Interface documentation for Parcel Search 2011-10-18 Version: 1 Doc. no.: I04306 Sida 2 av 5 Revision history Datum Version Sign. Kommentar 2011-10-18 1.0.0 PD First public version.

Läs mer

Writing with context. Att skriva med sammanhang

Writing with context. Att skriva med sammanhang Writing with context Att skriva med sammanhang What makes a piece of writing easy and interesting to read? Discuss in pairs and write down one word (in English or Swedish) to express your opinion http://korta.nu/sust(answer

Läs mer

Styrteknik: Binära tal, talsystem och koder D3:1

Styrteknik: Binära tal, talsystem och koder D3:1 Styrteknik: Binära tal, talsystem och koder D3:1 Digitala kursmoment D1 Boolesk algebra D2 Grundläggande logiska funktioner D3 Binära tal, talsystem och koder Styrteknik :Binära tal, talsystem och koder

Läs mer

Isometries of the plane

Isometries of the plane Isometries of the plane Mikael Forsberg August 23, 2011 Abstract Här följer del av ett dokument om Tesselering som jag skrivit för en annan kurs. Denna del handlar om isometrier och innehåller bevis för

Läs mer

Stiftelsen Allmänna Barnhuset KARLSTADS UNIVERSITET

Stiftelsen Allmänna Barnhuset KARLSTADS UNIVERSITET Stiftelsen Allmänna Barnhuset KARLSTADS UNIVERSITET National Swedish parental studies using the same methodology have been performed in 1980, 2000, 2006 and 2011 (current study). In 1980 and 2000 the studies

Läs mer

1. Compute the following matrix: (2 p) 2. Compute the determinant of the following matrix: (2 p)

1. Compute the following matrix: (2 p) 2. Compute the determinant of the following matrix: (2 p) UMEÅ UNIVERSITY Department of Mathematics and Mathematical Statistics Pre-exam in mathematics Linear algebra 2012-02-07 1. Compute the following matrix: (2 p 3 1 2 3 2 2 7 ( 4 3 5 2 2. Compute the determinant

Läs mer

Viktig information för transmittrar med option /A1 Gold-Plated Diaphragm

Viktig information för transmittrar med option /A1 Gold-Plated Diaphragm Viktig information för transmittrar med option /A1 Gold-Plated Diaphragm Guldplätering kan aldrig helt stoppa genomträngningen av vätgas, men den får processen att gå långsammare. En tjock guldplätering

Läs mer

Module 6: Integrals and applications

Module 6: Integrals and applications Department of Mathematics SF65 Calculus Year 5/6 Module 6: Integrals and applications Sections 6. and 6.5 and Chapter 7 in Calculus by Adams and Essex. Three lectures, two tutorials and one seminar. Important

Läs mer

SAMMANFATTNING AV SUMMARY OF

SAMMANFATTNING AV SUMMARY OF Detta dokument är en enkel sammanfattning i syfte att ge en första orientering av investeringsvillkoren. Fullständiga villkor erhålles genom att registera sin e- postadress på ansökningssidan för FastForward

Läs mer

Webbregistrering pa kurs och termin

Webbregistrering pa kurs och termin Webbregistrering pa kurs och termin 1. Du loggar in på www.kth.se via den personliga menyn Under fliken Kurser och under fliken Program finns på höger sida en länk till Studieöversiktssidan. På den sidan

Läs mer

Support for Artist Residencies

Support for Artist Residencies 1. Basic information 1.1. Name of the Artist-in-Residence centre 0/100 1.2. Name of the Residency Programme (if any) 0/100 1.3. Give a short description in English of the activities that the support is

Läs mer

Rastercell. Digital Rastrering. AM & FM Raster. Rastercell. AM & FM Raster. Sasan Gooran (VT 2007) Rastrering. Rastercell. Konventionellt, AM

Rastercell. Digital Rastrering. AM & FM Raster. Rastercell. AM & FM Raster. Sasan Gooran (VT 2007) Rastrering. Rastercell. Konventionellt, AM Rastercell Digital Rastrering Hybridraster, Rastervinkel, Rotation av digitala bilder, AM/FM rastrering Sasan Gooran (VT 2007) Önskat mått * 2* rastertätheten = inläsningsupplösning originalets mått 2

Läs mer

EXPERT SURVEY OF THE NEWS MEDIA

EXPERT SURVEY OF THE NEWS MEDIA EXPERT SURVEY OF THE NEWS MEDIA THE SHORENSTEIN CENTER ON THE PRESS, POLITICS & PUBLIC POLICY JOHN F. KENNEDY SCHOOL OF GOVERNMENT, HARVARD UNIVERSITY, CAMBRIDGE, MA 0238 PIPPA_NORRIS@HARVARD.EDU. FAX:

Läs mer

Boiler with heatpump / Värmepumpsberedare

Boiler with heatpump / Värmepumpsberedare Boiler with heatpump / Värmepumpsberedare QUICK START GUIDE / SNABBSTART GUIDE More information and instruction videos on our homepage www.indol.se Mer information och instruktionsvideos på vår hemsida

Läs mer

WindPRO version 2.7.448 feb 2010. SHADOW - Main Result. Calculation: inkl Halmstad SWT 2.3. Assumptions for shadow calculations. Shadow receptor-input

WindPRO version 2.7.448 feb 2010. SHADOW - Main Result. Calculation: inkl Halmstad SWT 2.3. Assumptions for shadow calculations. Shadow receptor-input SHADOW - Main Result Calculation: inkl Halmstad SWT 2.3 Assumptions for shadow calculations Maximum distance for influence Calculate only when more than 20 % of sun is covered by the blade Please look

Läs mer

Make a speech. How to make the perfect speech. söndag 6 oktober 13

Make a speech. How to make the perfect speech. söndag 6 oktober 13 Make a speech How to make the perfect speech FOPPA FOPPA Finding FOPPA Finding Organizing FOPPA Finding Organizing Phrasing FOPPA Finding Organizing Phrasing Preparing FOPPA Finding Organizing Phrasing

Läs mer

1. Unpack content of zip-file to temporary folder and double click Setup

1. Unpack content of zip-file to temporary folder and double click Setup Instruktioner Dokumentnummer/Document Number Titel/Title Sida/Page 13626-1 BM800 Data Interface - Installation Instructions 1/8 Utfärdare/Originator Godkänd av/approved by Gäller från/effective date Mats

Läs mer

Calculate check digits according to the modulus-11 method

Calculate check digits according to the modulus-11 method 2016-12-01 Beräkning av kontrollsiffra 11-modulen Calculate check digits according to the modulus-11 method Postadress: 105 19 Stockholm Besöksadress: Palmfeltsvägen 5 www.bankgirot.se Bankgironr: 160-9908

Läs mer

PORTSECURITY IN SÖLVESBORG

PORTSECURITY IN SÖLVESBORG PORTSECURITY IN SÖLVESBORG Kontaktlista i skyddsfrågor / List of contacts in security matters Skyddschef/PFSO Tord Berg Phone: +46 456 422 44. Mobile: +46 705 82 32 11 Fax: +46 456 104 37. E-mail: tord.berg@sbgport.com

Läs mer

Managing addresses in the City of Kokkola Underhåll av adresser i Karleby stad

Managing addresses in the City of Kokkola Underhåll av adresser i Karleby stad Managing addresses in the City of Kokkola Underhåll av adresser i Karleby stad Nordic Address Meeting Odense 3.-4. June 2010 Asko Pekkarinen Anna Kujala Facts about Kokkola Fakta om Karleby Population:

Läs mer

Materialplanering och styrning på grundnivå. 7,5 högskolepoäng

Materialplanering och styrning på grundnivå. 7,5 högskolepoäng Materialplanering och styrning på grundnivå Provmoment: Ladokkod: Tentamen ges för: Skriftlig tentamen TI6612 Af3-Ma, Al3, Log3,IBE3 7,5 högskolepoäng Namn: (Ifylles av student) Personnummer: (Ifylles

Läs mer

http://marvel.com/games/play/31/create_your_own_superhero http://www.heromachine.com/

http://marvel.com/games/play/31/create_your_own_superhero http://www.heromachine.com/ Name: Year 9 w. 4-7 The leading comic book publisher, Marvel Comics, is starting a new comic, which it hopes will become as popular as its classics Spiderman, Superman and The Incredible Hulk. Your job

Läs mer

Metodprov för kontroll av svetsmutterförband Kontrollbestämmelse Method test for inspection of joints of weld nut Inspection specification

Metodprov för kontroll av svetsmutterförband Kontrollbestämmelse Method test for inspection of joints of weld nut Inspection specification Stämpel/Etikett Security stamp/lable Metodprov för kontroll av svetsmutterförband Kontrollbestämmelse Method test for inspection of joints of weld nut Inspection specification Granskad av Reviewed by Göran

Läs mer

Documentation SN 3102

Documentation SN 3102 This document has been created by AHDS History and is based on information supplied by the depositor /////////////////////////////////////////////////////////// THE EUROPEAN STATE FINANCE DATABASE (Director:

Läs mer

Beijer Electronics AB 2000, MA00336A, 2000-12

Beijer Electronics AB 2000, MA00336A, 2000-12 Demonstration driver English Svenska Beijer Electronics AB 2000, MA00336A, 2000-12 Beijer Electronics AB reserves the right to change information in this manual without prior notice. All examples in this

Läs mer

Protokoll Föreningsutskottet 2013-10-22

Protokoll Föreningsutskottet 2013-10-22 Protokoll Föreningsutskottet 2013-10-22 Närvarande: Oliver Stenbom, Andreas Estmark, Henrik Almén, Ellinor Ugland, Oliver Jonstoij Berg. 1. Mötets öppnande. Ordförande Oliver Stenbom öppnade mötet. 2.

Läs mer

Dagens Nyheter STHLM Total. A Stockholm paper made by and for those that love Stockholm

Dagens Nyheter STHLM Total. A Stockholm paper made by and for those that love Stockholm Summary Dagens Nyheter STHLM Total. A Stockholm paper made by and for those that love Stockholm Our readers know the product as DagensNyheter STHLM. At the same time our advertisers know this specific

Läs mer

Evaluation Ny Nordisk Mat II Appendix 1. Questionnaire evaluation Ny Nordisk Mat II

Evaluation Ny Nordisk Mat II Appendix 1. Questionnaire evaluation Ny Nordisk Mat II Evaluation Ny Nordisk Mat II Appendix 1. Questionnaire evaluation Ny Nordisk Mat II English version A. About the Program in General We will now ask some questions about your relationship to the program

Läs mer

En bild säger mer än tusen ord?

En bild säger mer än tusen ord? Faculteit Letteren en Wijsbegeerte Academiejaar 2009-2010 En bild säger mer än tusen ord? En studie om dialogen mellan illustrationer och text i Tiina Nunnallys engelska översättning av Pippi Långstrump

Läs mer

Annonsformat desktop. Startsida / områdesstartsidor. Artikel/nyhets-sidor. 1. Toppbanner, format 1050x180 pxl. Format 1060x180 px + 250x240 pxl.

Annonsformat desktop. Startsida / områdesstartsidor. Artikel/nyhets-sidor. 1. Toppbanner, format 1050x180 pxl. Format 1060x180 px + 250x240 pxl. Annonsformat desktop Startsida / områdesstartsidor 1. Toppbanner, format 1050x180 pxl. Bigbang (toppbanner + bannerplats 2) Format 1060x180 px + 250x240 pxl. 2. DW, format 250x240 pxl. 3. TW, format 250x360

Läs mer

BOENDEFORMENS BETYDELSE FÖR ASYLSÖKANDES INTEGRATION Lina Sandström

BOENDEFORMENS BETYDELSE FÖR ASYLSÖKANDES INTEGRATION Lina Sandström BOENDEFORMENS BETYDELSE FÖR ASYLSÖKANDES INTEGRATION Lina Sandström Frågeställningar Kan asylprocessen förstås som en integrationsprocess? Hur fungerar i sådana fall denna process? Skiljer sig asylprocessen

Läs mer

Tentamen i Matematik 2: M0030M.

Tentamen i Matematik 2: M0030M. Tentamen i Matematik 2: M0030M. Datum: 203-0-5 Skrivtid: 09:00 4:00 Antal uppgifter: 2 ( 30 poäng ). Examinator: Norbert Euler Tel: 0920-492878 Tillåtna hjälpmedel: Inga Betygsgränser: 4p 9p = 3; 20p 24p

Läs mer

Kurskod: TAIU06 MATEMATISK STATISTIK Provkod: TENA 15 August 2016, 8:00-12:00. English Version

Kurskod: TAIU06 MATEMATISK STATISTIK Provkod: TENA 15 August 2016, 8:00-12:00. English Version Kurskod: TAIU06 MATEMATISK STATISTIK Provkod: TENA 15 August 2016, 8:00-12:00 Examiner: Xiangfeng Yang (Tel: 070 0896661). Please answer in ENGLISH if you can. a. Allowed to use: a calculator, Formelsamling

Läs mer

EXTERNAL ASSESSMENT SAMPLE TASKS SWEDISH BREAKTHROUGH LSPSWEB/0Y09

EXTERNAL ASSESSMENT SAMPLE TASKS SWEDISH BREAKTHROUGH LSPSWEB/0Y09 EXTENAL ASSESSENT SAPLE TASKS SWEDISH BEAKTHOUGH LSPSWEB/0Y09 Asset Languages External Assessment Sample Tasks Breakthrough Stage Listening and eading Swedish Contents Page Introduction 2 Listening Sample

Läs mer

Questionnaire for visa applicants Appendix A

Questionnaire for visa applicants Appendix A Questionnaire for visa applicants Appendix A Business Conference visit 1 Personal particulars Surname Date of birth (yr, mth, day) Given names (in full) 2 Your stay in Sweden A. Who took the initiative

Läs mer

Datasäkerhet och integritet

Datasäkerhet och integritet Chapter 4 module A Networking Concepts OSI-modellen TCP/IP This module is a refresher on networking concepts, which are important in information security A Simple Home Network 2 Unshielded Twisted Pair

Läs mer

Kurskod: TAIU06 MATEMATISK STATISTIK Provkod: TENA 17 August 2015, 8:00-12:00. English Version

Kurskod: TAIU06 MATEMATISK STATISTIK Provkod: TENA 17 August 2015, 8:00-12:00. English Version Kurskod: TAIU06 MATEMATISK STATISTIK Provkod: TENA 17 August 2015, 8:00-12:00 Examiner: Xiangfeng Yang (Tel: 070 2234765). Please answer in ENGLISH if you can. a. Allowed to use: a calculator, Formelsamling

Läs mer

F ξ (x) = f(y, x)dydx = 1. We say that a random variable ξ has a distribution F (x), if. F (x) =

F ξ (x) = f(y, x)dydx = 1. We say that a random variable ξ has a distribution F (x), if. F (x) = Problems for the Basic Course in Probability (Fall 00) Discrete Probability. Die A has 4 red and white faces, whereas die B has red and 4 white faces. A fair coin is flipped once. If it lands on heads,

Läs mer

Övning 5 ETS052 Datorkommuniktion Routing och Networking

Övning 5 ETS052 Datorkommuniktion Routing och Networking Övning 5 TS5 Datorkommuniktion - 4 Routing och Networking October 7, 4 Uppgift. Rita hur ett paket som skickas ut i nätet nedan från nod, med flooding, sprider sig genom nätet om hop count = 3. Solution.

Läs mer

Förändrade förväntningar

Förändrade förväntningar Förändrade förväntningar Deloitte Ca 200 000 medarbetare 150 länder 700 kontor Omsättning cirka 31,3 Mdr USD Spetskompetens av världsklass och djup lokal expertis för att hjälpa klienter med de insikter

Läs mer

DVG C01 TENTAMEN I PROGRAMSPRÅK PROGRAMMING LANGUAGES EXAMINATION :15-13: 15

DVG C01 TENTAMEN I PROGRAMSPRÅK PROGRAMMING LANGUAGES EXAMINATION :15-13: 15 DVG C01 TENTAMEN I PROGRAMSPRÅK PROGRAMMING LANGUAGES EXAMINATION 120607 08:15-13: 15 Ansvarig Lärare: Donald F. Ross Hjälpmedel: Bilaga A: BNF-definition En ordbok: studentenshemspråk engelska Betygsgräns:

Läs mer

SWESIAQ Swedish Chapter of International Society of Indoor Air Quality and Climate

SWESIAQ Swedish Chapter of International Society of Indoor Air Quality and Climate Swedish Chapter of International Society of Indoor Air Quality and Climate Aneta Wierzbicka Swedish Chapter of International Society of Indoor Air Quality and Climate Independent and non-profit Swedish

Läs mer

Kurskod: TAMS11 Provkod: TENB 28 August 2014, 08:00-12:00. English Version

Kurskod: TAMS11 Provkod: TENB 28 August 2014, 08:00-12:00. English Version Kurskod: TAMS11 Provkod: TENB 28 August 2014, 08:00-12:00 Examinator/Examiner: Xiangfeng Yang (Tel: 070 2234765) a. You are permitted to bring: a calculator; formel -och tabellsamling i matematisk statistik

Läs mer

FORSKNINGSKOMMUNIKATION OCH PUBLICERINGS- MÖNSTER INOM UTBILDNINGSVETENSKAP

FORSKNINGSKOMMUNIKATION OCH PUBLICERINGS- MÖNSTER INOM UTBILDNINGSVETENSKAP FORSKNINGSKOMMUNIKATION OCH PUBLICERINGS- MÖNSTER INOM UTBILDNINGSVETENSKAP En studie av svensk utbildningsvetenskaplig forskning vid tre lärosäten VETENSKAPSRÅDETS RAPPORTSERIE 10:2010 Forskningskommunikation

Läs mer

FÖRBERED UNDERLAG FÖR BEDÖMNING SÅ HÄR

FÖRBERED UNDERLAG FÖR BEDÖMNING SÅ HÄR FÖRBERED UNDERLAG FÖR BEDÖMNING SÅ HÄR Kontrollera vilka kurser du vill söka under utbytet. Fyll i Basis for nomination for exchange studies i samråd med din lärare. För att läraren ska kunna göra en korrekt

Läs mer

Biblioteket.se. A library project, not a web project. Daniel Andersson. Biblioteket.se. New Communication Channels in Libraries Budapest Nov 19, 2007

Biblioteket.se. A library project, not a web project. Daniel Andersson. Biblioteket.se. New Communication Channels in Libraries Budapest Nov 19, 2007 A library project, not a web project New Communication Channels in Libraries Budapest Nov 19, 2007 Daniel Andersson, daniel@biblioteket.se 1 Daniel Andersson Project manager and CDO at, Stockholm Public

Läs mer

Statistical Quality Control Statistisk kvalitetsstyrning. 7,5 högskolepoäng. Ladok code: 41T05A, Name: Personal number:

Statistical Quality Control Statistisk kvalitetsstyrning. 7,5 högskolepoäng. Ladok code: 41T05A, Name: Personal number: Statistical Quality Control Statistisk kvalitetsstyrning 7,5 högskolepoäng Ladok code: 41T05A, The exam is given to: 41I02B IBE11, Pu2, Af2-ma Name: Personal number: Date of exam: 1 June Time: 9-13 Hjälpmedel

Läs mer

STANDARD. UTM Ingegerd Annergren UTMS Lina Orbéus. UTMD Anders Johansson UTMS Jan Sandberg

STANDARD. UTM Ingegerd Annergren UTMS Lina Orbéus. UTMD Anders Johansson UTMS Jan Sandberg 1(7) Distribution: Scania, Supplier Presskruvar med rundat huvud - Metrisk gänga med grov delning Innehåll Sida Orientering... 1 Ändringar från föregående utgåva... 1 1 Material och hållfasthet... 1 2

Läs mer

Kvalitetsarbete I Landstinget i Kalmar län. 24 oktober 2007 Eva Arvidsson

Kvalitetsarbete I Landstinget i Kalmar län. 24 oktober 2007 Eva Arvidsson Kvalitetsarbete I Landstinget i Kalmar län 24 oktober 2007 Eva Arvidsson Bakgrund Sammanhållen primärvård 2005 Nytt ekonomiskt system Olika tradition och förutsättningar Olika pågående projekt Get the

Läs mer

1.1 Invoicing Requirements

1.1 Invoicing Requirements 1.1 Invoicing Requirements Document name The document should clearly state INVOICE, DOWNPAYMENT REQUEST or CREDIT NOTE. Invoice lines and credit lines cannot be sent in the same document. Invoicing currency.

Läs mer

Senaste trenderna från testforskningen: Passar de industrin? Robert Feldt,

Senaste trenderna från testforskningen: Passar de industrin? Robert Feldt, Senaste trenderna från testforskningen: Passar de industrin? Robert Feldt, robert.feldt@bth.se Vad är på gång i forskningen? (ICST 2015 & 2016) Security testing Mutation testing GUI testing Model-based

Läs mer

55R Kia Carens 2013»

55R Kia Carens 2013» 55R-013714 60 Kia Carens 2013» 630-0810 rev. 2014-06-09 DC Congratulations on purchasing an ATS towbar Alexo Towbars Sweden offer quality towbars produced as a result of direct market research. Every towbar

Läs mer

55R Volvo XC » Volvo XC » Volvo S » Volvo V » Volvo XC » Volvo V »

55R Volvo XC » Volvo XC » Volvo S » Volvo V » Volvo XC » Volvo V » 55R-01 3687 90 Volvo XC60 2008-2013» Volvo XC60 2013» Volvo S60 2010» Volvo V60 2010» Volvo XC70 2007» Volvo V70 2007» 668-0312 rev. 2014-05-07 RG Congratulations on purchasing an ATS towbar Alexo Towbars

Läs mer

Dokumentnamn Order and safety regulations for Hässleholms Kretsloppscenter. Godkänd/ansvarig Gunilla Holmberg. Kretsloppscenter

Dokumentnamn Order and safety regulations for Hässleholms Kretsloppscenter. Godkänd/ansvarig Gunilla Holmberg. Kretsloppscenter 1(5) The speed through the entire area is 30 km/h, unless otherwise indicated. Beware of crossing vehicles! Traffic signs, guardrails and exclusions shall be observed and followed. Smoking is prohibited

Läs mer

The Swedish National Patient Overview (NPO)

The Swedish National Patient Overview (NPO) The Swedish National Patient Overview (NPO) Background and status 2009 Tieto Corporation Christer Bergh Manager of Healthcare Sweden Tieto, Healthcare & Welfare christer.bergh@tieto.com Agenda Background

Läs mer

Custom-made software solutions for increased transport quality and creation of cargo specific lashing protocols.

Custom-made software solutions for increased transport quality and creation of cargo specific lashing protocols. Custom-made software solutions for increased transport quality and creation of cargo specific lashing protocols. ExcelLoad simulates the maximum forces that may appear during a transport no matter if the

Läs mer

Rev No. Magnetic gripper 3

Rev No. Magnetic gripper 3 Magnetic gripper 1 Magnetic gripper 2 Magnetic gripper 3 Magnetic gripper 4 Pneumatic switchable permanent magnet. A customized gripper designed to handle large objects in/out of press break/laser cutting

Läs mer

Om oss DET PERFEKTA KOMPLEMENTET THE PERFECT COMPLETION 04 EN BINZ ÄR PRECIS SÅ BRA SOM DU FÖRVÄNTAR DIG A BINZ IS JUST AS GOOD AS YOU THINK 05

Om oss DET PERFEKTA KOMPLEMENTET THE PERFECT COMPLETION 04 EN BINZ ÄR PRECIS SÅ BRA SOM DU FÖRVÄNTAR DIG A BINZ IS JUST AS GOOD AS YOU THINK 05 Om oss Vi på Binz är glada att du är intresserad av vårt support-system för begravningsbilar. Sedan mer än 75 år tillverkar vi specialfordon i Lorch för de flesta olika användningsändamål, och detta enligt

Läs mer

PRESS FÄLLKONSTRUKTION FOLDING INSTRUCTIONS

PRESS FÄLLKONSTRUKTION FOLDING INSTRUCTIONS PRESS FÄLLKONSTRUKTION FOLDING INSTRUCTIONS Vänd bordet upp och ner eller ställ det på långsidan. Tryck ner vid PRESS och fäll benen samtidigt. Om benen sitter i spänn tryck benen mot kortsidan före de

Läs mer

PROFINET MELLAN EL6631 OCH EK9300

PROFINET MELLAN EL6631 OCH EK9300 PROFINET MELLAN EL6631 OCH EK9300 Installation och beskrivningsfil Exemplet visar igångkörning av profinet mellan Beckhoff-master och Beckhoff-kopplare för EL-terminaler. Med ny hårdvara är det viktigt

Läs mer

Workplan Food. Spring term 2016 Year 7. Name:

Workplan Food. Spring term 2016 Year 7. Name: Workplan Food Spring term 2016 Year 7 Name: During the time we work with this workplan you will also be getting some tests in English. You cannot practice for these tests. Compulsory o Read My Canadian

Läs mer

Pharmacovigilance lagstiftning - PSUR

Pharmacovigilance lagstiftning - PSUR Pharmacovigilance lagstiftning - PSUR Karl Mikael Kälkner Tf enhetschef ES1 EUROPAPARLAMENTETS OCH RÅDETS DIREKTIV 2010/84/EU av den 15 december om ändring, när det gäller säkerhetsövervakning av läkemedel,

Läs mer

Typografi, text & designperspektiv

Typografi, text & designperspektiv Typografi, text & designperspektiv Serif Hårstreck Stapel Heplx x-höjd Baslinje Grundstreck Serif Underhäng Inre form I dag Lite bakgrund Övergripande grunder inom typografi Text hantering Elva Synlig

Läs mer

Kurskod: TAMS11 Provkod: TENB 07 April 2015, 14:00-18:00. English Version

Kurskod: TAMS11 Provkod: TENB 07 April 2015, 14:00-18:00. English Version Kurskod: TAMS11 Provkod: TENB 07 April 2015, 14:00-18:00 Examiner: Xiangfeng Yang (Tel: 070 2234765). Please answer in ENGLISH if you can. a. You are allowed to use: a calculator; formel -och tabellsamling

Läs mer

- den bredaste guiden om Mallorca på svenska! -

- den bredaste guiden om Mallorca på svenska! - - den bredaste guiden om Mallorca på svenska! - Driver du företag, har en affärsrörelse på Mallorca eller relaterad till Mallorca och vill nå ut till våra läsare? Då har du möjlighet att annonsera på Mallorcaguide.se

Läs mer

Country report: Sweden

Country report: Sweden Country report: Sweden Anneli Petersson, PhD. Swedish Gas Centre Sweden Statistics for 2006 1.2 TWh produced per year 223 plants 138 municipal sewage treatment plants 60 landfills 3 Industrial wastewater

Läs mer

Item 6 - Resolution for preferential rights issue.

Item 6 - Resolution for preferential rights issue. Item 6 - Resolution for preferential rights issue. The board of directors in Tobii AB (publ), reg. no. 556613-9654, (the Company ) has on November 5, 2016, resolved to issue shares in the Company, subject

Läs mer

FOI MEMO. Jonas Hallberg FOI Memo 5253

FOI MEMO. Jonas Hallberg FOI Memo 5253 Projekt/Project Security culture and information technology Projektnummer/Project no Kund/Customer B34103 MSB Sidnr/Page no 1 (5) Handläggare/Our reference Datum/Date Jonas Hallberg 2015-01-21 FOI Memo

Läs mer

Mönster. Ulf Cederling Växjö University Ulf.Cederling@msi.vxu.se http://www.msi.vxu.se/~ulfce. Slide 1

Mönster. Ulf Cederling Växjö University Ulf.Cederling@msi.vxu.se http://www.msi.vxu.se/~ulfce. Slide 1 Mönster Ulf Cederling Växjö University UlfCederling@msivxuse http://wwwmsivxuse/~ulfce Slide 1 Beskrivningsmall Beskrivningsmallen är inspirerad av den som användes på AG Communication Systems (AGCS) Linda

Läs mer

Jämförelse mellan FCI-reglerna och de svenska reglerna för elitklass lydnad - ur ett tävlandeperspektiv

Jämförelse mellan FCI-reglerna och de svenska reglerna för elitklass lydnad - ur ett tävlandeperspektiv Jämförelse mellan FCI-reglerna och de svenska reglerna för elitklass lydnad - ur ett tävlandeperspektiv Genomgången gjord av Niina Svartberg april 2009 Tävlingsupplägg (Layout of the competition) sid 5

Läs mer

Analys och bedömning av företag och förvaltning. Omtentamen. Ladokkod: SAN023. Tentamen ges för: Namn: (Ifylles av student.

Analys och bedömning av företag och förvaltning. Omtentamen. Ladokkod: SAN023. Tentamen ges för: Namn: (Ifylles av student. Analys och bedömning av företag och förvaltning Omtentamen Ladokkod: SAN023 Tentamen ges för: Namn: (Ifylles av student Personnummer: (Ifylles av student) Tentamensdatum: Tid: 2014-02-17 Hjälpmedel: Lexikon

Läs mer

Service och bemötande. Torbjörn Johansson, GAF Pär Magnusson, Öjestrand GC

Service och bemötande. Torbjörn Johansson, GAF Pär Magnusson, Öjestrand GC Service och bemötande Torbjörn Johansson, GAF Pär Magnusson, Öjestrand GC Vad är service? Åsikter? Service är något vi upplever i vårt möte med butikssäljaren, med kundserviceavdelningen, med företagets

Läs mer

Provlektion Just Stuff B Textbook Just Stuff B Workbook

Provlektion Just Stuff B Textbook Just Stuff B Workbook Provlektion Just Stuff B Textbook Just Stuff B Workbook Genomförande I provlektionen får ni arbeta med ett avsnitt ur kapitlet Hobbies - The Rehearsal. Det handlar om några elever som skall sätta upp Romeo

Läs mer

Alias 1.0 Rollbaserad inloggning

Alias 1.0 Rollbaserad inloggning Alias 1.0 Rollbaserad inloggning Alias 1.0 Rollbaserad inloggning Magnus Bergqvist Tekniskt Säljstöd Magnus.Bergqvist@msb.se 072-502 09 56 Alias 1.0 Rollbaserad inloggning Funktionen Förutsättningar Funktionen

Läs mer

2 Uppgifter. Uppgifter. Svaren börjar på sidan 35. Uppgift 1. Steg 1. Problem 1 : 2. Problem 1 : 3

2 Uppgifter. Uppgifter. Svaren börjar på sidan 35. Uppgift 1. Steg 1. Problem 1 : 2. Problem 1 : 3 1 2 Uppgifter Uppgifter Svaren börjar på sidan 35. Uppgift 1. Steg 1 Problem 1 : 2 Problem 1 : 3 Uppgifter 3 Svarsalternativ. Answer alternative 1. a Svarsalternativ. Answer alternative 1. b Svarsalternativ.

Läs mer

English. Things to remember

English. Things to remember English Things to remember Essay Kolla instruktionerna noggrant! Gå tillbaka och läs igenom igen och kolla att allt är med. + Håll dig till ämnet! Vem riktar ni er till? Var ska den publiceras? Vad är

Läs mer

STORSEMINARIET 3. Amplitud. frekvens. frekvens uppgift 9.4 (cylindriskt rör)

STORSEMINARIET 3. Amplitud. frekvens. frekvens uppgift 9.4 (cylindriskt rör) STORSEMINARIET 1 uppgift SS1.1 A 320 g block oscillates with an amplitude of 15 cm at the end of a spring, k =6Nm -1.Attimet = 0, the displacement x = 7.5 cm and the velocity is positive, v > 0. Write

Läs mer

Undergraduate research:

Undergraduate research: Undergraduate research: Laboratory experiments with many variables Arne Rosén 1, Magnus Karlsteen 2, Jonathan Weidow 2, Andreas Isacsson 2 and Ingvar Albinsson 1 1 Department of Physics, University of

Läs mer

8 < x 1 + x 2 x 3 = 1, x 1 +2x 2 + x 4 = 0, x 1 +2x 3 + x 4 = 2. x 1 2x 12 1A är inverterbar, och bestäm i så fall dess invers.

8 < x 1 + x 2 x 3 = 1, x 1 +2x 2 + x 4 = 0, x 1 +2x 3 + x 4 = 2. x 1 2x 12 1A är inverterbar, och bestäm i så fall dess invers. MÄLARDALENS HÖGSKOLA Akademin för utbildning, kultur och kommunikation Avdelningen för tillämpad matematik Examinator: Erik Darpö TENTAMEN I MATEMATIK MAA150 Vektoralgebra TEN1 Datum: 9januari2015 Skrivtid:

Läs mer

H0008 Skrivskydd FBWF

H0008 Skrivskydd FBWF Skrivskydd FBWF Skrivskydd FBWF (File-Based Write Filter) är en Microsoft komponent som finns med i Windows Embedded image. Det finns inte för Windows CE/Compact 7 operativ, hanteringen av skrivningar

Läs mer

säkerhetsutrustning / SAFETY EQUIPMENT

säkerhetsutrustning / SAFETY EQUIPMENT säkerhetsutrustning / SAFETY EQUIPMENT Hastighetsvakt / Speed monitor Kellves hastighetsvakter används för att stoppa bandtransportören när dess hastighet sjunker under beräknade minimihastigheten. Kellve

Läs mer

SEKUNDERNA - THE SECONDS, FILM/PROJECT

SEKUNDERNA - THE SECONDS, FILM/PROJECT SEKUNDERNA - THE SECONDS, 2004-2007 FILM/PROJECT Verkbeskrivning - Work description Filmen sekunderna besår av 365 stillbilder överförda till 40:min 35mm, HD stum film. Den 6 juli 2004 kl.18.30 utfördes

Läs mer

Tänder din grill på sextio sekunder. Lights your grill in sixty seconds.

Tänder din grill på sextio sekunder. Lights your grill in sixty seconds. LOOFTLIGHTER Tänder din grill på sextio sekunder. Lights your grill in sixty seconds. Hur den fungerar Med Looftlighter behöver du aldrig mer använda tändvätska för att tända din grill. Istället används

Läs mer

Översättning av galleriet. Hjälp till den som vill...

Översättning av galleriet. Hjälp till den som vill... Hjälp till den som vill... $txt['aeva_title'] = 'Galleri'; $txt['aeva_admin'] = 'Admin'; $txt['aeva_add_title'] = 'Titel'; $txt['aeva_add_desc'] = 'Beskrivning'; $txt['aeva_add_file'] = 'Fil att ladda

Läs mer

Quick-guide to Min ansökan

Quick-guide to Min ansökan Version 2015-05-12 Quick-guide to Min ansökan Before filling in the application To be able to fill in an application you need to create a user account (instructions on p. 3). If you have already created

Läs mer

Anvisning för Guide for

Anvisning för Guide for Anvisning för Guide for PRISMA SENSOR 1 96243235zPC Montering i tak/installation in the ceiling Byte av kupa/change of diffuser 2 Installation Installation från gavel / Installation from the end Installationskabel

Läs mer

Arctic. Design by Rolf Fransson

Arctic. Design by Rolf Fransson Arctic Design by Rolf Fransson 2 Endless possibilities of combinations. Oändliga kombinationsmöjligheter. 3 4 5 If you are looking for a range of storage furniture which limits of combination is set by

Läs mer

Tentamen i Matematik 2: M0030M.

Tentamen i Matematik 2: M0030M. Tentamen i Matematik 2: M0030M. Datum: 2010-01-12 Skrivtid: 09:00 14:00 Antal uppgifter: 6 ( 30 poäng ). Jourhavande lärare: Norbert Euler Telefon: 0920-492878 Tillåtna hjälpmedel: Inga Till alla uppgifterna

Läs mer

Utfärdad av Compiled by Tjst Dept. Telefon Telephone Datum Date Utg nr Edition No. Dokumentnummer Document No.

Utfärdad av Compiled by Tjst Dept. Telefon Telephone Datum Date Utg nr Edition No. Dokumentnummer Document No. Stämpel/Etikett Security stamp/lable PROVNINGSBESTÄMMELSE OFRSTRANDE PROVNING AV STÅLGJUTGODS TEST SPECIFICATION NON-DESTRUCTIVE TESTING OF STEEL CASTINGS Granskad av Reviewed by Göran Magnusson Tjst Dept.

Läs mer

FORSBERGS SKOLA DISTANSKURS BREV 24

FORSBERGS SKOLA DISTANSKURS BREV 24 FORSBERGS SKOLA DISTANSKURS BREV 24 One of the most important accessories today is the very familiar shopping bag, now in common use throughout many countries of the world. Created to carry all sorts of

Läs mer

Utvärdering SFI, ht -13

Utvärdering SFI, ht -13 Utvärdering SFI, ht -13 Biblioteksbesöken 3% Ej svarat 3% 26% 68% Jag hoppas att gå till biblioteket en gång två veckor I think its important to come to library but maybe not every week I like because

Läs mer

2.1 Installation of driver using Internet Installation of driver from disk... 3

2.1 Installation of driver using Internet Installation of driver from disk... 3 &RQWHQW,QQHKnOO 0DQXDOÃ(QJOLVKÃ'HPRGULYHU )RUHZRUG Ã,QWURGXFWLRQ Ã,QVWDOOÃDQGÃXSGDWHÃGULYHU 2.1 Installation of driver using Internet... 3 2.2 Installation of driver from disk... 3 Ã&RQQHFWLQJÃWKHÃWHUPLQDOÃWRÃWKHÃ3/&ÃV\VWHP

Läs mer

2(x + 1) x f(x) = 3. Find the area of the surface generated by rotating the curve. y = x 3, 0 x 1,

2(x + 1) x f(x) = 3. Find the area of the surface generated by rotating the curve. y = x 3, 0 x 1, MÄLARDALEN UNIVERSITY School of Education, Culture and Communication Department of Applied Mathematics Examiner: Lars-Göran Larsson EXAMINATION IN MATHEMATICS MAA5 Single Variable Calculus, TEN Date: 06--0

Läs mer

Klicka här för att ändra format

Klicka här för att ändra format på 1 på Marianne Andrén General Manager marianne.andren@sandviken.se Sandbacka Park Högbovägen 45 SE 811 32 Sandviken Telephone: +46 26 24 21 33 Mobile: +46 70 230 67 41 www.isea.se 2 From the Off e project

Läs mer

Förbundsutskott 32, broar och tunnlar

Förbundsutskott 32, broar och tunnlar Förbundsutskott 32, broar och tunnlar Utmärkelse till en framstående konstruktion inom bro- och tunnelområdet www.nvfnorden.org Stadgar i korthet: För ingenjörskonsten viktiga konstruktioner Behöver inte

Läs mer

The cornerstone of Swedish disability policy is the principle that everyone is of equal value and has equal rights.

The cornerstone of Swedish disability policy is the principle that everyone is of equal value and has equal rights. Swedish disability policy -service and care for people with funcional impairments The cornerstone of Swedish disability policy is the principle that everyone is of equal value and has equal rights. The

Läs mer