Bri's worldelectronics, programming and more

deutsch

German and english word list for download created from Wikipedia

For my last projet I needed a list of german words. But I did not found a good and free list. So I had the idea to create a word list from all Wikipedia articles.

Download free german and english word list

For those who only want to download the word list: (License Creative Commons as used by Wikipedia)

All words from german Wikipedia word_list_german_all.txt.7z
All spell checked words from german Wikipedia word_list_german_spell_checked.txt.7z
All case insensitive spell checked words from german Wikipedia word_list_german_uppercase_spell_checked.txt.7z
All words from english Wikipedia word_list_english_all.txt.7z
All spell checked words from english Wikipedia word_list_english_spell_checked.txt.7z
All case insensitive spell checked words from english Wikipedia word_list_english_uppercase_spell_checked.txt.7z

Those who want to create own word lists should read the following sections.

Create word list from Wikipedia

Open ZIM

All articles from Wikipedia are available for offline reading. The whole Wikipedia can be downloaded as a compressed archive in Open ZIM format: https://download.kiwix.org/zim/wikipedia

libzim

There is a library called libzim to decompress and read the articles. This library is written in C++. A C++ library has the disadvantage that it is not easy to include in a C# program. But I like C# as programming language. Therefore I created a C library as a wrapper for libzim.

Spell checking

I used hunspell to check the words. In theory it is possible to use hunspell to create word lists. But not all created words make sense. So it is better to use it only for spell checking the words found in Wikipedia articles.

Program to create the word list

WordList_program_screenshot.png

This program is written with Monodevelop on Linux. There are two versions, one for the console and one with a GUI. I hope the usage is self explaining. (License GPL v3)

Download sourcecode and binary: Woerterbuch.zip

(The binary is located in the directory WoerterbuchGUI/bin/debug)

Bri's world© Torsten Brischalle. Design based upon BlueWebTemplates.com