RENDER » corpora

Browsing for tag "corpora"

Aug 05 2011

Tool: Corpex – Wikipedia Corpora Explorer

Tags: API, corpex, corpora, explorer, frequency analysis, RENDER work, tool, wikipedia, word occurrence
Published by Carmen Brenner at 11:15 under News,Project News

Developed within the RENDER project by KIT Karlsruhe, the Wikipedia corpora explorer Corpex let’s you swiftly browse through all the words of Wikipedia. Select your language, and when you start typing, the system shows you two statistics in four graphs:

the ten most frequent words that start with the typed sequence of letters (as a barcharts and a piechart), and
the most frequent letter following the already typed sequence of letters (again, as a barchart and a piechart).

Additionally, the ten most frequent following words of any input word are visualized (as a barcharts and a piechart).

This can be used for many applications where the occurence of words in different language editions of Wikipedia is of use. An API is also provided for easy use of the data.

Corpex is currently available in the following languages: German (de), English (en), Spanish (es), French (fr), Hungarian (hr), Romanian (ro), Albanian (sq), Bulgarian (bg), Czech (cs), Italian (it), Swedish (sv), Serbian (sr), Croatian (hr), Serbo-Croatian (sh), Bosnian (bs), and simple English (simple). It is further available for the Brown Corpus (brown). Further languages are being prepared.

Corpex is still under development. The source code is fully open source, and all the data is also freely available. Feedback, and especially suggestions for cooperation, is welcome.

Try it out at render-project.eu/tools/Corpex

No responses yet

Links
- Blog
- Forums
- Internal Wiki

CONTRIBUTE
to our extensions for:

Development Resources & Discussion: Click on the representative logo

learn more

RENDER – Reflecting Knowledge Diversity

“United in diversity"

Browsing for tag "corpora"

Tool: Corpex – Wikipedia Corpora Explorer

Home

About Us

Resources

Tools

Showcase

Contact

Links

CONTRIBUTE
to our extensions for:

“United in diversity"

Browsing for tag "corpora"

Links

CONTRIBUTE to our extensions for:

CONTRIBUTE
to our extensions for: