Word segmentation, part-of-speech tagging, and more:
Check out the Center for Corpus Development, NINJAL for links to many corpora and databases. Here are some notable corpora.
WordNet, including synsets (synonym sets) only, has been created for Japanese. Please visit the page to download the sqlite3 database of Japanese WordNet, then use one of the APIs in a variety of programming languages to use it in your own code. Here is a link to the Python API for Japanese WordNet.
These are plain-text files formatted for use with the topic modeling software MALLET. They contain the title of the work, author, and year, followed by the text with words separated by spaces. I have provided both lemmatized and raw format text when available.