Skip to content

tsleyson/kindle_pub_formatter

Repository files navigation

clean_html

A Clojure program to clean up the HTML output from Libre Office, merge multiple files into one, and make the file suitable for publishing on the Amazon Kindle Store.

Usage

The files, directory, cleaner function, and other configuration options go in a config.clj file, written in Clojure. resources/strawberrysunflower/config.clj is an example. Note that running the program on the example config files won't work, because I didn't include the full text of my books for free on GitHub when I'm asking 99 cents for them on Amazon.

Options

  • :directory is the directory where your HTML files are stored.
  • :order is the order in which the content of the files should appear in the final merged file.
  • :title, :subtitle are self-explanatory. :authors is a vector to allow for multiple authors.
  • :heading-selector and :paragraph-selector are Enlive selectors that select the headings and standard paragraphs from an HTML file. See the Enlive documentation.
  • :cleaner is an Enlive transformation function that can clean up the HTML generated by programs like Word, which often includes redundant font and span tags and other oddities. Note that replacement of Unicode characters by HTML entities can't be done here, since Enlive automatically escapes certain characters. So “ will be changed by Enlive to “. I have written an Emacs Lisp function which does this replacement; see my Emacs stuff.

License

Copyright (C) 2013 Trisdan Leyson.

Distributed under the Eclipse Public License, the same as Clojure.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published