Ebook conversion.
Hoo, boy! This should be the easy step, but I think this is where most of my effort went, in the ten days before my publication deadline. You need to convert your book to the ebook format required for the publishing platform you've chosen. This should be straightforward and painless, but I'm told it's anything but. My early experiments bore that out: but read on, I found a good and simple approach using Calibre.
Lest this sound a bit scary, I should cut to the chase and say that once you've cleaned up your source document, and entered the metadata into Calibre for your book (by typing in the blurb, ISBN, title, author, etc.), the conversion to epub or Amazon's .mobi using Calibre requires only a few minutes of effort, and produces a good and reliable result.
What you want is the ebook format to be as nicely laid out and as well-formatted as what you have in your word processor. But for a various reasons, the conversion can have major problems.
I was using LibreOffice 5.0.3 for my word processor, incidentally, at the time I wrote this blog post. Most people probably use Microsoft Word. A few people (especially if they're producing something like a photo book or a children's illustrated story) might use InDesign. I think InDesign is poorly suited to producing a reflowable ebook: InDesign is, after all, designed to do the opposite, to place each piece of text and artwork exactly where you put it, and not to be movable or resizeable. (That said, it would be great for non-reflowable ebooks, like children's picture books or coffee-table books with lots of images.)
In contrast, one of the biggest strengths of an ebook is that the reader can enlarge or reduce the font (or possibly change the font completely), to suit their eyes and/or the lighting conditions. So this means that you have to be careful about preparing your MS so this reader-directed reformatting can work well: you need to use true page breaks when you want a new page, and an indent or centring style when you want to indent or centre something. An awful mistake is to use spaces/tabs and blank lines to manually position some text so it looks right for you in your MS. Likewise, it's a bad mistake to make a piece of text look like a chapter title but not mark it as such. In short, you should be using paragraph styles exclusively. Doing so also means that if you decide you need to make a change to how you've laid out every instance of something, you need only change the paragraph style, not each paragraph!
Here are some problems I've encountered, using LibreOffice:
The HTML it produces is far more complex than it needs to be. I have basically two kinds of paragraphs (body text and chapter titles), and a few pieces of text that must be centred. I use italics for emphasis. But LO (currently) splits the runs of text into shorter spans of differently-named but identical styles (usually, showing where I went in and edited the text), and for my MS produced literally thousands of paragraph styles.
Worst problem for me was when I copied the whole MS into the new template I wanted to use: after resetting all the paragraphs to the correct body text style, and setting all the chapter titles to the chapter title style, it lost (randomly as far as I could see) about half the places where I'd applied the italic style. I delved into the XML and after a days work of writing some programs, was able to recover about 90% of the italics. The rest I just had to notice and re-do manually during one of my many read-throughs.
Conversion to Microsoft docx and back seemed to help somewhat: but then other aspects of the page formatting got messed up (left and right margins, footers, etc.). So that was a dead-end.
The table of contents generated for the ebook format was just bizarre. You don't want page numbers matched to chapters in the TOC, since page numbers basically don't exist in the ebook format (because the ereaders don't want to calculate this up-front and store and display the information: the technology may change in future). Anyway, not only did my chapters have page numbers, but after conversion, each chapter in the TOC occupied a separate page. Ouch!
Loading up the LibreOffice .odt file into Calibre for conversion also produced poorish results when I then used Calibre (v2.45) to convert it to epub and Kindle (.mobi) format. Calibre seemed to work better when I saved as a .docx and then loaded that up. But the formatting wasn't what I wanted, so more investigation was needed. I made this note to myself when I started this step:
"I have some notes on a page at Goodreads about tips for conversion to ebook formats, and a few web pages to read, as well as my own experiments to perform. Worst case, I may have to unzip the epub format and have a look at the raw HTML, and simplify it down to the bare minimum. This is not meant to be rocket science, after all."(Which is what happened, incidentally.)
Oh, I also had a plug-in for LO that converted straight to ebook formats: but that seemed to have its own problems, so I uninstalled it.
Here's a very good explanation of self-publishing on Kindle.
Now, at this point I could have delved into the HTML side of things, since Kindle uses a subset of HTML as one of the possible input file formats for producing a book. (The others are PDF and mobi.) If you want to delve in and get your hands dirty, and take control of lots of aspects of the layout, then learning about the HTML may be the best approach. I didn't do that in the end, but rather than ignore all the tips I've gathered, I'll stick them in an "appendix" after this main article ("Kindle's HTML").###Fix this link
The approach I took in the end was much simpler, and I believe generally more useful, and should only continue to improve with time, so I'll focus mainly on that.
My approach was to rely on Calibre, since one of Calibre's important main functions is to perform ebook conversions.
I thought the best starting point was probably to see what it produced, and to look into any problems I found. In that way, if I found something that looked like a genuine problem in Calibre, I could inform its creator, Kovid Goyal. Or, it might point to problems in what I'd done.
Importing my .odt file and converting to epub was painless, as was reading the resulting epub version within Calibre – but I noticed quite a few issues. Here's a screen shot of Calibre in action, with Wild Thing imported into it:
Calibre is powerful, flexible, and has a well thought out user interface that makes it pleasant to use. But the flexibility (necessary to deal with all the different tasks it can perform), meant that I found myself at a bit of a loss as to work out what to do. Calibre also comes with lots of good documentation, including video demos.
Another thing I found was that Calibre itself has what looks like an excellent editor built in, with a well-written user manual available for free here. (I did what I normally do: "printed" the web page to a PDF file, then opened it in Adobe's "acroread" program, selected "booklet" printing, and sent that to the printer, stapling the result with my long-arm stapler to make a convenient A5 reference booklet for easy reference.) Obviously, editing the MS in Calibre is a last resort: if I make changes to the original MS and re-convert in Calibre, I'd need to make any edit in Calibre again – or else make the Calibre version the main source file for the book, which wouldn't be a good idea since then I'd have different "source file" for the print and the epub editions.
The trouble for me was that my deadline was now so close I simply didn't have the time to study and learn Calibre thoroughly. I needed some shortcuts.
Because I'm technical, I decided to poke about under the bonnet and see if I could work out what was happening. Now, most of this poking about was unnecessary and is merely a distraction: so I'm again going to put most of it into yet another "appendix" to this article ("Poking about in Calibre").###Fix this link, too
But one aspect of it proved so useful that I'll shortly describe some of my poking about, within this article proper.
The ".epub" file is actually a ZIP file with specific files and file formats inside it (that's good design!). One of the intentional consequences of this is that you can poke about inside. If you know what you're doing, and don't break the rules, you can even change things. (You just zip it all back up into an .epub file again.)
So I poked about and spent a while developing some scripts to try to fix what I saw as problems, and continued delving into and learning the technical details of how the .epub format worked, and what Calibre produced. I made some progress, but then reached a point where things looked sufficiently wrong that I filled in a bug report to describe what I was experiencing.
The three big errors I had were:
- After importing my LibreOffice MS into Calibre and converting to epub, the Table of Contents (of about 70 chapters), appeared as one separate line on each page: you have to page forward through 70 pages to get to the actual 1st page. Or, if you wanted to "jump" to chapter N, then you have to page forward N-1 to get to the ToC entry for Chapter N, and then click on the link.
- I was getting white square rectangles appearing for some of the non-breaking space characters when I uploaded the .mobi file to KDP.
- Lots more fonts than I expected in the output file, leading to strange changes in the paragraphs.
The developer of Calibre, Kovid Goyal, was very helpful. And, the executive summary of the long and detailed description of my trials and tribulations are that all the problems were caused by either a wrong setting I'd chosen in Calibre, or a genuine bit of messiness in my MS.
To fix problem (1), Kovid pointed out two simple corrections to what I was choosing for the conversion operation:
- He recommended converting from .odt to .docx format , and give Calibre the .docx file, since a lot more work had gone into the conversion from .docx. So I chose "Save a copy" as .docx and tried that.
- In the Structure Detection section of the conversion dialogue, either change the "chapter mark" to none, or change the chapter detection expression to "/". So I set the "Chapter mark" to "none", and also changed both the "Detect Chapters at" and the "Insert Page Breaks before" to just a "/" (this means, "disable this function").
Together, those two pieces of advice fixed pretty much all the errors I'd found in the conversion process – apart from errors obviously coming from my MS itself. Here's another screen shot, showing all three (changed) settings in Calibre:
For problem (2), it turned out to be a character I was occasionally (accidentally) somehow typing instead of a non-breaking space, in LibreOffice. The character appeared to be a non-breaking space in LO but was illegal for Kindle (specifically, it was the Unicode character U+FFF9). I just had to find and delete all those characters from my MS.
Problem (3) – fun with fonts! When I paged through the ebook in Calibre, doing a visual check, I noticed there were some sections in the wrong font. Checking back into the MS I found that, sure enough, there was a genuine font change in the MS – with a visual difference so small that it was almost undetectable in LibreOffice. But then I realised that finding all these by visual inspection (i.e., manually) was an unnecessarily laborious way of finding and fixing them. By manual inspection, I had learned that the font errors so far had been because some of the text was still in Garamond 9pt. Because Garamond is not installed on my system and the license costs for me to acquire it and pay the annual fees were prohibitive, I had changed the body-text paragraph style to Georgia 10.5pt. Anyway, the odd slippage back into Garamond 9pt would have happened because I must have done some manual formatting, so that when I changed the font style within the paragraph style, those manually-applied formatting changes persisted.
So I called up the Find&Replace dialogue, clicked on Format, and then on the Fonts tab. Now, since Garamond isn't installed, I couldn't pick it from the list. However (and this is really praiseworthy on the LO developers' parts), I was able to type Garamond (with leading cap) into the font text field, choose "Regular expressions, and then enter ".*" (no apostrophes, naturally) into the Search field.
Now, I could fix them one by one, by clearing direct formatting, after noting whether the text should be italic or not, and then setting it italic as needed, after clearing direct formatting. But that, too, was tedious. Useful to do a little, though, to get a feel for the errors and start correcting them.
I could see exactly how it had happened, and it had made sense at the time, but it did have this side-effect that I'd been unaware of. Anyway, understanding the types of errors, I then did a Find All (such a cool LO feature: you end up with multiple separate selections of the matching pattern throughout the document, so you can make changes to those pieces of text in one operation). LO told me it had 299 words and 1,457 characters selected, and I noted that the font size was 9pt in every case (by simply observing that "9" showed in the font size box on the main menu-bar), and then changed the font to Georgia and the point size to 10.5, and that was that!
(Except for LO bug 62603 – aarghh!)
The joy of the "stylesheet.css" file. I thought I'd do a search for other typefaces that might have crept in (Arial, Calibri, Times Roman). I found that there were a few places I'd unintentionally slipped into Times New Roman. Then I realised I could see every single font I had actually used just by looking inside the "stylesheet.css" file inside the .epub file.
And, yes, there were a whole bunch more there. Here's the list, for maximum self-embarrassment:
$ grep "font-family" WildThing-exp5.unz/stylesheet.css | sort -u font-family: "Liberation Sans", Arial; font-family: "Liberation Serif", serif; font-family: "Times New Roman"; font-family: Calibri; font-family: Garamond, sans-serif; font-family: Georgia1, serif; font-family: Georgia; font-family: Times New Roman1, sans-serif; font-family: Ubuntu, sans-serif;
Actually, I couldn't make much sense of what was going on when I searched for Liberation Serif: it seemed like a high percentage of paragraphs in the document were in this font: I could clear the formatting so they weren't found, but doing so made no visible difference. And these weren't paragraphs that had proven to be a problem in the conversion. I wondered if some weird font substitution was still going on? The particularly weird thing was that these searches would also find the chapter and part headings – but when they were selected, the font showed as Times New Roman, not Liberation Serif. Yet when I'd searched for Times Roman, they hadn't shown up. (The only pieces of text in the whole MS in Times New Roman were the title and "by L. J. Kendall" on the title page. Because of a bug (feature?) in LO's ToC generation, I had had to add a space character to the end of each chapter title (to make the chapter names actually appear in the ToC). Because of that work around, it seemed like these were all appearing in Liberation Serif. Finding and fixing these was tedious. I couldn't do a Search& Replace because the word "Chapter" and the chapter number were supplied from the paragraph style (they weren't "in" the document body). It turned out that the text is provided via the Bullets and Numbering dialogue's Options tab; the font is kind of defined by the "Character Style" you choose; but I couldn't at first see where these character styles were defined (or modifiable). It was set to "None"; when I changed it to "Header Char", it changed to Liberation Serif, even though it appeared as Time New Roman because that was the font style defined for the Chapter Title style defined in the "Styles and Formatting" dialogue. A bit confusing, eh?
So, in the "Styles and Formatting" dialogue I needed to select the "Character Styles" section, and then change that to Liberation Serif. While there, I set the font size and colour, too. That changed all the auto-formatted chapter titles to be the same: I changed the Character Style for the Prolog to match, but chose not to do so for the Parts nor the Acknowledgement, nor Afterword etc.
So, I checked that after making that change, I checked to see if the "added space" to the title was still being detected as in Liberation Serif when I did my Find&Replace: it was, but I figured that since this was a non-printing character, letting it be changed to Georgia shouldn't make a visible difference. So I went ahead and did a Find All for the Liberation Serif font: this time, it worked harder, finding I think 66,000 words in that font. So I went ahead and changed the font to Georgia, and the font size to 10.5 for all of them. That seemed to make no visible difference, but now when I searched for Liberation Serif, there were no matches.
Whew!
Okay, next on the list of probably-spurious fonts was Liberation Sans: that found only the Acknowledgement and Table of Contents headings – good!
I then checked the others – none of them were used, except for Ubuntu in four places. So, after that, it seemed that I had (finally!) cleaned up my font mess. Oh: and in the Find&Replace dialogue I clicked on the "No format" button to clear the search for font, and also unchecked the regular-expression, so as not to accidentally confuse myself the next time I used the dialogue.
In Calibre, you update the book by choosing "Add files to selected book records" from the Add books menu, and then selecting the (preferably .docx file) for your MS.
I noticed some odd justification; checking back, I discovered that my two-space pedantry had come undone: all the occurrences of a non-breaking space followed by a normal space, between sentences, had been lost somehow. A quick regular expression Find& Replace fixed that. Oh, and for some reason, every paragraph was no longer fully-justified. Re-converting fixed not just the weird spacing problem but also the justification problem. Now all that remained was trying to work out how to centre the chapter headings for the auto-titled chapters (and ideally, add a little space below).
Ha – in checking, I found that the back cover image I'd inserted on the last page, and anchored to the page, had stayed on that page even when the font change had increased the number of pages. It now appeared about 50 pages before the end. So I fixed that, too.
After that, "all" that was required was about half a day of intense and error-prone work to find the single-character "smearing" of regular-into-italics and italics-into-regular caused throughout the MS in every place where there had been a style-change at the edge of where a Find&Replace had operated.
Sigh. But at least that was the last of my problems, and things proceeded smoothly from there.
Okay – once again, this blog post has turned out to be much longer than I expected, so I'll need yet another separate and new post to cover the topic of the nitty gritty of uploading the file to Kindle and making your Amazon Author page and making updates. So I'll leave it there, for today. Stay tuned!
A Nice Table of Contents
Oh, one thing I did to make the Table of Contents nicer was to delete the leader-characters and page numbers from the auto-generated table of contents, leaving just the (useful) hyperlinks for the ebook edition. I've also found a problem, due to LibreOffice's bad habit of breaking runs of characters into separate logical spans of text. This means that if you edit any text in your ToC, and new text you type in will be in a separate text span. This was very visible because I'd chosen a colour for my ebook chapter titles (not plain black), so edits became very visible, as thed edit was in black and the auto-generated text was blue.
Editing the ToC manually to delete the leader characters (the row of dots) and the page number reference (mostly useless in an ebook), after the 1st time, instead of doing it manually, I used the Find&Replace dialogue: I turned on the Other Options, selected Regular Expressions, put nothing in the Replace With field and this text (without the quotes) in the Search For field: "\[0-9][0-9]*", and then quickly replaced each one.
Poking about in Calibre and .epub
So I copied the .epub file Calibre had produced, into a temporary directory to play with. The 1st step was to unzip the .epub file. Now, I noticed there were far more "index_split" files than there were chapters, and so I opened the first few of these files, and the content.opf, page_styles.css, titlepage.xhtml, and toc.ncx files. I immediately noticed a few unexpected things: text was still being broken up into some very short runs of characters, with different paragraph styles for each run. When I got to the stylesheet.css file, things started becoming clearer. That file defines all the paragraph styles in a very human-readable format, and it became obvious that I apparently had sections of text that looked the same, but which weren't. Since the paragraph style names created by Calibre are pretty distinct (like p-p1, p-p2 and so forth), it meant it would be easy to search the html files to see where they were used. In that way, I could find where I'd used these odd styles, and change the original MS to remove the weird styling.
Now, the 1st problem in my epub version is that Table of Contents (TOC) puts each line of the TOC on a separate page! Looking in the toc.ncx file, I see it looks like this:
<navPoint id="ujwaAQQK4di35KOrpmPFchB" playOrder="1"> <navLabel> <text>Prolog 1</text> </navLabel> <content src="index_split_004.xhtml"/> </navPoint> <navPoint id="utE7axnMMjYH5AzmPYfyOp2" playOrder="2"> <navLabel> <text>Part I 14</text> </navLabel> <content src="index_split_005.xhtml"/> </navPoint> <navPoint id="uyHLH5SBCmIBt42iiswSe24" playOrder="3"> <navLabel> <text>Chapter 1 15</text> </navLabel> <content src="index_split_006.xhtml"/> </navPoint>
So it looks like the page-breaks must be inside the actual content files like "index_split_004.xhtml". Looking inside that shows that the text is "Prolog 1" as you'd expect, but that instead of being a normal paragraph, it's defined to be a header element (H1). That seems odd. Look:
<h1 class="p-p18" id="calibre_toc_2"> <a href="index_split_075.xhtml#anchor8" class="s-t"> <span class="s-t4">Prolog 1</span> </a> <a id="anchor9" class="s-t"></a> <a class="s-t"></a> </h1>
Another odd thing there is that there are three anchor points: the 1st with the text "Prolog 1" and ID "anchor8", a 2nd with no text/content but an ID of "anchor9", and a 3rd with, again, no text/content, all the empty anchors with class "s-t", the "Prolog 1" text with styles-t4. I don't understand that. Looking again at the stylesheet.css, I see that the "s-t" class defines a style with 0 margin, 0 padding, and a 1.2 line-height, and s-t4 is the same but with font-height 0.77419em.
Ahhh...! Perhaps because I manually reformatted the main block of the TOC to be 9pt, and the PARTs to be 10pt? And the 1st anchor is for the left text, the middle for the leader/spacing dots (omitted) and the 3rd for the page number? Sounds plausible, except the page number is in the 1st anchor, not the 3rd. Hmm. It kind of matches the definition at http://www.w3.org/TR/html5/text-level-semantics.html#the-a-element which says that if you leave out the actual link, "then the element represents a placeholder for where a link might otherwise have been placed, if it had been relevant, consisting of just the element's contents." So: why is the page number inside the 1st anchor, not the 3rd? The anchors are often used to indicate clickable content.
What is a class, anyway? Sitepoint says that an HTML element's name (like "h1" or "div" or "a" or "p" specifies headier, division, anchor and paragraph), the "class" attribute let's you specify one or more subtypes. These subtypes are used to label semantically-similar things for identification, so that CSS or JavaScript code can then do clever stuff like define some properties or do something to all elements in the whole document structure (DOM) that have that subtype.
It also means that if I want to auto-remove the page numbers from the TOC entries, I may have to remove it from both the index_split_004.xhtml file and the toc.ncx file – something to keep in mind. So: does this mean I should manually delete the TOC from the MS and regenerate one with page numbers turned off? Or write a script to do the edits, so I can keep a single MS file? Or maybe insert two TOCs, one designed for epub and one for print, and write a script to delete one or the other depending on whether I'm generating an ebook or a pbook? Or, is there a tweak in Calibre itself I could do? If you click on "Convert books" and then on the "Structure Detection" button, it shows you the code it uses to "Detect chapters" (which looks perfect to me), and also to"Insert page breaks", which again looks good, and explains where the 1-page-per-TOC-line problem is coming from: each line is of style "H1", so it's of course getting a page break.
So it sounds like a 1st step might be just to regenerate the TOC in the MS, and not fiddle at all with the formatting, and then see what Calibre produces.
Ah, and I see in Calibre's "Heuristic Processing" conversion option, one that says "Renumber sequences of <h1> or <h2> tags to prevent splitting" – that sounds like what I want, so the problem may become very simple to solve. Hmm: nope. It now has two lines of the TOC per page! An improvement but not a solution.
Oh, and while fiddling with fonts, noticed the paragraph indent was an absolute unit of measurement (cm) instead of relative (points). So:
To change the measurement units used in this dialogue, choose Tools - Options - LibreOffice Writer - General, and then select a new measurement unit in the Settings area.
Hmm, tried again after fixing up far more font-change issues than I expected: I did have several messes. Now, it's all in Georgia, 10.5pt. TOC still wrong. Changed an option to force it to use the auto-generated TOC – I assume that means the one in the .odt file…
"Appendix – Kindle's HTML"
Most of the text below (apart from the occasional editorial remark in italics) is directly quoted from the Support Indie Authors forum on Goodreads.
Never specify fonts or font sizes.
Use percents for indents and em's for margins.
Limit use of symbol characters (try to use HTML symbol codes) and in-line formatting. (We only use bold and italics -- not even super and subscripts.)
But Morris noted (good tips here http://www.morrisegraham.com/):
You can, and should specify font sizes for chapter and title headings. This is done by specifying size in the style part of your header by using a CSS style.
Like....
p.h1 { font-size: 1.5em; text-indent: 0em; text-align: center; font-weight: bold }
note that when I use an h1 header in my body, it causes the text to be 1.5 em, which is 1.5 times the width of the letter "m." So when the customer on the end adjust font size in their ereader device, the titles and chapter headings stay relative in ratio and proportion to the deafult font size. You can only get this kind of control by doing your book in notepad and then converting it to eBook with a converter later.
But Owen then noted:
If H1, H2, H3 etc tags are used, each device will display them in their own different default size. You can of course define your own styles and adjust the sizes if you want to, but we've always been happy with the defaults.
Micah noted:
One thing that always bugged me about the KDP defaults is that they automatically stick a huge extra line of space between paragraphs. This makes intentional section breaks very difficult for the reader to interpret. To avoid Kindle sticking in extra space, you do this (presumably in the CSS section):
I just define the normal paragraph properties:
p { text-indent: 1.3em; margin-bottom: 0.2em; }
I found that setting the bottom margin to 0 seemed a bit too tight, and also thought the default text indent was too large for most eReaders. But you can adjust them as you like.
And he added:
Oh, and another thing which I find a bit odd about Amazon's eBook formatting: if you do not hard code in justified text, Kindle eReaders will justify the text by default.
That's not an issue, however what may be an issue is that this default justification does NOT show up in their Look Inside feature.
I have not hard coded justified text, so in the Look Inside preview, it appears as if my books have left justified text. But in an actual eReader they show up with fully justified text.
Owen shared lots of good info:
In our case, we specify a "normal" style in our style sheet for all normal text that is:
p.Normal, li.Normal {margin-top:.60em; margin-right:0; margin-bottom:0; margin-left:0; text-indent:5%; }
Our paragraph style (which is hardly ever used) is:
p {margin-top:0; margin-right:0; margin-bottom:0; margin-left:0; }
Our heading styles follow this format:
h1 {margin-top:2em; margin-right:0; margin-bottom:.30em; margin-left:0; border:none; padding:0; font-weight:bold; } h2 {margin-top:0.6em; margin-right:0; margin-bottom:0; margin-left:0; border:none; padding:0; text-transform:uppercase; font-weight:bold; } h3 {margin-top:0.6em; margin-right:0; margin-bottom:0; margin-left:0; border:none; padding:0; font-weight:bold; } h4, h5, h6 {margin-top:0.6em; margin-right:0; margin-bottom:0.6em; margin-left:0; text-align:right; font-weight:normal; font-style:italic; }
We've never set the text justification manually. I was unaware until Micah mentioned it that anyone would think this "unprofessional", since I would have thought that readers of eBooks would be aware that the Amazon "Look inside" feature formats things differently than eReaders. I don't think I want such people reading our books anyway (our characters often speak and act "unprofessionally" and that might annoy them?)
I have read that some readers find left justified text easier to reader. I suppose there might be some worth in not hard-coding the justification in case their device allows them to select that option?
And also:
Igzy wrote: "If it's not too much trouble could one of you offer an example of two properly formatted paragraphs laid out together? I'd be much obliged if I could see the tags in action, as I'm not familiar with HTML formatting. "
No problem. Of course GR interprets HTML code, so this example used parentheses in place of pointy brackets, thus ( = < and ) = > . This is how we begin a chapter, with notes after the ||:
(br clear=all style='page-break-before:always') || This tells the Kindle converter to break the page. Below are the headings: heading 3 for the main, and heading 4 for the sub, which is italic and aligned right.
(h3)(a name="_Toc410049128")Prologue: Zero Day(/a)(/h3) || name ID's the target for the TOC link.
(h4)Janin Station;(br) || br is a linefeed (line break)
Tau Verde, Vulpecula Region(/h4)
(p class=NormalBlock)It was make-and-mend day for the Halith Imperial Navy’s Kerberos Fleet ... ruled the lives of Halith mariners—especially when the fleet was lying up at a comfortable port like Janin. (/p) || This a paragraph container. HTML designates the beginning of a container with a code, just p here -- h3 above -- and ends it with a / in front of the code: /p. "class" defines the type of paragraph as defined in the style sheet. The is flush-left paragraph with extra space at the top. The CSS entry for it is below.
(p class=Normal)Watchstanding and sensor sweeps ... guarded by a ring of monitors. (/p) || This is a standard text paragraph. p.NormalBlock margin-top:2.5em; || creates the extra space. Note there is no text indent. margin-right:0; margin-bottom:0; margin-left:0; }
That is 90% of an HTML text doc right there (with the CSS examples above, in the previous posts). Yes, you see a lot of godawful gibberish in the code here and there, like style="much gibberish" or (span style="much gibberish") (/span). Almost all the time, that is unwelcome. Word will put it in to try to mimic the exact look of a doc in IE -- not what you want.
And Amazon have their own guide which comes recommended:
"Building Your Book for Kindle", and the ebook is free.