Build Log: Send-to-Kindle Service + EPUB Conversion
Friday, 30 October 2020
Introduction
Amazon provides a Send to Kindle by E-mail service that is a great way to get non-Amazon ebooks onto the device wirelessly. It doesn’t work very well for EPUBs, though, which is unfortunately the format that a number of DRM-free ebook providers default to. A conversion service is provided:
PDFs can be converted to the Kindle format so you can take advantage of functionality such as variable font size, annotations, and Whispersync. To have a document converted to Kindle format (.azw), the subject line should be “convert” when e-mailing a personal document to your Send-to-Kindle address
In practice, the conversion is pretty shoddy, with typographical issues and no real chapter detection. My typical workflow, then, is:
- Buy a DRM-free EPUB
- Convert it to MOBI locally with (the excellent) Calibre:
# Convert all EPUBs in the current directory to MOBI, 4 at a time. $ ls *epub | xargs -i{} -P4 -n1 /Applications/calibre.app/Contents/MacOS/ebook-convert '{}' '{}'.mobi
- Email the MOBI to Amazon’s email service
This isn’t terrible, but is pretty frustrating when I’m away from my computer, say on an iPad or phone. I couldn’t find a hosted solution that provided this workflow (or similar), so I ended up building it for myself.
Implementation
I considered doing this in a few different languages (going as far as a tiny PoC in Rust), but eventually settled on Elixir. My main priority here was developer productivity: I wanted to get this done in ~2 days, so I didn’t want to spend too much time on “simple” things like parameter parsing and file uploads:
- Rust: Much lower-level than I wanted. Things like handling multipart form uploads are extremely boilerplate-y.
- Go: Webserver handlers are still fairly low-level. The module system was an immediate turn-off. No preemptive scheduler, which is a (minor, at this scale) downside here: long-running goroutines could back the runtime up, especially on the single-core VPS I intend to put this app on.
- Rails: This seemed like a much better choice to whip this project up quickly, Rails is heavily optimized for developer productivity. However, the story for background processes hasn’t changed very much, so to run the Calibre converter I’d have to set up a background queue with workers reading off it, which is much too heavy for this project. Having everything run in a single process would be a significant improvement.
- Elixir/Plug: This is pretty perfect. All the niceties of Ruby/Rails, but the process model allows me to orchestrate the Calibre conversion from the webserver process, so I’m responsible for fewer moving parts. The observability/hot-reloading capabilities are a nice bonus, as is the BEAM’s preemptive scheduler.
I’ve briefly used Elixir in the past (mainly to build a custom RSS server), so I wasn’t coming to it entirely fresh. This project has no real need for a database, so Phoenix seemed a bit much. I started off with a raw Plug app. I’m a fan of the amount of magic that Plug provides (which is to say, not much); my initial attempt at a solution was simply:
defmodule Ebook.Application do
use Application
@impl true
def start(_type, _args) do
children = [
{Plug.Cowboy, scheme: :http, plug: Ebook.Webserver, options: [port: 4002]}
]
opts = [strategy: :one_for_one, name: Ebook.Supervisor]
Supervisor.start_link(children, opts)
end
end
defmodule Ebook.Webserver do
use Plug.Builder
plug Plug.Logger
plug Ebook.Router
end
defmodule Ebook.Router do
use Plug.Router
import Plug.Conn
require EEx
plug :match
plug Plug.Parsers, parsers: [:multipart]
plug :dispatch
EEx.function_from_file(:defp, :home_view, "template/home.eex", [])
get "/" do
send_resp(conn, 200, home_view())
end
post "/submit" do
%{"who" => _who, "file" => file} = conn.params
format = cond do
String.match?(file.filename, ~r/epub$/iu) -> :epub
String.match?(file.filename, ~r/mobi$/iu) -> :mobi
true -> :none
end
conn = put_resp_content_type(conn, "application/json")
if format == :none do
send_resp(conn, 400, Jason.encode!(%{error: true, message: "Don't know how to handle files of this type"}))
else
{:ok, pid} = Task.start(fn ->
if format == :epub do
System.cmd("/Applications/calibre.app/Contents/MacOS/ebook-convert", [file.filename, "#{file.filename}.mobi"])
end
end)
response = %{message: "Conversion started!"}
send_resp(conn, 202, Jason.encode!(response))
end
end
end
This is succinct and gets the initial job done nicely. As usual, there is a long tail of smaller concerns to handle:
- Email: I wanted to email the resulting MOBI to Amazon using my own Gmail account via SMTP. I fought with this for a long time before I realized that Google doesn’t really allow SMTP access with your Gmail username/password, even with the Less secure apps setting enabled. I eventually fixed this by creating an app password and using that instead.
- Authentication: Überauth works well out of the box, although the documentation is a bit lacking/out-of-date. I’ve used CloudFront+Lambda@Edge to authenticate personal projects in the past (using
cloudfront-auth
), but there are benefits to having this app manage its own auth, like figuring out which Kindle email to send an upload to based on the logged-in user. This required the Plug session store and a couple of plugs to set up a Rails-likecurrent_user
mechanism. - Frontend: I very quickly outgrew the static HTML frontend. The very nature of this app is to click “Submit” and wait for 30-60 seconds while the conversion occurs, so something dynamic helps fill that time better. I could’ve gotten away with something simpler, but I used
create-react-app
with TypeScript, mainly to learn. A “Live View”-esque approach would’ve worked here, too. Annoyingly, the React development server takes 10 seconds to start, while the Elixir server starts up in a second or so. - File handling: Plug’s multipart handler deletes uploaded files when the process dies (when the server finishes processing a request, essentially). This isn’t good enough for us, because we want the file around after the initial request has completed, so we have to make a copy (and delete it when we’re done).
- Progress notification: How does the frontend know when conversion has completed? Elixir’s Registry is probably a good way to solve this at a larger scale, but I went with something far more rudimentary. The file upload handler extracts the PID of the conversion process and returns it. The frontend uses a different (polling) endpoint to periodically check if that process is still alive or not; once that process doesn’t exist anymore, the job is done. The incantation to convert a string PID back to an actual PID was a bit tricky to find, but here it is:
get "/poll/:pid" do pid = :erlang.list_to_pid('<#{pid}>') if Process.info(pid) do send_resp(conn, 200, Jason.encode!(%{exists: true})) else send_resp(conn, 200, Jason.encode!(%{exists: false})) end end
Error Handling / Resilience
At this point this service is passable, but has one glaring flaw: there’s pretty much no error handling. The conversion process could have crashed, but the frontend will happily assume that the absence of the process indicates success. How do we disambiguate these end states better?
Elixir provides a GenServer
class that seemed like a good fit. From the docs:
A behaviour module for implementing the server of a client-server relation. A GenServer is a process like any other Elixir process and it can be used to keep state, execute code asynchronously and so on. The advantage of using a generic server process (GenServer) implemented using this module is that it will have a standard set of interface functions and include functionality for tracing and error reporting.
There are a couple of different ways this could be structured; here’s the approach I went for:
- Every incoming request spawns a new cowboy process.
- The
/submit
ile upload handler then starts an instance of theJobServer
GenServer
, passing in a function that does the actual work (convert to MOBI, email). - The
JobServer
internally spawns a new (raw) process that executes this “job function”. - The
JobServer
passes a reference to itself to the job function, and exposes a few methods that allow the function to check in with theJobServer
about its progress. In this way, theJobServer
is notified when the job function reaches various checkpoints, and can collect log output. - When the job function is spawned, we call
Process.monitor(pid)
on it so theJobServer
is notified when the job finishes (either successfully or because of a crash) with a:DOWN
message. - The
JobServer
maintains the status of the running job function as internal state, and passes this state back to any/poll
requests that come in. - Once the job function is done, the
JobServer
sets a timeout to shut itself down in 1 minute (we poll every second, so this is reasonable).
With that out of the way, here’s what the JobServer
looks like (a quick read of the GenServer
docs is probably a prerequisite to understand this; it’s a bit cryptic at first glance):
defmodule JobServer do
use GenServer
# Client: the process that sets up and starts this server
def start(task_fn), do: GenServer.start(__MODULE__, task_fn)
def poll(pid), do: GenServer.call(pid, :poll)
# Job: the job process/function that this server is executing
def mark_checkpoint(pid, checkpoint), do: GenServer.cast(pid, {:checkpoint, checkpoint})
def record_log(pid, log), do: GenServer.cast(pid, {:log, log})
# Server: callbacks
@impl true
def init(task_fn) do
{:ok, task_fn, {:continue, :start_task}}
end
@impl true
def handle_continue(:start_task, task_fn) do
parent = self()
pid = spawn(fn -> task_fn.(parent) end)
Process.monitor(pid)
{:noreply, %{logs: []}}
end
@impl true
def handle_call(:poll, _from, state = %{done: true}), do: {:reply, state, state, 60000}
def handle_call(:poll, _from, state = %{crashed: {true, _}}), do: {:reply, state, state, 60000}
def handle_call(:poll, _from, state), do: {:reply, state, state}
@impl true
def handle_cast({:log, l}, state = %{logs: logs}), do: {:noreply, Map.put(state, :logs, [l | logs])}
def handle_cast({:checkpoint, c}, state), do: {:noreply, Map.put(state, :checkpoint, c)}
@impl true
def handle_info({:DOWN, _ref, :process, _object, :normal}, state), do: {:noreply, Map.put(state, :done, true), 60000}
def handle_info({:DOWN, _ref, :process, _object, reason}, state), do: {:noreply, Map.put(state, :crashed, {true, reason}), 60000}
def handle_info(:timeout, state), do: {:stop, :shutdown, state}
end
And the “job function” looks like:
fn (pid) ->
JobServer.mark_checkpoint(pid, :started)
filename = if format == :epub do
JobServer.mark_checkpoint(pid, :conversion)
JobServer.record_log(pid, "Attempting to convert #{filename} to MOBI")
{:ok, path} = Ebook.Conversion.convert(filename)
JobServer.record_log(pid, "Converted #{filename} to #{path}")
path
else
JobServer.record_log(pid, "Received MOBI file; no conversion necessary.")
filename
end
JobServer.mark_checkpoint(pid, :email)
JobServer.record_log(pid, "Attempting to send MOBI file via email")
email = Ebook.Email.SendToKindle.personal_document(who, filename)
Ebook.Mailer.deliver!(email)
JobServer.record_log(pid, "Email sent!")
JobServer.mark_checkpoint(pid, :cleanup)
JobServer.record_log(pid, "Cleaning up")
File.rm(filename)
end
This is now far more robust. The frontend now has access to fine-grained state about the status of the running job function: did it crash? did it complete successfully? if it’s still running, how far has it gotten?
Streaming Logs
One last improvement I wanted to make was to have the frontend display raw logs from the System.cmd
command. Calibre’s ebook-convert
program is fairly verbose about what it’s doing, and having this stream scroll past in the frontend UI would be a nice touch.
This was a bit more involved than I expected, though.
System.cmd
allows the caller to pass an :into
value, which “injects the result into the given collectable”.
I had to look this up; Collectable
is an Elixir protocol that is: “A protocol to traverse data structures”.
This is a bit vague, but the documentation continues:
The Collectable module was designed to fill the gap left by the Enumerable protocol. Collectable.into/1 can be seen as the opposite of Enumerable.reduce/3. If the functions in Enumerable are about taking values out, then Collectable.into/1 is about collecting those values into a structure.
If I’m reading that correctly, all we need is to write a Collectable
that calls JobServer.record_log
on every line of output it receives:
defmodule SystemCmdStreamingLogger do
defstruct [:pid]
defimpl Collectable do
def into(logger) do
{logger, fn
(_, {:cont, term}) ->
term = term |> String.trim |> String.replace("\n", ", ") |> String.trim
JobServer.record_log(l.pid, term)
(_, :done) -> :done
end}
end
end
end
and pass that to System.cmd
:
System.cmd(
"/Applications/calibre.app/Contents/MacOS/ebook-convert",
[filename, "#{filename}.mobi"]
stderr_to_stdout: true,
# `pid` is the process ID of the `JobServer` instance this conversion is running under
into: %SystemCmdStreamingLogger{pid: pid}
)
Conclusion
This feels like a good stopping point; this implementation is likely to require minimal supervision/babysitting while doing its thing (which is an [if not the] important concern for weekend projects). On the whole, I’m very happy with the choice of Elixir here.