Dynamic Thinking

Saturday, 18 August 2012

iPod/iPhone Video Encoding with ffmpeg and mplayer

Up until fairly recently, I've been transcoding videos for playback on my (ancient, 1st-gen) iPod Touch using mencoder. But it's less than ideal. mencoder is, by admission of core mplayer devs, an unsupported, unmaintained piece of software that may or may not work at any given moment, and which isn't particularly high on the fixit list. When it comes to transcoding, ffmpeg is the better platform.

Almost.

ffmpeg isn't too great with subtitles, and I watch a great deal of subtitled content. So what's really required is something that's as good at playback and subtitle rendering as MPlayer, but with FFmpeg's encoding capabilities. So the answer really is to use both: mplayer to write raw video frames to a pipe, and ffmpeg to transcode from that pipe. As a final step, package the encoded video in an iPod-ready m4v file and copy the audio stream(s) from the original source:

mkfifo a-named-pipe.fifo

mplayer a-source-file.mkv -noconfig all -vf-clr -nosound -benchmark -ass -vf scale=480:-10 -vo yuv4mpeg:file=a-named-pipe.fifo

This will block immediately, as there's nothing reading from the pipe that mplayer is writing to, and the pipe only has a small buffer. In another terminal (in the same directory), run this to pull video frames from the pipe into ffmpeg:

ffmpeg -i a-named-pipe.fifo -vcodec libx264 -b:v 768k -flags +loop+mv4 -cmp 256 -partitions +parti4x4+parti8x8+partp4x4+partp8x8+partb8x8 -me_method hex -subq 7 -threads auto -trellis 1 -refs 5 -bf 0 -coder 0 -me_range 16 -profile:v baseline -g 250 -keyint_min 25 -sc_threshold 40 -i_qfactor 0.71 -qmin 10 -qmax 51 -qdiff 4 -y video-only-file.avi

Once mplayer terminates at end-of-file, you'll have video-only-file.avi, a silent-movie AVI encapsulating a video stream that's iPod-ready. If you had subtitles, they're burned into the video. You can discard a-named-pipe.fifo. Now, you just need to take the encoded video in the AVI, splice in an AAC version of the audio stream from the original source, and pack it into an M4V container that your device can work with.

ffmpeg -i a-source-file.mkv -i video-only-file.avi -acodec libfaac -b:a 128k -ac 2 -vcodec copy -f ipod -map 0:a -map 1:v -metadata title=My Video -y output-video.m4v

Once you've checked that output-video.m4v is good, you can discard the intermediate video-only-file.avi.

Now, this all looks like a lot of hassle (and it kind of is), but it's far more resilient and reliable than the mencoder-only version. It even gives you chapters and multiple audio streams, if they were in the source.

If all that's a bit too fussy, I've wrapped it up in a Perl script hosted on GitHub, meaning that you can do everything with just:

perl ipod-encode.pl --title='My Video Title' --standalone input-video.mkv

Feedback and improvement suggestions are always welcome.

Sunday, 22 July 2012

Emacs: Speeding Up Loading Your .emacs File

One of the best things about Emacs is its extensibility. The bulk of its editing functions are written in the same language as the editor itself, Emacs Lisp, which you, the user, can program inside the editor itself.

Experienced Emacs users invariably find packages on the Web, maybe from EmacsWiki or github, and stuff them into their .emacs file, loading them with (require). For a small number of small packages, this can often work well, but eventually you're going to either stumble across a large package like Org Mode, or for the polyglot programmer, load the major modes for half a dozen or more languages.

All those (load) and (require) calls will slow down the time taken to load your .emacs file considerably. If you want to see how long it takes, M-x emacs-init-time will give you the answer. Personally, I consider load times in excess of one or two seconds to be unacceptable for development-class hardware.

Loading What You Need, When You Need It

So how do you speed up loading of your init file, but at the same time keep everything that makes Emacs work for you? Simple: autoload and eval-after-load.

autoload tells Emacs that a function is defined in a specific file, but doesn't actually load that file until it's explicitly asked for, e.g:

(autoload 'ruby-mode "ruby-mode")

This tells Emacs that the function ruby-mode is in the file ruby-mode.el (or ruby-mode.elc, if appropriate). When the function is invoked, the path elements in load-path are checked in order to find ruby-mode.el, which is then loaded on-demand.

Automatically Using the Right Mode for a File

So, how does ruby-mode get called? Well, it's either explicitly (M-x ruby-mode), or by specifying that it should be invoked when you load an appropriate file, like this:

(add-to-list 'auto-mode-alist '("\\.rb$" . ruby-mode))

This says that Emacs should enter ruby-mode whenever a file with the ".rb" extension is loaded. When it is, ruby-mode is started, and the autoload ensures that it is loaded from the appropriate file if required.

Customising Just-Loaded Code

But what happens if you want to do more than just load the major mode when .rb files are loaded? Maybe you also want to load yasnippet, ace-jump or other modules?

You've got two choices: a mode hook, or eval-after-load. The former is cleaner, but the latter is more flexible, and available in the event that there's no mode hook variable for your language (e.g. scala-mode-hook isn't available until after scala-mode is loaded).

eval-after-load takes an Emacs Lisp form as an unevaluated list and only evaluates it once the specified file has been loaded, e.g.:

(eval-after-load "ruby-mode" '(setup-autocompletion))

The time-saver here comes from the fact that the quoted form is evaluated after load, so it can potentially result in quite a lot of work getting done when it's loaded, and you'll only feel the effects when you enter that specific mode.

In this case, the same effect can be achieved by adding a function to ruby-mode-hook:

(add-to-list 'ruby-mode-hook #'setup-autocompletion)

Here, I'm assuming that (setup-autocompletion) has already been defined and, as in my case, is referred to by multiple programming modes (nb: that function is one I've written myself; it's not part of Emacs, so you won't have it unless you've written one with the same name!). The form can be atribrarily complex, but it must be a single form. Multiple forms can be wrapped in a (progn) to compose them. Here's an example for cperl-mode that's more involved, setting some defaults, defining some Perl-specific functions, and doing that (setup-autocompletion) again.

(eval-after-load "cperl-mode"
  '(progn
    (setq
     cperl-merge-trailing-else nil
     cperl-continued-statement-offset 0
     cperl-extra-newline-before-brace t)
    (defun installed-perl-version ()
      (interactive)
      (let ((perl (executable-find "perl")))
        (if perl
            (shell-command-to-string (concatenate 'string perl " -e '($v = $]) =~ s/(?<!\\.)(?=(\\d{3})+$)/./g; print $v;'")))))
    (defun use-installed-perl-version ()
      (interactive)
      (let ((perl-version (installed-perl-version)))
        (if perl-version
            (save-excursion
              (beginning-of-buffer)
              (let ((case-fold-search nil))
                (re-search-forward "^use [a-z]" (point-max) t)
                (beginning-of-line)
                (open-line 1)
                (insert (concatenate 'string "use v" perl-version ";"))))
            (message "Couldn't determine perl version"))))
    (setup-autocompletion)))

eval-after-load isn't limited to programming modes. Here's an example of how you can auto-load your credentials when loading GNUS:

(eval-after-load "gnus"
  '(setq
    nntp-authinfo-file "~/.authinfo"
    gnus-nntp-server *my-nntp-server*))

Summary

So, there you have it:

auto-mode-alist allows you to enter specific modes based on file extension
autoload allows you to not not load that mode until it's actually required by opening a file of that type
eval-after-load and/or (add-to-list {mode}-mode-hook) can be used for mode customisation or setup, again only when the mode actually loads

As much as possible, ensure that your .emacs file makes minimal use of non-deferred (require) or (load) calls, although (require 'cl) is an exception. It's also fair to load code/minor modes that you use all the time, since there's no point in deferring them just to load them the first time you do anything!

Thursday, 7 June 2012

Perl: 'my' vs. 'local'

There's often a lot of confusion among new Perl hackers as to what the difference between my and local tags on variables. Unfortunately, many reach for the wrong one because the language that they're used to working with has local variables, and the words are deceptively familiar...

If you've just Google'd for a quick answer, you almost certainly want to be using my.

However, local's got to be there for a reason, right? Read on...

Here's a sample program that declares a variable, $global, that's visible to all of the subs beneath it. There's a function, print_global that displays its value, and two functions my_variable and local_variable that call it.

our $global = 'global';

sub print_global { print "\$global='$global'\n" }

sub my_variable {
  my $global = 'lexical';
  print_global;
}

sub local_variable {
  local $global = 'local';
  print_global;
}

my_variable();
local_variable();

The last two lines actually call the declared subs.

my_variable behaves exactly as you'd expect if you're coming from Java, C or a similar language. It's got a variable with the name $global in it, but when print_global is called, it's the the top-level $global (the string 'global') that gets displayed, as that's the one that's visible to that function. In this case, the variable called $global in my_variable isn't used anywhere and is wasted.

my variables are lexically scoped.

local_variable paints a different picture. local eclipses or shadows the globally visible $global, so that any function that subsequently asks for its value sees the shadowing value, not the previous value. This can happen repeatedly, so that there are a stack of shadowed values, with a single visible value. So, the call to print_global from local_variable displays the string 'local'.

local variables are dynamically scoped.

Now, it's up to you to determine which you need. But if you're reading this, I'll bet that it's my :-)

Sunday, 4 March 2012

Ruby: Exceptions and Continuations

Years ago, I stumbled across a paper describing user interface continuations. At the time, the concept of continuations seemed like nothing short of wizardry: the program would be at one place, processing instructions, and then would suddenly be somewhere else to collect a piece of data, and the back at the original place, with that collected data available to the computation that was happening originally.

A nice theoretical exercise, but that could never be useful, right?
Well, since then, I've learned Common Lisp, and then its sibling, Scheme. Continuations aren't a part of Common Lisp, but they're readily available in Scheme and, looking back, CL's awesome condition system starts to look like a regular exception framework with continuations thrown in for some fun.

So far, this is all sounding somewhat academic: papers and arcane languages don't have anything to do with what programmers do on a day to day basis, right?

Ruby is a modern scripting language, and supports continuations out of the box: they're built right into the Kernel module as the callcc method (the method name has been lifted from Scheme, where it's called call/cc, or call-with-current-continuation, if you like typing).

This method takes a one-arg block, where the argument is supplied by the system and represents the current continuation, which is a representation of where the program is at the moment that it is created. So what can you do with that? Well, Continuations can be call'ed, and when they are the value of the callcc call becomes the value that the continuation is call'ed with, regardless of where in the program the call was made. The continuation is a regular object that can be stored in data structures, passed around, etc., but when it's invoked, program flow resumes from the site of the callcc.

This needs an example.

I mentioned Common Lisp's condition system earlier. It's analagous to the exception mechanism in languages like Java and Python, with one notable difference: when a condition is signalled, the stack is not unwound to an enclosing exception handler. Instead, the stack is searched for a handler, which then gets to look at the condition and, if it determines that there is remedial action that can be taken, can provide information to the exception site that can tell the code there how to proceed. These are called restarts.

Where would this be useful?
A simple example from Peter Seibel's Practical Common Lisp is a log file parser. Imagine you're writing this, and you've sensibly layered the different functions: from the abstract 'parse a log file with this filename', through 'process all log entries in the file', into 'extract a single log entry', and then 'analyse a single log entry'.

But what if something goes wrong at the analysis level? What do you do with a malformed entry? If you're working with functions, you just have to pass something into the function that tells it how to handle that. If it's an object, set some property on the object for this situation. But what if it is, as in this case, several layers down from the application's interface? Well, you can pass some property dictionary or other miscellaneous contextual information into either the intervening functions or objects.

This kind of action indicates a break in reasoning: you're setting or passing properties on something that really shouldn't need to care about their existence. This kind of clutter makes maintenance programming difficult, as objects and functions are littered with things that they don't use themselves, but instead are made aware of for the sole purpose of handing over to something else. This has adverse effects upon reusability, as it's now assumed that these objects are part of a particular call chain.

Looking at it, the only layers that need to know about the problem are the bottom one, where the problem occurs, and the top one, where the business logic lives.

With continuations, we can make this happen. Here's an example.

  class Condition < Exception
    attr_accessor :continuation, :payload
    def initialize(continuation, payload)
      self.continuation = continuation
      self.payload = payload
    end
    def continue(value)
      @continuation.call(value)
    end
  end
  
  def topLevel
    begin
      intermediateLayer1
    rescue Condition => c
      c.continue(0 - c.payload)
    end
  end
  
  def intermediateLayer1
    intermediateLayer2
  end
  
  def intermediateLayer2
    intermediateLayer3
  end
  
  def intermediateLayer3
    fragileLayer
  end
  
  def fragileLayer
    (1..5).each { |i|
      i = callcc { |cc|
        begin
          processEntry(i)
        rescue Exception
          raise Condition.new(cc,i)
        end
      }
      puts i
    }
  end
  
  def processEntry(entry)
    (entry % 2 == 1) ? (raise "Can't deal with odd numbers!") : entry
  end
  
  topLevel()

The purpose of this program is quite simple: a top-level caller gets some work done (in this case, printing the numbers from 1 to 5) by asking a lower layer. The intermediate layers exist to demonstrate that there's no direct linkage between the raiser of the exception and its handler.

In this example, the bottom layer refuses to work with odd numbers, and raises an exception when given one. This is caught, but the decision as to what to do next is not appropriate for that low level: the business logic needs to make that decision, but it's several layers up in the stack.

At this point, a continuation is captured with callcc, and a new ContinuableException is raised. There's nothing special about these objects: they just encapsulate the continuation and the data that caused the error.

Normally, an exception propagating up the stack causes the intermediate stack frames to become inaccessible and therefore eligible for garbage collection. However, the continuation captured in the exception that's just been thrown refers the stack frame in which it was created, so the stack remains live, even if control flow is being unwound through it.

Now, the wizardry: the top level handler has access to the continuation and the problematic value, so it can decide what to do next. It can re-raise the exception, or it can provide a new value to the original source of the exception to be used in its place. Continuations can be call'ed, and they take a value to treat as the return value of the callcc call. Lower-level processing can continue as though it hadn't been interrupted; the intermediate layers are not unwound or invoked again.

The output of the above is just:

-1
2
-3
4
-5

Now, there's no need to follow this precise pattern. The value that's returned could instead be a Symbol that indicates which of a range of choices should be executed. It could be a Proc, which the receiver is expected to call.
Pretty neat, huh?

Saturday, 25 February 2012

Visual Studio 11's New Look

Microsoft recently unveiled a proposal for a new look on Visual Studio 11. Now, I'm hardly a graphic designer, but one thing repeatedly strikes me as a bit bonkers in software written in the past ten years, and it's not restricted to Visual Studio: the 'save' and 'save all' icons.

Do you see that? They're floppy disks. I don't know if their continued representation is maybe indicative of an aging population of computer programmers, but floppy disks haven't been relevant since the turn of the millenium.

I wonder if new up-and-coming programmers even know what they're clicking?

Saturday, 18 February 2012

Mac DVDRipper Pro

I have a Mac mini under my TV, with an NFS mount to 2TB of RAID-1 space on the other end of a gigabit ethernet home network. So it makes perfect sense to get rid of physical DVDs, putting them in a box in the loft after ripping them into digital form. So I've been looking for software that does this well, and stumbled across Ma c DVD Ripper Pro a few days ago. It even managed to rip/transcode some troublesome DVDs that seem to resist other means of doing the same.

MDRP offers an on-the-fly transcode feature which looks to be built on HandBrakeCLI.

All things considered, it's a pretty good converter: simple, slick and reliable. That said, I'm not sure how its proprietary license works with the GPL'd HandBrake underneath it.

Update 19-Apr-2012: I ended up not really bothering with transcoding my DVDs: the storage on the NAS is such that I could avoid the re-coding overhead and just rip the VOBs directly with MPlayer:

mplayer dvd://1//dev/sr0 -dumpstream -dumpfile "Some DVD Title.ps"

This way, I get all the audio and subtitle streams, and with the help of lsdvd can pick out just the movie track (which MDRP can do as well).
One potential gotcha is to remember, particularly with subtitled/multi-audio movies, is to grab the .INF files from the DVD, too. They contain the names of the tracks and the palette used in rendering the subs; the subs may render strangely without this information.

Sunday, 12 February 2012

Shell Scripting: Counting Occurences of a Character

I recently found myself needing to know the occurrences of a letter in a line from a shell script. I was working with a delimited file, and needed to know how many columns were in a given line (it would be consistent within the same file, but could differ between files, depending upon the version of the code that produced it).

Surprisingly, it took a bit more digging than I thought would be needed for such a simple task. Counting lines is easy, but characters within a line? It's not a difficult thing to do in Perl or awk, but launching their respective interpreters seemed a bit heavyweight.

tr to the rescue. This under-used command line utility translates from one character set to another (a set can just be a single character). It can also delete characters from its input and, using the -c (complement) switch, can work on a set that's the inverse of the one specified. Tying those loose threads together, you end up with this to pick out occurrences of the letter 'e':


danny@khisanth ~ [2] % echo one two three four five | tr -cd 'e'
eeee%

That % is my shell indicating that the line didn't terminate with a newline, so it was just the 'e's. Once you have those, wc -c can do the rest:


danny@khisanth ~ [3] % echo one two three four five | tr -cd e | wc -c
       4

Annoyingly, wc on some Unixes indents its output, so if you just want the number itself, you might need to play with your shell's string manipulation functions to get something neater. e.g., in zsh:


danny@khisanth ~ [4] % echo ${$(echo one two three four five | tr -cd e | wc -c)// /}
4