An OSX Service to get a web page title
The issue: I have a bunch of services that I use to drop URLs into a journal-type text file that lives in Dropbox, which I then go through to write blog posts, newsletters and the like.
Going through each link (opening up in a web browser, then copying the relevant details from the web page back into the text file) is a boring task. But the real problem is that its a boring task that I only do when I'm in the right mood to be doing the more creative task of writing up whatever it is that I'm writing.
The idea; I want a service, where I can just click on a URL and automatically convert it to a (MarkDown) link, automatically looking up the web page from the URL to get the title of the page.
Turns out that its pretty simple. I set up a Service in Automator, which receives selected text, and output replaces selected text.
All the Service does is run the following Ruby shell script;
require 'open-uri' require 'nokogiri' ARGF.each do |f| doc = Nokogiri::HTML(open(f)) print "[" + doc.at_css("title").content.gsub(/\s{2,}/, "") + "]" + "(" + f.strip + ")" end
To make it work, you will need the Nokogiri gem installed in your System Ruby. (Nokogiri can be straightforward to install – it can also be a complicated mess, so the instructions are outside the scope of this blog post.)
Obviously, there is room for improvement on this. For starters, it seems like overkill to pull a whole web page HTML and then to use a whole HTML/XML parsing tool like Nokogiri just to get a page title. (readline
seems like it could be useful here.) It would also be nice to extract a URL from a selected piece of text – that is, turn it into a service that could be used on a selection of text with multiple URLs in it. And it would also be nice to detect URLs that are already either HTML or Markdown links and ignore them.
But as a starting point, it does the job.