Emacs. Transform a html page to an Org file
Emacs. Transform a html page to an Org file
To transform a html (eww) page to an Org mode, the easy way is to use pandoc. The option extract-images
creates a directory where images are stored.
pandoc -f html -t org -o output.org --extract-media=images https://torres.epv.uniovi.es/centon/visualizacion-congestion-puertos.html
The original html is:
And the final Org file is:
So we use this instruction in a elisp function:
(defun etm-eww-html-to-org (&optional url) "Convert a URL or a web page (eww) to org text. It includes images, that are stored in the directory images." (interactive nil eww-mode) (let ((url (or url (plist-get eww-data :url))) (dirimages "images")) (switch-to-buffer (generate-new-buffer "*eww2org*")) (unless (executable-find "pandoc") (error "The program pandoc does not exist.")) (message "Transforming %s" url) (shell-command (concat "pandoc -f html -t org --extract-media=" dirimages " " url) (current-buffer)) (org-mode)))