ruby on rails - replace relative path urls with absolute path urls -


i have bunch of html content stored in database , i'm looking convert of relative asset references use absolute paths instead. instance, of image tags looking this:

<img src=\"/system/images/146/original/03.png?1362691463\"> 

i'm trying prepend "http://mydomain.com" "/system/images/" bit, had following code hoping handle sadly doesn't seem result in changes:

text = "<img src=\"/system/images/146/original/03.png?1362691463\">" text.gsub(%r{<img src=\\('|")\/system\/images\/}, "<img src=\"http://virtualrobotgames.com/system/images/") 

instead of manipulating url string using normal string manipulation, use tool made job. ruby includes uri class, , there's more thorough addressable gem.

here's i'd if had html links wanted rewrite:

first, parse document:

require 'nokogiri' require 'uri'  source_site = "http://virtualrobotgames.com"  html = ' <html> <head></head> <body>   <img src="/system/images/146/original/03.png?1362691463">   <script src="/scripts/foo.js"></script>   <a href="/foo/bar.html">foo</a> </body> </html> ' doc = nokogiri::html(html) 

then you're in position walk through document , modify tags <a>, <img>, <script> , else want:

# find things using 'src' , 'href' parameters tags = {   'img'    => 'src',   'script' => 'src',   'a'      => 'href' } doc.search(tags.keys.join(',')).each |node|    url_param = tags[node.name]    src = node[url_param]   unless (src.empty?)     uri = uri.parse(src)     unless uri.host       uri.scheme = source_site.scheme       uri.host = source_site.host       node[url_param] = uri.to_s     end   end end  puts doc.to_html 

which, after running, outputs:

<!doctype html public "-//w3c//dtd html 4.0 transitional//en" "http://www.w3.org/tr/rec-html40/loose.dtd"> <html> <head><meta http-equiv="content-type" content="text/html; charset=us-ascii"></head> <body>   <img src="http://virtualrobotgames.com/system/images/146/original/03.png?1362691463"><script src="http://virtualrobotgames.com/scripts/foo.js"></script><a href="http://virtualrobotgames.com/foo/bar.html">foo</a> </body> </html> 

this isn't meant complete, fully-working, example. working absolute links, you'll have deal relative links, links sibling/peer hostnames, missing parameters.

you'll want check errors method "doc" after parsing make sure valid html. parser can rewrite/trim nodes in invalid html trying make sense of it.


Comments

Popular posts from this blog

android - getbluetoothservice() called with no bluetoothmanagercallback -

sql - ASP.NET SqlDataSource, like on SelectCommand -

ios - Undefined symbols for architecture armv7: "_OBJC_CLASS_$_SSZipArchive" -