A family members was having a problem with some mixed up image names on a static html site. I could have fixed it manually in a few shakes, but that’s no fun. Instead I used hpricot to scrape, open-uri to test for broken-ness, Find to search and some good old fashion regex to correct.
This was my first time messing around with hpricot and I found it to be powerful and easy to use, two thumbs up. I foresee some scraping and spidering posts in the near future.
On to the code:
My final script was a bit hairy so I broke out the bit I used to find the broken images.
If you run the script it’ll print the offending paths to screen:
ruby image_scanner.rb http://site.com/busted.html
Or you can call the get_broken_images method to get an array back:
scanner = Image_Scanner.new
broken_images = scanner.get_broken_images "http://site.com/busted.html"
In case you’re interested, I’ve also uploaded the full code that I used to search for and correct the images although it’s implementation specific, riddled with lazy and is poorly tested. Read the disclaimer!
Just run it and be amazed!
ruby image_scanner.rb http://site.com/busted.html /media_folder /busted.html /fixed.html
Download only the broken image scanner
Download the full script
When I first fired up this blog I thought I’d be posting a lot about the kind of problems and solutions I bang my head against daily at work. But instead it’s evolved into me posting a lot of the little scripts that I play around with while I’m watching tv or doing laundry or whatever else.
Sometimes I worry that what I’m posting might give one the impression that I’m a bit of a programming simpleton. In order to offset this I’ve thought about posting about some more advanced topics, but my heart just isn’t in it. For whatever reason I’ve really been enjoying playing with and posting my little programs…so that’s what I’m going to do!
I’d tried a few times over the last week to update Battlefield: Bad Company on the PlayStation 3. Every time I tried I would get hung up on the “Downloading update data…” screen. It would stay at 0% for a few minutes and then briefly flash me an error code in the top right corner (8002AD23) and a few minutes after that it’d bounce me to another screen: “An error occurred during the download operation (80710723)”
I googled around for a while. Opened some ports on the router, plugged directly into the wall, restarted. I finally found somewhere that recommended I disable my media server via the network settings and BAM! “fixed”
My “source” for the “fix”: http://www.fixya.com/support/t291065-ps3_system_update_error
Somehow I managed to bugger up the ColdFusion installation on my beloved laptop. Whenever I would try to start the cf server I would get the following not-very-helpful message.
Running the ColdFusion 8 connector wizard
Configuring the web server connector
(Launched on the first run of the ColdFusion 8 start script)
Running apache connector wizard...
There was an error while running the connector wizard
Connector installation was not successful
I googled around enough to realize that since it didn’t appear to be a common problem, that it was probably something I did. And it was.
As you may have figured out by now, I’m not a server-config kind of guy. I poked around a little bit and found an interesting looking shell script at /opt/coldfusion8/bin/connectors/apache_connector.sh. Running that gave me a much better error message:
Could not find directory /etc/apache2/apache2.conf
I opened the file and realized that the paths were all boogered up, meaning the paths I entered when installing ColdFusion where all buggered up. In any case, I fixed the incorrect paths and here’s what the file looks like now:
# Configure the Apache connector.
# -dir should be the *directory* which contains httpd.conf
# -bin should be the path to the apache *executable*
# -script should be the path to the script which is used to
# start/stop apache
New computer begets new blog. That’s just how these things work. I intend this to mainly be a professional blog, but like all professional blogs I actually enjoy reading I’ll also be smushing in some personal stuff here and there.
I plan on re-beginning this whole blogging thing by posting solutions to some of the little problems I (and anyone else who uses a computer) run into all the friggin’ time.
Ta Ta for now.