_ hpricot.com
| blog | demos | contact | hpricot alternatives

Demos

(Results may be sent to another page or tab)


Simple Search

Sample Syntax:

hpricot_object.search("a").to_html demo:

URL:

hpricot_object.search() argument:


Simple Remove

Sample Syntax:

hpricot_object.search("a[@href*=shtml]").remove

I use hpricot_object.to_html to see the effect.

Removing comments is a bit more work. See the comments removal demo a bit further down.


Stacked Search

Sample Syntax:

hpricot_object.search("a").search("img").to_html


"at" Search is similar to Simple Search

Sample Syntax:

hpricot_object.at("img[@src*=png]").to_html

.at() is useful if I need just 1 element. If I want more than 1, I use .search() instead.


If I need to "peel-off" an outer tag, I use inner_html combined with .at()

Sample Syntax:

hpricot_object.at("html/body/div/div/div/div/div/div").inner_html


If .search() gives me several elements and I want the first, I could use first (I'd probably use .at() though).

Sample Syntax:

hpricot_object.search("img[@src*=png]").first.to_html demo:


Ruby is well suited for working with Enumerable objects.

And, .search() returns an enumerable object.

Sample Syntax:

hpricot_object.search("a").map{|e| "<hr /># {e.to_html}" }


Here, I loop through the object and attach an <hr /> to the front of each element.

And I sort it by href.


I can burrow into an hpricot_object and wrap some HTML around the outside of each element pointed to by the .search() method.

Sample Syntax which wraps <h1 /> around the outside of every <a> element:

hpricot_object.search("a").wrap("<h1>")

An easy way to see the effect is to just run hpricot_object.to_html

Or, I could run the original search and then use .search("..") to go up a level:

hpricot_object.search("a").search("..").to_html


If I want to replace an element with some HTML of my choice, I can use .at() combined with .swap()

Sample Syntax:

hpricot_object.at("div/a").swap("<h1>hpricot.com</h1>")

An easy way to see the effect is to just run hpricot_object.to_html

If I know about the parent of the new element, I can search for the parent:

hpricot_object.search("div").to_html


I can use Simple Search to find element attributes. Perhaps I need a list of href attributes?

Sample Syntax:

hpricot_object.search("a[@href*=nytimes.com]").map {|e| '<hr />' + e.get_attribute(attrname) }

The above search is a bit loose. It will match if 'nytimes.com' appears anywhere in href.

If I need an exact match, my call to search would look like this: .search("a[@href='http://www.google.com']")


Perhaps I want to visualize the enumerable returned by .search() ?

Sample Syntax:

i = -1; hpricot_object.search("a").map{|e| i+=1;"<hr /># {i}# {e}"}

Here, I use .map() to create an array of numbered HTML strings.


Perhaps I want to see just a slice of the enumerable returned by .search() ?

Sample Syntax:

i = -1; hpricot_object.search("a").map{|e| i+=1;"<hr />#\{i}#\{e}"}[5,11]

Here, I get 11 elements from it, starting at element 5.


Working with HTML comments

Sample Syntax:

hpricot_object.search("body").search("*").map{|e| "<hr /># {e}" if e.comment?}

I cannot use .search() to locate HTML comments.

I can, however, use .search("*") to get a list of all the nodes.

Then, I loop through the list and ask, "Is this a comment?"

So, one way to display HTML comments is to use .map() to create an array of HTML strings from a stacked .search()


Removing HTML comments

Sample Syntax:

hpricot_object.search("body").search("*").each{|e| (lst=e.parent.children;e.parent=nil;lst.delete(e)) if e.comment?}

Removing HTML comments is similar to displaying them.

I remove HTML comments using a stacked search and a loop. I use hpricot_object.to_html to see the effect.


Searching For Text Nodes

Sample Syntax:

hpricot_object.search("a[text()*='Google']")

Notice that this is identical to simple search. I just need to know this format: [text()*='Washington']

If I'm looking for an EXACT match, I use this format: [text()='Washington']

This is similar to searching for an element by its attributes rather than its name.

See the href searching example above.


Altering Text Nodes

Sample Syntax:

hpricot_object.search("*").each {|e| e.content=e.content().gsub(Regexp.new('bikle.com'), 'bikle.com IS MY SITE!') if e.text? }

The "getter" method for a text node content is: .content()

The "setter" method for a text node content is: .content()=

I use hpricot_object.to_html to see the effect.


Removing Attributes

Sometimes I want to remove a JavaScript-onclick attribute from an <a> tag.

Sample Syntax:

hpricot_object.search("a[@onclick]").remove_attr("onclick")

I use hpricot_object.to_html to see the effect.


blog | demos | contact | hpricot alternatives