find links containing bold text using WWW::Mechanize

Without using extra modules that can parse the contents of the page, it's going to be difficult to achieve your goal with WWW::Mechanize. However, there are other modules that will allow you to achieve this very easily.

Here is an example using Mojo::DOM, which uses lets you select elements as you would do in CSS. The Mojolicious distribution also contains Mojo::UserAgent, so you could migrate your code over to Mojo fairly easily if you are not too tied to WWW::Mechanize.

# $html is the content of the page
my $dom = Mojo::DOM->new($html);

# extract all <b> elements that are under <a> elements (at any
depth beneath the <a>)
# and get the <a> ancestors of those elements
# creates a Mojo::Collection object
my $collection = $dom->find('a b')->map(sub{ return
$_->ancestors('a') } )->flatten;

$collection->each( sub {
    say "LINK: " . $_->all_text;
} );

# Use a sub to perform an action on each of the retrieved <a>
$dom->find('a b')->each( sub {
    $_->ancestors('a')->each( sub {
        say "All in one: " . $_->all_text
    } )
} );

Here's a demonstration with a sample list of links:

<ul><li><a href="abc.com"><b>ABC</b>
<li><a href="google.com">ABC Search</a></li>
<li>Here is <a href="#">a link 
    <span>with a span 
        <b>and a "b" tag</b> 
          even though
    </span> "b" tags are deprecated.</a> Yay!</li>
<li><a href="abc.com">Movies with


LINK: ABC industry
LINK: a link with a span and a "b" tag even though "b" tags are deprecated.
LINK: Movies with ABC
All in one: ABC industry
All in one: a link with a span and a "b" tag even though "b" tags are
All in one: Movies with ABC

If you use Mojo::UserAgent instead of WWW::Mechanize your search can be even easier. Mojo::UserAgent can get a page (just like WWW::Mechanize), and the DOM of the returned page can be accessed using $ua->get($url)->res->dom. You can then chain your query above on this, to give the following:

my $ua = Mojo::UserAgent->new();
# get the page and find the links with a <b> element in them:
   ->res->dom('a b')->each( sub { $_->ancestors('a')->each(
sub { say $_->all_text } ) } );

# example using this page:
# print the contents of divs with class 'spacer' that contain a link with a
div in it:
->res->dom('a div')->each( sub { 
    $_->ancestors('div.spacer')->each( sub {
        say $_->all_text
    } )
} );


There are lots of examples in the Mojolicious documentation in case this isn't immediately comprehensible!

For a helpful 8 minute introductory video to Mojo::DOM and Mojo::UserAgent check out Mojocast Episode 5.

