If you want to get all the URLs within a variable $content (which in my case contains a full HTML page the I crawled) and place them inside a variable called $urls then you can use the following:
$urls = array(); preg_match_all('@href="([^"]+)"[^>]*>([^<]+)@',$content,$urls); return $urls[1];
To get the title of a page out from a variable called $content (contains a full HTML page) you can use the following function:
$content = strtolower($content); $matches = array(); preg_match_all('@<title>([^<]*)</title>@',$content,$matches); return $matches[1][0];
By PHPin24 @ 2010-03-14 03:28:34
|