all these triangles!

processing.org — nik @ 4:10 pm

yay! Finally uploaded something with fine lines to vimeo that didn’t get entirely screwed up by the compression process. Animation Codec @ 720×480 with lots of keyframes seems to do okay.

all these triangles! from nik hanselmann on Vimeo.

working out these blocks

processing.org — nik @ 5:41 pm

working out these blocks from nik hanselmann on Vimeo.

simpler times from nik hanselmann on Vimeo.

Screen scraping with Perl, HTML::TreeBuilder

perl — Tags: , — nik @ 6:41 pm

I put together a simple screen scraper using HTML::TreeBuilder in Perl. I am new to Perl and had a bit of a tricky time getting code available online to do what I wanted it to do, so after a bunch of trial and error I ended up with this. The script gathers 1000 results for “Doogie Howser, M.D.” from Google and then parses out the title, link, and summary for each result. As you can see, I used HTML::TreeBuilder to pick out specific instances of CSS styles in the HTML.

#!/usr/bin/perl

#screenscraping with HTML::TreeBuilder

use strict;
use LWP;
use HTML::TreeBuilder;

my($user_agent, $url, $browser, $request, $max_result, $response, $start,
$content, $search_result, @search_results);

$max_result = 1000;
$start = 0;

#Google for "Doogie Howser, M.D.", 100 results per page.
$url = 'http://www.google.com/search?hl=en&q=Doogie+Howser%2C+M.D.&aq=f&oq=&aqi=&num=100';
$user_agent = LWP::UserAgent->new();

print "Searching...this may take a minute\n";
while ( $start < $max_result ) {
	$request = HTTP::Request->new(GET => $url."&start=$start");
	$user_agent->timeout(30);
	$user_agent->agent('Mozilla/5.0');
	$response = $user_agent->request($request);
	if($response->is_success){
		$content .= $response->content;
		$start += 100;
	}
}

#parse $content with treebuilder
my $page = HTML::TreeBuilder->new();
$page->parse($content);
$page->eof();

#the following code finds every <li> item with class 'g' -- right now,
#this is how search results are styled on google. so, the HTML result
#for each item returned by google is stored in @search_results
#
#you could uncomment this:
#foreach $search_result (@search_results){
# print $search_result->as_HTML,"\n\n";
#}
#and see how each item looks in the array.

@search_results= $page->look_down(
sub{ $_[0]-> tag() eq 'li' and ($_[0]->attr('class') =~ /g/)}
);

foreach $search_result (@search_results){
	my($url, $title, $summary);

	#now that we have each HTML chunk of search results stored in an array
	#we can take it apart further:

	$page = HTML::TreeBuilder->new_from_content($search_result->as_HTML);

	#the title is styled as <h3 class=r>
	$title = $page->look_down(
	sub{ $_[0]-> tag() eq 'h3' and ($_[0]->attr('class') =~ /r/)}
	);
	#the summary is styled as <div class=s>
	$summary = $page->look_down(
	sub{ $_[0]-> tag() eq 'div' and($_[0]->attr('class') =~ /s/)}
	);

	#now we have to get the href attribute from the title to get the link
	#so we load the title, as HTML, into the treebuilder object
	$page = HTML::TreeBuilder->new_from_content($title->as_HTML);

	#the link is styled as <a class=l href="..."
	#the following assigns $url the href attribute
	$page->look_down(
	sub{ $_[0]-> tag() eq 'a' and
	($_[0]->attr('class') =~ /l/), $url = $_[0]->attr('href')}
	);

	#print everything out...
	if($title) { print 'title: '.$title->as_text."\n";}
	if($summary){ print 'summary: '.$summary->as_text."\n";}
	if($url){ print 'url: '.$url."\n\n";}
}

#delete the treebuilder object.
$page->delete;

IT BOOT UP

technostalgia — nik @ 3:47 pm

itbootup1

looking around google code…

programming — nik @ 12:15 am

The best thing I found while searching:

FnPtr FnED[256]=
{
  fuck,fuck,fuck,fuck,fuck,fuck,fuck,fuck,
  fuck,fuck,fuck,fuck,fuck,fuck,fuck,fuck,
  fuck,fuck,fuck,fuck,fuck,fuck,fuck,fuck,
  fuck,fuck,fuck,fuck,fuck,fuck,fuck,fuck,
  fuck,fuck,fuck,fuck,fuck,fuck,fuck,fuck,
  fuck,fuck,fuck,fuck,fuck,fuck,fuck,fuck,
  fuck,fuck,fuck,fuck,fuck,fuck,fuck,fuck,
  fuck,fuck,fuck,fuck,fuck,fuck,fuck,fuck,
  in_b_xc,out_xc_b,sbc_hl_bc,ld_xworde_bc,neg,retn,im_0,ld_i_a,
  in_c_xc,out_xc_c,adc_hl_bc,ld_bc_xworde,fuck,reti,fuck,ld_r_a,
  in_d_xc,out_xc_d,sbc_hl_de,ld_xworde_de,fuck,fuck,im_1,ld_a_i,
  in_e_xc,out_xc_e,adc_hl_de,ld_de_xworde,fuck,fuck,im_2,ld_a_r,
  in_h_xc,out_xc_h,sbc_hl_hl,ld_xworde_hl,fuck,fuck,fuck,rrd,
  in_l_xc,out_xc_l,adc_hl_hl,ld_hl_xworde,fuck,fuck,fuck,rld,
  in_f_xc,fuck,sbc_hl_sp,ld_xworde_sp,fuck,fuck,fuck,fuck,
  in_a_xc,out_xc_a,adc_hl_sp,ld_sp_xworde,fuck,fuck,fuck,fuck,
  fuck,fuck,fuck,fuck,fuck,fuck,fuck,fuck,
  fuck,fuck,fuck,fuck,fuck,fuck,fuck,fuck,
  fuck,fuck,fuck,fuck,fuck,fuck,fuck,fuck,
  fuck,fuck,fuck,fuck,fuck,fuck,fuck,fuck,
  ldi,cpi,ini,outi,fuck,fuck,fuck,fuck,
  ldd,cpd,ind,outd,fuck,fuck,fuck,fuck,
  ldir,cpir,inir,otir,fuck,fuck,fuck,fuck,
  lddr,cpdr,indr,otdr,fuck,fuck,fuck,fuck,
  fuck,fuck,fuck,fuck,fuck,fuck,fuck,fuck,
  fuck,fuck,fuck,fuck,fuck,fuck,fuck,fuck,
  fuck,fuck,fuck,fuck,fuck,fuck,fuck,fuck,
  fuck,fuck,fuck,fuck,fuck,fuck,fuck,fuck,
  fuck,fuck,fuck,fuck,fuck,fuck,fuck,fuck,
  fuck,fuck,fuck,fuck,fuck,fuck,fuck,fuck,
  fuck,fuck,fuck,fuck,fuck,fuck,fuck,fuck,
  fuck,fuck,fuck,fuck,fuck,fuck,patch,fuck
};

fromĀ http://www.google.com/codesearch/p?hl=en#OiylO0H42Bs/codeed.c&q=fuck&l=373

more particles

processing.org — nik @ 7:37 am

still figuring out how to compress for Vimeo

particles from nik hanselmann on Vimeo.

bezier polygon test

processing.org — nik @ 1:59 am

bezier polygon test from nik hanselmann on Vimeo.

vimeo compressor is awful.

playing with texture

processing.org — nik @ 12:22 am


memorandum for john rizzo, 3 permutations

torture — nik @ 12:07 am

http://luxmedia.vo.llnwd.net/o10/clients/aclu/olc_08012002_bybee.pdf

feeling glitchy

maxmsp — nik @ 10:35 pm

« Previous PageNext Page »
This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License.
(c) 2010 nik’s blog | powered by WordPress with Barecity