AV女優の情報をデータベースに突っ込む
データベースを扱うプログラムの例として, DMM.comからAV女優の
情報を取得して、データベースに突っ込むというコードを書いてみた。
#!/usr/bin/env perl use strict; use warnings; use utf8; use Web::Scraper; use URI; use Time::HiRes; use DBI; binmode STDOUT, ":utf8"; my @actresses = get_actress_info(); my $dbh = DBI->connect( "dbi:SQLite:dbname=actress.db", undef, undef, { AutoCommit => 0, } ); die DBI->errstr unless defined $dbh; my $retval = $dbh->do(q/ CREATE TABLE IF NOT EXISTS actress ( id INTEGER PRIMARY KEY, name TEXT, image TEXT )/); die "Can't create actress table\n" unless defined $retval; my $sth = $dbh->prepare(q/ INSERT INTO actress (id, name, image) VALUES(?, ?, ?) /); die DBI->errstr unless defined $sth; foreach my $actress (@actresses) { my $r = $sth->execute($actress->{dmm_id}, $actress->{name}, $actress->{image}); die $sth->errstr unless defined $r; print "Insert ", $actress->{name}, "\n"; } $dbh->commit; $dbh->disconnect; sub get_actress_info { my @urls = get_dmm_actress_urls(); my @actresses; foreach my $url (@urls) { push @actresses, collect_actress_image($url); } return @actresses; } sub collect_actress_image { my $url = shift; my $image = scraper { process '.act-box > ul > li ', 'actresses[]' => scraper { process 'a > img', image => '@src'; process 'a', name => 'TEXT'; process 'a', id => '@href'; }; }; my $info = $image->scrape( URI->new($url) ); my @actresses; foreach my $info ( @{$info->{'actresses'}}) { my ($name, $image, $id) = ($info->{name}, $info->{image}, $info->{id}); $name =~ s{[((].+?[))]}{}xms; $image =~ s{/(?:thumbnail|medium)}{}xms; $id =~ m{id=(\d+)}; my $dmm_id = $1; my $actress = { name => $name, image => $image, dmm_id => $dmm_id, }; push @actresses, $actress; } return @actresses; } sub get_dmm_actress_urls { my @initials = collect_initial_chars(); my $url_tmpl = 'http://www.dmm.co.jp/mono/dvd/-/actress/=/keyword=?/'; my @urls; foreach my $initial (@initials) { (my $url = $url_tmpl) =~ s{ \? }{$initial}xms; push @urls, $url; } return @urls; } sub collect_initial_chars { my @initials = qw(a i u e o); foreach my $a (qw/k s t n h m y r w/) { foreach my $b (qw/a i u e o/) { next if $a eq 'y' && $b =~ m/[ie]/; next if $a eq 'w' && $b ne 'a'; push @initials, ($a . $b); } } return @initials; }
確認。ひらがなの「なお」で終わる人の名前の検索
% sqlite actress.db sqlite > SELECT name FROM actress WHERE name LIKE "%なお"; 津田なお 鮎川なお 林なお 加藤なお
うまくいってますね。
SQLが全然理解できていないので, まずはそこから勉強しないとね。
ORMを理解して使ってみたいけど、それはまだまだ先になりそうですね。
SQLでいい本はないものだろうか?