It took me quite a while to find a usable metadata dump of the Citeseer catalog. While Citeseer supports the OAI harvesting protocol, their server for handling it seems down most of the time. Fortunately, after much digging I found:
http://www.cs.purdue.edu/commugrate/data_access/all_data_sets.php
and more specifically:
http://www.cs.purdue.edu/commugrate/data/citeseer/
Ah, metadata, how I missed you.