Skip to content
This repository was archived by the owner on Oct 17, 2021. It is now read-only.

0rca/clj-scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

59 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

clj-scraper

A web-scraper for personal enjoyment and experiments with core/async. Supports two websites for your scraping pleasure.

Requirements

  1. Leiningen
  2. JDK >= 1.6

Building

$ lein uberjar

Usage

java -jar target/scraper-0.3.1-standalone.jar

Options

-c, --cache [dir]           cache files directory
-o, --output [dir]          downloaded images directory
-w, --workers [num]         number of download workers
-d, --debug                 display debug info
-s, --source [ngo|vrotmne]  handle of website to scrape
-S, --skip [num]            skip first num posts of LJ
-L, --list-only             save image urls, but don't download
-x, --exit-on-exist         exit the process if downloaded file exists
-h, --help                  print this help

Examples

$ java jar target/scraper-0.3.1-standalone.jar -w 20 -s ngo

License

Copyright © 2013 FIXME

Distributed under the Eclipse Public License, the same as Clojure.

About

A web-scraper for your personal enjoyment

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy