clj-scraper

A web-scraper for personal enjoyment and experiments with core/async. Supports two websites for your scraping pleasure.

Requirements

Leiningen
JDK >= 1.6

Building

$ lein uberjar

Usage

java -jar target/scraper-0.3.1-standalone.jar

Options

-c, --cache [dir]           cache files directory
-o, --output [dir]          downloaded images directory
-w, --workers [num]         number of download workers
-d, --debug                 display debug info
-s, --source [ngo|vrotmne]  handle of website to scrape
-S, --skip [num]            skip first num posts of LJ
-L, --list-only             save image urls, but don't download
-x, --exit-on-exist         exit the process if downloaded file exists
-h, --help                  print this help

Examples

$ java jar target/scraper-0.3.1-standalone.jar -w 20 -s ngo

License

Distributed under the Eclipse Public License, the same as Clojure.

Name		Name	Last commit message	Last commit date
Latest commit History 59 Commits
config		config
doc		doc
images		images
src/scraper		src/scraper
test/scraper		test/scraper
.deploy.sftp		.deploy.sftp
.gitignore		.gitignore
.history.sftp		.history.sftp
README.md		README.md
project.clj		project.clj
serve		serve

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

clj-scraper

Requirements

Building

Usage

Options

Examples

License

About

Uh oh!

Releases

Packages

Languages

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

0rca/clj-scraper

Folders and files

Latest commit

History

Repository files navigation

clj-scraper

Requirements

Building

Usage

Options

Examples

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Packages