类库
› webmagic
code4craft/webmagic
WebMagic是一个可扩展的Java网络爬虫框架,简化爬虫开发过程,支持下载、URL管理、内容提取和存储。
技术栈
根目录 java
查看全部依赖 (28)
依赖
com.alibaba:fastjson
com.github.detro:phantomjsdriver
com.github.dreamhead:moco-core
com.google.guava:guava
com.jayway.jsonpath:json-path
commons-cli:commons-cli
commons-io:commons-io
net.sf.saxon:Saxon-HE
net.sourceforge.htmlcleaner:htmlcleaner
org.apache.commons:commons-collections4
org.apache.commons:commons-lang3
org.apache.httpcomponents:httpclient
org.apache.httpcomponents:httpcore
org.apache.logging.log4j:log4j-core
org.apache.logging.log4j:log4j-slf4j2-impl
org.assertj:assertj-core
org.codehaus.groovy:groovy-all
org.jruby:jruby
org.junit.jupiter:junit-jupiter-engine
org.junit.platform:junit-platform-launcher
org.junit.platform:junit-platform-runner
org.junit.vintage:junit-vintage-engine
org.mockito:mockito-all
org.python:jython
org.seleniumhq.selenium:selenium-java
org.slf4j:slf4j-api
redis.clients:jedis
us.codecraft:xsoup
webmagic-core java
查看全部依赖 (11)
依赖
com.alibaba:fastjson
com.github.dreamhead:moco-core
com.jayway.jsonpath:json-path
commons-io:commons-io
org.apache.commons:commons-collections4
org.apache.commons:commons-lang3
org.apache.httpcomponents:httpclient
org.assertj:assertj-core
org.mockito:mockito-all
org.slf4j:slf4j-api
us.codecraft:xsoup
webmagic-coverage java
查看全部依赖 (6)
依赖
${project.groupId}:webmagic-core
${project.groupId}:webmagic-extension
${project.groupId}:webmagic-samples
${project.groupId}:webmagic-saxon
${project.groupId}:webmagic-scripts
${project.groupId}:webmagic-selenium
webmagic-extension java
查看全部依赖 (5)
依赖
${project.groupId}:webmagic-core
com.google.guava:guava
org.assertj:assertj-core
org.projectlombok:lombok
redis.clients:jedis
webmagic-samples java
查看全部依赖 (6)
依赖
${project.groupId}:webmagic-core
${project.groupId}:webmagic-extension
com.fasterxml.jackson.core:jackson-annotations
com.fasterxml.jackson.core:jackson-core
com.fasterxml.jackson.core:jackson-databind
org.mapdb:mapdb
webmagic-saxon java
查看全部依赖 (3)
依赖
${project.groupId}:webmagic-core
net.sf.saxon:Saxon-HE
net.sourceforge.htmlcleaner:htmlcleaner
webmagic-scripts java
查看全部依赖 (9)
依赖
${project.groupId}:webmagic-core
${project.groupId}:webmagic-extension
commons-cli:commons-cli
org.apache.logging.log4j:log4j-core
org.apache.logging.log4j:log4j-slf4j2-impl
org.jetbrains.kotlin:kotlin-stdlib
org.jruby:jruby
org.projectlombok:lombok
org.python:jython
webmagic-selenium java
查看全部依赖 (3)
依赖
${project.groupId}:webmagic-core
com.github.detro:phantomjsdriver
org.seleniumhq.selenium:selenium-java