vendredi 29 mai 2015

Error : org.apache.hadoop.mapred.InvalidInputException: Input path does not exist

I am new in nutch and solr integration.

I want to crawl new urls so I installed both solr version 4.6.0 and nutch version 1.6 in ubuntu.First I start with some configuration but i still get this error:

org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: File:/home/cloudera/apache-nutch-1.6/bin/20150529030452/crawl_fetch

Input path does not exist: file:/home/cloudera/apache-nutch-1.6/bin /20150529030452/crawl_parse

Input path does not exist: file:/home/cloudera/apache-nutch-1.6/bin/20150529030452/parse_data

Input path does not exist: file:/home/cloudera/apache-nutch-1.6/bin/20150529030452/parse_text

In the file logs I get this error:

2015-05-29 03:05:41,153 ERROR security.UserGroupInformation -PriviledgedActionException as:cloudera

cause:org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: file:/home/cloudera/apache-nutch-1.6/bin/20150529030452/crawl_fetch

Input path does not exist: file:/home/cloudera/apache-nutch-1.6/bin/20150529030452/crawl_parse

Input path does not exist: file:/home/cloudera/apache-nutch-1.6/bin/20150529030452/parse_data

Input path does not exist: file:/home/cloudera/apache-nutch-1.6/bin/20150529030452/parse_text

2015-05-29 03:05:41,153 ERROR solr.SolrIndexer - org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: file:/home/cloudera/apache-nutch-1.6/bin/20150529030452/crawl_fetch

Input path does not exist: file:/home/cloudera/apache-nutch-1.6/bin/20150529030452/crawl_parse

Input path does not exist: file:/home/cloudera/apache-nutch-1.6/bin/20150529030452/parse_data

Input path does not exist: file:/home/cloudera/apache-nutch-1.6/bin/20150529030452/parse_text

Whats the meaning of this, can you please explain whats the issue and how can I solve it.

I will highly appreciate your help.

Aucun commentaire:

Enregistrer un commentaire