Create an Account
username: password:
 
  MemeStreams Logo

MemeStreams Discussion

search


This page contains all of the posts and discussion on MemeStreams referencing the following web page: [#HADOOP-5815] Sqoop: A database import tool for Hadoop - ASF JIRA. You can find discussions on MemeStreams as you surf the web, even if you aren't a MemeStreams member, using the Threads Bookmarklet.

[#HADOOP-5815] Sqoop: A database import tool for Hadoop - ASF JIRA
by Lost at 9:58 pm EDT, May 19, 2009

Overview:

Sqoop is a tool designed to help users import existing relational databases into their Hadoop clusters. Sqoop uses JDBC to connect to a database, examine the schema for tables, and auto-generate the necessary classes to import data into HDFS. It then instantiates a MapReduce job to read the table from the database via the DBInputFormat (JDBC-based InputFormat). The table is read into a set of files loaded into HDFS. Both SequenceFile and text-based targets are supported.

Longer term, Sqoop will support automatic connectivity to Hive, with the ability to load data files directly into the Hive warehouse directory, and also to inject the appropriate table definition into the metastore.


 
 
Powered By Industrial Memetics