I'm importing data from a Comma Separated Value (CSV) file into Cloudant, using the most excellent CouchDB tools provided by my IBM colleague, Glynn Bird.
Having created a CSV: -
vi cartoon.csv
id,givenName,familyName
1,Maggie,Simpson
2,Lisa,Simpson
3,Bart,Simpson
4,Homer,Simpson
5,Fred,Flintstone
6,Wilma,Flintstone
7,Barney,Rubble
8,Betty,Rubble( with due respect to the creators and owners of The Simpsons and The Flintstones )
I setup my environment: -
export ACCOUNT=0e5c777542c5e2cc2418013429e0824f-bluemix:d088ff753c9e258add92e45128cd161d
acbffedbcec0c8f78b216368ba0503abexport HOST=d088ff753c9e258add92e45128cd161d-bluemix.cloudant.comexport COUCH_URL=https://$ACCOUNT@$HOSTexport COUCH_DATABASE="CARTOON"export COUCH_DATABASE=`echo $COUCH_DATABASE | tr '[:upper:]' '[:lower:]'`export COUCH_DELIMITER=","and created my database: -
curl -X PUT $COUCH_URL/$COUCH_DATABASEand populated it: -
cat $COUCH_DATABASE.csv | couchimport
This worked well but …. my data had a system-generated _id field whereas I wanted to use my own ID field: -
{
"_id": "e143bcd25bc620e6aa8f2adc206cf21c",
"_rev": "1-0152a3e6867ad34da6e882a80f0fbeff",
"id": "1",
"givenName": "Maggie",
"familyName": "Simpson"
}
{
"_id": "82c1068c830759a904cfdd02ab41b980",
"_rev": "1-6bbb94301323a3c3f6ff54f1c3c765e5",
"id": "2",
"givenName": "Lisa",
"familyName": "Simpson"
}
Thankfully Glenn kindly advised me how to use a JavaScript function to mitigate this: -
vi ~/transform_cartoon.js
var transform = function(doc) {
doc._id = doc.id
delete doc.id
return doc
}
module.exports = transform
which effectively assigns the _id field to the value of the id field ( as taken from the CSV ) and also drops the original id field.
I dropped the DB: -
curl -X DELETE $COUCH_URL/$COUCH_DATABASE
and recreated it: -
curl -X PUT $COUCH_URL/$COUCH_DATABASE
and then repopulated it: -
cat $COUCH_DATABASE.csv | couchimport --transform ~/transform_cartoon.js
and now we have this: -
{
"_id": "1",
"_rev": "1-0e77dbadefba2a95e5cde5bda2ecd695",
"givenName": "Maggie",
"familyName": "Simpson"
}
{
"_id": "2",
"_rev": "1-fc746edc394ac98b013b7788cc1cca5d",
"givenName": "Lisa",
"familyName": "Simpson"
}
If needed, I could modify my transform: -
var transform = function(doc) {
doc._id = doc.id
return doc
}
module.exports = transform
to avoid dropping the original id field, to give me this: -
{
"_id": "1",
"_rev": "1-0152a3e6867ad34da6e882a80f0fbeff",
"id": "1",
"givenName": "Maggie",
"familyName": "Simpson"
}
{
"_id": "2",
"_rev": "1-6bbb94301323a3c3f6ff54f1c3c765e5",
"id": "2",
"givenName": "Lisa",
"familyName": "Simpson"
}
so I have choices :-)
For more insights, please go here: -