Thursday, 14 June 2018

Cloudant - Continuing to tinker

I'm importing data from a Comma Separated Value (CSV) file into Cloudant, using the most excellent CouchDB tools provided by my IBM colleague, Glynn Bird.

Having created a CSV: -

vi cartoon.csv 

id,givenName,familyName
1,Maggie,Simpson
2,Lisa,Simpson
3,Bart,Simpson
4,Homer,Simpson
5,Fred,Flintstone
6,Wilma,Flintstone
7,Barney,Rubble
8,Betty,Rubble


( with due respect to the creators and owners of The Simpsons and The Flintstones )

I setup my environment: -

export ACCOUNT=0e5c777542c5e2cc2418013429e0824f-bluemix:d088ff753c9e258add92e45128cd161d
acbffedbcec0c8f78b216368ba0503ab


export HOST=d088ff753c9e258add92e45128cd161d-bluemix.cloudant.com

export COUCH_URL=https://$ACCOUNT@$HOST

export COUCH_DATABASE="CARTOON"

export COUCH_DATABASE=`echo $COUCH_DATABASE | tr '[:upper:]' '[:lower:]'`

export COUCH_DELIMITER=","

and created my database: -

curl -X PUT $COUCH_URL/$COUCH_DATABASE

and populated it: -

cat $COUCH_DATABASE.csv | couchimport

This worked well but …. my data had a system-generated _id field whereas I wanted to use my own ID field: -

{
  "_id": "e143bcd25bc620e6aa8f2adc206cf21c",
  "_rev": "1-0152a3e6867ad34da6e882a80f0fbeff",
  "id": "1",
  "givenName": "Maggie",
  "familyName": "Simpson"
}

{
  "_id": "82c1068c830759a904cfdd02ab41b980",
  "_rev": "1-6bbb94301323a3c3f6ff54f1c3c765e5",
  "id": "2",
  "givenName": "Lisa",
  "familyName": "Simpson"
}

Thankfully Glenn kindly advised me how to use a JavaScript function to mitigate this: -

vi ~/transform_cartoon.js

var transform = function(doc) {
  doc._id = doc.id
  delete doc.id
  return doc
}

module.exports = transform

which effectively assigns the _id field to the value of the id field ( as taken from the CSV ) and also drops the original id field.

I dropped the DB: -

curl -X DELETE $COUCH_URL/$COUCH_DATABASE

and recreated it: -

curl -X PUT $COUCH_URL/$COUCH_DATABASE

and then repopulated it: -

cat $COUCH_DATABASE.csv | couchimport --transform ~/transform_cartoon.js

and now we have this: -

{
  "_id": "1",
  "_rev": "1-0e77dbadefba2a95e5cde5bda2ecd695",
  "givenName": "Maggie",
  "familyName": "Simpson"

}

{
  "_id": "2",
  "_rev": "1-fc746edc394ac98b013b7788cc1cca5d",
  "givenName": "Lisa",
  "familyName": "Simpson"
}

If needed, I could modify my transform: -

var transform = function(doc) {
  doc._id = doc.id
  return doc
}

module.exports = transform

to avoid dropping the original id field, to give me this: -

{
  "_id": "1",
  "_rev": "1-0152a3e6867ad34da6e882a80f0fbeff",
  "id": "1",
  "givenName": "Maggie",
  "familyName": "Simpson"
}

{
  "_id": "2",
  "_rev": "1-6bbb94301323a3c3f6ff54f1c3c765e5",
  "id": "2",
  "givenName": "Lisa",
  "familyName": "Simpson"
}

so I have choices :-) 

For more insights, please go here: -



No comments:

Note to self - use kubectl to query images in a pod or deployment

In both cases, we use JSON ... For a deployment, we can do this: - kubectl get deployment foobar --namespace snafu --output jsonpath="{...