Tuesday 13 August 2019

Python Learnings #1 - decoding Base64 strings

Long story short, I was seeing exceptions such as this: -

Traceback (most recent call last):
  File "decode.py", line 8, in
    decoded_string = base64.b64decode(string).decode('UTF-8')
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb2 in position 0: invalid start byte

and: -

Traceback (most recent call last):
  File "decode.py", line 8, in
    decoded_string = base64.b64decode(string).decode('UTF-8')
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb0 in position 25: invalid start byte

with a piece of inherited Python code.

I wasn't 100% clear how the decoder was being used, apart from this: -----

decoded_string = base64.b64decode(string).decode('UTF-8')

where string contained, I assume,  Base64 encoded string.

The actual string was a public key, generated using openssl : -

ssh-keygen -t rsa -b 4096 -f ~/.ssh/dummy -N ""

I wrote a test harness to simulate the problem and test potential solutions, and was posting the public key to the string variable: -

cat ~/.ssh/dummy.pub | awk '{print $2}'

AAAAB3NzaC1yc2EAAAADAQABAAACAQDXmbB7kUK4G0Fqm+5SSDztAMR5mV+0irWGLFuZN7Pbj30Kyi67TZ3J1cEhC3PsDyFW4hkvMRpdOoSlUfL2yVb1IxvbidcPF0ihtHgnMD2pn3W8xwFpbutpPWUgPd679Yq1C/bzFx2lIDWBpy5bSj/TpTWRsdFy7Z1Esja2ST8RfUByAl5zsg6fuyFFySzY8bVgH/Oc+eS82tICS1ZqdXJy6atsJQ2OnP7zTrw4Txz+vwpmQeddWSjL1wUs77ea0FJjU2MMFHm6+uW+cAr2woYlA4Lac6d+Mq9t5Ibt77J8BijkjJ+U79JhNSky0A2rSeThdWuD7uW/Kju43m6fb5ss/ATKbra/M3hUPg0F0YwtiDmPratCkE11uJnFfyYaPpt58LrgvYZzosliQe96AeCWru6IzEkGoGErSfl/PwielDWzDWXuNxY00gQ0Rtx3I76g6gV01gbxKcBusLTFh51GC0PvVEikhk5cI+drbT1uMDjLHi6Tr2MO+uRdu2BpwVQIZgSUke3OpnjQ2rDTIcaKy6e5lfJ7Hpw0kIw0Bi9j9YDMod90TRQXdElWFKeKQ+ZlaH9Ytr2FeDk+9H69kf52rXtn8q9Uy/NtlIdKsYa2pGdv7N1IFumGX+GbYplewTta/05OaJXI3iia1CV09oFryag+5MYQmJRCijSlUBIFjQ==

Can you see where I was going wrong ?

Yes, the public key is NOT Base64 encoded ....

The solution was to encode the public key: -

cat ~/.ssh/dummy.pub | awk '{print $2}' | base64

QUFBQUIzTnphQzF5YzJFQUFBQURBUUFCQUFBQ0FRRFhtYkI3a1VLNEcwRnFtKzVTU0R6dEFNUjVtViswaXJXR0xGdVpON1BiajMwS3lpNjdUWjNKMWNFaEMzUHNEeUZXNGhrdk1ScGRPb1NsVWZMMnlWYjFJeHZiaWRjUEYwaWh0SGduTUQycG4zVzh4d0ZwYnV0cFBXVWdQZDY3OVlxMUMvYnpGeDJsSURXQnB5NWJTai9UcFRXUnNkRnk3WjFFc2phMlNUOFJmVUJ5QWw1enNnNmZ1eUZGeVN6WThiVmdIL09jK2VTODJ0SUNTMVpxZFhKeTZhdHNKUTJPblA3elRydzRUeHordndwbVFlZGRXU2pMMXdVczc3ZWEwRkpqVTJNTUZIbTYrdVcrY0FyMndvWWxBNExhYzZkK01xOXQ1SWJ0NzdKOEJpamtqSitVNzlKaE5Ta3kwQTJyU2VUaGRXdUQ3dVcvS2p1NDNtNmZiNXNzL0FUS2JyYS9NM2hVUGcwRjBZd3RpRG1QcmF0Q2tFMTF1Sm5GZnlZYVBwdDU4THJndllaem9zbGlRZTk2QWVDV3J1Nkl6RWtHb0dFclNmbC9Qd2llbERXekRXWHVOeFkwMGdRMFJ0eDNJNzZnNmdWMDFnYnhLY0J1c0xURmg1MUdDMFB2VkVpa2hrNWNJK2RyYlQxdU1EakxIaTZUcjJNTyt1UmR1MkJwd1ZRSVpnU1VrZTNPcG5qUTJyRFRJY2FLeTZlNWxmSjdIcHcwa0l3MEJpOWo5WURNb2Q5MFRSUVhkRWxXRktlS1ErWmxhSDlZdHIyRmVEays5SDY5a2Y1MnJYdG44cTlVeS9OdGxJZEtzWWEycEdkdjdOMUlGdW1HWCtHYllwbGV3VHRhLzA1T2FKWEkzaWlhMUNWMDlvRnJ5YWcrNU1ZUW1KUkNpalNsVUJJRmpRPT0K

at which point my code started working: -

python3 decode.py

which returns the original unencoded public key: -

AAAAB3NzaC1yc2EAAAADAQABAAACAQDXmbB7kUK4G0Fqm+5SSDztAMR5mV+0irWGLFuZN7Pbj30Kyi67TZ3J1cEhC3PsDyFW4hkvMRpdOoSlUfL2yVb1IxvbidcPF0ihtHgnMD2pn3W8xwFpbutpPWUgPd679Yq1C/bzFx2lIDWBpy5bSj/TpTWRsdFy7Z1Esja2ST8RfUByAl5zsg6fuyFFySzY8bVgH/Oc+eS82tICS1ZqdXJy6atsJQ2OnP7zTrw4Txz+vwpmQeddWSjL1wUs77ea0FJjU2MMFHm6+uW+cAr2woYlA4Lac6d+Mq9t5Ibt77J8BijkjJ+U79JhNSky0A2rSeThdWuD7uW/Kju43m6fb5ss/ATKbra/M3hUPg0F0YwtiDmPratCkE11uJnFfyYaPpt58LrgvYZzosliQe96AeCWru6IzEkGoGErSfl/PwielDWzDWXuNxY00gQ0Rtx3I76g6gV01gbxKcBusLTFh51GC0PvVEikhk5cI+drbT1uMDjLHi6Tr2MO+uRdu2BpwVQIZgSUke3OpnjQ2rDTIcaKy6e5lfJ7Hpw0kIw0Bi9j9YDMod90TRQXdElWFKeKQ+ZlaH9Ytr2FeDk+9H69kf52rXtn8q9Uy/NtlIdKsYa2pGdv7N1IFumGX+GbYplewTta/05OaJXI3iia1CV09oFryag+5MYQmJRCijSlUBIFjQ==

No comments:

Visual Studio Code - Wow 🙀

Why did I not know that I can merely hit [cmd] [p]  to bring up a search box allowing me to search my project e.g. a repo cloned from GitHub...