Sunday, September 8, 2019

AWS EMR some basic learnings (python and spark)

As a new user of this service it was a bit confusing to start with, especially as there seem to be endless number of contradicting articles about how to add steps and what they should execute.

My main issue was with this part of the documentation:

Steps=[
{
'Name': 'string',
'ActionOnFailure': 'TERMINATE_JOB_FLOW'|'TERMINATE_CLUSTER'|'CANCEL_AND_WAIT'|'CONTINUE',
'HadoopJarStep': {
'Properties': [
{
'Key': 'string',
'Value': 'string'
},
],
'Jar': 'string',
'MainClass': 'string',
'Args': [
'string',
]
}
},
],


It looks like Jar is not an optional parameter, so how am I supposed to run python? I don’t have any jars...

So here is what I discovered:
AWS EMR provides two jars:

script-runner.jar and command-runner.jar

script-runner.jar can execute scripts, so it will get a script file as a parameter.
command-runner.jar is simillar to ssh connection and running commands.

For my use-case i think the command-runner was the best fit, so for a simpliest command of running spark-submit command with a python file my step becomes:


{
"Name": “Python Step",
"ActionOnFailure": "CONTINUE",
'HadoopJarStep': {
"Properties":[],
"Jar":"command-runner.jar",
“Args": [
'spark-submit',
's3://buket/my_spart_python_file.py',
]
}
}

Args tranlates each , to space, so the command that is going to be executed is: "spark-submit s3://buket/my_spart_python_file.py"

Any additional parameter I would like to execute, I would just add it with , and construct the command just as if I’m running bash commands.

For example '--executor-memory’, ‘5g’, ‘—something-else’, ‘else'

Also, to enable logging and debuging need to add another step:
{
'Name': 'Setup Hadoop Debugging',
'ActionOnFailure': 'TERMINATE_CLUSTER',
'HadoopJarStep': {
'Jar': 'command-runner.jar',
'Args': ['state-pusher-script']
}
}

Thursday, August 22, 2019

OnePlus 6T Windows 10 file access issue (Shows Drivers CD only)

OnePlus 6T – when connecting to USB port shows only the CD Drive that contains the drivers.

Tried:

1. Installing the drivers from CD drive

2. Enabled Developer options and File Access.

3. Installing and uninstalling MTP from device manager (shows as issue).

4. Rebooting both the device and the PC

5. Many more – nothing helped.


Solution:

Installing “Media Feature Pack for N versions of Windows 10” and rebooting.

Now the device shows as expected both in the device manager and Windows explorer"

Monday, June 24, 2019

Jenkins Groovy issue with disappearing quotes

One of the really annoying issues in Jenkins I’ve encountered recently is disappearing double and single quotes.
There are solutions out there with using double tripple or even more quotes, but that didn’t work for us.
My colegue David found a nice soluton which works 100% of times for us:

Surround your string with $/ {my string with quotes} /$ instead of outer quotes. Now we can use any combination of quotes inside the string without any issues.

Thanks David!

Sunday, January 27, 2019

AWS Cognito - Get token with ADMIN_NO_SRP_AUTH (Python)



try:
return boto3.Session(profile_name=’[PROFILE_NAME]’)
.client('cognito-idp', [REGION]).admin_initiate_auth(
UserPoolId=[USER_POOL_ID],
ClientId=[CLIENT_ID],
AuthFlow='ADMIN_NO_SRP_AUTH',
AuthParameters={
'USERNAME': [USER_NAME],
'PASSWORD': [PASSWORD],
'SECRET_HASH': get_secret_hash([UserName])
}
)
except botocore.exceptions.ClientError as e:
return e.response

# creating secret hash
def get_secret_hash(self, username):
message = username + self.client_id
dig = hmac.new(bytes(self.client_secret, encoding='utf-8'), msg=message.encode('UTF-8'),
digestmod=hashlib.sha256).digest()
hash_is = base64.b64encode(dig).decode()
return hash_is