About IBM :

IBM’s greatest invention is the IBMer. We believe that through the application of intelligence, reason, and science, we can improve business, society, and the human condition, bringing the power of an open hybrid cloud and AI strategy to life for our clients and partners around the world.

Job description

As a Big Data Engineer, you will develop, maintain, evaluate, and test big data solutions.

You will be involved in data engineering activities like creating pipelines/workflows for Source to Target and implementing solutions that tackle the clients needs.

Design, build, optimize and support new and existing data models and ETL processes based on our clients business requirements.

Build, deploy and manage data infrastructure that can adequately handle the needs of a rapidly growing data driven organization.

Coordinate data access and security to enable data scientists and analysts to easily access to data whenever they need too.

Eligibility Criteria:

Any Bachelor’s Degree .

Preferred skill:

Developed the Pysprk code for AWS Glue jobs and for EMR.. Worked on scalable distributed data system using Hadoop ecosystem in AWS EMR, MapR distribution..

Developed Python and pyspark programs for data analysis.. Good working experience with python to develop Custom Framework for generating of rules (just like rules engine).

Developed Hadoop streaming Jobs using python for integrating python API supported applications..

Developed Python code to gather the data from HBase and designs the solution to implement using Pyspark. Apache Spark DataFrames/RDD’s were used to apply business transformations and utilized Hive Context objects to perform read/write operations..

Re – write some Hive queries to Spark SQL to reduce the overall batch time.